Evidence for a Global Sampling Process in Extraction of Summary Statistics of Item Sizes in a Set.
Tokita, Midori; Ueda, Sachiyo; Ishiguchi, Akira
2016-01-01
Several studies have shown that our visual system may construct a "summary statistical representation" over groups of visual objects. Although there is a general understanding that human observers can accurately represent sets of a variety of features, many questions on how summary statistics, such as an average, are computed remain unanswered. This study investigated sampling properties of visual information used by human observers to extract two types of summary statistics of item sets, average and variance. We presented three models of ideal observers to extract the summary statistics: a global sampling model without sampling noise, global sampling model with sampling noise, and limited sampling model. We compared the performance of an ideal observer of each model with that of human observers using statistical efficiency analysis. Results suggest that summary statistics of items in a set may be computed without representing individual items, which makes it possible to discard the limited sampling account. Moreover, the extraction of summary statistics may not necessarily require the representation of individual objects with focused attention when the sets of items are larger than 4.
Mohammed A. Kalkhan; Robin M. Reich; Raymond L. Czaplewski
1996-01-01
A Monte Carlo simulation was used to evaluate the statistical properties of measures of association and the Kappa statistic under double sampling with replacement. Three error matrices representing three levels of classification accuracy of Landsat TM Data consisting of four forest cover types in North Carolina. The overall accuracy of the five indices ranged from 0.35...
ProUCL version 4.1.00 Documentation Downloads
ProUCL version 4.1.00 represents a comprehensive statistical software package equipped with statistical methods and graphical tools needed to address many environmental sampling and statistical issues as described in various these guidance documents.
Analyzing thematic maps and mapping for accuracy
Rosenfield, G.H.
1982-01-01
Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
ERIC Educational Resources Information Center
Chang, Chun-Yen; Cheng, Wei-Ying
2008-01-01
The interrelationship between senior high school students' science achievement (SA) and their self-confidence and interest in science (SCIS) was explored with a representative sample of approximately 1,044 11th-grade students from 30 classes attending four high schools throughout Taiwan. Statistical analyses indicated that a statistically…
ERIC Educational Resources Information Center
Bailey, Thomas; Jenkins, Davis; Leinbach, Timothy
2005-01-01
This report summarizes statistics on access and attainment in higher education, focusing particularly on community college students, using data from the National Education Longitudinal Study of 1988 (NELS:88), which follows a nationally representative sample of individuals who were eighth graders in the spring of 1988. A sample of these…
Statistical scaling of geometric characteristics in stochastically generated pore microstructures
Hyman, Jeffrey D.; Guadagnini, Alberto; Winter, C. Larrabee
2015-05-21
In this study, we analyze the statistical scaling of structural attributes of virtual porous microstructures that are stochastically generated by thresholding Gaussian random fields. Characterization of the extent at which randomly generated pore spaces can be considered as representative of a particular rock sample depends on the metrics employed to compare the virtual sample against its physical counterpart. Typically, comparisons against features and/patterns of geometric observables, e.g., porosity and specific surface area, flow-related macroscopic parameters, e.g., permeability, or autocorrelation functions are used to assess the representativeness of a virtual sample, and thereby the quality of the generation method. Here, wemore » rely on manifestations of statistical scaling of geometric observables which were recently observed in real millimeter scale rock samples [13] as additional relevant metrics by which to characterize a virtual sample. We explore the statistical scaling of two geometric observables, namely porosity (Φ) and specific surface area (SSA), of porous microstructures generated using the method of Smolarkiewicz and Winter [42] and Hyman and Winter [22]. Our results suggest that the method can produce virtual pore space samples displaying the symptoms of statistical scaling observed in real rock samples. Order q sample structure functions (statistical moments of absolute increments) of Φ and SSA scale as a power of the separation distance (lag) over a range of lags, and extended self-similarity (linear relationship between log structure functions of successive orders) appears to be an intrinsic property of the generated media. The width of the range of lags where power-law scaling is observed and the Hurst coefficient associated with the variables we consider can be controlled by the generation parameters of the method.« less
Howard Stauffer; Nadav Nur
2005-01-01
The papers included in the Advances in Statistics section of the Partners in Flight (PIF) 2002 Proceedings represent a small sample of statistical topics of current importance to Partners In Flight research scientists: hierarchical modeling, estimation of detection probabilities, and Bayesian applications. Sauer et al. (this volume) examines a hierarchical model...
Efficient statistical tests to compare Youden index: accounting for contingency correlation.
Chen, Fangyao; Xue, Yuqiang; Tan, Ming T; Chen, Pingyan
2015-04-30
Youden index is widely utilized in studies evaluating accuracy of diagnostic tests and performance of predictive, prognostic, or risk models. However, both one and two independent sample tests on Youden index have been derived ignoring the dependence (association) between sensitivity and specificity, resulting in potentially misleading findings. Besides, paired sample test on Youden index is currently unavailable. This article develops efficient statistical inference procedures for one sample, independent, and paired sample tests on Youden index by accounting for contingency correlation, namely associations between sensitivity and specificity and paired samples typically represented in contingency tables. For one and two independent sample tests, the variances are estimated by Delta method, and the statistical inference is based on the central limit theory, which are then verified by bootstrap estimates. For paired samples test, we show that the estimated covariance of the two sensitivities and specificities can be represented as a function of kappa statistic so the test can be readily carried out. We then show the remarkable accuracy of the estimated variance using a constrained optimization approach. Simulation is performed to evaluate the statistical properties of the derived tests. The proposed approaches yield more stable type I errors at the nominal level and substantially higher power (efficiency) than does the original Youden's approach. Therefore, the simple explicit large sample solution performs very well. Because we can readily implement the asymptotic and exact bootstrap computation with common software like R, the method is broadly applicable to the evaluation of diagnostic tests and model performance. Copyright © 2015 John Wiley & Sons, Ltd.
Tilson, Julie K; Marshall, Katie; Tam, Jodi J; Fetters, Linda
2016-04-22
A primary barrier to the implementation of evidence based practice (EBP) in physical therapy is therapists' limited ability to understand and interpret statistics. Physical therapists demonstrate limited skills and report low self-efficacy for interpreting results of statistical procedures. While standards for physical therapist education include statistics, little empirical evidence is available to inform what should constitute such curricula. The purpose of this study was to conduct a census of the statistical terms and study designs used in physical therapy literature and to use the results to make recommendations for curricular development in physical therapist education. We conducted a bibliometric analysis of 14 peer-reviewed journals associated with the American Physical Therapy Association over 12 months (Oct 2011-Sept 2012). Trained raters recorded every statistical term appearing in identified systematic reviews, primary research reports, and case series and case reports. Investigator-reported study design was also recorded. Terms representing the same statistical test or concept were combined into a single, representative term. Cumulative percentage was used to identify the most common representative statistical terms. Common representative terms were organized into eight categories to inform curricular design. Of 485 articles reviewed, 391 met the inclusion criteria. These 391 articles used 532 different terms which were combined into 321 representative terms; 13.1 (sd = 8.0) terms per article. Eighty-one representative terms constituted 90% of all representative term occurrences. Of the remaining 240 representative terms, 105 (44%) were used in only one article. The most common study design was prospective cohort (32.5%). Physical therapy literature contains a large number of statistical terms and concepts for readers to navigate. However, in the year sampled, 81 representative terms accounted for 90% of all occurrences. These "common representative terms" can be used to inform curricula to promote physical therapists' skills, competency, and confidence in interpreting statistics in their professional literature. We make specific recommendations for curriculum development informed by our findings.
Simulation of Wind Profile Perturbations for Launch Vehicle Design
NASA Technical Reports Server (NTRS)
Adelfang, S. I.
2004-01-01
Ideally, a statistically representative sample of measured high-resolution wind profiles with wavelengths as small as tens of meters is required in design studies to establish aerodynamic load indicator dispersions and vehicle control system capability. At most potential launch sites, high- resolution wind profiles may not exist. Representative samples of Rawinsonde wind profiles to altitudes of 30 km are more likely to be available from the extensive network of measurement sites established for routine sampling in support of weather observing and forecasting activity. Such a sample, large enough to be statistically representative of relatively large wavelength perturbations, would be inadequate for launch vehicle design assessments because the Rawinsonde system accurately measures wind perturbations with wavelengths no smaller than 2000 m (1000 m altitude increment). The Kennedy Space Center (KSC) Jimsphere wind profiles (150/month and seasonal 2 and 3.5-hr pairs) are the only adequate samples of high resolution profiles approx. 150 to 300 m effective resolution, but over-sampled at 25 m intervals) that have been used extensively for launch vehicle design assessments. Therefore, a simulation process has been developed for enhancement of measured low-resolution Rawinsonde profiles that would be applicable in preliminary launch vehicle design studies at launch sites other than KSC.
Using Candy Samples to Learn about Sampling Techniques and Statistical Data Evaluation
ERIC Educational Resources Information Center
Canaes, Larissa S.; Brancalion, Marcel L.; Rossi, Adriana V.; Rath, Susanne
2008-01-01
A classroom exercise for undergraduate and beginning graduate students that takes about one class period is proposed and discussed. It is an easy, interesting exercise that demonstrates important aspects of sampling techniques (sample amount, particle size, and the representativeness of the sample in relation to the bulk material). The exercise…
Standard deviation and standard error of the mean.
Lee, Dong Kyu; In, Junyong; Lee, Sangseok
2015-06-01
In most clinical and experimental studies, the standard deviation (SD) and the estimated standard error of the mean (SEM) are used to present the characteristics of sample data and to explain statistical analysis results. However, some authors occasionally muddle the distinctive usage between the SD and SEM in medical literature. Because the process of calculating the SD and SEM includes different statistical inferences, each of them has its own meaning. SD is the dispersion of data in a normal distribution. In other words, SD indicates how accurately the mean represents sample data. However the meaning of SEM includes statistical inference based on the sampling distribution. SEM is the SD of the theoretical distribution of the sample means (the sampling distribution). While either SD or SEM can be applied to describe data and statistical results, one should be aware of reasonable methods with which to use SD and SEM. We aim to elucidate the distinctions between SD and SEM and to provide proper usage guidelines for both, which summarize data and describe statistical results.
Standard deviation and standard error of the mean
In, Junyong; Lee, Sangseok
2015-01-01
In most clinical and experimental studies, the standard deviation (SD) and the estimated standard error of the mean (SEM) are used to present the characteristics of sample data and to explain statistical analysis results. However, some authors occasionally muddle the distinctive usage between the SD and SEM in medical literature. Because the process of calculating the SD and SEM includes different statistical inferences, each of them has its own meaning. SD is the dispersion of data in a normal distribution. In other words, SD indicates how accurately the mean represents sample data. However the meaning of SEM includes statistical inference based on the sampling distribution. SEM is the SD of the theoretical distribution of the sample means (the sampling distribution). While either SD or SEM can be applied to describe data and statistical results, one should be aware of reasonable methods with which to use SD and SEM. We aim to elucidate the distinctions between SD and SEM and to provide proper usage guidelines for both, which summarize data and describe statistical results. PMID:26045923
Currens, J.C.
1999-01-01
Analytical data for nitrate and triazines from 566 samples collected over a 3-year period at Pleasant Grove Spring, Logan County, KY, were statistically analyzed to determine the minimum data set needed to calculate meaningful yearly averages for a conduit-flow karst spring. Results indicate that a biweekly sampling schedule augmented with bihourly samples from high-flow events will provide meaningful suspended-constituent and dissolved-constituent statistics. Unless collected over an extensive period of time, daily samples may not be representative and may also be autocorrelated. All high-flow events resulting in a significant deflection of a constituent from base-line concentrations should be sampled. Either the geometric mean or the flow-weighted average of the suspended constituents should be used. If automatic samplers are used, then they may be programmed to collect storm samples as frequently as every few minutes to provide details on the arrival time of constituents of interest. However, only samples collected bihourly should be used to calculate averages. By adopting a biweekly sampling schedule augmented with high-flow samples, the need to continuously monitor discharge, or to search for and analyze existing data to develop a statistically valid monitoring plan, is lessened.Analytical data for nitrate and triazines from 566 samples collected over a 3-year period at Pleasant Grove Spring, Logan County, KY, were statistically analyzed to determine the minimum data set needed to calculate meaningful yearly averages for a conduit-flow karst spring. Results indicate that a biweekly sampling schedule augmented with bihourly samples from high-flow events will provide meaningful suspended-constituent and dissolved-constituent statistics. Unless collected over an extensive period of time, daily samples may not be representative and may also be autocorrelated. All high-flow events resulting in a significant deflection of a constituent from base-line concentrations should be sampled. Either the geometric mean or the flow-weighted average of the suspended constituents should be used. If automatic samplers are used, then they may be programmed to collect storm samples as frequently as every few minutes to provide details on the arrival time of constituents of interest. However, only samples collected bihourly should be used to calculate averages. By adopting a biweekly sampling schedule augmented with high-flow samples, the need to continuously monitor discharge, or to search for and analyze existing data to develop a statistically valid monitoring plan, is lessened.
[Respondent-Driven Sampling: a new sampling method to study visible and hidden populations].
Mantecón, Alejandro; Juan, Montse; Calafat, Amador; Becoña, Elisardo; Román, Encarna
2008-01-01
The paper introduces a variant of chain-referral sampling: respondent-driven sampling (RDS). This sampling method shows that methods based on network analysis can be combined with the statistical validity of standard probability sampling methods. In this sense, RDS appears to be a mathematical improvement of snowball sampling oriented to the study of hidden populations. However, we try to prove its validity with populations that are not within a sampling frame but can nonetheless be contacted without difficulty. The basics of RDS are explained through our research on young people (aged 14 to 25) who go clubbing, consume alcohol and other drugs, and have sex. Fieldwork was carried out between May and July 2007 in three Spanish regions: Baleares, Galicia and Comunidad Valenciana. The presentation of the study shows the utility of this type of sampling when the population is accessible but there is a difficulty deriving from the lack of a sampling frame. However, the sample obtained is not a random representative one in statistical terms of the target population. It must be acknowledged that the final sample is representative of a 'pseudo-population' that approximates to the target population but is not identical to it.
VizieR Online Data Catalog: LVL global optical photometry (Cook+, 2014)
NASA Astrophysics Data System (ADS)
Cook, D. O.; Dale, D. A.; Johnson, B. D.; van Zee, L.; Lee, J. C.; Kennicutt, R. C.; Calzetti, D.; Staudaher, S. M.; Engelbracht, C. W.
2015-05-01
The LVL sample consists of 258 of our nearest galaxy neighbours reflecting a statistically complete, representative sample of the local Universe. The sample selection and description are detailed in Dale et al. (2009ApJ...703..517D, Cat. J/ApJ/703/517). (4 data files).
VizieR Online Data Catalog: LVL SEDs and physical properties (Cook+, 2014)
NASA Astrophysics Data System (ADS)
Cook, D. O.; Dale, D. A.; Johnson, B. D.; van Zee, L.; Lee, J. C.; Kennicutt, R. C.; Calzetti, D.; Staudaher, S. M.; Engelbracht, C. W.
2015-05-01
The LVL sample consists of 258 of our nearest galaxy neighbours reflecting a statistically complete, representative sample of the local Universe. The sample selection and description are detailed in Dale et al. (2009ApJ...703..517D, Cat. J/ApJ/703/517). (1 data file).
Sampling for mercury at subnanogram per litre concentrations for load estimation in rivers
Colman, J.A.; Breault, R.F.
2000-01-01
Estimation of constituent loads in streams requires collection of stream samples that are representative of constituent concentrations, that is, composites of isokinetic multiple verticals collected along a stream transect. An all-Teflon isokinetic sampler (DH-81) cleaned in 75??C, 4 N HCl was tested using blank, split, and replicate samples to assess systematic and random sample contamination by mercury species. Mean mercury concentrations in field-equipment blanks were low: 0.135 ng??L-1 for total mercury (??Hg) and 0.0086 ng??L-1 for monomethyl mercury (MeHg). Mean square errors (MSE) for ??Hg and MeHg duplicate samples collected at eight sampling stations were not statistically different from MSE of samples split in the laboratory, which represent the analytical and splitting error. Low fieldblank concentrations and statistically equal duplicate- and split-sample MSE values indicate that no measurable contamination was occurring during sampling. Standard deviations associated with example mercury load estimations were four to five times larger, on a relative basis, than standard deviations calculated from duplicate samples, indicating that error of the load determination was primarily a function of the loading model used, not of sampling or analytical methods.
75 FR 1415 - Submission for OMB Review: Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-11
... Department of Labor--Bureau of Labor Statistics (BLS), Office of Management and Budget, Room 10235... Statistics. Type of Review: Revision of a currently approved collection. Title of Collection: The Consumer... sector. The data are collected from a national probability sample of households designed to represent the...
2014-12-01
both 2000 and 2007 Bw·eau of Justice Statistics Law Enforcement Management and Administrative Statistics sw’Veys. These agencies incmporate most...responded to a variety of community policing and homeland security questions in both 2000 and 2007 Bureau of Justice Statistics Law Enforcement...Management and Administrative Statistics surveys. These agencies incorporate most major U.S. police departments as well as a representative sample of smaller
Public and patient involvement in quantitative health research: A statistical perspective.
Hannigan, Ailish
2018-06-19
The majority of studies included in recent reviews of impact for public and patient involvement (PPI) in health research had a qualitative design. PPI in solely quantitative designs is underexplored, particularly its impact on statistical analysis. Statisticians in practice have a long history of working in both consultative (indirect) and collaborative (direct) roles in health research, yet their perspective on PPI in quantitative health research has never been explicitly examined. To explore the potential and challenges of PPI from a statistical perspective at distinct stages of quantitative research, that is sampling, measurement and statistical analysis, distinguishing between indirect and direct PPI. Statistical analysis is underpinned by having a representative sample, and a collaborative or direct approach to PPI may help achieve that by supporting access to and increasing participation of under-represented groups in the population. Acknowledging and valuing the role of lay knowledge of the context in statistical analysis and in deciding what variables to measure may support collective learning and advance scientific understanding, as evidenced by the use of participatory modelling in other disciplines. A recurring issue for quantitative researchers, which reflects quantitative sampling methods, is the selection and required number of PPI contributors, and this requires further methodological development. Direct approaches to PPI in quantitative health research may potentially increase its impact, but the facilitation and partnership skills required may require further training for all stakeholders, including statisticians. © 2018 The Authors Health Expectations published by John Wiley & Sons Ltd.
Gordon, J.D.; Schroder, L.J.; Morden-Moore, A. L.; Bowersox, V.C.
1995-01-01
Separate experiments by the U.S. Geological Survey (USGS) and the Illinois State Water Survey Central Analytical Laboratory (CAL) independently assessed the stability of hydrogen ion and specific conductance in filtered wet-deposition samples stored at ambient temperatures. The USGS experiment represented a test of sample stability under a diverse range of conditions, whereas the CAL experiment was a controlled test of sample stability. In the experiment by the USGS, a statistically significant (?? = 0.05) relation between [H+] and time was found for the composited filtered, natural, wet-deposition solution when all reported values are included in the analysis. However, if two outlying pH values most likely representing measurement error are excluded from the analysis, the change in [H+] over time was not statistically significant. In the experiment by the CAL, randomly selected samples were reanalyzed between July 1984 and February 1991. The original analysis and reanalysis pairs revealed that [H+] differences, although very small, were statistically different from zero, whereas specific-conductance differences were not. Nevertheless, the results of the CAL reanalysis project indicate there appears to be no consistent, chemically significant degradation in sample integrity with regard to [H+] and specific conductance while samples are stored at room temperature at the CAL. Based on the results of the CAL and USGS studies, short-term (45-60 day) stability of [H+] and specific conductance in natural filtered wet-deposition samples that are shipped and stored unchilled at ambient temperatures was satisfactory.
Drug Use and Crime. Bureau of Justice Statistics Special Report.
ERIC Educational Resources Information Center
Innes, Christopher A.
In 1974, 1979, and 1986, the Bureau of Justice Statistics sponsored surveys of nationally representative samples of inmates of state correctional facilities. Results from the 1986 Survey of Inmates of State Correctional Facilities which included 13,711 inmates, indicated that inmates reported high levels of drug use prior to the commission of the…
U.S. EPA conducted a national statistical survey of fish tissue contamination at 540 river sites (representing 82 954 river km) in 2008–2009, and analyzed samples for 50 persistent organic pollutants (POPs), including 21 PCB congeners, 8 PBDE congeners, and 21 organoc...
Sampling western spruce budworm by counting larvae on lower crown branches.
R.R. Mason; B.E. Wickman; H.G. Paul
1989-01-01
A technique is described for sampling spruce budworm larvae after bud flush by nondestructively beating branches in the lower crown. Sample data were collected from 32 plots representing a wide range of budworm densities. Statistical analyses indicated that larvae were less aggregated in the lower crown than at the same density in the middle crown. In an independent...
Evaluation of Respondent-Driven Sampling
McCreesh, Nicky; Frost, Simon; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda Ndagire; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G
2012-01-01
Background Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex-workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total-population data. Methods Total-population data on age, tribe, religion, socioeconomic status, sexual activity and HIV status were available on a population of 2402 male household-heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, employing current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). Results We recruited 927 household-heads. Full and small RDS samples were largely representative of the total population, but both samples under-represented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven-sampling statistical-inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven-sampling bootstrap 95% confidence intervals included the population proportion. Conclusions Respondent-driven sampling produced a generally representative sample of this well-connected non-hidden population. However, current respondent-driven-sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience-sampling method, and caution is required when interpreting findings based on the sampling method. PMID:22157309
Scheid, Anika; Nebel, Markus E
2012-07-09
Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case - without sacrificing much of the accuracy of the results. Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms.
2012-01-01
Background Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. Results In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case – without sacrificing much of the accuracy of the results. Conclusions Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms. PMID:22776037
Blynn, Emily; Ahmed, Saifuddin; Gibson, Dustin; Pariyo, George; Hyder, Adnan A
2017-01-01
In low- and middle-income countries (LMICs), historically, household surveys have been carried out by face-to-face interviews to collect survey data related to risk factors for noncommunicable diseases. The proliferation of mobile phone ownership and the access it provides in these countries offers a new opportunity to remotely conduct surveys with increased efficiency and reduced cost. However, the near-ubiquitous ownership of phones, high population mobility, and low cost require a re-examination of statistical recommendations for mobile phone surveys (MPS), especially when surveys are automated. As with landline surveys, random digit dialing remains the most appropriate approach to develop an ideal survey-sampling frame. Once the survey is complete, poststratification weights are generally applied to reduce estimate bias and to adjust for selectivity due to mobile ownership. Since weights increase design effects and reduce sampling efficiency, we introduce the concept of automated active strata monitoring to improve representativeness of the sample distribution to that of the source population. Although some statistical challenges remain, MPS represent a promising emerging means for population-level data collection in LMICs. PMID:28476726
NASA Technical Reports Server (NTRS)
Tolson, R. H.
1981-01-01
A technique is described for providing a means of evaluating the influence of spatial sampling on the determination of global mean total columnar ozone. A finite number of coefficients in the expansion are determined, and the truncated part of the expansion is shown to contribute an error to the estimate, which depends strongly on the spatial sampling and is relatively insensitive to data noise. First and second order statistics are derived for each term in a spherical harmonic expansion which represents the ozone field, and the statistics are used to estimate systematic and random errors in the estimates of total ozone.
Mueller, Amy V; Hemond, Harold F
2016-05-18
Knowledge of ionic concentrations in natural waters is essential to understand watershed processes. Inorganic nitrogen, in the form of nitrate and ammonium ions, is a key nutrient as well as a participant in redox, acid-base, and photochemical processes of natural waters, leading to spatiotemporal patterns of ion concentrations at scales as small as meters or hours. Current options for measurement in situ are costly, relying primarily on instruments adapted from laboratory methods (e.g., colorimetric, UV absorption); free-standing and inexpensive ISE sensors for NO3(-) and NH4(+) could be attractive alternatives if interferences from other constituents were overcome. Multi-sensor arrays, coupled with appropriate non-linear signal processing, offer promise in this capacity but have not yet successfully achieved signal separation for NO3(-) and NH4(+)in situ at naturally occurring levels in unprocessed water samples. A novel signal processor, underpinned by an appropriate sensor array, is proposed that overcomes previous limitations by explicitly integrating basic chemical constraints (e.g., charge balance). This work further presents a rationalized process for the development of such in situ instrumentation for NO3(-) and NH4(+), including a statistical-modeling strategy for instrument design, training/calibration, and validation. Statistical analysis reveals that historical concentrations of major ionic constituents in natural waters across New England strongly covary and are multi-modal. This informs the design of a statistically appropriate training set, suggesting that the strong covariance of constituents across environmental samples can be exploited through appropriate signal processing mechanisms to further improve estimates of minor constituents. Two artificial neural network architectures, one expanded to incorporate knowledge of basic chemical constraints, were tested to process outputs of a multi-sensor array, trained using datasets of varying degrees of statistical representativeness to natural water samples. The accuracy of ANN results improves monotonically with the statistical representativeness of the training set (error decreases by ∼5×), while the expanded neural network architecture contributes a further factor of 2-3.5 decrease in error when trained with the most representative sample set. Results using the most statistically accurate set of training samples (which retain environmentally relevant ion concentrations but avoid the potential interference of humic acids) demonstrated accurate, unbiased quantification of nitrate and ammonium at natural environmental levels (±20% down to <10 μM), as well as the major ions Na(+), K(+), Ca(2+), Mg(2+), Cl(-), and SO4(2-), in unprocessed samples. These results show promise for the development of new in situ instrumentation for the support of scientific field work.
Lindsey, Bruce D.; Rupert, Michael G.
2012-01-01
Decadal-scale changes in groundwater quality were evaluated by the U.S. Geological Survey National Water-Quality Assessment (NAWQA) Program. Samples of groundwater collected from wells during 1988-2000 - a first sampling event representing the decade ending the 20th century - were compared on a pair-wise basis to samples from the same wells collected during 2001-2010 - a second sampling event representing the decade beginning the 21st century. The data set consists of samples from 1,236 wells in 56 well networks, representing major aquifers and urban and agricultural land-use areas, with analytical results for chloride, dissolved solids, and nitrate. Statistical analysis was done on a network basis rather than by individual wells. Although spanning slightly more or less than a 10-year period, the two-sample comparison between the first and second sampling events is referred to as an analysis of decadal-scale change based on a step-trend analysis. The 22 principal aquifers represented by these 56 networks account for nearly 80 percent of the estimated withdrawals of groundwater used for drinking-water supply in the Nation. Well networks where decadal-scale changes in concentrations were statistically significant were identified using the Wilcoxon-Pratt signed-rank test. For the statistical analysis of chloride, dissolved solids, and nitrate concentrations at the network level, more than half revealed no statistically significant change over the decadal period. However, for networks that had statistically significant changes, increased concentrations outnumbered decreased concentrations by a large margin. Statistically significant increases of chloride concentrations were identified for 43 percent of 56 networks. Dissolved solids concentrations increased significantly in 41 percent of the 54 networks with dissolved solids data, and nitrate concentrations increased significantly in 23 percent of 56 networks. At least one of the three - chloride, dissolved solids, or nitrate - had a statistically significant increase in concentration in 66 percent of the networks. Statistically significant decreases in concentrations were identified in 4 percent of the networks for chloride, 2 percent of the networks for dissolved solids, and 9 percent of the networks for nitrate. A larger percentage of urban land-use networks had statistically significant increases in chloride, dissolved solids, and nitrate concentrations than agricultural land-use networks. In order to assess the magnitude of statistically significant changes, the median of the differences between constituent concentrations from the first full-network sampling event and those from the second full-network sampling event was calculated using the Turnbull method. The largest median decadal increases in chloride concentrations were in networks in the Upper Illinois River Basin (67 mg/L) and in the New England Coastal Basins (34 mg/L), whereas the largest median decadal decrease in chloride concentrations was in the Upper Snake River Basin (1 mg/L). The largest median decadal increases in dissolved solids concentrations were in networks in the Rio Grande Valley (260 mg/L) and the Upper Illinois River Basin (160 mg/L). The largest median decadal decrease in dissolved solids concentrations was in the Apalachicola-Chattahoochee-Flint River Basin (6.0 mg/L). The largest median decadal increases in nitrate as nitrogen (N) concentrations were in networks in the South Platte River Basin (2.0 mg/L as N) and the San Joaquin-Tulare Basins (1.0 mg/L as N). The largest median decadal decrease in nitrate concentrations was in the Santee River Basin and Coastal Drainages (0.63 mg/L). The magnitude of change in networks with statistically significant increases typically was much larger than the magnitude of change in networks with statistically significant decreases. The magnitude of change was greatest for chloride in the urban land-use networks and greatest for dissolved solids and nitrate in the agricultural land-use networks. Analysis of data from all networks combined indicated statistically significant increases for chloride, dissolved solids, and nitrate. Although chloride, dissolved solids, and nitrate concentrations were typically less than the drinking-water standards and guidelines, a statistical test was used to determine whether or not the proportion of samples exceeding the drinking-water standard or guideline changed significantly between the first and second full-network sampling events. The proportion of samples exceeding the U.S. Environmental Protection Agency (USEPA) Secondary Maximum Contaminant Level for dissolved solids (500 milligrams per liter) increased significantly between the first and second full-network sampling events when evaluating all networks combined at the national level. Also, for all networks combined, the proportion of samples exceeding the USEPA Maximum Contaminant Level (MCL) of 10 mg/L as N for nitrate increased significantly. One network in the Delmarva Peninsula had a significant increase in the proportion of samples exceeding the MCL for nitrate. A subset of 261 wells was sampled every other year (biennially) to evaluate decadal-scale changes using a time-series analysis. The analysis of the biennial data set showed that changes were generally similar to the findings from the analysis of decadal-scale change that was based on a step-trend analysis. Because of the small number of wells in a network with biennial data (typically 4-5 wells), the time-series analysis is more useful for understanding water-quality responses to changes in site-specific conditions rather than as an indicator of the change for the entire network.
Report: Science to Support Rulemaking
Report #2003-P-00003, November 15, 2002. The rules included in the pilot study were not a representative statistical sample of EPA rules, and we did not identify all ofthe critical science inputs for every rule.
Shoulder strength value differences between genders and age groups.
Balcells-Diaz, Eudald; Daunis-I-Estadella, Pepus
2018-03-01
The strength of a normal shoulder differs according to gender and decreases with age. Therefore, the Constant score, which is a shoulder function measurement tool that allocates 25% of the final score to strength, differs from the absolute values but likely reflects a normal shoulder. To compare group results, a normalized Constant score is needed, and the first step to achieving normalization involves statistically establishing the gender differences and age-related decline. In this investigation, we sought to verify the gender difference and age-related decline in strength. We obtained a randomized representative sample of the general population in a small to medium-sized Spanish city. We then invited this population to participate in our study, and we measured their shoulder strength. We performed a statistical analysis with a power of 80% and a P value < .05. We observed a statistically significant difference between the genders and a statistically significant decline with age. To the best of our knowledge, this is the first investigation to study a representative sample of the general population from which conclusions can be drawn regarding Constant score normalization. Copyright © 2017 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Allton, J. H.; Bevill, T. J.
2003-01-01
The strategy of raking rock fragments from the lunar regolith as a means of acquiring representative samples has wide support due to science return, spacecraft simplicity (reliability) and economy [3, 4, 5]. While there exists widespread agreement that raking or sieving the bulk regolith is good strategy, there is lively discussion about the minimum sample size. Advocates of consor-tium studies desire fragments large enough to support petrologic and isotopic studies. Fragments from 5 to 10 mm are thought adequate [4, 5]. Yet, Jolliff et al. [6] demonstrated use of 2-4 mm fragments as repre-sentative of larger rocks. Here we make use of cura-torial records and sample catalogs to give a different perspective on minimum sample size for a robotic sample collector.
A Bayesian nonparametric method for prediction in EST analysis
Lijoi, Antonio; Mena, Ramsés H; Prünster, Igor
2007-01-01
Background Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. Results In this work we propose a Bayesian nonparametric approach for tackling statistical problems related to EST surveys. In particular, we provide estimates for: a) the coverage, defined as the proportion of unique genes in the library represented in the given sample of reads; b) the number of new unique genes to be observed in a future sample; c) the discovery rate of new genes as a function of the future sample size. The Bayesian nonparametric model we adopt conveys, in a statistically rigorous way, the available information into prediction. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples. EST libraries, previously studied with frequentist methods, are analyzed in detail. Conclusion The Bayesian nonparametric approach we undertake yields valuable tools for gene capture and prediction in EST libraries. The estimators we obtain do not feature the kind of drawbacks associated with frequentist estimators and are reliable for any size of the additional sample. PMID:17868445
Statistical evaluation of vibration analysis techniques
NASA Technical Reports Server (NTRS)
Milner, G. Martin; Miller, Patrice S.
1987-01-01
An evaluation methodology is presented for a selection of candidate vibration analysis techniques applicable to machinery representative of the environmental control and life support system of advanced spacecraft; illustrative results are given. Attention is given to the statistical analysis of small sample experiments, the quantification of detection performance for diverse techniques through the computation of probability of detection versus probability of false alarm, and the quantification of diagnostic performance.
One goal of regional-scale sample surveys is to estimate the status of a resource of interest from a statistically drawn representative sample of that resource. An expression of status is a frequency distribution of indicator scores capturing the variability of the attributes of...
Labrique, Alain; Blynn, Emily; Ahmed, Saifuddin; Gibson, Dustin; Pariyo, George; Hyder, Adnan A
2017-05-05
In low- and middle-income countries (LMICs), historically, household surveys have been carried out by face-to-face interviews to collect survey data related to risk factors for noncommunicable diseases. The proliferation of mobile phone ownership and the access it provides in these countries offers a new opportunity to remotely conduct surveys with increased efficiency and reduced cost. However, the near-ubiquitous ownership of phones, high population mobility, and low cost require a re-examination of statistical recommendations for mobile phone surveys (MPS), especially when surveys are automated. As with landline surveys, random digit dialing remains the most appropriate approach to develop an ideal survey-sampling frame. Once the survey is complete, poststratification weights are generally applied to reduce estimate bias and to adjust for selectivity due to mobile ownership. Since weights increase design effects and reduce sampling efficiency, we introduce the concept of automated active strata monitoring to improve representativeness of the sample distribution to that of the source population. Although some statistical challenges remain, MPS represent a promising emerging means for population-level data collection in LMICs. ©Alain Labrique, Emily Blynn, Saifuddin Ahmed, Dustin Gibson, George Pariyo, Adnan A Hyder. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 05.05.2017.
Sampling in Developmental Science: Situations, Shortcomings, Solutions, and Standards.
Bornstein, Marc H; Jager, Justin; Putnick, Diane L
2013-12-01
Sampling is a key feature of every study in developmental science. Although sampling has far-reaching implications, too little attention is paid to sampling. Here, we describe, discuss, and evaluate four prominent sampling strategies in developmental science: population-based probability sampling, convenience sampling, quota sampling, and homogeneous sampling. We then judge these sampling strategies by five criteria: whether they yield representative and generalizable estimates of a study's target population, whether they yield representative and generalizable estimates of subsamples within a study's target population, the recruitment efforts and costs they entail, whether they yield sufficient power to detect subsample differences, and whether they introduce "noise" related to variation in subsamples and whether that "noise" can be accounted for statistically. We use sample composition of gender, ethnicity, and socioeconomic status to illustrate and assess the four sampling strategies. Finally, we tally the use of the four sampling strategies in five prominent developmental science journals and make recommendations about best practices for sample selection and reporting.
Sampling in Developmental Science: Situations, Shortcomings, Solutions, and Standards
Bornstein, Marc H.; Jager, Justin; Putnick, Diane L.
2014-01-01
Sampling is a key feature of every study in developmental science. Although sampling has far-reaching implications, too little attention is paid to sampling. Here, we describe, discuss, and evaluate four prominent sampling strategies in developmental science: population-based probability sampling, convenience sampling, quota sampling, and homogeneous sampling. We then judge these sampling strategies by five criteria: whether they yield representative and generalizable estimates of a study’s target population, whether they yield representative and generalizable estimates of subsamples within a study’s target population, the recruitment efforts and costs they entail, whether they yield sufficient power to detect subsample differences, and whether they introduce “noise” related to variation in subsamples and whether that “noise” can be accounted for statistically. We use sample composition of gender, ethnicity, and socioeconomic status to illustrate and assess the four sampling strategies. Finally, we tally the use of the four sampling strategies in five prominent developmental science journals and make recommendations about best practices for sample selection and reporting. PMID:25580049
Chao, Li-Wei; Szrek, Helena; Peltzer, Karl; Ramlagan, Shandir; Fleming, Peter; Leite, Rui; Magerman, Jesswill; Ngwenya, Godfrey B.; Pereira, Nuno Sousa; Behrman, Jere
2011-01-01
Finding an efficient method for sampling micro- and small-enterprises (MSEs) for research and statistical reporting purposes is a challenge in developing countries, where registries of MSEs are often nonexistent or outdated. This lack of a sampling frame creates an obstacle in finding a representative sample of MSEs. This study uses computer simulations to draw samples from a census of businesses and non-businesses in the Tshwane Municipality of South Africa, using three different sampling methods: the traditional probability sampling method, the compact segment sampling method, and the World Health Organization’s Expanded Programme on Immunization (EPI) sampling method. Three mechanisms by which the methods could differ are tested, the proximity selection of respondents, the at-home selection of respondents, and the use of inaccurate probability weights. The results highlight the importance of revisits and accurate probability weights, but the lesser effect of proximity selection on the samples’ statistical properties. PMID:22582004
Yin, Ge; Danielsson, Sara; Dahlberg, Anna-Karin; Zhou, Yihui; Qiu, Yanling; Nyberg, Elisabeth; Bignert, Anders
2017-10-01
Environmental monitoring typically assumes samples and sampling activities to be representative of the population being studied. Given a limited budget, an appropriate sampling strategy is essential to support detecting temporal trends of contaminants. In the present study, based on real chemical analysis data on polybrominated diphenyl ethers in snails collected from five subsites in Tianmu Lake, computer simulation is performed to evaluate three sampling strategies by the estimation of required sample size, to reach a detection of an annual change of 5% with a statistical power of 80% and 90% with a significant level of 5%. The results showed that sampling from an arbitrarily selected sampling spot is the worst strategy, requiring much more individual analyses to achieve the above mentioned criteria compared with the other two approaches. A fixed sampling site requires the lowest sample size but may not be representative for the intended study object e.g. a lake and is also sensitive to changes of that particular sampling site. In contrast, sampling at multiple sites along the shore each year, and using pooled samples when the cost to collect and prepare individual specimens are much lower than the cost for chemical analysis, would be the most robust and cost efficient strategy in the long run. Using statistical power as criterion, the results demonstrated quantitatively the consequences of various sampling strategies, and could guide users with respect of required sample sizes depending on sampling design for long term monitoring programs. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Nickles, C.; Zhao, Y.; Beighley, E.; Durand, M. T.; David, C. H.; Lee, H.
2017-12-01
The Surface Water and Ocean Topography (SWOT) satellite mission is jointly developed by NASA, the French space agency (CNES), with participation from the Canadian and UK space agencies to serve both the hydrology and oceanography communities. The SWOT mission will sample global surface water extents and elevations (lakes/reservoirs, rivers, estuaries, oceans, sea and land ice) at a finer spatial resolution than is currently possible enabling hydrologic discovery, model advancements and new applications that are not currently possible or likely even conceivable. Although the mission will provide global cover, analysis and interpolation of the data generated from the irregular space/time sampling represents a significant challenge. In this study, we explore the applicability of the unique space/time sampling for understanding river discharge dynamics throughout the Ohio River Basin. River network topology, SWOT sampling (i.e., orbit and identified SWOT river reaches) and spatial interpolation concepts are used to quantify the fraction of effective sampling of river reaches each day of the three-year mission. Streamflow statistics for SWOT generated river discharge time series are compared to continuous daily river discharge series. Relationships are presented to transform SWOT generated streamflow statistics to equivalent continuous daily discharge time series statistics intended to support hydrologic applications using low-flow and annual flow duration statistics.
Thomas, Elaine
2005-01-01
This article is the second in a series of three that will give health care professionals (HCPs) a sound introduction to medical statistics (Thomas, 2004). The objective of research is to find out about the population at large. However, it is generally not possible to study the whole of the population and research questions are addressed in an appropriate study sample. The next crucial step is then to use the information from the sample of individuals to make statements about the wider population of like individuals. This procedure of drawing conclusions about the population, based on study data, is known as inferential statistics. The findings from the study give us the best estimate of what is true for the relevant population, given the sample is representative of the population. It is important to consider how accurate this best estimate is, based on a single sample, when compared to the unknown population figure. Any difference between the observed sample result and the population characteristic is termed the sampling error. This article will cover the two main forms of statistical inference (hypothesis tests and estimation) along with issues that need to be addressed when considering the implications of the study results. Copyright (c) 2005 Whurr Publishers Ltd.
Evaluation of respondent-driven sampling.
McCreesh, Nicky; Frost, Simon D W; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda N; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G
2012-01-01
Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total population data. Total population data on age, tribe, religion, socioeconomic status, sexual activity, and HIV status were available on a population of 2402 male household heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, using current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). We recruited 927 household heads. Full and small RDS samples were largely representative of the total population, but both samples underrepresented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven sampling statistical inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven sampling bootstrap 95% confidence intervals included the population proportion. Respondent-driven sampling produced a generally representative sample of this well-connected nonhidden population. However, current respondent-driven sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required when interpreting findings based on the sampling method.
Optical Clock Distribution to VLSI Chips
1989-07-01
depletion layer edge represents a lower concentration of minority carriers, minority carriers generated outside the depletion layer statistically tend to...completed and the output voltage levels for the next state to be transfered to the intermediate buffer by means of the second pass transitor . Since 41...a sample size too small to have statistical significance, the results give an indication of the typical operating frequency that is possible with a
ERIC Educational Resources Information Center
LINDER, FORREST E.; AND OTHERS
COLLECTED BY INTERVIEWERS FROM A REPRESENTATIVE SAMPLE OF 42,000 HOUSEHOLDS CONTAINING 134,000 PERSONS, THIS DATA PERTAINS TO THE HEARING-IMPAIRED POPULATION IN 1962-1963. THE REPORT PRESENTS THE SOCIAL, ECONOMIC, AND DEMOGRAPHIC CHARACTERISTICS OF THE POPULATION WITH IMPAIRED HEARING, AND ALSO GIVES DATA ON THE UTILIZATION OF AND SATISFACTION…
A Hands-On Exercise Improves Understanding of the Standard Error of the Mean
ERIC Educational Resources Information Center
Ryan, Robert S.
2006-01-01
One of the most difficult concepts for statistics students is the standard error of the mean. To improve understanding of this concept, 1 group of students used a hands-on procedure to sample from small populations representing either a true or false null hypothesis. The distribution of 120 sample means (n = 3) from each population had standard…
STATISTICAL ANALYSIS OF TANK 18F FLOOR SAMPLE RESULTS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 18F as per the statistical sampling plan developed by Shine [1]. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL [2]. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples resultsmore » [3] to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL{sub 95%}) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 18F. The uncertainty is quantified in this report by an upper 95% confidence limit (UCL{sub 95%}) on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL{sub 95%} was based entirely on the six current scrape sample results (each averaged across three analytical determinations).« less
STATISTICAL ANALYSIS OF TANK 19F FLOOR SAMPLE RESULTS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 19F as per the statistical sampling plan developed by Harris and Shine. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples resultsmore » to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL95%) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current scrape sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 19F. The uncertainty is quantified in this report by an UCL95% on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL95% was based entirely on the six current scrape sample results (each averaged across three analytical determinations).« less
Reconstruction of three-dimensional porous media using generative adversarial neural networks
NASA Astrophysics Data System (ADS)
Mosser, Lukas; Dubrule, Olivier; Blunt, Martin J.
2017-10-01
To evaluate the variability of multiphase flow properties of porous media at the pore scale, it is necessary to acquire a number of representative samples of the void-solid structure. While modern x-ray computer tomography has made it possible to extract three-dimensional images of the pore space, assessment of the variability in the inherent material properties is often experimentally not feasible. We present a method to reconstruct the solid-void structure of porous media by applying a generative neural network that allows an implicit description of the probability distribution represented by three-dimensional image data sets. We show, by using an adversarial learning approach for neural networks, that this method of unsupervised learning is able to generate representative samples of porous media that honor their statistics. We successfully compare measures of pore morphology, such as the Euler characteristic, two-point statistics, and directional single-phase permeability of synthetic realizations with the calculated properties of a bead pack, Berea sandstone, and Ketton limestone. Results show that generative adversarial networks can be used to reconstruct high-resolution three-dimensional images of porous media at different scales that are representative of the morphology of the images used to train the neural network. The fully convolutional nature of the trained neural network allows the generation of large samples while maintaining computational efficiency. Compared to classical stochastic methods of image reconstruction, the implicit representation of the learned data distribution can be stored and reused to generate multiple realizations of the pore structure very rapidly.
Computed Tomography to Estimate the Representative Elementary Area for Soil Porosity Measurements
Borges, Jaqueline Aparecida Ribaski; Pires, Luiz Fernando; Belmont Pereira, André
2012-01-01
Computed tomography (CT) is a technique that provides images of different solid and porous materials. CT could be an ideal tool to study representative sizes of soil samples because of the noninvasive characteristic of this technique. The scrutiny of such representative elementary sizes (RESs) has been the target of attention of many researchers related to soil physics field owing to the strong relationship between physical properties and size of the soil sample. In the current work, data from gamma-ray CT were used to assess RES in measurements of soil porosity (ϕ). For statistical analysis, a study on the full width at a half maximum (FWHM) of the adjustment of distribution of ϕ at different areas (1.2 to 1162.8 mm2) selected inside of tomographic images was proposed herein. The results obtained point out that samples with a section area corresponding to at least 882.1 mm2 were the ones that provided representative values of ϕ for the studied Brazilian tropical soil. PMID:22666133
Evaluation of Differential DependencY (EDDY) is a statistical test for the differential dependency relationship of a set of genes between two given conditions. For each condition, possible dependency network structures are enumerated and their likelihoods are computed to represent a probability distribution of dependency networks. The difference between the probability distributions of dependency networks is computed between conditions, and its statistical significance is evaluated with random permutations of condition labels on the samples.
Evaluation of Differential DependencY (EDDY) is a statistical test for the differential dependency relationship of a set of genes between two given conditions. For each condition, possible dependency network structures are enumerated and their likelihoods are computed to represent a probability distribution of dependency networks. The difference between the probability distributions of dependency networks is computed between conditions, and its statistical significance is evaluated with random permutations of condition labels on the samples.
Schramm, Jesper; Andersen, Morten; Vach, Kirstin; Kragstrup, Jakob; Peter Kampmann, Jens; Søndergaard, Jens
2007-01-01
Objective To examine the extent and composition of pharmaceutical industry representatives’ marketing techniques with a particular focus on drug sampling in relation to drug age. Design A group of 47 GPs prospectively collected data on drug promotional activities during a six-month period, and a sub-sample of 10 GPs furthermore recorded the representatives’ marketing techniques in detail. Setting Primary healthcare. Subjects General practitioners in the County of Funen, Denmark. Main outcome measures. Promotional visits and corresponding marketing techniques. Results The 47 GPs recorded 1050 visits corresponding to a median of 19 (range 3 to 63) per GP in the six months. The majority of drugs promoted (52%) were marketed more than five years ago. There was a statistically significant decline in the proportion of visits where drug samples were offered with drug age, but the decline was small OR 0.97 (95% CI 0.95;0.98) per year. Leaflets (68%), suggestions on how to improve therapy for a specific patient registered with the practice (53%), drug samples (48%), and gifts (36%) were the most frequently used marketing techniques. Conclusion Drug-industry representatives use a variety of promotional methods. The tendency to hand out drug samples was statistically significantly associated with drug age, but the decline was small. PMID:17497486
Effects of sampling interval on spatial patterns and statistics of watershed nitrogen concentration
Wu, S.-S.D.; Usery, E.L.; Finn, M.P.; Bosch, D.D.
2009-01-01
This study investigates how spatial patterns and statistics of a 30 m resolution, model-simulated, watershed nitrogen concentration surface change with sampling intervals from 30 m to 600 m for every 30 m increase for the Little River Watershed (Georgia, USA). The results indicate that the mean, standard deviation, and variogram sills do not have consistent trends with increasing sampling intervals, whereas the variogram ranges remain constant. A sampling interval smaller than or equal to 90 m is necessary to build a representative variogram. The interpolation accuracy, clustering level, and total hot spot areas show decreasing trends approximating a logarithmic function. The trends correspond to the nitrogen variogram and start to level at a sampling interval of 360 m, which is therefore regarded as a critical spatial scale of the Little River Watershed. Copyright ?? 2009 by Bellwether Publishing, Ltd. All right reserved.
ERIC Educational Resources Information Center
Beaver, Kevin M.; Schwartz, Joseph A.; Connolly, Eric J.; Al-Ghamdi, Mohammed Said; Kobeisy, Ahmed Nezar
2015-01-01
The role of parenting in the development of criminal behavior has been the source of a vast amount of research, with the majority of studies detecting statistically significant associations between dimensions of parenting and measures of criminal involvement. An emerging group of scholars, however, has drawn attention to the methodological…
Statistical Inference for Porous Materials using Persistent Homology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moon, Chul; Heath, Jason E.; Mitchell, Scott A.
2017-12-01
We propose a porous materials analysis pipeline using persistent homology. We rst compute persistent homology of binarized 3D images of sampled material subvolumes. For each image we compute sets of homology intervals, which are represented as summary graphics called persistence diagrams. We convert persistence diagrams into image vectors in order to analyze the similarity of the homology of the material images using the mature tools for image analysis. Each image is treated as a vector and we compute its principal components to extract features. We t a statistical model using the loadings of principal components to estimate material porosity, permeability,more » anisotropy, and tortuosity. We also propose an adaptive version of the structural similarity index (SSIM), a similarity metric for images, as a measure to determine the statistical representative elementary volumes (sREV) for persistence homology. Thus we provide a capability for making a statistical inference of the uid ow and transport properties of porous materials based on their geometry and connectivity.« less
AN ARCGIS TOOL FOR CREATING POPULATIONS OF WATERSHEDS
For the Landscape Investigations for Pesticides Study in the Midwest, the goal is to sample a representative subset of watersheds selected statistically from a target population of watersheds within the glaciated corn belt. This area stretches from Ohio to Iowa and includes parts...
40 CFR 403.7 - Removal credits.
Code of Federal Regulations, 2011 CFR
2011-07-01
... representative of the actual operation of the POTW Treatment Plant, an alternative sampling schedule will be... statistically valid description of daily, weekly and seasonal sewage treatment plant loadings and performance... the intentional or unintentional diversion of flow from the POTW before the POTW Treatment Plant...
The Timing of First Marriage: Are There Religious Variations?
ERIC Educational Resources Information Center
Xu, Xiaohe; Hudspeth, Clark D.; Bartkowski, John P.
2005-01-01
Using survey data from a nationally representative sample, this article explores how marriage timing varies across major religious denominations. Survival analysis indicates that net of statistical controls, Catholics, moderate Protestants, conservative Protestants, and Mormons marry significantly earlier than their unaffiliated counterparts. This…
Design of partially supervised classifiers for multispectral image data
NASA Technical Reports Server (NTRS)
Jeon, Byeungwoo; Landgrebe, David
1993-01-01
A partially supervised classification problem is addressed, especially when the class definition and corresponding training samples are provided a priori only for just one particular class. In practical applications of pattern classification techniques, a frequently observed characteristic is the heavy, often nearly impossible requirements on representative prior statistical class characteristics of all classes in a given data set. Considering the effort in both time and man-power required to have a well-defined, exhaustive list of classes with a corresponding representative set of training samples, this 'partially' supervised capability would be very desirable, assuming adequate classifier performance can be obtained. Two different classification algorithms are developed to achieve simplicity in classifier design by reducing the requirement of prior statistical information without sacrificing significant classifying capability. The first one is based on optimal significance testing, where the optimal acceptance probability is estimated directly from the data set. In the second approach, the partially supervised classification is considered as a problem of unsupervised clustering with initially one known cluster or class. A weighted unsupervised clustering procedure is developed to automatically define other classes and estimate their class statistics. The operational simplicity thus realized should make these partially supervised classification schemes very viable tools in pattern classification.
Seay, Kristen D.; Kohl, Patricia
2012-01-01
Using data from the National Survey of Child and Adolescent Well-Being II (NSCAW II), this article examines the impact of caregiver substance abuse on children’s exposure to violence in the home in a nationally representative sample of families involved with child protective services (CPS). Logistic regression analyses indicate an increased risk of witnessing mild and severe violence in the home for children whose primary caregiver was abusing alcohol or drugs. However, analyses did not find statistically significant relationships between child report of direct victimization in the home by mild or severe violence and caregiver alcohol or drug abuse. PMID:23440502
Historical changes in large river fish assemblages of the Americas: A synthesis
The objective of this synthesis is to summarize patterns in historical changes in the fish assemblages of selected large American rivers, to document causes for those changes, and to suggest rehabilitation measures. Although not a statistically representative sample of large riv...
Urban and Suburban Residents' Perceptions of Farmers and Agriculture.
ERIC Educational Resources Information Center
Molnar, Joseph J.; Duffy, Patricia A.
Attitudes about farming and government agricultural policies differed among residential categories ranging from urban to rural. A mail survey gathered 3,232 completed questionnaires from a national random sample of 9,250 households. Statistical weighting made respondent categories representative of national proportions. Although respondents…
EPA reviewed a statistically representative sample of oil and gas production wells reported by nine service companies to help understand the role of well design and construction practices preventing pathways for subsurface fluid movement.
Wiley, Jeffrey B.
2006-01-01
Five time periods between 1930 and 2002 are identified as having distinct patterns of annual minimum daily mean flows (minimum flows). Average minimum flows increased around 1970 at many streamflow-gaging stations in West Virginia. Before 1930, however, there might have been a period of minimum flows greater than any period identified between 1930 and 2002. The effects of climate variability are probably the principal causes of the differences among the five time periods. Comparisons of selected streamflow statistics are made between values computed for the five identified time periods and values computed for the 1930-2002 interval for 15 streamflow-gaging stations. The average difference between statistics computed for the five time periods and the 1930-2002 interval decreases with increasing magnitude of the low-flow statistic. The greatest individual-station absolute difference was 582.5 percent greater for the 7-day 10-year low flow computed for 1970-1979 compared to the value computed for 1930-2002. The hydrologically based low flows indicate approximately equal or smaller absolute differences than biologically based low flows. The average 1-day 3-year biologically based low flow (1B3) and 4-day 3-year biologically based low flow (4B3) are less than the average 1-day 10-year hydrologically based low flow (1Q10) and 7-day 10-year hydrologic-based low flow (7Q10) respectively, and range between 28.5 percent less and 13.6 percent greater. Seasonally, the average difference between low-flow statistics computed for the five time periods and 1930-2002 is not consistent between magnitudes of low-flow statistics, and the greatest difference is for the summer (July 1-September 30) and fall (October 1-December 31) for the same time period as the greatest difference determined in the annual analysis. The greatest average difference between 1B3 and 4B3 compared to 1Q10 and 7Q10, respectively, is in the spring (April 1-June 30), ranging between 11.6 and 102.3 percent greater. Statistics computed for the individual station's record period may not represent the statistics computed for the period 1930 to 2002 because (1) station records are available predominantly after about 1970 when minimum flows were greater than the average between 1930 and 2002 and (2) some short-term station records are mostly during dry periods, whereas others are mostly during wet periods. A criterion-based sampling of the individual station's record periods at stations was taken to reduce the effects of statistics computed for the entire record periods not representing the statistics computed for 1930-2002. The criterion used to sample the entire record periods is based on a comparison between the regional minimum flows and the minimum flows at the stations. Criterion-based sampling of the available record periods was superior to record-extension techniques for this study because more stations were selected and areal distribution of stations was more widespread. Principal component and correlation analyses of the minimum flows at 20 stations in or near West Virginia identify three regions of the State encompassing stations with similar patterns of minimum flows: the Lower Appalachian Plateaus, the Upper Appalachian Plateaus, and the Eastern Panhandle. All record periods of 10 years or greater between 1930 and 2002 where the average of the regional minimum flows are nearly equal to the average for 1930-2002 are determined as representative of 1930-2002. Selected statistics are presented for the longest representative record period that matches the record period for 77 stations in West Virginia and 40 stations near West Virginia. These statistics can be used to develop equations for estimating flow in ungaged stream locations.
Statistically optimal perception and learning: from behavior to neural representations
Fiser, József; Berkes, Pietro; Orbán, Gergő; Lengyel, Máté
2010-01-01
Human perception has recently been characterized as statistical inference based on noisy and ambiguous sensory inputs. Moreover, suitable neural representations of uncertainty have been identified that could underlie such probabilistic computations. In this review, we argue that learning an internal model of the sensory environment is another key aspect of the same statistical inference procedure and thus perception and learning need to be treated jointly. We review evidence for statistically optimal learning in humans and animals, and reevaluate possible neural representations of uncertainty based on their potential to support statistically optimal learning. We propose that spontaneous activity can have a functional role in such representations leading to a new, sampling-based, framework of how the cortex represents information and uncertainty. PMID:20153683
DeWeese, Lawrence R.; Stephens, Verlin C.; Short, Terry M.; Dubrovsky, Neil M.
2007-01-01
The U.S. Geological Survey National Water-Quality Assessment Program collected tissue samples from a variety of aquatic organisms during 1992-1999 within 47 study units across the United States. These tissue samples were collected to determine the occurrence and distribution of 20 major and minor trace elements in aquatic organisms. This report presents the tissue trace-element concentration data, sample summaries, and concentration statistics for 1,457 tissue samples representing 76 species or groups of fish, aquatic invertebrates, and plants were collected at 824 sampling sites.
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
VizieR Online Data Catalog: The ESO DIBs Large Exploration Survey (Cox+, 2017)
NASA Astrophysics Data System (ADS)
Cox, N. L. J.; Cami, J.; Farhang, A.; Smoker, J.; Monreal-Ibero, A.; Lallement, R.; Sarre, P. J.; Marshall, C. C. M.; Smith, K. T.; Evans, C. J.; Royer, P.; Linnartz, H.; Cordiner, M. A.; Joblin, C.; van Loon, J. T.; Foing, B. H.; Bhatt, N. H.; Bron, E.; Elyajouri, M.; de Koter, A.; Ehrenfreund, P.; Javadi, A.; Kaper, L.; Khosroshadi, H. G.; Laverick, M.; Le Petit, F.; Mulas, G.; Roueff, E.; Salama, F.; Spaans, M.
2018-01-01
We constructed a statistically representative survey sample that probes a wide range of interstellar environment parameters including reddening E(B-V), visual extinction AV, total-to-selective extinction ratio RV, and molecular hydrogen fraction fH2. EDIBLES provides the community with optical (~305-1042nm) spectra at high spectral resolution (R~70000 in the blue arm and 100000 in the red arm) and high signal-to-noise (S/N; median value ~500-1000), for a statistically significant sample of interstellar sightlines. Many of the >100 sightlines included in the survey already have auxiliary available ultraviolet, infrared and/or polarisation data on the dust and gas components. (2 data files).
Reaching Asian Americans: sampling strategies and incentives.
Lee, Soo-Kyung; Cheng, Yu-Yao
2006-07-01
Reaching and recruiting representative samples of minority populations is often challenging. This study examined in Chinese and Korean Americans: 1) whether using two different sampling strategies (random sampling vs. convenience sampling) significantly affected characteristics of recruited participants and 2) whether providing different incentives in the mail survey produced different response rates. We found statistically significant, however mostly not remarkable, differences between random and convenience samples. Offering monetary incentives in the mail survey improved response rates among Chinese Americans, while offering a small gift did not improve response rates among either Chinese or Korean Americans. This information will be useful for researchers and practitioners working with Asian Americans.
Method of identifying clusters representing statistical dependencies in multivariate data
NASA Technical Reports Server (NTRS)
Borucki, W. J.; Card, D. H.; Lyle, G. C.
1975-01-01
Approach is first to cluster and then to compute spatial boundaries for resulting clusters. Next step is to compute, from set of Monte Carlo samples obtained from scrambled data, estimates of probabilities of obtaining at least as many points within boundaries as were actually observed in original data.
Roadmap for Navy Family Research.
1980-08-01
of methodological limitations, including: small, often non -representative or narrowly defined samples; inadequate statistical controls, inadequate...1-1 1.2 Overview of the Research Roadmap ..................... 1-2 2. Methodology ...the Office of Naval Research by the Westinghouse Public Applied Systems Division, and is designed to provide the Navy with a systematic framework for
Learning Opportunities for Group Learning
ERIC Educational Resources Information Center
Gil, Alfonso J.; Mataveli, Mara
2017-01-01
Purpose: This paper aims to analyse the impact of organizational learning culture and learning facilitators in group learning. Design/methodology/approach: This study was conducted using a survey method applied to a statistically representative sample of employees from Rioja wine companies in Spain. A model was tested using a structural equation…
Money Matters: The Influence of Financial Factors on Graduate Student Persistence
ERIC Educational Resources Information Center
Strayhorn, Terrell L.
2010-01-01
National statistics indicate that approximately 50 percent of all graduate students fail to complete their degree; thus, understanding the factors that influence their persistence is an important research objective. Using data from a nationally representative sample of bachelor's degree recipients, the study aimed to answer three questions: What…
Representing Learning With Graphical Models
NASA Technical Reports Server (NTRS)
Buntine, Wray L.; Lum, Henry, Jr. (Technical Monitor)
1994-01-01
Probabilistic graphical models are being used widely in artificial intelligence, for instance, in diagnosis and expert systems, as a unified qualitative and quantitative framework for representing and reasoning with probabilities and independencies. Their development and use spans several fields including artificial intelligence, decision theory and statistics, and provides an important bridge between these communities. This paper shows by way of example that these models can be extended to machine learning, neural networks and knowledge discovery by representing the notion of a sample on the graphical model. Not only does this allow a flexible variety of learning problems to be represented, it also provides the means for representing the goal of learning and opens the way for the automatic development of learning algorithms from specifications.
NASA Astrophysics Data System (ADS)
Lima, Pedro; Steger, Stefan; Glade, Thomas
2017-04-01
Landslides can represent a significant threat for people and infrastructure in hilly and mountainous landscapes worldwide. The understanding and prediction of those geomorphic processes is crucial to avoid economic loses or even casualties to people and their properties. Statistical based landslide susceptibility models are well known for being highly reliant on the quality, representativeness and availability of input data. In this context, several studies indicate that the landslide inventory represents the most important input data. However each landslide mapping technique or data collection has its drawbacks. Consequently, biased landslide inventories may be commonly introduced into statistical models, especially at regional or even national scale. It remains to the researcher to be aware of potential limitations and design strategies to avoid or reduce the potential propagation of input data errors and biases influences on the modelling outcomes. Previous studies have proven that such erroneous landslide inventories may lead to unrealistic landslide susceptibility maps. We assume that one possibility to tackle systematic landslide inventory-based biases might be a concentration on sampling strategies that focus on the distribution of non-landslide locations. For this purpose, we test an approach for the Austrian territory that concentrates on a modified non-landslide sampling strategy, instead the traditional applied random sampling. It is expected that the way non-landslide locations are represented (e.g. equally over the area or within those areas where mapping campaigns have been conducted) is important to reduce a potential over- or underestimation of landslide susceptibility within specific areas caused by bias. As presumably each landslide inventory is known to be systematically incomplete, especially in those areas where no mapping campaign was previously conducted. This is also applicable to the one currently available for the Austrian territory, composed by 14,519 shallow landslides. Within this study, we introduce the following explanatory variables to test the effect of different non-landslide strategies: Lithological units, grouped by their geotechnical properties and topographic parameters such as aspect, elevation, slope gradient and the topographic position. Landslide susceptibility maps will be derived by applying logistic regression, while systematic comparisons will be carried out based on models created by different non-landslide sampling strategies. Models generated by the conventional random sampling are presented against models based on stratified and clustered sampling strategies. The modelling results will be compared in terms of their prediction performance measured by the AUROC (Area Under the Receiver Operating Characteristic Curve) obtained by means of a k-fold cross-validation and also by the spatial pattern of the maps. The outcomes of this study are intended to contribute to the understanding on how landslide-inventory based biases may be counteracted.
Slowdowns in diversification rates from real phylogenies may not be real.
Cusimano, Natalie; Renner, Susanne S
2010-07-01
Studies of diversification patterns often find a slowing in lineage accumulation toward the present. This seemingly pervasive pattern of rate downturns has been taken as evidence for adaptive radiations, density-dependent regulation, and metacommunity species interactions. The significance of rate downturns is evaluated with statistical tests (the gamma statistic and Monte Carlo constant rates (MCCR) test; birth-death likelihood models and Akaike Information Criterion [AIC] scores) that rely on null distributions, which assume that the included species are a random sample of the entire clade. Sampling in real phylogenies, however, often is nonrandom because systematists try to include early-diverging species or representatives of previous intrataxon classifications. We studied the effects of biased sampling, structured sampling, and random sampling by experimentally pruning simulated trees (60 and 150 species) as well as a completely sampled empirical tree (58 species) and then applying the gamma statistic/MCCR test and birth-death likelihood models/AIC scores to assess rate changes. For trees with random species sampling, the true model (i.e., the one fitting the complete phylogenies) could be inferred in most cases. Oversampling deep nodes, however, strongly biases inferences toward downturns, with simulations of structured and biased sampling suggesting that this occurs when sampling percentages drop below 80%. The magnitude of the effect and the sensitivity of diversification rate models is such that a useful rule of thumb may be not to infer rate downturns from real trees unless they have >80% species sampling.
NASA Technical Reports Server (NTRS)
Alberts, J. R.; Burden, H. W.; Hawes, N.; Ronca, A. E.
1996-01-01
To assess prenatal and postnatal developmental status in the offspring of a group of animals, it is typical to examine fetuses from some of the dams as well as infants born to the remaining dams. Statistical limitations often arise, particularly when the animals are rare or especially precious, because all offspring of the dam represent only a single statistical observation; littermates are not independent observations (biologically or statistically). We describe a study in which pregnant laboratory rats were laparotomized on day 7 of gestation (GD7) to ascertain the number and distribution of uterine implantation sites and were subjected to a simulated experience on a 10-day space shuttle flight. After the simulated landing on GD18, rats were unilaterally hysterectomized, thus providing a sample of fetuses from 10 independent uteruses, followed by successful vaginal delivery on GD22, yielding postnatal samples from 10 uteruses. A broad profile of maternal and offspring morphologic and physiologic measures indicated that these novel sampling procedures did not compromise maternal well-being and maintained normal offspring development and function. Measures included maternal organ weights and hormone concentrations, offspring body size, growth, organ weights, sexual differentiation, and catecholamine concentrations.
Alegana, Victor A; Wright, Jim; Bosco, Claudio; Okiro, Emelda A; Atkinson, Peter M; Snow, Robert W; Tatem, Andrew J; Noor, Abdisalan M
2017-11-21
One pillar to monitoring progress towards the Sustainable Development Goals is the investment in high quality data to strengthen the scientific basis for decision-making. At present, nationally-representative surveys are the main source of data for establishing a scientific evidence base, monitoring, and evaluation of health metrics. However, little is known about the optimal precisions of various population-level health and development indicators that remains unquantified in nationally-representative household surveys. Here, a retrospective analysis of the precision of prevalence from these surveys was conducted. Using malaria indicators, data were assembled in nine sub-Saharan African countries with at least two nationally-representative surveys. A Bayesian statistical model was used to estimate between- and within-cluster variability for fever and malaria prevalence, and insecticide-treated bed nets (ITNs) use in children under the age of 5 years. The intra-class correlation coefficient was estimated along with the optimal sample size for each indicator with associated uncertainty. Results suggest that the estimated sample sizes for the current nationally-representative surveys increases with declining malaria prevalence. Comparison between the actual sample size and the modelled estimate showed a requirement to increase the sample size for parasite prevalence by up to 77.7% (95% Bayesian credible intervals 74.7-79.4) for the 2015 Kenya MIS (estimated sample size of children 0-4 years 7218 [7099-7288]), and 54.1% [50.1-56.5] for the 2014-2015 Rwanda DHS (12,220 [11,950-12,410]). This study highlights the importance of defining indicator-relevant sample sizes to achieve the required precision in the current national surveys. While expanding the current surveys would need additional investment, the study highlights the need for improved approaches to cost effective sampling.
Assessing Literacy: The Framework for the National Adult Literacy Survey.
ERIC Educational Resources Information Center
Campbell, Anne; And Others
To satisfy federal requirements, the National Center for Education Statistics and the Division of Adult Education and Literacy planned a nationally representative household sample survey to assess the literacy skills of the adult population of the United States, to be conducted by the Educational Testing Service with the assistance of Westat, Inc.…
Use of Longitudinal Regression in Quality Control. Research Report. ETS RR-14-31
ERIC Educational Resources Information Center
Lu, Ying; Yen, Wendy M.
2014-01-01
This article explores the use of longitudinal regression as a tool for identifying scoring inaccuracies. Student progression patterns, as evaluated through longitudinal regressions, typically are more stable from year to year than are scale score distributions and statistics, which require representative samples to conduct credibility checks.…
ERIC Educational Resources Information Center
Cardina, Catherine E.; DeNysschen, Carol
2018-01-01
Purpose: This study described professional development (PD) among public school physical education (PE) teachers and compared PE teachers to teachers of other subjects. Method: Data were collected from a nationally representative sample of public school teachers in the United States. Descriptive statistics were used to describe teachers' support…
ERIC Educational Resources Information Center
White, Susan C.
2016-01-01
Since 1987, the Statistical Research Center at the American Institute of Physics has regularly conducted a survey of high school physics teachers. This September we're at it again. This fall, we will look for physics teachers at each of the 4,000+ schools with 12th grade in our nationally representative sample of public and private schools. We…
ERIC Educational Resources Information Center
Hamaker, Ellen L.; Dolan, Conor V.; Molenaar, Peter C. M.
2005-01-01
Results obtained with interindividual techniques in a representative sample of a population are not necessarily generalizable to the individual members of this population. In this article the specific condition is presented that must be satisfied to generalize from the interindividual level to the intraindividual level. A way to investigate…
Geostatistical Sampling Methods for Efficient Uncertainty Analysis in Flow and Transport Problems
NASA Astrophysics Data System (ADS)
Liodakis, Stylianos; Kyriakidis, Phaedon; Gaganis, Petros
2015-04-01
In hydrogeological applications involving flow and transport of in heterogeneous porous media the spatial distribution of hydraulic conductivity is often parameterized in terms of a lognormal random field based on a histogram and variogram model inferred from data and/or synthesized from relevant knowledge. Realizations of simulated conductivity fields are then generated using geostatistical simulation involving simple random (SR) sampling and are subsequently used as inputs to physically-based simulators of flow and transport in a Monte Carlo framework for evaluating the uncertainty in the spatial distribution of solute concentration due to the uncertainty in the spatial distribution of hydraulic con- ductivity [1]. Realistic uncertainty analysis, however, calls for a large number of simulated concentration fields; hence, can become expensive in terms of both time and computer re- sources. A more efficient alternative to SR sampling is Latin hypercube (LH) sampling, a special case of stratified random sampling, which yields a more representative distribution of simulated attribute values with fewer realizations [2]. Here, term representative implies realizations spanning efficiently the range of possible conductivity values corresponding to the lognormal random field. In this work we investigate the efficiency of alternative methods to classical LH sampling within the context of simulation of flow and transport in a heterogeneous porous medium. More precisely, we consider the stratified likelihood (SL) sampling method of [3], in which attribute realizations are generated using the polar simulation method by exploring the geometrical properties of the multivariate Gaussian distribution function. In addition, we propose a more efficient version of the above method, here termed minimum energy (ME) sampling, whereby a set of N representative conductivity realizations at M locations is constructed by: (i) generating a representative set of N points distributed on the surface of a M-dimensional, unit radius hyper-sphere, (ii) relocating the N points on a representative set of N hyper-spheres of different radii, and (iii) transforming the coordinates of those points to lie on N different hyper-ellipsoids spanning the multivariate Gaussian distribution. The above method is applied in a dimensionality reduction context by defining flow-controlling points over which representative sampling of hydraulic conductivity is performed, thus also accounting for the sensitivity of the flow and transport model to the input hydraulic conductivity field. The performance of the various stratified sampling methods, LH, SL, and ME, is compared to that of SR sampling in terms of reproduction of ensemble statistics of hydraulic conductivity and solute concentration for different sample sizes N (numbers of realizations). The results indicate that ME sampling constitutes an equally if not more efficient simulation method than LH and SL sampling, as it can reproduce to a similar extent statistics of the conductivity and concentration fields, yet with smaller sampling variability than SR sampling. References [1] Gutjahr A.L. and Bras R.L. Spatial variability in subsurface flow and transport: A review. Reliability Engineering & System Safety, 42, 293-316, (1993). [2] Helton J.C. and Davis F.J. Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliability Engineering & System Safety, 81, 23-69, (2003). [3] Switzer P. Multiple simulation of spatial fields. In: Heuvelink G, Lemmens M (eds) Proceedings of the 4th International Symposium on Spatial Accuracy Assessment in Natural Resources and Environmental Sciences, Coronet Books Inc., pp 629?635 (2000).
Perugini, Monia; Visciano, Pierina; Manera, Maurizio; Abete, Maria Cesarina; Gavinelli, Stefania; Amorena, Michele
2013-11-01
The aim of this study was to evaluate mercury and selenium distribution in different portions (exoskeleton, white meat and brown meat) of Norway lobster (Nephrops norvegicus). Some samples were also analysed as whole specimens. The same portions were also examined after boiling, in order to observe if this cooking practice could affect mercury and selenium concentrations. The highest mercury concentrations were detected in white meat, exceeding in all cases the maximum levels established by European legislation. The brown meat reported the highest selenium concentrations. In all boiled samples, mercury levels showed a statistically significant increase compared to raw portions. On the contrary, selenium concentrations detected in boiled samples of white meat, brown meat and whole specimen showed a statistically significant decrease compared to the corresponding raw samples. These results indicate that boiling modifies mercury and selenium concentrations. The high mercury levels detected represent a possible risk for consumers, and the publication and diffusion of specific advisories concerning seafood consumption is recommended.
NASA Technical Reports Server (NTRS)
Torres-Pomales, Wilfredo
2014-01-01
This report describes a modeling and simulation approach for disturbance patterns representative of the environment experienced by a digital system in an electromagnetic reverberation chamber. The disturbance is modeled by a multi-variate statistical distribution based on empirical observations. Extended versions of the Rejection Samping and Inverse Transform Sampling techniques are developed to generate multi-variate random samples of the disturbance. The results show that Inverse Transform Sampling returns samples with higher fidelity relative to the empirical distribution. This work is part of an ongoing effort to develop a resilience assessment methodology for complex safety-critical distributed systems.
The optical and near-infrared colors of galaxies, 1: The photometric data
NASA Technical Reports Server (NTRS)
Bershady, Matthew A.; Hereld, Mark; Kron, Richard G.; Koo, David C.; Munn, Jeffrey A.; Majewski, Steven R.
1994-01-01
We present optical and near-infrared photometry and spectroscopic redshifts of a well defined sample of 171 field galaxies selected from three high galactic latitude fields. This data set forms the basis for subsequent studies to characterize the trends, dispersion, and evolution of rest-frame colors and image structure. A subset of 143 galaxies constitutes a magnitude-limited sample to B approx. 19.9-20.75 (depending on field), with a median redshift of 0.14, and a maximum redshift of 0.54. This subset is statistically representative in its sampling of the apparent color distribution of galaxies. Thirty six galaxies were selected to have the reddest red-optical colors in two redshift intervals between 0.2 less than z less than 0.3. Photometric passbands are similar to U, B, V, I, and K, and sample galaxy spectral energy distributions between 0.37 and 2.2 micrometers in the observed frame, or down to 0.26 micrometers in the rest frame for the most distant galaxies. B and K images of the entire sample are assembled to form the first optical and near-infrared atlas of a statistically-representative sample of field galaxies. We discuss techniques for faint field-galaxy photometry, including a working definition of a total magnitude, and a method for matching magnitudes in different passbands and different seeing conditions to ensure reliable, integrated colors. Photographic saturation, which substantially affects the brightest 12% of our sample in the optical bands, is corrected with a model employing measured plate-density distributions for each galaxy, calibrated via similar measurements for stars as a function of known saturation level. Both the relative and absolute calibration of our photometry are demonstrated.
Comparison of Sample Size by Bootstrap and by Formulas Based on Normal Distribution Assumption.
Wang, Zuozhen
2018-01-01
Bootstrapping technique is distribution-independent, which provides an indirect way to estimate the sample size for a clinical trial based on a relatively smaller sample. In this paper, sample size estimation to compare two parallel-design arms for continuous data by bootstrap procedure are presented for various test types (inequality, non-inferiority, superiority, and equivalence), respectively. Meanwhile, sample size calculation by mathematical formulas (normal distribution assumption) for the identical data are also carried out. Consequently, power difference between the two calculation methods is acceptably small for all the test types. It shows that the bootstrap procedure is a credible technique for sample size estimation. After that, we compared the powers determined using the two methods based on data that violate the normal distribution assumption. To accommodate the feature of the data, the nonparametric statistical method of Wilcoxon test was applied to compare the two groups in the data during the process of bootstrap power estimation. As a result, the power estimated by normal distribution-based formula is far larger than that by bootstrap for each specific sample size per group. Hence, for this type of data, it is preferable that the bootstrap method be applied for sample size calculation at the beginning, and that the same statistical method as used in the subsequent statistical analysis is employed for each bootstrap sample during the course of bootstrap sample size estimation, provided there is historical true data available that can be well representative of the population to which the proposed trial is planning to extrapolate.
Afifi, Tracie O; Henriksen, Christine A; Asmundson, Gordon J G; Sareen, Jitender
2012-11-01
To examine the association between a history of 5 types of childhood maltreatment (that is, physical abuse, sexual abuse, emotional abuse, physical neglect, and emotional neglect) and several substance use disorders (SUDs), including alcohol, sedatives, tranquilizers, opioids, amphetamines, cannabis, cocaine, hallucinogens, heroin, and nicotine, in a nationally representative US adult sex-stratified sample. Data were drawn from the National Epidemiologic Survey of Alcohol and Related Conditions (NESARC), a nationally representative US sample of adults aged 20 years and older (n = 34 653). Logistic regression models were conducted to understand the relations between 5 types of childhood maltreatment and SUDs separately among men and women after adjusting for sociodemographic variables and Diagnostic and Statistical Manual of Mental Disorders (DSM) Axis I and II mental disorders. All 5 types of childhood maltreatment were associated with increased odds of all individual SUDs among men and women after adjusting for sociodemographic variables, with the exception of physical neglect and heroin abuse or dependence, emotional neglect, and amphetamines and cocaine abuse or dependence among men (adjusted odds ratio range 1.3 to 4.7). After further adjustment for other DSM Axis I and II mental disorders, the relations between childhood maltreatment and SUDs were attenuated, but many remained statistically significant. Differences in the patterns of findings were noted for men and women for sexual abuse and emotional neglect. This research provides evidence of the robust nature of the relations between many types of childhood maltreatment and many individual SUDs. The prevention of childhood maltreatment may help to reduce SUDs in the general population.
Multiscale pore structure and constitutive models of fine-grained rocks
NASA Astrophysics Data System (ADS)
Heath, J. E.; Dewers, T. A.; Shields, E. A.; Yoon, H.; Milliken, K. L.
2017-12-01
A foundational concept of continuum poromechanics is the representative elementary volume or REV: an amount of material large enough that pore- or grain-scale fluctuations in relevant properties are dissipated to a definable mean, but smaller than length scales of heterogeneity. We determine 2D-equivalent representative elementary areas (REAs) of pore areal fraction of three major types of mudrocks by applying multi-beam scanning electron microscopy (mSEM) to obtain terapixel image mosaics. Image analysis obtains pore areal fraction and pore size and shape as a function of progressively larger measurement areas. Using backscattering imaging and mSEM data, pores are identified by the components within which they occur, such as in organics or the clastic matrix. We correlate pore areal fraction with nano-indentation, micropillar compression, and axysimmetic testing at multiple length scales on a terrigenous-argillaceous mudrock sample. The combined data set is used to: investigate representative elementary volumes (and areas for the 2D images); determine if scale separation occurs; and determine if transport and mechanical properties at a given length scale can be statistically defined. Clear scale separation occurs between REAs and observable heterogeneity in two of the samples. A highly-laminated sample exhibits fine-scale heterogeneity and an overlapping in scales, in which case typical continuum assumptions on statistical variability may break down. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525.
Popov, Stanko Ilić; Stafilov, Trajče; Šajn, Robert; Tănăselia, Claudiu; Bačeva, Katerina
2014-01-01
A systematic study was carried out to investigate the distribution of fifty-six elements in the water samples from river Vardar (Republic of Macedonia and Greece) and its major tributaries. The samples were collected from 27 sampling sites. Analyses were performed by mass spectrometry with inductively coupled plasma (ICP-MS) and atomic emission spectrometry with inductively coupled plasma (ICP-AES). Cluster and R mode factor analysis (FA) was used to identify and characterise element associations and four associations of elements were determined by the method of multivariate statistics. Three factors represent the associations of elements that occur in the river water naturally while Factor 3 represents an anthropogenic association of the elements (Cd, Ga, In, Pb, Re, Tl, Cu, and Zn) introduced in the river waters from the waste waters from the mining and metallurgical activities in the country. PMID:24587756
Popov, Stanko Ilić; Stafilov, Trajče; Sajn, Robert; Tănăselia, Claudiu; Bačeva, Katerina
2014-01-01
A systematic study was carried out to investigate the distribution of fifty-six elements in the water samples from river Vardar (Republic of Macedonia and Greece) and its major tributaries. The samples were collected from 27 sampling sites. Analyses were performed by mass spectrometry with inductively coupled plasma (ICP-MS) and atomic emission spectrometry with inductively coupled plasma (ICP-AES). Cluster and R mode factor analysis (FA) was used to identify and characterise element associations and four associations of elements were determined by the method of multivariate statistics. Three factors represent the associations of elements that occur in the river water naturally while Factor 3 represents an anthropogenic association of the elements (Cd, Ga, In, Pb, Re, Tl, Cu, and Zn) introduced in the river waters from the waste waters from the mining and metallurgical activities in the country.
Rodrigues, Camila Carneiro Dos Santos; Santos, Ewerton; Ramos, Brunalisa Silva; Damasceno, Flaviana Cardoso; Correa, José Augusto Martins
2018-06-01
The 16 priority PAH were determined in sediment samples from the insular zone of Guajará Bay and Guamá River (Southern Amazon River mouth). Low hydrocarbon levels were observed and naphthalene was the most representative PAH. The low molecular weight PAH represented 51% of the total PAH. Statistical analysis showed that the sampling sites are not significantly different. Source analysis by PAH ratios and principal component analysis revealed that PAH are primary from a few rate of fossil fuel combustion, mainly related to the local small community activity. All samples presented no biological stress or damage potencial according to the sediment quality guidelines. This study discuss baselines for PAH in surface sediments from Amazonic aquatic systems based on source determination by PAH ratios and principal component analysis, sediment quality guidelines and through comparison with previous studies data.
Toward Robust and Efficient Climate Downscaling for Wind Energy
NASA Astrophysics Data System (ADS)
Vanvyve, E.; Rife, D.; Pinto, J. O.; Monaghan, A. J.; Davis, C. A.
2011-12-01
This presentation describes a more accurate and economical (less time, money and effort) wind resource assessment technique for the renewable energy industry, that incorporates innovative statistical techniques and new global mesoscale reanalyzes. The technique judiciously selects a collection of "case days" that accurately represent the full range of wind conditions observed at a given site over a 10-year period, in order to estimate the long-term energy yield. We will demonstrate that this new technique provides a very accurate and statistically reliable estimate of the 10-year record of the wind resource by intelligently choosing a sample of ±120 case days. This means that the expense of downscaling to quantify the wind resource at a prospective wind farm can be cut by two thirds from the current industry practice of downscaling a randomly chosen 365-day sample to represent winds over a "typical" year. This new estimate of the long-term energy yield at a prospective wind farm also has far less statistical uncertainty than the current industry standard approach. This key finding has the potential to reduce significantly market barriers to both onshore and offshore wind farm development, since insurers and financiers charge prohibitive premiums on investments that are deemed to be high risk. Lower uncertainty directly translates to lower perceived risk, and therefore far more attractive financing terms could be offered to wind farm developers who employ this new technique.
Earning and Learning: The Impact of Paid Work on First-Generation Student Persistence
ERIC Educational Resources Information Center
Micka-Pickunka, Marilyn
2010-01-01
This study utilized the Beginning Postsecondary Student (BPS) longitudinal data set (2004-2006) from the National Center for Educational Statistics (NCES), which will follow for six academic years a nationally representative sample of students who began their postsecondary education during the 2004-2005 academic year. The purpose of this study is…
Adult Literacy in OECD Countries: Technical Report on the First International Adult Literacy Survey.
ERIC Educational Resources Information Center
Murray, T. Scott; Kirsch, Irwin S.; Jenkins, Lynn B.
In December 1995, the Organisation for Economic Co-Operation and Development (OECD) and Statistics Canada jointly published the results of the first International Adult Literacy Survey (IALS). For this survey, representative samples of adults aged 16 to 65 were interviewed and tested in their homes in Canada, France, Germany, the Netherlands,…
ERIC Educational Resources Information Center
National Field Research Center Inc., Iowa City, IA.
Educational programs in solid waste management offered by 16 schools in 9 states were surveyed. These programs represent a sample, only, of the various programs available nationwide. Enrollment and graduate statistics are presented. Overall, 116 full-time and 124 part-time faculty were involved in the programs surveyed. Curricula and sources of…
ERIC Educational Resources Information Center
Konold, Timothy R.; Glutting, Joseph J.
2008-01-01
This study employed a correlated trait-correlated method application of confirmatory factor analysis to disentangle trait and method variance from measures of attention-deficit/hyperactivity disorder obtained at the college level. The two trait factors were "Diagnostic and Statistical Manual of Mental Disorders-Fourth Edition" ("DSM-IV")…
ERIC Educational Resources Information Center
National Field Research Center Inc., Iowa City, IA.
This report, together with volume II, (multiple degree programs), detail 105 post-secondary wastewater treatment programs from 33 states. These programs represent a sample, only, of the various programs available nationwide. Enrollment and graduate statistics are presented. The total number of faculty involved in all the programs surveyed was…
Interstate Survey: What Do Voters Say about K-12 Education in Six States? Polling Paper No. 1
ERIC Educational Resources Information Center
DiPerna, Paul
2010-01-01
The core purpose of the Interstate Survey series is to survey statistically representative statewide samples and report the "levels" and "gaps" of voter opinion, knowledge, and awareness when it comes to K-12 education and school choice reforms--particularly with respect to state performance, education spending, graduation…
Testing Structural Models of DSM-IV Symptoms of Common Forms of Child and Adolescent Psychopathology
ERIC Educational Resources Information Center
Lahey, Benjamin B.; Rathouz, Paul J.; Van Hulle, Carol; Urbano, Richard C.; Krueger, Robert F.; Applegate, Brooks; Garriock, Holly A.; Chapman, Derek A.; Waldman, Irwin D.
2008-01-01
Confirmatory factor analyses were conducted of "Diagnostic and Statistical Manual of Mental Disorders", Fourth Edition (DSM-IV) symptoms of common mental disorders derived from structured interviews of a representative sample of 4,049 twin children and adolescents and their adult caretakers. A dimensional model based on the assignment of symptoms…
NASA Astrophysics Data System (ADS)
Roberts, Michael J.; Braun, Noah O.; Sinclair, Thomas R.; Lobell, David B.; Schlenker, Wolfram
2017-09-01
We compare predictions of a simple process-based crop model (Soltani and Sinclair 2012), a simple statistical model (Schlenker and Roberts 2009), and a combination of both models to actual maize yields on a large, representative sample of farmer-managed fields in the Corn Belt region of the United States. After statistical post-model calibration, the process model (Simple Simulation Model, or SSM) predicts actual outcomes slightly better than the statistical model, but the combined model performs significantly better than either model. The SSM, statistical model and combined model all show similar relationships with precipitation, while the SSM better accounts for temporal patterns of precipitation, vapor pressure deficit and solar radiation. The statistical and combined models show a more negative impact associated with extreme heat for which the process model does not account. Due to the extreme heat effect, predicted impacts under uniform climate change scenarios are considerably more severe for the statistical and combined models than for the process-based model.
Observational studies of patients in the emergency department: a comparison of 4 sampling methods.
Valley, Morgan A; Heard, Kennon J; Ginde, Adit A; Lezotte, Dennis C; Lowenstein, Steven R
2012-08-01
We evaluate the ability of 4 sampling methods to generate representative samples of the emergency department (ED) population. We analyzed the electronic records of 21,662 consecutive patient visits at an urban, academic ED. From this population, we simulated different models of study recruitment in the ED by using 2 sample sizes (n=200 and n=400) and 4 sampling methods: true random, random 4-hour time blocks by exact sample size, random 4-hour time blocks by a predetermined number of blocks, and convenience or "business hours." For each method and sample size, we obtained 1,000 samples from the population. Using χ(2) tests, we measured the number of statistically significant differences between the sample and the population for 8 variables (age, sex, race/ethnicity, language, triage acuity, arrival mode, disposition, and payer source). Then, for each variable, method, and sample size, we compared the proportion of the 1,000 samples that differed from the overall ED population to the expected proportion (5%). Only the true random samples represented the population with respect to sex, race/ethnicity, triage acuity, mode of arrival, language, and payer source in at least 95% of the samples. Patient samples obtained using random 4-hour time blocks and business hours sampling systematically differed from the overall ED patient population for several important demographic and clinical variables. However, the magnitude of these differences was not large. Common sampling strategies selected for ED-based studies may affect parameter estimates for several representative population variables. However, the potential for bias for these variables appears small. Copyright © 2012. Published by Mosby, Inc.
Riskind, Rachel G; Tornello, Samantha L
2017-09-01
Previous researchers have found evidence for differences in parenting goals between lesbian and gay people and their heterosexual peers. However, no previous research has quantified the parenting goals of bisexual people or evaluated parenting goals as a function of sexual partner gender. In addition, political and social climates for sexual minority people had improved rapidly since the last representative data on lesbian and gay peoples' plans for parenthood were collected. We analyzed data from 3,941 childless lesbian, gay, bisexual, and heterosexual participants from the 2011-2013 National Survey of Family Growth (NSFG; United States Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, 2014), a nationally representative sample of United States residents aged 15 to 44 years. We found that statistically significant, within-gender sexual orientation differences in parenting plans persist, despite social and legal changes. Consistent with hypotheses, bisexual men's parenting desires and intentions were similar to those of their heterosexual male peers and different from those of their gay male peers, while bisexual women's reports were more mixed. Also consistent with hypotheses, the gender of the most recent sexual partner was a strong predictor of parenting goals. We discuss implications for mental and reproductive health-care providers, attorneys, social workers, and others who interact with sexual minority adults. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Susukida, Ryoko; Crum, Rosa M; Ebnesajjad, Cyrus; Stuart, Elizabeth A; Mojtabai, Ramin
2017-07-01
To compare randomized controlled trial (RCT) sample treatment effects with the population effects of substance use disorder (SUD) treatment. Statistical weighting was used to re-compute the effects from 10 RCTs such that the participants in the trials had characteristics that resembled those of patients in the target populations. Multi-site RCTs and usual SUD treatment settings in the United States. A total of 3592 patients in 10 RCTs and 1 602 226 patients from usual SUD treatment settings between 2001 and 2009. Three outcomes of SUD treatment were examined: retention, urine toxicology and abstinence. We weighted the RCT sample treatment effects using propensity scores representing the conditional probability of participating in RCTs. Weighting the samples changed the significance of estimated sample treatment effects. Most commonly, positive effects of trials became statistically non-significant after weighting (three trials for retention and urine toxicology and one trial for abstinence); also, non-significant effects became significantly positive (one trial for abstinence) and significantly negative effects became non-significant (two trials for abstinence). There was suggestive evidence of treatment effect heterogeneity in subgroups that are under- or over-represented in the trials, some of which were consistent with the differences in average treatment effects between weighted and unweighted results. The findings of randomized controlled trials (RCTs) for substance use disorder treatment do not appear to be directly generalizable to target populations when the RCT samples do not reflect adequately the target populations and there is treatment effect heterogeneity across patient subgroups. © 2017 Society for the Study of Addiction.
Further developments in cloud statistics for computer simulations
NASA Technical Reports Server (NTRS)
Chang, D. T.; Willand, J. H.
1972-01-01
This study is a part of NASA's continued program to provide global statistics of cloud parameters for computer simulation. The primary emphasis was on the development of the data bank of the global statistical distributions of cloud types and cloud layers and their applications in the simulation of the vertical distributions of in-cloud parameters such as liquid water content. These statistics were compiled from actual surface observations as recorded in Standard WBAN forms. Data for a total of 19 stations were obtained and reduced. These stations were selected to be representative of the 19 primary cloud climatological regions defined in previous studies of cloud statistics. Using the data compiled in this study, a limited study was conducted of the hemogeneity of cloud regions, the latitudinal dependence of cloud-type distributions, the dependence of these statistics on sample size, and other factors in the statistics which are of significance to the problem of simulation. The application of the statistics in cloud simulation was investigated. In particular, the inclusion of the new statistics in an expanded multi-step Monte Carlo simulation scheme is suggested and briefly outlined.
Reflexion on linear regression trip production modelling method for ensuring good model quality
NASA Astrophysics Data System (ADS)
Suprayitno, Hitapriya; Ratnasari, Vita
2017-11-01
Transport Modelling is important. For certain cases, the conventional model still has to be used, in which having a good trip production model is capital. A good model can only be obtained from a good sample. Two of the basic principles of a good sampling is having a sample capable to represent the population characteristics and capable to produce an acceptable error at a certain confidence level. It seems that this principle is not yet quite understood and used in trip production modeling. Therefore, investigating the Trip Production Modelling practice in Indonesia and try to formulate a better modeling method for ensuring the Model Quality is necessary. This research result is presented as follows. Statistics knows a method to calculate span of prediction value at a certain confidence level for linear regression, which is called Confidence Interval of Predicted Value. The common modeling practice uses R2 as the principal quality measure, the sampling practice varies and not always conform to the sampling principles. An experiment indicates that small sample is already capable to give excellent R2 value and sample composition can significantly change the model. Hence, good R2 value, in fact, does not always mean good model quality. These lead to three basic ideas for ensuring good model quality, i.e. reformulating quality measure, calculation procedure, and sampling method. A quality measure is defined as having a good R2 value and a good Confidence Interval of Predicted Value. Calculation procedure must incorporate statistical calculation method and appropriate statistical tests needed. A good sampling method must incorporate random well distributed stratified sampling with a certain minimum number of samples. These three ideas need to be more developed and tested.
The relation between statistical power and inference in fMRI
Wager, Tor D.; Yarkoni, Tal
2017-01-01
Statistically underpowered studies can result in experimental failure even when all other experimental considerations have been addressed impeccably. In fMRI the combination of a large number of dependent variables, a relatively small number of observations (subjects), and a need to correct for multiple comparisons can decrease statistical power dramatically. This problem has been clearly addressed yet remains controversial—especially in regards to the expected effect sizes in fMRI, and especially for between-subjects effects such as group comparisons and brain-behavior correlations. We aimed to clarify the power problem by considering and contrasting two simulated scenarios of such possible brain-behavior correlations: weak diffuse effects and strong localized effects. Sampling from these scenarios shows that, particularly in the weak diffuse scenario, common sample sizes (n = 20–30) display extremely low statistical power, poorly represent the actual effects in the full sample, and show large variation on subsequent replications. Empirical data from the Human Connectome Project resembles the weak diffuse scenario much more than the localized strong scenario, which underscores the extent of the power problem for many studies. Possible solutions to the power problem include increasing the sample size, using less stringent thresholds, or focusing on a region-of-interest. However, these approaches are not always feasible and some have major drawbacks. The most prominent solutions that may help address the power problem include model-based (multivariate) prediction methods and meta-analyses with related synthesis-oriented approaches. PMID:29155843
ERIC Educational Resources Information Center
Celebuski, Carin; Farris, Elizabeth
This report presents the findings from the "Nutrition Education in Public Schools, K-12" survey that was designed to provide data on the status of nutrition education in U.S. public schools. Questionnaires were sent to 1,000 school principals of a nationally representative sample of U.S. elementary, middle, and high schools. The survey…
The Mental Health of Children and Adolescents with Learning Disabilities in Britain
ERIC Educational Resources Information Center
Emerson, Eric; Hatton, Chris
2007-01-01
A previous project by the Foundation for People with Learning Disabilities analysed data that had been collected by the Office for National Statistics (ONS) in 1999 in their survey of the mental health of children and adolescents in Great Britain. The Foundation found that in this nationally representative sample of just over 10,000 children, 39%…
On the Public-Private School Achievement Debate. Faculty Research Working Papers Series
ERIC Educational Resources Information Center
Peterson, Paul E.; Llaudet, Elena
2006-01-01
On July 14, 2006, the U. S. Department of Education's National Center for Education Statistics (NCES) released a study that compared the performance in reading and math of 4th and 8th-graders attending private and public schools. Using information from a nationwide, representative sample of public and private school students collected in 2003 as…
ERIC Educational Resources Information Center
Lewis, Anne; And Others
If the American schoolhouse symbolizes public concern for children, millions of today's youngsters are receiving a negative message. Based on available statistics and information, and using a representative sample of one-half of the country's public school buildings, this investigation found that 25 percent of the nation's school buildings are…
ERIC Educational Resources Information Center
National Field Research Center Inc., Iowa City, IA.
This report, together with volume I (single degree programs), detail 105 post-secondary wastewater treatment programs from 33 states. These programs represent a sample, only, of the various programs available nationwide. Enrollment and graduate statistics are presented. The total number of faculty involved in all the programs surveyed was 1,106;…
Code of Federal Regulations, 2012 CFR
2012-07-01
... rigorous statistical experimental design and interpretation (Reference 16.4). 14.0Pollution Prevention 14... fluids. 1.4This method has been designed to show positive contamination for 5% of representative crude....1Sample collection bottles/jars—New, pre-cleaned bottles/jars, lot-certified to be free of artifacts...
Code of Federal Regulations, 2013 CFR
2013-07-01
... rigorous statistical experimental design and interpretation (Reference 16.4). 14.0Pollution Prevention 14... fluids. 1.4This method has been designed to show positive contamination for 5% of representative crude....1Sample collection bottles/jars—New, pre-cleaned bottles/jars, lot-certified to be free of artifacts...
Code of Federal Regulations, 2014 CFR
2014-07-01
... rigorous statistical experimental design and interpretation (Reference 16.4). 14.0Pollution Prevention 14... fluids. 1.4This method has been designed to show positive contamination for 5% of representative crude....1Sample collection bottles/jars—New, pre-cleaned bottles/jars, lot-certified to be free of artifacts...
NASA Astrophysics Data System (ADS)
Havens, Timothy C.; Cummings, Ian; Botts, Jonathan; Summers, Jason E.
2017-05-01
The linear ordered statistic (LOS) is a parameterized ordered statistic (OS) that is a weighted average of a rank-ordered sample. LOS operators are useful generalizations of aggregation as they can represent any linear aggregation, from minimum to maximum, including conventional aggregations, such as mean and median. In the fuzzy logic field, these aggregations are called ordered weighted averages (OWAs). Here, we present a method for learning LOS operators from training data, viz., data for which you know the output of the desired LOS. We then extend the learning process with regularization, such that a lower complexity or sparse LOS can be learned. Hence, we discuss what 'lower complexity' means in this context and how to represent that in the optimization procedure. Finally, we apply our learning methods to the well-known constant-false-alarm-rate (CFAR) detection problem, specifically for the case of background levels modeled by long-tailed distributions, such as the K-distribution. These backgrounds arise in several pertinent imaging problems, including the modeling of clutter in synthetic aperture radar and sonar (SAR and SAS) and in wireless communications.
Coherent radar imaging: Signal processing and statistical properties
NASA Astrophysics Data System (ADS)
Woodman, Ronald F.
1997-11-01
The recently developed technique for imaging radar scattering irregularities has opened a great scientific potential for ionospheric and atmospheric coherent radars. These images are obtained by processing the diffraction pattern of the backscattered electromagnetic field at a finite number of sampling points on the ground. In this paper, we review the mathematical relationship between the statistical covariance of these samples, (? ?†), and that of the radiating object field to be imaged, (??†), in a self-contained and comprehensive way. It is shown that these matrices are related in a linear way by (??†) = aM(FF†)M†a*, where M is a discrete Fourier transform operator and a is a matrix operator representing the discrete and limited sampling of the field. The image, or brightness distribution, is the diagonal of (FF†). The equation can be linearly inverted only in special cases. In most cases, inversion algorithms which make use of a priori information or maximum entropy constraints must be used. A naive (biased) "image" can be estimated in a manner analogous to an optical camera by simply applying an inverse DFT operator to the sampled field ? and evaluating the average power of the elements of the resulting vector ?. Such a transformation can be obtained either digitally or in an analog way. For the latter we can use a Butler matrix consisting of properly interconnected transmission lines. The case of radar targets in the near field is included as a new contribution. This case involves an additional matrix operator b, which is an analog of an optical lens used to compensate for the curvature of the phase fronts of the backscattered field. This "focusing" can be done after the statistics have been obtained. The formalism is derived for brightness distributions representing total powers. However, the derived expressions have been extended to include "color" images for each of the frequency components of the sampled time series. The frequency filtering is achieved by estimating spectra and cross spectra of the sample time series, in lieu of the power and cross correlations used in the derivation.
The coverage of a random sample from a biological community.
Engen, S
1975-03-01
A taxonomic group will frequently have a large number of species with small abundances. When a sample is drawn at random from this group, one is therefore faced with the problem that a large proportion of the species will not be discovered. A general definition of quantitative measures of "sample coverage" is proposed, and the problem of statistical inference is considered for two special cases, (1) the actual total relative abundance of those species that are represented in the sample, and (2) their relative contribution to the information index of diversity. The analysis is based on a extended version of the negative binomial species frequency model. The results are tabulated.
Vining, Kevin C.; Lundgren, Robert F.
2008-01-01
Sixty-five sampling sites, selected by a statistical design to represent lengths of perennial streams in North Dakota, were chosen to be sampled for fish and aquatic insects (macroinvertebrates) to establish unbiased baseline data. Channel catfish and common carp were the most abundant game and large fish species in the Cultivated Plains and Rangeland Plains, respectively. Blackflies were present in more than 50 percent of stream lengths sampled in the State; mayflies and caddisflies were present in more than 80 percent. Dragonflies were present in a greater percentage of stream lengths in the Rangeland Plains than in the Cultivated Plains.
TNO/Centaurs grouping tested with asteroid data sets
NASA Astrophysics Data System (ADS)
Fulchignoni, M.; Birlan, M.; Barucci, M. A.
2001-11-01
Recently, we have discussed the possible subdivision in few groups of a sample of 22 TNO and Centaurs for which the BVRIJ photometry were available (Barucci et al., 2001, A&A, 371,1150). We obtained this results using the multivariate statistics adopted to define the current asteroid taxonomy, namely the Principal Components Analysis and the G-mode method (Tholen & Barucci, 1989, in ASTEROIDS II). How these methods work with a very small statistical sample as the TNO/Centaurs one? Theoretically, the number of degrees of freedom of the sample is correct. In fact it is 88 in our case and have to be larger then 50 to cope with the requirements of the G-mode. Does the random sampling of the small number of members of a large population contain enough information to reveal some structure in the population? We extracted several samples of 22 asteroids out of a data-base of 86 objects of known taxonomic type for which BVRIJ photometry is available from ECAS (Zellner et al. 1985, ICARUS 61, 355), SMASS II (S.W. Bus, 1999, PhD Thesis, MIT), and the Bell et al. Atlas of the asteroid infrared spectra. The objects constituting the first sample were selected in order to give a good representation of the major asteroid taxonomic classes (at least three samples each class): C,S,D,A, and G. Both methods were able to distinguish all these groups confirming the validity of the adopted methods. The S class is hard to individuate as a consequence of the choice of I and J variables, which imply a lack of information on the absorption band at 1 micron. The other samples were obtained by random choice of the objects. Not all the major groups were well represented (less than three samples per groups), but the general trend of the asteroid taxonomy has been always obtained. We conclude that the quoted grouping of TNO/Centaurs is representative of some physico-chemical structure of the outer solar system small body population.
Statistical auditing and randomness test of lotto k/N-type games
NASA Astrophysics Data System (ADS)
Coronel-Brizio, H. F.; Hernández-Montoya, A. R.; Rapallo, F.; Scalas, E.
2008-11-01
One of the most popular lottery games worldwide is the so-called “lotto k/N”. It considers N numbers 1,2,…,N from which k are drawn randomly, without replacement. A player selects k or more numbers and the first prize is shared amongst those players whose selected numbers match all of the k randomly drawn. Exact rules may vary in different countries. In this paper, mean values and covariances for the random variables representing the numbers drawn from this kind of game are presented, with the aim of using them to audit statistically the consistency of a given sample of historical results with theoretical values coming from a hypergeometric statistical model. The method can be adapted to test pseudorandom number generators.
Geospatial techniques for developing a sampling frame of watersheds across a region
Gresswell, Robert E.; Bateman, Douglas S.; Lienkaemper, George; Guy, T.J.
2004-01-01
Current land-management decisions that affect the persistence of native salmonids are often influenced by studies of individual sites that are selected based on judgment and convenience. Although this approach is useful for some purposes, extrapolating results to areas that were not sampled is statistically inappropriate because the sampling design is usually biased. Therefore, in recent investigations of coastal cutthroat trout (Oncorhynchus clarki clarki) located above natural barriers to anadromous salmonids, we used a methodology for extending the statistical scope of inference. The purpose of this paper is to apply geospatial tools to identify a population of watersheds and develop a probability-based sampling design for coastal cutthroat trout in western Oregon, USA. The population of mid-size watersheds (500-5800 ha) west of the Cascade Range divide was derived from watershed delineations based on digital elevation models. Because a database with locations of isolated populations of coastal cutthroat trout did not exist, a sampling frame of isolated watersheds containing cutthroat trout had to be developed. After the sampling frame of watersheds was established, isolated watersheds with coastal cutthroat trout were stratified by ecoregion and erosion potential based on dominant bedrock lithology (i.e., sedimentary and igneous). A stratified random sample of 60 watersheds was selected with proportional allocation in each stratum. By comparing watershed drainage areas of streams in the general population to those in the sampling frame and the resulting sample (n = 60), we were able to evaluate the how representative the subset of watersheds was in relation to the population of watersheds. Geospatial tools provided a relatively inexpensive means to generate the information necessary to develop a statistically robust, probability-based sampling design.
Are atmospheric surface layer flows ergodic?
NASA Astrophysics Data System (ADS)
Higgins, Chad W.; Katul, Gabriel G.; Froidevaux, Martin; Simeonov, Valentin; Parlange, Marc B.
2013-06-01
The transposition of atmospheric turbulence statistics from the time domain, as conventionally sampled in field experiments, is explained by the so-called ergodic hypothesis. In micrometeorology, this hypothesis assumes that the time average of a measured flow variable represents an ensemble of independent realizations from similar meteorological states and boundary conditions. That is, the averaging duration must be sufficiently long to include a large number of independent realizations of the sampled flow variable so as to represent the ensemble. While the validity of the ergodic hypothesis for turbulence has been confirmed in laboratory experiments, and numerical simulations for idealized conditions, evidence for its validity in the atmospheric surface layer (ASL), especially for nonideal conditions, continues to defy experimental efforts. There is some urgency to make progress on this problem given the proliferation of tall tower scalar concentration networks aimed at constraining climate models yet are impacted by nonideal conditions at the land surface. Recent advancements in water vapor concentration lidar measurements that simultaneously sample spatial and temporal series in the ASL are used to investigate the validity of the ergodic hypothesis for the first time. It is shown that ergodicity is valid in a strict sense above uniform surfaces away from abrupt surface transitions. Surprisingly, ergodicity may be used to infer the ensemble concentration statistics of a composite grass-lake system using only water vapor concentration measurements collected above the sharp transition delineating the lake from the grass surface.
Suglia, Shakira F; Pamplin, John R; Forde, Allana T; Shelton, Rachel C
2017-10-01
Prior studies examining the association between perceived stress and adiposity have reported mixed findings, and sex differences have largely not been examined. We examined the relationship between perceived stress and body mass index (BMI) and waist circumference in young adults in the National Longitudinal Study of Adolescent to Adult Health. Participants (mean age 29 years; N = 14,044) completed the short form of Cohen's Perceived Stress Scale during a home visit. Height, weight, and waist circumference were assessed during the same visit. BMI was calculated based on measured height and weight. In the sample, 52% were male and 65% were identified as white. In adjusted linear regression analyses, a sex by Perceived Stress Scale interaction was noted (P < .05) for both BMI and waist circumference. Perceived stress was statistically significantly associated with lower BMI (β: -0.09; standard error [SE]: 0.05) and was associated with lower waist circumference, although not statistically significant (β: -0.18; SE: 0.10) among men. No associations were noted among women. In this nationally representative sample of young adults, perceived stress was associated with lower levels of adiposity among men. Noted differences could be attributed to different behavioral and coping strategies in response to stress between men and women as well as biological mechanisms which should be explored further. Copyright © 2017 Elsevier Inc. All rights reserved.
Modelling the CO emission in southern Bok globules
NASA Astrophysics Data System (ADS)
Cecchi-Pestellini, Cesare; Casu, Silvia; Scappini, Flavio
2001-10-01
The analysis of the sample of southern globules investigated by Scappini et al. in the CO (4-3) transition has been extended using a statistical equilibrium-radiative transfer model and making use of the results of Bourke et al. and Henning & Launardt for those globules which are in common among these samples. CO column densities and excitation temperatures have been calculated and the results compared with a chemical model representative of the chemistry of a spherical dark cloud. In a number of cases the gas kinetic temperatures have been constrained.
Hargreaves, James R; Fearon, Elizabeth; Davey, Calum; Phillips, Andrew; Cambiano, Valentina; Cowan, Frances M
2016-01-05
Pragmatic cluster-randomised trials should seek to make unbiased estimates of effect and be reported according to CONSORT principles, and the study population should be representative of the target population. This is challenging when conducting trials amongst 'hidden' populations without a sample frame. We describe a pair-matched cluster-randomised trial of a combination HIV-prevention intervention to reduce the proportion of female sex workers (FSW) with a detectable HIV viral load in Zimbabwe, recruiting via respondent driven sampling (RDS). We will cross-sectionally survey approximately 200 FSW at baseline and at endline to characterise each of 14 sites. RDS is a variant of chain referral sampling and has been adapted to approximate random sampling. Primary analysis will use the 'RDS-2' method to estimate cluster summaries and will adapt Hayes and Moulton's '2-step' method to adjust effect estimates for individual-level confounders and further adjust for cluster baseline prevalence. We will adapt CONSORT to accommodate RDS. In the absence of observable refusal rates, we will compare the recruitment process between matched pairs. We will need to investigate whether cluster-specific recruitment or the intervention itself affects the accuracy of the RDS estimation process, potentially causing differential biases. To do this, we will calculate RDS-diagnostic statistics for each cluster at each time point and compare these statistics within matched pairs and time points. Sensitivity analyses will assess the impact of potential biases arising from assumptions made by the RDS-2 estimation. We are not aware of any other completed pragmatic cluster RCTs that are recruiting participants using RDS. Our statistical design and analysis approach seeks to transparently document participant recruitment and allow an assessment of the representativeness of the study to the target population, a key aspect of pragmatic trials. The challenges we have faced in the design of this trial are likely to be shared in other contexts aiming to serve the needs of legally and/or socially marginalised populations for which no sampling frame exists and especially when the social networks of participants are both the target of intervention and the means of recruitment. The trial was registered at Pan African Clinical Trials Registry (PACTR201312000722390) on 9 December 2013.
Hudson, Lawrence N; Newbold, Tim; Contu, Sara; Hill, Samantha L L; Lysenko, Igor; De Palma, Adriana; Phillips, Helen R P; Alhusseini, Tamera I; Bedford, Felicity E; Bennett, Dominic J; Booth, Hollie; Burton, Victoria J; Chng, Charlotte W T; Choimes, Argyrios; Correia, David L P; Day, Julie; Echeverría-Londoño, Susy; Emerson, Susan R; Gao, Di; Garon, Morgan; Harrison, Michelle L K; Ingram, Daniel J; Jung, Martin; Kemp, Victoria; Kirkpatrick, Lucinda; Martin, Callum D; Pan, Yuan; Pask-Hale, Gwilym D; Pynegar, Edwin L; Robinson, Alexandra N; Sanchez-Ortiz, Katia; Senior, Rebecca A; Simmons, Benno I; White, Hannah J; Zhang, Hanbin; Aben, Job; Abrahamczyk, Stefan; Adum, Gilbert B; Aguilar-Barquero, Virginia; Aizen, Marcelo A; Albertos, Belén; Alcala, E L; Del Mar Alguacil, Maria; Alignier, Audrey; Ancrenaz, Marc; Andersen, Alan N; Arbeláez-Cortés, Enrique; Armbrecht, Inge; Arroyo-Rodríguez, Víctor; Aumann, Tom; Axmacher, Jan C; Azhar, Badrul; Azpiroz, Adrián B; Baeten, Lander; Bakayoko, Adama; Báldi, András; Banks, John E; Baral, Sharad K; Barlow, Jos; Barratt, Barbara I P; Barrico, Lurdes; Bartolommei, Paola; Barton, Diane M; Basset, Yves; Batáry, Péter; Bates, Adam J; Baur, Bruno; Bayne, Erin M; Beja, Pedro; Benedick, Suzan; Berg, Åke; Bernard, Henry; Berry, Nicholas J; Bhatt, Dinesh; Bicknell, Jake E; Bihn, Jochen H; Blake, Robin J; Bobo, Kadiri S; Bóçon, Roberto; Boekhout, Teun; Böhning-Gaese, Katrin; Bonham, Kevin J; Borges, Paulo A V; Borges, Sérgio H; Boutin, Céline; Bouyer, Jérémy; Bragagnolo, Cibele; Brandt, Jodi S; Brearley, Francis Q; Brito, Isabel; Bros, Vicenç; Brunet, Jörg; Buczkowski, Grzegorz; Buddle, Christopher M; Bugter, Rob; Buscardo, Erika; Buse, Jörn; Cabra-García, Jimmy; Cáceres, Nilton C; Cagle, Nicolette L; Calviño-Cancela, María; Cameron, Sydney A; Cancello, Eliana M; Caparrós, Rut; Cardoso, Pedro; Carpenter, Dan; Carrijo, Tiago F; Carvalho, Anelena L; Cassano, Camila R; Castro, Helena; Castro-Luna, Alejandro A; Rolando, Cerda B; Cerezo, Alexis; Chapman, Kim Alan; Chauvat, Matthieu; Christensen, Morten; Clarke, Francis M; Cleary, Daniel F R; Colombo, Giorgio; Connop, Stuart P; Craig, Michael D; Cruz-López, Leopoldo; Cunningham, Saul A; D'Aniello, Biagio; D'Cruze, Neil; da Silva, Pedro Giovâni; Dallimer, Martin; Danquah, Emmanuel; Darvill, Ben; Dauber, Jens; Davis, Adrian L V; Dawson, Jeff; de Sassi, Claudio; de Thoisy, Benoit; Deheuvels, Olivier; Dejean, Alain; Devineau, Jean-Louis; Diekötter, Tim; Dolia, Jignasu V; Domínguez, Erwin; Dominguez-Haydar, Yamileth; Dorn, Silvia; Draper, Isabel; Dreber, Niels; Dumont, Bertrand; Dures, Simon G; Dynesius, Mats; Edenius, Lars; Eggleton, Paul; Eigenbrod, Felix; Elek, Zoltán; Entling, Martin H; Esler, Karen J; de Lima, Ricardo F; Faruk, Aisyah; Farwig, Nina; Fayle, Tom M; Felicioli, Antonio; Felton, Annika M; Fensham, Roderick J; Fernandez, Ignacio C; Ferreira, Catarina C; Ficetola, Gentile F; Fiera, Cristina; Filgueiras, Bruno K C; Fırıncıoğlu, Hüseyin K; Flaspohler, David; Floren, Andreas; Fonte, Steven J; Fournier, Anne; Fowler, Robert E; Franzén, Markus; Fraser, Lauchlan H; Fredriksson, Gabriella M; Freire, Geraldo B; Frizzo, Tiago L M; Fukuda, Daisuke; Furlani, Dario; Gaigher, René; Ganzhorn, Jörg U; García, Karla P; Garcia-R, Juan C; Garden, Jenni G; Garilleti, Ricardo; Ge, Bao-Ming; Gendreau-Berthiaume, Benoit; Gerard, Philippa J; Gheler-Costa, Carla; Gilbert, Benjamin; Giordani, Paolo; Giordano, Simonetta; Golodets, Carly; Gomes, Laurens G L; Gould, Rachelle K; Goulson, Dave; Gove, Aaron D; Granjon, Laurent; Grass, Ingo; Gray, Claudia L; Grogan, James; Gu, Weibin; Guardiola, Moisès; Gunawardene, Nihara R; Gutierrez, Alvaro G; Gutiérrez-Lamus, Doris L; Haarmeyer, Daniela H; Hanley, Mick E; Hanson, Thor; Hashim, Nor R; Hassan, Shombe N; Hatfield, Richard G; Hawes, Joseph E; Hayward, Matt W; Hébert, Christian; Helden, Alvin J; Henden, John-André; Henschel, Philipp; Hernández, Lionel; Herrera, James P; Herrmann, Farina; Herzog, Felix; Higuera-Diaz, Diego; Hilje, Branko; Höfer, Hubert; Hoffmann, Anke; Horgan, Finbarr G; Hornung, Elisabeth; Horváth, Roland; Hylander, Kristoffer; Isaacs-Cubides, Paola; Ishida, Hiroaki; Ishitani, Masahiro; Jacobs, Carmen T; Jaramillo, Víctor J; Jauker, Birgit; Hernández, F Jiménez; Johnson, McKenzie F; Jolli, Virat; Jonsell, Mats; Juliani, S Nur; Jung, Thomas S; Kapoor, Vena; Kappes, Heike; Kati, Vassiliki; Katovai, Eric; Kellner, Klaus; Kessler, Michael; Kirby, Kathryn R; Kittle, Andrew M; Knight, Mairi E; Knop, Eva; Kohler, Florian; Koivula, Matti; Kolb, Annette; Kone, Mouhamadou; Kőrösi, Ádám; Krauss, Jochen; Kumar, Ajith; Kumar, Raman; Kurz, David J; Kutt, Alex S; Lachat, Thibault; Lantschner, Victoria; Lara, Francisco; Lasky, Jesse R; Latta, Steven C; Laurance, William F; Lavelle, Patrick; Le Féon, Violette; LeBuhn, Gretchen; Légaré, Jean-Philippe; Lehouck, Valérie; Lencinas, María V; Lentini, Pia E; Letcher, Susan G; Li, Qi; Litchwark, Simon A; Littlewood, Nick A; Liu, Yunhui; Lo-Man-Hung, Nancy; López-Quintero, Carlos A; Louhaichi, Mounir; Lövei, Gabor L; Lucas-Borja, Manuel Esteban; Luja, Victor H; Luskin, Matthew S; MacSwiney G, M Cristina; Maeto, Kaoru; Magura, Tibor; Mallari, Neil Aldrin; Malone, Louise A; Malonza, Patrick K; Malumbres-Olarte, Jagoba; Mandujano, Salvador; Måren, Inger E; Marin-Spiotta, Erika; Marsh, Charles J; Marshall, E J P; Martínez, Eliana; Martínez Pastur, Guillermo; Moreno Mateos, David; Mayfield, Margaret M; Mazimpaka, Vicente; McCarthy, Jennifer L; McCarthy, Kyle P; McFrederick, Quinn S; McNamara, Sean; Medina, Nagore G; Medina, Rafael; Mena, Jose L; Mico, Estefania; Mikusinski, Grzegorz; Milder, Jeffrey C; Miller, James R; Miranda-Esquivel, Daniel R; Moir, Melinda L; Morales, Carolina L; Muchane, Mary N; Muchane, Muchai; Mudri-Stojnic, Sonja; Munira, A Nur; Muoñz-Alonso, Antonio; Munyekenye, B F; Naidoo, Robin; Naithani, A; Nakagawa, Michiko; Nakamura, Akihiro; Nakashima, Yoshihiro; Naoe, Shoji; Nates-Parra, Guiomar; Navarrete Gutierrez, Dario A; Navarro-Iriarte, Luis; Ndang'ang'a, Paul K; Neuschulz, Eike L; Ngai, Jacqueline T; Nicolas, Violaine; Nilsson, Sven G; Noreika, Norbertas; Norfolk, Olivia; Noriega, Jorge Ari; Norton, David A; Nöske, Nicole M; Nowakowski, A Justin; Numa, Catherine; O'Dea, Niall; O'Farrell, Patrick J; Oduro, William; Oertli, Sabine; Ofori-Boateng, Caleb; Oke, Christopher Omamoke; Oostra, Vicencio; Osgathorpe, Lynne M; Otavo, Samuel Eduardo; Page, Navendu V; Paritsis, Juan; Parra-H, Alejandro; Parry, Luke; Pe'er, Guy; Pearman, Peter B; Pelegrin, Nicolás; Pélissier, Raphaël; Peres, Carlos A; Peri, Pablo L; Persson, Anna S; Petanidou, Theodora; Peters, Marcell K; Pethiyagoda, Rohan S; Phalan, Ben; Philips, T Keith; Pillsbury, Finn C; Pincheira-Ulbrich, Jimmy; Pineda, Eduardo; Pino, Joan; Pizarro-Araya, Jaime; Plumptre, A J; Poggio, Santiago L; Politi, Natalia; Pons, Pere; Poveda, Katja; Power, Eileen F; Presley, Steven J; Proença, Vânia; Quaranta, Marino; Quintero, Carolina; Rader, Romina; Ramesh, B R; Ramirez-Pinilla, Martha P; Ranganathan, Jai; Rasmussen, Claus; Redpath-Downing, Nicola A; Reid, J Leighton; Reis, Yana T; Rey Benayas, José M; Rey-Velasco, Juan Carlos; Reynolds, Chevonne; Ribeiro, Danilo Bandini; Richards, Miriam H; Richardson, Barbara A; Richardson, Michael J; Ríos, Rodrigo Macip; Robinson, Richard; Robles, Carolina A; Römbke, Jörg; Romero-Duque, Luz Piedad; Rös, Matthias; Rosselli, Loreta; Rossiter, Stephen J; Roth, Dana S; Roulston, T'ai H; Rousseau, Laurent; Rubio, André V; Ruel, Jean-Claude; Sadler, Jonathan P; Sáfián, Szabolcs; Saldaña-Vázquez, Romeo A; Sam, Katerina; Samnegård, Ulrika; Santana, Joana; Santos, Xavier; Savage, Jade; Schellhorn, Nancy A; Schilthuizen, Menno; Schmiedel, Ute; Schmitt, Christine B; Schon, Nicole L; Schüepp, Christof; Schumann, Katharina; Schweiger, Oliver; Scott, Dawn M; Scott, Kenneth A; Sedlock, Jodi L; Seefeldt, Steven S; Shahabuddin, Ghazala; Shannon, Graeme; Sheil, Douglas; Sheldon, Frederick H; Shochat, Eyal; Siebert, Stefan J; Silva, Fernando A B; Simonetti, Javier A; Slade, Eleanor M; Smith, Jo; Smith-Pardo, Allan H; Sodhi, Navjot S; Somarriba, Eduardo J; Sosa, Ramón A; Soto Quiroga, Grimaldo; St-Laurent, Martin-Hugues; Starzomski, Brian M; Stefanescu, Constanti; Steffan-Dewenter, Ingolf; Stouffer, Philip C; Stout, Jane C; Strauch, Ayron M; Struebig, Matthew J; Su, Zhimin; Suarez-Rubio, Marcela; Sugiura, Shinji; Summerville, Keith S; Sung, Yik-Hei; Sutrisno, Hari; Svenning, Jens-Christian; Teder, Tiit; Threlfall, Caragh G; Tiitsaar, Anu; Todd, Jacqui H; Tonietto, Rebecca K; Torre, Ignasi; Tóthmérész, Béla; Tscharntke, Teja; Turner, Edgar C; Tylianakis, Jason M; Uehara-Prado, Marcio; Urbina-Cardona, Nicolas; Vallan, Denis; Vanbergen, Adam J; Vasconcelos, Heraldo L; Vassilev, Kiril; Verboven, Hans A F; Verdasca, Maria João; Verdú, José R; Vergara, Carlos H; Vergara, Pablo M; Verhulst, Jort; Virgilio, Massimiliano; Vu, Lien Van; Waite, Edward M; Walker, Tony R; Wang, Hua-Feng; Wang, Yanping; Watling, James I; Weller, Britta; Wells, Konstans; Westphal, Catrin; Wiafe, Edward D; Williams, Christopher D; Willig, Michael R; Woinarski, John C Z; Wolf, Jan H D; Wolters, Volkmar; Woodcock, Ben A; Wu, Jihua; Wunderle, Joseph M; Yamaura, Yuichi; Yoshikura, Satoko; Yu, Douglas W; Zaitsev, Andrey S; Zeidler, Juliane; Zou, Fasheng; Collen, Ben; Ewers, Rob M; Mace, Georgina M; Purves, Drew W; Scharlemann, Jörn P W; Purvis, Andy
2017-01-01
The PREDICTS project-Projecting Responses of Ecological Diversity In Changing Terrestrial Systems (www.predicts.org.uk)-has collated from published studies a large, reasonably representative database of comparable samples of biodiversity from multiple sites that differ in the nature or intensity of human impacts relating to land use. We have used this evidence base to develop global and regional statistical models of how local biodiversity responds to these measures. We describe and make freely available this 2016 release of the database, containing more than 3.2 million records sampled at over 26,000 locations and representing over 47,000 species. We outline how the database can help in answering a range of questions in ecology and conservation biology. To our knowledge, this is the largest and most geographically and taxonomically representative database of spatial comparisons of biodiversity that has been collated to date; it will be useful to researchers and international efforts wishing to model and understand the global status of biodiversity.
Zhao, Qi; Liu, Yuanning; Zhang, Ning; Hu, Menghan; Zhang, Hao; Joshi, Trupti; Xu, Dong
2018-01-01
In recent years, an increasing number of studies have reported the presence of plant miRNAs in human samples, which resulted in a hypothesis asserting the existence of plant-derived exogenous microRNA (xenomiR). However, this hypothesis is not widely accepted in the scientific community due to possible sample contamination and the small sample size with lack of rigorous statistical analysis. This study provides a systematic statistical test that can validate (or invalidate) the plant-derived xenomiR hypothesis by analyzing 388 small RNA sequencing data from human samples in 11 types of body fluids/tissues. A total of 166 types of plant miRNAs were found in at least one human sample, of which 14 plant miRNAs represented more than 80% of the total plant miRNAs abundance in human samples. Plant miRNA profiles were characterized to be tissue-specific in different human samples. Meanwhile, the plant miRNAs identified from microbiome have an insignificant abundance compared to those from humans, while plant miRNA profiles in human samples were significantly different from those in plants, suggesting that sample contamination is an unlikely reason for all the plant miRNAs detected in human samples. This study also provides a set of testable synthetic miRNAs with isotopes that can be detected in situ after being fed to animals.
Regional surnames and genetic structure in Great Britain.
Kandt, Jens; Cheshire, James A; Longley, Paul A
2016-10-01
Following the increasing availability of DNA-sequenced data, the genetic structure of populations can now be inferred and studied in unprecedented detail. Across social science, this innovation is shaping new bio-social research agendas, attracting substantial investment in the collection of genetic, biological and social data for large population samples. Yet genetic samples are special because the precise populations that they represent are uncertain and ill-defined. Unlike most social surveys, a genetic sample's representativeness of the population cannot be established by conventional procedures of statistical inference, and the implications for population-wide generalisations about bio-social phenomena are little understood. In this paper, we seek to address these problems by linking surname data to a censored and geographically uneven sample of DNA scans, collected for the People of the British Isles study. Based on a combination of global and local spatial correspondence measures, we identify eight regions in Great Britain that are most likely to represent the geography of genetic structure of Great Britain's long-settled population. We discuss the implications of this regionalisation for bio-social investigations. We conclude that, as the often highly selective collection of DNA and biomarkers becomes a more common practice, geography is crucial to understanding variation in genetic information within diverse populations.
Self-averaging and weak ergodicity breaking of diffusion in heterogeneous media
NASA Astrophysics Data System (ADS)
Russian, Anna; Dentz, Marco; Gouze, Philippe
2017-08-01
Diffusion in natural and engineered media is quantified in terms of stochastic models for the heterogeneity-induced fluctuations of particle motion. However, fundamental properties such as ergodicity and self-averaging and their dependence on the disorder distribution are often not known. Here, we investigate these questions for diffusion in quenched disordered media characterized by spatially varying retardation properties, which account for particle retention due to physical or chemical interactions with the medium. We link self-averaging and ergodicity to the disorder sampling efficiency Rn, which quantifies the number of disorder realizations a noise ensemble may sample in a single disorder realization. Diffusion for disorder scenarios characterized by a finite mean transition time is ergodic and self-averaging for any dimension. The strength of the sample to sample fluctuations decreases with increasing spatial dimension. For an infinite mean transition time, particle motion is weakly ergodicity breaking in any dimension because single particles cannot sample the heterogeneity spectrum in finite time. However, even though the noise ensemble is not representative of the single-particle time statistics, subdiffusive motion in q ≥2 dimensions is self-averaging, which means that the noise ensemble in a single realization samples a representative part of the heterogeneity spectrum.
Signal Sampling for Efficient Sparse Representation of Resting State FMRI Data
Ge, Bao; Makkie, Milad; Wang, Jin; Zhao, Shijie; Jiang, Xi; Li, Xiang; Lv, Jinglei; Zhang, Shu; Zhang, Wei; Han, Junwei; Guo, Lei; Liu, Tianming
2015-01-01
As the size of brain imaging data such as fMRI grows explosively, it provides us with unprecedented and abundant information about the brain. How to reduce the size of fMRI data but not lose much information becomes a more and more pressing issue. Recent literature studies tried to deal with it by dictionary learning and sparse representation methods, however, their computation complexities are still high, which hampers the wider application of sparse representation method to large scale fMRI datasets. To effectively address this problem, this work proposes to represent resting state fMRI (rs-fMRI) signals of a whole brain via a statistical sampling based sparse representation. First we sampled the whole brain’s signals via different sampling methods, then the sampled signals were aggregate into an input data matrix to learn a dictionary, finally this dictionary was used to sparsely represent the whole brain’s signals and identify the resting state networks. Comparative experiments demonstrate that the proposed signal sampling framework can speed-up by ten times in reconstructing concurrent brain networks without losing much information. The experiments on the 1000 Functional Connectomes Project further demonstrate its effectiveness and superiority. PMID:26646924
NASA Astrophysics Data System (ADS)
Gomo, M.; Vermeulen, D.
2015-03-01
An investigation was conducted to statistically compare the influence of non-purging and purging groundwater sampling methods on analysed inorganic chemistry parameters and calculated saturation indices. Groundwater samples were collected from 15 monitoring wells drilled in Karoo aquifers before and after purging for the comparative study. For the non-purging method, samples were collected from groundwater flow zones located in the wells using electrical conductivity (EC) profiling. The two data sets of non-purged and purged groundwater samples were analysed for inorganic chemistry parameters at the Institute of Groundwater Studies (IGS) laboratory of the Free University in South Africa. Saturation indices for mineral phases that were found in the data base of PHREEQC hydrogeochemical model were calculated for each data set. Four one-way ANOVA tests were conducted using Microsoft excel 2007 to investigate if there is any statistically significant difference between: (1) all inorganic chemistry parameters measured in the non-purged and purged groundwater samples per each specific well, (2) all mineral saturation indices calculated for the non-purged and purged groundwater samples per each specific well, (3) individual inorganic chemistry parameters measured in the non-purged and purged groundwater samples across all wells and (4) Individual mineral saturation indices calculated for non-purged and purged groundwater samples across all wells. For all the ANOVA tests conducted, the calculated alpha values (p) are greater than 0.05 (significance level) and test statistic (F) is less than the critical value (Fcrit) (F < Fcrit). The results imply that there was no statistically significant difference between the two data sets. With a 95% confidence, it was therefore concluded that the variance between groups was rather due to random chance and not to the influence of the sampling methods (tested factor). It is therefore be possible that in some hydrogeologic conditions, non-purged groundwater samples might be just as representative as the purged ones. The findings of this study can provide an important platform for future evidence oriented research investigations to establish the necessity of purging prior to groundwater sampling in different aquifer systems.
Mota, Natalie P; Medved, Maria; Whitney, Debbie; Hiebert-Murphy, Diane; Sareen, Jitender
2013-10-01
Although military interest in promoting psychological resilience is growing, resources protective against psychopathology have been understudied in female service members. Using a representative sample of Canadian Forces personnel, we investigated whether religious attendance, spirituality, coping, and social support were related to mental disorders and psychological distress in female service members, and whether sex differences occurred in these associations. Religious attendance and spirituality were self-reported. Coping items were taken from 3 scales and produced 3 factors (active, avoidance, and self-medication). Social support was assessed with the Medical Outcomes Study Social Support Survey. Past-year mental disorders were diagnosed with the World Mental Health Composite International Diagnostic Interview. The Kessler Psychological Distress Scale assessed distress. Multivariate regression models investigated links between correlates and psychological outcomes within each sex. For associations that were statistically significant in only one sex, sex by correlate interactions were computed. In female service members, inverse relations were found between social support and MDD, any MDD or anxiety disorder, suicidal ideation, and distress. No associations were found between religious attendance and outcomes, and spirituality was associated with an increased likelihood of some outcomes. Active coping was related to less psychological distress, while avoidance coping and self-medication were linked to a higher likelihood of most outcomes. Although several statistically significant associations were found in only one sex, only one sex by correlate interaction was statistically significant. Social support was found to be inversely related to several negative mental health outcomes in female service members. Few differences between men and women reached statistical significance. Future research should identify additional helpful resources for female service members.
The Complete Local-Volume Groups Sample (CLoGS): Early results from X-ray and radio observations
NASA Astrophysics Data System (ADS)
Vrtilek, Jan M.; O'Sullivan, Ewan; David, Laurence P.; Giacintucci, Simona; Kolokythas, Konstantinos
2017-08-01
Although the group environment is the dominant locus of galaxy evolution (in contrast to rich clusters, which contain only a few percent of galaxies), there has been a lack of reliable, representative group samples in the local Universe. In particular, X-ray selected samples are strongly biased in favor of the X-ray bright, centrally-concentrated cool-core systems. In response, we have designed the Complete Local-Volume Groups Sample (CLoGS), an optically-selected statistically-complete sample of 53 groups within 80 Mpc which is intended to overcome the limitations of X-ray selected samples and serve as a representative survey of groups in the local Universe. We have supplemented X-ray data from Chandra and XMM (70% complete to date, using both archival and new observations, with a 26-group high richness subsample 100% complete) with GMRT radio continuum observations (at 235 and 610 MHz, complete for the entire sample). CLoGS includes groups with a wide variety of properties in terms of galaxy population, hot gas content, and AGN power. We here describe early results from the survey, including the range of AGN activity observed in the dominant galaxies, the relative fraction of cool-core and non-cool-core groups in our sample, and the degree of disturbance observed in the IGM.
The Study on Mental Health at Work: Design and sampling.
Rose, Uwe; Schiel, Stefan; Schröder, Helmut; Kleudgen, Martin; Tophoven, Silke; Rauch, Angela; Freude, Gabriele; Müller, Grit
2017-08-01
The Study on Mental Health at Work (S-MGA) generates the first nationwide representative survey enabling the exploration of the relationship between working conditions, mental health and functioning. This paper describes the study design, sampling procedures and data collection, and presents a summary of the sample characteristics. S-MGA is a representative study of German employees aged 31-60 years subject to social security contributions. The sample was drawn from the employment register based on a two-stage cluster sampling procedure. Firstly, 206 municipalities were randomly selected from a pool of 12,227 municipalities in Germany. Secondly, 13,590 addresses were drawn from the selected municipalities for the purpose of conducting 4500 face-to-face interviews. The questionnaire covers psychosocial working and employment conditions, measures of mental health, work ability and functioning. Data from personal interviews were combined with employment histories from register data. Descriptive statistics of socio-demographic characteristics and logistic regressions analyses were used for comparing population, gross sample and respondents. In total, 4511 face-to-face interviews were conducted. A test for sampling bias revealed that individuals in older cohorts participated more often, while individuals with an unknown educational level, residing in major cities or with a non-German ethnic background were slightly underrepresented. There is no indication of major deviations in characteristics between the basic population and the sample of respondents. Hence, S-MGA provides representative data for research on work and health, designed as a cohort study with plans to rerun the survey 5 years after the first assessment.
ERIC Educational Resources Information Center
Ali, Sundus Muhsin; Hussein, Khalid Shakir
2014-01-01
This paper presents an attempt to verify the comparative power of two statistical features: Type/Token, and Hapax legomena/Token ratios (henceforth TTR and HTR). A corpus of ten novels is compiled. Then sixteen samples (each is 5,000 tokens in length) are taken randomly out of these novels as representative blocks. The researchers observe the way…
ERIC Educational Resources Information Center
Brownridge, Douglas A.; Chan, Ko Ling; Hiebert-Murphy, Diane; Ristock, Janice; Tiwari, Agnes; Leung, Wing-Cheong; Santos, Susy C.
2008-01-01
The purpose of the study was to shed light on the potentially differing dynamics of violence against separated and divorced women by their ex-husbands and violence against married women by their current husbands. Using a nationally representative sample of 7,369 heterosexual women from Cycle 13 of Statistics Canada's General Social Survey,…
ERIC Educational Resources Information Center
Tyler, John; White, Susan
2014-01-01
During the 2012-13 academic year, the Statistic Research Center (SRC) collected data from a representative national sample of over 3,500 public and private high schools across the U.S. to inquire about physics availabilities and offerings. This report describes their findings. SRC takes two different approaches to describe the characteristics of…
ERIC Educational Resources Information Center
Neman, Ronald S.; And Others
The study represents an extension of previous research involving the development of scales for the five-card, orally administered, and tape-recorded version of the Thematic Apperception Test(TAT). Scale development is documented and national norms are presented based on a national probability sample of 1,398 youths administered the Cycle III test…
Employment Experience of Youths During the School Year and Summer. Bureau of Labor Statistics News.
ERIC Educational Resources Information Center
Bureau of Labor Statistics (DOL), Washington, DC.
Findings from the first four annual survey rounds of the National Longitudinal Survey of Youth 1997 provided data on employment experiences of a nationally representative sample of about 9,000 young men and women born during 1980-84. The survey indicated that the percent of students employed in employee jobs during any week of the 1999-2000 school…
Trutschel, Diana; Palm, Rebecca; Holle, Bernhard; Simon, Michael
2017-11-01
Because not every scientific question on effectiveness can be answered with randomised controlled trials, research methods that minimise bias in observational studies are required. Two major concerns influence the internal validity of effect estimates: selection bias and clustering. Hence, to reduce the bias of the effect estimates, more sophisticated statistical methods are needed. To introduce statistical approaches such as propensity score matching and mixed models into representative real-world analysis and to conduct the implementation in statistical software R to reproduce the results. Additionally, the implementation in R is presented to allow the results to be reproduced. We perform a two-level analytic strategy to address the problems of bias and clustering: (i) generalised models with different abilities to adjust for dependencies are used to analyse binary data and (ii) the genetic matching and covariate adjustment methods are used to adjust for selection bias. Hence, we analyse the data from two population samples, the sample produced by the matching method and the full sample. The different analysis methods in this article present different results but still point in the same direction. In our example, the estimate of the probability of receiving a case conference is higher in the treatment group than in the control group. Both strategies, genetic matching and covariate adjustment, have their limitations but complement each other to provide the whole picture. The statistical approaches were feasible for reducing bias but were nevertheless limited by the sample used. For each study and obtained sample, the pros and cons of the different methods have to be weighted. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
McDonough, Christine M.; Jette, Alan M.; Ni, Pengsheng; Bogusz, Kara; Marfeo, Elizabeth E; Brandt, Diane E; Chan, Leighton; Meterko, Mark; Haley, Stephen M.; Rasch, Elizabeth K.
2014-01-01
Objectives To build a comprehensive item pool representing work-relevant physical functioning and to test the factor structure of the item pool. These developmental steps represent initial outcomes of a broader project to develop instruments for the assessment of function within the context of Social Security Administration (SSA) disability programs. Design Comprehensive literature review; gap analysis; item generation with expert panel input; stakeholder interviews; cognitive interviews; cross-sectional survey administration; and exploratory and confirmatory factor analyses to assess item pool structure. Setting In-person and semi-structured interviews; internet and telephone surveys. Participants A sample of 1,017 SSA claimants, and a normative sample of 999 adults from the US general population. Interventions Not Applicable. Main Outcome Measure Model fit statistics Results The final item pool consisted of 139 items. Within the claimant sample 58.7% were white; 31.8% were black; 46.6% were female; and the mean age was 49.7 years. Initial factor analyses revealed a 4-factor solution which included more items and allowed separate characterization of: 1) Changing and Maintaining Body Position, 2) Whole Body Mobility, 3) Upper Body Function and 4) Upper Extremity Fine Motor. The final 4-factor model included 91 items. Confirmatory factor analyses for the 4-factor models for the claimant and the normative samples demonstrated very good fit. Fit statistics for claimant and normative samples respectively were: Comparative Fit Index = 0.93 and 0.98; Tucker-Lewis Index = 0.92 and 0.98; Root Mean Square Error Approximation = 0.05 and 0.04. Conclusions The factor structure of the Physical Function item pool closely resembled the hypothesized content model. The four scales relevant to work activities offer promise for providing reliable information about claimant physical functioning relevant to work disability. PMID:23542402
McDonough, Christine M; Jette, Alan M; Ni, Pengsheng; Bogusz, Kara; Marfeo, Elizabeth E; Brandt, Diane E; Chan, Leighton; Meterko, Mark; Haley, Stephen M; Rasch, Elizabeth K
2013-09-01
To build a comprehensive item pool representing work-relevant physical functioning and to test the factor structure of the item pool. These developmental steps represent initial outcomes of a broader project to develop instruments for the assessment of function within the context of Social Security Administration (SSA) disability programs. Comprehensive literature review; gap analysis; item generation with expert panel input; stakeholder interviews; cognitive interviews; cross-sectional survey administration; and exploratory and confirmatory factor analyses to assess item pool structure. In-person and semistructured interviews and Internet and telephone surveys. Sample of SSA claimants (n=1017) and a normative sample of adults from the U.S. general population (n=999). Not applicable. Model fit statistics. The final item pool consisted of 139 items. Within the claimant sample, 58.7% were white; 31.8% were black; 46.6% were women; and the mean age was 49.7 years. Initial factor analyses revealed a 4-factor solution, which included more items and allowed separate characterization of: (1) changing and maintaining body position, (2) whole body mobility, (3) upper body function, and (4) upper extremity fine motor. The final 4-factor model included 91 items. Confirmatory factor analyses for the 4-factor models for the claimant and the normative samples demonstrated very good fit. Fit statistics for claimant and normative samples, respectively, were: Comparative Fit Index=.93 and .98; Tucker-Lewis Index=.92 and .98; and root mean square error approximation=.05 and .04. The factor structure of the physical function item pool closely resembled the hypothesized content model. The 4 scales relevant to work activities offer promise for providing reliable information about claimant physical functioning relevant to work disability. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Diaz, Francisco J; Berg, Michel J; Krebill, Ron; Welty, Timothy; Gidal, Barry E; Alloway, Rita; Privitera, Michael
2013-12-01
Due to concern and debate in the epilepsy medical community and to the current interest of the US Food and Drug Administration (FDA) in revising approaches to the approval of generic drugs, the FDA is currently supporting ongoing bioequivalence studies of antiepileptic drugs, the EQUIGEN studies. During the design of these crossover studies, the researchers could not find commercial or non-commercial statistical software that quickly allowed computation of sample sizes for their designs, particularly software implementing the FDA requirement of using random-effects linear models for the analyses of bioequivalence studies. This article presents tables for sample-size evaluations of average bioequivalence studies based on the two crossover designs used in the EQUIGEN studies: the four-period, two-sequence, two-formulation design, and the six-period, three-sequence, three-formulation design. Sample-size computations assume that random-effects linear models are used in bioequivalence analyses with crossover designs. Random-effects linear models have been traditionally viewed by many pharmacologists and clinical researchers as just mathematical devices to analyze repeated-measures data. In contrast, a modern view of these models attributes an important mathematical role in theoretical formulations in personalized medicine to them, because these models not only have parameters that represent average patients, but also have parameters that represent individual patients. Moreover, the notation and language of random-effects linear models have evolved over the years. Thus, another goal of this article is to provide a presentation of the statistical modeling of data from bioequivalence studies that highlights the modern view of these models, with special emphasis on power analyses and sample-size computations.
Defining And Characterizing Sample Representativeness For DWPF Melter Feed Samples
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shine, E. P.; Poirier, M. R.
2013-10-29
Representative sampling is important throughout the Defense Waste Processing Facility (DWPF) process, and the demonstrated success of the DWPF process to achieve glass product quality over the past two decades is a direct result of the quality of information obtained from the process. The objective of this report was to present sampling methods that the Savannah River Site (SRS) used to qualify waste being dispositioned at the DWPF. The goal was to emphasize the methodology, not a list of outcomes from those studies. This methodology includes proven methods for taking representative samples, the use of controlled analytical methods, and datamore » interpretation and reporting that considers the uncertainty of all error sources. Numerous sampling studies were conducted during the development of the DWPF process and still continue to be performed in order to evaluate options for process improvement. Study designs were based on use of statistical tools applicable to the determination of uncertainties associated with the data needs. Successful designs are apt to be repeated, so this report chose only to include prototypic case studies that typify the characteristics of frequently used designs. Case studies have been presented for studying in-tank homogeneity, evaluating the suitability of sampler systems, determining factors that affect mixing and sampling, comparing the final waste glass product chemical composition and durability to that of the glass pour stream sample and other samples from process vessels, and assessing the uniformity of the chemical composition in the waste glass product. Many of these studies efficiently addressed more than one of these areas of concern associated with demonstrating sample representativeness and provide examples of statistical tools in use for DWPF. The time when many of these designs were implemented was in an age when the sampling ideas of Pierre Gy were not as widespread as they are today. Nonetheless, the engineers and statisticians used carefully thought out designs that systematically and economically provided plans for data collection from the DWPF process. Key shared features of the sampling designs used at DWPF and the Gy sampling methodology were the specification of a standard for sample representativeness, an investigation that produced data from the process to study the sampling function, and a decision framework used to assess whether the specification was met based on the data. Without going into detail with regard to the seven errors identified by Pierre Gy, as excellent summaries are readily available such as Pitard [1989] and Smith [2001], SRS engineers understood, for example, that samplers can be biased (Gy's extraction error), and developed plans to mitigate those biases. Experiments that compared installed samplers with more representative samples obtained directly from the tank may not have resulted in systematically partitioning sampling errors into the now well-known error categories of Gy, but did provide overall information on the suitability of sampling systems. Most of the designs in this report are related to the DWPF vessels, not the large SRS Tank Farm tanks. Samples from the DWPF Slurry Mix Evaporator (SME), which contains the feed to the DWPF melter, are characterized using standardized analytical methods with known uncertainty. The analytical error is combined with the established error from sampling and processing in DWPF to determine the melter feed composition. This composition is used with the known uncertainty of the models in the Product Composition Control System (PCCS) to ensure that the wasteform that is produced is comfortably within the acceptable processing and product performance region. Having the advantage of many years of processing that meets the waste glass product acceptance criteria, the DWPF process has provided a considerable amount of data about itself in addition to the data from many special studies. Demonstrating representative sampling directly from the large Tank Farm tanks is a difficult, if not unsolvable enterprise due to limited accessibility. However, the consistency and the adequacy of sampling and mixing at SRS could at least be studied under the controlled process conditions based on samples discussed by Ray and others [2012a] in Waste Form Qualification Report (WQR) Volume 2 and the transfers from Tanks 40H and 51H to the Sludge Receipt and Adjustment Tank (SRAT) within DWPF. It is important to realize that the need for sample representativeness becomes more stringent as the material gets closer to the melter, and the tanks within DWPF have been studied extensively to meet those needs.« less
Kessler, Ronald C.; Avenevoli, Shelli; Costello, E. Jane; Green, Jennifer Greif; Gruber, Michael J.; Heeringa, Steven; Merikangas, Kathleen R.; Pennell, Beth-Ellen; Sampson, Nancy A.; Zaslavsky, Alan M.
2009-01-01
An overview is presented of the design and field procedures of the US National Comorbidity Survey Replication Adolescent Supplement (NCS-A), a US face-to-face household survey of the prevalence and correlates of DSM-IV mental disorders. The survey was based on a dual-frame design that included 904 adolescent residents of the households that participated in the US National Comorbidity Survey Replication (85.9% response rate) and 9,244 adolescent students selected from a nationally representative sample of 320 schools (74.7% response rate). After expositing the logic of dual-frame designs, comparisons are presented of sample and population distributions on Census socio-demographic variables and, in the school sample, school characteristics. These document only minor differences between the samples and the population. The results of statistical analysis of the bias-efficiency trade-off in weight trimming are then presented. These show that modest trimming meaningfully reduces mean squared error. Analysis of comparative sample efficiency shows that the household sample is more efficient than the school sample, leading to the household sample getting a higher weight relative to its size in the consolidated sample relative to the school sample. Taken together, these results show that the NCS-A is an efficient sample of the target population with good representativeness on a range of socio-demographic and geographic variables. PMID:19507169
Robust model selection and the statistical classification of languages
NASA Astrophysics Data System (ADS)
García, J. E.; González-López, V. A.; Viola, M. L. L.
2012-10-01
In this paper we address the problem of model selection for the set of finite memory stochastic processes with finite alphabet, when the data is contaminated. We consider m independent samples, with more than half of them being realizations of the same stochastic process with law Q, which is the one we want to retrieve. We devise a model selection procedure such that for a sample size large enough, the selected process is the one with law Q. Our model selection strategy is based on estimating relative entropies to select a subset of samples that are realizations of the same law. Although the procedure is valid for any family of finite order Markov models, we will focus on the family of variable length Markov chain models, which include the fixed order Markov chain model family. We define the asymptotic breakdown point (ABDP) for a model selection procedure, and we show the ABDP for our procedure. This means that if the proportion of contaminated samples is smaller than the ABDP, then, as the sample size grows our procedure selects a model for the process with law Q. We also use our procedure in a setting where we have one sample conformed by the concatenation of sub-samples of two or more stochastic processes, with most of the subsamples having law Q. We conducted a simulation study. In the application section we address the question of the statistical classification of languages according to their rhythmic features using speech samples. This is an important open problem in phonology. A persistent difficulty on this problem is that the speech samples correspond to several sentences produced by diverse speakers, corresponding to a mixture of distributions. The usual procedure to deal with this problem has been to choose a subset of the original sample which seems to best represent each language. The selection is made by listening to the samples. In our application we use the full dataset without any preselection of samples. We apply our robust methodology estimating a model which represent the main law for each language. Our findings agree with the linguistic conjecture, related to the rhythm of the languages included on our dataset.
Image statistics underlying natural texture selectivity of neurons in macaque V4
Okazawa, Gouki; Tajima, Satohiro; Komatsu, Hidehiko
2015-01-01
Our daily visual experiences are inevitably linked to recognizing the rich variety of textures. However, how the brain encodes and differentiates a plethora of natural textures remains poorly understood. Here, we show that many neurons in macaque V4 selectively encode sparse combinations of higher-order image statistics to represent natural textures. We systematically explored neural selectivity in a high-dimensional texture space by combining texture synthesis and efficient-sampling techniques. This yielded parameterized models for individual texture-selective neurons. The models provided parsimonious but powerful predictors for each neuron’s preferred textures using a sparse combination of image statistics. As a whole population, the neuronal tuning was distributed in a way suitable for categorizing textures and quantitatively predicts human ability to discriminate textures. Together, we suggest that the collective representation of visual image statistics in V4 plays a key role in organizing the natural texture perception. PMID:25535362
Haytowitz, David B; Pehrsson, Pamela R
2018-01-01
For nearly 20years, the National Food and Nutrient Analysis Program (NFNAP) has expanded and improved the quantity and quality of data in US Department of Agriculture's (USDA) food composition databases (FCDB) through the collection and analysis of nationally representative food samples. NFNAP employs statistically valid sampling plans, the Key Foods approach to identify and prioritize foods and nutrients, comprehensive quality control protocols, and analytical oversight to generate new and updated analytical data for food components. NFNAP has allowed the Nutrient Data Laboratory to keep up with the dynamic US food supply and emerging scientific research. Recently generated results for nationally representative food samples show marked changes compared to previous database values for selected nutrients. Monitoring changes in the composition of foods is critical in keeping FCDB up-to-date, so that they remain a vital tool in assessing the nutrient intake of national populations, as well as for providing dietary advice. Published by Elsevier Ltd.
Grebennikov, Vasily V; Smetana, Aleš
2015-02-18
Extensive litter sampling at eight forested localities in Yunnan and Sichuan detected 381 specimens of Micropeplinae rove beetles. DNA barcoding data from 85 representative specimens were analysed to delimit species and infer their relationships. Statistical methods were implemented to assess regional species diversity of understudied Micropeplinae. The total number of sampled Micropeplinae species varied between 14 and 17, depending on a splitting versus lumping approach for allopatric populations. A single Micropeplinae species was sampled in six of eight studied localities, three species were found on Mount Gongga, while ten species were discovered on hyperdiverse Mount Emei in Sichuan. All Micropeplinae specimens from our samples belong either to the genus Cerapeplus, or to three other inclusive groups temporarily retained inside Micropeplus sensu lato. Each of the three groups potentially represents a separate genus: tesserula group, sculptus group and Micropeplus sensu stricto. A new species Micropeplus jason sp. n. from Mount Emei in Sichuan is described. Numerous illustrations introduce regional fauna and clarify the discussed morphological characters.
Semi-Supervised Projective Non-Negative Matrix Factorization for Cancer Classification.
Zhang, Xiang; Guan, Naiyang; Jia, Zhilong; Qiu, Xiaogang; Luo, Zhigang
2015-01-01
Advances in DNA microarray technologies have made gene expression profiles a significant candidate in identifying different types of cancers. Traditional learning-based cancer identification methods utilize labeled samples to train a classifier, but they are inconvenient for practical application because labels are quite expensive in the clinical cancer research community. This paper proposes a semi-supervised projective non-negative matrix factorization method (Semi-PNMF) to learn an effective classifier from both labeled and unlabeled samples, thus boosting subsequent cancer classification performance. In particular, Semi-PNMF jointly learns a non-negative subspace from concatenated labeled and unlabeled samples and indicates classes by the positions of the maximum entries of their coefficients. Because Semi-PNMF incorporates statistical information from the large volume of unlabeled samples in the learned subspace, it can learn more representative subspaces and boost classification performance. We developed a multiplicative update rule (MUR) to optimize Semi-PNMF and proved its convergence. The experimental results of cancer classification for two multiclass cancer gene expression profile datasets show that Semi-PNMF outperforms the representative methods.
Harding, Kaitlin A.; Willey, Brittany; Ahles, Joshua; Mezulis, Amy
2016-01-01
Background Trait negative affect and trait positive affect are affective vulnerabilities to depressive symptoms in adolescence and adulthood. While trait affect and the state affect characteristic of depressive symptoms are proposed to be theoretically distinct, no studies have established that these constructs are statistically distinct. Therefore, the purpose of the current study was to determine whether the trait affect (e.g. temperament dimensions) that predicts depressive symptoms and the state affect characteristic of depressive symptoms are statistically distinct among early adolescents and adults. We hypothesized that trait negative affect, trait positive affect, and depressive symptoms would represent largely distinct factors in both samples. Method Participants were 268 early adolescents (53.73% female) and 321 young adults (70.09% female) who completed self-report measures of demographic information, trait affect, and depressive symptoms. Results Principal axis factoring with oblique rotation for both samples indicated distinct adolescent factor loadings and overlapping adult factor loadings. Confirmatory factor analyses in both samples supported distinct but related relationships between trait NA, trait PA, and depressive symptoms. Limitations Study limitations include our cross-sectional design that prevented examination of self-reported fluctuations in trait affect and depressive symptoms and the unknown potential effects of self-report biases among adolescents and adults. Conclusions Findings support existing theoretical distinctions between adolescent constructs but highlight a need to revise or remove items to distinguish measurements of adult trait affect and depressive symptoms. Adolescent trait affect and depressive symptoms are statistically distinct, but adult trait affect and depressive symptoms statistically overlap and warrant further consideration. PMID:27085163
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics
NASA Technical Reports Server (NTRS)
Pohorille, Andrew
2006-01-01
The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described by rate constants. These problems are isomorphic with chemical kinetics problems. Recently, several efficient techniques for this purpose have been developed based on the approach originally proposed by Gillespie. Although the utility of the techniques mentioned above for Bayesian problems has not been determined, further research along these lines is warranted
NASA Technical Reports Server (NTRS)
Currit, P. A.
1983-01-01
The Cleanroom software development methodology is designed to take the gamble out of product releases for both suppliers and receivers of the software. The ingredients of this procedure are a life cycle of executable product increments, representative statistical testing, and a standard estimate of the MTTF (Mean Time To Failure) of the product at the time of its release. A statistical approach to software product testing using randomly selected samples of test cases is considered. A statistical model is defined for the certification process which uses the timing data recorded during test. A reasonableness argument for this model is provided that uses previously published data on software product execution. Also included is a derivation of the certification model estimators and a comparison of the proposed least squares technique with the more commonly used maximum likelihood estimators.
Software for Data Analysis with Graphical Models
NASA Technical Reports Server (NTRS)
Buntine, Wray L.; Roy, H. Scott
1994-01-01
Probabilistic graphical models are being used widely in artificial intelligence and statistics, for instance, in diagnosis and expert systems, as a framework for representing and reasoning with probabilities and independencies. They come with corresponding algorithms for performing statistical inference. This offers a unifying framework for prototyping and/or generating data analysis algorithms from graphical specifications. This paper illustrates the framework with an example and then presents some basic techniques for the task: problem decomposition and the calculation of exact Bayes factors. Other tools already developed, such as automatic differentiation, Gibbs sampling, and use of the EM algorithm, make this a broad basis for the generation of data analysis software.
Sampling studies to estimate the HIV prevalence rate in female commercial sex workers.
Pascom, Ana Roberta Pati; Szwarcwald, Célia Landmann; Barbosa Júnior, Aristides
2010-01-01
We investigated sampling methods being used to estimate the HIV prevalence rate among female commercial sex workers. The studies were classified according to the adequacy or not of the sample size to estimate HIV prevalence rate and according to the sampling method (probabilistic or convenience). We identified 75 studies that estimated the HIV prevalence rate among female sex workers. Most of the studies employed convenience samples. The sample size was not adequate to estimate HIV prevalence rate in 35 studies. The use of convenience sample limits statistical inference for the whole group. It was observed that there was an increase in the number of published studies since 2005, as well as in the number of studies that used probabilistic samples. This represents a large advance in the monitoring of risk behavior practices and HIV prevalence rate in this group.
BCM: toolkit for Bayesian analysis of Computational Models using samplers.
Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A
2016-10-21
Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.
Minimum and Maximum Times Required to Obtain Representative Suspended Sediment Samples
NASA Astrophysics Data System (ADS)
Gitto, A.; Venditti, J. G.; Kostaschuk, R.; Church, M. A.
2014-12-01
Bottle sampling is a convenient method of obtaining suspended sediment measurements for the development of sediment budgets. While these methods are generally considered to be reliable, recent analysis of depth-integrated sampling has identified considerable uncertainty in measurements of grain-size concentration between grain-size classes of multiple samples. Point-integrated bottle sampling is assumed to represent the mean concentration of suspended sediment but the uncertainty surrounding this method is not well understood. Here we examine at-a-point variability in velocity, suspended sediment concentration, grain-size distribution, and grain-size moments to determine if traditional point-integrated methods provide a representative sample of suspended sediment. We present continuous hour-long observations of suspended sediment from the sand-bedded portion of the Fraser River at Mission, British Columbia, Canada, using a LISST laser-diffraction instrument. Spectral analysis suggests that there are no statistically significant peak in energy density, suggesting the absence of periodic fluctuations in flow and suspended sediment. However, a slope break in the spectra at 0.003 Hz corresponds to a period of 5.5 minutes. This coincides with the threshold between large-scale turbulent eddies that scale with channel width/mean velocity and hydraulic phenomena related to channel dynamics. This suggests that suspended sediment samples taken over a period longer than 5.5 minutes incorporate variability that is larger scale than turbulent phenomena in this channel. Examination of 5.5-minute periods of our time series indicate that ~20% of the time a stable mean value of volumetric concentration is reached within 30 seconds, a typical bottle sample duration. In ~12% of measurements a stable mean was not reached over the 5.5 minute sample duration. The remaining measurements achieve a stable mean in an even distribution over the intervening interval.
Garofalo, Robert; Emerson, Erin M.
2010-01-01
Objectives. We examined associations of race/ethnicity, gender, and sexual orientation with mental disorders among lesbian, gay, bisexual, and transgender (LGBT) youths. Methods. We assessed mental disorders by administering a structured diagnostic interview to a community sample of 246 LGBT youths aged 16 to 20 years. Participants also completed the Brief Symptom Inventory 18 (BSI 18). Results. One third of participants met criteria for any mental disorder, 17% for conduct disorder, 15% for major depression, and 9% for posttraumatic stress disorder. Anorexia and bulimia were rare. Lifetime suicide attempts were frequent (31%) but less so in the prior 12 months (7%). Few racial/ethnic and gender differences were statistically significant. Bisexually identified youths had lower prevalences of every diagnosis. The BSI 18 had high negative predictive power (90%) and low positive predictive power (25%) for major depression. Conclusions. LGBT youths had higher prevalences of mental disorder diagnoses than youths in national samples, but were similar to representative samples of urban, racial/ethnic minority youths. Suicide behaviors were similar to those among representative youth samples in the same geographic area. Questionnaires measuring psychological distress may overestimate depression prevalence among this population. PMID:20966378
Exploring Sampling in the Detection of Multicategory EEG Signals
Siuly, Siuly; Kabir, Enamul; Wang, Hua; Zhang, Yanchun
2015-01-01
The paper presents a structure based on samplings and machine leaning techniques for the detection of multicategory EEG signals where random sampling (RS) and optimal allocation sampling (OS) are explored. In the proposed framework, before using the RS and OS scheme, the entire EEG signals of each class are partitioned into several groups based on a particular time period. The RS and OS schemes are used in order to have representative observations from each group of each category of EEG data. Then all of the selected samples by the RS from the groups of each category are combined in a one set named RS set. In the similar way, for the OS scheme, an OS set is obtained. Then eleven statistical features are extracted from the RS and OS set, separately. Finally this study employs three well-known classifiers: k-nearest neighbor (k-NN), multinomial logistic regression with a ridge estimator (MLR), and support vector machine (SVM) to evaluate the performance for the RS and OS feature set. The experimental outcomes demonstrate that the RS scheme well represents the EEG signals and the k-NN with the RS is the optimum choice for detection of multicategory EEG signals. PMID:25977705
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sevcik, R. S.; Hyman, D. A.; Basumallich, L.
2013-01-01
A technique for carbohydrate analysis for bioprocess samples has been developed, providing reduced analysis time compared to current practice in the biofuels R&D community. The Thermofisher CarboPac SA10 anion-exchange column enables isocratic separation of monosaccharides, sucrose and cellobiose in approximately 7 minutes. Additionally, use of a low-volume (0.2 mL) injection valve in combination with a high-volume detection cell minimizes the extent of sample dilution required to bring sugar concentrations into the linear range of the pulsed amperometric detector (PAD). Three laboratories, representing academia, industry, and government, participated in an interlaboratory study which analyzed twenty-one opportunistic samples representing biomass pretreatment, enzymaticmore » saccharification, and fermentation samples. The technique's robustness, linearity, and interlaboratory reproducibility were evaluated and showed excellent-to-acceptable characteristics. Additionally, quantitation by the CarboPac SA10/PAD was compared with the current practice method utilizing a HPX-87P/RID. While these two methods showed good agreement a statistical comparison found significant quantitation difference between them, highlighting the difference between selective and universal detection modes.« less
Wathen, John B; Lazorchak, James M; Olsen, Anthony R; Batt, Angela
2015-03-01
The U.S. EPA conducted a national statistical survey of fish fillet tissue with a sample size of 541 sites on boatable rivers =>5th order in 2008-2009. This is the first such study of mercury (Hg) in fish tissue from river sites focused on potential impacts to human health from fish consumption to also address wildlife impacts. Sample sites were identified as being urban or non-urban. All sample mercury concentrations were above the 3.33ugkg(-1) (ppb) quantitation limit, and an estimated 25.4% (±4.4%) of the 51663 river miles assessed exceeded the U.S. EPA 300ugkg(-1) fish-tissue based water quality criterion for mercury, representing 13144±181.8 river miles. Estimates of river miles exceeding comparable aquatic life thresholds (translated from fillet concentrations to whole fish equivalents) in avian species were similar to the number of river miles exceeding the human health threshold, whereas some mammalian species were more at risk than human from lower mercury concentrations. A comparison of means from the non-urban and urban data and among three ecoregions did not indicate a statistically significant difference in fish tissue Hg concentrations at p<0.05. Published by Elsevier Ltd.
Gasparini, Patrizia; Di Cosmo, Lucio; Cenni, Enrico; Pompei, Enrico; Ferretti, Marco
2013-07-01
In the frame of a process aiming at harmonizing National Forest Inventory (NFI) and ICP Forests Level I Forest Condition Monitoring (FCM) in Italy, we investigated (a) the long-term consistency between FCM sample points (a subsample of the first NFI, 1985, NFI_1) and recent forest area estimates (after the second NFI, 2005, NFI_2) and (b) the effect of tree selection method (tree-based or plot-based) on sample composition and defoliation statistics. The two investigations were carried out on 261 and 252 FCM sites, respectively. Results show that some individual forest categories (larch and stone pine, Norway spruce, other coniferous, beech, temperate oaks and cork oak forests) are over-represented and others (hornbeam and hophornbeam, other deciduous broadleaved and holm oak forests) are under-represented in the FCM sample. This is probably due to a change in forest cover, which has increased by 1,559,200 ha from 1985 to 2005. In case of shift from a tree-based to a plot-based selection method, 3,130 (46.7%) of the original 6,703 sample trees will be abandoned, and 1,473 new trees will be selected. The balance between exclusion of former sample trees and inclusion of new ones will be particularly unfavourable for conifers (with only 16.4% of excluded trees replaced by new ones) and less for deciduous broadleaves (with 63.5% of excluded trees replaced). The total number of tree species surveyed will not be impacted, while the number of trees per species will, and the resulting (plot-based) sample composition will have a much larger frequency of deciduous broadleaved trees. The newly selected trees have-in general-smaller diameter at breast height (DBH) and defoliation scores. Given the larger rate of turnover, the deciduous broadleaved part of the sample will be more impacted. Our results suggest that both a revision of FCM network to account for forest area change and a plot-based approach to permit statistical inference and avoid bias in the tree sample composition in terms of DBH (and likely age and structure) are desirable in Italy. As the adoption of a plot-based approach will keep a large share of the trees formerly selected, direct tree-by-tree comparison will remain possible, thus limiting the impact on the time series comparability. In addition, the plot-based design will favour the integration with NFI_2.
Comparison of flume and towing methods for verifying the calibration of a suspended-sediment sampler
Beverage, J.P.; Futrell, J.C.
1986-01-01
Suspended-sediment samplers must sample isokinetically (at stream velocity) in order to collect representative water samples of rivers. Each sampler solo by the Federal Interagency Sedimentation Project or by the U.S. Geological Survey Hydrologic Instrumentation Facility has been adjusted to sample isokinetically and tested in a flume to verify the calibration. The test program for a modified U.S. P-61 sampler provided an opportunity to compare flume and towing tank tests. Although the two tests yielded statistically distinct results, the difference between them was quite small. The conclusion is that verifying the calibration of any suspended-sediment sampler by either the flume or towing method should give acceptable results.
SURVEILLANCE FOR AVIAN INFLUENZA VIRUS IN WILD BIRDS IN POLAND, 2008-15.
Świętoń, Edyta; Wyrostek, Krzysztof; Jóźwiak, Michał; Olszewska-Tomczyk, Monika; Domańska-Blicharz, Katarzyna; Meissner, Włodzimierz; Włodarczyk, Radosław; Minias, Piotr; Janiszewski, Tomasz; Minta, Zenon; Śmietanka, Krzysztof
2017-04-01
We tested wild birds in Poland during 2008-15 for avian influenza virus (AIV). We took 10,312 swabs and feces samples from 6,314 live birds representing 12 orders and 84 bird species, mostly from orders Anseriformes and Charadriiformes, for testing and characterization by various PCR methods. From PCR-positive samples, we attempted to isolate and subtype the virus. The RNA of AIV was detected in 1.8% (95% confidence interval [CI], 1.5-2.1%) of birds represented by 48 Mallards ( Anas platyrhynchos ), 11 Mute Swans ( Cygnus olor ), 48 Common Teals ( Anas crecca ), three Black-headed Gulls (Chroicocephalus ridibundus), one Common Coot ( Fulica atra ), one Garganey (Spatula querquedula), and one unidentified bird species. Overall, the prevalence of AIV detection in Mallards and Mute Swans (the most frequently sampled species) was 2.0% (95% CI, 1.4-2.5%) and 0.5% (95% CI, 0.2-0.8%), respectively; the difference was statistically significant (P=0.000). Hemagglutinin subtypes from H1 to H13 were identified, including H5 and H7 low pathogenic AIV subtypes. Mallards and Common Teals harbored the greatest diversity of subtypes. We observed seasonality of viral detection in Mallards, with higher AIV prevalence in late summer and autumn than in winter and spring. In addition, two peaks in AIV prevalence in summer (August) and autumn (November) were demonstrated for Mallards. The prevalence of AIV in Mute Swans did not show any statistically significant seasonal patterns.
Methodology of the 2009 Survey on Living with Chronic Diseases in Canada--hypertension component.
Bienek, A S; Gee, M E; Nolan, R P; Kaczorowski, J; Campbell, N R; Bancej, C; Gwadry-Sridhar, F; Robitaille, C; Walker, R L; Dai, S
2013-09-01
The Survey on Living with Chronic Diseases in Canada--hypertension component (SLCDC-H) is a 20-minute cross-sectional telephone survey on hypertension diagnosis and management. Sampled from the 2008 Canadian Community Health Survey (CCHS), the SLCDC-H includes Canadians (aged ≥ 20 years) with self-reported hypertension from the ten provinces. The questionnaire was developed by Delphi technique, externally reviewed and qualitatively tested. Statistics Canada performed sampling strategies, recruitment, data collection and processing. Proportions were weighted to represent the Canadian population, and 95% confidence intervals (CIs) were derived by bootstrap method. Compared with the CCHS population reporting hypertension, the SLCDC-H sample (n = 6142) is slightly younger (SLCDC-H mean age: 61.2 years, 95% CI: 60.8-61.6; CCHS mean age: 62.2 years, 95% CI: 61.8-62.5), has more post-secondary school graduates (SLCDC-H: 52.0%, 95% CI: 49.7%-54.2%; CCHS: 47.5%, 95% CI: 46.1%-48.9%) and has fewer respondents on hypertension medication (SLCDC-H: 82.5%, 95% CI: 80.9%-84.1%; CCHS: 88.6%, 95% CI: 87.7%-89.6%). Overall, the 2009 SLCDC-H represents its source population and provides novel, comprehensive data on the diagnosis and management of hypertension. The survey has been adapted to other chronic conditions--diabetes, asthma/chronic obstructive pulmonary disease and neurological conditions. The questionnaire is available on the Statistics Canada website; descriptive results have been disseminated by the Public Health Agency of Canada.
ERIC Educational Resources Information Center
Kalindi, Sylvia Chanda; McBride, Catherine; Tong, Xiuhong; Wong, Natalie Lok Lee; Chung, Kien Hoa Kevin; Lee, Chia-Ying
2015-01-01
To examine cognitive correlates of dyslexia in Chinese and reading difficulties in English as a foreign language, a total of 14 Chinese dyslexic children (DG), 16 poor readers of English (PE), and 17 poor readers of both Chinese and English (PB) were compared to a control sample (C) of 17 children, drawn from a statistically representative sample…
Review of Literature on Probability of Detection for Liquid Penetrant Nondestructive Testing
2011-11-01
increased maintenance costs , or catastrophic failure of safety- critical structure. Knowledge of the reliability achieved by NDT methods, including...representative components to gather data for statistical analysis, which can be prohibitively expensive. To account for sampling variability inherent in any...Sioux City and Pensacola. (Those recommendations were discussed in Section 3.4.) Drury et al report on a factorial experiment aimed at identifying the
REPORT FOR COMMERCIAL GRADE NICKEL CHARACTERIZATION AND BENCHMARKING
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
2012-12-20
Oak Ridge Associated Universities (ORAU), under the Oak Ridge Institute for Science and Education (ORISE) contract, has completed the collection, sample analysis, and review of analytical results to benchmark the concentrations of gross alpha-emitting radionuclides, gross beta-emitting radionuclides, and technetium-99 in commercial grade nickel. This report presents methods, change management, observations, and statistical analysis of materials procured from sellers representing nine countries on four continents. The data suggest there is a low probability of detecting alpha- and beta-emitting radionuclides in commercial nickel. Technetium-99 was not detected in any samples, thus suggesting it is not present in commercial nickel.
NASA Astrophysics Data System (ADS)
Pastukhov, A. V.; Kaverin, D. A.; Shchanov, V. M.
2016-09-01
A digital map of soil carbon pools was created for the forest-tundra ecotone in the Usa River basin with the use of ERDAS Imagine 2014 and ArcGIS 10.2 software. Supervised classification and thematic interpretation of satellite images and digital terrain models with the use of a georeferenced database on soil profiles were applied. Expert assessment of the natural diversity and representativeness of random samples for different soil groups was performed, and the minimal necessary size of the statistical sample was determined.
On evaluating compliance with air pollution levels 'not to be exceeded more than once per year'
NASA Technical Reports Server (NTRS)
Neustadter, H. E.; Sidik, S. M.
1974-01-01
The point of view taken is that the Environmental Protection Agency (EPA) Air Quality Standards (AQS) represent conditions which must be made to exist in the ambient environment. The statistical techniques developed should serve as tools for measuring the closeness to achieving the desired quality of air. It is shown that the sampling frequency recommended by EPA is inadequate to meet these objectives when the standard is expressed as a level not to be exceeded more than once per year and sampling frequency is once every three days or less frequent.
Skates, Steven J.; Gillette, Michael A.; LaBaer, Joshua; Carr, Steven A.; Anderson, N. Leigh; Liebler, Daniel C.; Ransohoff, David; Rifai, Nader; Kondratovich, Marina; Težak, Živana; Mansfield, Elizabeth; Oberg, Ann L.; Wright, Ian; Barnes, Grady; Gail, Mitchell; Mesri, Mehdi; Kinsinger, Christopher R.; Rodriguez, Henry; Boja, Emily S.
2014-01-01
Protein biomarkers are needed to deepen our understanding of cancer biology and to improve our ability to diagnose, monitor and treat cancers. Important analytical and clinical hurdles must be overcome to allow the most promising protein biomarker candidates to advance into clinical validation studies. Although contemporary proteomics technologies support the measurement of large numbers of proteins in individual clinical specimens, sample throughput remains comparatively low. This problem is amplified in typical clinical proteomics research studies, which routinely suffer from a lack of proper experimental design, resulting in analysis of too few biospecimens to achieve adequate statistical power at each stage of a biomarker pipeline. To address this critical shortcoming, a joint workshop was held by the National Cancer Institute (NCI), National Heart, Lung and Blood Institute (NHLBI), and American Association for Clinical Chemistry (AACC), with participation from the U.S. Food and Drug Administration (FDA). An important output from the workshop was a statistical framework for the design of biomarker discovery and verification studies. Herein, we describe the use of quantitative clinical judgments to set statistical criteria for clinical relevance, and the development of an approach to calculate biospecimen sample size for proteomic studies in discovery and verification stages prior to clinical validation stage. This represents a first step towards building a consensus on quantitative criteria for statistical design of proteomics biomarker discovery and verification research. PMID:24063748
Skates, Steven J; Gillette, Michael A; LaBaer, Joshua; Carr, Steven A; Anderson, Leigh; Liebler, Daniel C; Ransohoff, David; Rifai, Nader; Kondratovich, Marina; Težak, Živana; Mansfield, Elizabeth; Oberg, Ann L; Wright, Ian; Barnes, Grady; Gail, Mitchell; Mesri, Mehdi; Kinsinger, Christopher R; Rodriguez, Henry; Boja, Emily S
2013-12-06
Protein biomarkers are needed to deepen our understanding of cancer biology and to improve our ability to diagnose, monitor, and treat cancers. Important analytical and clinical hurdles must be overcome to allow the most promising protein biomarker candidates to advance into clinical validation studies. Although contemporary proteomics technologies support the measurement of large numbers of proteins in individual clinical specimens, sample throughput remains comparatively low. This problem is amplified in typical clinical proteomics research studies, which routinely suffer from a lack of proper experimental design, resulting in analysis of too few biospecimens to achieve adequate statistical power at each stage of a biomarker pipeline. To address this critical shortcoming, a joint workshop was held by the National Cancer Institute (NCI), National Heart, Lung, and Blood Institute (NHLBI), and American Association for Clinical Chemistry (AACC) with participation from the U.S. Food and Drug Administration (FDA). An important output from the workshop was a statistical framework for the design of biomarker discovery and verification studies. Herein, we describe the use of quantitative clinical judgments to set statistical criteria for clinical relevance and the development of an approach to calculate biospecimen sample size for proteomic studies in discovery and verification stages prior to clinical validation stage. This represents a first step toward building a consensus on quantitative criteria for statistical design of proteomics biomarker discovery and verification research.
Ryberg, Karen R.; Hiemenz, Gregory
2009-01-01
The Bureau of Reclamation collected water-quality samples at 16 sites on the James River and the Arrowwood National Wildlife Refuge, N. Dak., as part of its refuge-monitoring program from 1987-93 and as part of an environmental impact statement commitment from 1999-2004. Climatic and hydrologic conditions varied greatly during both sampling periods. The first period was dominated by drought conditions, which abruptly changed to cooler and wetter conditions in 1992-93. During the second period, conditions were near normal to very wet and included higher inflow from the James River into the refuge. The two periods also differed in the sites sampled, seasons sampled, and properties and constituent concentrations measured. Summary statistics were reported separately for the two sampling periods for all physical properties and constituents. Nonparametric statistical tests were used to further analyze some of the water-quality data. During the first sampling period, 1987-93, specific conductance, turbidity, hardness, alkalinity, total dissolved solids, total suspended solids, nonvolatile suspended solids, calcium, magnesium, sodium, potassium, sulfate, chloride, phosphate, total phosphorus, total organic carbon, chlorophyll a, and arsenic were determined to have significantly different medians among the sites tested. During the second sampling period, 1999-2004, the medians of pH, sodium, chloride, barium, and boron varied significantly among sites. Sites sampled and period of record varied between the two sampling periods and the period of record varied among the sites. Also, some constituents analyzed during the first period (1987-93) were not analyzed during the second period (1999-2004), and winter sampling was done during the second sampling period only. This variability reduces the number of direct comparisons that can be made between the two periods. Three sites had complete periods of record for both sampling periods and were compared. Differences in variability and median concentration were identified between the two time periods. Sites representing inflow to the refuge and outflow were compared statistically for the period when data were available for both sites, 1999-2004. Of the nutrients tested - ammonia plus organic nitrogen, phosphate, and total phosphorus - no significant statistical differences were found between the inflow samples and the outflow samples. Statistically significant differences were found for pH, sulfate, chloride, barium, and manganese. Nutrients are of particular interest in the refuge because of the aquatic plant and animal life and the use of the wetland resources by waterfowl. However, the nutrient data were highly censored and there were differences in the seasonal timing of sample collection between the two sampling periods. Therefore, the nutrient data were examined graphically with stripplots that highlighted differences in the seasonal timing of sample collection and concentration differences likely related to the differences in climatic and hydrologic conditions between the two periods.
Parsons, Nick R; Price, Charlotte L; Hiskens, Richard; Achten, Juul; Costa, Matthew L
2012-04-25
The application of statistics in reported research in trauma and orthopaedic surgery has become ever more important and complex. Despite the extensive use of statistical analysis, it is still a subject which is often not conceptually well understood, resulting in clear methodological flaws and inadequate reporting in many papers. A detailed statistical survey sampled 100 representative orthopaedic papers using a validated questionnaire that assessed the quality of the trial design and statistical analysis methods. The survey found evidence of failings in study design, statistical methodology and presentation of the results. Overall, in 17% (95% confidence interval; 10-26%) of the studies investigated the conclusions were not clearly justified by the results, in 39% (30-49%) of studies a different analysis should have been undertaken and in 17% (10-26%) a different analysis could have made a difference to the overall conclusions. It is only by an improved dialogue between statistician, clinician, reviewer and journal editor that the failings in design methodology and analysis highlighted by this survey can be addressed.
King-Shier, Kathryn M; Hemmelgarn, Brenda R; Musto, Richard; Quan, Hude
2014-01-01
Background: Francophones who live outside the primarily French-speaking province of Quebec, Canada, risk being excluded from research by lack of a sampling frame. We examined the adequacy of random sampling, advertising, and respondent-driven sampling for recruitment of francophones for survey research. Methods: We recruited francophones residing in the city of Calgary, Alberta, through advertising and respondentdriven sampling. These 2 samples were then compared with a random subsample of Calgary francophones derived from the 2006 Canadian Community Health Survey (CCHS). We assessed the effectiveness of advertising and respondent-driven sampling in relation to the CCHS sample by comparing demographic characteristics and selected items from the CCHS (specifically self-reported general health status, perceived weight, and having a family doctor). Results: We recruited 120 francophones through advertising and 145 through respondent-driven sampling; the random sample from the CCHS consisted of 259 records. The samples derived from advertising and respondentdriven sampling differed from the CCHS in terms of age (mean ages 41.0, 37.6, and 42.5 years, respectively), sex (proportion of males 26.1%, 40.6%, and 56.6%, respectively), education (college or higher 86.7% , 77.9% , and 59.1%, respectively), place of birth (immigrants accounting for 45.8%, 55.2%, and 3.7%, respectively), and not having a regular medical doctor (16.7%, 34.5%, and 16.6%, respectively). Differences were not tested statistically because of limitations on the analysis of CCHS data imposed by Statistics Canada. Interpretation: The samples generated exclusively through advertising and respondent-driven sampling were not representative of the gold standard sample from the CCHS. Use of such biased samples for research studies could generate misleading results. PMID:25426180
2008 Niday Perinatal Database quality audit: report of a quality assurance project.
Dunn, S; Bottomley, J; Ali, A; Walker, M
2011-12-01
This quality assurance project was designed to determine the reliability, completeness and comprehensiveness of the data entered into Niday Perinatal Database. Quality of the data was measured by comparing data re-abstracted from the patient record to the original data entered into the Niday Perinatal Database. A representative sample of hospitals in Ontario was selected and a random sample of 100 linked mother and newborn charts were audited for each site. A subset of 33 variables (representing 96 data fields) from the Niday dataset was chosen for re-abstraction. Of the data fields for which Cohen's kappa statistic or intraclass correlation coefficient (ICC) was calculated, 44% showed substantial or almost perfect agreement (beyond chance). However, about 17% showed less than 95% agreement and a kappa or ICC value of less than 60% indicating only slight, fair or moderate agreement (beyond chance). Recommendations to improve the quality of these data fields are presented.
Wang, Ling-jia; Kissler, Hermann J; Wang, Xiaojun; Cochet, Olivia; Krzystyniak, Adam; Misawa, Ryosuke; Golab, Karolina; Tibudan, Martin; Grzanka, Jakub; Savari, Omid; Grose, Randall; Kaufman, Dixon B; Millis, Michael; Witkowski, Piotr
2015-01-01
Pancreatic islet mass, represented by islet equivalent (IEQ), is the most important parameter in decision making for clinical islet transplantation. To obtain IEQ, the sample of islets is routinely counted manually under a microscope and discarded thereafter. Islet purity, another parameter in islet processing, is routinely acquired by estimation only. In this study, we validated our digital image analysis (DIA) system developed using the software of Image Pro Plus for islet mass and purity assessment. Application of the DIA allows to better comply with current good manufacturing practice (cGMP) standards. Human islet samples were captured as calibrated digital images for the permanent record. Five trained technicians participated in determination of IEQ and purity by manual counting method and DIA. IEQ count showed statistically significant correlations between the manual method and DIA in all sample comparisons (r >0.819 and p < 0.0001). Statistically significant difference in IEQ between both methods was found only in High purity 100μL sample group (p = 0.029). As far as purity determination, statistically significant differences between manual assessment and DIA measurement was found in High and Low purity 100μL samples (p<0.005), In addition, islet particle number (IPN) and the IEQ/IPN ratio did not differ statistically between manual counting method and DIA. In conclusion, the DIA used in this study is a reliable technique in determination of IEQ and purity. Islet sample preserved as a digital image and results produced by DIA can be permanently stored for verification, technical training and islet information exchange between different islet centers. Therefore, DIA complies better with cGMP requirements than the manual counting method. We propose DIA as a quality control tool to supplement the established standard manual method for islets counting and purity estimation. PMID:24806436
Investigation of Error Patterns in Geographical Databases
NASA Technical Reports Server (NTRS)
Dryer, David; Jacobs, Derya A.; Karayaz, Gamze; Gronbech, Chris; Jones, Denise R. (Technical Monitor)
2002-01-01
The objective of the research conducted in this project is to develop a methodology to investigate the accuracy of Airport Safety Modeling Data (ASMD) using statistical, visualization, and Artificial Neural Network (ANN) techniques. Such a methodology can contribute to answering the following research questions: Over a representative sampling of ASMD databases, can statistical error analysis techniques be accurately learned and replicated by ANN modeling techniques? This representative ASMD sample should include numerous airports and a variety of terrain characterizations. Is it possible to identify and automate the recognition of patterns of error related to geographical features? Do such patterns of error relate to specific geographical features, such as elevation or terrain slope? Is it possible to combine the errors in small regions into an error prediction for a larger region? What are the data density reduction implications of this work? ASMD may be used as the source of terrain data for a synthetic visual system to be used in the cockpit of aircraft when visual reference to ground features is not possible during conditions of marginal weather or reduced visibility. In this research, United States Geologic Survey (USGS) digital elevation model (DEM) data has been selected as the benchmark. Artificial Neural Networks (ANNS) have been used and tested as alternate methods in place of the statistical methods in similar problems. They often perform better in pattern recognition, prediction and classification and categorization problems. Many studies show that when the data is complex and noisy, the accuracy of ANN models is generally higher than those of comparable traditional methods.
Statistical characterization of a large geochemical database and effect of sample size
Zhang, C.; Manheim, F.T.; Hinde, J.; Grossman, J.N.
2005-01-01
The authors investigated statistical distributions for concentrations of chemical elements from the National Geochemical Survey (NGS) database of the U.S. Geological Survey. At the time of this study, the NGS data set encompasses 48,544 stream sediment and soil samples from the conterminous United States analyzed by ICP-AES following a 4-acid near-total digestion. This report includes 27 elements: Al, Ca, Fe, K, Mg, Na, P, Ti, Ba, Ce, Co, Cr, Cu, Ga, La, Li, Mn, Nb, Nd, Ni, Pb, Sc, Sr, Th, V, Y and Zn. The goal and challenge for the statistical overview was to delineate chemical distributions in a complex, heterogeneous data set spanning a large geographic range (the conterminous United States), and many different geological provinces and rock types. After declustering to create a uniform spatial sample distribution with 16,511 samples, histograms and quantile-quantile (Q-Q) plots were employed to delineate subpopulations that have coherent chemical and mineral affinities. Probability groupings are discerned by changes in slope (kinks) on the plots. Major rock-forming elements, e.g., Al, Ca, K and Na, tend to display linear segments on normal Q-Q plots. These segments can commonly be linked to petrologic or mineralogical associations. For example, linear segments on K and Na plots reflect dilution of clay minerals by quartz sand (low in K and Na). Minor and trace element relationships are best displayed on lognormal Q-Q plots. These sensitively reflect discrete relationships in subpopulations within the wide range of the data. For example, small but distinctly log-linear subpopulations for Pb, Cu, Zn and Ag are interpreted to represent ore-grade enrichment of naturally occurring minerals such as sulfides. None of the 27 chemical elements could pass the test for either normal or lognormal distribution on the declustered data set. Part of the reasons relate to the presence of mixtures of subpopulations and outliers. Random samples of the data set with successively smaller numbers of data points showed that few elements passed standard statistical tests for normality or log-normality until sample size decreased to a few hundred data points. Large sample size enhances the power of statistical tests, and leads to rejection of most statistical hypotheses for real data sets. For large sample sizes (e.g., n > 1000), graphical methods such as histogram, stem-and-leaf, and probability plots are recommended for rough judgement of probability distribution if needed. ?? 2005 Elsevier Ltd. All rights reserved.
Preparing for the first meeting with a statistician.
De Muth, James E
2008-12-15
Practical statistical issues that should be considered when performing data collection and analysis are reviewed. The meeting with a statistician should take place early in the research development before any study data are collected. The process of statistical analysis involves establishing the research question, formulating a hypothesis, selecting an appropriate test, sampling correctly, collecting data, performing tests, and making decisions. Once the objectives are established, the researcher can determine the characteristics or demographics of the individuals required for the study, how to recruit volunteers, what type of data are needed to answer the research question(s), and the best methods for collecting the required information. There are two general types of statistics: descriptive and inferential. Presenting data in a more palatable format for the reader is called descriptive statistics. Inferential statistics involve making an inference or decision about a population based on results obtained from a sample of that population. In order for the results of a statistical test to be valid, the sample should be representative of the population from which it is drawn. When collecting information about volunteers, researchers should only collect information that is directly related to the study objectives. Important information that a statistician will require first is an understanding of the type of variables involved in the study and which variables can be controlled by researchers and which are beyond their control. Data can be presented in one of four different measurement scales: nominal, ordinal, interval, or ratio. Hypothesis testing involves two mutually exclusive and exhaustive statements related to the research question. Statisticians should not be replaced by computer software, and they should be consulted before any research data are collected. When preparing to meet with a statistician, the pharmacist researcher should be familiar with the steps of statistical analysis and consider several questions related to the study to be conducted.
Statistical universals reveal the structures and functions of human music.
Savage, Patrick E; Brown, Steven; Sakai, Emi; Currie, Thomas E
2015-07-21
Music has been called "the universal language of mankind." Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation.
Selecting the optimum plot size for a California design-based stream and wetland mapping program.
Lackey, Leila G; Stein, Eric D
2014-04-01
Accurate estimates of the extent and distribution of wetlands and streams are the foundation of wetland monitoring, management, restoration, and regulatory programs. Traditionally, these estimates have relied on comprehensive mapping. However, this approach is prohibitively resource-intensive over large areas, making it both impractical and statistically unreliable. Probabilistic (design-based) approaches to evaluating status and trends provide a more cost-effective alternative because, compared with comprehensive mapping, overall extent is inferred from mapping a statistically representative, randomly selected subset of the target area. In this type of design, the size of sample plots has a significant impact on program costs and on statistical precision and accuracy; however, no consensus exists on the appropriate plot size for remote monitoring of stream and wetland extent. This study utilized simulated sampling to assess the performance of four plot sizes (1, 4, 9, and 16 km(2)) for three geographic regions of California. Simulation results showed smaller plot sizes (1 and 4 km(2)) were most efficient for achieving desired levels of statistical accuracy and precision. However, larger plot sizes were more likely to contain rare and spatially limited wetland subtypes. Balancing these considerations led to selection of 4 km(2) for the California status and trends program.
Statistical universals reveal the structures and functions of human music
Savage, Patrick E.; Brown, Steven; Sakai, Emi; Currie, Thomas E.
2015-01-01
Music has been called “the universal language of mankind.” Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation. PMID:26124105
Rocha, Ana Cristina; Duarte, Cidália
2015-02-01
To share Portugal's experience with school-based sexuality education, and to describe its implementation at a local level, following an ecological model and using a mixed methodology approach. The study also examines the impact of the latest policies put into effect, identifying potential weaknesses and strengths affecting the effectiveness of sexuality education enforcement. A representative sample of 296 schools in Portugal was analysed. Teachers representing the school completed a questionnaire and were asked to share any kind of official document from their sexuality education project (such as curriculum content). A subsample of these documents was analysed by two coders. Quantitative analysis was carried out using descriptive statistics. The majority of Portuguese schools delivered sexuality education, in line with Portuguese technical guidelines and international recommendations. There were common procedures in planning, implementation and evaluation of sexuality education. Some strengths and weaknesses were identified. Results highlighted the impact of the various systems on the planning, enforcement and evaluation of sexuality education in school. The latest policies introduced valuable changes in school-based sexuality education. A way of assessing effectiveness of sexuality education is still needed.
Précis of statistical significance: rationale, validity, and utility.
Chow, S L
1998-04-01
The null-hypothesis significance-test procedure (NHSTP) is defended in the context of the theory-corroboration experiment, as well as the following contrasts: (a) substantive hypotheses versus statistical hypotheses, (b) theory corroboration versus statistical hypothesis testing, (c) theoretical inference versus statistical decision, (d) experiments versus nonexperimental studies, and (e) theory corroboration versus treatment assessment. The null hypothesis can be true because it is the hypothesis that errors are randomly distributed in data. Moreover, the null hypothesis is never used as a categorical proposition. Statistical significance means only that chance influences can be excluded as an explanation of data; it does not identify the nonchance factor responsible. The experimental conclusion is drawn with the inductive principle underlying the experimental design. A chain of deductive arguments gives rise to the theoretical conclusion via the experimental conclusion. The anomalous relationship between statistical significance and the effect size often used to criticize NHSTP is more apparent than real. The absolute size of the effect is not an index of evidential support for the substantive hypothesis. Nor is the effect size, by itself, informative as to the practical importance of the research result. Being a conditional probability, statistical power cannot be the a priori probability of statistical significance. The validity of statistical power is debatable because statistical significance is determined with a single sampling distribution of the test statistic based on H0, whereas it takes two distributions to represent statistical power or effect size. Sample size should not be determined in the mechanical manner envisaged in power analysis. It is inappropriate to criticize NHSTP for nonstatistical reasons. At the same time, neither effect size, nor confidence interval estimate, nor posterior probability can be used to exclude chance as an explanation of data. Neither can any of them fulfill the nonstatistical functions expected of them by critics.
Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch
2017-06-06
An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a "one stop shop" algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For "future" predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.
Lange, J H; Lange, P R; Reinhard, T K; Thomulka, K W
1996-08-01
Data were collected and analysed on airborne concentrations of asbestos generated by abatement of different asbestos-containing materials using various removal practices. Airborne concentrations of asbestos are dramatically variable among the types of asbestos-containing material being abated. Abatement practices evaluated in this study were removal of boiler/pipe insulation in a crawl space, ceiling tile, transite, floor tile/mastic with traditional methods, and mastic removal with a high-efficiency particulate air filter blast track (shot-blast) machine. In general, abatement of boiler and pipe insulation produces the highest airborne fibre levels, while abatement of floor tile and mastic was observed to be the lowest. A comparison of matched personal and area samples was not significantly different, and exhibited a good correlation using regression analysis. After adjusting data for outliers, personal sample fibre concentrations were greater than area sample fibre concentrations. Statistical analysis and sample distribution of airborne asbestos concentrations appear to be best represented in a logarithmic form. Area sample fibre concentrations were shown in this study to have a larger variability than personal measurements. Evaluation of outliers in fibre concentration data and the ability of these values to skew sample populations is presented. The use of personal and area samples in determining exposure, selecting personal protective equipment and its historical relevance as related to future abatement projects is discussed.
Sampling pig farms at the abattoir in a cross-sectional study - Evaluation of a sampling method.
Birkegård, Anna Camilla; Halasa, Tariq; Toft, Nils
2017-09-15
A cross-sectional study design is relatively inexpensive, fast and easy to conduct when compared to other study designs. Careful planning is essential to obtaining a representative sample of the population, and the recommended approach is to use simple random sampling from an exhaustive list of units in the target population. This approach is rarely feasible in practice, and other sampling procedures must often be adopted. For example, when slaughter pigs are the target population, sampling the pigs on the slaughter line may be an alternative to on-site sampling at a list of farms. However, it is difficult to sample a large number of farms from an exact predefined list, due to the logistics and workflow of an abattoir. Therefore, it is necessary to have a systematic sampling procedure and to evaluate the obtained sample with respect to the study objective. We propose a method for 1) planning, 2) conducting, and 3) evaluating the representativeness and reproducibility of a cross-sectional study when simple random sampling is not possible. We used an example of a cross-sectional study with the aim of quantifying the association of antimicrobial resistance and antimicrobial consumption in Danish slaughter pigs. It was not possible to visit farms within the designated timeframe. Therefore, it was decided to use convenience sampling at the abattoir. Our approach was carried out in three steps: 1) planning: using data from meat inspection to plan at which abattoirs and how many farms to sample; 2) conducting: sampling was carried out at five abattoirs; 3) evaluation: representativeness was evaluated by comparing sampled and non-sampled farms, and the reproducibility of the study was assessed through simulated sampling based on meat inspection data from the period where the actual data collection was carried out. In the cross-sectional study samples were taken from 681 Danish pig farms, during five weeks from February to March 2015. The evaluation showed that the sampling procedure was reproducible with results comparable to the collected sample. However, the sampling procedure favoured sampling of large farms. Furthermore, both under-sampled and over-sampled areas were found using scan statistics. In conclusion, sampling conducted at abattoirs can provide a spatially representative sample. Hence it is a possible cost-effective alternative to simple random sampling. However, it is important to assess the properties of the resulting sample so that any potential selection bias can be addressed when reporting the findings. Copyright © 2017 Elsevier B.V. All rights reserved.
US Food and Drug Administration survey of methyl mercury in canned tuna
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yess, J.
1993-01-01
Methyl mercury was determined by the US Food and Drug Administration (FDA) in 220 samples of canned tuna collected in 1991. Samples were chosen to represent different styles, colors, and packs as available. Emphasis was placed on water-packed tuna, small can size, and the highest-volume brand names. The average methyl mercury (expressed as Hg) found for the 220 samples was 0.17 ppm; the range was <0.10-0.75 ppm. Statistically, a significantly higher level of methyl mercury was found in solid white and chunk tuna. Methyl mercury level was not related to can size. None of the 220 samples had methyl mercurymore » levels that exceeded the 1 ppm FDA action level. 11 refs., 1 tab.« less
Heterogenic Solid Biofuel Sampling Methodology and Uncertainty Associated with Prompt Analysis
Pazó, Jose A.; Granada, Enrique; Saavedra, Ángeles; Patiño, David; Collazo, Joaquín
2010-01-01
Accurate determination of the properties of biomass is of particular interest in studies on biomass combustion or cofiring. The aim of this paper is to develop a methodology for prompt analysis of heterogeneous solid fuels with an acceptable degree of accuracy. Special care must be taken with the sampling procedure to achieve an acceptable degree of error and low statistical uncertainty. A sampling and error determination methodology for prompt analysis is presented and validated. Two approaches for the propagation of errors are also given and some comparisons are made in order to determine which may be better in this context. Results show in general low, acceptable levels of uncertainty, demonstrating that the samples obtained in the process are representative of the overall fuel composition. PMID:20559506
NASA Astrophysics Data System (ADS)
Bruns, S.; Stipp, S. L. S.; Sørensen, H. O.
2017-09-01
Digital rock physics carries the dogmatic concept of having to segment volume images for quantitative analysis but segmentation rejects huge amounts of signal information. Information that is essential for the analysis of difficult and marginally resolved samples, such as materials with very small features, is lost during segmentation. In X-ray nanotomography reconstructions of Hod chalk we observed partial volume voxels with an abundance that limits segmentation based analysis. Therefore, we investigated the suitability of greyscale analysis for establishing statistical representative elementary volumes (sREV) for the important petrophysical parameters of this type of chalk, namely porosity, specific surface area and diffusive tortuosity, by using volume images without segmenting the datasets. Instead, grey level intensities were transformed to a voxel level porosity estimate using a Gaussian mixture model. A simple model assumption was made that allowed formulating a two point correlation function for surface area estimates using Bayes' theory. The same assumption enables random walk simulations in the presence of severe partial volume effects. The established sREVs illustrate that in compacted chalk, these simulations cannot be performed in binary representations without increasing the resolution of the imaging system to a point where the spatial restrictions of the represented sample volume render the precision of the measurement unacceptable. We illustrate this by analyzing the origins of variance in the quantitative analysis of volume images, i.e. resolution dependence and intersample and intrasample variance. Although we cannot make any claims on the accuracy of the approach, eliminating the segmentation step from the analysis enables comparative studies with higher precision and repeatability.
NASA Astrophysics Data System (ADS)
Feyen, Luc; Caers, Jef
2006-06-01
In this work, we address the problem of characterizing the heterogeneity and uncertainty of hydraulic properties for complex geological settings. Hereby, we distinguish between two scales of heterogeneity, namely the hydrofacies structure and the intrafacies variability of the hydraulic properties. We employ multiple-point geostatistics to characterize the hydrofacies architecture. The multiple-point statistics are borrowed from a training image that is designed to reflect the prior geological conceptualization. The intrafacies variability of the hydraulic properties is represented using conventional two-point correlation methods, more precisely, spatial covariance models under a multi-Gaussian spatial law. We address the different levels and sources of uncertainty in characterizing the subsurface heterogeneity, and explore their effect on groundwater flow and transport predictions. Typically, uncertainty is assessed by way of many images, termed realizations, of a fixed statistical model. However, in many cases, sampling from a fixed stochastic model does not adequately represent the space of uncertainty. It neglects the uncertainty related to the selection of the stochastic model and the estimation of its input parameters. We acknowledge the uncertainty inherent in the definition of the prior conceptual model of aquifer architecture and in the estimation of global statistics, anisotropy, and correlation scales. Spatial bootstrap is used to assess the uncertainty of the unknown statistical parameters. As an illustrative example, we employ a synthetic field that represents a fluvial setting consisting of an interconnected network of channel sands embedded within finer-grained floodplain material. For this highly non-stationary setting we quantify the groundwater flow and transport model prediction uncertainty for various levels of hydrogeological uncertainty. Results indicate the importance of accurately describing the facies geometry, especially for transport predictions.
Surveying immigrants without sampling frames - evaluating the success of alternative field methods.
Reichel, David; Morales, Laura
2017-01-01
This paper evaluates the sampling methods of an international survey, the Immigrant Citizens Survey, which aimed at surveying immigrants from outside the European Union (EU) in 15 cities in seven EU countries. In five countries, no sample frame was available for the target population. Consequently, alternative ways to obtain a representative sample had to be found. In three countries 'location sampling' was employed, while in two countries traditional methods were used with adaptations to reach the target population. The paper assesses the main methodological challenges of carrying out a survey among a group of immigrants for whom no sampling frame exists. The samples of the survey in these five countries are compared to results of official statistics in order to assess the accuracy of the samples obtained through the different sampling methods. It can be shown that alternative sampling methods can provide meaningful results in terms of core demographic characteristics although some estimates differ to some extent from the census results.
Multiscale dispersion-state characterization of nanocomposites using optical coherence tomography
Schneider, Simon; Eppler, Florian; Weber, Marco; Olowojoba, Ganiu; Weiss, Patrick; Hübner, Christof; Mikonsaari, Irma; Freude, Wolfgang; Koos, Christian
2016-01-01
Nanocomposite materials represent a success story of nanotechnology. However, development of nanomaterial fabrication still suffers from the lack of adequate analysis tools. In particular, achieving and maintaining well-dispersed particle distributions is a key challenge, both in material development and industrial production. Conventional methods like optical or electron microscopy need laborious, costly sample preparation and do not permit fast extraction of nanoscale structural information from statistically relevant sample volumes. Here we show that optical coherence tomography (OCT) represents a versatile tool for nanomaterial characterization, both in a laboratory and in a production environment. The technique does not require sample preparation and is applicable to a wide range of solid and liquid material systems. Large particle agglomerates can be directly found by OCT imaging, whereas dispersed nanoparticles are detected by model-based analysis of depth-dependent backscattering. Using a model system of polystyrene nanoparticles, we demonstrate nanoparticle sizing with high accuracy. We further prove the viability of the approach by characterizing highly relevant material systems based on nanoclays or carbon nanotubes. The technique is perfectly suited for in-line metrology in a production environment, which is demonstrated using a state-of-the-art compounding extruder. These experiments represent the first demonstration of multiscale nanomaterial characterization using OCT. PMID:27557544
Multiscale dispersion-state characterization of nanocomposites using optical coherence tomography.
Schneider, Simon; Eppler, Florian; Weber, Marco; Olowojoba, Ganiu; Weiss, Patrick; Hübner, Christof; Mikonsaari, Irma; Freude, Wolfgang; Koos, Christian
2016-08-25
Nanocomposite materials represent a success story of nanotechnology. However, development of nanomaterial fabrication still suffers from the lack of adequate analysis tools. In particular, achieving and maintaining well-dispersed particle distributions is a key challenge, both in material development and industrial production. Conventional methods like optical or electron microscopy need laborious, costly sample preparation and do not permit fast extraction of nanoscale structural information from statistically relevant sample volumes. Here we show that optical coherence tomography (OCT) represents a versatile tool for nanomaterial characterization, both in a laboratory and in a production environment. The technique does not require sample preparation and is applicable to a wide range of solid and liquid material systems. Large particle agglomerates can be directly found by OCT imaging, whereas dispersed nanoparticles are detected by model-based analysis of depth-dependent backscattering. Using a model system of polystyrene nanoparticles, we demonstrate nanoparticle sizing with high accuracy. We further prove the viability of the approach by characterizing highly relevant material systems based on nanoclays or carbon nanotubes. The technique is perfectly suited for in-line metrology in a production environment, which is demonstrated using a state-of-the-art compounding extruder. These experiments represent the first demonstration of multiscale nanomaterial characterization using OCT.
NASA Astrophysics Data System (ADS)
Poppe, Sam; Barette, Florian; Smets, Benoît; Benbakkar, Mhammed; Kervyn, Matthieu
2016-04-01
The Virunga Volcanic Province (VVP) is situated within the western branch of the East-African Rift. The geochemistry and petrology of its' volcanic products has been studied extensively in a fragmented manner. They represent a unique collection of silica-undersaturated, ultra-alkaline and ultra-potassic compositions, displaying marked geochemical variations over the area occupied by the VVP. We present a novel spatially-explicit database of existing whole-rock geochemical analyses of the VVP volcanics, compiled from international publications, (post-)colonial scientific reports and PhD theses. In the database, a total of 703 geochemical analyses of whole-rock samples collected from the 1950s until recently have been characterised with a geographical location, eruption source location, analytical results and uncertainty estimates for each of these categories. Comparative box plots and Kruskal-Wallis H tests on subsets of analyses with contrasting ages or analytical methods suggest that the overall database accuracy is consistent. We demonstrate how statistical techniques such as Principal Component Analysis (PCA) and subsequent cluster analysis allow the identification of clusters of samples with similar major-element compositions. The spatial patterns represented by the contrasting clusters show that both the historically active volcanoes represent compositional clusters which can be identified based on their contrasted silica and alkali contents. Furthermore, two sample clusters are interpreted to represent the most primitive, deep magma source within the VVP, different from the shallow magma reservoirs that feed the eight dominant large volcanoes. The samples from these two clusters systematically originate from locations which 1. are distal compared to the eight large volcanoes and 2. mostly coincide with the surface expressions of rift faults or NE-SW-oriented inherited Precambrian structures which were reactivated during rifting. The lava from the Mugogo eruption of 1957 belongs to these primitive clusters and is the only known to have erupted outside the current rift valley in historical times. We thus infer there is a distributed hazard of vent opening susceptibility additional to the susceptibility associated with the main Virunga edifices. This study suggests that the statistical analysis of such geochemical database may help to understand complex volcanic plumbing systems and the spatial distribution of volcanic hazards in active and poorly known volcanic areas such as the Virunga Volcanic Province.
NASA Astrophysics Data System (ADS)
White, Susan C.
2016-11-01
We are continuing our examination of very different physics availability numbers reported by AIP Statistics and the U.S. Department of Education's Office of Civil Rights (OCR). The essential difference appears to be the number of schools included in the denominator. The U.S. Department of Education classifies schools into one of five types based upon the curriculum offered: regular, special education, vocational, alternative, and a fifth classification for schools that do not fit into any of the first four. In AIP Statistics' Quadrennial Survey of High School Physics Teachers, data are collected from a nationally representative sample of all public and private regular and vocational schools that have at least three seniors (students enrolled in 12th grade).
Protein Multiplexed Immunoassay Analysis with R.
Breen, Edmond J
2017-01-01
Plasma samples from 177 control and type 2 diabetes patients collected at three Australian hospitals are screened for 14 analytes using six custom-made multiplex kits across 60 96-well plates. In total 354 samples were collected from the patients, representing one baseline and one end point sample from each patient. R methods and source code for analyzing the analyte fluorescence response obtained from these samples by Luminex Bio-Plex ® xMap multiplexed immunoassay technology are disclosed. Techniques and R procedures for reading Bio-Plex ® result files for statistical analysis and data visualization are also presented. The need for technical replicates and the number of technical replicates are addressed as well as plate layout design strategies. Multinomial regression is used to determine plate to sample covariate balance. Methods for matching clinical covariate information to Bio-Plex ® results and vice versa are given. As well as methods for measuring and inspecting the quality of the fluorescence responses are presented. Both fixed and mixed-effect approaches for immunoassay statistical differential analysis are presented and discussed. A random effect approach to outlier analysis and detection is also shown. The bioinformatics R methodology present here provides a foundation for rigorous and reproducible analysis of the fluorescence response obtained from multiplexed immunoassays.
ERIC Educational Resources Information Center
Radford, Alexandria Walton; Berkner, Lutz
2011-01-01
This Statistics in Brief applies IRS rules and data to a nationally representative sample of 2007-08 undergraduates to estimate who received education tax benefits and looks at the extent to which these benefits shaped their price of college attendance. Key findings include: (1) Nearly one-half of all 2007-08 undergraduates were estimated to have…
Overlay improvement methods with diffraction based overlay and integrated metrology
NASA Astrophysics Data System (ADS)
Nam, Young-Sun; Kim, Sunny; Shin, Ju Hee; Choi, Young Sin; Yun, Sang Ho; Kim, Young Hoon; Shin, Si Woo; Kong, Jeong Heung; Kang, Young Seog; Ha, Hun Hwan
2015-03-01
To accord with new requirement of securing more overlay margin, not only the optical overlay measurement is faced with the technical limitations to represent cell pattern's behavior, but also the larger measurement samples are inevitable for minimizing statistical errors and better estimation of circumstance in a lot. From these reasons, diffraction based overlay (DBO) and integrated metrology (IM) were mainly proposed as new approaches for overlay enhancement in this paper.
Evaluation of errors in quantitative determination of asbestos in rock
NASA Astrophysics Data System (ADS)
Baietto, Oliviero; Marini, Paola; Vitaliti, Martina
2016-04-01
The quantitative determination of the content of asbestos in rock matrices is a complex operation which is susceptible to important errors. The principal methodologies for the analysis are Scanning Electron Microscopy (SEM) and Phase Contrast Optical Microscopy (PCOM). Despite the PCOM resolution is inferior to that of SEM, PCOM analysis has several advantages, including more representativity of the analyzed sample, more effective recognition of chrysotile and a lower cost. The DIATI LAA internal methodology for the analysis in PCOM is based on a mild grinding of a rock sample, its subdivision in 5-6 grain size classes smaller than 2 mm and a subsequent microscopic analysis of a portion of each class. The PCOM is based on the optical properties of asbestos and of the liquids with note refractive index in which the particles in analysis are immersed. The error evaluation in the analysis of rock samples, contrary to the analysis of airborne filters, cannot be based on a statistical distribution. In fact for airborne filters a binomial distribution (Poisson), which theoretically defines the variation in the count of fibers resulting from the observation of analysis fields, chosen randomly on the filter, can be applied. The analysis in rock matrices instead cannot lean on any statistical distribution because the most important object of the analysis is the size of the of asbestiform fibers and bundles of fibers observed and the resulting relationship between the weights of the fibrous component compared to the one granular. The error evaluation generally provided by public and private institutions varies between 50 and 150 percent, but there are not, however, specific studies that discuss the origin of the error or that link it to the asbestos content. Our work aims to provide a reliable estimation of the error in relation to the applied methodologies and to the total content of asbestos, especially for the values close to the legal limits. The error assessments must be made through the repetition of the same analysis on the same sample to try to estimate the error on the representativeness of the sample and the error related to the sensitivity of the operator, in order to provide a sufficiently reliable uncertainty of the method. We used about 30 natural rock samples with different asbestos content, performing 3 analysis on each sample to obtain a trend sufficiently representative of the percentage. Furthermore we made on one chosen sample 10 repetition of the analysis to try to define more specifically the error of the methodology.
Ngwakongnwi, Emmanuel; King-Shier, Kathryn M; Hemmelgarn, Brenda R; Musto, Richard; Quan, Hude
2014-01-01
Francophones who live outside the primarily French-speaking province of Quebec, Canada, risk being excluded from research by lack of a sampling frame. We examined the adequacy of random sampling, advertising, and respondent-driven sampling for recruitment of francophones for survey research. We recruited francophones residing in the city of Calgary, Alberta, through advertising and respondentdriven sampling. These 2 samples were then compared with a random subsample of Calgary francophones derived from the 2006 Canadian Community Health Survey (CCHS). We assessed the effectiveness of advertising and respondent-driven sampling in relation to the CCHS sample by comparing demographic characteristics and selected items from the CCHS (specifically self-reported general health status, perceived weight, and having a family doctor). We recruited 120 francophones through advertising and 145 through respondent-driven sampling; the random sample from the CCHS consisted of 259 records. The samples derived from advertising and respondentdriven sampling differed from the CCHS in terms of age (mean ages 41.0, 37.6, and 42.5 years, respectively), sex (proportion of males 26.1%, 40.6%, and 56.6%, respectively), education (college or higher 86.7% , 77.9% , and 59.1%, respectively), place of birth (immigrants accounting for 45.8%, 55.2%, and 3.7%, respectively), and not having a regular medical doctor (16.7%, 34.5%, and 16.6%, respectively). Differences were not tested statistically because of limitations on the analysis of CCHS data imposed by Statistics Canada. The samples generated exclusively through advertising and respondent-driven sampling were not representative of the gold standard sample from the CCHS. Use of such biased samples for research studies could generate misleading results.
Data Analysis and Statistical Methods for the Assessment and Interpretation of Geochronologic Data
NASA Astrophysics Data System (ADS)
Reno, B. L.; Brown, M.; Piccoli, P. M.
2007-12-01
Ages are traditionally reported as a weighted mean with an uncertainty based on least squares analysis of analytical error on individual dates. This method does not take into account geological uncertainties, and cannot accommodate asymmetries in the data. In most instances, this method will understate uncertainty on a given age, which may lead to over interpretation of age data. Geologic uncertainty is difficult to quantify, but is typically greater than analytical uncertainty. These factors make traditional statistical approaches inadequate to fully evaluate geochronologic data. We propose a protocol to assess populations within multi-event datasets and to calculate age and uncertainty from each population of dates interpreted to represent a single geologic event using robust and resistant statistical methods. To assess whether populations thought to represent different events are statistically separate exploratory data analysis is undertaken using a box plot, where the range of the data is represented by a 'box' of length given by the interquartile range, divided at the median of the data, with 'whiskers' that extend to the furthest datapoint that lies within 1.5 times the interquartile range beyond the box. If the boxes representing the populations do not overlap, they are interpreted to represent statistically different sets of dates. Ages are calculated from statistically distinct populations using a robust tool such as the tanh method of Kelsey et al. (2003, CMP, 146, 326-340), which is insensitive to any assumptions about the underlying probability distribution from which the data are drawn. Therefore, this method takes into account the full range of data, and is not drastically affected by outliers. The interquartile range of each population of dates (the interquartile range) gives a first pass at expressing uncertainty, which accommodates asymmetry in the dataset; outliers have a minor affect on the uncertainty. To better quantify the uncertainty, a resistant tool that is insensitive to local misbehavior of data is preferred, such as the normalized median absolute deviations proposed by Powell et al. (2002, Chem Geol, 185, 191-204). We illustrate the method using a dataset of 152 monazite dates determined using EPMA chemical data from a single sample from the Neoproterozoic Brasília Belt, Brazil. Results are compared with ages and uncertainties calculated using traditional methods to demonstrate the differences. The dataset was manually culled into three populations representing discrete compositional domains within chemically-zoned monazite grains. The weighted mean ages and least squares uncertainties for these populations are 633±6 (2σ) Ma for a core domain, 614±5 (2σ) Ma for an intermediate domain and 595±6 (2σ) Ma for a rim domain. Probability distribution plots indicate asymmetric distributions of all populations, which cannot be accounted for with traditional statistical tools. These three domains record distinct ages outside the interquartile range for each population of dates, with the core domain lying in the subrange 642-624 Ma, the intermediate domain 617-609 Ma and the rim domain 606-589 Ma. The tanh estimator yields ages of 631±7 (2σ) for the core domain, 616±7 (2σ) for the intermediate domain and 601±8 (2σ) for the rim domain. Whereas the uncertainties derived using a resistant statistical tool are larger than those derived from traditional statistical tools, the method yields more realistic uncertainties that better address the spread in the dataset and account for asymmetry in the data.
Environmental Health Practice: Statistically Based Performance Measurement
Enander, Richard T.; Gagnon, Ronald N.; Hanumara, R. Choudary; Park, Eugene; Armstrong, Thomas; Gute, David M.
2007-01-01
Objectives. State environmental and health protection agencies have traditionally relied on a facility-by-facility inspection-enforcement paradigm to achieve compliance with government regulations. We evaluated the effectiveness of a new approach that uses a self-certification random sampling design. Methods. Comprehensive environmental and occupational health data from a 3-year statewide industry self-certification initiative were collected from representative automotive refinishing facilities located in Rhode Island. Statistical comparisons between baseline and postintervention data facilitated a quantitative evaluation of statewide performance. Results. The analysis of field data collected from 82 randomly selected automotive refinishing facilities showed statistically significant improvements (P<.05, Fisher exact test) in 4 major performance categories: occupational health and safety, air pollution control, hazardous waste management, and wastewater discharge. Statistical significance was also shown when a modified Bonferroni adjustment for multiple comparisons was performed. Conclusions. Our findings suggest that the new self-certification approach to environmental and worker protection is effective and can be used as an adjunct to further enhance state and federal enforcement programs. PMID:17267709
Improved Estimation and Interpretation of Correlations in Neural Circuits
Yatsenko, Dimitri; Josić, Krešimir; Ecker, Alexander S.; Froudarakis, Emmanouil; Cotton, R. James; Tolias, Andreas S.
2015-01-01
Ambitious projects aim to record the activity of ever larger and denser neuronal populations in vivo. Correlations in neural activity measured in such recordings can reveal important aspects of neural circuit organization. However, estimating and interpreting large correlation matrices is statistically challenging. Estimation can be improved by regularization, i.e. by imposing a structure on the estimate. The amount of improvement depends on how closely the assumed structure represents dependencies in the data. Therefore, the selection of the most efficient correlation matrix estimator for a given neural circuit must be determined empirically. Importantly, the identity and structure of the most efficient estimator informs about the types of dominant dependencies governing the system. We sought statistically efficient estimators of neural correlation matrices in recordings from large, dense groups of cortical neurons. Using fast 3D random-access laser scanning microscopy of calcium signals, we recorded the activity of nearly every neuron in volumes 200 μm wide and 100 μm deep (150–350 cells) in mouse visual cortex. We hypothesized that in these densely sampled recordings, the correlation matrix should be best modeled as the combination of a sparse graph of pairwise partial correlations representing local interactions and a low-rank component representing common fluctuations and external inputs. Indeed, in cross-validation tests, the covariance matrix estimator with this structure consistently outperformed other regularized estimators. The sparse component of the estimate defined a graph of interactions. These interactions reflected the physical distances and orientation tuning properties of cells: The density of positive ‘excitatory’ interactions decreased rapidly with geometric distances and with differences in orientation preference whereas negative ‘inhibitory’ interactions were less selective. Because of its superior performance, this ‘sparse+latent’ estimator likely provides a more physiologically relevant representation of the functional connectivity in densely sampled recordings than the sample correlation matrix. PMID:25826696
The Apollo 16 regolith - A petrographically-constrained chemical mixing model
NASA Technical Reports Server (NTRS)
Kempa, M. J.; Papike, J. J.; White, C.
1980-01-01
A mixing model for Apollo 16 regolith samples has been developed, which differs from other A-16 mixing models in that it is both petrographically constrained and statistically sound. The model was developed using three components representative of rock types present at the A-16 site, plus a representative mare basalt. A linear least-squares fitting program employing the chi-squared test and sum of components was used to determine goodness of fit. Results for surface soils indicate that either there are no significant differences between Cayley and Descartes material at the A-16 site or, if differences do exist, they have been obscured by meteoritic reworking and mixing of the lithologies.
Fasoli, Marianna; Dal Santo, Silvia; Zenoni, Sara; Tornielli, Giovanni Battista; Farina, Lorenzo; Zamboni, Anita; Porceddu, Andrea; Venturini, Luca; Bicego, Manuele; Murino, Vittorio; Ferrarini, Alberto; Delledonne, Massimo; Pezzotti, Mario
2012-09-01
We developed a genome-wide transcriptomic atlas of grapevine (Vitis vinifera) based on 54 samples representing green and woody tissues and organs at different developmental stages as well as specialized tissues such as pollen and senescent leaves. Together, these samples expressed ∼91% of the predicted grapevine genes. Pollen and senescent leaves had unique transcriptomes reflecting their specialized functions and physiological status. However, microarray and RNA-seq analysis grouped all the other samples into two major classes based on maturity rather than organ identity, namely, the vegetative/green and mature/woody categories. This division represents a fundamental transcriptomic reprogramming during the maturation process and was highlighted by three statistical approaches identifying the transcriptional relationships among samples (correlation analysis), putative biomarkers (O2PLS-DA approach), and sets of strongly and consistently expressed genes that define groups (topics) of similar samples (biclustering analysis). Gene coexpression analysis indicated that the mature/woody developmental program results from the reiterative coactivation of pathways that are largely inactive in vegetative/green tissues, often involving the coregulation of clusters of neighboring genes and global regulation based on codon preference. This global transcriptomic reprogramming during maturation has not been observed in herbaceous annual species and may be a defining characteristic of perennial woody plants.
Talbott, Mariah J; Servid, Sarah A; Cavinato, Anna G; Van Eenennaam, Joel P; Doroshov, Serge I; Struffenegger, Peter; Webb, Molly A H
2014-02-01
Assessing stage of oocyte maturity in female sturgeon by calculating oocyte polarization index (PI) is a necessary tool for both conservation propagation managers and caviar producers to know when to hormonally induce spawning. We tested the assumption that sampling ovarian follicles from one section of one ovary is sufficient for calculating an oocyte PI representative of oocyte maturity for an individual animal. Short-wavelength near-infrared spectroscopy (SW-NIR) scans were performed on three positions per ovary for five fish prior to caviar harvest. Samples of ovarian follicles were subsequently taken from the exact location of the SW-NIR scans for calculation of oocyte PI and follicle diameter. Oocyte PI was statistically different though not biologically relevant within an ovary and between ovaries in four of five fish. Follicle diameter was statistically different but not biologically relevant within an ovary in three of five fish. There were no differences in follicle diameter between ovaries. No statistical differences were observed between SW-NIR spectra collected at different locations within an ovary or between ovaries. These results emphasize the importance of utilizing both oocyte PI measurement and progesterone-induced oocyte maturation assays while deciding when to hormonally induce spawning in sturgeon females.
Water Quality Sensing and Spatio-Temporal Monitoring Structure with Autocorrelation Kernel Methods.
Vizcaíno, Iván P; Carrera, Enrique V; Muñoz-Romero, Sergio; Cumbal, Luis H; Rojo-Álvarez, José Luis
2017-10-16
Pollution on water resources is usually analyzed with monitoring campaigns, which consist of programmed sampling, measurement, and recording of the most representative water quality parameters. These campaign measurements yields a non-uniform spatio-temporal sampled data structure to characterize complex dynamics phenomena. In this work, we propose an enhanced statistical interpolation method to provide water quality managers with statistically interpolated representations of spatial-temporal dynamics. Specifically, our proposal makes efficient use of the a priori available information of the quality parameter measurements through Support Vector Regression (SVR) based on Mercer's kernels. The methods are benchmarked against previously proposed methods in three segments of the Machángara River and one segment of the San Pedro River in Ecuador, and their different dynamics are shown by statistically interpolated spatial-temporal maps. The best interpolation performance in terms of mean absolute error was the SVR with Mercer's kernel given by either the Mahalanobis spatial-temporal covariance matrix or by the bivariate estimated autocorrelation function. In particular, the autocorrelation kernel provides with significant improvement of the estimation quality, consistently for all the six water quality variables, which points out the relevance of including a priori knowledge of the problem.
Water Quality Sensing and Spatio-Temporal Monitoring Structure with Autocorrelation Kernel Methods
Vizcaíno, Iván P.; Muñoz-Romero, Sergio; Cumbal, Luis H.
2017-01-01
Pollution on water resources is usually analyzed with monitoring campaigns, which consist of programmed sampling, measurement, and recording of the most representative water quality parameters. These campaign measurements yields a non-uniform spatio-temporal sampled data structure to characterize complex dynamics phenomena. In this work, we propose an enhanced statistical interpolation method to provide water quality managers with statistically interpolated representations of spatial-temporal dynamics. Specifically, our proposal makes efficient use of the a priori available information of the quality parameter measurements through Support Vector Regression (SVR) based on Mercer’s kernels. The methods are benchmarked against previously proposed methods in three segments of the Machángara River and one segment of the San Pedro River in Ecuador, and their different dynamics are shown by statistically interpolated spatial-temporal maps. The best interpolation performance in terms of mean absolute error was the SVR with Mercer’s kernel given by either the Mahalanobis spatial-temporal covariance matrix or by the bivariate estimated autocorrelation function. In particular, the autocorrelation kernel provides with significant improvement of the estimation quality, consistently for all the six water quality variables, which points out the relevance of including a priori knowledge of the problem. PMID:29035333
Lunar Samples: Apollo Collection Tools, Curation Handling, Surveyor III and Soviet Luna Samples
NASA Technical Reports Server (NTRS)
Allton, J.H.
2009-01-01
The 6 Apollo missions that landed on the lunar surface returned 2196 samples comprised of 382 kg. The 58 samples weighing 21.5 kg collected on Apollo 11 expanded to 741 samples weighing 110.5 kg by the time of Apollo 17. The main goal on Apollo 11 was to obtain some material and return it safely to Earth. As we gained experience, the sampling tools and a more specific sampling strategy evolved. A summary of the sample types returned is shown in Table 1. By year 1989, some statistics on allocation by sample type were compiled [2]. The "scientific interest index" is based on the assumption that the more allocations per gram of sample, the higher the scientific interest. It is basically a reflection of the amount of diversity within a given sample type. Samples were also set aside for biohazard testing. The samples set aside and used for biohazard testing were represen-tative, as opposed to diverse. They tended to be larger and be comprised of less scientifically valuable mate-rial, such as dust and debris in the bottom of sample containers.
ERIC Educational Resources Information Center
White, Susan; Tesfaye, Casey Langer
2014-01-01
Since 1987, the Statistical Research Center at the American Institute of Physics has regularly conducted a nationwide survey of high school physics teachers to take a closer look at physics in U.S. high schools. We contact all of the teachers who teach at least one physics course at a nationally representative sample of all U.S. high schools-both…
Noninformative prior in the quantum statistical model of pure states
NASA Astrophysics Data System (ADS)
Tanaka, Fuyuhiko
2012-06-01
In the present paper, we consider a suitable definition of a noninformative prior on the quantum statistical model of pure states. While the full pure-states model is invariant under unitary rotation and admits the Haar measure, restricted models, which we often see in quantum channel estimation and quantum process tomography, have less symmetry and no compelling rationale for any choice. We adopt a game-theoretic approach that is applicable to classical Bayesian statistics and yields a noninformative prior for a general class of probability distributions. We define the quantum detection game and show that there exist noninformative priors for a general class of a pure-states model. Theoretically, it gives one of the ways that we represent ignorance on the given quantum system with partial information. Practically, our method proposes a default distribution on the model in order to use the Bayesian technique in the quantum-state tomography with a small sample.
Swahn, Monica H; Bossarte, Robert M
2007-08-01
To examine the cross-sectional associations between preteen alcohol use initiation and subsequent suicide ideation and attempts for boys and girls in a nationally representative sample of high school students. Analyses are computed using data from the 2005 national Youth Risk Behavior Survey, which includes a representative sample (n = 13,639) of high-school students in grades 9-12 in the United States. Cross-sectional logistic regression analyses were conducted to determine the associations between early alcohol use and reports of suicide ideation and suicide attempts for boys and girls while controlling for demographic characteristics, substance use, involvement in physical fights, weapon carrying, physical abuse by dating partner, sexual assault, and sadness. Among study participants, 25.4% reported drinking before age 13 years. Preteen alcohol use initiation was statistically significantly associated with suicidal ideation (adjusted OR = 1.89, 95% CI =1.46-2.44) and suicide attempts (adjusted OR = 2.71, 95% CI =1.82-4.02) relative to nondrinkers. Preteen alcohol use initiation was statistically significantly associated with suicidal ideation and attempts relative to nondrinkers for both boys and girls. Alcohol use among adolescents, particularly preteen alcohol use initiation, is an important risk factor for both suicide ideation and suicide attempts among boys and girls. Increased efforts to delay and reduce early alcohol use are needed, and may reduce suicide attempts.
NASA Astrophysics Data System (ADS)
Žukovič, Milan; Hristopulos, Dionissios T.
2009-02-01
A current problem of practical significance is how to analyze large, spatially distributed, environmental data sets. The problem is more challenging for variables that follow non-Gaussian distributions. We show by means of numerical simulations that the spatial correlations between variables can be captured by interactions between 'spins'. The spins represent multilevel discretizations of environmental variables with respect to a number of pre-defined thresholds. The spatial dependence between the 'spins' is imposed by means of short-range interactions. We present two approaches, inspired by the Ising and Potts models, that generate conditional simulations of spatially distributed variables from samples with missing data. Currently, the sampling and simulation points are assumed to be at the nodes of a regular grid. The conditional simulations of the 'spin system' are forced to respect locally the sample values and the system statistics globally. The second constraint is enforced by minimizing a cost function representing the deviation between normalized correlation energies of the simulated and the sample distributions. In the approach based on the Nc-state Potts model, each point is assigned to one of Nc classes. The interactions involve all the points simultaneously. In the Ising model approach, a sequential simulation scheme is used: the discretization at each simulation level is binomial (i.e., ± 1). Information propagates from lower to higher levels as the simulation proceeds. We compare the two approaches in terms of their ability to reproduce the target statistics (e.g., the histogram and the variogram of the sample distribution), to predict data at unsampled locations, as well as in terms of their computational complexity. The comparison is based on a non-Gaussian data set (derived from a digital elevation model of the Walker Lake area, Nevada, USA). We discuss the impact of relevant simulation parameters, such as the domain size, the number of discretization levels, and the initial conditions.
Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B; Chen, Li; Wang, Yue; Clarke, Robert
2012-08-01
Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive 'noise' in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. xuan@vt.edu Supplementary data are available at Bioinformatics online.
Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B.; Chen, Li; Wang, Yue; Clarke, Robert
2012-01-01
Motivation: Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive ‘noise’ in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. Results: In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. Availability and implementation: The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. Contact: xuan@vt.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22595208
Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R
2017-07-05
Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell.
Koh, Dong-Hee; Locke, Sarah J.; Chen, Yu-Cheng; Purdue, Mark P.; Friesen, Melissa C.
2016-01-01
Background Retrospective exposure assessment of occupational lead exposure in population-based studies requires historical exposure information from many occupations and industries. Methods We reviewed published US exposure monitoring studies to identify lead exposure measurement data. We developed an occupational lead exposure database from the 175 identified papers containing 1,111 sets of lead concentration summary statistics (21% area air, 47% personal air, 32% blood). We also extracted ancillary exposure-related information, including job, industry, task/location, year collected, sampling strategy, control measures in place, and sampling and analytical methods. Results Measurements were published between 1940 and 2010 and represented 27 2-digit standardized industry classification codes. The majority of the measurements were related to lead-based paint work, joining or cutting metal using heat, primary and secondary metal manufacturing, and lead acid battery manufacturing. Conclusions This database can be used in future statistical analyses to characterize differences in lead exposure across time, jobs, and industries. PMID:25968240
42 CFR 402.109 - Statistical sampling.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 42 Public Health 2 2011-10-01 2011-10-01 false Statistical sampling. 402.109 Section 402.109... Statistical sampling. (a) Purpose. CMS or OIG may introduce the results of a statistical sampling study to... or caused to be presented. (b) Prima facie evidence. The results of the statistical sampling study...
42 CFR 402.109 - Statistical sampling.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 42 Public Health 2 2010-10-01 2010-10-01 false Statistical sampling. 402.109 Section 402.109... Statistical sampling. (a) Purpose. CMS or OIG may introduce the results of a statistical sampling study to... or caused to be presented. (b) Prima facie evidence. The results of the statistical sampling study...
Szabolcsi, Zoltán; Farkas, Zsuzsa; Borbély, Andrea; Bárány, Gusztáv; Varga, Dániel; Heinrich, Attila; Völgyi, Antónia; Pamjav, Horolma
2015-11-01
When the DNA profile from a crime-scene matches that of a suspect, the weight of DNA evidence depends on the unbiased estimation of the match probability of the profiles. For this reason, it is required to establish and expand the databases that reflect the actual allele frequencies in the population applied. 21,473 complete DNA profiles from Databank samples were used to establish the allele frequency database to represent the population of Hungarian suspects. We used fifteen STR loci (PowerPlex ESI16) including five, new ESS loci. The aim was to calculate the statistical, forensic efficiency parameters for the Databank samples and compare the newly detected data to the earlier report. The population substructure caused by relatedness may influence the frequency of profiles estimated. As our Databank profiles were considered non-random samples, possible relationships between the suspects can be assumed. Therefore, population inbreeding effect was estimated using the FIS calculation. The overall inbreeding parameter was found to be 0.0106. Furthermore, we tested the impact of the two allele frequency datasets on 101 randomly chosen STR profiles, including full and partial profiles. The 95% confidence interval estimates for the profile frequencies (pM) resulted in a tighter range when we used the new dataset compared to the previously published ones. We found that the FIS had less effect on frequency values in the 21,473 samples than the application of minimum allele frequency. No genetic substructure was detected by STRUCTURE analysis. Due to the low level of inbreeding effect and the high number of samples, the new dataset provides unbiased and precise estimates of LR for statistical interpretation of forensic casework and allows us to use lower allele frequencies. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Peled, Ofra N.; Peled, Irit; Peled, Jonathan U.
2013-01-01
The phenomenon of birth of a baby is a common and familiar one, and yet college students participating in a general biology class did not possess the expected common knowledge of the equal probability of gender births. We found that these students held strikingly skewed conceptions regarding gender birth ratio, estimating the number of female births to be more than twice the number of male births. Possible sources of these beliefs were analysed, showing flaws in statistical thinking such as viewing small unplanned samples as representing the whole population and making inferences from an inappropriate population. Some educational implications are discussed and a short teaching example (using data assembly) demonstrates an instructional direction that might facilitate conceptual change.
NASA Astrophysics Data System (ADS)
Yang, Fang; Zhai, YunBo; Chen, Lin; Li, CaiTing; Zeng, GuangMing; He, YiDe; Fu, ZongMin; Peng, WenFeng
2010-04-01
16 Polycyclic aromatic hydrocarbons (PAHs) in TSP were identified and quantified in samples collected during May and September of 2008, in Changsha, on three different sites: the city environmental protection agency of Changsha (A), the Middle School Attached to Hunan Normal University (B) and Yuhua district (C). The filters contained the particulate matter were extracted with dichloromethane in ultrasonic bath and then analyzed by gas chromatography/mass spectrometry (GC/MS). The total of 16 PAHs mean concentrations of summer at site A, B, C were 32.503 ng/m 3 , 19.360 ng/m 3 and 26.784 ng/m 3, respectively; while the values for autumn at site A, B, C were 24.982 ng/m 3, 17.088 ng/m 3 and 15.465 ng/m 3, respectively. The mean concentrations of PAHs of all samples in A site were 0.57 times higher than those measured at B site, and 0.38 times higher than at C site. The analysis of their distribution amongst the main emission sources was done through the diagnosis of concentration ratios of PAHs, as well as using statistical methods like principal component analysis. The diagnosis results of concentration ratios of PAHs suggested that the major polluting sources in the Changsha region during the studied period were the combustion of fuels, such as diesel oil, gasoline, wood and coal. The statistical analysis separated the 16 compounds studied into 3 and 4 factors for summer and autumn, separately. Factor 1 in summer represents vehicular emissions. Factor 2 represents emissions from the nature gas. Factor 3 represents emissions from combustion. In autumn, vehicle emissions, combustion sources, natural gas and coke oven were the major emissions.
Airborne Bacteria in an Urban Environment
Mancinelli, Rocco L.; Shulls, Wells A.
1978-01-01
Samples were taken at random intervals over a 2-year period from urban air and tested for viable bacteria. The number of bacteria in each sample was determined, and each organism isolated was identified by its morphological and biochemical characteristics. The number of bacteria found ranged from 0.013 to 1.88 organisms per liter of air sampled. Representatives of 19 different genera were found in 21 samples. The most frequently isolated organisms and their percent of occurence were Micrococcus (41%), Staphylococcus (11%), and Aerococcus (8%). The bacteria isolated were correlated with various weather and air pollution parameters using the Pearson product-moment correlation coefficient method. Statistically significant correlations were found between the number of viable bacteria isolated and the concentrations of nitric oxide (−0.45), nitrogen dioxide (+0.43), and suspended particulate pollutants (+0.56). Calculated individually, the total number of Micrococcus, Aerococcus, and Staphylococcus, number of rods, and number of cocci isolated showed negative correlations with nitric oxide and positive correlations with nitrogen dioxide and particulates. Statistically significant positive correlations were found between the total number of rods isolated and the concentration of nitrogen dioxide (+0.54) and the percent relative humidity (+0.43). The other parameters tested, sulfur dioxide, hydrocarbons, and temperature, showed no significant correlations. Images PMID:677875
Statistical methods for identifying and bounding a UXO target area or minefield
DOE Office of Scientific and Technical Information (OSTI.GOV)
McKinstry, Craig A.; Pulsipher, Brent A.; Gilbert, Richard O.
2003-09-18
The sampling unit for minefield or UXO area characterization is typically represented by a geographical block or transect swath that lends itself to characterization by geophysical instrumentation such as mobile sensor arrays. New spatially based statistical survey methods and tools, more appropriate for these unique sampling units have been developed and implemented at PNNL (Visual Sample Plan software, ver. 2.0) with support from the US Department of Defense. Though originally developed to support UXO detection and removal efforts, these tools may also be used in current form or adapted to support demining efforts and aid in the development of newmore » sensors and detection technologies by explicitly incorporating both sampling and detection error in performance assessments. These tools may be used to (1) determine transect designs for detecting and bounding target areas of critical size, shape, and density of detectable items of interest with a specified confidence probability, (2) evaluate the probability that target areas of a specified size, shape and density have not been missed by a systematic or meandering transect survey, and (3) support post-removal verification by calculating the number of transects required to achieve a specified confidence probability that no UXO or mines have been missed.« less
Gustafson, Paul; Gilbert, Mark; Xia, Michelle; Michelow, Warren; Robert, Wayne; Trussler, Terry; McGuire, Marissa; Paquette, Dana; Moore, David M; Gustafson, Reka
2013-05-15
Venue sampling is a common sampling method for populations of men who have sex with men (MSM); however, men who visit venues frequently are more likely to be recruited. While statistical adjustment methods are recommended, these have received scant attention in the literature. We developed a novel approach to adjust for frequency of venue attendance (FVA) and assess the impact of associated bias in the ManCount Study, a venue-based survey of MSM conducted in Vancouver, British Columbia, Canada, in 2008-2009 to measure the prevalence of human immunodeficiency virus and other infections and associated behaviors. Sampling weights were determined from an abbreviated list of questions on venue attendance and were used to adjust estimates of prevalence for health and behavioral indicators using a Bayesian, model-based approach. We found little effect of FVA adjustment on biological or sexual behavior indicators (primary outcomes); however, adjustment for FVA did result in differences in the prevalence of demographic indicators, testing behaviors, and a small number of additional variables. While these findings are reassuring and lend credence to unadjusted prevalence estimates from this venue-based survey, adjustment for FVA did shed important insights on MSM subpopulations that were not well represented in the sample.
Expression Profiling of Nonpolar Lipids in Meibum From Patients With Dry Eye: A Pilot Study
Chen, Jianzhong; Keirsey, Jeremy K.; Green, Kari B.; Nichols, Kelly K.
2017-01-01
Purpose The purpose of this investigation was to characterize differentially expressed lipids in meibum samples from patients with dry eye disease (DED) in order to better understand the underlying pathologic mechanisms. Methods Meibum samples were collected from postmenopausal women with DED (PW-DED; n = 5) and a control group of postmenopausal women without DED (n = 4). Lipid profiles were analyzed by direct infusion full-scan electrospray ionization mass spectrometry (ESI-MS). An initial analysis of 145 representative peaks from four classes of lipids in PW-DED samples revealed that additional manual corrections for peak overlap and isotopes only slightly affected the statistical analysis. Therefore, analysis of uncorrected data, which can be applied to a greater number of peaks, was used to compare more than 500 lipid peaks common to PW-DED and control samples. Statistical analysis of peak intensities identified several lipid species that differed significantly between the two groups. Data from contact lens wearers with DED (CL-DED; n = 5) were also analyzed. Results Many species of the two types of diesters (DE) and very long chain wax esters (WE) were decreased by ∼20% in PW-DED, whereas levels of triacylglycerols were increased by an average of 39% ± 3% in meibum from PW-DED compared to that in the control group. Approximately the same reduction (20%) of similar DE and WE was observed for CL-DED. Conclusions Statistical analysis of peak intensities from direct infusion ESI-MS results identified differentially expressed lipids in meibum from dry eye patients. Further studies are warranted to support these findings. PMID:28426869
Laganà, Pasqualina; Moscato, Umberto; Poscia, Andrea; La Milia, Daniele Ignazio; Boccia, Stefania; Avventuroso, Emanuela; Delia, Santi
2015-01-01
Legionnaires' disease is normally acquired by inhalation of legionellae from a contaminated environmental source. Water systems of large buildings, such as hospitals, are often contaminated with legionellae and therefore represent a potential risk for the hospital population. The aim of this study was to evaluate the potential contamination of Legionella pneumophila (LP) in a large hospital in Italy through georeferential statistical analysis to assess the possible sources of dispersion and, consequently, the risk of exposure for both health care staff and patients. LP serogroups 1 and 2-14 distribution was considered in the wards housed on two consecutive floors of the hospital building. On the basis of information provided by 53 bacteriological analysis, a 'random' grid of points was chosen and spatial geostatistics or FAIk Kriging was applied and compared with the results of classical statistical analysis. Over 50% of the examined samples were positive for Legionella pneumophila. LP 1 was isolated in 69% of samples from the ground floor and in 60% of sample from the first floor; LP 2-14 in 36% of sample from the ground floor and 24% from the first. The iso-estimation maps show clearly the most contaminated pipe and the difference in the diffusion of the different L. pneumophila serogroups. Experimental work has demonstrated that geostatistical methods applied to the microbiological analysis of water matrices allows a better modeling of the phenomenon under study, a greater potential for risk management and a greater choice of methods of prevention and environmental recovery to be put in place with respect to the classical statistical analysis.
Soft-tissue imaging with C-arm cone-beam CT using statistical reconstruction
NASA Astrophysics Data System (ADS)
Wang, Adam S.; Webster Stayman, J.; Otake, Yoshito; Kleinszig, Gerhard; Vogt, Sebastian; Gallia, Gary L.; Khanna, A. Jay; Siewerdsen, Jeffrey H.
2014-02-01
The potential for statistical image reconstruction methods such as penalized-likelihood (PL) to improve C-arm cone-beam CT (CBCT) soft-tissue visualization for intraoperative imaging over conventional filtered backprojection (FBP) is assessed in this work by making a fair comparison in relation to soft-tissue performance. A prototype mobile C-arm was used to scan anthropomorphic head and abdomen phantoms as well as a cadaveric torso at doses substantially lower than typical values in diagnostic CT, and the effects of dose reduction via tube current reduction and sparse sampling were also compared. Matched spatial resolution between PL and FBP was determined by the edge spread function of low-contrast (˜40-80 HU) spheres in the phantoms, which were representative of soft-tissue imaging tasks. PL using the non-quadratic Huber penalty was found to substantially reduce noise relative to FBP, especially at lower spatial resolution where PL provides a contrast-to-noise ratio increase up to 1.4-2.2× over FBP at 50% dose reduction across all objects. Comparison of sampling strategies indicates that soft-tissue imaging benefits from fully sampled acquisitions at dose above ˜1.7 mGy and benefits from 50% sparsity at dose below ˜1.0 mGy. Therefore, an appropriate sampling strategy along with the improved low-contrast visualization offered by statistical reconstruction demonstrates the potential for extending intraoperative C-arm CBCT to applications in soft-tissue interventions in neurosurgery as well as thoracic and abdominal surgeries by overcoming conventional tradeoffs in noise, spatial resolution, and dose.
Dunn, Richard A; Tan, Andrew K G
2011-01-01
As is the case in many developing nations, previous studies of breast cancer screening behavior in Malaysia have used relatively small samples that are not nationally representative, thereby limiting the generalizability of results. Therefore, this study uses nationally representative data from the Malaysia Non-Communicable Disease Surveillance-1 to investigate the role of socio-economic status on breast cancer screening behavior in Malaysia, particularly differences in screening behaviour between ethnic groups. The decisions of 816 women above age 40 in Malaysia to screen for breast cancer using mammography, clinical breast exams (CBE), and breast self-exams (BSE) are modeled using logistic regression. Results indicate that after adjusting for differences in age, education, household income, marital status, and residential location, Malay women are less likely than Chinese and Indian women to utilize mammography, but more likely to perform BSE. Education level and urban residence are positively associated with utilization of each method, but these relationships vary across ethnicity. Higher education levels are strongly related to using each screening method among Chinese women, but have no statistically significant relationship to screening among Malays. © 2011 Wiley Periodicals, Inc.
Cabrera-Pivaral, Carlos E; Gutiérrez-Ruvalcaba, Clara Luz; Peralta-Heredia, Irma Concepción; Alonso-Reynoso, Carlos
2008-01-01
The purpose of this work was to measure family physicians' clinical aptitude for the diagnosis and treatment of metabolic syndrome in a representative sample from six Family Medicine Units (UMF) at the Mexican Institute for Social Security (IMSS), in Guadalajara, Jalisco, México. This is a cross-sectional study. A validated and structured instrument was used, with a confidence coefficient (Kuder-Richardson) of 0.95, that was applied to a representative sample of 90 family physicians throughout six UMFs in Guadalajara, between 2003 and 2004. Mann-Whitney's U and Kruskal-Wallis' tests were used to compare two or more groups, and the Perez-Viniegra Test was used to define aptitude development levels. No statistically significant differences were found in aptitude development between the six family medicine units groups and other comparative groups. The generally low level of clinical aptitude, and its indicators, reflects limitations on the part of family physicians at the IMSS in Jalisco to identify and manage metabolic syndrome.
Data Analysis with Graphical Models: Software Tools
NASA Technical Reports Server (NTRS)
Buntine, Wray L.
1994-01-01
Probabilistic graphical models (directed and undirected Markov fields, and combined in chain graphs) are used widely in expert systems, image processing and other areas as a framework for representing and reasoning with probabilities. They come with corresponding algorithms for performing probabilistic inference. This paper discusses an extension to these models by Spiegelhalter and Gilks, plates, used to graphically model the notion of a sample. This offers a graphical specification language for representing data analysis problems. When combined with general methods for statistical inference, this also offers a unifying framework for prototyping and/or generating data analysis algorithms from graphical specifications. This paper outlines the framework and then presents some basic tools for the task: a graphical version of the Pitman-Koopman Theorem for the exponential family, problem decomposition, and the calculation of exact Bayes factors. Other tools already developed, such as automatic differentiation, Gibbs sampling, and use of the EM algorithm, make this a broad basis for the generation of data analysis software.
Ma, Ruoshui; Zhang, Xiumei; Wang, Yi; Zhang, Xiao
2018-04-27
The heterogeneous and complex structural characteristics of lignin present a significant challenge to predict its processability (e.g. depolymerization, modifications etc) to valuable products. This study provides a detailed characterization and comparison of structural properties of seven representative biorefinery lignin samples derived from forest and agricultural residues, which were subjected to representative pretreatment methods. A range of wet chemistry and spectroscopy methods were applied to determine specific lignin structural characteristics such as functional groups, inter-unit linkages and peak molecular weight. In parallel, oxidative depolymerization of these lignin samples to either monomeric phenolic compounds or dicarboxylic acids were conducted, and the product yields were quantified. Based on these results (lignin structural characteristics and monomer yields), we demonstrated for the first time to apply multiple-variable linear estimations (MVLE) approach using R statistics to gain insight toward a quantitative correlation between lignin structural properties and their conversion reactivity toward oxidative depolymerization to monomers. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Boka, V; Arapostathis, K; Karagiannis, V; Kotsanos, N; van Loveren, C; Veerkamp, J
2017-03-01
To present: the normative data on dental fear and caries status; the dental fear cut-off points of young children in the city of Thessaloniki, Greece. Study Design: This is a cross-sectional study with two independent study groups. A first representative sample consisted of 1484 children from 15 primary public schools of Thessaloniki. A second sample consisted of 195 randomly selected age-matched children, all patients of the Postgraduate Paediatric Dental Clinic of Aristotle University of Thessaloniki. First sample: In order to select data on dental fear and caries, dental examination took place in the classroom with disposable mirrors and a penlight. All the children completed the Dental Subscale of the Children's Fear Survey Schedule (CFSS-DS). Second sample: In order to define the cut-off points of the CFSS-DS, dental treatment of the 195 children was performed at the University Clinic. Children⁁s dental fear was assessed using the CFSS-DS and their behaviour during dental treatment was observed by one calibrated examiner using the Venham scale. Statistical analysis of the data was performed with IBM SPSS Statistics 20 at a statistical significance level of <0.05. First sample: The mean CFSS-DS score was 27.1±10.8. Age was significantly (p<0.05) related to dental fear. Mean differences between boys and girls were not significant. Caries was not correlated with dental fear. Second sample: CFSS-DS< 33 was defined as 'no dental fear', scores 33-37 as 'borderline' and scores > 37 as 'dental fear'. In the first sample, 84.6% of the children did not suffer from dental fear (CFSS-DS<33). Dental fear was correlated to age and not to caries and gender. The dental fear cut-off point for the CFSS-DS was estimated at 37 for 6-12 year old children (33-37 borderlines).
Technology Development Risk Assessment for Space Transportation Systems
NASA Technical Reports Server (NTRS)
Mathias, Donovan L.; Godsell, Aga M.; Go, Susie
2006-01-01
A new approach for assessing development risk associated with technology development projects is presented. The method represents technology evolution in terms of sector-specific discrete development stages. A Monte Carlo simulation is used to generate development probability distributions based on statistical models of the discrete transitions. Development risk is derived from the resulting probability distributions and specific program requirements. Two sample cases are discussed to illustrate the approach, a single rocket engine development and a three-technology space transportation portfolio.
Transmuted of Rayleigh Distribution with Estimation and Application on Noise Signal
NASA Astrophysics Data System (ADS)
Ahmed, Suhad; Qasim, Zainab
2018-05-01
This paper deals with transforming one parameter Rayleigh distribution, into transmuted probability distribution through introducing a new parameter (λ), since this studied distribution is necessary in representing signal data distribution and failure data model the value of this transmuted parameter |λ| ≤ 1, is also estimated as well as the original parameter (⊖) by methods of moments and maximum likelihood using different sample size (n=25, 50, 75, 100) and comparing the results of estimation by statistical measure (mean square error, MSE).
Challa, Shruthi; Potumarthi, Ravichandra
2013-01-01
Process analytical technology (PAT) is used to monitor and control critical process parameters in raw materials and in-process products to maintain the critical quality attributes and build quality into the product. Process analytical technology can be successfully implemented in pharmaceutical and biopharmaceutical industries not only to impart quality into the products but also to prevent out-of-specifications and improve the productivity. PAT implementation eliminates the drawbacks of traditional methods which involves excessive sampling and facilitates rapid testing through direct sampling without any destruction of sample. However, to successfully adapt PAT tools into pharmaceutical and biopharmaceutical environment, thorough understanding of the process is needed along with mathematical and statistical tools to analyze large multidimensional spectral data generated by PAT tools. Chemometrics is a chemical discipline which incorporates both statistical and mathematical methods to obtain and analyze relevant information from PAT spectral tools. Applications of commonly used PAT tools in combination with appropriate chemometric method along with their advantages and working principle are discussed. Finally, systematic application of PAT tools in biopharmaceutical environment to control critical process parameters for achieving product quality is diagrammatically represented.
Statistical approaches to account for false-positive errors in environmental DNA samples.
Lahoz-Monfort, José J; Guillera-Arroita, Gurutzeta; Tingley, Reid
2016-05-01
Environmental DNA (eDNA) sampling is prone to both false-positive and false-negative errors. We review statistical methods to account for such errors in the analysis of eDNA data and use simulations to compare the performance of different modelling approaches. Our simulations illustrate that even low false-positive rates can produce biased estimates of occupancy and detectability. We further show that removing or classifying single PCR detections in an ad hoc manner under the suspicion that such records represent false positives, as sometimes advocated in the eDNA literature, also results in biased estimation of occupancy, detectability and false-positive rates. We advocate alternative approaches to account for false-positive errors that rely on prior information, or the collection of ancillary detection data at a subset of sites using a sampling method that is not prone to false-positive errors. We illustrate the advantages of these approaches over ad hoc classifications of detections and provide practical advice and code for fitting these models in maximum likelihood and Bayesian frameworks. Given the severe bias induced by false-negative and false-positive errors, the methods presented here should be more routinely adopted in eDNA studies. © 2015 John Wiley & Sons Ltd.
Trends in groundwater quality in principal aquifers of the United States, 1988-2012
Lindsey, Bruce D.; Rupert, Michael G.
2014-01-01
The U.S. Geological Survey (USGS) National Water-Quality Assessment (NAWQA) Program analyzed trends in groundwater quality throughout the nation for the sampling period of 1988-2012. Trends were determined for networks (sets of wells routinely monitored by the USGS) for a subset of constituents by statistical analysis of paired water-quality measurements collected on a near-decadal time scale. The data set for chloride, dissolved solids, and nitrate consisted of 1,511 wells in 67 networks, whereas the data set for methyl tert-butyl ether (MTBE) consisted of 1, 013 wells in 46 networks. The 25 principal aquifers represented by these networks account for about 75 percent of withdrawals of groundwater used for drinking-water supply for the nation. Statistically significant changes in chloride, dissolved-solids, or nitrate concentrations were found in many well networks over a decadal period. Concentrations increased significantly in 48 percent of networks for chloride, 42 percent of networks for dissolved solids, and 21 percent of networks for nitrate. Chloride, dissolved solids, and nitrate concentrations decreased significantly in 3, 3, and 10 percent of the networks, respectively. The magnitude of change in concentrations was typically small in most networks; however, the magnitude of change in networks with statistically significant increases was typically much larger than the magnitude of change in networks with statistically significant decreases. The largest increases of chloride concentrations were in urban areas in the northeastern and north central United States. The largest increases of nitrate concentrations were in networks in agricultural areas. Statistical analysis showed 42 or the 46 networks had no statistically significant changes in MTBE concentrations. The four networks with statistically significant changes in MTBE concentrations were in the northeastern United States, where MTBE was widely used. Two networks had increasing concentrations, and two networks had decreasing concentrations. Production and use of MTBE peaked in about 2000 and has been effectively banned in many areas since about 2006. The two networks that had increasing concentrations were sampled for the second time close to the peak of MTBE production, whereas the two networks that had decreasing concentrations were sampled for the second time 10 years after the peak of MTBE production.
NASA Astrophysics Data System (ADS)
Chavez, Roberto; Lozano, Sergio; Correia, Pedro; Sanz-Rodrigo, Javier; Probst, Oliver
2013-04-01
With the purpose of efficiently and reliably generating long-term wind resource maps for the wind energy industry, the application and verification of a statistical methodology for the climate downscaling of wind fields at surface level is presented in this work. This procedure is based on the combination of the Monte Carlo and the Principal Component Analysis (PCA) statistical methods. Firstly the Monte Carlo method is used to create a huge number of daily-based annual time series, so called climate representative years, by the stratified sampling of a 33-year-long time series corresponding to the available period of the NCAR/NCEP global reanalysis data set (R-2). Secondly the representative years are evaluated such that the best set is chosen according to its capability to recreate the Sea Level Pressure (SLP) temporal and spatial fields from the R-2 data set. The measure of this correspondence is based on the Euclidean distance between the Empirical Orthogonal Functions (EOF) spaces generated by the PCA (Principal Component Analysis) decomposition of the SLP fields from both the long-term and the representative year data sets. The methodology was verified by comparing the selected 365-days period against a 9-year period of wind fields generated by dynamical downscaling the Global Forecast System data with the mesoscale model SKIRON for the Iberian Peninsula. These results showed that, compared to the traditional method of dynamical downscaling any random 365-days period, the error in the average wind velocity by the PCA's representative year was reduced by almost 30%. Moreover the Mean Absolute Errors (MAE) in the monthly and daily wind profiles were also reduced by almost 25% along all SKIRON grid points. These results showed also that the methodology presented maximum error values in the wind speed mean of 0.8 m/s and maximum MAE in the monthly curves of 0.7 m/s. Besides the bulk numbers, this work shows the spatial distribution of the errors across the Iberian domain and additional wind statistics such as the velocity and directional frequency. Additional repetitions were performed to prove the reliability and robustness of this kind-of statistical-dynamical downscaling method.
Is parenting style a predictor of suicide attempts in a representative sample of adolescents?
Donath, Carolin; Graessel, Elmar; Baier, Dirk; Bleich, Stefan; Hillemacher, Thomas
2014-04-26
Suicidal ideation and suicide attempts are serious but not rare conditions in adolescents. However, there are several research and practical suicide-prevention initiatives that discuss the possibility of preventing serious self-harm. Profound knowledge about risk and protective factors is therefore necessary. The aim of this study is a) to clarify the role of parenting behavior and parenting styles in adolescents' suicide attempts and b) to identify other statistically significant and clinically relevant risk and protective factors for suicide attempts in a representative sample of German adolescents. In the years 2007/2008, a representative written survey of N = 44,610 students in the 9th grade of different school types in Germany was conducted. In this survey, the lifetime prevalence of suicide attempts was investigated as well as potential predictors including parenting behavior. A three-step statistical analysis was carried out: I) As basic model, the association between parenting and suicide attempts was explored via binary logistic regression controlled for age and sex. II) The predictive values of 13 additional potential risk/protective factors were analyzed with single binary logistic regression analyses for each predictor alone. Non-significant predictors were excluded in Step III. III) In a multivariate binary logistic regression analysis, all significant predictor variables from Step II and the parenting styles were included after testing for multicollinearity. Three parental variables showed a relevant association with suicide attempts in adolescents - (all protective): mother's warmth and father's warmth in childhood and mother's control in adolescence (Step I). In the full model (Step III), Authoritative parenting (protective: OR: .79) and Rejecting-Neglecting parenting (risk: OR: 1.63) were identified as significant predictors (p < .001) for suicidal attempts. Seven further variables were interpreted to be statistically significant and clinically relevant: ADHD, female sex, smoking, Binge Drinking, absenteeism/truancy, migration background, and parental separation events. Parenting style does matter. While children of Authoritative parents profit, children of Rejecting-Neglecting parents are put at risk - as we were able to show for suicide attempts in adolescence. Some of the identified risk factors contribute new knowledge and potential areas of intervention for special groups such as migrants or children diagnosed with ADHD.
Patel, Harilal; Patel, Prakash; Modi, Nirav; Shah, Shaival; Ghoghari, Ashok; Variya, Bhavesh; Laddha, Ritu; Baradia, Dipesh; Dobaria, Nitin; Mehta, Pavak; Srinivas, Nuggehally R
2017-08-30
Because of the avoidance of first pass metabolic effects due to direct and rapid absorption with improved permeability, intranasal route represents a good alternative for extravascular drug administration. The aim of the study was to investigate the intranasal pharmacokinetics of two anti-migraine drugs (zolmitriptan and eletriptan), using retro-orbital sinus and jugular vein sites sampling. In a parallel study design, healthy male Sprague-Dawley (SD) rats aged between 8 and 12weeks were divided into groups (n=4 or 5/group). The animals of individual groups were dosed intranasal (~1.0mg/kg) and oral doses of 2.1mg/kg of either zolmitriptan or eletriptan. Serial blood sampling was performed from jugular vein or retro-orbital site and plasma samples were analyzed for drug concentrations using LC-MS/MS assay. Standard pharmacokinetics parameters such as T max , C max , AUC last , AUC 0-inf and T 1/2 were calculated and statistics of derived parameters was performed using unpaired t-test. After intranasal dosing, the mean pharmacokinetic parameters C max and AUC inf of zolmitriptan/eletriptan showed about 17-fold and 3-5-fold higher values for retro-orbital sampling as compared to the jugular vein sampling site. Whereas after oral administration such parameters derived for both drugs were largely comparable between the two sampling sites and statistically non-significant. In conclusion, the assessment of plasma levels after intranasal administration with retro-orbital sampling would result in spurious and misleading pharmacokinetics. Copyright © 2017 Elsevier B.V. All rights reserved.
Experimental scattershot boson sampling
Bentivegna, Marco; Spagnolo, Nicolò; Vitelli, Chiara; Flamini, Fulvio; Viggianiello, Niko; Latmiral, Ludovico; Mataloni, Paolo; Brod, Daniel J.; Galvão, Ernesto F.; Crespi, Andrea; Ramponi, Roberta; Osellame, Roberto; Sciarrino, Fabio
2015-01-01
Boson sampling is a computational task strongly believed to be hard for classical computers, but efficiently solvable by orchestrated bosonic interference in a specialized quantum computer. Current experimental schemes, however, are still insufficient for a convincing demonstration of the advantage of quantum over classical computation. A new variation of this task, scattershot boson sampling, leads to an exponential increase in speed of the quantum device, using a larger number of photon sources based on parametric down-conversion. This is achieved by having multiple heralded single photons being sent, shot by shot, into different random input ports of the interferometer. We report the first scattershot boson sampling experiments, where six different photon-pair sources are coupled to integrated photonic circuits. We use recently proposed statistical tools to analyze our experimental data, providing strong evidence that our photonic quantum simulator works as expected. This approach represents an important leap toward a convincing experimental demonstration of the quantum computational supremacy. PMID:26601164
Experimental scattershot boson sampling.
Bentivegna, Marco; Spagnolo, Nicolò; Vitelli, Chiara; Flamini, Fulvio; Viggianiello, Niko; Latmiral, Ludovico; Mataloni, Paolo; Brod, Daniel J; Galvão, Ernesto F; Crespi, Andrea; Ramponi, Roberta; Osellame, Roberto; Sciarrino, Fabio
2015-04-01
Boson sampling is a computational task strongly believed to be hard for classical computers, but efficiently solvable by orchestrated bosonic interference in a specialized quantum computer. Current experimental schemes, however, are still insufficient for a convincing demonstration of the advantage of quantum over classical computation. A new variation of this task, scattershot boson sampling, leads to an exponential increase in speed of the quantum device, using a larger number of photon sources based on parametric down-conversion. This is achieved by having multiple heralded single photons being sent, shot by shot, into different random input ports of the interferometer. We report the first scattershot boson sampling experiments, where six different photon-pair sources are coupled to integrated photonic circuits. We use recently proposed statistical tools to analyze our experimental data, providing strong evidence that our photonic quantum simulator works as expected. This approach represents an important leap toward a convincing experimental demonstration of the quantum computational supremacy.
Experience of elder abuse among older Korean immigrants.
Chang, Miya
2016-01-01
Studies on the scope and nature of Asian American elder abuse conducted with older immigrants are extremely limited. The overall purpose of this study was to examine the extent and type of elder abuse among older Korean immigrants, and to investigate critical predictors of elder abuse in this population. The sample consisted of 200 older Korean immigrants aged 60 to 90 years who resided in Los Angeles County in 2008. One of the key findings indicated that 58.3% of respondents experienced one or more types of elder abuse. Logistic regression indicated that the victims' health status and educational level were statistically significant predictors of the likelihood of experiencing abuse. The present study, although limited in sample size, measures, sampling methods, and population representation, has contributed to this important area of knowledge. It is recommended that future studies conduct research on elder abuse with more representative national samples that can measure the extent of abuse and neglect more accurately.
Time-integrated sampling of fluvial suspended sediment: a simple methodology for small catchments
NASA Astrophysics Data System (ADS)
Phillips, J. M.; Russell, M. A.; Walling, D. E.
2000-10-01
Fine-grained (<62·5 µm) suspended sediment transport is a key component of the geochemical flux in most fluvial systems. The highly episodic nature of suspended sediment transport imposes a significant constraint on the design of sampling strategies aimed at characterizing the biogeochemical properties of such sediment. A simple sediment sampler, utilizing ambient flow to induce sedimentation by settling, is described. The sampler can be deployed unattended in small streams to collect time-integrated suspended sediment samples. In laboratory tests involving chemically dispersed sediment, the sampler collected a maximum of 71% of the input sample mass. However, under natural conditions, the existence of composite particles or flocs can be expected to increase significantly the trapping efficiency. Field trials confirmed that the particle size composition and total carbon content of the sediment collected by the sampler were representative statistically of the ambient suspended sediment.
Walker, R.S.; Novare, A.J.; Nichols, J.D.
2000-01-01
Estimation of abundance of mammal populations is essential for monitoring programs and for many ecological investigations. The first step for any study of variation in mammal abundance over space or time is to define the objectives of the study and how and why abundance data are to be used. The data used to estimate abundance are count statistics in the form of counts of animals or their signs. There are two major sources of uncertainty that must be considered in the design of the study: spatial variation and the relationship between abundance and the count statistic. Spatial variation in the distribution of animals or signs may be taken into account with appropriate spatial sampling. Count statistics may be viewed as random variables, with the expected value of the count statistic equal to the true abundance of the population multiplied by a coefficient p. With direct counts, p represents the probability of detection or capture of individuals, and with indirect counts it represents the rate of production of the signs as well as their probability of detection. Comparisons of abundance using count statistics from different times or places assume that the p are the same for all times or places being compared (p= pi). In spite of considerable evidence that this assumption rarely holds true, it is commonly made in studies of mammal abundance, as when the minimum number alive or indices based on sign counts are used to compare abundance in different habitats or times. Alternatives to relying on this assumption are to calibrate the index used by testing the assumption of p= pi, or to incorporate the estimation of p into the study design.
ELF magnetic fields in electric and gasoline-powered vehicles.
Tell, R A; Sias, G; Smith, J; Sahl, J; Kavet, R
2013-02-01
We conducted a pilot study to assess magnetic field levels in electric compared to gasoline-powered vehicles, and established a methodology that would provide valid data for further assessments. The sample consisted of 14 vehicles, all manufactured between January 2000 and April 2009; 6 were gasoline-powered vehicles and 8 were electric vehicles of various types. Of the eight models available, three were represented by a gasoline-powered vehicle and at least one electric vehicle, enabling intra-model comparisons. Vehicles were driven over a 16.3 km test route. Each vehicle was equipped with six EMDEX Lite broadband meters with a 40-1,000 Hz bandwidth programmed to sample every 4 s. Standard statistical testing was based on the fact that the autocorrelation statistic damped quickly with time. For seven electric cars, the geometric mean (GM) of all measurements (N = 18,318) was 0.095 µT with a geometric standard deviation (GSD) of 2.66, compared to 0.051 µT (N = 9,301; GSD = 2.11) for four gasoline-powered cars (P < 0.0001). Using the data from a previous exposure assessment of residential exposure in eight geographic regions in the United States as a basis for comparison (N = 218), the broadband magnetic fields in electric vehicles covered the same range as personal exposure levels recorded in that study. All fields measured in all vehicles were much less than the exposure limits published by the International Commission on Non-Ionizing Radiation Protection (ICNIRP) and the Institute of Electrical and Electronics Engineers (IEEE). Future studies should include larger sample sizes representative of a greater cross-section of electric-type vehicles. Copyright © 2012 Wiley Periodicals, Inc.
Drinking water quality assessment.
Aryal, J; Gautam, B; Sapkota, N
2012-09-01
Drinking water quality is the great public health concern because it is a major risk factor for high incidence of diarrheal diseases in Nepal. In the recent years, the prevalence rate of diarrhoea has been found the highest in Myagdi district. This study was carried out to assess the quality of drinking water from different natural sources, reservoirs and collection taps at Arthunge VDC of Myagdi district. A cross-sectional study was carried out using random sampling method in Arthunge VDC of Myagdi district from January to June,2010. 84 water samples representing natural sources, reservoirs and collection taps from the study area were collected. The physico-chemical and microbiological analysis was performed following standards technique set by APHA 1998 and statistical analysis was carried out using SPSS 11.5. The result was also compared with national and WHO guidelines. Out of 84 water samples (from natural source, reservoirs and tap water) analyzed, drinking water quality parameters (except arsenic and total coliform) of all water samples was found to be within the WHO standards and national standards.15.48% of water samples showed pH (13) higher than the WHO permissible guideline values. Similarly, 85.71% of water samples showed higher Arsenic value (72) than WHO value. Further, the statistical analysis showed no significant difference (P<0.05) of physico-chemical parameters and total coliform count of drinking water for collection taps water samples of winter (January, 2010) and summer (June, 2010). The microbiological examination of water samples revealed the presence of total coliform in 86.90% of water samples. The results obtained from physico-chemical analysis of water samples were within national standard and WHO standards except arsenic. The study also found the coliform contamination to be the key problem with drinking water.
2013-01-01
Background Besides the development of comprehensive tools for high-throughput 16S ribosomal RNA amplicon sequence analysis, there exists a growing need for protocols emphasizing alternative phylogenetic markers such as those representing eukaryotic organisms. Results Here we introduce CloVR-ITS, an automated pipeline for comparative analysis of internal transcribed spacer (ITS) pyrosequences amplified from metagenomic DNA isolates and representing fungal species. This pipeline performs a variety of steps similar to those commonly used for 16S rRNA amplicon sequence analysis, including preprocessing for quality, chimera detection, clustering of sequences into operational taxonomic units (OTUs), taxonomic assignment (at class, order, family, genus, and species levels) and statistical analysis of sample groups of interest based on user-provided information. Using ITS amplicon pyrosequencing data from a previous human gastric fluid study, we demonstrate the utility of CloVR-ITS for fungal microbiota analysis and provide runtime and cost examples, including analysis of extremely large datasets on the cloud. We show that the largest fractions of reads from the stomach fluid samples were assigned to Dothideomycetes, Saccharomycetes, Agaricomycetes and Sordariomycetes but that all samples were dominated by sequences that could not be taxonomically classified. Representatives of the Candida genus were identified in all samples, most notably C. quercitrusa, while sequence reads assigned to the Aspergillus genus were only identified in a subset of samples. CloVR-ITS is made available as a pre-installed, automated, and portable software pipeline for cloud-friendly execution as part of the CloVR virtual machine package (http://clovr.org). Conclusion The CloVR-ITS pipeline provides fungal microbiota analysis that can be complementary to bacterial 16S rRNA and total metagenome sequence analysis allowing for more comprehensive studies of environmental and host-associated microbial communities. PMID:24451270
Dingus, Cheryl A; Teuschler, Linda K; Rice, Glenn E; Simmons, Jane Ellen; Narotsky, Michael G
2011-10-01
In complex mixture toxicology, there is growing emphasis on testing environmentally representative doses that improve the relevance of results for health risk assessment, but are typically much lower than those used in traditional toxicology studies. Traditional experimental designs with typical sample sizes may have insufficient statistical power to detect effects caused by environmentally relevant doses. Proper study design, with adequate statistical power, is critical to ensuring that experimental results are useful for environmental health risk assessment. Studies with environmentally realistic complex mixtures have practical constraints on sample concentration factor and sample volume as well as the number of animals that can be accommodated. This article describes methodology for calculation of statistical power for non-independent observations for a multigenerational rodent reproductive/developmental bioassay. The use of the methodology is illustrated using the U.S. EPA's Four Lab study in which rodents were exposed to chlorinated water concentrates containing complex mixtures of drinking water disinfection by-products. Possible experimental designs included two single-block designs and a two-block design. Considering the possible study designs and constraints, a design of two blocks of 100 females with a 40:60 ratio of control:treated animals and a significance level of 0.05 yielded maximum prospective power (~90%) to detect pup weight decreases, while providing the most power to detect increased prenatal loss.
Dingus, Cheryl A.; Teuschler, Linda K.; Rice, Glenn E.; Simmons, Jane Ellen; Narotsky, Michael G.
2011-01-01
In complex mixture toxicology, there is growing emphasis on testing environmentally representative doses that improve the relevance of results for health risk assessment, but are typically much lower than those used in traditional toxicology studies. Traditional experimental designs with typical sample sizes may have insufficient statistical power to detect effects caused by environmentally relevant doses. Proper study design, with adequate statistical power, is critical to ensuring that experimental results are useful for environmental health risk assessment. Studies with environmentally realistic complex mixtures have practical constraints on sample concentration factor and sample volume as well as the number of animals that can be accommodated. This article describes methodology for calculation of statistical power for non-independent observations for a multigenerational rodent reproductive/developmental bioassay. The use of the methodology is illustrated using the U.S. EPA’s Four Lab study in which rodents were exposed to chlorinated water concentrates containing complex mixtures of drinking water disinfection by-products. Possible experimental designs included two single-block designs and a two-block design. Considering the possible study designs and constraints, a design of two blocks of 100 females with a 40:60 ratio of control:treated animals and a significance level of 0.05 yielded maximum prospective power (~90%) to detect pup weight decreases, while providing the most power to detect increased prenatal loss. PMID:22073030
A comprehensive review of arsenic levels in the semiconductor manufacturing industry.
Park, Donguk; Yang, Haengsun; Jeong, Jeeyeon; Ha, Kwonchul; Choi, Sangjun; Kim, Chinyon; Yoon, Chungsik; Park, Dooyong; Paek, Domyung
2010-11-01
This paper presents a summary of arsenic level statistics from air and wipe samples taken from studies conducted in fabrication operations. The main objectives of this study were not only to describe arsenic measurement data but also, through a literature review, to categorize fabrication workers in accordance with observed arsenic levels. All airborne arsenic measurements reported were included in the summary statistics for analysis of the measurement data. The arithmetic mean was estimated assuming a lognormal distribution from the geometric mean and the geometric standard deviation or the range. In addition, weighted arithmetic means (WAMs) were calculated based on the number of measurements reported for each mean. Analysis of variance (ANOVA) was employed to compare arsenic levels classified according to several categories such as the year, sampling type, location sampled, operation type, and cleaning technique. Nine papers were found reporting airborne arsenic measurement data from maintenance workers or maintenance areas in semiconductor chip-making plants. A total of 40 statistical summaries from seven articles were identified that represented a total of 423 airborne arsenic measurements. Arsenic exposure levels taken during normal operating activities in implantation operations (WAM = 1.6 μg m⁻³, no. of samples = 77, no. of statistical summaries = 2) were found to be lower than exposure levels of engineers who were involved in maintenance works (7.7 μg m⁻³, no. of samples = 181, no. of statistical summaries = 19). The highest level (WAM = 218.6 μg m⁻³) was associated with various maintenance works performed inside an ion implantation chamber. ANOVA revealed no significant differences in the WAM arsenic levels among the categorizations based on operation and sampling characteristics. Arsenic levels (56.4 μg m⁻³) recorded during maintenance works performed in dry conditions were found to be much higher than those from maintenance works in wet conditions (0.6 μg m⁻³). Arsenic levels from wipe samples in process areas after maintenance activities ranged from non-detectable to 146 μg cm⁻², indicating the potential for dispersion into the air and hence inhalation. We conclude that workers who are regularly or occasionally involved in maintenance work have higher potential for occupational exposure than other employees who are in charge of routine production work. In addition, fabrication workers can be classified into two groups based on the reviewed arsenic exposure levels: operators with potential for low levels of exposure and maintenance engineers with high levels of exposure. These classifications could be used as a basis for a qualitative ordinal ranking of exposure in an epidemiological study.
Moreira, Andrios da Silva; Baptista, Cristiane Telles; Brasil, Carolina Litchina; Valente, Júlia de Souza Silveira; Bruhn, Fábio Raphael Pascoti; Pereira, Daniela Isabel Brayer
2018-01-01
This study investigated the frequency of oocysts of Cryptosporidium spp. in feces from dogs and cats in five municipalities in the southern region of the state of Rio Grande do Sul. The risk factors associated with infection were also investigated. Feces samples from 110 dogs and 18 cats were stained using the auramine method. At the time of feces sampling, a questionnaire with semi-open-ended questions was applied to the animal guardians and all data obtained underwent statistical analysis. The real frequency of oocysts of Cryptosporidium spp. was 24.63% (27 dogs and two cats). Only four samples of dog feces were diarrheic and no presence of oocysts was observed in any of them. Variables that represented risk factors for infection were: homemade food, untreated water, circulation of animals on grassy terrain and living in the same environment as other animals (cattle). The results made it possible to inferring that within the population studied, the frequency of parasitism due to Cryptosporidium spp. in dogs was relevant and emphasize the asymptomatic nature of this infection. The adopting control measures are highlighted, particularly in relation to variables that represent risk factors for this infection.
Visell, Yon
2015-04-01
This paper proposes a fast, physically accurate method for synthesizing multimodal, acoustic and haptic, signatures of distributed fracture in quasi-brittle heterogeneous materials, such as wood, granular media, or other fiber composites. Fracture processes in these materials are challenging to simulate with existing methods, due to the prevalence of large numbers of disordered, quasi-random spatial degrees of freedom, representing the complex physical state of a sample over the geometric volume of interest. Here, I develop an algorithm for simulating such processes, building on a class of statistical lattice models of fracture that have been widely investigated in the physics literature. This algorithm is enabled through a recently published mathematical construction based on the inverse transform method of random number sampling. It yields a purely time domain stochastic jump process representing stress fluctuations in the medium. The latter can be readily extended by a mean field approximation that captures the averaged constitutive (stress-strain) behavior of the material. Numerical simulations and interactive examples demonstrate the ability of these algorithms to generate physically plausible acoustic and haptic signatures of fracture in complex, natural materials interactively at audio sampling rates.
Statistical characterization of handwriting characteristics using automated tools
NASA Astrophysics Data System (ADS)
Ball, Gregory R.; Srihari, Sargur N.
2011-01-01
We provide a statistical basis for reporting the results of handwriting examination by questioned document (QD) examiners. As a facet of Questioned Document (QD) examination, the analysis and reporting of handwriting examination suffers from the lack of statistical data concerning the frequency of occurrence of combinations of particular handwriting characteristics. QD examiners tend to assign probative values to specific handwriting characteristics and their combinations based entirely on the examiner's experience and power of recall. The research uses data bases of handwriting samples that are representative of the US population. Feature lists of characteristics provided by QD examiners, are used to determine as to what frequencies need to be evaluated. Algorithms are used to automatically extract those characteristics, e.g., a software tool for extracting most of the characteristics from the most common letter pair th, is functional. For each letter combination the marginal and conditional frequencies of their characteristics are evaluated. Based on statistical dependencies of the characteristics the probability of any given letter formation is computed. The resulting algorithms are incorporated into a system for writer verification known as CEDAR-FOX.
Determinism and mass-media portrayals of genetics.
Condit, C M; Ofulue, N; Sheedy, K M
1998-01-01
Scholars have expressed concern that the introduction of substantial coverage of "medical genetics" in the mass media during the past 2 decades represents an increase in biological determinism in public discourse. To test this contention, we analyzed the contents of a randomly selected, structured sample of American public newspapers (n=250) and magazines (n=722) published during 1919-95. Three coders, using three measures, all with intercoder reliability >85%, were employed. Results indicate that the introduction of the discourse of medical genetics is correlated with both a statistically significant decrease in the degree to which articles attribute human characteristics to genetic causes (P<.001) and a statistically significant increase in the differentiation of attributions to genetic and other causes among various conditions or outcomes (P<. 016). There has been no statistically significant change in the relative proportions of physical phenomena attributed to genetic causes, but there has been a statistically significant decrease in the number of articles assigning genetic causes to mental (P<.002) and behavioral (P<.000) characteristics. These results suggest that the current discourse of medical genetics is not accurately described as more biologically deterministic than its antecedents. PMID:9529342
Khang, Young-Ho; Kim, Hye-Ryun
2016-03-22
Investigations into socioeconomic inequalities in mortality have rarely used long-term mortality follow-up data from nationally representative samples in Asian countries. A limited subset of indicators for socioeconomic position was employed in prior studies on socioeconomic inequalities in mortality. We examined socioeconomic inequalities in mortality using follow-up 12-year mortality data from nationally representative samples of South Koreans. A total of 10,137 individuals who took part in the 1998 and 2001 Korea National Health and Nutrition Examination Surveys were linked to mortality data from Statistics Korea. Of those individuals, 1,219 (12.1 %) had died as of December 2012. Cox proportional hazard models were used to estimate the relative risks of mortality according to a wide range of socioeconomic position (SEP) indicators after taking into account primary sampling units, stratification, and sample weights. Our analysis showed strong evidence that individuals with disadvantaged SEP indicators had greater all-cause mortality risks than their counterparts. The magnitude of the association varied according to gender, age group, and specific SEP indicators. Cause-specific analyses using equivalized income quintiles showed that the magnitude of mortality inequalities tended to be greater for cardiovascular disease and external causes than for cancer. Inequalities in mortality exist in every aspect of SEP indicators, both genders, and age groups, and four broad causes of deaths. The South Korean economic development, previously described as effective in both economic growth and relatively equitable income distribution, should be scrutinized regarding its impact on socioeconomic mortality inequalities. Policy measures to reduce inequalities in mortality should be implemented in South Korea.
NASA Astrophysics Data System (ADS)
Thurner, Stefan; Corominas-Murtra, Bernat; Hanel, Rudolf
2017-09-01
There are at least three distinct ways to conceptualize entropy: entropy as an extensive thermodynamic quantity of physical systems (Clausius, Boltzmann, Gibbs), entropy as a measure for information production of ergodic sources (Shannon), and entropy as a means for statistical inference on multinomial processes (Jaynes maximum entropy principle). Even though these notions represent fundamentally different concepts, the functional form of the entropy for thermodynamic systems in equilibrium, for ergodic sources in information theory, and for independent sampling processes in statistical systems, is degenerate, H (p ) =-∑ipilogpi . For many complex systems, which are typically history-dependent, nonergodic, and nonmultinomial, this is no longer the case. Here we show that for such processes, the three entropy concepts lead to different functional forms of entropy, which we will refer to as SEXT for extensive entropy, SIT for the source information rate in information theory, and SMEP for the entropy functional that appears in the so-called maximum entropy principle, which characterizes the most likely observable distribution functions of a system. We explicitly compute these three entropy functionals for three concrete examples: for Pólya urn processes, which are simple self-reinforcing processes, for sample-space-reducing (SSR) processes, which are simple history dependent processes that are associated with power-law statistics, and finally for multinomial mixture processes.
Rogala, James T.; Gray, Brian R.
2006-01-01
The Long Term Resource Monitoring Program (LTRMP) uses a stratified random sampling design to obtain water quality statistics within selected study reaches of the Upper Mississippi River System (UMRS). LTRMP sampling strata are based on aquatic area types generally found in large rivers (e.g., main channel, side channel, backwater, and impounded areas). For hydrologically well-mixed strata (i.e., main channel), variance associated with spatial scales smaller than the strata scale is a relatively minor issue for many water quality parameters. However, analysis of LTRMP water quality data has shown that within-strata variability at the strata scale is high in off-channel areas (i.e., backwaters). A portion of that variability may be associated with differences among individual backwater lakes (i.e., small and large backwater regions separated by channels) that cumulatively make up the backwater stratum. The objective of the statistical modeling presented here is to determine if differences among backwater lakes account for a large portion of the variance observed in the backwater stratum for selected parameters. If variance associated with backwater lakes is high, then inclusion of backwater lake effects within statistical models is warranted. Further, lakes themselves may represent natural experimental units where associations of interest to management may be estimated.
[Flavouring estimation of quality of grape wines with use of methods of mathematical statistics].
Yakuba, Yu F; Khalaphyan, A A; Temerdashev, Z A; Bessonov, V V; Malinkin, A D
2016-01-01
The questions of forming of wine's flavour integral estimation during the tasting are discussed, the advantages and disadvantages of the procedures are declared. As investigating materials we used the natural white and red wines of Russian manufactures, which were made with the traditional technologies from Vitis Vinifera, straight hybrids, blending and experimental wines (more than 300 different samples). The aim of the research was to set the correlation between the content of wine's nonvolatile matter and wine's tasting quality rating by mathematical statistics methods. The content of organic acids, amino acids and cations in wines were considered as the main factors influencing on the flavor. Basically, they define the beverage's quality. The determination of those components in wine's samples was done by the electrophoretic method «CAPEL». Together with the analytical checking of wine's samples quality the representative group of specialists simultaneously carried out wine's tasting estimation using 100 scores system. The possibility of statistical modelling of correlation of wine's tasting estimation based on analytical data of amino acids and cations determination reasonably describing the wine's flavour was examined. The statistical modelling of correlation between the wine's tasting estimation and the content of major cations (ammonium, potassium, sodium, magnesium, calcium), free amino acids (proline, threonine, arginine) and the taking into account the level of influence on flavour and analytical valuation within fixed limits of quality accordance were done with Statistica. Adequate statistical models which are able to predict tasting estimation that is to determine the wine's quality using the content of components forming the flavour properties have been constructed. It is emphasized that along with aromatic (volatile) substances the nonvolatile matter - mineral substances and organic substances - amino acids such as proline, threonine, arginine influence on wine's flavour properties. It has been shown the nonvolatile components contribute in organoleptic and flavour quality estimation of wines as aromatic volatile substances but they take part in forming the expert's evaluation.
Dzul, Maria C.; Dixon, Philip M.; Quist, Michael C.; Dinsomore, Stephen J.; Bower, Michael R.; Wilson, Kevin P.; Gaines, D. Bailey
2013-01-01
We used variance components to assess allocation of sampling effort in a hierarchically nested sampling design for ongoing monitoring of early life history stages of the federally endangered Devils Hole pupfish (DHP) (Cyprinodon diabolis). Sampling design for larval DHP included surveys (5 days each spring 2007–2009), events, and plots. Each survey was comprised of three counting events, where DHP larvae on nine plots were counted plot by plot. Statistical analysis of larval abundance included three components: (1) evaluation of power from various sample size combinations, (2) comparison of power in fixed and random plot designs, and (3) assessment of yearly differences in the power of the survey. Results indicated that increasing the sample size at the lowest level of sampling represented the most realistic option to increase the survey's power, fixed plot designs had greater power than random plot designs, and the power of the larval survey varied by year. This study provides an example of how monitoring efforts may benefit from coupling variance components estimation with power analysis to assess sampling design.
Sanna, Daria; Pala, Maria; Cossu, Piero; Dedola, Gian Luca; Melis, Sonia; Fresu, Giovanni; Morelli, Laura; Obinu, Domenica; Tonolo, Giancarlo; Secchi, Giannina; Triunfo, Riccardo; Lorenz, Joseph G.; Scheinfeldt, Laura; Torroni, Antonio; Robledo, Renato; Francalacci, Paolo
2011-01-01
We report a sampling strategy based on Mendelian Breeding Units (MBUs), representing an interbreeding group of individuals sharing a common gene pool. The identification of MBUs is crucial for case-control experimental design in association studies. The aim of this work was to evaluate the possible existence of bias in terms of genetic variability and haplogroup frequencies in the MBU sample, due to severe sample selection. In order to reach this goal, the MBU sampling strategy was compared to a standard selection of individuals according to their surname and place of birth. We analysed mitochondrial DNA variation (first hypervariable segment and coding region) in unrelated healthy subjects from two different areas of Sardinia: the area around the town of Cabras and the western Campidano area. No statistically significant differences were observed when the two sampling methods were compared, indicating that the stringent sample selection needed to establish a MBU does not alter original genetic variability and haplogroup distribution. Therefore, the MBU sampling strategy can be considered a useful tool in association studies of complex traits. PMID:21734814
Grey literature in meta-analyses.
Conn, Vicki S; Valentine, Jeffrey C; Cooper, Harris M; Rantz, Marilyn J
2003-01-01
In meta-analysis, researchers combine the results of individual studies to arrive at cumulative conclusions. Meta-analysts sometimes include "grey literature" in their evidential base, which includes unpublished studies and studies published outside widely available journals. Because grey literature is a source of data that might not employ peer review, critics have questioned the validity of its data and the results of meta-analyses that include it. To examine evidence regarding whether grey literature should be included in meta-analyses and strategies to manage grey literature in quantitative synthesis. This article reviews evidence on whether the results of studies published in peer-reviewed journals are representative of results from broader samplings of research on a topic as a rationale for inclusion of grey literature. Strategies to enhance access to grey literature are addressed. The most consistent and robust difference between published and grey literature is that published research is more likely to contain results that are statistically significant. Effect size estimates of published research are about one-third larger than those of unpublished studies. Unfunded and small sample studies are less likely to be published. Yet, importantly, methodological rigor does not differ between published and grey literature. Meta-analyses that exclude grey literature likely (a) over-represent studies with statistically significant findings, (b) inflate effect size estimates, and (c) provide less precise effect size estimates than meta-analyses including grey literature. Meta-analyses should include grey literature to fully reflect the existing evidential base and should assess the impact of methodological variations through moderator analysis.
Probing Protein Fold Space with a Simplified Model
Minary, Peter; Levitt, Michael
2008-01-01
We probe the stability and near-native energy landscape of protein fold space using powerful conformational sampling methods together with simple reduced models and statistical potentials. Fold space is represented by a set of 280 protein domains spanning all topological classes and having a wide range of lengths (0-300 residues), amino acid composition, and number of secondary structural elements. The degrees of freedom are taken as the loop torsion angles. This choice preserves the native secondary structure but allows the tertiary structure to change. The proteins are represented by three-point per residue, three-dimensional models with statistical potentials derived from a knowledge-based study of known protein structures. When this space is sampled by a combination of Parallel Tempering and Equi-Energy Monte Carlo, we find that the three-point model captures the known stability of protein native structures with stable energy basins that are near-native (all-α: 4.77 Å, all-β: 2.93 Å, α/β: 3.09 Å, α+β: 4.89 Å on average and within 6 Å for 71.41 %, 92.85 %, 94.29 % and 64.28 % for all-α, all-β, α/β and α+β, classes respectively). Denatured structures also occur and these have interesting structural properties that shed light on the different landscape characteristics of α and β folds. We find that α/β proteins with alternating α and β segments (such as the beta-barrel) are more stable than proteins in other fold classes. PMID:18054792
Effect of the menstrual cycle on voice quality.
Silverman, E M; Zimmer, C H
1978-01-01
The question addressed was whether most young women with no vocal training exhibit premenstrual hoarseness. Spectral (acoustical) analyses of the sustained productions of three vowels produced by 20 undergraduates at and at premenstruation were rated for degree of hoarseness. Statistical analysis of the data indicated that the typical subject was no more hoarse of premenstruation than at ovulation. To determine whether this finding represented a genuine characteristic of women's voices or a type II statistical error, a systematic replication was undertaken with another sample of 27 undergraduates. The finding replicated that of the original investigation, suggesting that premenstrual hoarseness is a rarely occurring condition among young women with no vocal training. The apparent differential effect of the menstrual cycle on trained as opposed to untrained voices deserves systematic investigation.
Adequacy of laser diffraction for soil particle size analysis
Fisher, Peter; Aumann, Colin; Chia, Kohleth; O'Halloran, Nick; Chandra, Subhash
2017-01-01
Sedimentation has been a standard methodology for particle size analysis since the early 1900s. In recent years laser diffraction is beginning to replace sedimentation as the prefered technique in some industries, such as marine sediment analysis. However, for the particle size analysis of soils, which have a diverse range of both particle size and shape, laser diffraction still requires evaluation of its reliability. In this study, the sedimentation based sieve plummet balance method and the laser diffraction method were used to measure the particle size distribution of 22 soil samples representing four contrasting Australian Soil Orders. Initially, a precise wet riffling methodology was developed capable of obtaining representative samples within the recommended obscuration range for laser diffraction. It was found that repeatable results were obtained even if measurements were made at the extreme ends of the manufacturer’s recommended obscuration range. Results from statistical analysis suggested that the use of sample pretreatment to remove soil organic carbon (and possible traces of calcium-carbonate content) made minor differences to the laser diffraction particle size distributions compared to no pretreatment. These differences were found to be marginally statistically significant in the Podosol topsoil and Vertosol subsoil. There are well known reasons why sedimentation methods may be considered to ‘overestimate’ plate-like clay particles, while laser diffraction will ‘underestimate’ the proportion of clay particles. In this study we used Lin’s concordance correlation coefficient to determine the equivalence of laser diffraction and sieve plummet balance results. The results suggested that the laser diffraction equivalent thresholds corresponding to the sieve plummet balance cumulative particle sizes of < 2 μm, < 20 μm, and < 200 μm, were < 9 μm, < 26 μm, < 275 μm respectively. The many advantages of laser diffraction for soil particle size analysis, and the empirical results of this study, suggest that deployment of laser diffraction as a standard test procedure can provide reliable results, provided consistent sample preparation is used. PMID:28472043
Identifying ontogenetic, environmental and individual components of forest tree growth
Chaubert-Pereira, Florence; Caraglio, Yves; Lavergne, Christian; Guédon, Yann
2009-01-01
Background and Aims This study aimed to identify and characterize the ontogenetic, environmental and individual components of forest tree growth. In the proposed approach, the tree growth data typically correspond to the retrospective measurement of annual shoot characteristics (e.g. length) along the trunk. Methods Dedicated statistical models (semi-Markov switching linear mixed models) were applied to data sets of Corsican pine and sessile oak. In the semi-Markov switching linear mixed models estimated from these data sets, the underlying semi-Markov chain represents both the succession of growth phases and their lengths, while the linear mixed models represent both the influence of climatic factors and the inter-individual heterogeneity within each growth phase. Key Results On the basis of these integrative statistical models, it is shown that growth phases are not only defined by average growth level but also by growth fluctuation amplitudes in response to climatic factors and inter-individual heterogeneity and that the individual tree status within the population may change between phases. Species plasticity affected the response to climatic factors while tree origin, sampling strategy and silvicultural interventions impacted inter-individual heterogeneity. Conclusions The transposition of the proposed integrative statistical modelling approach to cambial growth in relation to climatic factors and the study of the relationship between apical growth and cambial growth constitute the next steps in this research. PMID:19684021
Suzuki, Takakuni; Griffin, Sarah A; Samuel, Douglas B
2017-04-01
Several studies have shown structural and statistical similarities between the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) alternative personality disorder model and the Five-Factor Model (FFM). However, no study to date has evaluated the nomological network similarities between the two models. The relations of the Revised NEO Personality Inventory (NEO PI-R) and the Personality Inventory for DSM-5 (PID-5) with relevant criterion variables were examined in a sample of 336 undergraduate students (M age = 19.4; 59.8% female). The resulting profiles for each instrument were statistically compared for similarity. Four of the five domains of the two models have highly similar nomological networks, with the exception being FFM Openness to Experience and PID-5 Psychoticism. Further probing of that pair suggested that the NEO PI-R domain scores obscured meaningful similarity between PID-5 Psychoticism and specific aspects and lower-order facets of Openness. The results support the notion that the DSM-5 alternative personality disorder model trait domains represent variants of the FFM domains. Similarities of Openness and Psychoticism domains were supported when the lower-order aspects and facets of Openness domain were considered. The findings support the view that the DSM-5 trait model represents an instantiation of the FFM. © 2015 Wiley Periodicals, Inc.
Daigle, Courtney L; Siegford, Janice M
2014-03-01
Continuous observation is the most accurate way to determine animals' actual time budget and can provide a 'gold standard' representation of resource use, behavior frequency, and duration. Continuous observation is useful for capturing behaviors that are of short duration or occur infrequently. However, collecting continuous data is labor intensive and time consuming, making multiple individual or long-term data collection difficult. Six non-cage laying hens were video recorded for 15 h and behavioral data collected every 2 s were compared with data collected using scan sampling intervals of 5, 10, 15, 30, and 60 min and subsamples of 2 second observations performed for 10 min every 30 min, 15 min every 1 h, 30 min every 1.5 h, and 15 min every 2 h. Three statistical approaches were used to provide a comprehensive analysis to examine the quality of the data obtained via different sampling methods. General linear mixed models identified how the time budget from the sampling techniques differed from continuous observation. Correlation analysis identified how strongly results from the sampling techniques were associated with those from continuous observation. Regression analysis identified how well the results from the sampling techniques were associated with those from continuous observation, changes in magnitude, and whether a sampling technique had bias. Static behaviors were well represented with scan and time sampling techniques, while dynamic behaviors were best represented with time sampling techniques. Methods for identifying an appropriate sampling strategy based upon the type of behavior of interest are outlined and results for non-caged laying hens are presented. Copyright © 2013 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Yousu; Etingov, Pavel V.; Ren, Huiying
This paper describes a probabilistic look-ahead contingency analysis application that incorporates smart sampling and high-performance computing (HPC) techniques. Smart sampling techniques are implemented to effectively represent the structure and statistical characteristics of uncertainty introduced by different sources in the power system. They can significantly reduce the data set size required for multiple look-ahead contingency analyses, and therefore reduce the time required to compute them. High-performance-computing (HPC) techniques are used to further reduce computational time. These two techniques enable a predictive capability that forecasts the impact of various uncertainties on potential transmission limit violations. The developed package has been tested withmore » real world data from the Bonneville Power Administration. Case study results are presented to demonstrate the performance of the applications developed.« less
Thoresen, Carl J; Bradley, Jill C; Bliese, Paul D; Thoresen, Joseph D
2004-10-01
This study extends the literature on personality and job performance through the use of random coefficient modeling to test the validity of the Big Five personality traits in predicting overall sales performance and sales performance trajectories--or systematic patterns of performance growth--in 2 samples of pharmaceutical sales representatives at maintenance and transitional job stages (K. R. Murphy, 1989). In the maintenance sample, conscientiousness and extraversion were positively associated with between-person differences in total sales, whereas only conscientiousness predicted performance growth. In the transitional sample, agreeableness and openness to experience predicted overall performance differences and performance trends. All effects remained significant with job tenure statistically controlled. Possible explanations for these findings are offered, and theoretical and practical implications of findings are discussed. (c) 2004 APA, all rights reserved
Statistical power as a function of Cronbach alpha of instrument questionnaire items.
Heo, Moonseong; Kim, Namhee; Faith, Myles S
2015-10-14
In countless number of clinical trials, measurements of outcomes rely on instrument questionnaire items which however often suffer measurement error problems which in turn affect statistical power of study designs. The Cronbach alpha or coefficient alpha, here denoted by C(α), can be used as a measure of internal consistency of parallel instrument items that are developed to measure a target unidimensional outcome construct. Scale score for the target construct is often represented by the sum of the item scores. However, power functions based on C(α) have been lacking for various study designs. We formulate a statistical model for parallel items to derive power functions as a function of C(α) under several study designs. To this end, we assume fixed true score variance assumption as opposed to usual fixed total variance assumption. That assumption is critical and practically relevant to show that smaller measurement errors are inversely associated with higher inter-item correlations, and thus that greater C(α) is associated with greater statistical power. We compare the derived theoretical statistical power with empirical power obtained through Monte Carlo simulations for the following comparisons: one-sample comparison of pre- and post-treatment mean differences, two-sample comparison of pre-post mean differences between groups, and two-sample comparison of mean differences between groups. It is shown that C(α) is the same as a test-retest correlation of the scale scores of parallel items, which enables testing significance of C(α). Closed-form power functions and samples size determination formulas are derived in terms of C(α), for all of the aforementioned comparisons. Power functions are shown to be an increasing function of C(α), regardless of comparison of interest. The derived power functions are well validated by simulation studies that show that the magnitudes of theoretical power are virtually identical to those of the empirical power. Regardless of research designs or settings, in order to increase statistical power, development and use of instruments with greater C(α), or equivalently with greater inter-item correlations, is crucial for trials that intend to use questionnaire items for measuring research outcomes. Further development of the power functions for binary or ordinal item scores and under more general item correlation strutures reflecting more real world situations would be a valuable future study.
Mitchell, P J; Sarmiento, E E; Meldrum, D J
2012-10-01
Based on comparisons to non-statistically representative samples of humans and two great ape species (i.e. common chimpanzees Pan troglodytes and lowland gorillas Gorilla gorilla), Ward et al. (2011) concluded that a complete hominin fourth metatarsal (4th MT) from Hadar, AL 333-160, belonged to a committed terrestrial biped with fixed transverse and longitudinal pedal arches, which was no longer under selection favoring substantial arboreal behaviors. According to Ward et al., the Hadar 4th MT had (1) a torsion value indicating a transverse arch, (2) sagittal plane angles between the diaphyseal long axis and the planes of the articular surfaces indicating a longitudinal arch, and (3) a narrow mediolateral to dorsoplantar base ratio, an ectocuneiform facet, and tarsal articular surface contours all indicating a rigid foot without an ape-like mid-tarsal break. Comparisons of the Hadar 4th MT characters to those of statistically representative samples of humans, all five great ape species, baboons and proboscis monkeys show that none of the correlations Ward et al. make to localized foot function were supported by this analysis. The Hadar 4th MT characters are common to catarrhines that have a midtarsal break and lack fixed transverse or longitudinal arches. Further comparison of the AL 333-160 4th MT length, and base, midshaft and head circumferences to those of catarrhines with field collected body weights show that this bone is uniquely short with a large base. Its length suggests the AL 333-160 individual was a poor leaper with limited arboreal behaviors and lacked a longitudinal arch, i.e. its 4th MT long axis was usually held perpendicular to gravity. Its large base implies cuboid-4th MT joint mobility. A relatively short 4th MT head circumference indicates AL 333-160 had small proximal phalanges with a restricted range of mobility. Overall, AL 333-160 is most similar to the 4th MT of eastern gorillas, a slow moving quadruped that sacrifices arboreal behaviors for terrestrial ones. This study highlights evolutionary misconceptions underlying the practice of using localized anatomy and/or a single bony element to reconstruct overall locomotor behaviors and of summarizing great ape structure and behavior based on non-statistically representative samples of only a few living great ape species. Copyright © 2012. Published by Elsevier GmbH.
Landrine, Hope; Corral, Irma
2014-01-01
To conduct meaningful, epidemiologic research on racial–ethnic health disparities, racial–ethnic samples must be rendered equivalent on other social status and contextual variables via statistical controls of those extraneous factors. The racial–ethnic groups must also be equally familiar with and have similar responses to the methods and measures used to collect health data, must have equal opportunity to participate in the research, and must be equally representative of their respective populations. In the absence of such measurement equivalence, studies of racial–ethnic health disparities are confounded by a plethora of unmeasured, uncontrolled correlates of race–ethnicity. Those correlates render the samples, methods, and measures incomparable across racial–ethnic groups, and diminish the ability to attribute health differences discovered to race–ethnicity vs. to its correlates. This paper reviews the non-equivalent yet normative samples, methodologies and measures used in epidemiologic studies of racial–ethnic health disparities, and provides concrete suggestions for improving sample, method, and scalar measurement equivalence. PMID:25566524
The occurrence and distribution of trace metals in the Mississippi River and its tributaries
Taylor, Howard E.; Garbarino, J.R.; Brinton, T.I.
1990-01-01
Quantitative and semiquantitative analyses of dissolved trace metals are reported for designated sampling sites on the Mississippi River and its main tributaries utilizing depth-integrated and width-integrated sampling technology to collect statistically representative samples. Data are reported for three sampling periods, including: July-August 1987, November-December 1987, and May-June 1988. Concentrations of Al, As, Ba, Be, Cd, Co, Cr, Cu, Fe, Li, Mn, Mo, Pb, Sr, Tl, U, V, and Zn are reported quantitatively, with the remainder of the stable metals in the periodic table reported semiquantitatively. Correlations between As and V, Ba and U, Cu and Zn, Li and Ba, and Li and U are significant at the 99% confidence level for each of the sampling trips. Comparison of the results of this study for selected metals with other published data show generally good agreement for Cr, Cu, Fe, and Zn, moderate agreement for Mo, and poor agreement for Cd and V.
Lu, Alex Y; Turban, Jack L; Damisah, Eyiyemisi C; Li, Jie; Alomari, Ahmed K; Eid, Tore; Vortmeyer, Alexander O; Chiang, Veronica L
2017-08-01
OBJECTIVE Following an initial response of brain metastases to Gamma Knife radiosurgery, regrowth of the enhancing lesion as detected on MRI may represent either radiation necrosis (a treatment-related inflammatory change) or recurrent tumor. Differentiation of radiation necrosis from tumor is vital for management decision making but remains difficult by imaging alone. In this study, gas chromatography with time-of-flight mass spectrometry (GC-TOF) was used to identify differential metabolite profiles of the 2 tissue types obtained by surgical biopsy to find potential targets for noninvasive imaging. METHODS Specimens of pure radiation necrosis and pure tumor obtained from patient brain biopsies were flash-frozen and validated histologically. These formalin-free tissue samples were then analyzed using GC-TOF. The metabolite profiles of radiation necrosis and tumor samples were compared using multivariate and univariate statistical analysis. Statistical significance was defined as p ≤ 0.05. RESULTS For the metabolic profiling, GC-TOF was performed on 7 samples of radiation necrosis and 7 samples of tumor. Of the 141 metabolites identified, 17 (12.1%) were found to be statistically significantly different between comparison groups. Of these metabolites, 6 were increased in tumor, and 11 were increased in radiation necrosis. An unsupervised hierarchical clustering analysis found that tumor had elevated levels of metabolites associated with energy metabolism, whereas radiation necrosis had elevated levels of metabolites that were fatty acids and antioxidants/cofactors. CONCLUSIONS To the authors' knowledge, this is the first tissue-based metabolomics study of radiation necrosis and tumor. Radiation necrosis and recurrent tumor following Gamma Knife radiosurgery for brain metastases have unique metabolite profiles that may be targeted in the future to develop noninvasive metabolic imaging techniques.
Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye
2016-01-13
A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.
Batt, Angela L; Wathen, John B; Lazorchak, James M; Olsen, Anthony R; Kincaid, Thomas M
2017-03-07
U.S. EPA conducted a national statistical survey of fish tissue contamination at 540 river sites (representing 82 954 river km) in 2008-2009, and analyzed samples for 50 persistent organic pollutants (POPs), including 21 PCB congeners, 8 PBDE congeners, and 21 organochlorine pesticides. The survey results were used to provide national estimates of contamination for these POPs. PCBs were the most abundant, being measured in 93.5% of samples. Summed concentrations of the 21 PCB congeners had a national weighted mean of 32.7 μg/kg and a maximum concentration of 857 μg/kg, and exceeded the human health cancer screening value of 12 μg/kg in 48% of the national sampled population of river km, and in 70% of the urban sampled population. PBDEs (92.0%), chlordane (88.5%) and DDT (98.7%) were also detected frequently, although at lower concentrations. Results were examined by subpopulations of rivers, including urban or nonurban and three defined ecoregions. PCBs, PBDEs, and DDT occur at significantly higher concentrations in fish from urban rivers versus nonurban; however, the distribution varied more among the ecoregions. Wildlife screening values previously published for bird and mammalian species were converted from whole fish to fillet screening values, and used to estimate risk for wildlife through fish consumption.
Assigning African elephant DNA to geographic region of origin: Applications to the ivory trade
Wasser, Samuel K.; Shedlock, Andrew M.; Comstock, Kenine; Ostrander, Elaine A.; Mutayoba, Benezeth; Stephens, Matthew
2004-01-01
Resurgence of illicit trade in African elephant ivory is placing the elephant at renewed risk. Regulation of this trade could be vastly improved by the ability to verify the geographic origin of tusks. We address this need by developing a combined genetic and statistical method to determine the origin of poached ivory. Our statistical approach exploits a smoothing method to estimate geographic-specific allele frequencies over the entire African elephants' range for 16 microsatellite loci, using 315 tissue and 84 scat samples from forest (Loxodonta africana cyclotis) and savannah (Loxodonta africana africana) elephants at 28 locations. These geographic-specific allele frequency estimates are used to infer the geographic origin of DNA samples, such as could be obtained from tusks of unknown origin. We demonstrate that our method alleviates several problems associated with standard assignment methods in this context, and the absolute accuracy of our method is high. Continent-wide, 50% of samples were located within 500 km, and 80% within 932 km of their actual place of origin. Accuracy varied by region (median accuracies: West Africa, 135 km; Central Savannah, 286 km; Central Forest, 411 km; South, 535 km; and East, 697 km). In some cases, allele frequencies vary considerably over small geographic regions, making much finer discriminations possible and suggesting that resolution could be further improved by collection of samples from locations not represented in our study. PMID:15459317
Statistical Survey of Persistent Organic Pollutants: Risk ...
U.S. EPA conducted a national statistical survey of fish tissue contamination at 540 river sites (representing 82 954 river km) in 2008–2009, and analyzed samples for 50 persistent organic pollutants (POPs), including 21 PCB congeners, 8 PBDE congeners, and 21 organochlorine pesticides. The survey results were used to provide national estimates of contamination for these POPs. PCBs were the most abundant, being measured in 93.5% of samples. Summed concentrations of the 21 PCB congeners had a national weighted mean of 32.7 μg/kg and a maximum concentration of 857 μg/kg, and exceeded the human health cancer screening value of 12 μg/kg in 48% of the national sampled population of river km, and in 70% of the urban sampled population. PBDEs (92.0%), chlordane (88.5%) and DDT (98.7%) were also detected frequently, although at lower concentrations. Results were examined by subpopulations of rivers, including urban or nonurban and three defined ecoregions. PCBs, PBDEs, and DDT occur at significantly higher concentrations in fish from urban rivers versus nonurban; however, the distribution varied more among the ecoregions. Wildlife screening values previously published for bird and mammalian species were converted from whole fish to fillet screening values, and used to estimate risk for wildlife through fish consumption. This work presents the results of the 2008-2009 National Rivers and Streams Assessment Survey (NRSA) where 50 persistent organic pollutants (POPs
Cluster Stability Estimation Based on a Minimal Spanning Trees Approach
NASA Astrophysics Data System (ADS)
Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard-Wilhelm; Toledano-Kitai, Dvora
2009-08-01
Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stability of partitions obtained from clustering of samples. Partitions are considered consistent if their clusters are stable. Clusters validity is measured as the total number of edges, in the clusters' minimal spanning trees, connecting points from different samples. Actually, we use the Friedman and Rafsky two sample test statistic. The homogeneity hypothesis, of well mingled samples within the clusters, leads to asymptotic normal distribution of the considered statistic. Resting upon this fact, the standard score of the mentioned edges quantity is set, and the partition quality is represented by the worst cluster corresponding to the minimal standard score value. It is natural to expect that the true number of clusters can be characterized by the empirical distribution having the shortest left tail. The proposed methodology sequentially creates the described value distribution and estimates its left-asymmetry. Numerical experiments, presented in the paper, demonstrate the ability of the approach to detect the true number of clusters.
Padmavathi, P
2014-01-01
Premenstrual syndrome is the most common of gynaecologic complaints. It affects half of all female adolescents today and represents the leading cause of college/school absenteeism among that population. It was sought to assess the effectiveness of acupressure Vs reflexology on premenstrual syndrome among adolescents. Two-group pre-test and post-test true experimental design was adopted for the study. Forty adolescent girls from Government Girls Secondary School, Erode with pre- menstrual syndrome fulfilling the inclusion criteria were selected by simple random sampling. A pre-test was conducted by using premenstrual symptoms assessment scale. Immediately after pre-test acupressure Vs reflexology was given once a week for 6 weeks and again post-test was conducted to assess the effectiveness of treatment. Collected data was analysed by using descriptive and inferential statistics. In post-test, the mean score of the experimental group I sample was 97.3 (SD = 2.5) and the group II mean score was 70:8 (SD = 10.71) with paired 't' value of 19.2 and 31.9. This showed that the reflexology was more effective than acupressure in enhancing the practice of the sample regarding pre-menstrual syndrome. Statistically no significant association was found between the post-test scores of the sample with their demographic variables. The findings imply the need for educating adolescent girls on effective management of pre-menstrual syndrome.
Royle, J. Andrew; Dorazio, Robert M.
2008-01-01
A guide to data collection, modeling and inference strategies for biological survey data using Bayesian and classical statistical methods. This book describes a general and flexible framework for modeling and inference in ecological systems based on hierarchical models, with a strict focus on the use of probability models and parametric inference. Hierarchical models represent a paradigm shift in the application of statistics to ecological inference problems because they combine explicit models of ecological system structure or dynamics with models of how ecological systems are observed. The principles of hierarchical modeling are developed and applied to problems in population, metapopulation, community, and metacommunity systems. The book provides the first synthetic treatment of many recent methodological advances in ecological modeling and unifies disparate methods and procedures. The authors apply principles of hierarchical modeling to ecological problems, including * occurrence or occupancy models for estimating species distribution * abundance models based on many sampling protocols, including distance sampling * capture-recapture models with individual effects * spatial capture-recapture models based on camera trapping and related methods * population and metapopulation dynamic models * models of biodiversity, community structure and dynamics.
Jimsphere wind and turbulence exceedance statistic
NASA Technical Reports Server (NTRS)
Adelfang, S. I.; Court, A.
1972-01-01
Exceedance statistics of winds and gusts observed over Cape Kennedy with Jimsphere balloon sensors are described. Gust profiles containing positive and negative departures, from smoothed profiles, in the wavelength ranges 100-2500, 100-1900, 100-860, and 100-460 meters were computed from 1578 profiles with four 41 weight digital high pass filters. Extreme values of the square root of gust speed are normally distributed. Monthly and annual exceedance probability distributions of normalized rms gust speeds in three altitude bands (2-7, 6-11, and 9-14 km) are log-normal. The rms gust speeds are largest in the 100-2500 wavelength band between 9 and 14 km in late winter and early spring. A study of monthly and annual exceedance probabilities and the number of occurrences per kilometer of level crossings with positive slope indicates significant variability with season, altitude, and filter configuration. A decile sampling scheme is tested and an optimum approach is suggested for drawing a relatively small random sample that represents the characteristic extreme wind speeds and shears of a large parent population of Jimsphere wind profiles.
Jagucki, Martha L.; Kula, Stephanie P.; Mailot, Brian E.
2015-01-01
To evaluate whether constituent concentrations consistently increased or decreased over time, the strength of the association between sampling year (time) and constituent concentration was statistically evaluated for 116 water-quality samples collected by the USGS in 1978, 1980, 1986, 1999, and 2009 from a total of 65 wells across the county (generally domestic wells or wells serving small businesses or churches). Results indicate that many of the constituents that have been analyzed for decades exhibited no consistent temporal trends at a statistically significant level (p-value less than 0.05); fluctuations in concentrations of these constituents represent natural variation in groundwater quality. Dissolved oxygen, calcium, and sulfate concentrations and chloride:bromide ratios increased over time in one or more aquifers, while pH and concentrations of bromide and dissolved organic carbon decreased over time. Detections of total coliform bacteria and nitrate did not become more frequent from 1986 to 2009, even though potential sources of these constituents, such as number of septic systems (linked to population) and percent developed land in the county, increased during this period.
Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R
2017-01-01
Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell. DOI: http://dx.doi.org/10.7554/eLife.26580.001 PMID:28678007
Trull, Timothy J.; Vergés, Alvaro; Wood, Phillip K.; Jahng, Seungmin; Sher, Kenneth J.
2013-01-01
We examined the latent structure underlying the criteria for DSM–IV–TR (American Psychiatric Association, 2000, Diagnostic and statistical manual of mental disorders (4th ed., text revision). Washington, DC: Author.) personality disorders in a large nationally representative sample of U.S. adults. Personality disorder symptom data were collected using a structured diagnostic interview from approximately 35,000 adults assessed over two waves of data collection in the National Epidemiologic Survey on Alcohol and Related Conditions. Our analyses suggested that a seven-factor solution provided the best fit for the data, and these factors were marked primarily by one or at most two personality disorder criteria sets. A series of regression analyses that used external validators tapping Axis I psychopathology, treatment for mental health problems, functioning scores, interpersonal conflict, and suicidal ideation and behavior provided support for the seven-factor solution. We discuss these findings in the context of previous studies that have examined the structure underlying the personality disorder criteria as well as the current proposals for DSM-5 personality disorders. PMID:22506626
Gender differences in drug-addicted patients in a clinical treatment center of Spain.
Fernandez-Montalvo, Javier; Lopez-Goñi, José J; Azanza, Paula; Cacho, Raul
2014-01-01
This study explored the characteristics of a representative sample of patients who were addicted to drugs and analyzed the differential profile of addicted women and men. A sample of 195 addicted patients (95 female and 100 male) who sought outpatient treatment in a Spanish clinical center was assessed. Information on sociodemographic status, consumption patterns and associated characteristics was collected using the European Addiction Severity Index (EuropASI). The results showed statistically significant differences between groups. Demographically, the differences were centered on employment, with more labor problems in the female group. Regarding addiction severity, the EuropASI results showed statistically significant differences in both the interviewer severity ratings (ISR) and composite scores (CS). Women experienced more severe impacts in the medical, family social and psychiatric areas. By contrast addicted men had more severe legal problems than addicted females did. These results suggest that, women who seek outpatient treatment in a clinical center presented with more severe addiction problems than men did. Moreover, they reported more significant maladjustment in the various aspects of life explored. © American Academy of Addiction Psychiatry.
Making Sense of 'Big Data' in Provenance Studies
NASA Astrophysics Data System (ADS)
Vermeesch, P.
2014-12-01
Huge online databases can be 'mined' to reveal previously hidden trends and relationships in society. One could argue that sedimentary geology has entered a similar era of 'Big Data', as modern provenance studies routinely apply multiple proxies to dozens of samples. Just like the Internet, sedimentary geology now requires specialised statistical tools to interpret such large datasets. These can be organised on three levels of progressively higher order:A single sample: The most effective way to reveal the provenance information contained in a representative sample of detrital zircon U-Pb ages are probability density estimators such as histograms and kernel density estimates. The widely popular 'probability density plots' implemented in IsoPlot and AgeDisplay compound analytical uncertainty with geological scatter and are therefore invalid.Several samples: Multi-panel diagrams comprising many detrital age distributions or compositional pie charts quickly become unwieldy and uninterpretable. For example, if there are N samples in a study, then the number of pairwise comparisons between samples increases quadratically as N(N-1)/2. This is simply too much information for the human eye to process. To solve this problem, it is necessary to (a) express the 'distance' between two samples as a simple scalar and (b) combine all N(N-1)/2 such values in a single two-dimensional 'map', grouping similar and pulling apart dissimilar samples. This can be easily achieved using simple statistics-based dissimilarity measures and a standard statistical method called Multidimensional Scaling (MDS).Several methods: Suppose that we use four provenance proxies: bulk petrography, chemistry, heavy minerals and detrital geochronology. This will result in four MDS maps, each of which likely show slightly different trends and patterns. To deal with such cases, it may be useful to use a related technique called 'three way multidimensional scaling'. This results in two graphical outputs: an MDS map, and a map with 'weights' showing to what extent the different provenance proxies influence the horizontal and vertical axis of the MDS map. Thus, detrital data can not only inform the user about the provenance of sediments, but also about the causal relationships between the mineralogy, geochronology and chemistry.
45 CFR 160.536 - Statistical sampling.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 45 Public Welfare 1 2010-10-01 2010-10-01 false Statistical sampling. 160.536 Section 160.536... REQUIREMENTS GENERAL ADMINISTRATIVE REQUIREMENTS Procedures for Hearings § 160.536 Statistical sampling. (a) In... statistical sampling study as evidence of the number of violations under § 160.406 of this part, or the...
42 CFR 1003.133 - Statistical sampling.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 42 Public Health 5 2011-10-01 2011-10-01 false Statistical sampling. 1003.133 Section 1003.133... AUTHORITIES CIVIL MONEY PENALTIES, ASSESSMENTS AND EXCLUSIONS § 1003.133 Statistical sampling. (a) In meeting... statistical sampling study as evidence of the number and amount of claims and/or requests for payment as...
45 CFR 160.536 - Statistical sampling.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 45 Public Welfare 1 2011-10-01 2011-10-01 false Statistical sampling. 160.536 Section 160.536... REQUIREMENTS GENERAL ADMINISTRATIVE REQUIREMENTS Procedures for Hearings § 160.536 Statistical sampling. (a) In... statistical sampling study as evidence of the number of violations under § 160.406 of this part, or the...
42 CFR 1003.133 - Statistical sampling.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 42 Public Health 5 2010-10-01 2010-10-01 false Statistical sampling. 1003.133 Section 1003.133... AUTHORITIES CIVIL MONEY PENALTIES, ASSESSMENTS AND EXCLUSIONS § 1003.133 Statistical sampling. (a) In meeting... statistical sampling study as evidence of the number and amount of claims and/or requests for payment as...
42 CFR 405.1064 - ALJ decisions involving statistical samples.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 42 Public Health 2 2011-10-01 2011-10-01 false ALJ decisions involving statistical samples. 405... Medicare Coverage Policies § 405.1064 ALJ decisions involving statistical samples. When an appeal from the QIC involves an overpayment issue and the QIC used a statistical sample in reaching its...
42 CFR 405.1064 - ALJ decisions involving statistical samples.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 42 Public Health 2 2010-10-01 2010-10-01 false ALJ decisions involving statistical samples. 405... Medicare Coverage Policies § 405.1064 ALJ decisions involving statistical samples. When an appeal from the QIC involves an overpayment issue and the QIC used a statistical sample in reaching its...
Statistical analyses to support guidelines for marine avian sampling. Final report
Kinlan, Brian P.; Zipkin, Elise; O'Connell, Allan F.; Caldow, Chris
2012-01-01
Interest in development of offshore renewable energy facilities has led to a need for high-quality, statistically robust information on marine wildlife distributions. A practical approach is described to estimate the amount of sampling effort required to have sufficient statistical power to identify species-specific “hotspots” and “coldspots” of marine bird abundance and occurrence in an offshore environment divided into discrete spatial units (e.g., lease blocks), where “hotspots” and “coldspots” are defined relative to a reference (e.g., regional) mean abundance and/or occurrence probability for each species of interest. For example, a location with average abundance or occurrence that is three times larger the mean (3x effect size) could be defined as a “hotspot,” and a location that is three times smaller than the mean (1/3x effect size) as a “coldspot.” The choice of the effect size used to define hot and coldspots will generally depend on a combination of ecological and regulatory considerations. A method is also developed for testing the statistical significance of possible hotspots and coldspots. Both methods are illustrated with historical seabird survey data from the USGS Avian Compendium Database. Our approach consists of five main components: 1. A review of the primary scientific literature on statistical modeling of animal group size and avian count data to develop a candidate set of statistical distributions that have been used or may be useful to model seabird counts. 2. Statistical power curves for one-sample, one-tailed Monte Carlo significance tests of differences of observed small-sample means from a specified reference distribution. These curves show the power to detect "hotspots" or "coldspots" of occurrence and abundance at a range of effect sizes, given assumptions which we discuss. 3. A model selection procedure, based on maximum likelihood fits of models in the candidate set, to determine an appropriate statistical distribution to describe counts of a given species in a particular region and season. 4. Using a large database of historical at-sea seabird survey data, we applied this technique to identify appropriate statistical distributions for modeling a variety of species, allowing the distribution to vary by season. For each species and season, we used the selected distribution to calculate and map retrospective statistical power to detect hotspots and coldspots, and map pvalues from Monte Carlo significance tests of hotspots and coldspots, in discrete lease blocks designated by the U.S. Department of Interior, Bureau of Ocean Energy Management (BOEM). 5. Because our definition of hotspots and coldspots does not explicitly include variability over time, we examine the relationship between the temporal scale of sampling and the proportion of variance captured in time series of key environmental correlates of marine bird abundance, as well as available marine bird abundance time series, and use these analyses to develop recommendations for the temporal distribution of sampling to adequately represent both shortterm and long-term variability. We conclude by presenting a schematic “decision tree” showing how this power analysis approach would fit in a general framework for avian survey design, and discuss implications of model assumptions and results. We discuss avenues for future development of this work, and recommendations for practical implementation in the context of siting and wildlife assessment for offshore renewable energy development projects.
AL-Waili, Noori; Al Ghamdi, Ahmad; Ansari, Mohammad Javed; Al-Attal, Yehya; Al-Mubarak, Aarif; Salom, Khelod
2013-05-01
Antibiotic multiresistant microbes represent a challenging problem. Because honey has a potent antibacterial property, the antimicrobial effects of different honey samples against multiresistant pathogens and their compositions were investigated. Five honey samples were used: Talah, Dhahian, Sumra-1, Sidr, and Sumra-2. Samples were analyzed to determine chemical composition such as fructose, glucose, sucrose, pH, total flavonoids, total phenolics, hydrogen peroxide concentration, minerals and trace elements. Antimicrobial activities of the samples against 17 (16 were multiresistant) human pathogenic bacteria and three types of fungi were studied. Specimens of the isolates were cultured into 10 mL of 10-100% (volume/volume) honey diluted in broth. Microbial growth was assessed on a solid plate media after 24 h and 72 h incubation. The composition of honey samples varied considerably. Sumra 1 and 2 contained the highest level of flavonoids and phenolics and the lowest level of hydrogen peroxide, whereas Dhahian honey contained the highest level of hydrogen peroxide. Sixteen pathogens were antibiotic multiresistant. A single dose of each honey sample inhibited all the pathogens tested after 24 h and 72 h incubation. The most sensitive pathogens were Aspergillus nidulans, Salmonella typhimurum and Staphylococcus epidermidis (S. epidermidis). Although there was no statistically significant difference in the effectiveness of honey samples, the most effective honey against bacteria was Talah and against fungi were Dhahian and Sumra-2. Various honey samples collected from different geographical areas and plant origins showed almost similar antimicrobial activities against multiresistant pathogens despite considerable variation in their composition. Honey may represent an alternative candidate to be tested as part of management of drug multiresistant pathogens. Copyright © 2013 IMSS. Published by Elsevier Inc. All rights reserved.
Characterization of stormwater runoff from bridge decks in eastern Massachusetts, 2014–16
Smith, Kirk P.; Sorenson, Jason R.; Granato, Gregory E.
2018-05-02
The quality of stormwater runoff from bridge decks (hereafter referred to as “bridge-deck runoff”) was characterized in a field study from August 2014 through August 2016 in which concentrations of suspended sediment (SS) and total nutrients were monitored. These new data were collected to supplement existing highway-runoff data collected in Massachusetts which were deficient in bridge-deck runoff concentration data. Monitoring stations were installed at three bridges maintained by the Massachusetts Department of Transportation in eastern Massachusetts (State Route 2A in the city of Boston, Interstate 90 in the town of Weston, and State Route 20 near Quinsigamond Village in the city of Worcester). The bridges had annual average daily traffic volumes from 21,200 to 124,000 vehicles per day; the land use surrounding the monitoring stations was 25 to 67 percent impervious.Automatic-monitoring techniques were used to collect more than 160 flow-proportional composite samples of bridge-deck runoff. Samples were analyzed for concentrations of SS, loss on ignition of suspended solids (LOI), particulate carbon (PC), total phosphorus (TP), total dissolved nitrogen (DN), and particulate nitrogen (PN). The distribution of particle size of SS also was determined for composite samples. Samples of bridge-deck runoff were collected year round during rain, mixed precipitation, and snowmelt runoff and with different dry antecedent periods throughout the 2-year sampling period.At the three bridge-deck-monitoring stations, median concentrations of SS in composite samples of bridge-deck runoff ranged from 1,490 to 2,020 milligrams per liter (mg/L); however, the range of SS in individual composites was vast at 44 to 142,000 mg/L. Median concentrations of SS were similar in composite samples collected from the State Route 2A and Interstate 90 bridge (2,010 and 2,020 mg/L, respectively), and lowest at the State Route 20 bridge (1,490 mg/L). Concentrations of coarse sediment (greater than 0.25 millimeters in diameter) dominated the SS matrix by more than an order of magnitude. Concentrations of LOI and PC in composite samples ranged from 15 to 1,740 mg/L and 6.68 to 1,360 mg/L, respectively, and generally represented less than 10 and 3 percent of the median mass of SS, respectively. Concentrations of TP in composite samples ranged from 0.09 to 7.02 mg/L; median concentrations of TP ranged from 0.505 to 0.69 mg/L and were highest on the bridge on State Route 2A in Boston. Concentrations of total nitrogen (TN) (sum DN and PN) in composite samples were variable (0.36 to 29 mg/L). Median DN (0.64 to 0.90 mg/L) concentrations generally represented about 40 percent of the TN concentration at each bridge and were similar to annual volume-weighted mean concentrations of nitrogen in precipitation in Massachusetts.Nonparametric statistical methods were used to test for differences between sample constituent concentrations among the three bridges. These results indicated that there are no statistically significant differences for concentrations of SS, LOI, PC, and TP among the three bridges (one-way analysis of variance test on rank-transformed data, 95-percent confidence level). Test results for concentrations of TN in composite samples indicated that concentrations of TN collected on State Route 20 near Quinsigamond Village were significantly higher than those concentrations collected on State Route 2A in Boston and Interstate 90 near Weston. Median concentrations of TN were about 93 and 55 percent lower at State Route 2A and at Interstate 90, respectively, compared to the median concentrations of TN at State Route 20.Samples of sediment were collected from five fixed locations on each bridge on three occasions during dry weather to calculate semiquantitative distributions of sediment yields on the bridge surface relative to the monitoring location. Mean yields of bridge-deck sediment during this study for State Route 2A in Boston, Interstate 90 near Weston, and State Route 20 near Quinsigamond Village were 1,500, 250, and 5,700 pounds per curb-mile, respectively. Sediment yields at each sampling location varied widely (26 to 25,000 pounds per curb-mile) but were similar to yields reported elsewhere in Massachusetts and the United States. Yields calculated for each sampling location indicated that the sediment was not evenly distributed across each bridge in this study for plausible reasons such as bridge slope, vehicular tracking, and bridge deterioration.Bridge-deck sediment quality was largely affected by the distribution of sediment particle size. Concentrations of TP in the fine sediment-size fraction (less than 0.0625 millimeter in diameter) of samples of bridge-deck sediment were about 6 times greater than in the coarse size fraction. Concentrations for many total-recoverable metals were 2 to 17 times greater in the fine size fraction compared to concentrations in the coarse size fraction (greater than or equal to 0.25 millimeter in diameter), and concentrations of total-recoverable copper and lead in the fine size fraction were 2 to 65 times higher compared to concentrations in the intermediate (greater than or equal to 0.0625 to 0.25 millimeter in diameter) or the coarse size fraction. However, the proportion of sediment particles less than 0.0625 millimeter in diameter in composite samples of bridge-deck runoff was small (median values range from 4 to 8 percent at each bridge) compared to the larger sediment particle-size mass. As a result, more than 50 percent of the sediment-associated TP, aluminum, chromium, manganese, and nickel was estimated to be associated with the coarse size fraction of the SS load. In contrast, about 95 percent of the estimated sediment-associated copper concentration was associated with the fine size fraction of the SS load.Version 1.0.2 of the Stochastic Empirical Loading and Dilution Model was used to simulate long-term (29–30-year) concentrations and annual yields of SS, TP, and TN in bridge-deck runoff and in discharges from a hypothetical stormwater treatment best-management practice structure. Three methods (traditional statistics, robust statistics, and L-moments) were used to calculate statistics for stochastic simulations because the high variability in measured concentration values during the field study resulted in extreme simulated concentrations. Statistics of each dataset, including the average, standard deviation, and skew of the common (base 10) logarithms, for each of the three bridges, and for a lumped dataset, were calculated and used for simulations; statistics representing the median of statistics calculated for the three bridges also were used for simulations. These median statistics were selected for the interpretive simulations so that the simulations could be used to estimate concentrations and yields from other, unmonitored bridges in Massachusetts. Comparisons of the standard and robust statistics indicated that simulation results with either method would be similar, which indicated that the large variability in simulated results was not caused by a few outliers. Comparison to statistics calculated by the L-moments methods indicated that L-moments do not produce extreme concentrations; however, they also do not produce results that represent the bulk of concentration data.The runoff-quality risk analysis indicated that bridge-deck runoff would exceed discharge standards commonly used for large, advanced wastewater treatment plants, but that commonly used stormwater best-management practices may reduce the percentage of exceedances by one-half. Results of simulations indicated that long-term average yields of TN, TP, and SS may be about 21.4, 6.44, and 40,600 pounds per acre per year, respectively. These yields are about 1.3, 3.4, and 16 times simulated ultra-urban highway yields in Massachusetts; however, simulations indicated that use of a best-management practice structure to treat bridge-deck runoff may reduce discharge yields to about 10, 2.8, and 4,300, pounds per acre per year, respectively.
Bird, C B; Hoerner, R J; Restaino, L
2001-01-01
Four different food types along with environmental swabs were analyzed by the Reveal for E. coli O157:H7 test (Reveal) and the Bacteriological Analytical Manual (BAM) culture method for the presence of Escherichia coli O157:H7. Twenty-seven laboratories representing academia and private industry in the United States and Canada participated. Sample types were inoculated with E. coli O157:H7 at 2 different levels. Of the 1,095 samples and controls analyzed and confirmed, 459 were positive and 557 were negative by both methods. No statistical differences (p <0.05) were observed between the Reveal and BAM methods.
Multinomial logistic regression in workers' health
NASA Astrophysics Data System (ADS)
Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana
2017-11-01
In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.
Real-time, continuous water-quality monitoring in Indiana and Kentucky
Shoda, Megan E.; Lathrop, Timothy R.; Risch, Martin R.
2015-01-01
Water-quality “super” gages (also known as “sentry” gages) provide real-time, continuous measurements of the physical and chemical characteristics of stream water at or near selected U.S. Geological Survey (USGS) streamgages in Indiana and Kentucky. A super gage includes streamflow and water-quality instrumentation and representative stream sample collection for laboratory analysis. USGS scientists can use statistical surrogate models to relate instrument values to analyzed chemical concentrations at a super gage. Real-time, continuous and laboratory-analyzed concentration and load data are publicly accessible on USGS Web pages.
Mathematics in modern immunology
Castro, Mario; Lythe, Grant; Molina-París, Carmen; ...
2016-02-19
Mathematical and statistical methods enable multidisciplinary approaches that catalyse discovery. Together with experimental methods, they identify key hypotheses, define measurable observables and reconcile disparate results. Here, we collect a representative sample of studies in T-cell biology that illustrate the benefits of modelling–experimental collaborations and that have proven valuable or even groundbreaking. Furthermore, we conclude that it is possible to find excellent examples of synergy between mathematical modelling and experiment in immunology, which have brought significant insight that would not be available without these collaborations, but that much remains to be discovered.
Statistical analysis of general aviation VG-VGH data
NASA Technical Reports Server (NTRS)
Clay, L. E.; Dickey, R. L.; Moran, M. S.; Payauys, K. W.; Severyn, T. P.
1974-01-01
To represent the loads spectra of general aviation aircraft operating in the Continental United States, VG and VGH data collected since 1963 in eight operational categories were processed and analyzed. Adequacy of data sample and current operational categories, and parameter distributions required for valid data extrapolation were studied along with envelopes of equal probability of exceeding the normal load factor (n sub z) versus airspeed for gust and maneuver loads and the probability of exceeding current design maneuver, gust, and landing impact n sub z limits. The significant findings are included.
DOE R&D Accomplishments Database
Tartarelli, G. F.; CDF Collaboration
1996-05-01
The authors present the latest results about top physics obtained by the CDF experiment at the Fermilab Tevatron collider. The data sample used for these analysis (about 110 pb{sup{minus}1}) represents almost the entire statistics collected by CDF during four years (1992--95) of data taking. This large data size has allowed detailed studies of top production and decay properties. The results discussed here include the determination of the top quark mass, the measurement of the production cross section, the study of the kinematics of the top events and a look at top decays.
Mathematics in modern immunology
Castro, Mario; Lythe, Grant; Molina-París, Carmen; Ribeiro, Ruy M.
2016-01-01
Mathematical and statistical methods enable multidisciplinary approaches that catalyse discovery. Together with experimental methods, they identify key hypotheses, define measurable observables and reconcile disparate results. We collect a representative sample of studies in T-cell biology that illustrate the benefits of modelling–experimental collaborations and that have proven valuable or even groundbreaking. We conclude that it is possible to find excellent examples of synergy between mathematical modelling and experiment in immunology, which have brought significant insight that would not be available without these collaborations, but that much remains to be discovered. PMID:27051512
Vingilis, Evelyn; Mann, Robert E; Erickson, Patricia; Toplak, Maggie; Kolla, Nathan J; Seeley, Jane; Jain, Umesh
2014-01-01
The purpose of this study is to examine the relationships among self-reported screening measures of attention deficit hyperactivity disorder (ADHD), other psychiatric problems, and driving-related outcomes in a provincially representative sample of adults 18 years and older living in the province of Ontario, Canada. The study examined the results of the Centre for Addictions and Mental Health (CAMH) Ontario Monitor, an ongoing repeated cross-sectional telephone survey of Ontario adults over a 2-year period. Measures included ADHD measures (Adult ADHD Self-Report Scale-V1.1 [ASRS-V1.1], previous ADHD diagnosis, ADHD medication use); psychiatric distress measures (General Health Questionnaire [GHQ12], use of pain, anxiety, and depression medication); antisocial behavior measure (The Antisocial Personality Disorder Scale from the Mini-International Neuropsychiatric Interview [APD]); substance use and abuse measures (alcohol, cannabis, and cocaine), Alcohol Use Disorders Identification Test (AUDIT), Alcohol, Smoking and Substance Involvement Screening Test (ASSIST), driving-related outcomes (driving after drinking, driving after cannabis use, street racing, collisions in past year), and sociodemographics (gender, age, vehicle-kilometers traveled). A total of 4,014 Ontario residents were sampled, of which 3,485 reported having a valid driver's license. Overall, 3.22% screened positive for ADHD symptoms on the ASRS-V1.1 screening tool. A greater percentage of those who screened positive were younger, reported previous ADHD diagnosis and medication use, distress, antisocial behavior, anti-anxiety and antidepressant medication use, substance use, and social problems compared to those who screened negative. However, there were no statistically significant differences between those who screened positive or negative for ADHD symptoms on self-reported driving after having 2 or more drinks in the previous hour; within an hour of using cannabis, marijuana, or hash; or in a street race or collision involvement as a driver in the past year. When a sequential regression was conducted to predict self-reported collisions, younger age and higher weekly kilometers driven showed higher odds of collision involvement, and the odds ratio for cannabis use ever approached statistical significance. This study is the first population-based study of a representative sample of adults 18 years and older living in Ontario, Canada. These results showed no relationship between the ADHD screen and collision when age, sex, and kilometers driven are controlled for. However, these analyses are based on self-report screeners and not psychiatric diagnoses and a limited sample of ADHD respondents. Thus, these results should be interpreted with caution.
Support, shape and number of replicate samples for tree foliage analysis.
Luyssaert, Sebastiaan; Mertens, Jan; Raitio, Hannu
2003-06-01
Many fundamental features of a sampling program are determined by the heterogeneity of the object under study and the settings for the error (alpha), the power (beta), the effect size (ES), the number of replicate samples, and sample support, which is a feature that is often overlooked. The number of replicates, alpha, beta, ES, and sample support are interconnected. The effect of the sample support and its shape on the required number of replicate samples was investigated by means of a resampling method. The method was applied to a simulated distribution of Cd in the crown of a Salix fragilis L. tree. Increasing the dimensions of the sample support results in a decrease in the variance of the element concentration under study. Analysis of the variance is often the foundation of statistical tests, therefore, valid statistical testing requires the use of a fixed sample support during the experiment. This requirement might be difficult to meet in time-series analyses and long-term monitoring programs. Sample supports have their largest dimension in the direction with the largest heterogeneity, i.e. the direction representing the crown height, and this will give more accurate results than supports with other shapes. Taking the relationships between the sample support and the variance of the element concentrations in tree crowns into account provides guidelines for sampling efficiency in terms of precision and costs. In terms of time, the optimal support to test whether the average Cd concentration of the crown exceeds a threshold value is 0.405 m3 (alpha = 0.05, beta = 0.20, ES = 1.0 mg kg(-1) dry mass). The average weight of this support is 23 g dry mass, and 11 replicate samples need to be taken. It should be noted that in this case the optimal support applies to Cd under conditions similar to those of the simulation, but not necessarily all the examinations for this tree species, element, and hypothesis test.
Amini, Mehdi; Pourshahbaz, Abbas; Mohammadkhani, Parvaneh; Ardakani, Mohammad-Reza Khodaie; Lotfi, Mozhgan
2014-12-01
The goal of this study was to examine the construct validity of the diagnostic and statistical manual of mental disorder-5 (DSM-5) conceptual model of antisocial and borderline personality disorders (PDs). More specifically, the aim was to determine whether the DSM-5 five-factor structure of pathological personality trait domains replicated in an independently collected sample that differs culturally from the derivation sample. This study was on a sample of 346 individuals with antisocial (n = 122) and borderline PD (n = 130), and nonclinical subjects (n = 94). Participants randomly selected from prisoners, out-patient, and in-patient clients. Participants were recruited from Tehran prisoners, and clinical psychology and psychiatry clinics of Razi and Taleghani Hospital, Tehran, Iran. The SCID-II-PQ, SCID-II, DSM-5 Personality Trait Rating Form (Clinician's PTRF) were used to diagnosis of PD and to assessment of pathological traits. The data were analyzed by exploratory factor analysis. Factor analysis revealed a 5-factor solution for DSM-5 personality traits. Results showed that DSM-5 has adequate construct validity in Iranian sample with antisocial and borderline PDs. Factors similar in number with the other studies, but different in the content. Exploratory factor analysis revealed five homogeneous components of antisocial and borderline PDs. That may represent personality, behavioral, and affective features central to the disorder. Furthermore, the present study helps understand the adequacy of DSM-5 dimensional approach to evaluation of personality pathology, specifically on Iranian sample.
NASA Astrophysics Data System (ADS)
Davis, C.; Rozo, E.; Roodman, A.; Alarcon, A.; Cawthon, R.; Gatti, M.; Lin, H.; Miquel, R.; Rykoff, E. S.; Troxel, M. A.; Vielzeuf, P.; Abbott, T. M. C.; Abdalla, F. B.; Allam, S.; Annis, J.; Bechtol, K.; Benoit-Lévy, A.; Bertin, E.; Brooks, D.; Buckley-Geer, E.; Burke, D. L.; Carnero Rosell, A.; Carrasco Kind, M.; Carretero, J.; Castander, F. J.; Crocce, M.; Cunha, C. E.; D'Andrea, C. B.; da Costa, L. N.; Desai, S.; Diehl, H. T.; Doel, P.; Drlica-Wagner, A.; Fausti Neto, A.; Flaugher, B.; Fosalba, P.; Frieman, J.; García-Bellido, J.; Gaztanaga, E.; Gerdes, D. W.; Giannantonio, T.; Gruen, D.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; Jain, B.; James, D. J.; Jeltema, T.; Krause, E.; Kuehn, K.; Kuhlmann, S.; Kuropatkin, N.; Lahav, O.; Li, T. S.; Lima, M.; March, M.; Marshall, J. L.; Martini, P.; Melchior, P.; Ogando, R. L. C.; Plazas, A. A.; Romer, A. K.; Sanchez, E.; Scarpine, V.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Smith, M.; Soares-Santos, M.; Sobreira, F.; Suchyta, E.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Vikram, V.; Walker, A. R.; Wechsler, R. H.
2018-06-01
Galaxy cross-correlations with high-fidelity redshift samples hold the potential to precisely calibrate systematic photometric redshift uncertainties arising from the unavailability of complete and representative training and validation samples of galaxies. However, application of this technique in the Dark Energy Survey (DES) is hampered by the relatively low number density, small area, and modest redshift overlap between photometric and spectroscopic samples. We propose instead using photometric catalogues with reliable photometric redshifts for photo-z calibration via cross-correlations. We verify the viability of our proposal using redMaPPer clusters from the Sloan Digital Sky Survey (SDSS) to successfully recover the redshift distribution of SDSS spectroscopic galaxies. We demonstrate how to combine photo-z with cross-correlation data to calibrate photometric redshift biases while marginalizing over possible clustering bias evolution in either the calibration or unknown photometric samples. We apply our method to DES Science Verification (DES SV) data in order to constrain the photometric redshift distribution of a galaxy sample selected for weak lensing studies, constraining the mean of the tomographic redshift distributions to a statistical uncertainty of Δz ˜ ±0.01. We forecast that our proposal can, in principle, control photometric redshift uncertainties in DES weak lensing experiments at a level near the intrinsic statistical noise of the experiment over the range of redshifts where redMaPPer clusters are available. Our results provide strong motivation to launch a programme to fully characterize the systematic errors from bias evolution and photo-z shapes in our calibration procedure.
Davis, C.; Rozo, E.; Roodman, A.; ...
2018-03-26
Galaxy cross-correlations with high-fidelity redshift samples hold the potential to precisely calibrate systematic photometric redshift uncertainties arising from the unavailability of complete and representative training and validation samples of galaxies. However, application of this technique in the Dark Energy Survey (DES) is hampered by the relatively low number density, small area, and modest redshift overlap between photometric and spectroscopic samples. We propose instead using photometric catalogs with reliable photometric redshifts for photo-z calibration via cross-correlations. We verify the viability of our proposal using redMaPPer clusters from the Sloan Digital Sky Survey (SDSS) to successfully recover the redshift distribution of SDSS spectroscopic galaxies. We demonstrate how to combine photo-z with cross-correlation data to calibrate photometric redshift biases while marginalizing over possible clustering bias evolution in either the calibration or unknown photometric samples. We apply our method to DES Science Verification (DES SV) data in order to constrain the photometric redshift distribution of a galaxy sample selected for weak lensing studies, constraining the mean of the tomographic redshift distributions to a statistical uncertainty ofmore » $$\\Delta z \\sim \\pm 0.01$$. We forecast that our proposal can in principle control photometric redshift uncertainties in DES weak lensing experiments at a level near the intrinsic statistical noise of the experiment over the range of redshifts where redMaPPer clusters are available. Here, our results provide strong motivation to launch a program to fully characterize the systematic errors from bias evolution and photo-z shapes in our calibration procedure.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davis, C.; Rozo, E.; Roodman, A.
Galaxy cross-correlations with high-fidelity redshift samples hold the potential to precisely calibrate systematic photometric redshift uncertainties arising from the unavailability of complete and representative training and validation samples of galaxies. However, application of this technique in the Dark Energy Survey (DES) is hampered by the relatively low number density, small area, and modest redshift overlap between photometric and spectroscopic samples. We propose instead using photometric catalogs with reliable photometric redshifts for photo-z calibration via cross-correlations. We verify the viability of our proposal using redMaPPer clusters from the Sloan Digital Sky Survey (SDSS) to successfully recover the redshift distribution of SDSS spectroscopic galaxies. We demonstrate how to combine photo-z with cross-correlation data to calibrate photometric redshift biases while marginalizing over possible clustering bias evolution in either the calibration or unknown photometric samples. We apply our method to DES Science Verification (DES SV) data in order to constrain the photometric redshift distribution of a galaxy sample selected for weak lensing studies, constraining the mean of the tomographic redshift distributions to a statistical uncertainty ofmore » $$\\Delta z \\sim \\pm 0.01$$. We forecast that our proposal can in principle control photometric redshift uncertainties in DES weak lensing experiments at a level near the intrinsic statistical noise of the experiment over the range of redshifts where redMaPPer clusters are available. Here, our results provide strong motivation to launch a program to fully characterize the systematic errors from bias evolution and photo-z shapes in our calibration procedure.« less
Comparative Financial Statistics for Public Two-Year Colleges: FY 1993 National Sample.
ERIC Educational Resources Information Center
Dickmeyer, Nathan; Meeker, Bradley
This report provides comparative information derived from a national sample of 516 public two-year colleges, highlighting financial statistics for fiscal year, 1992-93. This report provides space for colleges to compare their institutional statistics with national sample medians, quartile data for the national sample, and statistics presented in a…
7 CFR 52.38a - Definitions of terms applicable to statistical sampling.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 7 Agriculture 2 2011-01-01 2011-01-01 false Definitions of terms applicable to statistical... Sampling § 52.38a Definitions of terms applicable to statistical sampling. (a) Terms applicable to both on... acceptable as a process average. At the AQL's contained in the statistical sampling plans of this subpart...
7 CFR 52.38a - Definitions of terms applicable to statistical sampling.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 7 Agriculture 2 2010-01-01 2010-01-01 false Definitions of terms applicable to statistical... Sampling § 52.38a Definitions of terms applicable to statistical sampling. (a) Terms applicable to both on... acceptable as a process average. At the AQL's contained in the statistical sampling plans of this subpart...
Bartholomay, Roy C.; Davis, Linda C.; Fisher, Jason C.; Tucker, Betty J.; Raben, Flint A.
2012-01-01
The U.S. Geological Survey, in cooperation with the U.S. Department of Energy, analyzed water-quality data collected from 67 aquifer wells and 7 surface-water sites at the Idaho National Laboratory (INL) from 1949 through 2009. The data analyzed included major cations, anions, nutrients, trace elements, and total organic carbon. The analyses were performed to examine water-quality trends that might inform future management decisions about the number of wells to sample at the INL and the type of constituents to monitor. Water-quality trends were determined using (1) the nonparametric Kendall's tau correlation coefficient, p-value, Theil-Sen slope estimator, and summary statistics for uncensored data; and (2) the Kaplan-Meier method for calculating summary statistics, Kendall's tau correlation coefficient, p-value, and Akritas-Theil-Sen slope estimator for robust linear regression for censored data. Statistical analyses for chloride concentrations indicate that groundwater influenced by Big Lost River seepage has decreasing chloride trends or, in some cases, has variable chloride concentration changes that correlate with above-average and below-average periods of recharge. Analyses of trends for chloride in water samples from four sites located along the Big Lost River indicate a decreasing trend or no trend for chloride, and chloride concentrations generally are much lower at these four sites than those in the aquifer. Above-average and below-average periods of recharge also affect concentration trends for sodium, sulfate, nitrate, and a few trace elements in several wells. Analyses of trends for constituents in water from several of the wells that is mostly regionally derived groundwater generally indicate increasing trends for chloride, sodium, sulfate, and nitrate concentrations. These increases are attributed to agricultural or other anthropogenic influences on the aquifer upgradient of the INL. Statistical trends of chemical constituents from several wells near the Naval Reactors Facility may be influenced by wastewater disposal at the facility or by anthropogenic influence from the Little Lost River basin. Groundwater samples from three wells downgradient of the Power Burst Facility area show increasing trends for chloride, nitrate, sodium, and sulfate concentrations. The increases could be caused by wastewater disposal in the Power Burst Facility area. Some groundwater samples in the southwestern part of the INL and southwest of the INL show concentration trends for chloride and sodium that may be influenced by wastewater disposal. Some of the groundwater samples have decreasing trends that could be attributed to the decreasing concentrations in the wastewater from the late 1970s to 2009. The young fraction of groundwater in many of the wells is more than 20 years old, so samples collected in the early 1990s are more representative of groundwater discharged in the 1960s and 1970s, when concentrations in wastewater were much higher. Groundwater sampled in 2009 would be representative of the lower concentrations of chloride and sodium in wastewater discharged in the late 1980s. Analyses of trends for sodium in several groundwater samples from the central and southern part of the eastern Snake River aquifer show increasing trends. In most cases, however, the sodium concentrations are less than background concentrations measured in the aquifer. Many of the wells are open to larger mixed sections of the aquifer, and the increasing trends may indicate that the long history of wastewater disposal in the central part of the INL is increasing sodium concentrations in the groundwater.
A geostatistical approach to predicting sulfur content in the Pittsburgh coal bed
Watson, W.D.; Ruppert, L.F.; Bragg, L.J.; Tewalt, S.J.
2001-01-01
The US Geological Survey (USGS) is completing a national assessment of coal resources in the five top coal-producing regions in the US. Point-located data provide measurements on coal thickness and sulfur content. The sample data and their geologic interpretation represent the most regionally complete and up-to-date assessment of what is known about top-producing US coal beds. The sample data are analyzed using a combination of geologic and Geographic Information System (GIS) models to estimate tonnages and qualities of the coal beds. Traditionally, GIS practitioners use contouring to represent geographical patterns of "similar" data values. The tonnage and grade of coal resources are then assessed by using the contour lines as references for interpolation. An assessment taken to this point is only indicative of resource quantity and quality. Data users may benefit from a statistical approach that would allow them to better understand the uncertainty and limitations of the sample data. To develop a quantitative approach, geostatistics were applied to the data on coal sulfur content from samples taken in the Pittsburgh coal bed (located in the eastern US, in the southwestern part of the state of Pennsylvania, and in adjoining areas in the states of Ohio and West Virginia). Geostatistical methods that account for regional and local trends were applied to blocks 2.7 mi (4.3 km) on a side. The data and geostatistics support conclusions concerning the average sulfur content and its degree of reliability at regional- and economic-block scale over the large, contiguous part of the Pittsburgh outcrop, but not to a mine scale. To validate the method, a comparison was made with the sulfur contents in sample data taken from 53 coal mines located in the study area. The comparison showed a high degree of similarity between the sulfur content in the mine samples and the sulfur content represented by the geostatistically derived contours. Published by Elsevier Science B.V.
Meda, Shashwath A.; Giuliani, Nicole R.; Calhoun, Vince D.; Jagannathan, Kanchana; Schretlen, David J.; Pulver, Anne; Cascella, Nicola; Keshavan, Matcheri; Kates, Wendy; Buchanan, Robert; Sharma, Tonmoy; Pearlson, Godfrey D.
2008-01-01
Background Many studies have employed voxel-based morphometry (VBM) of MRI images as an automated method of investigating cortical gray matter differences in schizophrenia. However, results from these studies vary widely, likely due to different methodological or statistical approaches. Objective To use VBM to investigate gray matter differences in schizophrenia in a sample significantly larger than any published to date, and to increase statistical power sufficiently to reveal differences missed in smaller analyses. Methods Magnetic resonance whole brain images were acquired from four geographic sites, all using the same model 1.5T scanner and software version, and combined to form a sample of 200 patients with both first episode and chronic schizophrenia and 200 healthy controls, matched for age, gender and scanner location. Gray matter concentration was assessed and compared using optimized VBM. Results Compared to the healthy controls, schizophrenia patients showed significantly less gray matter concentration in multiple cortical and subcortical regions, some previously unreported. Overall, we found lower concentrations of gray matter in regions identified in prior studies, most of which reported only subsets of the affected areas. Conclusions Gray matter differences in schizophrenia are most comprehensively elucidated using a large, diverse and representative sample. PMID:18378428
General aviation activity and avionics survey. 1978. Annual summary report cy 1978
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schwenk, J.C.
1980-03-01
This report presents the results and a description of the 1978 General Aviation Activity and Avionics Survey. The survey was conducted during early 1979 by the FAA to obtain information on the activity and avionics of the United States registered general aviation aircraft fleet, the dominant component of civil aviation in the U.S. The survey was based on a statistically selected sample of about 13.3 percent of the general aviation fleet and obtained a response rate of 74 percent. Survey results are based upon responses but are expanded upward to represent the total population. Survey results revealed that during 1978more » an estimated 39.4 million hours of flying time were logged by the 198,778 active general aviation aircraft in the U.S. fleet, yielding a mean annual flight time per aircraft of 197.7 hours. The active aircraft represented 85 percent of the registered general aviation fleet. The report contains breakdowns of these and other statistics by manufacturer/model group, aircraft type, state and region of based aircraft, and primary use. Also included are fuel consumption, lifetime airframe hours, avionics, and engine hours estimates.« less
Setel, Philip W.; Sankoh, Osman; Rao, Chalapati; Velkoff, Victoria A.; Mathers, Colin; Gonghuan, Yang; Hemed, Yusuf; Jha, Prabhat; Lopez, Alan D.
2005-01-01
Registration of births, recording deaths by age, sex and cause, and calculating mortality levels and differentials are fundamental to evidence-based health policy, monitoring and evaluation. Yet few of the countries with the greatest need for these data have functioning systems to produce them despite legislation providing for the establishment and maintenance of vital registration. Sample vital registration (SVR), when applied in conjunction with validated verbal autopsy procedures and implemented in a nationally representative sample of population clusters represents an affordable, cost-effective, and sustainable short- and medium-term solution to this problem. SVR complements other information sources by producing age-, sex-, and cause-specific mortality data that are more complete and continuous than those currently available. The tools and methods employed in an SVR system, however, are imperfect and require rigorous validation and continuous quality assurance; sampling strategies for SVR are also still evolving. Nonetheless, interest in establishing SVR is rapidly growing in Africa and Asia. Better systems for reporting and recording data on vital events will be sustainable only if developed hand-in-hand with existing health information strategies at the national and district levels; governance structures; and agendas for social research and development monitoring. If the global community wishes to have mortality measurements 5 or 10 years hence, the foundation stones of SVR must be laid today. PMID:16184280
NASA Technical Reports Server (NTRS)
Deloach, Richard; Obara, Clifford J.; Goodman, Wesley L.
2012-01-01
This paper documents a check standard wind tunnel test conducted in the Langley 0.3-Meter Transonic Cryogenic Tunnel (0.3M TCT) that was designed and analyzed using the Modern Design of Experiments (MDOE). The test designed to partition the unexplained variance of typical wind tunnel data samples into two constituent components, one attributable to ordinary random error, and one attributable to systematic error induced by covariate effects. Covariate effects in wind tunnel testing are discussed, with examples. The impact of systematic (non-random) unexplained variance on the statistical independence of sequential measurements is reviewed. The corresponding correlation among experimental errors is discussed, as is the impact of such correlation on experimental results generally. The specific experiment documented herein was organized as a formal test for the presence of unexplained variance in representative samples of wind tunnel data, in order to quantify the frequency with which such systematic error was detected, and its magnitude relative to ordinary random error. Levels of systematic and random error reported here are representative of those quantified in other facilities, as cited in the references.
NASA Astrophysics Data System (ADS)
Plaisance, L.; Knowlton, N.; Paulay, G.; Meyer, C.
2009-12-01
The cryptofauna associated with coral reefs accounts for a major part of the biodiversity in these ecosystems but has been largely overlooked in biodiversity estimates because the organisms are hard to collect and identify. We combine a semi-quantitative sampling design and a DNA barcoding approach to provide metrics for the diversity of reef-associated crustacean. Twenty-two similar-sized dead heads of Pocillopora were sampled at 10 m depth from five central Pacific Ocean localities (four atolls in the Northern Line Islands and in Moorea, French Polynesia). All crustaceans were removed, and partial cytochrome oxidase subunit I was sequenced from 403 individuals, yielding 135 distinct taxa using a species-level criterion of 5% similarity. Most crustacean species were rare; 44% of the OTUs were represented by a single individual, and an additional 33% were represented by several specimens found only in one of the five localities. The Northern Line Islands and Moorea shared only 11 OTUs. Total numbers estimated by species richness statistics (Chao1 and ACE) suggest at least 90 species of crustaceans in Moorea and 150 in the Northern Line Islands for this habitat type. However, rarefaction curves for each region failed to approach an asymptote, and Chao1 and ACE estimators did not stabilize after sampling eight heads in Moorea, so even these diversity figures are underestimates. Nevertheless, even this modest sampling effort from a very limited habitat resulted in surprisingly high species numbers.
Marin, Tania; Taylor, Anne Winifred; Grande, Eleonora Dal; Avery, Jodie; Tucker, Graeme; Morey, Kim
2015-05-19
The considerably lower average life expectancy of Aboriginal and Torres Strait Islander Australians, compared with non-Aboriginal and non-Torres Strait Islander Australians, has been widely reported. Prevalence data for chronic disease and health risk factors are needed to provide evidence based estimates for Australian Aboriginal and Torres Strait Islanders population health planning. Representative surveys for these populations are difficult due to complex methodology. The focus of this paper is to describe in detail the methodological challenges and resolutions of a representative South Australian Aboriginal population-based health survey. Using a stratified multi-stage sampling methodology based on the Australian Bureau of Statistics 2006 Census with culturally appropriate and epidemiological rigorous methods, 11,428 randomly selected dwellings were approached from a total of 209 census collection districts. All persons eligible for the survey identified as Aboriginal and/or Torres Strait Islander and were selected from dwellings identified as having one or more Aboriginal person(s) living there at the time of the survey. Overall, the 399 interviews from an eligible sample of 691 SA Aboriginal adults yielded a response rate of 57.7%. These face-to-face interviews were conducted by ten interviewers retained from a total of 27 trained Aboriginal interviewers. Challenges were found in three main areas: identification and recruitment of participants; interviewer recruitment and retainment; and using appropriate engagement with communities. These challenges were resolved, or at least mainly overcome, by following local protocols with communities and their representatives, and reaching agreement on the process of research for Aboriginal people. Obtaining a representative sample of Aboriginal participants in a culturally appropriate way was methodologically challenging and required high levels of commitment and resources. Adhering to these principles has resulted in a rich and unique data set that provides an overview of the self-reported health status for Aboriginal people living in South Australia. This process provides some important principles to be followed when engaging with Aboriginal people and their communities for the purpose of health research.
Nummenmaa, Lauri; Glerean, Enrico; Hari, Riitta; Hietanen, Jari K
2014-01-14
Emotions are often felt in the body, and somatosensory feedback has been proposed to trigger conscious emotional experiences. Here we reveal maps of bodily sensations associated with different emotions using a unique topographical self-report method. In five experiments, participants (n = 701) were shown two silhouettes of bodies alongside emotional words, stories, movies, or facial expressions. They were asked to color the bodily regions whose activity they felt increasing or decreasing while viewing each stimulus. Different emotions were consistently associated with statistically separable bodily sensation maps across experiments. These maps were concordant across West European and East Asian samples. Statistical classifiers distinguished emotion-specific activation maps accurately, confirming independence of topographies across emotions. We propose that emotions are represented in the somatosensory system as culturally universal categorical somatotopic maps. Perception of these emotion-triggered bodily changes may play a key role in generating consciously felt emotions.
Flood Frequency Curves - Use of information on the likelihood of extreme floods
NASA Astrophysics Data System (ADS)
Faber, B.
2011-12-01
Investment in the infrastructure that reduces flood risk for flood-prone communities must incorporate information on the magnitude and frequency of flooding in that area. Traditionally, that information has been a probability distribution of annual maximum streamflows developed from the historical gaged record at a stream site. Practice in the United States fits a Log-Pearson type3 distribution to the annual maximum flows of an unimpaired streamflow record, using the method of moments to estimate distribution parameters. The procedure makes the assumptions that annual peak streamflow events are (1) independent, (2) identically distributed, and (3) form a representative sample of the overall probability distribution. Each of these assumptions can be challenged. We rarely have enough data to form a representative sample, and therefore must compute and display the uncertainty in the estimated flood distribution. But, is there a wet/dry cycle that makes precipitation less than independent between successive years? Are the peak flows caused by different types of events from different statistical populations? How does the watershed or climate changing over time (non-stationarity) affect the probability distribution floods? Potential approaches to avoid these assumptions vary from estimating trend and shift and removing them from early data (and so forming a homogeneous data set), to methods that estimate statistical parameters that vary with time. A further issue in estimating a probability distribution of flood magnitude (the flood frequency curve) is whether a purely statistical approach can accurately capture the range and frequency of floods that are of interest. A meteorologically-based analysis produces "probable maximum precipitation" (PMP) and subsequently a "probable maximum flood" (PMF) that attempts to describe an upper bound on flood magnitude in a particular watershed. This analysis can help constrain the upper tail of the probability distribution, well beyond the range of gaged data or even historical or paleo-flood data, which can be very important in risk analyses performed for flood risk management and dam and levee safety studies.
The developmental basis for germline mosaicism in mouse and Drosophila melanogaster.
Drost, J B; Lee, W R
1998-01-01
Data involving germline mosaics in Drosophila melanogaster and mouse are reconciled with developmental observations. Mutations that become fixed in the early embryo before separation of soma from the germline may, by the sampling process of development, continue as part of germline and/or differentiate into any somatic tissue. The cuticle of adult D. melanogaster, because of segmental development, can be used to estimate the proportion of mutant nuclei in the early embryo, but most somatic tissues and the germlines of both species continue from samples too small to be representative of the early embryo. Because of the small sample of cells/nuclei that remain in the germline after separation of soma in both species, mosaic germlines have percentages of mutant cells that vary widely, with a mean of 50% and an unusual platykurtic, flat-topped distribution. While the sampling process leads to similar statistical results for both species, their patterns of development are very different. In D. melanogaster the first differentiation is the separation of soma from germline with the germline continuing from a sample of only two to four nuclei, whereas the adult cuticle is a representative sample of cleavage nuclei. The presence of mosaicism in D. melanogaster germline is independent of mosaicism in the eye, head, and thorax. This independence was used to determine that mutations can occur at any of the early embryonic cell divisions and still average 50% mutant germ cells when the germline is mosaic; however, the later the mutation occurs, the higher the proportion of completely nonmutant germlines. In contrast to D. melanogaster, the first differentiation in the mouse does not separate soma from germline but produces the inner cell mass that is representative of the cleavage nuclei. Following formation of the primitive streak, the primordial germ cells develop at the base of the allantois and among a clonally related sample of cells, providing the same statistical distribution in the mouse germlines as in D. melanogaster. The proportion of mutations that are fixed during early embryonic development is greatly underestimated. For example, a DNA lesion in a postmeiotic gamete that becomes fixed as a dominant mutation during early embryonic development of the F1 may produce an individual completely mutant in the germ line and relevant somatic tissue or, alternatively, the F1 germline may be completely mutant but with no relevant somatic tissues for detecting the mutation until the F2. In both cases the mutation would be classified as complete in the F1 and F2, respectively, and not recognized as embryonic in origin. Because germ cells differentiate later in mammalian development, there are more opportunities for correlation between germline and soma in the mammal than Drosophila. However, because the germ cells and any somatic tissue, like blood, are derived from small samples, there may be many individuals that test negative in blood but have germlines that are either mosaic or entirely mutant.
Beaver, Kevin M; Schwartz, Joseph A; Connolly, Eric J; Al-Ghamdi, Mohammed Said; Kobeisy, Ahmed Nezar
2015-03-01
The role of parenting in the development of criminal behavior has been the source of a vast amount of research, with the majority of studies detecting statistically significant associations between dimensions of parenting and measures of criminal involvement. An emerging group of scholars, however, has drawn attention to the methodological limitations-mainly genetic confounding-of the parental socialization literature. The current study addressed this limitation by analyzing a sample of adoptees to assess the association between 8 parenting measures and 4 criminal justice outcome measures. The results revealed very little evidence of parental socialization effects on criminal behavior before controlling for genetic confounding and no evidence of parental socialization effects on criminal involvement after controlling for genetic confounding. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Paleomagnetism of the Mesozoic in Alaska. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Packer, D. R.
1972-01-01
Over 400 oriented cores of Permian, Triassic, Jurassic, and Cretaceous sedimentary and igneous rocks were collected from 34 sites at 10 areas throughout southern Alaska. After magnetic cleaning in successively higher alternating fields 179 samples were considered to be stable and to give statistically consistent results within each site and age group. Due to the lack of a sufficient number of stable samples, the results from Permian, Triassic, and Cretaceous rocks were inconclusive. The nine remaining Jurassic sites represent 100 samples from three general areas in southern Alaska. The southern Alaskan Jurassic paleomagnetic pole is significantly different from the North American Jurassic pole. This suggests that since the Jurassic, southern Alaska must have moved approximately 18 degrees north and rotated 52 degrees clockwise to reach its present position. Tectonic interpretation of these results give a possible explanation for many of the geologic features observed in southern Alaska.
Wolf, Erika J.; Mitchell, Karen S.; Sadeh, Naomi; Hein, Christina; Fuhrman, Isaac; Pietrzak, Robert H.; Miller, Mark W.
2015-01-01
The fifth edition of the Diagnostic and Statistical Manual (DSM-5) includes a dissociative subtype of posttraumatic stress disorder (PTSD), but no existing measures specifically assess it. This paper describes the initial evaluation of a 15-item self-report measure of the subtype called the Dissociative Subtype of PTSD Scale (DSPS) in an on-line survey of 697 trauma-exposed military veterans representative of the US veteran population. Exploratory factor analyses of the lifetime DSPS items supported the intended structure of the measure consisting of three factors reflecting derealization/depersonalization, loss of awareness, and psychogenic amnesia. Consistent with prior research, latent profile analyses assigned 8.3% of the sample to a highly dissociative class distinguished by pronounced symptoms of derealization and depersonalization. Overall, results provide initial psychometric support for the lifetime DSPS scales; additional research in clinical and community samples is needed to further validate the measure. PMID:26603115
Wolf, Erika J; Mitchell, Karen S; Sadeh, Naomi; Hein, Christina; Fuhrman, Isaac; Pietrzak, Robert H; Miller, Mark W
2017-06-01
The fifth edition of the Diagnostic and Statistical Manual includes a dissociative subtype of posttraumatic stress disorder, but no existing measures specifically assess it. This article describes the initial evaluation of a 15-item self-report measure of the subtype called the Dissociative Subtype of Posttraumatic Stress Disorder Scale (DSPS) in an online survey of 697 trauma-exposed military veterans representative of the U.S. veteran population. Exploratory factor analyses of the lifetime DSPS items supported the intended structure of the measure consisting of three factors reflecting derealization/depersonalization, loss of awareness, and psychogenic amnesia. Consistent with prior research, latent profile analyses assigned 8.3% of the sample to a highly dissociative class distinguished by pronounced symptoms of derealization and depersonalization. Overall, results provide initial psychometric support for the lifetime DSPS scales; additional research in clinical and community samples is needed to further validate the measure.
Atomistic study of two-level systems in amorphous silica
NASA Astrophysics Data System (ADS)
Damart, T.; Rodney, D.
2018-01-01
Internal friction is analyzed in an atomic-scale model of amorphous silica. The potential energy landscape of more than 100 glasses is explored to identify a sample of about 700 two-level systems (TLSs). We discuss the properties of TLSs, particularly their energy asymmetry and barrier as well as their deformation potential, computed as longitudinal and transverse averages of the full deformation potential tensors. The discrete sampling is used to predict dissipation in the classical regime. Comparison with experimental data shows a better agreement with poorly relaxed thin films than well relaxed vitreous silica, as expected from the large quench rates used to produce numerical glasses. The TLSs are categorized in three types that are shown to affect dissipation in different temperature ranges. The sampling is also used to discuss critically the usual approximations employed in the literature to represent the statistical properties of TLSs.
The National Food and Nutrient Analysis Program: A decade of progress
Haytowitz, David B.; Pehrsson, Pamela R.; Holden, Joanne M.
2009-01-01
The National Food and Nutrient Analysis Program (NFNAP) was designed to expand the quantity and improve the quality of data in the United States Department of Agriculture (USDA) food composition databases through the collection and analysis of nationally representative samples of foods and beverages. This paper describes some of the findings from the NFNAP and its impact on the food composition databases produced by USDA. The NFNAP employs statistically valid sampling plans, comprehensive quality control, and USDA analytical oversight as part of the program to generate new and updated analytical data for food components. USDA food consumption and composition data were used to target those foods that are major contributors of nutrients of public health significance to the U.S. diet (454 Key Foods). Foods were ranked using a scoring system, divided into quartiles, and reviewed to determine the impact of changes in their composition compared to historical values. Foods were purchased from several types of locations, such as retail outlets and fast food restaurants in different geographic areas as determined by the sampling plan, then composited and sent for analysis to commercial laboratories and cooperators, along with quality control materials. Comparisons were made to assess differences between new NFNAP means generated from original analytical data and historical means. Recently generated results for nationally representative food samples show marked changes compared to database values for selected nutrients from unknown or non-representative sampling. A number of changes were observed in many high consumption foods, e.g. the vitamin A value for cooked carrots decreased from 1,225 to 860 RAE/100g; the fat value for fast food French fried potatoes increased by 13% (14.08 to 17.06 g/100g). Trans fatty acids in margarine have decreased as companies reformulate their products in response to the required addition of trans fatty acids content on the nutrition label. Values decreased from 19.7 g/100 in 2002 to 14.8 g/100 in 2006 for 80%-fat stick margarines and to 4.52 g/100 g for 80%-fat tub margarines. These changes reflect improved strategies for sampling and analysis of representative food samples, which enhance the reliability of nutrient estimates for Key Foods and subsequent assessments of nutrient intake. PMID:19578546
ADAPTIVE MATCHING IN RANDOMIZED TRIALS AND OBSERVATIONAL STUDIES
van der Laan, Mark J.; Balzer, Laura B.; Petersen, Maya L.
2014-01-01
SUMMARY In many randomized and observational studies the allocation of treatment among a sample of n independent and identically distributed units is a function of the covariates of all sampled units. As a result, the treatment labels among the units are possibly dependent, complicating estimation and posing challenges for statistical inference. For example, cluster randomized trials frequently sample communities from some target population, construct matched pairs of communities from those included in the sample based on some metric of similarity in baseline community characteristics, and then randomly allocate a treatment and a control intervention within each matched pair. In this case, the observed data can neither be represented as the realization of n independent random variables, nor, contrary to current practice, as the realization of n/2 independent random variables (treating the matched pair as the independent sampling unit). In this paper we study estimation of the average causal effect of a treatment under experimental designs in which treatment allocation potentially depends on the pre-intervention covariates of all units included in the sample. We define efficient targeted minimum loss based estimators for this general design, present a theorem that establishes the desired asymptotic normality of these estimators and allows for asymptotically valid statistical inference, and discuss implementation of these estimators. We further investigate the relative asymptotic efficiency of this design compared with a design in which unit-specific treatment assignment depends only on the units’ covariates. Our findings have practical implications for the optimal design and analysis of pair matched cluster randomized trials, as well as for observational studies in which treatment decisions may depend on characteristics of the entire sample. PMID:25097298
ERIC Educational Resources Information Center
Dierker, Lisa; Alexander, Jalen; Cooper, Jennifer L.; Selya, Arielle; Rose, Jennifer; Dasgupta, Nilanjana
2016-01-01
Introductory statistics needs innovative, evidence-based teaching practices that support and engage diverse students. To evaluate the success of a multidisciplinary, project-based course, we compared experiences of under-represented (URM) and non-underrepresented students in 4 years of the course. While URM students considered the material more…
Rasch fit statistics and sample size considerations for polytomous data.
Smith, Adam B; Rush, Robert; Fallowfield, Lesley J; Velikova, Galina; Sharpe, Michael
2008-05-29
Previous research on educational data has demonstrated that Rasch fit statistics (mean squares and t-statistics) are highly susceptible to sample size variation for dichotomously scored rating data, although little is known about this relationship for polytomous data. These statistics help inform researchers about how well items fit to a unidimensional latent trait, and are an important adjunct to modern psychometrics. Given the increasing use of Rasch models in health research the purpose of this study was therefore to explore the relationship between fit statistics and sample size for polytomous data. Data were collated from a heterogeneous sample of cancer patients (n = 4072) who had completed both the Patient Health Questionnaire - 9 and the Hospital Anxiety and Depression Scale. Ten samples were drawn with replacement for each of eight sample sizes (n = 25 to n = 3200). The Rating and Partial Credit Models were applied and the mean square and t-fit statistics (infit/outfit) derived for each model. The results demonstrated that t-statistics were highly sensitive to sample size, whereas mean square statistics remained relatively stable for polytomous data. It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges.
Rasch fit statistics and sample size considerations for polytomous data
Smith, Adam B; Rush, Robert; Fallowfield, Lesley J; Velikova, Galina; Sharpe, Michael
2008-01-01
Background Previous research on educational data has demonstrated that Rasch fit statistics (mean squares and t-statistics) are highly susceptible to sample size variation for dichotomously scored rating data, although little is known about this relationship for polytomous data. These statistics help inform researchers about how well items fit to a unidimensional latent trait, and are an important adjunct to modern psychometrics. Given the increasing use of Rasch models in health research the purpose of this study was therefore to explore the relationship between fit statistics and sample size for polytomous data. Methods Data were collated from a heterogeneous sample of cancer patients (n = 4072) who had completed both the Patient Health Questionnaire – 9 and the Hospital Anxiety and Depression Scale. Ten samples were drawn with replacement for each of eight sample sizes (n = 25 to n = 3200). The Rating and Partial Credit Models were applied and the mean square and t-fit statistics (infit/outfit) derived for each model. Results The results demonstrated that t-statistics were highly sensitive to sample size, whereas mean square statistics remained relatively stable for polytomous data. Conclusion It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges. PMID:18510722
Representation of Probability Density Functions from Orbit Determination using the Particle Filter
NASA Technical Reports Server (NTRS)
Mashiku, Alinda K.; Garrison, James; Carpenter, J. Russell
2012-01-01
Statistical orbit determination enables us to obtain estimates of the state and the statistical information of its region of uncertainty. In order to obtain an accurate representation of the probability density function (PDF) that incorporates higher order statistical information, we propose the use of nonlinear estimation methods such as the Particle Filter. The Particle Filter (PF) is capable of providing a PDF representation of the state estimates whose accuracy is dependent on the number of particles or samples used. For this method to be applicable to real case scenarios, we need a way of accurately representing the PDF in a compressed manner with little information loss. Hence we propose using the Independent Component Analysis (ICA) as a non-Gaussian dimensional reduction method that is capable of maintaining higher order statistical information obtained using the PF. Methods such as the Principal Component Analysis (PCA) are based on utilizing up to second order statistics, hence will not suffice in maintaining maximum information content. Both the PCA and the ICA are applied to two scenarios that involve a highly eccentric orbit with a lower apriori uncertainty covariance and a less eccentric orbit with a higher a priori uncertainty covariance, to illustrate the capability of the ICA in relation to the PCA.
Statistical parsimony networks and species assemblages in Cephalotrichid nemerteans (nemertea).
Chen, Haixia; Strand, Malin; Norenburg, Jon L; Sun, Shichun; Kajihara, Hiroshi; Chernyshev, Alexey V; Maslakova, Svetlana A; Sundberg, Per
2010-09-21
It has been suggested that statistical parsimony network analysis could be used to get an indication of species represented in a set of nucleotide data, and the approach has been used to discuss species boundaries in some taxa. Based on 635 base pairs of the mitochondrial protein-coding gene cytochrome c oxidase I (COI), we analyzed 152 nemertean specimens using statistical parsimony network analysis with the connection probability set to 95%. The analysis revealed 15 distinct networks together with seven singletons. Statistical parsimony yielded three networks supporting the species status of Cephalothrix rufifrons, C. major and C. spiralis as they currently have been delineated by morphological characters and geographical location. Many other networks contained haplotypes from nearby geographical locations. Cladistic structure by maximum likelihood analysis overall supported the network analysis, but indicated a false positive result where subnetworks should have been connected into one network/species. This probably is caused by undersampling of the intraspecific haplotype diversity. Statistical parsimony network analysis provides a rapid and useful tool for detecting possible undescribed/cryptic species among cephalotrichid nemerteans based on COI gene. It should be combined with phylogenetic analysis to get indications of false positive results, i.e., subnetworks that would have been connected with more extensive haplotype sampling.
Azad, Ariful; Rajwa, Bartek; Pothen, Alex
2016-08-31
We describe algorithms for discovering immunophenotypes from large collections of flow cytometry samples and using them to organize the samples into a hierarchy based on phenotypic similarity. The hierarchical organization is helpful for effective and robust cytometry data mining, including the creation of collections of cell populations’ characteristic of different classes of samples, robust classification, and anomaly detection. We summarize a set of samples belonging to a biological class or category with a statistically derived template for the class. Whereas individual samples are represented in terms of their cell populations (clusters), a template consists of generic meta-populations (a group ofmore » homogeneous cell populations obtained from the samples in a class) that describe key phenotypes shared among all those samples. We organize an FC data collection in a hierarchical data structure that supports the identification of immunophenotypes relevant to clinical diagnosis. A robust template-based classification scheme is also developed, but our primary focus is in the discovery of phenotypic signatures and inter-sample relationships in an FC data collection. This collective analysis approach is more efficient and robust since templates describe phenotypic signatures common to cell populations in several samples while ignoring noise and small sample-specific variations. We have applied the template-based scheme to analyze several datasets, including one representing a healthy immune system and one of acute myeloid leukemia (AML) samples. The last task is challenging due to the phenotypic heterogeneity of the several subtypes of AML. However, we identified thirteen immunophenotypes corresponding to subtypes of AML and were able to distinguish acute promyelocytic leukemia (APL) samples with the markers provided. Clinically, this is helpful since APL has a different treatment regimen from other subtypes of AML. Core algorithms used in our data analysis are available in the flowMatch package at www.bioconductor.org. It has been downloaded nearly 6,000 times since 2014.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Azad, Ariful; Rajwa, Bartek; Pothen, Alex
We describe algorithms for discovering immunophenotypes from large collections of flow cytometry samples and using them to organize the samples into a hierarchy based on phenotypic similarity. The hierarchical organization is helpful for effective and robust cytometry data mining, including the creation of collections of cell populations’ characteristic of different classes of samples, robust classification, and anomaly detection. We summarize a set of samples belonging to a biological class or category with a statistically derived template for the class. Whereas individual samples are represented in terms of their cell populations (clusters), a template consists of generic meta-populations (a group ofmore » homogeneous cell populations obtained from the samples in a class) that describe key phenotypes shared among all those samples. We organize an FC data collection in a hierarchical data structure that supports the identification of immunophenotypes relevant to clinical diagnosis. A robust template-based classification scheme is also developed, but our primary focus is in the discovery of phenotypic signatures and inter-sample relationships in an FC data collection. This collective analysis approach is more efficient and robust since templates describe phenotypic signatures common to cell populations in several samples while ignoring noise and small sample-specific variations. We have applied the template-based scheme to analyze several datasets, including one representing a healthy immune system and one of acute myeloid leukemia (AML) samples. The last task is challenging due to the phenotypic heterogeneity of the several subtypes of AML. However, we identified thirteen immunophenotypes corresponding to subtypes of AML and were able to distinguish acute promyelocytic leukemia (APL) samples with the markers provided. Clinically, this is helpful since APL has a different treatment regimen from other subtypes of AML. Core algorithms used in our data analysis are available in the flowMatch package at www.bioconductor.org. It has been downloaded nearly 6,000 times since 2014.« less
Understanding the Sampling Distribution and the Central Limit Theorem.
ERIC Educational Resources Information Center
Lewis, Charla P.
The sampling distribution is a common source of misuse and misunderstanding in the study of statistics. The sampling distribution, underlying distribution, and the Central Limit Theorem are all interconnected in defining and explaining the proper use of the sampling distribution of various statistics. The sampling distribution of a statistic is…
[Bacteriological quality of traditional, organic and hydroponic cultured lettuce in Costa Rica].
Monge, Claudio; Chaves, Carolina; Arias, María Laura
2011-03-01
The main objective of this work was to evaluate the microbiological quality of lettuces commercialized in the Metropolitan Area of San José, Costa Rica, and cultured in different ways, in order to detect differences between the culturing methods and the risk that these products may represent for Public Health. The study was done at the Food Microbiology Laboratory, Universidad de Costa Rica, from March to July, 2010. 30 lettuce samples were analyzed (10 obtained by traditional culture, 10 by organic culture and 10 by hydropony). All samples were obtained from markets where their origin was certified. Total aerobic plate count, total and fecal coliforms count and Escherichia coli were determined to all samples, as well as the presence/abscense of Salmonella spp. and Listeria monocytogenes in 25 g. Results obtained show that there is no statistically significant difference (p < 0.001) between the different types of cultures analyzed for any of the parameters evaluated. An important percentage of the samples presented coliforms, nevertheless, just one E. coli strain was isolated from a traditionally cultured lettuce sample. Four different Salmonella spp. strains were isolated from the samples as well as one Listeria monocytogenes strain. Data obtained show that the consumption of this product, raw or without an adequate hygiene and disinfection may represent a risk for health. Also, from the bacteriological point of view, there is no significant difference between the culturing methods evaluated, suggesting that the specific directions for each type of culture are not followed or that there is an inadequate handling of the products or post harvest contamination.
ERIC Educational Resources Information Center
Garfield, Joan; Le, Laura; Zieffler, Andrew; Ben-Zvi, Dani
2015-01-01
This paper describes the importance of developing students' reasoning about samples and sampling variability as a foundation for statistical thinking. Research on expert-novice thinking as well as statistical thinking is reviewed and compared. A case is made that statistical thinking is a type of expert thinking, and as such, research…
Forget, Nathalie L; Kim Juniper, S
2013-01-01
We systematically studied free-living bacterial diversity within aggregations of the vestimentiferan tubeworm Ridgeia piscesae sampled from two contrasting flow regimes (High Flow and Low Flow) in the Endeavour Hydrothermal Vents Marine Protected Area (MPA) on the Juan de Fuca Ridge (Northeast Pacific). Eight samples of particulate detritus were recovered from paired tubeworm grabs from four vent sites. Most sequences (454 tag and Sanger methods) were affiliated to the Epsilonproteobacteria, and the sulfur-oxidizing genus Sulfurovum was dominant in all samples. Gammaproteobacteria were also detected, mainly in Low Flow sequence libraries, and were affiliated with known methanotrophs and decomposers. The cooccurrence of sulfur reducers from the Deltaproteobacteria and the Epsilonproteobacteria suggests internal sulfur cycling within these habitats. Other phyla detected included Bacteroidetes, Actinobacteria, Chloroflexi, Firmicutes, Planctomycetes, Verrucomicrobia, and Deinococcus–Thermus. Statistically significant relationships between sequence library composition and habitat type suggest a predictable pattern for High Flow and Low Flow environments. Most sequences significantly more represented in High Flow libraries were related to sulfur and hydrogen oxidizers, while mainly heterotrophic groups were more represented in Low Flow libraries. Differences in temperature, available energy for metabolism, and stability between High Flow and Low Flow habitats potentially explain their distinct bacterial communities. PMID:23401293
DALMATIAN: An Algorithm for Automatic Cell Detection and Counting in 3D.
Shuvaev, Sergey A; Lazutkin, Alexander A; Kedrov, Alexander V; Anokhin, Konstantin V; Enikolopov, Grigori N; Koulakov, Alexei A
2017-01-01
Current 3D imaging methods, including optical projection tomography, light-sheet microscopy, block-face imaging, and serial two photon tomography enable visualization of large samples of biological tissue. Large volumes of data obtained at high resolution require development of automatic image processing techniques, such as algorithms for automatic cell detection or, more generally, point-like object detection. Current approaches to automated cell detection suffer from difficulties originating from detection of particular cell types, cell populations of different brightness, non-uniformly stained, and overlapping cells. In this study, we present a set of algorithms for robust automatic cell detection in 3D. Our algorithms are suitable for, but not limited to, whole brain regions and individual brain sections. We used watershed procedure to split regional maxima representing overlapping cells. We developed a bootstrap Gaussian fit procedure to evaluate the statistical significance of detected cells. We compared cell detection quality of our algorithm and other software using 42 samples, representing 6 staining and imaging techniques. The results provided by our algorithm matched manual expert quantification with signal-to-noise dependent confidence, including samples with cells of different brightness, non-uniformly stained, and overlapping cells for whole brain regions and individual tissue sections. Our algorithm provided the best cell detection quality among tested free and commercial software.
Hitting Is Contagious in Baseball: Evidence from Long Hitting Streaks
Bock, Joel R.; Maewal, Akhilesh; Gough, David A.
2012-01-01
Data analysis is used to test the hypothesis that “hitting is contagious”. A statistical model is described to study the effect of a hot hitter upon his teammates’ batting during a consecutive game hitting streak. Box score data for entire seasons comprising streaks of length games, including a total observations were compiled. Treatment and control sample groups () were constructed from core lineups of players on the streaking batter’s team. The percentile method bootstrap was used to calculate confidence intervals for statistics representing differences in the mean distributions of two batting statistics between groups. Batters in the treatment group (hot streak active) showed statistically significant improvements in hitting performance, as compared against the control. Mean for the treatment group was found to be to percentage points higher during hot streaks (mean difference increased points), while the batting heat index introduced here was observed to increase by points. For each performance statistic, the null hypothesis was rejected at the significance level. We conclude that the evidence suggests the potential existence of a “statistical contagion effect”. Psychological mechanisms essential to the empirical results are suggested, as several studies from the scientific literature lend credence to contagious phenomena in sports. Causal inference from these results is difficult, but we suggest and discuss several latent variables that may contribute to the observed results, and offer possible directions for future research. PMID:23251507
DNA barcode identification of Podocarpaceae--the second largest conifer family.
Little, Damon P; Knopf, Patrick; Schulz, Christian
2013-01-01
We have generated matK, rbcL, and nrITS2 DNA barcodes for 320 specimens representing all 18 extant genera of the conifer family Podocarpaceae. The sample includes 145 of the 198 recognized species. Comparative analyses of sequence quality and species discrimination were conducted on the 159 individuals from which all three markers were recovered (representing 15 genera and 97 species). The vast majority of sequences were of high quality (B 30 = 0.596-0.989). Even the lowest quality sequences exceeded the minimum requirements of the BARCODE data standard. In the few instances that low quality sequences were generated, the responsible mechanism could not be discerned. There were no statistically significant differences in the discriminatory power of markers or marker combinations (p = 0.05). The discriminatory power of the barcode markers individually and in combination is low (56.7% of species at maximum). In some instances, species discrimination failed in spite of ostensibly useful variation being present (genotypes were shared among species), but in many cases there was simply an absence of sequence variation. Barcode gaps (maximum intraspecific p-distance > minimum interspecific p-distance) were observed in 50.5% of species when all three markers were considered simultaneously. The presence of a barcode gap was not predictive of discrimination success (p = 0.02) and there was no statistically significant difference in the frequency of barcode gaps among markers (p = 0.05). In addition, there was no correlation between number of individuals sampled per species and the presence of a barcode gap (p = 0.27).
Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online
Forsberg, Erica M; Huan, Tao; Rinehart, Duane; Benton, H Paul; Warth, Benedikt; Hilmers, Brian; Siuzdak, Gary
2018-01-01
Systems biology is the study of complex living organisms, and as such, analysis on a systems-wide scale involves the collection of information-dense data sets that are representative of an entire phenotype. To uncover dynamic biological mechanisms, bioinformatics tools have become essential to facilitating data interpretation in large-scale analyses. Global metabolomics is one such method for performing systems biology, as metabolites represent the downstream functional products of ongoing biological processes. We have developed XCMS Online, a platform that enables online metabolomics data processing and interpretation. A systems biology workflow recently implemented within XCMS Online enables rapid metabolic pathway mapping using raw metabolomics data for investigating dysregulated metabolic processes. In addition, this platform supports integration of multi-omic (such as genomic and proteomic) data to garner further systems-wide mechanistic insight. Here, we provide an in-depth procedure showing how to effectively navigate and use the systems biology workflow within XCMS Online without a priori knowledge of the platform, including uploading liquid chromatography (LCLC)–mass spectrometry (MS) data from metabolite-extracted biological samples, defining the job parameters to identify features, correcting for retention time deviations, conducting statistical analysis of features between sample classes and performing predictive metabolic pathway analysis. Additional multi-omics data can be uploaded and overlaid with previously identified pathways to enhance systems-wide analysis of the observed dysregulations. We also describe unique visualization tools to assist in elucidation of statistically significant dysregulated metabolic pathways. Parameter input takes 5–10 min, depending on user experience; data processing typically takes 1–3 h, and data analysis takes ~30 min. PMID:29494574
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M
2016-05-01
Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature.
Montagu, Dominic; Sudhinaraset, May; Lwin, Thandar; Onozaki, Ikushi; Win, Zaw; Aung, Tin
2013-01-10
Since 2004, the Sun Quality Health (SQH) franchise network has provided TB care in Myanmar through a network of established private medical clinics. This study compares the wealth distribution of the TB patients to non-TB patients to determine if TB is most common among the poor, and compares the wealth of all TB patients to SQH TB patients to assess whether the franchise achieves its goal of serving the poor. The study uses data from two sources: 1) Myanmar's first nationally representative TB prevalence study conducted in 2009, and 2) client exit interviews from TB patients from SQH clinics. In total, 1,114 TB-positive individuals were included in the study, including 739 from the national sample and 375 from the SQH sample. TB patients at SQH clinics were poorer than TB-positive individuals in the overall population, though not at a statistically significant level (p > 0.05). After stratification we found that in urban areas, TB patients at SQH clinics were more likely to be in the poorest quartile compared to general TB positive population (16.8% vs. 8.6%, respectively; p < 0.05). In rural areas, there was no statistically significant difference between the wealth distribution of SQH clinic patients and general TB positive individuals (p > 0.05). Franchised clinics in Myanmar are reaching poor populations of TB patients in urban areas; more efforts are needed in order to reach the most vulnerable in rural areas.
2013-01-01
Introduction Since 2004, the Sun Quality Health (SQH) franchise network has provided TB care in Myanmar through a network of established private medical clinics. This study compares the wealth distribution of the TB patients to non-TB patients to determine if TB is most common among the poor, and compares the wealth of all TB patients to SQH TB patients to assess whether the franchise achieves its goal of serving the poor. Methods The study uses data from two sources: 1) Myanmar’s first nationally representative TB prevalence study conducted in 2009, and 2) client exit interviews from TB patients from SQH clinics. In total, 1,114 TB-positive individuals were included in the study, including 739 from the national sample and 375 from the SQH sample. Results TB patients at SQH clinics were poorer than TB-positive individuals in the overall population, though not at a statistically significant level (p > 0.05). After stratification we found that in urban areas, TB patients at SQH clinics were more likely to be in the poorest quartile compared to general TB positive population (16.8% vs. 8.6%, respectively; p < 0.05). In rural areas, there was no statistically significant difference between the wealth distribution of SQH clinic patients and general TB positive individuals (p > 0.05). Conclusion Franchised clinics in Myanmar are reaching poor populations of TB patients in urban areas; more efforts are needed in order to reach the most vulnerable in rural areas. PMID:23305063
Blanquart, François; Bataillon, Thomas
2016-01-01
The fitness landscape defines the relationship between genotypes and fitness in a given environment and underlies fundamental quantities such as the distribution of selection coefficient and the magnitude and type of epistasis. A better understanding of variation in landscape structure across species and environments is thus necessary to understand and predict how populations will adapt. An increasing number of experiments investigate the properties of fitness landscapes by identifying mutations, constructing genotypes with combinations of these mutations, and measuring the fitness of these genotypes. Yet these empirical landscapes represent a very small sample of the vast space of all possible genotypes, and this sample is often biased by the protocol used to identify mutations. Here we develop a rigorous statistical framework based on Approximate Bayesian Computation to address these concerns and use this flexible framework to fit a broad class of phenotypic fitness models (including Fisher’s model) to 26 empirical landscapes representing nine diverse biological systems. Despite uncertainty owing to the small size of most published empirical landscapes, the inferred landscapes have similar structure in similar biological systems. Surprisingly, goodness-of-fit tests reveal that this class of phenotypic models, which has been successful so far in interpreting experimental data, is a plausible in only three of nine biological systems. More precisely, although Fisher’s model was able to explain several statistical properties of the landscapes—including the mean and SD of selection and epistasis coefficients—it was often unable to explain the full structure of fitness landscapes. PMID:27052568
DNA Barcode Identification of Podocarpaceae—The Second Largest Conifer Family
Little, Damon P.; Knopf, Patrick; Schulz, Christian
2013-01-01
We have generated matK, rbcL, and nrITS2 DNA barcodes for 320 specimens representing all 18 extant genera of the conifer family Podocarpaceae. The sample includes 145 of the 198 recognized species. Comparative analyses of sequence quality and species discrimination were conducted on the 159 individuals from which all three markers were recovered (representing 15 genera and 97 species). The vast majority of sequences were of high quality (B 30 = 0.596–0.989). Even the lowest quality sequences exceeded the minimum requirements of the BARCODE data standard. In the few instances that low quality sequences were generated, the responsible mechanism could not be discerned. There were no statistically significant differences in the discriminatory power of markers or marker combinations (p = 0.05). The discriminatory power of the barcode markers individually and in combination is low (56.7% of species at maximum). In some instances, species discrimination failed in spite of ostensibly useful variation being present (genotypes were shared among species), but in many cases there was simply an absence of sequence variation. Barcode gaps (maximum intraspecific p–distance > minimum interspecific p–distance) were observed in 50.5% of species when all three markers were considered simultaneously. The presence of a barcode gap was not predictive of discrimination success (p = 0.02) and there was no statistically significant difference in the frequency of barcode gaps among markers (p = 0.05). In addition, there was no correlation between number of individuals sampled per species and the presence of a barcode gap (p = 0.27). PMID:24312258
Saldaña, Erick; Castillo, Luiz Saldarriaga; Sánchez, Jorge Cabrera; Siche, Raúl; de Almeida, Marcio Aurélio; Behrens, Jorge H; Selani, Miriam Mabel; Contreras-Castillo, Carmen J
2018-06-01
The aim of this study was to perform a descriptive analysis (DA) of bacons smoked with woods from reforestation and liquid smokes in order to investigate their sensory profile. Six samples of bacon were selected: three smoked bacons with different wood species (Eucalyptus citriodora, Acacia mearnsii, and Bambusa vulgaris), two artificially smoked bacon samples (liquid smoke) and one negative control (unsmoked bacon). Additionally, a commercial bacon sample was also evaluated. DA was developed successfully, presenting a good performance in terms of discrimination, consensus and repeatability. The study revealed that the smoking process modified the sensory profile by intensifying the "saltiness" and differentiating the unsmoked from the smoked samples. The results from the current research represent the first methodological development of descriptive analysis of bacon and may be used by food companies and other stakeholders to understand the changes in sensory characteristics of bacon due to traditional smoking process. Copyright © 2018 Elsevier Ltd. All rights reserved.
Kinnear, John; Jackson, Ruth
2017-07-01
Although physicians are highly trained in the application of evidence-based medicine, and are assumed to make rational decisions, there is evidence that their decision making is prone to biases. One of the biases that has been shown to affect accuracy of judgements is that of representativeness and base-rate neglect, where the saliency of a person's features leads to overestimation of their likelihood of belonging to a group. This results in the substitution of 'subjective' probability for statistical probability. This study examines clinicians' propensity to make estimations of subjective probability when presented with clinical information that is considered typical of a medical condition. The strength of the representativeness bias is tested by presenting choices in textual and graphic form. Understanding of statistical probability is also tested by omitting all clinical information. For the questions that included clinical information, 46.7% and 45.5% of clinicians made judgements of statistical probability, respectively. Where the question omitted clinical information, 79.9% of clinicians made a judgement consistent with statistical probability. There was a statistically significant difference in responses to the questions with and without representativeness information (χ2 (1, n=254)=54.45, p<0.0001). Physicians are strongly influenced by a representativeness bias, leading to base-rate neglect, even though they understand the application of statistical probability. One of the causes for this representativeness bias may be the way clinical medicine is taught where stereotypic presentations are emphasised in diagnostic decision making. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Bhaskaran, Krishnan; Forbes, Harriet J; Douglas, Ian; Leon, David A; Smeeth, Liam
2013-01-01
Objectives To assess the completeness and representativeness of body mass index (BMI) data in the Clinical Practice Research Datalink (CPRD), and determine an optimal strategy for their use. Design Descriptive study. Setting Electronic healthcare records from primary care. Participants A million patient random sample from the UK CPRD primary care database, aged ≥16 years. Primary and secondary outcome measures BMI completeness in CPRD was evaluated by age, sex and calendar period. CPRD-based summary BMI statistics for each calendar year (2003–2010) were age-standardised and sex-standardised and compared with equivalent statistics from the Health Survey for England (HSE). Results BMI completeness increased over calendar time from 37% in 1990–1994 to 77% in 2005–2011, was higher among females and increased with age. When BMI at specific time points was assigned based on the most recent record, calendar–year-specific mean BMI statistics underestimated equivalent HSE statistics by 0.75–1.1 kg/m2. Restriction to those with a recent (≤3 years) BMI resulted in mean BMI estimates closer to HSE (≤0.28 kg/m2 underestimation), but excluded up to 47% of patients. An alternative strategy of imputing up-to-date BMI based on modelled changes in BMI over time since the last available record also led to mean BMI estimates that were close to HSE (≤0.37 kg/m2 underestimation). Conclusions Completeness of BMI in CPRD increased over time and varied by age and sex. At a given point in time, a large proportion of the most recent BMIs are unlikely to reflect current BMI; consequent BMI misclassification might be reduced by employing model-based imputation of current BMI. PMID:24038008
Experimental analysis of computer system dependability
NASA Technical Reports Server (NTRS)
Iyer, Ravishankar, K.; Tang, Dong
1993-01-01
This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance.
A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis.
Gonzalez, Oscar; MacKinnon, David P
Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to an outcome. However, current methods do not allow researchers to study the relationships between general and specific aspects of a construct to an outcome simultaneously. This study proposes a bifactor measurement model for the mediating construct as a way to parse variance and represent the general aspect and specific facets of a construct simultaneously. Monte Carlo simulation results are presented to help determine the properties of mediated effect estimation when the mediator has a bifactor structure and a specific facet of a construct is the true mediator. This study also investigates the conditions when researchers can detect the mediated effect when the multidimensionality of the mediator is ignored and treated as unidimensional. Simulation results indicated that the mediation model with a bifactor mediator measurement model had unbiased and adequate power to detect the mediated effect with a sample size greater than 500 and medium a - and b -paths. Also, results indicate that parameter bias and detection of the mediated effect in both the data-generating model and the misspecified model varies as a function of the amount of facet variance represented in the mediation model. This study contributes to the largely unexplored area of measurement issues in statistical mediation analysis.
NASA Astrophysics Data System (ADS)
Verma, Surendra P.; Pandarinath, Kailasa; Verma, Sanjeet K.
2011-07-01
In the lead presentation (invited talk) of Session SE05 (Frontiers in Geochemistry with Reference to Lithospheric Evolution and Metallogeny) of AOGS2010, we have highlighted the requirement of correct statistical treatment of geochemical data. In most diagrams used for interpreting compositional data, the basic statistical assumption of open space for all variables is violated. Among these graphic tools, discrimination diagrams have been in use for nearly 40 years to decipher tectonic setting. The newer set of five tectonomagmatic discrimination diagrams published in 2006 (based on major-elements) and two sets made available in 2008 and 2011 (both based on immobile elements) fulfill all statistical requirements for correct handling of compositional data, including the multivariate nature of compositional variables, representative sampling, and probability-based tectonic field boundaries. Additionally in the most recent proposal of 2011, samples having normally distributed, discordant-outlier free, log-ratio variables were used in linear discriminant analysis. In these three sets of five diagrams each, discrimination was successfully documented for four tectonic settings (island arc, continental rift, ocean-island, and mid-ocean ridge). The discrimination diagrams have been extensively evaluated for their performance by different workers. We exemplify these two sets of new diagrams (one set based on major-elements and the other on immobile elements) using ophiolites from Boso Peninsula, Japan. This example is included for illustration purposes only and is not meant for testing of these newer diagrams. Their evaluation and comparison with older, conventional bivariate or ternary diagrams have been reported in other papers.
Mura, Maria Chiara; De Felice, Marco; Morlino, Roberta; Fuselli, Sergio
2010-01-01
In step with the need to develop statistical procedures to manage small-size environmental samples, in this work we have used concentration values of benzene (C6H6), concurrently detected by seven outdoor and indoor monitoring stations over 12 000 minutes, in order to assess the representativeness of collected data and the impact of the pollutant on indoor environment. Clearly, the former issue is strictly connected to sampling-site geometry, which proves critical to correctly retrieving information from analysis of pollutants of sanitary interest. Therefore, according to current criteria for network-planning, single stations have been interpreted as nodes of a set of adjoining triangles; then, a) node pairs have been taken into account in order to estimate pollutant stationarity on triangle sides, as well as b) node triplets, to statistically associate data from air-monitoring with the corresponding territory area, and c) node sextuplets, to assess the impact probability of the outdoor pollutant on indoor environment for each area. Distributions from the various node combinations are all non-Gaussian, in the consequently, Kruskal-Wallis (KW) non-parametric statistics has been exploited to test variability on continuous density function from each pair, triplet and sextuplet. Results from the above-mentioned statistical analysis have shown randomness of site selection, which has not allowed a reliable generalization of monitoring data to the entire selected territory, except for a single "forced" case (70%); most important, they suggest a possible procedure to optimize network design.
The beta distribution: A statistical model for world cloud cover
NASA Technical Reports Server (NTRS)
Falls, L. W.
1973-01-01
Much work has been performed in developing empirical global cloud cover models. This investigation was made to determine an underlying theoretical statistical distribution to represent worldwide cloud cover. The beta distribution with probability density function is given to represent the variability of this random variable. It is shown that the beta distribution possesses the versatile statistical characteristics necessary to assume the wide variety of shapes exhibited by cloud cover. A total of 160 representative empirical cloud cover distributions were investigated and the conclusion was reached that this study provides sufficient statical evidence to accept the beta probability distribution as the underlying model for world cloud cover.
Is parenting style a predictor of suicide attempts in a representative sample of adolescents?
2014-01-01
Background Suicidal ideation and suicide attempts are serious but not rare conditions in adolescents. However, there are several research and practical suicide-prevention initiatives that discuss the possibility of preventing serious self-harm. Profound knowledge about risk and protective factors is therefore necessary. The aim of this study is a) to clarify the role of parenting behavior and parenting styles in adolescents’ suicide attempts and b) to identify other statistically significant and clinically relevant risk and protective factors for suicide attempts in a representative sample of German adolescents. Methods In the years 2007/2008, a representative written survey of N = 44,610 students in the 9th grade of different school types in Germany was conducted. In this survey, the lifetime prevalence of suicide attempts was investigated as well as potential predictors including parenting behavior. A three-step statistical analysis was carried out: I) As basic model, the association between parenting and suicide attempts was explored via binary logistic regression controlled for age and sex. II) The predictive values of 13 additional potential risk/protective factors were analyzed with single binary logistic regression analyses for each predictor alone. Non-significant predictors were excluded in Step III. III) In a multivariate binary logistic regression analysis, all significant predictor variables from Step II and the parenting styles were included after testing for multicollinearity. Results Three parental variables showed a relevant association with suicide attempts in adolescents – (all protective): mother’s warmth and father’s warmth in childhood and mother’s control in adolescence (Step I). In the full model (Step III), Authoritative parenting (protective: OR: .79) and Rejecting-Neglecting parenting (risk: OR: 1.63) were identified as significant predictors (p < .001) for suicidal attempts. Seven further variables were interpreted to be statistically significant and clinically relevant: ADHD, female sex, smoking, Binge Drinking, absenteeism/truancy, migration background, and parental separation events. Conclusions Parenting style does matter. While children of Authoritative parents profit, children of Rejecting-Neglecting parents are put at risk – as we were able to show for suicide attempts in adolescence. Some of the identified risk factors contribute new knowledge and potential areas of intervention for special groups such as migrants or children diagnosed with ADHD. PMID:24766881
Kozlakidis, Zisis; Mant, Christine; Peters, Barry; Post, Frank; Fox, Julie; Philpott-Howard, John; Tong, William C-Y; Edgeworth, Jonathan; Peakman, Mark; Malim, Michael; Cason, John
2011-09-01
Biobanks have a primary responsibility to collect tissues that are a true reflection of their local population and thereby promote translational research, which is applicable to the community. The Infectious Diseases BioBank (IDB) at King's College London is located in the southeast of the city, an area that is ethnically diverse. Transplantation programs have frequently reported a low rate of donation among some ethnic minorities. To determine whether patients who volunteered peripheral venous blood samples to the IDB were representative of the local community, we compared local government demographic data to characteristics of patients who have donated to the IDB. There was a good match between these statistics, indicating that the IDB's volunteer population of human immunodeficiency virus patients was similar to local demographics.
Kozlakidis, Zisis; Mant, Christine; Peters, Barry; Post, Frank; Fox, Julie; Philpott-Howard, John; Tong, William C.-Y.; Edgeworth, Jonathan; Peakman, Mark; Malim, Michael
2011-01-01
Biobanks have a primary responsibility to collect tissues that are a true reflection of their local population and thereby promote translational research, which is applicable to the community. The Infectious Diseases BioBank (IDB) at King's College London is located in the southeast of the city, an area that is ethnically diverse. Transplantation programs have frequently reported a low rate of donation among some ethnic minorities. To determine whether patients who volunteered peripheral venous blood samples to the IDB were representative of the local community, we compared local government demographic data to characteristics of patients who have donated to the IDB. There was a good match between these statistics, indicating that the IDB's volunteer population of human immunodeficiency virus patients was similar to local demographics. PMID:21977243
Coming of Age in Spain: The Self-identification, Beliefs and Self-Esteem of the Second Generation1
Portes, Alejandro; Vickstrom, Erik; Aparicio, Rosa
2013-01-01
We review the literature on determinants of ethnic/national self-identities and self-esteem as a prelude to examining these outcomes among a large, statistically representative sample of second generation adolescents in Madrid and Barcelona. While these psycho-social outcomes are malleable, they still represent important dimensions of immigrant adaptation and can have significant consequences both for individual mobility and collective mobilizations. Current theories are largely based on data from the USA and other Anglophone countries. The availability of a new large Spanish survey allows us to test those theories in an entirely different socio-cultural context. The analysis concludes with a structural equations model that summarizes key determinants of national identities and self-esteem among children of immigrants in Spain. Theoretical and practical implications of these findings are discussed. PMID:21899520
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun
2016-09-14
Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology of Biological and Clinical Statistics (OBCS) is a community-based open source ontology in the domain of biological and clinical statistics. OBCS is a timely ontology that represents statistics-related terms and their relations in a rigorous fashion, facilitates standard data analysis and integration, and supports reproducible biological and clinical research.
Middleton, David A; Hughes, Eleri; Madine, Jillian
2004-08-11
We describe an NMR approach for detecting the interactions between phospholipid membranes and proteins, peptides, or small molecules. First, 1H-13C dipolar coupling profiles are obtained from hydrated lipid samples at natural isotope abundance using cross-polarization magic-angle spinning NMR methods. Principal component analysis of dipolar coupling profiles for synthetic lipid membranes in the presence of a range of biologically active additives reveals clusters that relate to different modes of interaction of the additives with the lipid bilayer. Finally, by representing profiles from multiple samples in the form of contour plots, it is possible to reveal statistically significant changes in dipolar couplings, which reflect perturbations in the lipid molecules at the membrane surface or within the hydrophobic interior.
Yang, Yi; Tokita, Midori; Ishiguchi, Akira
2018-01-01
A number of studies revealed that our visual system can extract different types of summary statistics, such as the mean and variance, from sets of items. Although the extraction of such summary statistics has been studied well in isolation, the relationship between these statistics remains unclear. In this study, we explored this issue using an individual differences approach. Observers viewed illustrations of strawberries and lollypops varying in size or orientation and performed four tasks in a within-subject design, namely mean and variance discrimination tasks with size and orientation domains. We found that the performances in the mean and variance discrimination tasks were not correlated with each other and demonstrated that extractions of the mean and variance are mediated by different representation mechanisms. In addition, we tested the relationship between performances in size and orientation domains for each summary statistic (i.e. mean and variance) and examined whether each summary statistic has distinct processes across perceptual domains. The results illustrated that statistical summary representations of size and orientation may share a common mechanism for representing the mean and possibly for representing variance. Introspections for each observer performing the tasks were also examined and discussed.
NASA Astrophysics Data System (ADS)
Dietze, Michael; Fuchs, Margret; Kreutzer, Sebastian
2016-04-01
Many modern approaches of radiometric dating or geochemical fingerprinting rely on sampling sedimentary deposits. A key assumption of most concepts is that the extracted grain-size fraction of the sampled sediment adequately represents the actual process to be dated or the source area to be fingerprinted. However, these assumptions are not always well constrained. Rather, they have to align with arbitrary, method-determined size intervals, such as "coarse grain" or "fine grain" with partly even different definitions. Such arbitrary intervals violate principal process-based concepts of sediment transport and can thus introduce significant bias to the analysis outcome (i.e., a deviation of the measured from the true value). We present a flexible numerical framework (numOlum) for the statistical programming language R that allows quantifying the bias due to any given analysis size interval for different types of sediment deposits. This framework is applied to synthetic samples from the realms of luminescence dating and geochemical fingerprinting, i.e. a virtual reworked loess section. We show independent validation data from artificially dosed and subsequently mixed grain-size proportions and we present a statistical approach (end-member modelling analysis, EMMA) that allows accounting for the effect of measuring the compound dosimetric history or geochemical composition of a sample. EMMA separates polymodal grain-size distributions into the underlying transport process-related distributions and their contribution to each sample. These underlying distributions can then be used to adjust grain-size preparation intervals to minimise the incorporation of "undesired" grain-size fractions.
Korir, Robert Cheruiyot; Parveen, Salina; Hashem, Fawzy; Bowers, John
2016-06-01
The aim of this study was to investigate the microbiological quality of six types of fresh produce obtained from three retail stores located on the Eastern Shore of Maryland, USA. A total of 414 samples representing basil, cilantro, lettuce, scallion, spinach, and parsley were analyzed for total aerobic bacteria (APC), total coliforms, Escherichia coli, and three pathogenic bacteria (E. coli O157:H7, Listeria monocytogenes, and Salmonella), using standard methods. Presumptive pathogenic isolates were confirmed using BAX Polymerase Chain Reaction. Total aerobic populations varied widely between samples, while 38.41% were positive for total coliforms and only 10.15% for E. coli. Median abundance (log CFU/g) of total coliforms and E. coli were less than the limit of detection and that of APC ranged from 5.78 to 6.61 over the six produce types. There was a statistically significant difference in prevalence of total coliforms among the retail stores, but not for abundance of APC or prevalence of E. coli. E. coli O157:H7 and L. monocytogenes were detected in one spinach sample each, while one parsley and one cilantro sample were positive for Salmonella. There were no statistically significant differences in microbiological quality among produce types. Although the results of this study provided some indices of sanitary and/or spoilage level, no relationship was observed among the total aerobic bacteria, total coliforms, E. coli, and the presence of pathogenic bacteria in the samples tested. Copyright © 2015 Elsevier Ltd. All rights reserved.
Galway, Lp; Bell, Nathaniel; Sae, Al Shatari; Hagopian, Amy; Burnham, Gilbert; Flaxman, Abraham; Weiss, Wiliam M; Rajaratnam, Julie; Takaro, Tim K
2012-04-27
Mortality estimates can measure and monitor the impacts of conflict on a population, guide humanitarian efforts, and help to better understand the public health impacts of conflict. Vital statistics registration and surveillance systems are rarely functional in conflict settings, posing a challenge of estimating mortality using retrospective population-based surveys. We present a two-stage cluster sampling method for application in population-based mortality surveys. The sampling method utilizes gridded population data and a geographic information system (GIS) to select clusters in the first sampling stage and Google Earth TM imagery and sampling grids to select households in the second sampling stage. The sampling method is implemented in a household mortality study in Iraq in 2011. Factors affecting feasibility and methodological quality are described. Sampling is a challenge in retrospective population-based mortality studies and alternatives that improve on the conventional approaches are needed. The sampling strategy presented here was designed to generate a representative sample of the Iraqi population while reducing the potential for bias and considering the context specific challenges of the study setting. This sampling strategy, or variations on it, are adaptable and should be considered and tested in other conflict settings.
2012-01-01
Background Mortality estimates can measure and monitor the impacts of conflict on a population, guide humanitarian efforts, and help to better understand the public health impacts of conflict. Vital statistics registration and surveillance systems are rarely functional in conflict settings, posing a challenge of estimating mortality using retrospective population-based surveys. Results We present a two-stage cluster sampling method for application in population-based mortality surveys. The sampling method utilizes gridded population data and a geographic information system (GIS) to select clusters in the first sampling stage and Google Earth TM imagery and sampling grids to select households in the second sampling stage. The sampling method is implemented in a household mortality study in Iraq in 2011. Factors affecting feasibility and methodological quality are described. Conclusion Sampling is a challenge in retrospective population-based mortality studies and alternatives that improve on the conventional approaches are needed. The sampling strategy presented here was designed to generate a representative sample of the Iraqi population while reducing the potential for bias and considering the context specific challenges of the study setting. This sampling strategy, or variations on it, are adaptable and should be considered and tested in other conflict settings. PMID:22540266
DOE Office of Scientific and Technical Information (OSTI.GOV)
Piepel, Gregory F.; Matzke, Brett D.; Sego, Landon H.
2013-04-27
This report discusses the methodology, formulas, and inputs needed to make characterization and clearance decisions for Bacillus anthracis-contaminated and uncontaminated (or decontaminated) areas using a statistical sampling approach. Specifically, the report includes the methods and formulas for calculating the • number of samples required to achieve a specified confidence in characterization and clearance decisions • confidence in making characterization and clearance decisions for a specified number of samples for two common statistically based environmental sampling approaches. In particular, the report addresses an issue raised by the Government Accountability Office by providing methods and formulas to calculate the confidence that amore » decision area is uncontaminated (or successfully decontaminated) if all samples collected according to a statistical sampling approach have negative results. Key to addressing this topic is the probability that an individual sample result is a false negative, which is commonly referred to as the false negative rate (FNR). The two statistical sampling approaches currently discussed in this report are 1) hotspot sampling to detect small isolated contaminated locations during the characterization phase, and 2) combined judgment and random (CJR) sampling during the clearance phase. Typically if contamination is widely distributed in a decision area, it will be detectable via judgment sampling during the characterization phrase. Hotspot sampling is appropriate for characterization situations where contamination is not widely distributed and may not be detected by judgment sampling. CJR sampling is appropriate during the clearance phase when it is desired to augment judgment samples with statistical (random) samples. The hotspot and CJR statistical sampling approaches are discussed in the report for four situations: 1. qualitative data (detect and non-detect) when the FNR = 0 or when using statistical sampling methods that account for FNR > 0 2. qualitative data when the FNR > 0 but statistical sampling methods are used that assume the FNR = 0 3. quantitative data (e.g., contaminant concentrations expressed as CFU/cm2) when the FNR = 0 or when using statistical sampling methods that account for FNR > 0 4. quantitative data when the FNR > 0 but statistical sampling methods are used that assume the FNR = 0. For Situation 2, the hotspot sampling approach provides for stating with Z% confidence that a hotspot of specified shape and size with detectable contamination will be found. Also for Situation 2, the CJR approach provides for stating with X% confidence that at least Y% of the decision area does not contain detectable contamination. Forms of these statements for the other three situations are discussed in Section 2.2. Statistical methods that account for FNR > 0 currently only exist for the hotspot sampling approach with qualitative data (or quantitative data converted to qualitative data). This report documents the current status of methods and formulas for the hotspot and CJR sampling approaches. Limitations of these methods are identified. Extensions of the methods that are applicable when FNR = 0 to account for FNR > 0, or to address other limitations, will be documented in future revisions of this report if future funding supports the development of such extensions. For quantitative data, this report also presents statistical methods and formulas for 1. quantifying the uncertainty in measured sample results 2. estimating the true surface concentration corresponding to a surface sample 3. quantifying the uncertainty of the estimate of the true surface concentration. All of the methods and formulas discussed in the report were applied to example situations to illustrate application of the methods and interpretation of the results.« less
The intrinsic three-dimensional shape of galactic bars
NASA Astrophysics Data System (ADS)
Méndez-Abreu, J.; Costantin, L.; Aguerri, J. A. L.; de Lorenzo-Cáceres, A.; Corsini, E. M.
2018-06-01
We present the first statistical study on the intrinsic three-dimensional (3D) shape of a sample of 83 galactic bars extracted from the CALIFA survey. We use the galaXYZ code to derive the bar intrinsic shape with a statistical approach. The method uses only the geometric information (ellipticities and position angles) of bars and discs obtained from a multi-component photometric decomposition of the galaxy surface-brightness distributions. We find that bars are predominantly prolate-triaxial ellipsoids (68%), with a small fraction of oblate-triaxial ellipsoids (32%). The typical flattening (intrinsic C/A semiaxis ratio) of the bars in our sample is 0.34, which matches well the typical intrinsic flattening of stellar discs at these galaxy masses. We demonstrate that, for prolate-triaxial bars, the intrinsic shape of bars depends on the galaxy Hubble type and stellar mass (bars in massive S0 galaxies are thicker and more circular than those in less massive spirals). The bar intrinsic shape correlates with bulge, disc, and bar parameters. In particular with the bulge-to-total (B/T) luminosity ratio, disc g - r color, and central surface brightness of the bar, confirming the tight link between bars and their host galaxies. Combining the probability distributions of the intrinsic shape of bulges and bars in our sample we show that 52% (16%) of bulges are thicker (flatter) than the surrounding bar at 1σ level. We suggest that these percentages might be representative of the fraction of classical and disc-like bulges in our sample, respectively.
Rosenthal, Mariana; Anderson, Katey; Tengelsen, Leslie; Carter, Kris; Hahn, Christine; Ball, Christopher
2017-08-24
The Right Size Roadmap was developed by the Association of Public Health Laboratories and the Centers for Disease Control and Prevention to improve influenza virologic surveillance efficiency. Guidelines were provided to state health departments regarding representativeness and statistical estimates of specimen numbers needed for seasonal influenza situational awareness, rare or novel influenza virus detection, and rare or novel influenza virus investigation. The aim of this study was to compare Roadmap sampling recommendations with Idaho's influenza virologic surveillance to determine implementation feasibility. We calculated the proportion of medically attended influenza-like illness (MA-ILI) from Idaho's influenza-like illness surveillance among outpatients during October 2008 to May 2014, applied data to Roadmap-provided sample size calculators, and compared calculations with actual numbers of specimens tested for influenza by the Idaho Bureau of Laboratories (IBL). We assessed representativeness among patients' tested specimens to census estimates by age, sex, and health district residence. Among outpatients surveilled, Idaho's mean annual proportion of MA-ILI was 2.30% (20,834/905,818) during a 5-year period. Thus, according to Roadmap recommendations, Idaho needs to collect 128 specimens from MA-ILI patients/week for situational awareness, 1496 influenza-positive specimens/week for detection of a rare or novel influenza virus at 0.2% prevalence, and after detection, 478 specimens/week to confirm true prevalence is ≤2% of influenza-positive samples. The mean number of respiratory specimens Idaho tested for influenza/week, excluding the 2009-2010 influenza season, ranged from 6 to 24. Various influenza virus types and subtypes were collected and specimen submission sources were representative in terms of geographic distribution, patient age range and sex, and disease severity. Insufficient numbers of respiratory specimens are submitted to IBL for influenza laboratory testing. Increased specimen submission would facilitate meeting Roadmap sample size recommendations. ©Mariana Rosenthal, Katey Anderson, Leslie Tengelsen, Kris Carter, Christine Hahn, Christopher Ball. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 24.08.2017.
2017-01-01
Background The Right Size Roadmap was developed by the Association of Public Health Laboratories and the Centers for Disease Control and Prevention to improve influenza virologic surveillance efficiency. Guidelines were provided to state health departments regarding representativeness and statistical estimates of specimen numbers needed for seasonal influenza situational awareness, rare or novel influenza virus detection, and rare or novel influenza virus investigation. Objective The aim of this study was to compare Roadmap sampling recommendations with Idaho’s influenza virologic surveillance to determine implementation feasibility. Methods We calculated the proportion of medically attended influenza-like illness (MA-ILI) from Idaho’s influenza-like illness surveillance among outpatients during October 2008 to May 2014, applied data to Roadmap-provided sample size calculators, and compared calculations with actual numbers of specimens tested for influenza by the Idaho Bureau of Laboratories (IBL). We assessed representativeness among patients’ tested specimens to census estimates by age, sex, and health district residence. Results Among outpatients surveilled, Idaho’s mean annual proportion of MA-ILI was 2.30% (20,834/905,818) during a 5-year period. Thus, according to Roadmap recommendations, Idaho needs to collect 128 specimens from MA-ILI patients/week for situational awareness, 1496 influenza-positive specimens/week for detection of a rare or novel influenza virus at 0.2% prevalence, and after detection, 478 specimens/week to confirm true prevalence is ≤2% of influenza-positive samples. The mean number of respiratory specimens Idaho tested for influenza/week, excluding the 2009-2010 influenza season, ranged from 6 to 24. Various influenza virus types and subtypes were collected and specimen submission sources were representative in terms of geographic distribution, patient age range and sex, and disease severity. Conclusions Insufficient numbers of respiratory specimens are submitted to IBL for influenza laboratory testing. Increased specimen submission would facilitate meeting Roadmap sample size recommendations. PMID:28838883
An audit of the statistics and the comparison with the parameter in the population
NASA Astrophysics Data System (ADS)
Bujang, Mohamad Adam; Sa'at, Nadiah; Joys, A. Reena; Ali, Mariana Mohamad
2015-10-01
The sufficient sample size that is needed to closely estimate the statistics for particular parameters are use to be an issue. Although sample size might had been calculated referring to objective of the study, however, it is difficult to confirm whether the statistics are closed with the parameter for a particular population. All these while, guideline that uses a p-value less than 0.05 is widely used as inferential evidence. Therefore, this study had audited results that were analyzed from various sub sample and statistical analyses and had compared the results with the parameters in three different populations. Eight types of statistical analysis and eight sub samples for each statistical analysis were analyzed. Results found that the statistics were consistent and were closed to the parameters when the sample study covered at least 15% to 35% of population. Larger sample size is needed to estimate parameter that involve with categorical variables compared with numerical variables. Sample sizes with 300 to 500 are sufficient to estimate the parameters for medium size of population.
Determination of beta activity in water
Barker, F.B.; Robinson, B.P.
1963-01-01
Many elements have one or more naturally radioactive isotopes, and several hundred other radionuclides have been produced artificially. Radioactive substances may be present in natural water as a result of geochemical processes or the release of radioactive waste and other nuclear debris to the environment. The Geological Survey has developed methods for measuring certain of these .radioactive substances in water. Radioactive substances often are present in water samples in microgram quantities or less. Therefore, precautions must be taken to prevent loss of material and to assure that the sample truly represents its source at the time of collection. Addition of acids, complexing agents, or stable isotopes often aids in preventing loss of radioactivity on container walls, on sediment, or on other solid materials in contact with the sample. The disintegration of radioactive atoms is a random process subject to established methods of statistical analysis. Because many water samples contain small amounts of radioactivity, low-level counting techniques must be used. The usual assumption that counting data follow a Gaussian distribution is invalid under these conditions, and statistical analyses must be based on the Poisson distribution. The gross beta activity in water samples is determined from the residue left after evaporation of the sample to dryness. Evaporation is accomplished first in a teflon dish, then the residue is transferred with distilled water to a counting planchet and again is reduced to dryness. The radioactivity on the planchet is measured with an anticoincidence-shielded, low-background, beta counter and is compared with measurements of a strontium-90-yttrium-90 standard prepared and measured in the same manner. Control charts are used to assure consistent operation of the counting instrument.
Hu, Zonghui; Qin, Jing
2018-05-20
Many observational studies adopt what we call retrospective convenience sampling (RCS). With the sample size in each arm prespecified, RCS randomly selects subjects from the treatment-inclined subpopulation into the treatment arm and those from the control-inclined into the control arm. Samples in each arm are representative of the respective subpopulation, but the proportion of the 2 subpopulations is usually not preserved in the sample data. We show in this work that, under RCS, existing causal effect estimators actually estimate the treatment effect over the sample population instead of the underlying study population. We investigate how to correct existing methods for consistent estimation of the treatment effect over the underlying population. Although RCS is adopted in medical studies for ethical and cost-effective purposes, it also has a big advantage for statistical inference: When the tendency to receive treatment is low in a study population, treatment effect estimators under RCS, with proper correction, are more efficient than their parallels under random sampling. These properties are investigated both theoretically and through numerical demonstration. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.
Forensic discrimination of copper wire using trace element concentrations.
Dettman, Joshua R; Cassabaum, Alyssa A; Saunders, Christopher P; Snyder, Deanna L; Buscaglia, JoAnn
2014-08-19
Copper may be recovered as evidence in high-profile cases such as thefts and improvised explosive device incidents; comparison of copper samples from the crime scene and those associated with the subject of an investigation can provide probative associative evidence and investigative support. A solution-based inductively coupled plasma mass spectrometry method for measuring trace element concentrations in high-purity copper was developed using standard reference materials. The method was evaluated for its ability to use trace element profiles to statistically discriminate between copper samples considering the precision of the measurement and manufacturing processes. The discriminating power was estimated by comparing samples chosen on the basis of the copper refining and production process to represent the within-source (samples expected to be similar) and between-source (samples expected to be different) variability using multivariate parametric- and empirical-based data simulation models with bootstrap resampling. If the false exclusion rate is set to 5%, >90% of the copper samples can be correctly determined to originate from different sources using a parametric-based model and >87% with an empirical-based approach. These results demonstrate the potential utility of the developed method for the comparison of copper samples encountered as forensic evidence.
Representativeness of direct observations selected using a work-sampling equation.
Sharp, Rebecca A; Mudford, Oliver C; Elliffe, Douglas
2015-01-01
Deciding on appropriate sampling to obtain representative samples of behavior is important but not straightforward, because the relative duration of the target behavior may affect its observation in a given sampling interval. Work-sampling methods, which offer a way to adjust the frequency of sampling according to a priori or ongoing estimates of the behavior to achieve a preselected level of representativeness, may provide a solution. Full-week observations of 7 behaviors were conducted for 3 students with autism spectrum disorder and intellectual disabilities. Work-sampling methods were used to select momentary time samples from the full time-of-interest, which produced representative samples. However, work sampling required impractically high numbers of time samples to obtain representative samples. More practical momentary time samples produced less representative samples, particularly for low-duration behaviors. The utility and limits of work-sampling methods for applied behavior analysis are discussed. © Society for the Experimental Analysis of Behavior.
On estimating the effects of clock instability with flicker noise characteristics
NASA Technical Reports Server (NTRS)
Wu, S. C.
1981-01-01
A scheme for flicker noise generation is given. The second approach is that of successive segmentation: A clock fluctuation is represented by 2N piecewise linear segments and then converted into a summation of N+1 triangular pulse train functions. The statistics of the clock instability are then formulated in terms of two sample variances at N+1 specified averaging times. The summation converges very rapidly that a value of N 6 is seldom necessary. An application to radio interferometric geodesy shows excellent agreement between the two approaches. Limitations to and the relative merits of the two approaches are discussed.
Systematic evaluation of serum and plasma collection on the endogenous metabolome.
Zhou, Zhi; Chen, Yanhua; He, Jiuming; Xu, Jing; Zhang, Ruiping; Mao, Yan; Abliz, Zeper
2017-02-01
In metabolomics research, the use of different blood collection methods may influence endogenous metabolites. Ultra HPLC coupled with MS/MS was applied together with multivariate statistics to investigate metabolomics differences in serum and plasma samples handled by different anticoagulants. A total of 135 known representative metabolites were assessed for comprehensive evaluation of the effects of anticoagulants. Exogenous factors, including separation gel ingredients from the serum collection tubes and the anticoagulants, affected mass spectrometer detection. Heparin plasma yielded the best detection of different functional groups and is therefore the optimal blood specimen for metabolomics research, followed by potassium oxalate plasma.
Barnes, Stephen; Benton, H. Paul; Casazza, Krista; Cooper, Sara; Cui, Xiangqin; Du, Xiuxia; Engler, Jeffrey; Kabarowski, Janusz H.; Li, Shuzhao; Pathmasiri, Wimal; Prasain, Jeevan K.; Renfrow, Matthew B.; Tiwari, Hemant K.
2017-01-01
Metabolomics, a systems biology discipline representing analysis of known and unknown pathways of metabolism, has grown tremendously over the past 20 years. Because of its comprehensive nature, metabolomics requires careful consideration of the question(s) being asked, the scale needed to answer the question(s), collection and storage of the sample specimens, methods for extraction of the metabolites from biological matrices, the analytical method(s) to be employed and the quality control of the analyses, how collected data are correlated, the statistical methods to determine metabolites undergoing significant change, putative identification of metabolites, and the use of stable isotopes to aid in verifying metabolite identity and establishing pathway connections and fluxes. This second part of a comprehensive description of the methods of metabolomics focuses on data analysis, emerging methods in metabolomics and the future of this discipline. PMID:28239968
Nummenmaa, Lauri; Glerean, Enrico; Hari, Riitta; Hietanen, Jari K.
2014-01-01
Emotions are often felt in the body, and somatosensory feedback has been proposed to trigger conscious emotional experiences. Here we reveal maps of bodily sensations associated with different emotions using a unique topographical self-report method. In five experiments, participants (n = 701) were shown two silhouettes of bodies alongside emotional words, stories, movies, or facial expressions. They were asked to color the bodily regions whose activity they felt increasing or decreasing while viewing each stimulus. Different emotions were consistently associated with statistically separable bodily sensation maps across experiments. These maps were concordant across West European and East Asian samples. Statistical classifiers distinguished emotion-specific activation maps accurately, confirming independence of topographies across emotions. We propose that emotions are represented in the somatosensory system as culturally universal categorical somatotopic maps. Perception of these emotion-triggered bodily changes may play a key role in generating consciously felt emotions. PMID:24379370
NASA Astrophysics Data System (ADS)
Deguillaume, L.; Charbouillot, T.; Joly, M.; Vaïtilingom, M.; Parazols, M.; Marinoni, A.; Amato, P.; Delort, A.-M.; Vinatier, V.; Flossmann, A.; Chaumerliac, N.; Pichon, J. M.; Houdier, S.; Laj, P.; Sellegri, K.; Colomb, A.; Brigante, M.; Mailhot, G.
2014-02-01
Long-term monitoring of the chemical composition of clouds (73 cloud events representing 199 individual samples) sampled at the puy de Dôme (pdD) station (France) was performed between 2001 and 2011. Physicochemical parameters, as well as the concentrations of the major organic and inorganic constituents, were measured and analyzed by multicomponent statistical analysis. Along with the corresponding back-trajectory plots, this allowed for distinguishing four different categories of air masses reaching the summit of the pdD: polluted, continental, marine and highly marine. The statistical analysis led to the determination of criteria (concentrations of inorganic compounds, pH) that differentiate each category of air masses. Highly marine clouds exhibited high concentrations of Na+ and Cl-; the marine category presented lower concentration of ions but more elevated pH. Finally, the two remaining clusters were classified as "continental" and "polluted"; these clusters had the second-highest and highest levels of NH4+, NO3-, and SO24-, respectively. This unique data set of cloud chemical composition is then discussed as a function of this classification. Total organic carbon (TOC) is significantly higher in polluted air masses than in the other categories, which suggests additional anthropogenic sources. Concentrations of carboxylic acids and carbonyls represent around 10% of the organic matter in all categories of air masses and are studied for their relative importance. Iron concentrations are significantly higher for polluted air masses and iron is mainly present in its oxidation state (+II) in all categories of air masses. Finally, H2O2 concentrations are much more varied in marine and highly marine clouds than in polluted clouds, which are characterized by the lowest average concentration of H2O2. This data set provides concentration ranges of main inorganic and organic compounds for modeling purposes on multiphase cloud chemistry.
Afifi, Tracie O; Cox, Brian J; Martens, Patricia J; Sareen, Jitender; Enns, Murray W
2010-01-01
Gambling has become an increasingly common activity among women since the widespread growth of the gambling industry. Currently, our knowledge of the relationship between problem gambling among women and mental and physical correlates is limited. Therefore, important relationships between problem gambling and health and functioning, mental disorders, physical health conditions, and help-seeking behaviours among women were examined using a nationally representative Canadian sample. Data were from the nationally representative Canadian Community Health Survey Cycle 1.2 (CCHS 1.2; n = 10,056 women aged 15 years and older; data collected in 2002). The statistical analysis included binary logistic regression, multinomial logistic regression, and linear regression models. Past 12-month problem gambling was associated with a significantly higher probability of current lower general health, suicidal ideation and attempts, decreased psychological well-being, increased distress, depression, mania, panic attacks, social phobia, agoraphobia, alcohol dependence, any mental disorder, comorbidity of mental disorders, chronic bronchitis, fibromyalgia, migraine headaches, help-seeking from a professional, attending a self-help group, and calling a telephone help line (odds ratios ranged from 1.5 to 8.2). Problem gambling was associated with a broad range of negative health correlates among women. Problem gambling is an important public health concern. These findings can be used to inform healthy public policies on gambling.
Wen, Quan; Stepanyants, Armen; Elston, Guy N.; Grosberg, Alexander Y.; Chklovskii, Dmitri B.
2009-01-01
The shapes of dendritic arbors are fascinating and important, yet the principles underlying these complex and diverse structures remain unclear. Here, we analyzed basal dendritic arbors of 2,171 pyramidal neurons sampled from mammalian brains and discovered 3 statistical properties: the dendritic arbor size scales with the total dendritic length, the spatial correlation of dendritic branches within an arbor has a universal functional form, and small parts of an arbor are self-similar. We proposed that these properties result from maximizing the repertoire of possible connectivity patterns between dendrites and surrounding axons while keeping the cost of dendrites low. We solved this optimization problem by drawing an analogy with maximization of the entropy for a given energy in statistical physics. The solution is consistent with the above observations and predicts scaling relations that can be tested experimentally. In addition, our theory explains why dendritic branches of pyramidal cells are distributed more sparsely than those of Purkinje cells. Our results represent a step toward a unifying view of the relationship between neuronal morphology and function. PMID:19622738
2D Affine and Projective Shape Analysis.
Bryner, Darshan; Klassen, Eric; Huiling Le; Srivastava, Anuj
2014-05-01
Current techniques for shape analysis tend to seek invariance to similarity transformations (rotation, translation, and scale), but certain imaging situations require invariance to larger groups, such as affine or projective groups. Here we present a general Riemannian framework for shape analysis of planar objects where metrics and related quantities are invariant to affine and projective groups. Highlighting two possibilities for representing object boundaries-ordered points (or landmarks) and parameterized curves-we study different combinations of these representations (points and curves) and transformations (affine and projective). Specifically, we provide solutions to three out of four situations and develop algorithms for computing geodesics and intrinsic sample statistics, leading up to Gaussian-type statistical models, and classifying test shapes using such models learned from training data. In the case of parameterized curves, we also achieve the desired goal of invariance to re-parameterizations. The geodesics are constructed by particularizing the path-straightening algorithm to geometries of current manifolds and are used, in turn, to compute shape statistics and Gaussian-type shape models. We demonstrate these ideas using a number of examples from shape and activity recognition.
The Impact of DSM-5 A-Criteria Changes on Parent Ratings of ADHD in Adolescents.
Sibley, Margaret H; Yeguez, Carlos E
2018-01-01
Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) A-criteria for ADHD were expanded to include new descriptors referencing adolescent and adult symptom manifestations. This study examines the effect of these changes on symptom endorsement in a sample of adolescents with ADHD (N = 259; age range = 10.72-16.70). Parent ratings were collected and Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR) and DSM-5 endorsement of ADHD symptoms were compared. Under the DSM-5, there were significant increases in reported inattention, but not hyperactivity/impulsivity (H/I) symptoms, with specific elevations for certain symptoms. The average adolescent met criteria for less than one additional symptom under the DSM-5, but the correlation between ADHD symptoms and impairment was attenuated when using the DSM-5 items. Impulsivity items appeared to represent adolescent deficits better than hyperactivity items. Results were not moderated by demographic factors. In a sample of adolescents with well-diagnosed DSM-IV-TR ADHD, developmental symptom descriptors led parents to endorse slightly more symptoms of inattention, but this elevation is unlikely to be clinically meaningful.
Nelson, Sarah E.; LaBrie, Richard A.; Shaffer, Howard J.
2011-01-01
Background: The purpose of this study was to examine the relationships between types of gambling and disordered gambling, with and without controlling for gambling involvement (i.e. the number of types of games with which respondents were involved during the past 12 months). Methods: We completed a secondary data analysis of the 2007 British Gambling Prevalence Survey (BGPS), which collected data in England, Scotland and Wales between September 2006 and March 2007. The sample included 9003 residents, aged 16 or older, recruited from 10 144 randomly selected addresses. 5832 households contributed at least one participant. Post-facto weighting to produce a nationally representative sample yielded 8968 observations. The BGPS included four primary types of measures: participation in gambling (during the past 12 months and during the past 7 days), disordered gambling assessments, attitudes toward gambling and descriptive information. Results: Statistically controlling for gambling involvement substantially reduced or eliminated all statistically significant relationships between types of gambling and disordered gambling. Conclusions: Gambling involvement is an important predictor of disordered gambling status. Our analysis indicates that greater gambling involvement better characterizes disordered gambling than does any specific type of gambling. PMID:19892851
Trull, Timothy J; Vergés, Alvaro; Wood, Phillip K; Jahng, Seungmin; Sher, Kenneth J
2012-10-01
We examined the latent structure underlying the criteria for DSM-IV-TR (American Psychiatric Association, 2000, Diagnostic and statistical manual of mental disorders (4th ed., text revision). Washington, DC: Author.) personality disorders in a large nationally representative sample of U.S. adults. Personality disorder symptom data were collected using a structured diagnostic interview from approximately 35,000 adults assessed over two waves of data collection in the National Epidemiologic Survey on Alcohol and Related Conditions. Our analyses suggested that a seven-factor solution provided the best fit for the data, and these factors were marked primarily by one or at most two personality disorder criteria sets. A series of regression analyses that used external validators tapping Axis I psychopathology, treatment for mental health problems, functioning scores, interpersonal conflict, and suicidal ideation and behavior provided support for the seven-factor solution. We discuss these findings in the context of previous studies that have examined the structure underlying the personality disorder criteria as well as the current proposals for DSM-5 personality disorders. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Profe, Jörn; Ohlendorf, Christian
2017-04-01
XRF-scanning is the state-of-the-art technique for geochemical analyses in marine and lacustrine sedimentology for more than a decade. However, little attention has been paid to data precision and technical limitations so far. Using homogenized, dried and powdered samples (certified geochemical reference standards and samples from a lithologically-contrasting loess-paleosol sequence) minimizes many adverse effects that influence the XRF-signal when analyzing wet sediment cores. This allows the investigation of data precision under ideal conditions and documents a new application of the XRF core-scanner technology at the same time. Reliable interpretations of XRF results require data precision evaluation of single elements as a function of X-ray tube, measurement time, sample compaction and quality of peak fitting. Ten-fold measurement of each sample constitutes data precision. Data precision of XRF measurements theoretically obeys Poisson statistics. Fe and Ca exhibit largest deviations from Poisson statistics. The same elements show the least mean relative standard deviations in the range from 0.5% to 1%. This represents the technical limit of data precision achievable by the installed detector. Measurement times ≥ 30 s reveal mean relative standard deviations below 4% for most elements. The quality of peak fitting is only relevant for elements with overlapping fluorescence lines such as Ba, Ti and Mn or for elements with low concentrations such as Y, for example. Differences in sample compaction are marginal and do not change mean relative standard deviation considerably. Data precision is in the range reported for geochemical reference standards measured by conventional techniques. Therefore, XRF scanning of discrete samples provide a cost- and time-efficient alternative to conventional multi-element analyses. As best trade-off between economical operation and data quality, we recommend a measurement time of 30 s resulting in a total scan time of 30 minutes for 30 samples.
Reconstruction of three-dimensional porous media using a single thin section
NASA Astrophysics Data System (ADS)
Tahmasebi, Pejman; Sahimi, Muhammad
2012-06-01
The purpose of any reconstruction method is to generate realizations of two- or multiphase disordered media that honor limited data for them, with the hope that the realizations provide accurate predictions for those properties of the media for which there are no data available, or their measurement is difficult. An important example of such stochastic systems is porous media for which the reconstruction technique must accurately represent their morphology—the connectivity and geometry—as well as their flow and transport properties. Many of the current reconstruction methods are based on low-order statistical descriptors that fail to provide accurate information on the properties of heterogeneous porous media. On the other hand, due to the availability of high resolution two-dimensional (2D) images of thin sections of a porous medium, and at the same time, the high cost, computational difficulties, and even unavailability of complete 3D images, the problem of reconstructing porous media from 2D thin sections remains an outstanding unsolved problem. We present a method based on multiple-point statistics in which a single 2D thin section of a porous medium, represented by a digitized image, is used to reconstruct the 3D porous medium to which the thin section belongs. The method utilizes a 1D raster path for inspecting the digitized image, and combines it with a cross-correlation function, a grid splitting technique for deciding the resolution of the computational grid used in the reconstruction, and the Shannon entropy as a measure of the heterogeneity of the porous sample, in order to reconstruct the 3D medium. It also utilizes an adaptive technique for identifying the locations and optimal number of hard (quantitative) data points that one can use in the reconstruction process. The method is tested on high resolution images for Berea sandstone and a carbonate rock sample, and the results are compared with the data. To make the comparison quantitative, two sets of statistical tests consisting of the autocorrelation function, histogram matching of the local coordination numbers, the pore and throat size distributions, multiple-points connectivity, and single- and two-phase flow permeabilities are used. The comparison indicates that the proposed method reproduces the long-range connectivity of the porous media, with the computed properties being in good agreement with the data for both porous samples. The computational efficiency of the method is also demonstrated.
Choice of Anchor Test in Equating. Research Report. ETS RR-06-35
ERIC Educational Resources Information Center
Sinharay, Sandip; Holland, Paul
2006-01-01
It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…
NASA Astrophysics Data System (ADS)
Kunert, Anna Theresa; Scheel, Jan Frederik; Helleis, Frank; Klimach, Thomas; Pöschl, Ulrich; Fröhlich-Nowoisky, Janine
2016-04-01
Freezing of water above homogeneous freezing is catalyzed by ice nucleation active (INA) particles called ice nuclei (IN), which can be of various inorganic or biological origin. The freezing temperatures reach up to -1 °C for some biological samples and are dependent on the chemical composition of the IN. The standard method to analyze IN in solution is the droplet freezing assay (DFA) established by Gabor Vali in 1970. Several modifications and improvements were already made within the last decades, but they are still limited by either small droplet numbers, large droplet volumes or inadequate separation of the single droplets resulting in mutual interferences and therefore improper measurements. The probability that miscellaneous IN are concentrated together in one droplet increases with the volume of the droplet, which can be described by the Poisson distribution. At a given concentration, the partition of a droplet into several smaller droplets leads to finely dispersed IN resulting in better statistics and therefore in a better resolution of the nucleation spectrum. We designed a new customized high-performance droplet freezing assay (HP-DFA), which represents an upgrade of the previously existing DFAs in terms of temperature range and statistics. The necessity of observing freezing events at temperatures lower than homogeneous freezing due to freezing point depression, requires high-performance thermostats combined with an optimal insulation. Furthermore, we developed a cooling setup, which allows both huge and tiny temperature changes within a very short period of time. Besides that, the new DFA provides the analysis of more than 750 droplets per run with a small droplet volume of 5 μL. This enables a fast and more precise analysis of biological samples with complex IN composition as well as better statistics for every sample at the same time.
Distribution of the entomopathogenic nematodes from La Rioja (Northern Spain).
Campos-Herrera, Raquel; Escuer, Miguel; Labrador, Sonia; Robertson, Lee; Barrios, Laura; Gutiérrez, Carmen
2007-06-01
Entomopathogenic nematodes (EPNs) distribution in natural areas and crop field edges in La Rioja (Northern Spain) has been studied taking into account environmental and physical-chemical soil factors. Five hundred soil samples from 100 sites of the most representative habitats were assayed for the presence of EPNs. The occurrence of EPNs statistically fitted to a negative binomial distribution, which pointed out that the natural distribution of these nematodes in La Rioja was in aggregates. There were no statistical differences (p < or = 0.05) in the abundance of EPNs to environmental and physical-chemical variables, although, there were statistical differences in the altitude, annual mean air temperature and rainfall, potential vegetation series and moisture percentage recovery frequency. Twenty-seven samples from 14 sites were positive for EPNs. From these samples, twenty isolates were identified to a species level and fifteen strains were selected: 11 Steinernema feltiae, two S. carpocapsae and two S. kraussei strains. S. kraussei was isolated from humid soils of cool and high altitude habitats and S. carpocapsae was found to occur in heavy soils of dry and temperate habitats. S. feltiae was the most common species with a wide range of altitude, temperature, rainfall, pH and soil moisture, although this species preferred sandy soils. The virulence of nematode strains were assessed using G. mellonella as insect host, recording the larval mortality percentage and the time to insect die, as well as the number of infective juveniles produced to evaluate the reproductive potential and the time tooks to leave the insect cadaver to determinate the infection cycle length. The ecological trends and biological results are discussed in relationship with their future use as biological control.
Phung, Dung; Huang, Cunrui; Rutherford, Shannon; Dwirahmadi, Febi; Chu, Cordia; Wang, Xiaoming; Nguyen, Minh; Nguyen, Nga Huy; Do, Cuong Manh; Nguyen, Trung Hieu; Dinh, Tuan Anh Diep
2015-05-01
The present study is an evaluation of temporal/spatial variations of surface water quality using multivariate statistical techniques, comprising cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA). Eleven water quality parameters were monitored at 38 different sites in Can Tho City, a Mekong Delta area of Vietnam from 2008 to 2012. Hierarchical cluster analysis grouped the 38 sampling sites into three clusters, representing mixed urban-rural areas, agricultural areas and industrial zone. FA/PCA resulted in three latent factors for the entire research location, three for cluster 1, four for cluster 2, and four for cluster 3 explaining 60, 60.2, 80.9, and 70% of the total variance in the respective water quality. The varifactors from FA indicated that the parameters responsible for water quality variations are related to erosion from disturbed land or inflow of effluent from sewage plants and industry, discharges from wastewater treatment plants and domestic wastewater, agricultural activities and industrial effluents, and contamination by sewage waste with faecal coliform bacteria through sewer and septic systems. Discriminant analysis (DA) revealed that nephelometric turbidity units (NTU), chemical oxygen demand (COD) and NH₃ are the discriminating parameters in space, affording 67% correct assignation in spatial analysis; pH and NO₂ are the discriminating parameters according to season, assigning approximately 60% of cases correctly. The findings suggest a possible revised sampling strategy that can reduce the number of sampling sites and the indicator parameters responsible for large variations in water quality. This study demonstrates the usefulness of multivariate statistical techniques for evaluation of temporal/spatial variations in water quality assessment and management.
Crowley, Erin; Bird, Patrick; Flannery, Jonathan; Benzinger, M Joseph; Fisher, Kiel; Boyle, Megan; Huffman, Travis; Bastin, Ben; Bedinghaus, Paige; Judd, William; Hoang, Thao; Agin, James; Goins, David; Johnson, Ronald L
2014-01-01
The VIDAS UP Listeria (LPT) is an automated rapid screening enzyme phage-ligand based assay for the detection of Listeria species in human food products and environmental samples. The VIDAS LPT method was compared in a multi-laboratory collaborative study to AOAC Official Method 993.12 Listeria monocytogenes in Milk and Dairy Products reference method following current AOAC guidelines. A total of 14 laboratories participated, representing government and industry, throughout the United States. One matrix, queso fresco (soft Mexican cheese), was analyzed using two different test portion sizes, 25 and 125 g. Samples representing each test portion size were artificially contaminated with Listeria species at three levels, an uninoculated control level [0 colony-forming units (CFU)/test portion], a low-inoculum level (0.2-2 CFU/test portion), and a high-inoculum level (2-5 CFU/test portion). For this evaluation, 1800 unpaired replicate test portions were analyzed by either the VIDAS LPT or AOAC 993.12. Each inoculation level was analyzed using the Probability of Detection (POD) statistical model. For the low-level inoculated test portions, difference in collaborator POD (dLPOD) values of 0.01, (-0.10, 0.13), with 95% confidence intervals, were obtained for both 25 and 125 g test portions. The range of the confidence intervals for dLPOD values for both the 25 and 125 g test portions contains the point 0.0 indicating no statistically significant difference in the number of positive samples detected between the VIDAS LPT and the AOAC methods. In addition to Oxford agar, VIDAS LPT test portions were confirmed using Agar Listeria Ottavani and Agosti (ALOA), a proprietary chromogenic agar for the identification and differentiation of L. monocytogenes and Listeria species. No differences were observed between the two selective agars. The VIDAS LPT method, with the optional ALOA agar confirmation method, was adopted as Official First Action status for the detection of Listeria species in a variety of foods and environmental samples.
Mendoza, Lucía M; Neef, Alexander; Vignolo, Graciela; Belloch, Carmela
2017-10-01
Diversity and dynamics of yeasts associated with the fermentation of Argentinian maize-based beverage chicha was investigated. Samples taken at different stages from two chicha productions were analyzed by culture-dependent and culture-independent methods. Five hundred and ninety six yeasts were isolated by classical microbiological methods and 16 species identified by RFLPs and sequencing of D1/D2 26S rRNA gene. Genetic typing of isolates from the dominant species, Saccharomyces cerevisiae, by PCR of delta elements revealed up to 42 different patterns. High-throughput sequencing (HTS) of D1/D2 26S rRNA gene amplicons from chicha samples detected more than one hundred yeast species and almost fifty filamentous fungi taxa. Analysis of the data revealed that yeasts dominated the fermentation, although, a significant percentage of filamentous fungi appeared in the first step of the process. Statistical analysis of results showed that very few taxa were represented by more than 1% of the reads per sample at any step of the process. S. cerevisiae represented more than 90% of the reads in the fermentative samples. Other yeast species dominated the pre-fermentative steps and abounded in fermented samples when S. cerevisiae was in percentages below 90%. Most yeasts species detected by pyrosequencing were not recovered by cultivation. In contrast, the cultivation-based methodology detected very few yeast taxa, and most of them corresponded with very few reads in the pyrosequencing analysis. Copyright © 2017 Elsevier Ltd. All rights reserved.
Völker, Sebastian; Kistemann, Thomas
2015-01-01
Legionella spp. represent a significant health risk for humans. To ensure hygienically safe drinking water, technical guidelines recommend a central potable water hot (PWH) supply temperature of at least 60°C at the calorifier. In a clinic building we monitored whether slightly lowered temperatures in the PWH system led to a systemic change in the growth of these pathogens. In four separate phases we tested different scenarios concerning PWH supply temperatures and disinfection with chlorine dioxide (ClO2). In each phase, we took 5 sets of samples at 17 representative sampling points in the building's drinking water plumbing system. In total we collected 476 samples from the PWH system. All samples were tested (culture-based) for Legionella spp. and serogroups. Additionally, quantitative parameters at each sampling point were collected, which could possibly be associated with the presence of Legionella spp. (Pseudomonas aeruginsoa, heterotrophic plate count at 20°C and 36°C, temperatures, time until constant temperatures were reached, and chlorine dioxide concentration). The presence of Legionella spp. showed no significant reactions after reducing the PWH supply temperature from 63°C to 60°C and 57°C, as long as disinfection with ClO2 was maintained. After omitting the disinfectant, the PWH system showed statistically significant growth rates at 57°C. PWH temperatures which are permanently lowered to less than recommended values should be carefully accompanied by frequent testing, a thorough evaluation of the building's drinking water plumbing system, and hygiene expertise.
Design-based Sample and Probability Law-Assumed Sample: Their Role in Scientific Investigation.
ERIC Educational Resources Information Center
Ojeda, Mario Miguel; Sahai, Hardeo
2002-01-01
Discusses some key statistical concepts in probabilistic and non-probabilistic sampling to provide an overview for understanding the inference process. Suggests a statistical model constituting the basis of statistical inference and provides a brief review of the finite population descriptive inference and a quota sampling inferential theory.…
Comparative Financial Statistics for Public Two-Year Colleges: FY 1991 National Sample.
ERIC Educational Resources Information Center
Dickmeyer, Nathan; Cirino, Anna Marie
This report provides comparative financial information derived from a national sample of 503 public two-year colleges. The report includes space for colleges to compare their institutional statistics with data provided on national sample medians; quartile data for the national sample; and statistics presented in various formats, including tables,…
Estimation of distribution overlap of urn models.
Hampton, Jerrad; Lladser, Manuel E
2012-01-01
A classical problem in statistics is estimating the expected coverage of a sample, which has had applications in gene expression, microbial ecology, optimization, and even numismatics. Here we consider a related extension of this problem to random samples of two discrete distributions. Specifically, we estimate what we call the dissimilarity probability of a sample, i.e., the probability of a draw from one distribution not being observed in [Formula: see text] draws from another distribution. We show our estimator of dissimilarity to be a [Formula: see text]-statistic and a uniformly minimum variance unbiased estimator of dissimilarity over the largest appropriate range of [Formula: see text]. Furthermore, despite the non-Markovian nature of our estimator when applied sequentially over [Formula: see text], we show it converges uniformly in probability to the dissimilarity parameter, and we present criteria when it is approximately normally distributed and admits a consistent jackknife estimator of its variance. As proof of concept, we analyze V35 16S rRNA data to discern between various microbial environments. Other potential applications concern any situation where dissimilarity of two discrete distributions may be of interest. For instance, in SELEX experiments, each urn could represent a random RNA pool and each draw a possible solution to a particular binding site problem over that pool. The dissimilarity of these pools is then related to the probability of finding binding site solutions in one pool that are absent in the other.
Exploratory Spectroscopy of Magnetic Cataclysmic Variables Candidates and Other Variable Objects
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oliveira, A. S.; Palhares, M. S.; Rodrigues, C. V.
2017-04-01
The increasing number of synoptic surveys made by small robotic telescopes, such as the photometric Catalina Real-Time Transient Survey (CRTS), provides a unique opportunity to discover variable sources and improves the statistical samples of such classes of objects. Our goal is the discovery of magnetic Cataclysmic Variables (mCVs). These are rare objects that probe interesting accretion scenarios controlled by the white-dwarf magnetic field. In particular, improved statistics of mCVs would help to address open questions on their formation and evolution. We performed an optical spectroscopy survey to search for signatures of magnetic accretion in 45 variable objects selected mostly from themore » CRTS. In this sample, we found 32 CVs, 22 being mCV candidates, 13 of which were previously unreported as such. If the proposed classifications are confirmed, it would represent an increase of 4% in the number of known polars and 12% in the number of known IPs. A fraction of our initial sample was classified as extragalactic sources or other types of variable stars by the inspection of the identification spectra. Despite the inherent complexity in identifying a source as an mCV, variability-based selection, followed by spectroscopic snapshot observations, has proved to be an efficient strategy for their discoveries, being a relatively inexpensive approach in terms of telescope time.« less
Kruse, Johannes; Schmitz, Norbert; Thefeld, Wolfgang
2003-06-01
To determine the relationship between mental disorders and diabetes in a representative community sample. This was a cross-sectional study. Data on diabetes and HbA(1c) values were obtained by structured questionnaires and by laboratory assessments. Current psychiatric disorders were diagnosed by a modified version of the Composite International Diagnostic Interview (CIDI). People with diabetes (PWD) were not more likely to meet Diagnostic and Statistical Manual of Psychiatric Disorders, 4th edition (DSM-IV) criteria for at least one mental disorder than were individuals without diabetes. However, a different diagnostic pattern occurred compared with the general population: odds ratios (ORs) for anxiety disorders in PWD were higher (OR 1.93, 95% CI 1.19-3.14). Although PWD had higher prevalence rates of affective disorders, the relationship between diabetes and affective disorders was not statistically significant after controlling for age, sex, marital status, and socioeconomic status. In contrast, the relationship between diabetes and anxiety disorders remained significant after controlling for these variables. In contrast to individuals without mental disorders, PWD with affective or anxiety disorders more frequently had adequate glycemic control. Diabetes was associated with an increased likelihood of anxiety disorders. The association between mental disorders, diabetes, and glycemic control should be evaluated carefully in terms of potentially confounding sociodemographic variables, sample characteristics, and definitions of the disorders.
Relationship of children's salivary microbiota with their caries status: a pyrosequencing study.
Gomar-Vercher, S; Cabrera-Rubio, R; Mira, A; Montiel-Company, J M; Almerich-Silla, J M
2014-12-01
Different dental caries status could be related with alterations in oral microbiota. Previous studies have collected saliva as a representative medium of the oral ecosystem. The purpose of this study was to assess the composition of oral microbiota and its relation to the presence of dental caries at different degrees of severity. One hundred ten saliva samples from 12-year-old children were taken and divided into six groups defined in strict accordance with their dental caries prevalence according to the International Caries Detection and Assessment System II criteria. These samples were studied by pyrosequencing PCR products of the 16S ribosomal RNA gene. The results showed statistically significant intergroup differences at the class and genus taxonomic levels. Streptococcus is the most frequent genus in all groups; although it did not show intergroup statistical differences. In patients with cavities, Porphyromonas and Prevotella showed an increasing percentage compared to healthy individuals. Bacterial diversity diminished as the severity of the disease increased, so those patients with more advanced stages of caries presented less bacterial diversity than healthy subjects. Although microbial composition tended to be different, the intragroup variation is large, as evidenced by the lack of clear intragroup clustering in principal component analyses. Thus, no clear differences were found, indicating that using bacterial composition as the sole source of biomarkers for dental caries may not be reliable in the unstimulated saliva samples used in the current study.
NASA Astrophysics Data System (ADS)
Panko, Julie M.; Chu, Jennifer; Kreider, Marisa L.; Unice, Ken M.
2013-06-01
In addition to industrial facilities, fuel combustion, forest fires and dust erosion, exhaust and non-exhaust vehicle emissions are an important source of ambient air respirable particulate matter (PM10). Non-exhaust vehicle emissions are formed from wear particles of vehicle components such as brakes, clutches, chassis and tires. Although the non-exhaust particles are relatively minor contributors to the overall ambient air particulate load, reliable exposure estimates are few. In this study, a global sampling program was conducted to quantify tire and road wear particles (TRWP) in the ambient air in order to understand potential human exposures and the overall contribution of these particles to the PM10. The sampling was conducted in Europe, the United States and Japan and the sampling locations were selected to represent a variety of settings including both rural and urban core; and within each residential, commercial and recreational receptors. The air samples were analyzed using validated chemical markers for rubber polymer based on a pyrolysis technique. Results indicated that TRWP concentrations in the PM10 fraction were low with averages ranging from 0.05 to 0.70 μg m-3, representing an average PM10 contribution of 0.84%. The TRWP concentration in air was associated with traffic load and population density, but the trend was not statistically significant. Further, significant differences across days were not observed. This study provides a robust dataset to understand potential human exposures to airborne TRWP.
Subsampling for dataset optimisation
NASA Astrophysics Data System (ADS)
Ließ, Mareike
2017-04-01
Soil-landscapes have formed by the interaction of soil-forming factors and pedogenic processes. In modelling these landscapes in their pedodiversity and the underlying processes, a representative unbiased dataset is required. This concerns model input as well as output data. However, very often big datasets are available which are highly heterogeneous and were gathered for various purposes, but not to model a particular process or data space. As a first step, the overall data space and/or landscape section to be modelled needs to be identified including considerations regarding scale and resolution. Then the available dataset needs to be optimised via subsampling to well represent this n-dimensional data space. A couple of well-known sampling designs may be adapted to suit this purpose. The overall approach follows three main strategies: (1) the data space may be condensed and de-correlated by a factor analysis to facilitate the subsampling process. (2) Different methods of pattern recognition serve to structure the n-dimensional data space to be modelled into units which then form the basis for the optimisation of an existing dataset through a sensible selection of samples. Along the way, data units for which there is currently insufficient soil data available may be identified. And (3) random samples from the n-dimensional data space may be replaced by similar samples from the available dataset. While being a presupposition to develop data-driven statistical models, this approach may also help to develop universal process models and identify limitations in existing models.
Development of quantitative screen for 1550 chemicals with GC-MS.
Bergmann, Alan J; Points, Gary L; Scott, Richard P; Wilson, Glenn; Anderson, Kim A
2018-05-01
With hundreds of thousands of chemicals in the environment, effective monitoring requires high-throughput analytical techniques. This paper presents a quantitative screening method for 1550 chemicals based on statistical modeling of responses with identification and integration performed using deconvolution reporting software. The method was evaluated with representative environmental samples. We tested biological extracts, low-density polyethylene, and silicone passive sampling devices spiked with known concentrations of 196 representative chemicals. A multiple linear regression (R 2 = 0.80) was developed with molecular weight, logP, polar surface area, and fractional ion abundance to predict chemical responses within a factor of 2.5. Linearity beyond the calibration had R 2 > 0.97 for three orders of magnitude. Median limits of quantitation were estimated to be 201 pg/μL (1.9× standard deviation). The number of detected chemicals and the accuracy of quantitation were similar for environmental samples and standard solutions. To our knowledge, this is the most precise method for the largest number of semi-volatile organic chemicals lacking authentic standards. Accessible instrumentation and software make this method cost effective in quantifying a large, customizable list of chemicals. When paired with silicone wristband passive samplers, this quantitative screen will be very useful for epidemiology where binning of concentrations is common. Graphical abstract A multiple linear regression of chemical responses measured with GC-MS allowed quantitation of 1550 chemicals in samples such as silicone wristbands.
Temporal Wind Pairs for Space Launch Vehicle Capability Assessment and Risk Mitigation
NASA Technical Reports Server (NTRS)
Decker, Ryan K.; Barbre, Robert E., Jr.
2015-01-01
Space launch vehicles incorporate upper-level wind assessments to determine wind effects on the vehicle and for a commit to launch decision. These assessments make use of wind profiles measured hours prior to launch and may not represent the actual wind the vehicle will fly through. Uncertainty in the winds over the time period between the assessment and launch introduces uncertainty in assessment of vehicle controllability and structural integrity that must be accounted for to ensure launch safety. Temporal wind pairs are used in engineering development of allowances to mitigate uncertainty. Five sets of temporal wind pairs at various times (0.75, 1.5, 2, 3 and 4-hrs) at the United States Air Force Eastern Range and Western Range, as well as the National Aeronautics and Space Administration's Wallops Flight Facility are developed for use in upper-level wind assessments on vehicle performance. Historical databases are compiled from balloon-based and vertically pointing Doppler radar wind profiler systems. Various automated and manual quality control procedures are used to remove unacceptable profiles. Statistical analyses on the resultant wind pairs from each site are performed to determine if the observed extreme wind changes in the sample pairs are representative of extreme temporal wind change. Wind change samples in the Eastern Range and Western Range databases characterize extreme wind change. However, the small sample sizes in the Wallops Flight Facility databases yield low confidence that the sample population characterizes extreme wind change that could occur.
Temporal Wind Pairs for Space Launch Vehicle Capability Assessment and Risk Mitigation
NASA Technical Reports Server (NTRS)
Decker, Ryan K.; Barbre, Robert E., Jr.
2014-01-01
Space launch vehicles incorporate upper-level wind assessments to determine wind effects on the vehicle and for a commit to launch decision. These assessments make use of wind profiles measured hours prior to launch and may not represent the actual wind the vehicle will fly through. Uncertainty in the winds over the time period between the assessment and launch introduces uncertainty in assessment of vehicle controllability and structural integrity that must be accounted for to ensure launch safety. Temporal wind pairs are used in engineering development of allowances to mitigate uncertainty. Five sets of temporal wind pairs at various times (0.75, 1.5, 2, 3 and 4-hrs) at the United States Air Force Eastern Range and Western Range, as well as the National Aeronautics and Space Administration's Wallops Flight Facility are developed for use in upper-level wind assessments on vehicle performance. Historical databases are compiled from balloon-based and vertically pointing Doppler radar wind profiler systems. Various automated and manual quality control procedures are used to remove unacceptable profiles. Statistical analyses on the resultant wind pairs from each site are performed to determine if the observed extreme wind changes in the sample pairs are representative of extreme temporal wind change. Wind change samples in the Eastern Range and Western Range databases characterize extreme wind change. However, the small sample sizes in the Wallops Flight Facility databases yield low confidence that the sample population characterizes extreme wind change that could occur.
Recruitment of Older Adults: Success May Be in the Details
McHenry, Judith C.; Insel, Kathleen C.; Einstein, Gilles O.; Vidrine, Amy N.; Koerner, Kari M.; Morrow, Daniel G.
2015-01-01
Purpose: Describe recruitment strategies used in a randomized clinical trial of a behavioral prospective memory intervention to improve medication adherence for older adults taking antihypertensive medication. Results: Recruitment strategies represent 4 themes: accessing an appropriate population, communication and trust-building, providing comfort and security, and expressing gratitude. Recruitment activities resulted in 276 participants with a mean age of 76.32 years, and study enrollment included 207 women, 69 men, and 54 persons representing ethnic minorities. Recruitment success was linked to cultivating relationships with community-based organizations, face-to-face contact with potential study participants, and providing service (e.g., blood pressure checks) as an access point to eligible participants. Seventy-two percent of potential participants who completed a follow-up call and met eligibility criteria were enrolled in the study. The attrition rate was 14.34%. Implications: The projected increase in the number of older adults intensifies the need to study interventions that improve health outcomes. The challenge is to recruit sufficient numbers of participants who are also representative of older adults to test these interventions. Failing to recruit a sufficient and representative sample can compromise statistical power and the generalizability of study findings. PMID:22899424
Who Governs Federally Qualified Health Centers?
Wright, Brad
2017-01-01
To make them more responsive to their community’s needs, federally qualified health centers (FQHCs) are required to have a governing board comprised of at least 51% consumers. However, the extent to which consumer board members actually resemble the typical FQHC patient has not been assessed, which according to the political science literature on representation may influence the board’s ability to represent the community. This mixed-methods study uses four years of data from the Health Resources and Services Administration, combined with Uniform Data System, Bureau of Labor Statistics, and Area Resource File data to describe and identify factors associated with the composition of FQHC governing boards. Board members are classified into one of three groups: non-consumers, non-representative consumers (who do not resemble the typical FQHC patient), and representative consumers (who resemble the typical FQHC patient). The analysis finds that a minority of board members are representative consumers, and telephone interviews with a stratified random sample of 30 FQHC board members confirmed the existence of significant socioeconomic gaps between consumer board members and FQHC patients. This may make FQHCs less responsive to the needs of the predominantly low-income communities they serve. PMID:23052684
CALIFA: a diameter-selected sample for an integral field spectroscopy galaxy survey
NASA Astrophysics Data System (ADS)
Walcher, C. J.; Wisotzki, L.; Bekeraité, S.; Husemann, B.; Iglesias-Páramo, J.; Backsmann, N.; Barrera Ballesteros, J.; Catalán-Torrecilla, C.; Cortijo, C.; del Olmo, A.; Garcia Lorenzo, B.; Falcón-Barroso, J.; Jilkova, L.; Kalinova, V.; Mast, D.; Marino, R. A.; Méndez-Abreu, J.; Pasquali, A.; Sánchez, S. F.; Trager, S.; Zibetti, S.; Aguerri, J. A. L.; Alves, J.; Bland-Hawthorn, J.; Boselli, A.; Castillo Morales, A.; Cid Fernandes, R.; Flores, H.; Galbany, L.; Gallazzi, A.; García-Benito, R.; Gil de Paz, A.; González-Delgado, R. M.; Jahnke, K.; Jungwiert, B.; Kehrig, C.; Lyubenova, M.; Márquez Perez, I.; Masegosa, J.; Monreal Ibero, A.; Pérez, E.; Quirrenbach, A.; Rosales-Ortega, F. F.; Roth, M. M.; Sanchez-Blazquez, P.; Spekkens, K.; Tundo, E.; van de Ven, G.; Verheijen, M. A. W.; Vilchez, J. V.; Ziegler, B.
2014-09-01
We describe and discuss the selection procedure and statistical properties of the galaxy sample used by the Calar Alto Legacy Integral Field Area (CALIFA) survey, a public legacy survey of 600 galaxies using integral field spectroscopy. The CALIFA "mother sample" was selected from the Sloan Digital Sky Survey (SDSS) DR7 photometric catalogue to include all galaxies with an r-band isophotal major axis between 45'' and 79.2'' and with a redshift 0.005 < z < 0.03. The mother sample contains 939 objects, 600 of which will be observed in the course of the CALIFA survey. The selection of targets for observations is based solely on visibility and thus keeps the statistical properties of the mother sample. By comparison with a large set of SDSS galaxies, we find that the CALIFA sample is representative of galaxies over a luminosity range of -19 > Mr > -23.1 and over a stellar mass range between 109.7 and 1011.4 M⊙. In particular, within these ranges, the diameter selection does not lead to any significant bias against - or in favour of - intrinsically large or small galaxies. Only below luminosities of Mr = -19 (or stellar masses <109.7 M⊙) is there a prevalence of galaxies with larger isophotal sizes, especially of nearly edge-on late-type galaxies, but such galaxies form <10% of the full sample. We estimate volume-corrected distribution functions in luminosities and sizes and show that these are statistically fully compatible with estimates from the full SDSS when accounting for large-scale structure. For full characterization of the sample, we also present a number of value-added quantities determined for the galaxies in the CALIFA sample. These include consistent multi-band photometry based on growth curve analyses; stellar masses; distances and quantities derived from these; morphological classifications; and an overview of available multi-wavelength photometric measurements. We also explore different ways of characterizing the environments of CALIFA galaxies, finding that the sample covers environmental conditions from the field to genuine clusters. We finally consider the expected incidence of active galactic nuclei among CALIFA galaxies given the existing pre-CALIFA data, finding that the final observed CALIFA sample will contain approximately 30 Sey2 galaxies. Based on observations collected at the Centro Astronómico Hispano Alemán (CAHA) at Calar Alto, operated jointly by the Max Planck Institute for Astronomy and the Instituto de Astrofísica de Andalucía (CSIC). Publically released data products from CALIFA are made available on the webpage http://www.caha.es/CALIFA
General aviation activity and avionics survey. Annual report for CY81
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schwenk, J.C.; Carter, P.W.
1982-12-01
This report presents the results and a description of the 1981 General Aviation Activity and Avionics Survey. The survey was conducted during 1982 by the FAA to obtain information on the activity and avionics of the United States registered general aviation aircraft fleet, the dominant component of civil aviation in the U.S. The survey was based on a statistically selected sample of about 8.9 percent of the general aviation fleet and obtained a response rate of 61 percent. Survey results are based upon response but are expanded upward to represent the total population. Survey results revealed that during 1981 anmore » estimated 40.7 million hours of flying time were logged by the 213,226 active general aviation aircraft in the U.S. fleet, yielding a mean annual flight time per aircraft of 188.1 hours. The active aircraft represented about 83 percent of the registered general aviation fleet. The report contains breakdowns of these and other statistics by manufacturer/model group, aircraft type, state and region of based aircraft, and primary use. Also included are fuel consumption, lifetime airframe hours, avionics, and engine hours estimates. In addition, tables are included for detailed analysis of the avionics capabilities of GA fleet.« less
msap: a tool for the statistical analysis of methylation-sensitive amplified polymorphism data.
Pérez-Figueroa, A
2013-05-01
In this study msap, an R package which analyses methylation-sensitive amplified polymorphism (MSAP or MS-AFLP) data is presented. The program provides a deep analysis of epigenetic variation starting from a binary data matrix indicating the banding pattern between the isoesquizomeric endonucleases HpaII and MspI, with differential sensitivity to cytosine methylation. After comparing the restriction fragments, the program determines if each fragment is susceptible to methylation (representative of epigenetic variation) or if there is no evidence of methylation (representative of genetic variation). The package provides, in a user-friendly command line interface, a pipeline of different analyses of the variation (genetic and epigenetic) among user-defined groups of samples, as well as the classification of the methylation occurrences in those groups. Statistical testing provides support to the analyses. A comprehensive report of the analyses and several useful plots could help researchers to assess the epigenetic and genetic variation in their MSAP experiments. msap is downloadable from CRAN (http://cran.r-project.org/) and its own webpage (http://msap.r-forge.R-project.org/). The package is intended to be easy to use even for those people unfamiliar with the R command line environment. Advanced users may take advantage of the available source code to adapt msap to more complex analyses. © 2013 Blackwell Publishing Ltd.
Probabilistic liver atlas construction.
Dura, Esther; Domingo, Juan; Ayala, Guillermo; Marti-Bonmati, Luis; Goceri, E
2017-01-13
Anatomical atlases are 3D volumes or shapes representing an organ or structure of the human body. They contain either the prototypical shape of the object of interest together with other shapes representing its statistical variations (statistical atlas) or a probability map of belonging to the object (probabilistic atlas). Probabilistic atlases are mostly built with simple estimations only involving the data at each spatial location. A new method for probabilistic atlas construction that uses a generalized linear model is proposed. This method aims to improve the estimation of the probability to be covered by the liver. Furthermore, all methods to build an atlas involve previous coregistration of the sample of shapes available. The influence of the geometrical transformation adopted for registration in the quality of the final atlas has not been sufficiently investigated. The ability of an atlas to adapt to a new case is one of the most important quality criteria that should be taken into account. The presented experiments show that some methods for atlas construction are severely affected by the previous coregistration step. We show the good performance of the new approach. Furthermore, results suggest that extremely flexible registration methods are not always beneficial, since they can reduce the variability of the atlas and hence its ability to give sensible values of probability when used as an aid in segmentation of new cases.
Hinney, Barbara; Gottwald, Michaela; Moser, Jasmine; Reicher, Bianca; Schäfer, Bhavapriya Jasmin; Schaper, Roland; Joachim, Anja; Künzel, Frank
2017-10-15
Several endoparasites of dogs cannot only be detrimental to their primary host but might also represent a threat to human health because of their zoonotic potential. Due to their high dog population densities, metropolitan areas can be highly endemic for such parasites. We aimed to estimate the prevalence of endoparasites in dogs in the Austrian capital of Vienna by examining a representative number of canine faecal samples and to compare the prevalences with two neighbouring peri-urban and rural regions. In addition we analysed whether the density of dog populations and cleanliness of dog zones correlated with parasite occurrence. We collected 1001 anonymous faecal samples from 55 dog zones from all 23 districts of the federal state of Vienna, as well as 480 faecal samples from the Mödling district and Wolkersdorf with a peri-urban and rural character, respectively. Faeces were examined by flotation and by Baermann technique. Additionally we evaluated 292 Viennese, 102 peri-urban and 50 rural samples for Giardia and Cryptosporidium by GiardiaFASTest ® and CryptoFASTest ® . Samples from "clean" dog zones were compared to samples from "dirty" zones. The infection rate of Toxocara was surprisingly low, ranging from 0.6% to 1.9%. Trichuris was the most frequent helminth (1.8-7.5%) and Giardia the most frequent protozoan (4.0-10.8%). Ancylostomatidae, Crenosoma, Capillaria, Taeniidae, Cystoisospora and Sarcocystis were found in 1.8-2.2%, 0-0.9%, 0-0.9%, 0-0.6%, 0.3-3.1% and 0-0.6% of the samples, respectively. Samples from "dirty" dog zones in Vienna showed a significantly higher rate of parasites overall (p=0.003) and of Trichuris (p=0.048) compared to samples from "clean" dog zones. There were no statistically significant differences in densely vs. less densely populated areas of Vienna. Samples from the rural region of Wolkersdorf had significantly higher overall parasite, Trichuris and Cystoisospora prevalences than the peri-urban Mödling district and Vienna (p=0.000-0.039), while samples from the Mödling district had a significantly higher Giardia, Crenosoma and Capillaria prevalence than those from Vienna (p=0.002-0.047). Parasite excretion is dynamic and representative sampling and monitoring are necessary for parasite surveillance. Dog owners should be informed about the zoonotic risk and encouraged to remove dog faeces and dispose of them properly to reduce the infection risk for both other dogs and humans. Copyright © 2017 Elsevier B.V. All rights reserved.
Yang, Yi; Tokita, Midori; Ishiguchi, Akira
2018-01-01
A number of studies revealed that our visual system can extract different types of summary statistics, such as the mean and variance, from sets of items. Although the extraction of such summary statistics has been studied well in isolation, the relationship between these statistics remains unclear. In this study, we explored this issue using an individual differences approach. Observers viewed illustrations of strawberries and lollypops varying in size or orientation and performed four tasks in a within-subject design, namely mean and variance discrimination tasks with size and orientation domains. We found that the performances in the mean and variance discrimination tasks were not correlated with each other and demonstrated that extractions of the mean and variance are mediated by different representation mechanisms. In addition, we tested the relationship between performances in size and orientation domains for each summary statistic (i.e. mean and variance) and examined whether each summary statistic has distinct processes across perceptual domains. The results illustrated that statistical summary representations of size and orientation may share a common mechanism for representing the mean and possibly for representing variance. Introspections for each observer performing the tasks were also examined and discussed. PMID:29399318
Code of Federal Regulations, 2011 CFR
2011-01-01
... 7 Agriculture 2 2011-01-01 2011-01-01 false Statistical sampling procedures for lot inspection of processed fruits and vegetables by attributes. 52.38c Section 52.38c Agriculture Regulations of the... Regulations Governing Inspection and Certification Sampling § 52.38c Statistical sampling procedures for lot...
Code of Federal Regulations, 2011 CFR
2011-01-01
... 7 Agriculture 2 2011-01-01 2011-01-01 false Statistical sampling procedures for on-line inspection by attributes of processed fruits and vegetables. 52.38b Section 52.38b Agriculture Regulations of... Regulations Governing Inspection and Certification Sampling § 52.38b Statistical sampling procedures for on...
Code of Federal Regulations, 2010 CFR
2010-01-01
... 7 Agriculture 2 2010-01-01 2010-01-01 false Statistical sampling procedures for on-line inspection by attributes of processed fruits and vegetables. 52.38b Section 52.38b Agriculture Regulations of... Regulations Governing Inspection and Certification Sampling § 52.38b Statistical sampling procedures for on...
Code of Federal Regulations, 2010 CFR
2010-01-01
... 7 Agriculture 2 2010-01-01 2010-01-01 false Statistical sampling procedures for lot inspection of processed fruits and vegetables by attributes. 52.38c Section 52.38c Agriculture Regulations of the... Regulations Governing Inspection and Certification Sampling § 52.38c Statistical sampling procedures for lot...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Palmintier, Bryan S; Bugbee, Bruce; Gotseff, Peter
Capturing technical and economic impacts of solar photovoltaics (PV) and other distributed energy resources (DERs) on electric distribution systems can require high-time resolution (e.g. 1 minute), long-duration (e.g. 1 year) simulations. However, such simulations can be computationally prohibitive, particularly when including complex control schemes in quasi-steady-state time series (QSTS) simulation. Various approaches have been used in the literature to down select representative time segments (e.g. days), but typically these are best suited for lower time resolutions or consider only a single data stream (e.g. PV production) for selection. We present a statistical approach that combines stratified sampling and bootstrapping tomore » select representative days while also providing a simple method to reassemble annual results. We describe the approach in the context of a recent study with a utility partner. This approach enables much faster QSTS analysis by simulating only a subset of days, while maintaining accurate annual estimates.« less
75 FR 38871 - Proposed Collection; Comment Request for Revenue Procedure 2004-29
Federal Register 2010, 2011, 2012, 2013, 2014
2010-07-06
... comments concerning Revenue Procedure 2004-29, Statistical Sampling in Sec. 274 Context. DATES: Written... Internet, at [email protected] . SUPPLEMENTARY INFORMATION: Title: Statistical Sampling in Sec...: Revenue Procedure 2004-29 prescribes the statistical sampling methodology by which taxpayers under...
Crossing statistics of laser light scattered through a nanofluid.
Arshadi Pirlar, M; Movahed, S M S; Razzaghi, D; Karimzadeh, R
2017-09-01
In this paper, we investigate the crossing statistics of speckle patterns formed in the Fresnel diffraction region by a laser beam scattering through a nanofluid. We extend zero-crossing statistics to assess the dynamical properties of the nanofluid. According to the joint probability density function of laser beam fluctuation and its time derivative, the theoretical frameworks for Gaussian and non-Gaussian regimes are revisited. We count the number of crossings not only at zero level but also for all available thresholds to determine the average speed of moving particles. Using a probabilistic framework in determining crossing statistics, a priori Gaussianity is not essentially considered; therefore, even in the presence of deviation from Gaussian fluctuation, this modified approach is capable of computing relevant quantities, such as mean value of speed, more precisely. Generalized total crossing, which represents the weighted summation of crossings for all thresholds to quantify small deviation from Gaussian statistics, is introduced. This criterion can also manipulate the contribution of noises and trends to infer reliable physical quantities. The characteristic time scale for having successive crossings at a given threshold is defined. In our experimental setup, we find that increasing sample temperature leads to more consistency between Gaussian and perturbative non-Gaussian predictions. The maximum number of crossings does not necessarily occur at mean level, indicating that we should take into account other levels in addition to zero level to achieve more accurate assessments.
Time Series Analysis Based on Running Mann Whitney Z Statistics
USDA-ARS?s Scientific Manuscript database
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
Results of the Excreta Bioassay Quality Control Program for April 1, 2009 through March 31, 2010
DOE Office of Scientific and Technical Information (OSTI.GOV)
Antonio, Cheryl L.
2012-07-19
A total of 58 urine samples and 10 fecal samples were submitted during the report period (April 1, 2009 through March 31, 2010) to General Engineering Laboratories, South Carolina by the Hanford Internal Dosimetry Program (IDP) to check the accuracy, precision, and detection levels of their analyses. Urine analyses for Sr, 238Pu, 239Pu, 241Am, 243Am 235U, 238U, elemental uranium and fecal analyses for 241Am, 238Pu and 239Pu were tested this year as well as four tissue samples for 238Pu, 239Pu, 241Am and 241Pu. The number of QC urine samples submitted during the report period represented 1.3% of the total samplesmore » submitted. In addition to the samples provided by IDP, GEL was also required to conduct their own QC program, and submit the results of analyses to IDP. About 33% of the analyses processed by GEL during the third year of this contract were quality control samples. GEL tested the performance of 21 radioisotopes, all of which met or exceeded the specifications in the Statement of Work within statistical uncertainty (Table 4).« less
Results of The Excreta Bioassay Quality Control Program For April 1, 2010 Through March 31, 2011
DOE Office of Scientific and Technical Information (OSTI.GOV)
Antonio, Cheryl L.
2012-07-19
A total of 76 urine samples and 10 spiked fecal samples were submitted during the report period (April 1, 2010 through March 31, 2011) to GEL Laboratories, LLC in South Carolina by the Hanford Internal Dosimetry Program (IDP) to check the accuracy, precision, and detection levels of their analyses. Urine analyses for 14C, Sr, for 238Pu, 239Pu, 241Am, 243Am, 235U, 238U, 238U-mass and fecal analyses for 241Am, 238Pu and 239Pu were tested this year. The number of QC urine samples submitted during the report period represented 1.1% of the total samples submitted. In addition to the samples provided by IDP,more » GEL was also required to conduct their own QC program, and submit the results of analyses to IDP. About 31% of the analyses processed by GEL during the first year of contract 112512 were quality control samples. GEL tested the performance of 23 radioisotopes, all of which met or exceeded the specifications in the Statement of Work within statistical uncertainty except the slightly elevated relative bias for 243,244Cm (Table 4).« less
Kim, Il-Ho; Muntaner, Carles; Khang, Young-Ho; Paek, Domyung; Cho, Sung-Il
2006-08-01
In light of escalating job insecurity due to increasing numbers of nonstandard workers, this study examined the association between nonstandard employment and mental health among South Korean workers. We analyzed a representative weighted sample of 2086 men and 1194 women aged 20-64 years, using data from the 1998 Korean National Health and Nutrition Examination Survey. Nonstandard employment included part-time work, temporary work, and daily work. Mental health was measured with indicators of self-reported depression and suicidal ideation. Based on age-adjusted prevalence of mental health, nonstandard employees were more likely to be mentally ill compared to standard employees. Furthermore, nonstandard work status was associated with poor mental health after adjusting for socioeconomic position (education, occupational class, and income) and health behaviors (smoking, alcohol consumption, and exercise). However, the pattern of the relationship between nonstandard work and mental health differed by gender. Female gender was significantly associated with poor mental health. Although males tended to report more suicidal ideation, this difference was not statistically significant. Considering the increasing prevalence of nonstandard working conditions in South Korea, the results call for more longitudinal research on the mental health effects of nonstandard work.
Microplastic in the gastrointestinal tract of fishes along the Saudi Arabian Red Sea coast.
Baalkhuyur, Fadiyah M; Bin Dohaish, El-Jawaher A; Elhalwagy, Manal E A; Alikunhi, Nabeel M; AlSuwailem, Abdulaziz M; Røstad, Anders; Coker, Darren J; Berumen, Michael L; Duarte, Carlos M
2018-06-01
This study assesses the presence of microplastic litter in the contents of the gastrointestinal tract of 26 commercial and non-commercial fish species from four difference habitats sampled along the Saudi Arabian coast of the Red Sea. A total of 178 individual were examined for microplastics. In total, 26 microplastic fragments were found. Of these, 16 being films (61.5%) and 10 being fishing thread (38.5%). FTIR analysis revealed that the most abundant polymers were polypropylene and polyethylene. The grouper (Epinephelus spp.) sampled at Jazan registered the highest number of ingested microplastics. This fish species is benthic and feeds on benthic invertebrates. Although differences in the abundance of microplastic ingestion among species were not statistically significant, a significant change was observed when the level of ingestion of microplastics particles was compared among the habitats. The higher abundance of microplastics particles may be related to the habitats of fish and the presence of microplastics debris near the seabed. The results of this study represent a first evidence that microplastic pollution represents an emerging threat to Red Sea fishes, their food web and human consumers. Copyright © 2018 Elsevier Ltd. All rights reserved.
Riedel-Heller, S G; Schork, A; Matschinger, H; Angermeyer, M C
2000-02-01
According to the growing clinical interest in early indicators of dementia, numerous studies have examined the association between subjective memory complaints and cognitive performance in old age. Their results are contradictory. In this paper, studies carried out over the last 10 years are compared with regard to the study design and the assessment instruments used. The results are discussed with particular reference to the diagnostic validity of subjective memory complaints. The majority of case-control studies and cross-sectional studies of non-representative samples could not demonstrate an association between subjective memory complaints and cognitive performance. Most field studies of larger representative population samples, however, have come to the opposite conclusion. A consistent assessment of these statistically significant associations against the background of diagnostic validity showed that memory complaints cannot be taken as a clear clinical indicator for cognitive impairment. Subjective memory complaints may reflect depressive disorders and a multitude of other processes, of which an objective impairment of cognitive performance is just one aspect. As a consequence, an inclusion of subjective memory complaints as a diagnostic criterion for the diagnosis of "mild cognitive disorder" according to ICD-10 is not justified.
Hoy, Madita; Strauß, Bernhard; Kröger, Christoph; Brenk-Franz, Katja
2018-06-22
The New Sexual Satisfaction Scale (NSSS) is an internationally established questionnaire for assessing sexual satisfaction. It is based on 2 subscales (ego-centered and partner- and sexual activity-centered sexual satisfaction). The aim of the study was to evaluate the German short version of the questionnaire (NSSS-SD) in a representative sample (N=2524). In addition, relationships between sexual satisfaction and sociodemographic factors (age, sex, education) and characteristics of partnership and sexuality (relationship satisfaction, coitus frequency, number of sexual partners) were examined. The internal consistency of the NSSS-SD was excellent (Cronbach's Alpha = 0.96). The 2-dimensional structure of the long version could not be confirmed for the short version. One factor could be extracted, which explains 68.94% of the variance. An analysis of variance (ANOVA) revealed statistically significant differences in sexual satisfaction with respect to age, education, relationship satisfaction and coitus frequency. Sex and number of sexual partners did not influence sexual satisfaction. The NSSS-SD is a reliable questionnaire of sexual satisfaction for sexually active individuals. For sexually inactive individuals, a change of the instruction or a visual analogue scale might be useful. © Georg Thieme Verlag KG Stuttgart · New York.
Toussaint, Loren L; Marschall, Justin C; Williams, David R
2012-01-01
The present investigation examines the prospective associations of religiousness/spirituality with depression and the extent to which various dimensions of forgiveness act as mediating mechanisms of these associations. Data are from a nationally representative sample of United States adults who were first interviewed in 1998 and reinterviewed six months later. Measures of religiousness/spirituality, forgiveness, and various sociodemographics were collected. Depression was assessed using the Composite International Diagnostic Interview administered by trained interviewers. Results showed that religiousness/spirituality, forgiveness of oneself and others, and feeling forgiven by God were associated, both cross-sectionally and longitudinally, with depressive status. After controlling for initial depressive status, only forgiveness of oneself and others remained statistically significant predictors of depression. Path analyses revealed that religiousness/spirituality conveyed protective effects, prospectively, on depression by way of an indirect path through forgiveness of others but not forgiveness of oneself. Hence, forgiveness of others acts as a mechanism of the salutary effect of religiousness/spirituality, but forgiveness of oneself is an independent predictor. Conclusions regarding the continued development of this type of research and for the treatment of clients with depression are offered.
Toussaint, Loren L.; Marschall, Justin C.; Williams, David R.
2012-01-01
The present investigation examines the prospective associations of religiousness/spirituality with depression and the extent to which various dimensions of forgiveness act as mediating mechanisms of these associations. Data are from a nationally representative sample of United States adults who were first interviewed in 1998 and reinterviewed six months later. Measures of religiousness/spirituality, forgiveness, and various sociodemographics were collected. Depression was assessed using the Composite International Diagnostic Interview administered by trained interviewers. Results showed that religiousness/spirituality, forgiveness of oneself and others, and feeling forgiven by God were associated, both cross-sectionally and longitudinally, with depressive status. After controlling for initial depressive status, only forgiveness of oneself and others remained statistically significant predictors of depression. Path analyses revealed that religiousness/spirituality conveyed protective effects, prospectively, on depression by way of an indirect path through forgiveness of others but not forgiveness of oneself. Hence, forgiveness of others acts as a mechanism of the salutary effect of religiousness/spirituality, but forgiveness of oneself is an independent predictor. Conclusions regarding the continued development of this type of research and for the treatment of clients with depression are offered. PMID:22675623
Paz, Henry A.; Anderson, Christopher L.; Muller, Makala J.; Kononoff, Paul J.; Fernando, Samodha C.
2016-01-01
The rumen microbial community in dairy cows plays a critical role in efficient milk production. However, there is a lack of data comparing the composition of the rumen bacterial community of the main dairy breeds. This study utilizes 16S rRNA gene sequencing to describe the rumen bacterial community composition in Holstein and Jersey cows fed the same diet by sampling the rumen microbiota via the rumen cannula (Holstein cows) or esophageal tubing (both Holstein and Jersey cows). After collection of the rumen sample via esophageal tubing, particles attached to the strainer were added to the sample to ensure representative sampling of both the liquid and solid fraction of the rumen contents. Alpha diversity metrics, Chao1 and observed OTUs estimates, displayed higher (P = 0.02) bacterial richness in Holstein compared to Jersey cows and no difference (P > 0.70) in bacterial community richness due to sampling method. The principal coordinate analysis displayed distinct clustering of bacterial communities by breed suggesting that Holstein and Jersey cows harbor different rumen bacterial communities. Family level classification of most abundant (>1%) differential OTUs displayed that OTUs from the bacterial families Lachnospiraceae and p-2534-18B5 to be predominant in Holstein cows compared to Jersey cows. Additionally, OTUs belonging to family Prevotellaceae were differentially abundant in the two breeds. Overall, the results from this study suggest that the bacterial community between Holstein and Jersey cows differ and that esophageal tubing with collection of feed particles associated with the strainer provides a representative rumen sample similar to a sample collected via the rumen cannula. Thus, in future studies esophageal tubing with addition of retained particles can be used to collect rumen samples reducing the cost of cannulation and increasing the number of animals used in microbiome investigations, thus increasing the statistical power of rumen microbial community evaluations. PMID:27536291
Hansen, John P
2003-01-01
Healthcare quality improvement professionals need to understand and use inferential statistics to interpret sample data from their organizations. In quality improvement and healthcare research studies all the data from a population often are not available, so investigators take samples and make inferences about the population by using inferential statistics. This three-part series will give readers an understanding of the concepts of inferential statistics as well as the specific tools for calculating confidence intervals for samples of data. This article, Part 2, describes probability, populations, and samples. The uses of descriptive and inferential statistics are outlined. The article also discusses the properties and probability of normal distributions, including the standard normal distribution.
Uses and misuses of compositional data in sedimentology
NASA Astrophysics Data System (ADS)
Tolosana-Delgado, Raimon
2012-12-01
This paper serves two goals. The first part shows how mass evolution processes of different nature become undistinguishable once we take a size-limited, noisy sample of its compositional fingerprint: processes of exponential decay, mass mixture and complementary accumulation are simulated, and then samples contaminated with noise are extracted. The aim of this exercise is to illustrate the limitations of typical graphical representations and statistical methods when dealing with compositional data, i.e. data in percentages, concentrations or proportions. The second part presents a series of concepts, tools and methods to represent and statistically treat a compositional data set attending to these limitations. The aim of this second part is to offer a state-of-the-art Compositional Data Analysis. This includes: descriptive statistics and graphics (the biplot); ternary diagrams with confidence regions for the mean; regression and ANalysis-Of-VAriance models to explain compositional variability; and the use of compositional information to predict environmental covariables or discriminate between groups. All these tools share a four-step algorithm: (1) transform compositions with an invertible log-ratio transformation; (2) apply a statistical method to the transformed scores; (3) back-transform the results to compositions; and (4) interpret results in relative terms. Using these techniques, a data set of sand petrographic composition has been analyzed, highlighting that: finer sands are richer in single-crystal grains in relation to polycrystalline grains, and that grain-size accounts for almost all compositional variability; a stronger water flow (river discharge) favors mica grains against quartz or rock fragment grains, possibly due to hydrodynamic sorting effects; a higher relief ratio implies shorter residence times, which may favor survival of micas and rock fragments, relatively more labile grains.
Monte Carlo Analysis of Reservoir Models Using Seismic Data and Geostatistical Models
NASA Astrophysics Data System (ADS)
Zunino, A.; Mosegaard, K.; Lange, K.; Melnikova, Y.; Hansen, T. M.
2013-12-01
We present a study on the analysis of petroleum reservoir models consistent with seismic data and geostatistical constraints performed on a synthetic reservoir model. Our aim is to invert directly for structure and rock bulk properties of the target reservoir zone. To infer the rock facies, porosity and oil saturation seismology alone is not sufficient but a rock physics model must be taken into account, which links the unknown properties to the elastic parameters. We then combine a rock physics model with a simple convolutional approach for seismic waves to invert the "measured" seismograms. To solve this inverse problem, we employ a Markov chain Monte Carlo (MCMC) method, because it offers the possibility to handle non-linearity, complex and multi-step forward models and provides realistic estimates of uncertainties. However, for large data sets the MCMC method may be impractical because of a very high computational demand. To face this challenge one strategy is to feed the algorithm with realistic models, hence relying on proper prior information. To address this problem, we utilize an algorithm drawn from geostatistics to generate geologically plausible models which represent samples of the prior distribution. The geostatistical algorithm learns the multiple-point statistics from prototype models (in the form of training images), then generates thousands of different models which are accepted or rejected by a Metropolis sampler. To further reduce the computation time we parallelize the software and run it on multi-core machines. The solution of the inverse problem is then represented by a collection of reservoir models in terms of facies, porosity and oil saturation, which constitute samples of the posterior distribution. We are finally able to produce probability maps of the properties we are interested in by performing statistical analysis on the collection of solutions.
Measurement of surface roughness changes of unpolished and polished enamel following erosion
Austin, Rupert S.; Parkinson, Charles R.; Hasan, Adam; Bartlett, David W.
2017-01-01
Objectives To determine if Sa roughness data from measuring one central location of unpolished and polished enamel were representative of the overall surfaces before and after erosion. Methods Twenty human enamel sections (4x4 mm) were embedded in bis-acryl composite and randomised to either a native or polishing enamel preparation protocol. Enamel samples were subjected to an acid challenge (15 minutes 100 mL orange juice, pH 3.2, titratable acidity 41.3mmol OH/L, 62.5 rpm agitation, repeated for three cycles). Median (IQR) surface roughness [Sa] was measured at baseline and after erosion from both a centralised cluster and four peripheral clusters. Within each cluster, five smaller areas (0.04 mm2) provided the Sa roughness data. Results For both unpolished and polished enamel samples there were no significant differences between measuring one central cluster or four peripheral clusters, before and after erosion. For unpolished enamel the single central cluster had a median (IQR) Sa roughness of 1.45 (2.58) μm and the four peripheral clusters had a median (IQR) of 1.32 (4.86) μm before erosion; after erosion there were statistically significant reductions to 0.38 (0.35) μm and 0.34 (0.49) μm respectively (p<0.0001). Polished enamel had a median (IQR) Sa roughness 0.04 (0.17) μm for the single central cluster and 0.05 (0.15) μm for the four peripheral clusters which statistically significantly increased after erosion to 0.27 (0.08) μm for both (p<0.0001). Conclusion Measuring one central cluster of unpolished and polished enamel was representative of the overall enamel surface roughness, before and after erosion. PMID:28771562
Change in indoor particle levels after a smoking ban in Minnesota bars and restaurants.
Bohac, David L; Hewett, Martha J; Kapphahn, Kristopher I; Grimsrud, David T; Apte, Michael G; Gundel, Lara A
2010-12-01
Smoking bans in bars and restaurants have been shown to improve worker health and reduce hospital admissions for acute myocardial infarction. Several studies have also reported improved indoor air quality, although these studies generally used single visits before and after a ban for a convenience sample of venues. The primary objective of this study was to provide detailed time-of-day and day-of-week secondhand smoke-exposure data for representative bars and restaurants in Minnesota. This study improved on previous approaches by using a statistically representative sample of three venue types (drinking places, limited-service restaurants, and full-service restaurants), conducting repeat visits to the same venue prior to the ban, and matching the day of week and time of day for the before- and after-ban monitoring. The repeat visits included laser photometer fine particulate (PM₂.₅) concentration measurements, lit cigarette counts, and customer counts for 19 drinking places, eight limited-service restaurants, and 35 full-service restaurants in the Minneapolis/St. Paul metropolitan area. The more rigorous design of this study provides improved confidence in the findings and reduces the likelihood of systematic bias. The median reduction in PM₂.₅ was greater than 95% for all three venue types. Examination of data from repeated visits shows that making only one pre-ban visit to each venue would greatly increase the range of computed percentage reductions and lower the statistical power of pre-post tests. Variations in PM₂.₅ concentrations were found based on time of day and day of week when monitoring occurred. These comprehensive measurements confirm that smoking bans provide significant reductions in SHS constituents, protecting customers and workers from PM₂.₅ in bars and restaurants. Copyright © 2010 American Journal of Preventive Medicine. All rights reserved.
Guerrero, Natalie; Walsh, Matthew C; Malecki, Kristen C; Nieto, F Javier
2014-01-01
Background Food insecurity is a public health concern and it is estimated to affect 18 million American households nationally, which can result in chronic nutritional deficiencies and other health risks. The relationships between food insecurity and specific demographic and geographic factors in Wisconsin is not well documented. The goals of this paper are to investigate socio-demographic and geographic features associated with food insecurity in a representative sample of Wisconsin adults. Methods This study used data from the Survey of the Health of Wisconsin (SHOW). SHOW annually collects health-related data on a representative sample of Wisconsin residents. Between 2008-2012, 2,947 participants were enrolled in the SHOW study. The presence of food insecurity was defined based on the participant's affirmative answer to the question “In the last 12 months, have you been concerned about having enough food for you or your family?” Results After adjustment for age, race, and gender, 13.2% (95% Confidence Limit (CI): 10.8%-15.1%) of participants reported food insecurity, 56.7% (95% CI: 50.6%-62.7%) of whom were female. Food insecurity did not statistically differ by state public health region (p=0.30). The adjusted prevalence of food insecurity in the urban core, other urban, and rural areas of Wisconsin was 14.1%, 6.5% and 10.5%, respectively. These differences were not statistically significant (p=0.13). Conclusions The prevalence of food insecurity is substantial, affecting an estimated number of 740,000 Wisconsin residents. The prevalence was similarly high in all urbanicity levels and across all state public health regions in Wisconsin. Food insecurity is a common problem with potentially serious health consequences affecting populations across the entire state. PMID:25211799
Guerrero, Natalie; Walsh, Matthew C; Malecki, Kristen C; Nieto, F Javier
2014-08-01
Food insecurity is a public health concern estimated to affect 18 million American households nationally, which can result in chronic nutritional deficiencies and other health risks. The relationships between food insecurity and specific demographic and geographic factors in Wisconsin are not well documented. The goals of this paper are to investigate sociodemographic and geographic features associated with food insecurity in a representative sample of Wisconsin adults. This study used data from the Survey of the Health of Wisconsin (SHOW). SHOW annually collects health-related data on a representative sample of Wisconsin residents. Between 2008-2012, 2,947 participants were enrolled in the SHOW study. The presence of food insecurity was defined based on the participant's affirmative answer to the question "In the last 12 months, have you been concerned about having enough food for you or your family?" After adjustment for age, race, and gender, 13.2% (95% CI, 10.8%-15.1%) of participants reported food insecurity, 56.7% (95% CI, 50.6%-62.7%) of whom were female. Food insecurity did not statistically differ by region (P = 0.30). The adjusted prevalence of food insecurity in the urban core, other urban, and rural areas was 14.1%, 6.5%, and 10.5%, respectively. These differences were not statistically significant (P = 0.13) and, for urban core and rural areas, persisted even when accounting for level of economic hardship in the community. The prevalence of food insecurity is substantial, affecting an estimated 740,000 or more Wisconsin residents. The prevalence was similarly high in all urbanicity levels and across all state public health regions in Wisconsin. Food insecurity is a common problem with potentially serious health consequences affecting populations across the entire state.
Frazier, Thomas W; Ratliff, Kristin R; Gruber, Chris; Zhang, Yi; Law, Paul A; Constantino, John N
2014-01-01
Understanding the factor structure of autistic symptomatology is critical to the discovery and interpretation of causal mechanisms in autism spectrum disorder. We applied confirmatory factor analysis and assessment of measurement invariance to a large (N = 9635) accumulated collection of reports on quantitative autistic traits using the Social Responsiveness Scale, representing a broad diversity of age, severity, and reporter type. A two-factor structure (corresponding to social communication impairment and restricted, repetitive behavior) as elaborated in the updated Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) criteria for autism spectrum disorder exhibited acceptable model fit in confirmatory factor analysis. Measurement invariance was appreciable across age, sex, and reporter (self vs other), but somewhat less apparent between clinical and nonclinical populations in this sample comprised of both familial and sporadic autism spectrum disorders. The statistical power afforded by this large sample allowed relative differentiation of three factors among items encompassing social communication impairment (emotion recognition, social avoidance, and interpersonal relatedness) and two factors among items encompassing restricted, repetitive behavior (insistence on sameness and repetitive mannerisms). Cross-trait correlations remained extremely high, that is, on the order of 0.66-0.92. These data clarify domains of statistically significant factoral separation that may relate to partially-but not completely-overlapping biological mechanisms, contributing to variation in human social competency. Given such robust intercorrelations among symptom domains, understanding their co-emergence remains a high priority in conceptualizing common neural mechanisms underlying autistic syndromes.
75 FR 53738 - Proposed Collection; Comment Request for Rev. Proc. 2007-35
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-01
... Revenue Procedure Revenue Procedure 2007-35, Statistical Sampling for purposes of Section 199. DATES... through the Internet, at [email protected] . SUPPLEMENTARY INFORMATION: Title: Statistical Sampling...: This revenue procedure provides for determining when statistical sampling may be used in purposes of...
Whole Frog Project and Virtual Frog Dissection Statistics wwwstats output for January 1 through duplicate or extraneous accesses. For example, in these statistics, while a POST requesting an image is as well. Note that this under-represents the bytes requested. Starting date for following statistics
Arsenyev, P A; Trezvov, V V; Saratovskaya, N V
1997-01-01
This work represents a method, which allows to determine phase composition of calcium hydroxylapatite basing on its infrared spectrum. The method uses factor analysis of the spectral data of calibration set of samples to determine minimal number of factors required to reproduce the spectra within experimental error. Multiple linear regression is applied to establish correlation between factor scores of calibration standards and their properties. The regression equations can be used to predict the property value of unknown sample. The regression model was built for determination of beta-tricalcium phosphate content in hydroxylapatite. Statistical estimation of quality of the model was carried out. Application of the factor analysis on spectral data allows to increase accuracy of beta-tricalcium phosphate determination and expand the range of determination towards its less concentration. Reproducibility of results is retained.
Sellbom, Martin; Sansone, Randy A; Songer, Douglas A
2017-09-01
The current study evaluated the utility of the self-harm inventory (SHI) as a proxy for and screening measure of borderline personality disorder (BPD) using several diagnostic and statistical manual of mental disorders (DSM)-based BPD measures as criteria. We used a sample of 145 psychiatric inpatients, who completed the SHI and a series of well-validated, DSM-based self-report measures of BPD. Using a series of latent trait and latent class analyses, we found that the SHI was substantially associated with a latent construct representing BPD, as well as differentiated latent classes of 'high' vs. 'low' BPD, with good accuracy. The SHI can serve as proxy for and a good screening measure for BPD, but future research needs to replicate these findings using structured interview-based measurement of BPD.
NASA Astrophysics Data System (ADS)
Mallamace, Domenico; Vasi, Sebastiano; Corsaro, Carmelo; Naccari, Clara; Clodoveo, Maria Lisa; Dugo, Giacomo; Cicero, Nicola
2017-11-01
The thermal properties of many organic extra Virgin Olive Oils (eVOOs) coming from different countries of the world were investigated by Differential Scanning Calorimetry (DSC). This technique, through a series of heating and cooling cycles, provides a specific curve, i.e., a thermogram, which represents the fingerprint of each eVOO sample. In fact, variations due to the different cultivars, geographical origin or chemical composition can be highlighted because they produce changes in the corresponding thermogram. In particular, in this work, we show the results of an unsupervised multivariate statistical analysis applied to the DSC thermograms of many organic eVOOs. This analysis allows us to discriminate the geographical origin of the different studied samples in terms of the peculiar features shown by the melting profiles of the triacylglycerol moieties.
Atlantic Bluefin Tuna (Thunnus thynnus) Biometrics and Condition.
Rodriguez-Marin, Enrique; Ortiz, Mauricio; Ortiz de Urbina, José María; Quelle, Pablo; Walter, John; Abid, Noureddine; Addis, Piero; Alot, Enrique; Andrushchenko, Irene; Deguara, Simeon; Di Natale, Antonio; Gatt, Mark; Golet, Walter; Karakulak, Saadet; Kimoto, Ai; Macias, David; Saber, Samar; Santos, Miguel Neves; Zarrad, Rafik
2015-01-01
The compiled data for this study represents the first Atlantic and Mediterranean-wide effort to pool all available biometric data for Atlantic bluefin tuna (Thunnus thynnus) with the collaboration of many countries and scientific groups. Biometric relationships were based on an extensive sampling (over 140,000 fish sampled), covering most of the fishing areas for this species in the North Atlantic Ocean and Mediterranean Sea. Sensitivity analyses were carried out to evaluate the representativeness of sampling and explore the most adequate procedure to fit the weight-length relationship (WLR). The selected model for the WLRs by stock included standardized data series (common measurement types) weighted by the inverse variability. There was little difference between annual stock-specific round weight-straight fork length relationships, with an overall difference of 6% in weight. The predicted weight by month was estimated as an additional component in the exponent of the weight-length function. The analyses of monthly variations of fish condition by stock, maturity state and geographic area reflect annual cycles of spawning and feeding behavior. We update and improve upon the biometric relationships for bluefin currently used by the International Commission for the Conservation of Atlantic Tunas, by incorporating substantially larger datasets than ever previously compiled, providing complete documentation of sources and employing robust statistical fitting. WLRs and other conversion factors estimated in this study differ from the ones used in previous bluefin stock assessments.
Winterfield, Craig; van de Voort, F R
2014-12-01
The Fluid Life Corporation assessed and implemented Fourier transform infrared spectroscopy (FTIR)-based methods using American Society for Testing and Materials (ASTM)-like stoichiometric reactions for determination of acid and base number for in-service mineral-based oils. The basic protocols, quality control procedures, calibration, validation, and performance of these new quantitative methods are assessed. ASTM correspondence is attained using a mixed-mode calibration, using primary reference standards to anchor the calibration, supplemented by representative sample lubricants analyzed by ASTM procedures. A partial least squares calibration is devised by combining primary acid/base reference standards and representative samples, focusing on the main spectral stoichiometric response with chemometrics assisting in accounting for matrix variability. FTIR(AN/BN) methodology is precise, accurate, and free of most interference that affects ASTM D664 and D4739 results. Extensive side-by-side operational runs produced normally distributed differences with mean differences close to zero and standard deviations of 0.18 and 0.26 mg KOH/g, respectively. Statistically, the FTIR methods are a direct match to the ASTM methods, with superior performance in terms of analytical throughput, preparation time, and solvent use. FTIR(AN/BN) analysis is a viable, significant advance for in-service lubricant analysis, providing an economic means of trending samples instead of tedious and expensive conventional ASTM(AN/BN) procedures. © 2014 Society for Laboratory Automation and Screening.
Gamma and Beta Oscillations Define a Sequence of Neurocognitive Modes Present in Odor Processing
Frederick, Donald E.; Brown, Austin; Brim, Elizabeth; Mehta, Nisarg; Vujovic, Mark
2016-01-01
Olfactory system beta (15–35 Hz) and gamma (40–110 Hz) oscillations of the local field potential in mammals have both been linked to odor learning and discrimination. Gamma oscillations represent the activity of a local network within the olfactory bulb, and beta oscillations represent engagement of a systemwide network. Here, we test whether beta and gamma oscillations represent different cognitive modes using the different demands of go/no-go and two-alternative choice tasks that previously were suggested to favor beta or gamma oscillations, respectively. We reconcile previous studies and show that both beta and gamma oscillations occur in both tasks, with gamma dominating the early odor sampling period (2–4 sniffs) and beta dominating later. The relative power and coherence of both oscillations depend separately on multiple factors within both tasks without categorical differences across tasks. While the early/gamma-associated period occurs in all trials, rats can perform above chance without the later/beta-associated period. Longer sampling, which includes beta oscillations, is associated with better performance. Gamma followed by beta oscillations therefore represents a sequence of cognitive and neural states during odor discrimination, which can be separately modified depending on the demands of a task and odor discrimination. Additionally, fast (85 Hz) and slow (70 Hz) olfactory bulb gamma oscillation sub-bands have been hypothesized to represent tufted and mitral cell networks, respectively (Manabe and Mori, 2013). We find that fast gamma favors the early and slow gamma the later (beta-dominated) odor-sampling period and that the relative contributions of these oscillations are consistent across tasks. SIGNIFICANCE STATEMENT Olfactory system gamma (40–110 Hz) and beta (15–35 Hz) oscillations of the local field potential indicate different neural firing statistics and functional circuits. We show that gamma and beta oscillations occur in stereotyped sequence during odor sampling in associative tasks, with local gamma dominating the first 250 ms of odor sniffing, followed by systemwide beta as behavioral responses are prepared. Oscillations and coupling strength between brain regions are modulated by task, odor, and learning, showing that task features can dramatically adjust the dynamics of a cortical sensory system, which changes state every ∼250 ms. Understanding cortical circuits, even at the biophysical level, depends on careful use of multiple behavioral contexts and stimuli. PMID:27445151
Gamma and Beta Oscillations Define a Sequence of Neurocognitive Modes Present in Odor Processing.
Frederick, Donald E; Brown, Austin; Brim, Elizabeth; Mehta, Nisarg; Vujovic, Mark; Kay, Leslie M
2016-07-20
Olfactory system beta (15-35 Hz) and gamma (40-110 Hz) oscillations of the local field potential in mammals have both been linked to odor learning and discrimination. Gamma oscillations represent the activity of a local network within the olfactory bulb, and beta oscillations represent engagement of a systemwide network. Here, we test whether beta and gamma oscillations represent different cognitive modes using the different demands of go/no-go and two-alternative choice tasks that previously were suggested to favor beta or gamma oscillations, respectively. We reconcile previous studies and show that both beta and gamma oscillations occur in both tasks, with gamma dominating the early odor sampling period (2-4 sniffs) and beta dominating later. The relative power and coherence of both oscillations depend separately on multiple factors within both tasks without categorical differences across tasks. While the early/gamma-associated period occurs in all trials, rats can perform above chance without the later/beta-associated period. Longer sampling, which includes beta oscillations, is associated with better performance. Gamma followed by beta oscillations therefore represents a sequence of cognitive and neural states during odor discrimination, which can be separately modified depending on the demands of a task and odor discrimination. Additionally, fast (85 Hz) and slow (70 Hz) olfactory bulb gamma oscillation sub-bands have been hypothesized to represent tufted and mitral cell networks, respectively (Manabe and Mori, 2013). We find that fast gamma favors the early and slow gamma the later (beta-dominated) odor-sampling period and that the relative contributions of these oscillations are consistent across tasks. Olfactory system gamma (40-110 Hz) and beta (15-35 Hz) oscillations of the local field potential indicate different neural firing statistics and functional circuits. We show that gamma and beta oscillations occur in stereotyped sequence during odor sampling in associative tasks, with local gamma dominating the first 250 ms of odor sniffing, followed by systemwide beta as behavioral responses are prepared. Oscillations and coupling strength between brain regions are modulated by task, odor, and learning, showing that task features can dramatically adjust the dynamics of a cortical sensory system, which changes state every ∼250 ms. Understanding cortical circuits, even at the biophysical level, depends on careful use of multiple behavioral contexts and stimuli. Copyright © 2016 the authors 0270-6474/16/367750-18$15.00/0.
Vaasma, Taavi; Loosaar, Jüri; Kiisk, Madis; Tkaczyk, Alan Henry
2017-07-01
Several multi-day samplings were conducted over a 2-year period from an oil shale-fired power plant operating with pulverized fuel type of boilers that were equipped with either novel integrated desulphurization system and bag filters or with electrostatic precipitators. Oil shale, bottom ash and fly ash samples were collected and radionuclides from the 238 U and 232 Th series as well as 40 K were determined. The work aimed at determining possible variations in the concentrations of naturally occurring radionuclides within the collected samples and detect the sources of these fluctuations. During the continuous multi-day samplings, various boiler parameters were recorded as well. With couple of exceptions, no statistically significant differences were detected (significance level 0.05) between the measured radionuclide mean values in various ash samples within the same sampling. When comparing the results between multiple years and samplings, no statistically significant variations were observed between 238 U and 226 Ra values. However, there were significant differences between the values in the fly ashes when comparing 210 Pb, 40 K, 228 Ra and 232 Th values between the various samplings. In all cases the radionuclide activity concentrations in the specific fly ash remained under 100 Bq kg -1 , posing no radiological concerns when using this material as an additive in construction or building materials. Correlation analysis between the registered boiler parameters and measured radionuclide activity concentrations showed weak or no correlation. The obtained results suggest that the main sources of variations are due to the characteristics of the used fuel. The changes in the radionuclide activity concentrations between multiple years were in general rather modest. The radionuclide activity concentrations varied dominantly between 4% and 15% from the measured mean within the same sampling. The relative standard deviation was however within the same range as the relative measurement uncertainty, suggesting that the main component of fluctuations is derived from the measurement method and approach. The obtained results indicate that representativeness of the data over a longer time period is valid only when a fuel with a similar composition is used and when the combustion boilers operate with a uniform setup (same boiler type and purification system). The results and the accompanying statistical analysis clearly demonstrated that in order to obtain data with higher reliability, a repeated multi-day sampling should be organized and combined with the registered boiler technical and operational parameters. Copyright © 2016 Elsevier Ltd. All rights reserved.
Hagell, Peter; Westergren, Albert
Sample size is a major factor in statistical null hypothesis testing, which is the basis for many approaches to testing Rasch model fit. Few sample size recommendations for testing fit to the Rasch model concern the Rasch Unidimensional Measurement Models (RUMM) software, which features chi-square and ANOVA/F-ratio based fit statistics, including Bonferroni and algebraic sample size adjustments. This paper explores the occurrence of Type I errors with RUMM fit statistics, and the effects of algebraic sample size adjustments. Data with simulated Rasch model fitting 25-item dichotomous scales and sample sizes ranging from N = 50 to N = 2500 were analysed with and without algebraically adjusted sample sizes. Results suggest the occurrence of Type I errors with N less then or equal to 500, and that Bonferroni correction as well as downward algebraic sample size adjustment are useful to avoid such errors, whereas upward adjustment of smaller samples falsely signal misfit. Our observations suggest that sample sizes around N = 250 to N = 500 may provide a good balance for the statistical interpretation of the RUMM fit statistics studied here with respect to Type I errors and under the assumption of Rasch model fit within the examined frame of reference (i.e., about 25 item parameters well targeted to the sample).
40 CFR Appendix I to Part 261 - Representative Sampling Methods
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 25 2010-07-01 2010-07-01 false Representative Sampling Methods I...—Representative Sampling Methods The methods and equipment used for sampling waste materials will vary with the form and consistency of the waste materials to be sampled. Samples collected using the sampling...
40 CFR Appendix I to Part 261 - Representative Sampling Methods
Code of Federal Regulations, 2011 CFR
2011-07-01
... 40 Protection of Environment 26 2011-07-01 2011-07-01 false Representative Sampling Methods I...—Representative Sampling Methods The methods and equipment used for sampling waste materials will vary with the form and consistency of the waste materials to be sampled. Samples collected using the sampling...
Evaluating the One-in-Five Statistic: Women's Risk of Sexual Assault While in College.
Muehlenhard, Charlene L; Peterson, Zoë D; Humphreys, Terry P; Jozkowski, Kristen N
In 2014, U.S. president Barack Obama announced a White House Task Force to Protect Students From Sexual Assault, noting that "1 in 5 women on college campuses has been sexually assaulted during their time there." Since then, this one-in-five statistic has permeated public discourse. It is frequently reported, but some commentators have criticized it as exaggerated. Here, we address the question, "What percentage of women are sexually assaulted while in college?" After discussing definitions of sexual assault, we systematically review available data, focusing on studies that used large, representative samples of female undergraduates and multiple behaviorally specific questions. We conclude that one in five is a reasonably accurate average across women and campuses. We also review studies that are inappropriately cited as either supporting or debunking the one-in-five statistic; we explain why they do not adequately address this question. We identify and evaluate several assumptions implicit in the public discourse (e.g., the assumption that college students are at greater risk than nonstudents). Given the empirical support for the one-in-five statistic, we suggest that the controversy occurs because of misunderstandings about studies' methods and results and because this topic has implications for gender relations, power, and sexuality; this controversy is ultimately about values.
78 FR 43002 - Proposed Collection; Comment Request for Revenue Procedure 2004-29
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-18
... comments concerning statistical sampling in Sec. 274 Context. DATES: Written comments should be received on... INFORMATION: Title: Statistical Sampling in Sec. 274 Contest. OMB Number: 1545-1847. Revenue Procedure Number: Revenue Procedure 2004-29. Abstract: Revenue Procedure 2004-29 prescribes the statistical sampling...
42 CFR 1003.133 - Statistical sampling.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 42 Public Health 5 2014-10-01 2014-10-01 false Statistical sampling. 1003.133 Section 1003.133 Public Health OFFICE OF INSPECTOR GENERAL-HEALTH CARE, DEPARTMENT OF HEALTH AND HUMAN SERVICES OIG AUTHORITIES CIVIL MONEY PENALTIES, ASSESSMENTS AND EXCLUSIONS § 1003.133 Statistical sampling. (a) In meeting...
EVALUATION OF A NEW MEAN SCALED AND MOMENT ADJUSTED TEST STATISTIC FOR SEM.
Tong, Xiaoxiao; Bentler, Peter M
2013-01-01
Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and two well-known robust test statistics. A modification to the Satorra-Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the four test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies seven sample sizes and three distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ(2) test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra-Bentler scaled test statistic performed best overall, while the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.
Mozola, Mark; Norton, Paul; Alles, Susan; Gray, R Lucas; Tolan, Jerry; Caballero, Oscar; Pinkava, Lisa; Hosking, Edan; Luplow, Karen; Rice, Jennifer
2013-01-01
ANSR Salmonella is a new molecular diagnostic assay for detection of Salmonella spp. in foods and environmental samples. The test is based on the nicking enzyme amplification reaction (NEAR) isothermal nucleic acid amplification technology. The assay platform features simple instrumentation, minimal labor, and, following a single-step 10-24 h enrichment (depending on sample type), an extremely short assay time of 30 min, including sample preparation. Detection is real-time using fluorescent molecular beacon probes. Inclusivity testing was performed using a panel of 113 strains of S. enterica and S. bongori, representing 109 serovars and all genetic subgroups. With the single exception of the rare serovar S. Weslaco, all serovars and genetic subgroups were detected. Exclusivity testing of 38 non-salmonellae, mostly Enterobacteriaceae, yielded no evidence of cross-reactivity. In comparative testing of chicken carcass rinse, raw ground turkey, raw ground beef, hot dogs, and oat cereal, there were no statistically significant differences in the number of positive results obtained with the ANSR and the U.S. Department of Agriculture-Food Safety and Inspection Service or U.S. Food and Drug Administration/Bacteriological Analytical Manual reference culture methods. In testing of swab or sponge samples from five different environmental surfaces, four trials showed no statistically significant differences in the number of positive results by the ANSR and the U.S. Food and Drug Administration/ Bacteriological Analytical Manual reference methods; in the trial with stainless steel surface, there were significantly more positive results by the ANSR method. Ruggedness experiments showed a high degree of assay robustness when deviations in reagent volumes and incubation times were introduced.
Bed load transport over a broad range of timescales: Determination of three regimes of fluctuations
NASA Astrophysics Data System (ADS)
Ma, Hongbo; Heyman, Joris; Fu, Xudong; Mettra, Francois; Ancey, Christophe; Parker, Gary
2014-12-01
This paper describes the relationship between the statistics of bed load transport flux and the timescale over which it is sampled. A stochastic formulation is developed for the probability distribution function of bed load transport flux, based on the Ancey et al. (2008) theory. An analytical solution for the variance of bed load transport flux over differing sampling timescales is presented. The solution demonstrates that the timescale dependence of the variance of bed load transport flux reduces to a three-regime relation demarcated by an intermittency timescale (tI) and a memory timescale (tc). As the sampling timescale increases, this variance passes through an intermittent stage (≪tI), an invariant stage (tI < t < tc), and a memoryless stage (≫ tc). We propose a dimensionless number (Ra) to represent the relative strength of fluctuation, which provides a common ground for comparison of fluctuation strength among different experiments, as well as different sampling timescales for each experiment. Our analysis indicates that correlated motion and the discrete nature of bed load particles are responsible for this three-regime behavior. We use the data from three experiments with high temporal resolution of bed load transport flux to validate the proposed three-regime behavior. The theoretical solution for the variance agrees well with all three sets of experimental data. Our findings contribute to the understanding of the observed fluctuations of bed load transport flux over monosize/multiple-size grain beds, to the characterization of an inherent connection between short-term measurements and long-term statistics, and to the design of appropriate sampling strategies for bed load transport flux.
Henriksen, Linda O; Faber, Nina R; Moller, Mette F; Nexo, Ebba; Hansen, Annebirthe B
2014-10-01
Suitable procedures for transport of blood samples from general practitioners to hospital laboratories are requested. Here we explore routine testing on samples stored and transported as whole blood in lithium-heparin or serum tubes. Blood samples were collected from 106 hospitalized patients, and analyzed on Architect c8000 or Advia Centaur XP for 35 analytes at base line, and after storage and transport of whole blood in lithium-heparin or serum tubes at 21 ± 1°C for 10 h. Bias and imprecision (representing variation from analysis and storage) were calculated from values at baseline and after storage, and differences tested by paired t-tests. Results were compared to goals set by the laboratory. We observed no statistically significant bias and results within the goal for imprecision between baseline samples and 10-h samples for albumin, alkaline phosphatase, antitrypsin, bilirubin, creatinine, free triiodothyronine, γ-glutamyl transferase, haptoglobin, immunoglobulin G, lactate dehydrogenase, prostate specific antigen, total carbon dioxide, and urea. Alanine aminotransferase, amylase, C-reactive protein, calcium, cholesterol, creatine kinase, ferritin, free thyroxine, immunoglobulin A, immunoglobulin M, orosomucoid, sodium, transferrin, and triglycerides met goals for imprecision, though they showed a minor, but statistically significant bias in results after storage. Cobalamin, folate, HDL-cholesterol, iron, phosphate, potassium, thyroid stimulating hormone and urate warranted concern, but only folate and phosphate showed deviations of clinical importance. We conclude that whole blood in lithium-heparin or serum tubes stored for 10 h at 21 ± 1°C, may be used for routine analysis without restrictions for all investigated analytes but folate and phosphate.
Gallé, Róbert; Urák, István; Nikolett, Gallé-Szpisjak; Hartel, Tibor
2017-01-01
The integration of food production and biodiversity conservation represents a key challenge for sustainability. Several studies suggest that even small structural elements in the landscape can make a substantial contribution to the overall biodiversity value of the agricultural landscapes. Pastures can have high biodiversity potential. However, their intensive and monofunctional use typically erodes its natural capital, including biodiversity. Here we address the ecological value of fine scale structural elements represented by sparsely scattered trees and shrubs for the spider communities in a moderately intensively grazed pasture in Transylvania, Eastern Europe. The pasture was grazed with sheep, cattle and buffalo (ca 1 Livestock Unit ha-1) and no chemical fertilizers were applied. Sampling sites covered the open pasture as well as the existing fine-scale heterogeneity created by scattered trees and shrub. 40 sampling locations each being represented by three 1 m2 quadrats were situated in a stratified design while assuring spatial independency of sampling locations. We identified 140 species of spiders, out of which 18 were red listed and four were new for the Romanian fauna. Spider species assemblages of open pasture, scattered trees, trees and shrubs and the forest edge were statistically distinct. Our study shows that sparsely scattered mature woody vegetation and shrubs substantially increases the ecological value of managed pastures. The structural complexity provided by scattered trees and shrubs makes possible the co-occurrence of high spider diversity with a moderately high intensity grazing possible in this wood-pasture. Our results are in line with recent empirical research showing that sparse trees and shrubs increases the biodiversity potential of pastures managed for commodity production.
Nikolett, Gallé-Szpisjak; Hartel, Tibor
2017-01-01
The integration of food production and biodiversity conservation represents a key challenge for sustainability. Several studies suggest that even small structural elements in the landscape can make a substantial contribution to the overall biodiversity value of the agricultural landscapes. Pastures can have high biodiversity potential. However, their intensive and monofunctional use typically erodes its natural capital, including biodiversity. Here we address the ecological value of fine scale structural elements represented by sparsely scattered trees and shrubs for the spider communities in a moderately intensively grazed pasture in Transylvania, Eastern Europe. The pasture was grazed with sheep, cattle and buffalo (ca 1 Livestock Unit ha-1) and no chemical fertilizers were applied. Sampling sites covered the open pasture as well as the existing fine-scale heterogeneity created by scattered trees and shrub. 40 sampling locations each being represented by three 1 m2 quadrats were situated in a stratified design while assuring spatial independency of sampling locations. We identified 140 species of spiders, out of which 18 were red listed and four were new for the Romanian fauna. Spider species assemblages of open pasture, scattered trees, trees and shrubs and the forest edge were statistically distinct. Our study shows that sparsely scattered mature woody vegetation and shrubs substantially increases the ecological value of managed pastures. The structural complexity provided by scattered trees and shrubs makes possible the co-occurrence of high spider diversity with a moderately high intensity grazing possible in this wood-pasture. Our results are in line with recent empirical research showing that sparse trees and shrubs increases the biodiversity potential of pastures managed for commodity production. PMID:28886058
A comparison of exact tests for trend with binary endpoints using Bartholomew's statistic.
Consiglio, J D; Shan, G; Wilding, G E
2014-01-01
Tests for trend are important in a number of scientific fields when trends associated with binary variables are of interest. Implementing the standard Cochran-Armitage trend test requires an arbitrary choice of scores assigned to represent the grouping variable. Bartholomew proposed a test for qualitatively ordered samples using asymptotic critical values, but type I error control can be problematic in finite samples. To our knowledge, use of the exact probability distribution has not been explored, and we study its use in the present paper. Specifically we consider an approach based on conditioning on both sets of marginal totals and three unconditional approaches where only the marginal totals corresponding to the group sample sizes are treated as fixed. While slightly conservative, all four tests are guaranteed to have actual type I error rates below the nominal level. The unconditional tests are found to exhibit far less conservatism than the conditional test and thereby gain a power advantage.
Rommelmann, Vanessa; Setel, Philip W.; Hemed, Yusuf; Angeles, Gustavo; Mponezya, Hamisi; Whiting, David; Boerma, Ties
2005-01-01
OBJECTIVE: To examine the costs of complementary information generation activities in a resource-constrained setting and compare the costs and outputs of information subsystems that generate the statistics on poverty, health and survival required for monitoring, evaluation and reporting on health programmes in the United Republic of Tanzania. METHODS: Nine systems used by four government agencies or ministries were assessed. Costs were calculated from budgets and expenditure data made available by information system managers. System coverage, quality assurance and information production were reviewed using questionnaires and interviews. Information production was characterized in terms of 38 key sociodemographic indicators required for national programme monitoring. FINDINGS: In 2002-03 approximately US$ 0.53 was spent per Tanzanian citizen on the nine information subsystems that generated information on 37 of the 38 selected indicators. The census and reporting system for routine health service statistics had the largest participating populations and highest total costs. Nationally representative household surveys and demographic surveillance systems (which are not based on nationally representative samples) produced more than half the indicators and used the most rigorous quality assurance. Five systems produced fewer than 13 indicators and had comparatively high costs per participant. CONCLUSION: Policy-makers and programme planners should be aware of the many trade-offs with respect to system costs, coverage, production, representativeness and quality control when making investment choices for monitoring and evaluation. In future, formal cost-effectiveness studies of complementary information systems would help guide investments in the monitoring, evaluation and planning needed to demonstrate the impact of poverty-reduction and health programmes. PMID:16184275
78 FR 63568 - Proposed Collection; Comment Request for Rev. Proc. 2007-35
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-24
... Revenue Procedure 2007-35, Statistical Sampling for purposes of Section 199. DATES: Written comments... . SUPPLEMENTARY INFORMATION: Title: Statistical Sampling for purposes of Section 199. OMB Number: 1545-2072... statistical sampling may be used in purposes of section 199, which provides a deduction for income...
Willis, Brian H; Riley, Richard D
2017-09-20
An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
21 CFR 111.80 - What representative samples must you collect?
Code of Federal Regulations, 2010 CFR
2010-04-01
... Process Control System § 111.80 What representative samples must you collect? The representative samples... unique lot within each unique shipment); (b) Representative samples of in-process materials for each manufactured batch at points, steps, or stages, in the manufacturing process as specified in the master...
North Massif lithologies and chemical compositions viewed from 2-4 mm particles of soil sample 76503
NASA Technical Reports Server (NTRS)
Bishop, Kaylynn M.; Jolliff, Bradley L.; Korotev, Randy L.; Haskin, Larry A.
1992-01-01
We identify the lithologic and compositional components of soil 76503 based on INAA of 243 2-4-mm particles and 72 thin sections from these and associated 1-2-mm particles (76502). We present a statistical distribution of the major compositional types as the first step of a detailed comparative study of the North and South Massifs. The soil sample was collected well away from any boulder and is more representative of typical North Massif material than any single large rock or boulder sample. So far, our examination of the 76503 particles has provided a better definition of precursor igneous lithologies and their petrogenetic relationships. It has enabled us to refine the nature of mixing components for the North Massif less than 1-mm fines. It has confirmed the differences in lithologies and their proportions between materials of the North and South Massifs; e.g., the North Massif is distinguished by the absence of a 72275-type KREEP component, the abundance of a highly magnesian igneous component, and the absence of certain types of melt compositions found in the South Massif samples.
Martin, Derek; Cockell, Charles S
2015-02-01
Investigations of other planetary bodies, including Mars and icy moons such as Enceladus and Europa, show that they may have hosted aqueous environments in the past and may do so even today. Therefore, a major challenge in astrobiology is to build facilities that will allow us to study the geochemistry and habitability of these extraterrestrial environments. Here, we describe a simulation facility (PELS: Planetary Environmental Liquid Simulator) with the capability for liquid input and output that allows for the study of such environments. The facility, containing six separate sample vessels, allows for statistical replication of samples. Control of pressure, gas composition, UV irradiation conditions, and temperature allows for the precise replication of aqueous conditions, including subzero brines under martian atmospheric conditions. A sample acquisition system allows for the collection of both liquid and solid samples from within the chamber without breaking the atmospheric conditions, enabling detailed studies of the geochemical evolution and habitability of past and present extraterrestrial environments. The facility we describe represents a new frontier in planetary simulation-continuous flow-through simulation of extraterrestrial aqueous environments.
Geochemistry of sediments in the Northern and Central Adriatic Sea
NASA Astrophysics Data System (ADS)
De Lazzari, A.; Rampazzo, G.; Pavoni, B.
2004-03-01
Major, minor and trace elements, loss of ignition, specific surface area, quantities of calcite and dolomite, qualitative mineralogical composition, grain-size distribution and organic micropollutants (PAH, PCB, DDT) were determined on surficial marine sediments sampled during the 1990 ASCOP (Adriatic Scientific Cooperative Program) cruise. Mineralogical composition and carbonate content of the samples were found to be comparable with data previously reported in the literature, whereas geochemical composition and distribution of major, minor and trace elements for samples in international waters and in the central basin have never been reported before. The large amount of information contained in the variables of different origin has been processed by means of a comprehensive approach which establishes the relations among the components through the mathematical-statistical calculation of principal components (factors). These account for the major part of data variance loosing only marginal parts of information and are independent from the units of measure. The sample descriptors concerning natural components and contamination load are discussed by means of a statistical model based on an R-mode Factor analysis calculating four significant factors which explain 86.8% of the total variance, and represent important relationships between grain size, mineralogy, geochemistry and organic micropollutants. A description and an interpretation of factor composition is discussed on the basis of pollution inputs, basin geology and hydrodynamics. The areal distribution of the factors showed that it is the fine grain-size fraction, with oxides and hydroxides of colloidal origin, which are the main means of transport and thus the principal link between chemical, physical and granulometric elements in the Adriatic.
Muths, Delphine; Le Couls, Sarah; Evano, Hugues; Grewe, Peter; Bourjea, Jerome
2013-01-01
Genetic population structure of swordfish Xiphias gladius was examined based on 2231 individual samples, collected mainly between 2009 and 2010, among three major sampling areas within the Indian Ocean (IO; twelve distinct sites), Atlantic (two sites) and Pacific (one site) Oceans using analysis of nineteen microsatellite loci (n = 2146) and mitochondrial ND2 sequences (n = 2001) data. Sample collection was stratified in time and space in order to investigate the stability of the genetic structure observed with a special focus on the South West Indian Ocean. Significant AMOVA variance was observed for both markers indicating genetic population subdivision was present between oceans. Overall value of F-statistics for ND2 sequences confirmed that Atlantic and Indian Oceans swordfish represent two distinct genetic stocks. Indo-Pacific differentiation was also significant but lower than that observed between Atlantic and Indian Oceans. However, microsatellite F-statistics failed to reveal structure even at the inter-oceanic scale, indicating that resolving power of our microsatellite loci was insufficient for detecting population subdivision. At the scale of the Indian Ocean, results obtained from both markers are consistent with swordfish belonging to a single unique panmictic population. Analyses partitioned by sampling area, season, or sex also failed to identify any clear structure within this ocean. Such large spatial and temporal homogeneity of genetic structure, observed for such a large highly mobile pelagic species, suggests as satisfactory to consider swordfish as a single panmictic population in the Indian Ocean. PMID:23717447
Energy drink consumption in Italian university students: food habits and lifestyle.
Vitiello, V; Diolordi, L; Pirrone, M; Donini, L M; Del Balzo, V
2016-01-01
The aim of this study was to investigate the consumption of ED (Energy drink) among young people, both alone and in association with alcohol, as well as the food habits and lifestyle of ED consumers. An anonymous closed-ended questionnaire, was administered to a sample of students. The questionnaire is composed of 30 questions with multiple answers. The students, who come from different regions in Italy, were enrolled at two Italian Universities: Rome and Cagliari. T-test and the analysis of variance (ANOVA) were performed and chi-square test was used to compare observed and expected frequencies. The sample was composed by 618 females and 389 males and revealed statistically significant differences related to the gender in terms of lifestyle and food habits. About 2/3 of the sample has consumed ED at least once. ED consumers in the total sample accounted for 655 students (65,0%). The 41.3% of the females and the 58,8% of males were ED consumers. Habitual consumers represent the 15,8% of the ED consumers, while occasional consumers the 84,2 %. Habitual and occasional consumers show statistically significant differences both for the lifestyle and the food habits. The 72.1% of ED consumers drink ED in association with alcohol (ED-based cocktails). Our results suggest that would be recommended to inform consumers about the side effects related to an excessive use of ED, particularly when combined with alcohol: indeed, this combination is known to have adverse effects on the cardiovascular system, on the nervous system, leading in particular to sleeping disorders.
[Effect sizes, statistical power and sample sizes in "the Japanese Journal of Psychology"].
Suzukawa, Yumi; Toyoda, Hideki
2012-04-01
This study analyzed the statistical power of research studies published in the "Japanese Journal of Psychology" in 2008 and 2009. Sample effect sizes and sample statistical powers were calculated for each statistical test and analyzed with respect to the analytical methods and the fields of the studies. The results show that in the fields like perception, cognition or learning, the effect sizes were relatively large, although the sample sizes were small. At the same time, because of the small sample sizes, some meaningful effects could not be detected. In the other fields, because of the large sample sizes, meaningless effects could be detected. This implies that researchers who could not get large enough effect sizes would use larger samples to obtain significant results.
Wang, Dan; Silkie, Sarah S; Nelson, Kara L; Wuertz, Stefan
2010-09-01
Cultivation- and library-independent, quantitative PCR-based methods have become the method of choice in microbial source tracking. However, these qPCR assays are not 100% specific and sensitive for the target sequence in their respective hosts' genome. The factors that can lead to false positive and false negative information in qPCR results are well defined. It is highly desirable to have a way of removing such false information to estimate the true concentration of host-specific genetic markers and help guide the interpretation of environmental monitoring studies. Here we propose a statistical model based on the Law of Total Probability to predict the true concentration of these markers. The distributions of the probabilities of obtaining false information are estimated from representative fecal samples of known origin. Measurement error is derived from the sample precision error of replicated qPCR reactions. Then, the Monte Carlo method is applied to sample from these distributions of probabilities and measurement error. The set of equations given by the Law of Total Probability allows one to calculate the distribution of true concentrations, from which their expected value, confidence interval and other statistical characteristics can be easily evaluated. The output distributions of predicted true concentrations can then be used as input to watershed-wide total maximum daily load determinations, quantitative microbial risk assessment and other environmental models. This model was validated by both statistical simulations and real world samples. It was able to correct the intrinsic false information associated with qPCR assays and output the distribution of true concentrations of Bacteroidales for each animal host group. Model performance was strongly affected by the precision error. It could perform reliably and precisely when the standard deviation of the precision error was small (≤ 0.1). Further improvement on the precision of sample processing and qPCR reaction would greatly improve the performance of the model. This methodology, built upon Bacteroidales assays, is readily transferable to any other microbial source indicator where a universal assay for fecal sources of that indicator exists. Copyright © 2010 Elsevier Ltd. All rights reserved.
Trends in suspended-sediment concentration at selected stream sites in Kansas, 1970-2002
Putnam, James E.; Pope, Larry M.
2003-01-01
Knowledge of erosion, transport, and deposition of sediment relative to streams and impoundments is important to those involved directly or indirectly in the development and management of water resources. Monitoring the quantity of sediment in streams and impoundments is important because: (1) sediment may degrade the water quality of streams for such uses as municipal water supply, (2) sediment is detrimental to the health of some species of aquatic animals and plants, and (3) accumulation of sediment in water-supply impoundments decreases the amount of storage and, therefore, water available for users. One of the objectives of the Kansas Water Plan is to reduce the amount of sediment in Kansas streams by 2010. During the last 30 years, millions of dollars have been spent in Kansas watersheds to reduce sediment transport to streams. Because the last evaluation of trends in suspended-sediment concentrations in Kansas was completed in 1985, 14 sediment sampling sites that represent 10 of the 12 major river basins in Kansas were reestablished in 2000. The purpose of this report is to present the results of time-trend analyses at the reestablished sediment data-collection sites for the period of about 1970?2002 and to evaluate changes in the watersheds that may explain the trends. Time-trend tests for 13 of 14 sediment sampling sites in Kansas for the period from about 1970 to 2002 indicated that 3 of the 13 sites tested had statistically significant decreasing suspended-sediment concentrations; however, only 2 sites, Walnut River at Winfield and Elk River at Elk Falls, had trends that were statistically significant at the 0.05 probability level. Increasing suspended-sediment concentrations were indicated at three sites although none were statistically significant at the 0.05 probability level. Samples from five of the six sampling sites located upstream from reservoirs indicated decreasing suspended-sediment concentrations. Watershed impoundments located in the respective river basins may contribute to the decreasing suspended-sediment trends exhibited at most of the sampling sites because the impoundments are designed to trap sediment. Both sites that exhibited statistically significant decreasing suspended-sediment concentrations have a large number of watershed impoundments located in their respective drainage basins. The relation between percentage of the watershed affected by impoundments and trend in suspended-sediment concentration for 11 sites indicated that, as the number of impoundments in the watershed increases, suspended-sediment concentration decreases. Other conser-vation practices, such as terracing of farm fields and contour farming, also may contribute to the reduced suspended-sediment concentrations if their use has increased during the period of analysis. Regression models were developed for 13 of 14 sediment sampling sites in Kansas and can be used to estimate suspended-sediment concentration if the range in stream discharge for which they were developed is not exceeded and if time trends in suspended-sediment concentrations are not significant. For those sites that had a statistically significant trend in suspended-sediment concentration, a second regression model was developed using samples collected during 2000?02. Past and current studies by the U.S. Geological Survey have shown that regression models can be developed between in-stream measurements of turbidity and laboratory-analyzed sediment samples. Regression models were developed for the relations between discharge and suspended-sediment concentration and turbidity and suspended-sediment concentration for 10 sediment sampling sites using samples collected during 2000?02.
Statistical Inference for Data Adaptive Target Parameters.
Hubbard, Alan E; Kherad-Pajouh, Sara; van der Laan, Mark J
2016-05-01
Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in an estimation sample (one of the V subsamples) and corresponding complementary parameter-generating sample. For each of the V parameter-generating samples, we apply an algorithm that maps the sample to a statistical target parameter. We define our sample-split data adaptive statistical target parameter as the average of these V-sample specific target parameters. We present an estimator (and corresponding central limit theorem) of this type of data adaptive target parameter. This general methodology for generating data adaptive target parameters is demonstrated with a number of practical examples that highlight new opportunities for statistical learning from data. This new framework provides a rigorous statistical methodology for both exploratory and confirmatory analysis within the same data. Given that more research is becoming "data-driven", the theory developed within this paper provides a new impetus for a greater involvement of statistical inference into problems that are being increasingly addressed by clever, yet ad hoc pattern finding methods. To suggest such potential, and to verify the predictions of the theory, extensive simulation studies, along with a data analysis based on adaptively determined intervention rules are shown and give insight into how to structure such an approach. The results show that the data adaptive target parameter approach provides a general framework and resulting methodology for data-driven science.
NASA Astrophysics Data System (ADS)
Gavilan, C.; Grunwald, S.; Quiroz, R.; Zhu, L.
2015-12-01
The Andes represent the largest and highest mountain range in the tropics. Geological and climatic differentiation favored landscape and soil diversity, resulting in ecosystems adapted to very different climatic patterns. Although several studies support the fact that the Andes are a vast sink of soil organic carbon (SOC) only few have quantified this variable in situ. Estimating the spatial distribution of SOC stocks in data-poor and/or poorly accessible areas, like the Andean region, is challenging due to the lack of recent soil data at high spatial resolution and the wide range of coexistent ecosystems. Thus, the sampling strategy is vital in order to ensure the whole range of environmental covariates (EC) controlling SOC dynamics is represented. This approach allows grasping the variability of the area, which leads to more efficient statistical estimates and improves the modeling process. The objectives of this study were to i) characterize and model the spatial distribution of SOC stocks in the Central Andean region using soil-landscape modeling techniques, and to ii) validate and evaluate the model for predicting SOC content in the area. For that purpose, three representative study areas were identified and a suite of variables including elevation, mean annual temperature, annual precipitation and Normalized Difference Vegetation Index (NDVI), among others, was selected as EC. A stratified random sampling (namely conditioned Latin Hypercube) was implemented and a total of 400 sampling locations were identified. At all sites, four composite topsoil samples (0-30 cm) were collected within a 2 m radius. SOC content was measured using dry combustion and SOC stocks were estimated using bulk density measurements. Regression Kriging was used to map the spatial variation of SOC stocks. The accuracy, fit and bias of SOC models was assessed using a rigorous validation assessment. This study produced the first comprehensive, geospatial SOC stock assessment in this undersampled region that serves as a baseline reference to assess potential impacts of climate and land use change.
A method for the measurement and analysis of ride vibrations of transportation systems
NASA Technical Reports Server (NTRS)
Catherines, J. J.; Clevenson, S. A.; Scholl, H. F.
1972-01-01
The measurement and recording of ride vibrations which affect passenger comfort in transportation systems and the subsequent data-reduction methods necessary for interpreting the data present exceptional instrumentation requirements and necessitate the use of computers for specialized analysis techniques. A method is presented for both measuring and analyzing ride vibrations of the type encountered in ground and air transportation systems. A portable system for measuring and recording low-frequency, low-amplitude accelerations and specialized data-reduction procedures are described. Sample vibration measurements in the form of statistical parameters representative of typical transportation systems are also presented to demonstrate the utility of the techniques.
Measurement of the proton structure function F 2 from the 1993 HERA data
NASA Astrophysics Data System (ADS)
Derrick, M.; Krakauer, D.; Magill, S.; Musgrave, B.; Repond, J.; Schlereth, J.; Stanek, R.; Talaga, R. L.; Thron, J.; Arzarello, F.; Ayad, R.; Bari, G.; Basile, M.; Bellagamba, L.; Boscherini, D.; Bruni, A.; Bruni, G.; Bruni, P.; Cara Romeo, G.; Castellini, G.; Chiarini, M.; Cifarelli, L.; Cindolo, F.; Ciralli, F.; Contin, A.; D'Auria, S.; Frasconi, F.; Giusti, P.; Iacobucci, G.; Laurenti, G.; Levi, G.; Margotti, A.; Massam, T.; Nania, R.; Nemoz, C.; Palmonari, F.; Polini, A.; Sartorelli, G.; Timellini, R.; Zamora Garcia, Y.; Zichichi, A.; Bargende, A.; Crittenden, J.; Desch, K.; Diekmann, B.; Doeker, T.; Feld, L.; Frey, A.; Geerts, M.; Geitz, G.; Grothe, M.; Hartmann, H.; Haun, D.; Heinloth, K.; Hilger, E.; Jakob, H.-P.; Katz, U. F.; Mari, S. M.; Mass, A.; Mengel, S.; Mollen, J.; Paul, E.; Rembser, Ch.; Schattevoy, R.; Schneider, J.-L.; Schramm, D.; Stamm, J.; Wedemeyer, R.; Campbell-Robson, S.; Cassidy, A.; Dyce, N.; Foster, B.; George, S.; Gilmore, R.; Heath, G. P.; Heath, H. F.; Llewellyn, T. J.; Morgado, C. J. S.; Norman, D. J. P.; O'Mara, J. A.; Tapper, R. J.; Wilson, S. S.; Yoshida, R.; Rau, R. R.; Arneodo, M.; Iannotti, L.; Schioppa, M.; Susinno, G.; Bernstein, A.; Caldwell, A.; Gialas, I.; Parsons, J. A.; Ritz, S.; Sciulli, F.; Straub, P. B.; Wai, L.; Yang, S.; Borzemski, P.; Chwastowski, J.; Eskreys, A.; Piotrzkowski, K.; Zachara, M.; Zawiejski, L.; Adamczyk, L.; Bednarek, B.; Eskreys, K.; Jeleń, K.; Kisielewska, D.; Kowalski, T.; Rulikowska-Zarębska, E.; Suszycki, L.; Zając, J.; Kędzierski, T.; Kotański, A.; Przybycień, M.; Bauerdick, L. A. T.; Behrens, U.; Bienlein, J. K.; Böttcher, S.; Coldewey, C.; Drews, G.; Flasiński, M.; Gilkinson, D. J.; Göttlicher, P.; Gutjahr, B.; Haas, T.; Hagge, L.; Hain, W.; Hasell, D.; Heßling, H.; Hultschig, H.; Iga, Y.; Joos, P.; Kasemann, M.; Klanner, R.; Koch, W.; Köpke, L.; Kötz, U.; Kowalski, H.; Kröger, W.; Krüger, J.; Labs, J.; Ladage, A.; Löhr, B.; Löwe, M.; Lüke, D.; Mainusch, J.; Mańczak, O.; Ng, J. S. T.; Nickel, S.; Notz, D.; Ohrenberg, K.; Roco, M.; Rohde, M.; Roldán, J.; Schneekloth, U.; Schulz, W.; Selonke, F.; Stiliaris, E.; Voß, T.; Westphal, D.; Wolf, G.; Youngman, C.; Grabosch, H. J.; Leich, A.; Meyer, A.; Rethfeldt, C.; Schlenstedt, S.; Barbagli, G.; Pelfer, P.; Anzivino, G.; Maccarrone, G.; de Pasquale, S.; Qian, S.; Votano, L.; Bamberger, A.; Freidhof, A.; Poser, T.; Söldner-Rembold, S.; Schroeder, J.; Theisen, G.; Trefzger, T.; Brook, N. H.; Bussey, P. J.; Doyle, A. T.; Fleck, I.; Jamieson, V. A.; Saxon, D. H.; Utley, M. L.; Wilson, A. S.; Dannemann, A.; Holm, U.; Horstmann, D.; Kammerlocher, H.; Krebs, B.; Neumann, T.; Sinkus, R.; Wick, K.; Badura, E.; Burow, B. D.; Fürtjes, A.; Lohrmann, E.; Milewski, J.; Nakahata, M.; Pavel, N.; Poelz, G.; Schott, W.; Terron, J.; Zetsche, F.; Bacon, T. C.; Beuselinck, R.; Butterworth, I.; Gallo, E.; Harris, V. L.; Hung, B. H.; Long, K. R.; Miller, D. B.; Morawitz, P. P. O.; Prinias, A.; Sedgbeer, J. K.; Whitfield, A. F.; Mallik, U.; McCliment, E.; Wang, M. Z.; Zhang, Y.; Cloth, P.; Filges, D.; An, S. H.; Hong, S. M.; Nam, S. W.; Park, S. K.; Suh, M. H.; Yon, S. H.; Imlay, R.; Kartik, S.; Kim, H.-J.; McNeil, R. R.; Metcalf, W.; Nadendla, V. K.; Barreiro, F.; Cases, G.; Graciani, R.; Hernández, J. M.; Hervás, L.; Labarga, L.; Del Peso, J.; Puga, J.; de Trocóniz, J. F.; Ikraiam, F.; Mayer, J. K.; Smith, G. R.; Corriveau, F.; Hanna, D. S.; Hartmann, J.; Hung, L. W.; Lim, J. N.; Matthews, C. G.; Mitchell, J. W.; Patel, P. M.; Sinclair, L. E.; Stairs, D. G.; St. Laurent, M.; Ullmann, R.; Bashkirov, V.; Dolgoshein, B. A.; Stifutkin, A.; Bashindzhagyan, G. L.; Ermolov, P. F.; Gladilin, L. K.; Golubkov, Y. A.; Kobrin, V. D.; Kuzmin, V. A.; Proskuryakov, A. S.; Savin, A. A.; Shcheglova, L. M.; Solomin, A. N.; Zotov, N. P.; Bentvelsen, S.; Botje, M.; Chlebana, F.; Dake, A.; Engelen, J.; de Jong, P.; de Kamps, M.; Kooijman, P.; Kruse, A.; O'Dell, V.; Tenner, A.; Tiecke, H.; Verkerke, W.; Vreeswijk, M.; Wiggers, L.; de Wolf, E.; van Woudenberg, R.; Acosta, D.; Bylsma, B.; Durkin, L. S.; Honscheid, K.; Li, C.; Ling, T. Y.; McLean, K. W.; Murray, W. N.; Park, I. H.; Romanowski, T. A.; Seidlein, R.; Bailey, D. S.; Blair, G. A.; Byrne, A.; Cashmore, R. J.; Cooper-Sarkar, A. M.; Daniels, D.; Devenish, R. C. E.; Harnew, N.; Lancaster, M.; Luffman, P. E.; Lindemann, L.; McFall, J.; Nath, C.; Quadt, A.; Uijterwaal, H.; Walczak, R.; Wilson, F. F.; Yip, T.; Abbiendi, G.; Bertolin, A.; Brugnera, R.; Carlin, R.; Dal Corso, F.; de Giorgi, M.; Dosselli, U.; Gasparini, F.; Limentani, S.; Morandin, M.; Posocco, M.; Stanco, L.; Stroili, R.; Voci, C.; Bulmahn, J.; Butterworth, J. M.; Feild, R. G.; Oh, B. Y.; Whitmore, J. J.; D'Agostini, G.; Iori, M.; Marini, G.; Mattioli, M.; Nigro, A.; Tassi, E.; Hart, J. C.; McCubbin, N. A.; Prytz, K.; Shah, T. P.; Short, T. L.; Barberis, E.; Cartiglia, N.; Dubbs, T.; Heusch, C.; van Hook, M.; Hubbard, B.; Lockman, W.; Rahn, J. T.; Sadrozinski, H. F.-W.; Seiden, A.; Biltzinger, J.; Seifert, R. J.; Walenta, A. H.; Zech, G.; Abramowicz, H.; Briskin, G.; Dagan, S.; Levy, A.; Hasegawa, T.; Hazumi, M.; Ishii, T.; Kuze, M.; Mine, S.; Nagasawa, Y.; Nagira, T.; Nakao, M.; Suzuki, I.; Tokushuku, K.; Yamada, S.; Yamazaki, Y.; Chiba, M.; Hamatsu, R.; Hirose, T.; Homma, K.; Kitamura, S.; Nagayama, S.; Nakamitsu, Y.; Cirio, R.; Costa, M.; Ferrero, M. I.; Lamberti, L.; Maselli, S.; Peroni, C.; Sacchi, R.; Solano, A.; Staiano, A.; Dardo, M.; Bandyopadhyay, D.; Benard, F.; Brkic, M.; Crombie, M. B.; Gingrich, D. M.; Hartner, G. F.; Joo, K. K.; Levman, G. M.; Martin, J. F.; Orr, R. S.; Sampson, C. R.; Teuscher, R. J.; Catterall, C. D.; Jones, T. W.; Kaziewicz, P. B.; Lane, J. B.; Saunders, R. L.; Shulman, J.; Blankenship, K.; Kochocki, J.; Lu, B.; Mo, L. W.; Bogusz, W.; Charchula, K.; Ciborowski, J.; Gajewski, J.; Grzelak, G.; Kasprzak, M.; Krzyżanowski, M.; Muchorowski, K.; Nowak, R. J.; Pawlak, J. M.; Tymieniecka, T.; Wróblewski, A. K.; Zakrzewski, J. A.; Żarnecki, A. F.; Adamus, M.; Eisenberg, Y.; Glasman, C.; Karshon, U.; Revel, D.; Shapira, A.; Ali, I.; Behrens, B.; Dasu, S.; Fordham, C.; Foudas, C.; Goussiou, A.; Loveless, R. J.; Reeder, D. D.; Silverstein, S.; Smith, W. H.; Tsurugai, T.; Bhadra, S.; Frisken, W. R.; Furutani, K. M.
1995-09-01
The ZEUS detector has been used to measure the proton structure function F 2. During 1993 HERA collided 26.7 GeV electrons on 820 GeV protons. The data sample corresponds to an integrated luminosity of 0.54 pb-1, representing a twenty fold increase in statistics compared to that of 1992. Results are presented for 7< Q 2<104 GeV2 and x values as low as 3×10-4. The rapid rise in F 2 as x decreases observed previously is now studied in greater detail and persists for Q 2 values up to 500 GeV2.
NASA Astrophysics Data System (ADS)
De La Cruz-Agüero, José; García-Rodríguez, Francisco Javier; Cota-Gómez, Víctor Manuel; Melo-Barrera, Felipe Neri; González-Armas, Rogelio
2012-06-01
Fresh and preserved (type material) specimens of the black ghost chimaera Hydrolagus melanophasma were compared for morphometric characteristics. A molecular comparison was also performed on two mitochondrial gene sequences (12S rRNA and 16S rRNA gene sequences). While significant differences in measurements were found, the differences were not attributable to sexual dimorphism or the quality of the specimens, but to the sample size and the type of statistical tests. The result of the genetic characterization showed that 12S rRNA and 16S rRNA genes represented robust molecular markers that characterized the species.
Paoletti, Claudia; Esbensen, Kim H
2015-01-01
Material heterogeneity influences the effectiveness of sampling procedures. Most sampling guidelines used for assessment of food and/or feed commodities are based on classical statistical distribution requirements, the normal, binomial, and Poisson distributions-and almost universally rely on the assumption of randomness. However, this is unrealistic. The scientific food and feed community recognizes a strong preponderance of non random distribution within commodity lots, which should be a more realistic prerequisite for definition of effective sampling protocols. Nevertheless, these heterogeneity issues are overlooked as the prime focus is often placed only on financial, time, equipment, and personnel constraints instead of mandating acquisition of documented representative samples under realistic heterogeneity conditions. This study shows how the principles promulgated in the Theory of Sampling (TOS) and practically tested over 60 years provide an effective framework for dealing with the complete set of adverse aspects of both compositional and distributional heterogeneity (material sampling errors), as well as with the errors incurred by the sampling process itself. The results of an empirical European Union study on genetically modified soybean heterogeneity, Kernel Lot Distribution Assessment are summarized, as they have a strong bearing on the issue of proper sampling protocol development. TOS principles apply universally in the food and feed realm and must therefore be considered the only basis for development of valid sampling protocols free from distributional constraints.
Scharfenberger, Christian; Wong, Alexander; Clausi, David A
2015-01-01
We propose a simple yet effective structure-guided statistical textural distinctiveness approach to salient region detection. Our method uses a multilayer approach to analyze the structural and textural characteristics of natural images as important features for salient region detection from a scale point of view. To represent the structural characteristics, we abstract the image using structured image elements and extract rotational-invariant neighborhood-based textural representations to characterize each element by an individual texture pattern. We then learn a set of representative texture atoms for sparse texture modeling and construct a statistical textural distinctiveness matrix to determine the distinctiveness between all representative texture atom pairs in each layer. Finally, we determine saliency maps for each layer based on the occurrence probability of the texture atoms and their respective statistical textural distinctiveness and fuse them to compute a final saliency map. Experimental results using four public data sets and a variety of performance evaluation metrics show that our approach provides promising results when compared with existing salient region detection approaches.
Bacterial diversity of surface sand samples from the Gobi and Taklamaken deserts.
An, Shu; Couteau, Cécile; Luo, Fan; Neveu, Julie; DuBow, Michael S
2013-11-01
Arid regions represent nearly 30 % of the Earth's terrestrial surface, but their microbial biodiversity is not yet well characterized. The surface sands of deserts, a subset of arid regions, are generally subjected to large temperature fluctuations plus high UV light exposure and are low in organic matter. We examined surface sand samples from the Taklamaken (China, three samples) and Gobi (Mongolia, two samples) deserts, using pyrosequencing of PCR-amplified 16S V1/V2 rDNA sequences from total extracted DNA in order to gain an assessment of the bacterial population diversity. In total, 4,088 OTUs (using ≥97 % sequence similarity levels), with Chao1 estimates varying from 1,172 to 2,425 OTUs per sample, were discernable. These could be grouped into 102 families belonging to 15 phyla, with OTUs belonging to the Firmicutes, Proteobacteria, Bacteroidetes, and Actinobacteria phyla being the most abundant. The bacterial population composition was statistically different among the samples, though members from 30 genera were found to be common among the five samples. An increase in phylotype numbers with increasing C/N ratio was noted, suggesting a possible role in the bacterial richness of these desert sand environments. Our results imply an unexpectedly large bacterial diversity residing in the harsh environment of these two Asian deserts, worthy of further investigation.
Two sampling techniques for game meat.
van der Merwe, Maretha; Jooste, Piet J; Hoffman, Louw C; Calitz, Frikkie J
2013-03-20
A study was conducted to compare the excision sampling technique used by the export market and the sampling technique preferred by European countries, namely the biotrace cattle and swine test. The measuring unit for the excision sampling was grams (g) and square centimetres (cm2) for the swabbing technique. The two techniques were compared after a pilot test was conducted on spiked approved beef carcasses (n = 12) that statistically proved the two measuring units correlated. The two sampling techniques were conducted on the same game carcasses (n = 13) and analyses performed for aerobic plate count (APC), Escherichia coli and Staphylococcus aureus, for both techniques. A more representative result was obtained by swabbing and no damage was caused to the carcass. Conversely, the excision technique yielded fewer organisms and caused minor damage to the carcass. The recovery ratio from the sampling technique improved 5.4 times for APC, 108.0 times for E. coli and 3.4 times for S. aureus over the results obtained from the excision technique. It was concluded that the sampling methods of excision and swabbing can be used to obtain bacterial profiles from both export and local carcasses and could be used to indicate whether game carcasses intended for the local market are possibly on par with game carcasses intended for the export market and therefore safe for human consumption.
NASA Astrophysics Data System (ADS)
Forootan, Ehsan; Kusche, Jürgen
2016-04-01
Geodetic/geophysical observations, such as the time series of global terrestrial water storage change or sea level and temperature change, represent samples of physical processes and therefore contain information about complex physical interactionswith many inherent time scales. Extracting relevant information from these samples, for example quantifying the seasonality of a physical process or its variability due to large-scale ocean-atmosphere interactions, is not possible by rendering simple time series approaches. In the last decades, decomposition techniques have found increasing interest for extracting patterns from geophysical observations. Traditionally, principal component analysis (PCA) and more recently independent component analysis (ICA) are common techniques to extract statistical orthogonal (uncorrelated) and independent modes that represent the maximum variance of observations, respectively. PCA and ICA can be classified as stationary signal decomposition techniques since they are based on decomposing the auto-covariance matrix or diagonalizing higher (than two)-order statistical tensors from centered time series. However, the stationary assumption is obviously not justifiable for many geophysical and climate variables even after removing cyclic components e.g., the seasonal cycles. In this paper, we present a new decomposition method, the complex independent component analysis (CICA, Forootan, PhD-2014), which can be applied to extract to non-stationary (changing in space and time) patterns from geophysical time series. Here, CICA is derived as an extension of real-valued ICA (Forootan and Kusche, JoG-2012), where we (i) define a new complex data set using a Hilbert transformation. The complex time series contain the observed values in their real part, and the temporal rate of variability in their imaginary part. (ii) An ICA algorithm based on diagonalization of fourth-order cumulants is then applied to decompose the new complex data set in (i). (iii) Dominant non-stationary patterns are recognized as independent complex patterns that can be used to represent the space and time amplitude and phase propagations. We present the results of CICA on simulated and real cases e.g., for quantifying the impact of large-scale ocean-atmosphere interaction on global mass changes. Forootan (PhD-2014) Statistical signal decomposition techniques for analyzing time-variable satellite gravimetry data, PhD Thesis, University of Bonn, http://hss.ulb.uni-bonn.de/2014/3766/3766.htm Forootan and Kusche (JoG-2012) Separation of global time-variable gravity signals into maximally independent components, Journal of Geodesy 86 (7), 477-497, doi: 10.1007/s00190-011-0532-5
19 CFR 151.52 - Sampling procedures.
Code of Federal Regulations, 2012 CFR
2012-04-01
.... Representative commercial moisture and assay samples shall be taken under Customs supervision for testing by the Customs laboratory. The samples used for the moisture test shall be representative of the shipment at the... verified commercial moisture sample and prepared assay sample certified to be representative of the...
Labad, Javier; Martorell, Lourdes; Gaviria, Ana; Bayón, Carmen; Vilella, Elisabet; Cloninger, C. Robert
2015-01-01
Objectives. The psychometric properties regarding sex and age for the revised version of the Temperament and Character Inventory (TCI-R) and its derived short version, the Temperament and Character Inventory (TCI-140), were evaluated with a randomized sample from the community. Methods. A randomized sample of 367 normal adult subjects from a Spanish municipality, who were representative of the general population based on sex and age, participated in the current study. Descriptive statistics and internal consistency according to α coefficient were obtained for all of the dimensions and facets. T-tests and univariate analyses of variance, followed by Bonferroni tests, were conducted to compare the distributions of the TCI-R dimension scores by age and sex. Results. On both the TCI-R and TCI-140, women had higher scores for Harm Avoidance, Reward Dependence and Cooperativeness than men, whereas men had higher scores for Persistence. Age correlated negatively with Novelty Seeking, Reward Dependence and Cooperativeness and positively with Harm Avoidance and Self-transcendence. Young subjects between 18 and 35 years had higher scores than older subjects in NS and RD. Subjects between 51 and 77 years scored higher in both HA and ST. The alphas for the dimensions were between 0.74 and 0.87 for the TCI-R and between 0.63 and 0.83 for the TCI-140. Conclusion. Results, which were obtained with a randomized sample, suggest that there are specific distributions of personality traits by sex and age. Overall, both the TCI-R and the abbreviated TCI-140 were reliable in the ‘good-to-excellent’ range. A strength of the current study is the representativeness of the sample. PMID:26713237