Austin, Peter C; Schuster, Tibor; Platt, Robert W
2015-10-15
Estimating statistical power is an important component of the design of both randomized controlled trials (RCTs) and observational studies. Methods for estimating statistical power in RCTs have been well described and can be implemented simply. In observational studies, statistical methods must be used to remove the effects of confounding that can occur due to non-random treatment assignment. Inverse probability of treatment weighting (IPTW) using the propensity score is an attractive method for estimating the effects of treatment using observational data. However, sample size and power calculations have not been adequately described for these methods. We used an extensive series of Monte Carlo simulations to compare the statistical power of an IPTW analysis of an observational study with time-to-event outcomes with that of an analysis of a similarly-structured RCT. We examined the impact of four factors on the statistical power function: number of observed events, prevalence of treatment, the marginal hazard ratio, and the strength of the treatment-selection process. We found that, on average, an IPTW analysis had lower statistical power compared to an analysis of a similarly-structured RCT. The difference in statistical power increased as the magnitude of the treatment-selection model increased. The statistical power of an IPTW analysis tended to be lower than the statistical power of a similarly-structured RCT.
Wu, Robert; Glen, Peter; Ramsay, Tim; Martel, Guillaume
2014-06-28
Observational studies dominate the surgical literature. Statistical adjustment is an important strategy to account for confounders in observational studies. Research has shown that published articles are often poor in statistical quality, which may jeopardize their conclusions. The Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines have been published to help establish standards for statistical reporting.This study will seek to determine whether the quality of statistical adjustment and the reporting of these methods are adequate in surgical observational studies. We hypothesize that incomplete reporting will be found in all surgical observational studies, and that the quality and reporting of these methods will be of lower quality in surgical journals when compared with medical journals. Finally, this work will seek to identify predictors of high-quality reporting. This work will examine the top five general surgical and medical journals, based on a 5-year impact factor (2007-2012). All observational studies investigating an intervention related to an essential component area of general surgery (defined by the American Board of Surgery), with an exposure, outcome, and comparator, will be included in this systematic review. Essential elements related to statistical reporting and quality were extracted from the SAMPL guidelines and include domains such as intent of analysis, primary analysis, multiple comparisons, numbers and descriptive statistics, association and correlation analyses, linear regression, logistic regression, Cox proportional hazard analysis, analysis of variance, survival analysis, propensity analysis, and independent and correlated analyses. Each article will be scored as a proportion based on fulfilling criteria in relevant analyses used in the study. A logistic regression model will be built to identify variables associated with high-quality reporting. A comparison will be made between the scores of surgical observational studies published in medical versus surgical journals. Secondary outcomes will pertain to individual domains of analysis. Sensitivity analyses will be conducted. This study will explore the reporting and quality of statistical analyses in surgical observational studies published in the most referenced surgical and medical journals in 2013 and examine whether variables (including the type of journal) can predict high-quality reporting.
A new u-statistic with superior design sensitivity in matched observational studies.
Rosenbaum, Paul R
2011-09-01
In an observational or nonrandomized study of treatment effects, a sensitivity analysis indicates the magnitude of bias from unmeasured covariates that would need to be present to alter the conclusions of a naïve analysis that presumes adjustments for observed covariates suffice to remove all bias. The power of sensitivity analysis is the probability that it will reject a false hypothesis about treatment effects allowing for a departure from random assignment of a specified magnitude; in particular, if this specified magnitude is "no departure" then this is the same as the power of a randomization test in a randomized experiment. A new family of u-statistics is proposed that includes Wilcoxon's signed rank statistic but also includes other statistics with substantially higher power when a sensitivity analysis is performed in an observational study. Wilcoxon's statistic has high power to detect small effects in large randomized experiments-that is, it often has good Pitman efficiency-but small effects are invariably sensitive to small unobserved biases. Members of this family of u-statistics that emphasize medium to large effects can have substantially higher power in a sensitivity analysis. For example, in one situation with 250 pair differences that are Normal with expectation 1/2 and variance 1, the power of a sensitivity analysis that uses Wilcoxon's statistic is 0.08 while the power of another member of the family of u-statistics is 0.66. The topic is examined by performing a sensitivity analysis in three observational studies, using an asymptotic measure called the design sensitivity, and by simulating power in finite samples. The three examples are drawn from epidemiology, clinical medicine, and genetic toxicology. © 2010, The International Biometric Society.
Crosta, Fernando; Nishiwaki-Dantas, Maria Cristina; Silvino, Wilmar; Dantas, Paulo Elias Correa
2005-01-01
To verify the frequency of study design, applied statistical analysis and approval by institutional review offices (Ethics Committee) of articles published in the "Arquivos Brasileiros de Oftalmologia" during a 10-year interval, with later comparative and critical analysis by some of the main international journals in the field of Ophthalmology. Systematic review without metanalysis was performed. Scientific papers published in the "Arquivos Brasileiros de Oftalmologia" between January 1993 and December 2002 were reviewed by two independent reviewers and classified according to the applied study design, statistical analysis and approval by the institutional review offices. To categorize those variables, a descriptive statistical analysis was used. After applying inclusion and exclusion criteria, 584 articles for evaluation of statistical analysis and, 725 articles for evaluation of study design were reviewed. Contingency table (23.10%) was the most frequently applied statistical method, followed by non-parametric tests (18.19%), Student's t test (12.65%), central tendency measures (10.60%) and analysis of variance (9.81%). Of 584 reviewed articles, 291 (49.82%) presented no statistical analysis. Observational case series (26.48%) was the most frequently used type of study design, followed by interventional case series (18.48%), observational case description (13.37%), non-random clinical study (8.96%) and experimental study (8.55%). We found a higher frequency of observational clinical studies, lack of statistical analysis in almost half of the published papers. Increase in studies with approval by institutional review Ethics Committee was noted since it became mandatory in 1996.
APPLICATION OF STATISTICAL ENERGY ANALYSIS TO VIBRATIONS OF MULTI-PANEL STRUCTURES.
cylindrical shell are compared with predictions obtained from statistical energy analysis . Generally good agreement is observed. The flow of mechanical...the coefficients of proportionality between power flow and average modal energy difference, which one must know in order to apply statistical energy analysis . No
NASA Astrophysics Data System (ADS)
Hacker, Joshua; Vandenberghe, Francois; Jung, Byoung-Jo; Snyder, Chris
2017-04-01
Effective assimilation of cloud-affected radiance observations from space-borne imagers, with the aim of improving cloud analysis and forecasting, has proven to be difficult. Large observation biases, nonlinear observation operators, and non-Gaussian innovation statistics present many challenges. Ensemble-variational data assimilation (EnVar) systems offer the benefits of flow-dependent background error statistics from an ensemble, and the ability of variational minimization to handle nonlinearity. The specific benefits of ensemble statistics, relative to static background errors more commonly used in variational systems, have not been quantified for the problem of assimilating cloudy radiances. A simple experiment framework is constructed with a regional NWP model and operational variational data assimilation system, to provide the basis understanding the importance of ensemble statistics in cloudy radiance assimilation. Restricting the observations to those corresponding to clouds in the background forecast leads to innovations that are more Gaussian. The number of large innovations is reduced compared to the more general case of all observations, but not eliminated. The Huber norm is investigated to handle the fat tails of the distributions, and allow more observations to be assimilated without the need for strict background checks that eliminate them. Comparing assimilation using only ensemble background error statistics with assimilation using only static background error statistics elucidates the importance of the ensemble statistics. Although the cost functions in both experiments converge to similar values after sufficient outer-loop iterations, the resulting cloud water, ice, and snow content are greater in the ensemble-based analysis. The subsequent forecasts from the ensemble-based analysis also retain more condensed water species, indicating that the local environment is more supportive of clouds. In this presentation we provide details that explain the apparent benefit from using ensembles for cloudy radiance assimilation in an EnVar context.
CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions
Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.
Overview of PECBO Module, using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, methods for inferring environmental conditions, statistical scripts in module.
Statistical Analysis of Sport Movement Observations: the Case of Orienteering
NASA Astrophysics Data System (ADS)
Amouzandeh, K.; Karimipour, F.
2017-09-01
Study of movement observations is becoming more popular in several applications. Particularly, analyzing sport movement time series has been considered as a demanding area. However, most of the attempts made on analyzing movement sport data have focused on spatial aspects of movement to extract some movement characteristics, such as spatial patterns and similarities. This paper proposes statistical analysis of sport movement observations, which refers to analyzing changes in the spatial movement attributes (e.g. distance, altitude and slope) and non-spatial movement attributes (e.g. speed and heart rate) of athletes. As the case study, an example dataset of movement observations acquired during the "orienteering" sport is presented and statistically analyzed.
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
A new statistic for the analysis of circular data in gamma-ray astronomy
NASA Technical Reports Server (NTRS)
Protheroe, R. J.
1985-01-01
A new statistic is proposed for the analysis of circular data. The statistic is designed specifically for situations where a test of uniformity is required which is powerful against alternatives in which a small fraction of the observations is grouped in a small range of directions, or phases.
Nojima, Masanori; Tokunaga, Mutsumi; Nagamura, Fumitaka
2018-05-05
To investigate under what circumstances inappropriate use of 'multivariate analysis' is likely to occur and to identify the population that needs more support with medical statistics. The frequency of inappropriate regression model construction in multivariate analysis and related factors were investigated in observational medical research publications. The inappropriate algorithm of using only variables that were significant in univariate analysis was estimated to occur at 6.4% (95% CI 4.8% to 8.5%). This was observed in 1.1% of the publications with a medical statistics expert (hereinafter 'expert') as the first author, 3.5% if an expert was included as coauthor and in 12.2% if experts were not involved. In the publications where the number of cases was 50 or less and the study did not include experts, inappropriate algorithm usage was observed with a high proportion of 20.2%. The OR of the involvement of experts for this outcome was 0.28 (95% CI 0.15 to 0.53). A further, nation-level, analysis showed that the involvement of experts and the implementation of unfavourable multivariate analysis are associated at the nation-level analysis (R=-0.652). Based on the results of this study, the benefit of participation of medical statistics experts is obvious. Experts should be involved for proper confounding adjustment and interpretation of statistical models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Transition-Region Ultraviolet Explosive Events in IRIS Si IV: A Statistical Analysis
NASA Astrophysics Data System (ADS)
Bartz, Allison
2018-01-01
Explosive events (EEs) in the solar transition region are characterized by broad, non-Gaussian line profiles with wings at Doppler velocities exceeding the speed of sound. We present a statistical analysis of 23 IRIS (Interface Region Imaging Spectrograph) sit-and-stare observations, observed between April 2014 and March 2017. Using the IRIS Si IV 1394 Å and 1403 Å spectral windows and the 1400Å Slit Jaw images we have identified 581 EEs. We found that most EEs last less than 20 min. and have a spatial scale on the slit less than 10”, agreeing with measurements in previous work. We observed most EEs in active regions, regardless of date of observation, but selection bias of IRIS observations cannot be ruled out. We also present preliminary findings of optical depth effects from our statistical study.
Analysis of thrips distribution: application of spatial statistics and Kriging
John Aleong; Bruce L. Parker; Margaret Skinner; Diantha Howard
1991-01-01
Kriging is a statistical technique that provides predictions for spatially and temporally correlated data. Observations of thrips distribution and density in Vermont soils are made in both space and time. Traditional statistical analysis of such data assumes that the counts taken over space and time are independent, which is not necessarily true. Therefore, to analyze...
Analysis of Variance with Summary Statistics in Microsoft® Excel®
ERIC Educational Resources Information Center
Larson, David A.; Hsu, Ko-Cheng
2010-01-01
Students regularly are asked to solve Single Factor Analysis of Variance problems given only the sample summary statistics (number of observations per category, category means, and corresponding category standard deviations). Most undergraduate students today use Excel for data analysis of this type. However, Excel, like all other statistical…
ERIC Educational Resources Information Center
Santos-Delgado, M. J.; Larrea-Tarruella, L.
2004-01-01
The back-titration methods are compared statistically to establish glycine in a nonaqueous medium of acetic acid. Important variations in the mean values of glycine are observed due to the interaction effects between the analysis of variance (ANOVA) technique and a statistical study through a computer software.
Examination of influential observations in penalized spline regression
NASA Astrophysics Data System (ADS)
Türkan, Semra
2013-10-01
In parametric or nonparametric regression models, the results of regression analysis are affected by some anomalous observations in the data set. Thus, detection of these observations is one of the major steps in regression analysis. These observations are precisely detected by well-known influence measures. Pena's statistic is one of them. In this study, Pena's approach is formulated for penalized spline regression in terms of ordinary residuals and leverages. The real data and artificial data are used to see illustrate the effectiveness of Pena's statistic as to Cook's distance on detecting influential observations. The results of the study clearly reveal that the proposed measure is superior to Cook's Distance to detect these observations in large data set.
Evidence for a Global Sampling Process in Extraction of Summary Statistics of Item Sizes in a Set.
Tokita, Midori; Ueda, Sachiyo; Ishiguchi, Akira
2016-01-01
Several studies have shown that our visual system may construct a "summary statistical representation" over groups of visual objects. Although there is a general understanding that human observers can accurately represent sets of a variety of features, many questions on how summary statistics, such as an average, are computed remain unanswered. This study investigated sampling properties of visual information used by human observers to extract two types of summary statistics of item sets, average and variance. We presented three models of ideal observers to extract the summary statistics: a global sampling model without sampling noise, global sampling model with sampling noise, and limited sampling model. We compared the performance of an ideal observer of each model with that of human observers using statistical efficiency analysis. Results suggest that summary statistics of items in a set may be computed without representing individual items, which makes it possible to discard the limited sampling account. Moreover, the extraction of summary statistics may not necessarily require the representation of individual objects with focused attention when the sets of items are larger than 4.
The GEOS Ozone Data Assimilation System: Specification of Error Statistics
NASA Technical Reports Server (NTRS)
Stajner, Ivanka; Riishojgaard, Lars Peter; Rood, Richard B.
2000-01-01
A global three-dimensional ozone data assimilation system has been developed at the Data Assimilation Office of the NASA/Goddard Space Flight Center. The Total Ozone Mapping Spectrometer (TOMS) total ozone and the Solar Backscatter Ultraviolet (SBUV) or (SBUV/2) partial ozone profile observations are assimilated. The assimilation, into an off-line ozone transport model, is done using the global Physical-space Statistical Analysis Scheme (PSAS). This system became operational in December 1999. A detailed description of the statistical analysis scheme, and in particular, the forecast and observation error covariance models is given. A new global anisotropic horizontal forecast error correlation model accounts for a varying distribution of observations with latitude. Correlations are largest in the zonal direction in the tropics where data is sparse. Forecast error variance model is proportional to the ozone field. The forecast error covariance parameters were determined by maximum likelihood estimation. The error covariance models are validated using x squared statistics. The analyzed ozone fields in the winter 1992 are validated against independent observations from ozone sondes and HALOE. There is better than 10% agreement between mean Halogen Occultation Experiment (HALOE) and analysis fields between 70 and 0.2 hPa. The global root-mean-square (RMS) difference between TOMS observed and forecast values is less than 4%. The global RMS difference between SBUV observed and analyzed ozone between 50 and 3 hPa is less than 15%.
Multiple outcomes are often measured on each experimental unit in toxicology experiments. These multiple observations typically imply the existence of correlation between endpoints, and a statistical analysis that incorporates it may result in improved inference. When both disc...
NASA Astrophysics Data System (ADS)
Donges, J. F.; Schleussner, C.-F.; Siegmund, J. F.; Donner, R. V.
2016-05-01
Studying event time series is a powerful approach for analyzing the dynamics of complex dynamical systems in many fields of science. In this paper, we describe the method of event coincidence analysis to provide a framework for quantifying the strength, directionality and time lag of statistical interrelationships between event series. Event coincidence analysis allows to formulate and test null hypotheses on the origin of the observed interrelationships including tests based on Poisson processes or, more generally, stochastic point processes with a prescribed inter-event time distribution and other higher-order properties. Applying the framework to country-level observational data yields evidence that flood events have acted as triggers of epidemic outbreaks globally since the 1950s. Facing projected future changes in the statistics of climatic extreme events, statistical techniques such as event coincidence analysis will be relevant for investigating the impacts of anthropogenic climate change on human societies and ecosystems worldwide.
SPA- STATISTICAL PACKAGE FOR TIME AND FREQUENCY DOMAIN ANALYSIS
NASA Technical Reports Server (NTRS)
Brownlow, J. D.
1994-01-01
The need for statistical analysis often arises when data is in the form of a time series. This type of data is usually a collection of numerical observations made at specified time intervals. Two kinds of analysis may be performed on the data. First, the time series may be treated as a set of independent observations using a time domain analysis to derive the usual statistical properties including the mean, variance, and distribution form. Secondly, the order and time intervals of the observations may be used in a frequency domain analysis to examine the time series for periodicities. In almost all practical applications, the collected data is actually a mixture of the desired signal and a noise signal which is collected over a finite time period with a finite precision. Therefore, any statistical calculations and analyses are actually estimates. The Spectrum Analysis (SPA) program was developed to perform a wide range of statistical estimation functions. SPA can provide the data analyst with a rigorous tool for performing time and frequency domain studies. In a time domain statistical analysis the SPA program will compute the mean variance, standard deviation, mean square, and root mean square. It also lists the data maximum, data minimum, and the number of observations included in the sample. In addition, a histogram of the time domain data is generated, a normal curve is fit to the histogram, and a goodness-of-fit test is performed. These time domain calculations may be performed on both raw and filtered data. For a frequency domain statistical analysis the SPA program computes the power spectrum, cross spectrum, coherence, phase angle, amplitude ratio, and transfer function. The estimates of the frequency domain parameters may be smoothed with the use of Hann-Tukey, Hamming, Barlett, or moving average windows. Various digital filters are available to isolate data frequency components. Frequency components with periods longer than the data collection interval are removed by least-squares detrending. As many as ten channels of data may be analyzed at one time. Both tabular and plotted output may be generated by the SPA program. This program is written in FORTRAN IV and has been implemented on a CDC 6000 series computer with a central memory requirement of approximately 142K (octal) of 60 bit words. This core requirement can be reduced by segmentation of the program. The SPA program was developed in 1978.
Biometric Analysis – A Reliable Indicator for Diagnosing Taurodontism using Panoramic Radiographs
Hegde, Veda; Anegundi, Rajesh Trayambhak; Pravinchandra, K.R.
2013-01-01
Background: Taurodontism is a clinical entity with a morpho–anatomical change in the shape of the tooth, which was thought to be absent in modern man. Taurodontism is mostly observed as an isolated trait or a component of a syndrome. Various techniques have been devised to diagnose taurodontism. Aim: The aim of this study was to analyze whether a biometric analysis was useful in diagnosing taurodontism, in radiographs which appeared to be normal on cursory observations. Setting and Design: This study was carried out in our institution by using radiographs which were taken for routine procedures. Material and Methods: In this retrospective study, panoramic radiographs were obtained from dental records of children who were aged between 9–14 years, who did not have any abnormality on cursory observations. Biometric analyses were carried out on permanent mandibular first molar(s) by using a novel biometric method. The values were tabulated and analysed. Statistics: Fischer exact probability test, Chi square test and Chi-square test with Yates correction were used for statistical analysis of the data. Results: Cursory observation did not yield us any case of taurodontism. In contrast, the biometric analysis yielded us a statistically significant number of cases of taurodontism. However, there was no statistically significant difference in the number of cases with taurodontism, which was obtained between the genders and the age group which was considered. Conclusion: Thus, taurodontism was diagnosed on a biometric analysis, which was otherwise missed on a cursory observation. It is therefore necessary from the clinical point of view, to diagnose even the mildest form of taurodontism by using metric analysis rather than just relying on a visual radiographic assessment, as its occurrence has many clinical implications and a diagnostic importance. PMID:24086912
STRengthening analytical thinking for observational studies: the STRATOS initiative.
Sauerbrei, Willi; Abrahamowicz, Michal; Altman, Douglas G; le Cessie, Saskia; Carpenter, James
2014-12-30
The validity and practical utility of observational medical research depends critically on good study design, excellent data quality, appropriate statistical methods and accurate interpretation of results. Statistical methodology has seen substantial development in recent times. Unfortunately, many of these methodological developments are ignored in practice. Consequently, design and analysis of observational studies often exhibit serious weaknesses. The lack of guidance on vital practical issues discourages many applied researchers from using more sophisticated and possibly more appropriate methods when analyzing observational studies. Furthermore, many analyses are conducted by researchers with a relatively weak statistical background and limited experience in using statistical methodology and software. Consequently, even 'standard' analyses reported in the medical literature are often flawed, casting doubt on their results and conclusions. An efficient way to help researchers to keep up with recent methodological developments is to develop guidance documents that are spread to the research community at large. These observations led to the initiation of the strengthening analytical thinking for observational studies (STRATOS) initiative, a large collaboration of experts in many different areas of biostatistical research. The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies. The guidance is intended for applied statisticians and other data analysts with varying levels of statistical education, experience and interests. In this article, we introduce the STRATOS initiative and its main aims, present the need for guidance documents and outline the planned approach and progress so far. We encourage other biostatisticians to become involved. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Bonetto, P.; Qi, Jinyi; Leahy, R. M.
2000-08-01
Describes a method for computing linear observer statistics for maximum a posteriori (MAP) reconstructions of PET images. The method is based on a theoretical approximation for the mean and covariance of MAP reconstructions. In particular, the authors derive here a closed form for the channelized Hotelling observer (CHO) statistic applied to 2D MAP images. The theoretical analysis models both the Poission statistics of PET data and the inhomogeneity of tracer uptake. The authors show reasonably good correspondence between these theoretical results and Monte Carlo studies. The accuracy and low computational cost of the approximation allow the authors to analyze the observer performance over a wide range of operating conditions and parameter settings for the MAP reconstruction algorithm.
Modelling the Effects of Land-Use Changes on Climate: a Case Study on Yamula DAM
NASA Astrophysics Data System (ADS)
Köylü, Ü.; Geymen, A.
2016-10-01
Dams block flow of rivers and cause artificial water reservoirs which affect the climate and the land use characteristics of the river basin. In this research, the effect of the huge water body obtained by Yamula Dam in Kızılırmak Basin is analysed over surrounding spatial's land use and climate change. Mann Kendal non-parametrical statistical test, Theil&Sen Slope method, Inverse Distance Weighting (IDW), Soil Conservation Service-Curve Number (SCS-CN) methods are integrated for spatial and temporal analysis of the research area. For this research humidity, temperature, wind speed, precipitation observations which are collected in 16 weather stations nearby Kızılırmak Basin are analyzed. After that these statistical information is combined by GIS data over years. An application is developed for GIS analysis in Python Programming Language and integrated with ArcGIS software. Statistical analysis calculated in the R Project for Statistical Computing and integrated with developed application. According to the statistical analysis of extracted time series of meteorological parameters, statistical significant spatiotemporal trends are observed for climate change and land use characteristics. In this study, we indicated the effect of big dams in local climate on semi-arid Yamula Dam.
STRengthening Analytical Thinking for Observational Studies: the STRATOS initiative
Sauerbrei, Willi; Abrahamowicz, Michal; Altman, Douglas G; le Cessie, Saskia; Carpenter, James
2014-01-01
The validity and practical utility of observational medical research depends critically on good study design, excellent data quality, appropriate statistical methods and accurate interpretation of results. Statistical methodology has seen substantial development in recent times. Unfortunately, many of these methodological developments are ignored in practice. Consequently, design and analysis of observational studies often exhibit serious weaknesses. The lack of guidance on vital practical issues discourages many applied researchers from using more sophisticated and possibly more appropriate methods when analyzing observational studies. Furthermore, many analyses are conducted by researchers with a relatively weak statistical background and limited experience in using statistical methodology and software. Consequently, even ‘standard’ analyses reported in the medical literature are often flawed, casting doubt on their results and conclusions. An efficient way to help researchers to keep up with recent methodological developments is to develop guidance documents that are spread to the research community at large. These observations led to the initiation of the strengthening analytical thinking for observational studies (STRATOS) initiative, a large collaboration of experts in many different areas of biostatistical research. The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies. The guidance is intended for applied statisticians and other data analysts with varying levels of statistical education, experience and interests. In this article, we introduce the STRATOS initiative and its main aims, present the need for guidance documents and outline the planned approach and progress so far. We encourage other biostatisticians to become involved. PMID:25074480
Fotina, I; Lütgendorf-Caucig, C; Stock, M; Pötter, R; Georg, D
2012-02-01
Inter-observer studies represent a valid method for the evaluation of target definition uncertainties and contouring guidelines. However, data from the literature do not yet give clear guidelines for reporting contouring variability. Thus, the purpose of this work was to compare and discuss various methods to determine variability on the basis of clinical cases and a literature review. In this study, 7 prostate and 8 lung cases were contoured on CT images by 8 experienced observers. Analysis of variability included descriptive statistics, calculation of overlap measures, and statistical measures of agreement. Cross tables with ratios and correlations were established for overlap parameters. It was shown that the minimal set of parameters to be reported should include at least one of three volume overlap measures (i.e., generalized conformity index, Jaccard coefficient, or conformation number). High correlation between these parameters and scatter of the results was observed. A combination of descriptive statistics, overlap measure, and statistical measure of agreement or reliability analysis is required to fully report the interrater variability in delineation.
Analysis tools for discovering strong parity violation at hadron colliders
NASA Astrophysics Data System (ADS)
Backović, Mihailo; Ralston, John P.
2011-07-01
Several arguments suggest parity violation may be observable in high energy strong interactions. We introduce new analysis tools to describe the azimuthal dependence of multiparticle distributions, or “azimuthal flow.” Analysis uses the representations of the orthogonal group O(2) and dihedral groups DN necessary to define parity completely in two dimensions. Classification finds that collective angles used in event-by-event statistics represent inequivalent tensor observables that cannot generally be represented by a single “reaction plane.” Many new parity-violating observables exist that have never been measured, while many parity-conserving observables formerly lumped together are now distinguished. We use the concept of “event-shape sorting” to suggest separating right- and left-handed events, and we discuss the effects of transverse and longitudinal spin. The analysis tools are statistically robust, and can be applied equally to low or high multiplicity events at the Tevatron, RHIC or RHIC Spin, and the LHC.
An Adaptive Buddy Check for Observational Quality Control
NASA Technical Reports Server (NTRS)
Dee, Dick P.; Rukhovets, Leonid; Todling, Ricardo; DaSilva, Arlindo M.; Larson, Jay W.; Einaudi, Franco (Technical Monitor)
2000-01-01
An adaptive buddy check algorithm is presented that adjusts tolerances for outlier observations based on the variability of surrounding data. The algorithm derives from a statistical hypothesis test combined with maximum-likelihood covariance estimation. Its stability is shown to depend on the initial identification of outliers by a simple background check. The adaptive feature ensures that the final quality control decisions are not very sensitive to prescribed statistics of first-guess and observation errors, nor on other approximations introduced into the algorithm. The implementation of the algorithm in a global atmospheric data assimilation is described. Its performance is contrasted with that of a non-adaptive buddy check, for the surface analysis of an extreme storm that took place in Europe on 27 December 1999. The adaptive algorithm allowed the inclusion of many important observations that differed greatly from the first guess and that would have been excluded on the basis of prescribed statistics. The analysis of the storm development was much improved as a result of these additional observations.
CADDIS Volume 4. Data Analysis: Biological and Environmental Data Requirements
Overview of PECBO Module, using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, methods for inferring environmental conditions, statistical scripts in module.
Modified Distribution-Free Goodness-of-Fit Test Statistic.
Chun, So Yeon; Browne, Michael W; Shapiro, Alexander
2018-03-01
Covariance structure analysis and its structural equation modeling extensions have become one of the most widely used methodologies in social sciences such as psychology, education, and economics. An important issue in such analysis is to assess the goodness of fit of a model under analysis. One of the most popular test statistics used in covariance structure analysis is the asymptotically distribution-free (ADF) test statistic introduced by Browne (Br J Math Stat Psychol 37:62-83, 1984). The ADF statistic can be used to test models without any specific distribution assumption (e.g., multivariate normal distribution) of the observed data. Despite its advantage, it has been shown in various empirical studies that unless sample sizes are extremely large, this ADF statistic could perform very poorly in practice. In this paper, we provide a theoretical explanation for this phenomenon and further propose a modified test statistic that improves the performance in samples of realistic size. The proposed statistic deals with the possible ill-conditioning of the involved large-scale covariance matrices.
Evidence of Nanoflare Heating in Coronal Loops Observed with Hinolde-XRT and SDO-AIA
NASA Technical Reports Server (NTRS)
Lopez-Fuentes, M. C.; Klimchuk, James
2013-01-01
We study a series of coronal loop lightcurves from X-ray and EUV observations. In search for signatures of nanoflare heating, we analyze the statistical properties of the observed lightcurves and compare them with synthetic cases obtained with a 2D cellular-automaton model based on nanoflare heating driven by photospheric motions. Our analysis shows that the observed and the model lightcurves have similar statistical properties. The asymmetries observed in the distribution of the intensity fluctuations indicate the possible presence of widespread cooling processes in sub-resolution magnetic strands.
Tipping point analysis of atmospheric oxygen concentration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Livina, V. N.; Forbes, A. B.; Vaz Martins, T. M.
2015-03-15
We apply tipping point analysis to nine observational oxygen concentration records around the globe, analyse their dynamics and perform projections under possible future scenarios, leading to oxygen deficiency in the atmosphere. The analysis is based on statistical physics framework with stochastic modelling, where we represent the observed data as a composition of deterministic and stochastic components estimated from the observed data using Bayesian and wavelet techniques.
Analysis models for the estimation of oceanic fields
NASA Technical Reports Server (NTRS)
Carter, E. F.; Robinson, A. R.
1987-01-01
A general model for statistically optimal estimates is presented for dealing with scalar, vector and multivariate datasets. The method deals with anisotropic fields and treats space and time dependence equivalently. Problems addressed include the analysis, or the production of synoptic time series of regularly gridded fields from irregular and gappy datasets, and the estimate of fields by compositing observations from several different instruments and sampling schemes. Technical issues are discussed, including the convergence of statistical estimates, the choice of representation of the correlations, the influential domain of an observation, and the efficiency of numerical computations.
Bayesian statistics: estimating plant demographic parameters
James S. Clark; Michael Lavine
2001-01-01
There are times when external information should be brought tobear on an ecological analysis. experiments are never conducted in a knowledge-free context. The inference we draw from an observation may depend on everything else we know about the process. Bayesian analysis is a method that brings outside evidence into the analysis of experimental and observational data...
Statistical Quality Control of Moisture Data in GEOS DAS
NASA Technical Reports Server (NTRS)
Dee, D. P.; Rukhovets, L.; Todling, R.
1999-01-01
A new statistical quality control algorithm was recently implemented in the Goddard Earth Observing System Data Assimilation System (GEOS DAS). The final step in the algorithm consists of an adaptive buddy check that either accepts or rejects outlier observations based on a local statistical analysis of nearby data. A basic assumption in any such test is that the observed field is spatially coherent, in the sense that nearby data can be expected to confirm each other. However, the buddy check resulted in excessive rejection of moisture data, especially during the Northern Hemisphere summer. The analysis moisture variable in GEOS DAS is water vapor mixing ratio. Observational evidence shows that the distribution of mixing ratio errors is far from normal. Furthermore, spatial correlations among mixing ratio errors are highly anisotropic and difficult to identify. Both factors contribute to the poor performance of the statistical quality control algorithm. To alleviate the problem, we applied the buddy check to relative humidity data instead. This variable explicitly depends on temperature and therefore exhibits a much greater spatial coherence. As a result, reject rates of moisture data are much more reasonable and homogeneous in time and space.
NASA Astrophysics Data System (ADS)
Jaranowski, Piotr; Królak, Andrzej
2000-03-01
We develop the analytic and numerical tools for data analysis of the continuous gravitational-wave signals from spinning neutron stars for ground-based laser interferometric detectors. The statistical data analysis method that we investigate is maximum likelihood detection which for the case of Gaussian noise reduces to matched filtering. We study in detail the statistical properties of the optimum functional that needs to be calculated in order to detect the gravitational-wave signal and estimate its parameters. We find it particularly useful to divide the parameter space into elementary cells such that the values of the optimal functional are statistically independent in different cells. We derive formulas for false alarm and detection probabilities both for the optimal and the suboptimal filters. We assess the computational requirements needed to do the signal search. We compare a number of criteria to build sufficiently accurate templates for our data analysis scheme. We verify the validity of our concepts and formulas by means of the Monte Carlo simulations. We present algorithms by which one can estimate the parameters of the continuous signals accurately. We find, confirming earlier work of other authors, that given a 100 Gflops computational power an all-sky search for observation time of 7 days and directed search for observation time of 120 days are possible whereas an all-sky search for 120 days of observation time is computationally prohibitive.
CADDIS Volume 4. Data Analysis: Basic Analyses
Use of statistical tests to determine if an observation is outside the normal range of expected values. Details of CART, regression analysis, use of quantile regression analysis, CART in causal analysis, simplifying or pruning resulting trees.
NASA Astrophysics Data System (ADS)
Rubin, D.; Aldering, G.; Barbary, K.; Boone, K.; Chappell, G.; Currie, M.; Deustua, S.; Fagrelius, P.; Fruchter, A.; Hayden, B.; Lidman, C.; Nordin, J.; Perlmutter, S.; Saunders, C.; Sofiatti, C.; Supernova Cosmology Project, The
2015-11-01
While recent supernova (SN) cosmology research has benefited from improved measurements, current analysis approaches are not statistically optimal and will prove insufficient for future surveys. This paper discusses the limitations of current SN cosmological analyses in treating outliers, selection effects, shape- and color-standardization relations, unexplained dispersion, and heterogeneous observations. We present a new Bayesian framework, called UNITY (Unified Nonlinear Inference for Type-Ia cosmologY), that incorporates significant improvements in our ability to confront these effects. We apply the framework to real SN observations and demonstrate smaller statistical and systematic uncertainties. We verify earlier results that SNe Ia require nonlinear shape and color standardizations, but we now include these nonlinear relations in a statistically well-justified way. This analysis was primarily performed blinded, in that the basic framework was first validated on simulated data before transitioning to real data. We also discuss possible extensions of the method.
Statistical analysis of arthroplasty data
2011-01-01
It is envisaged that guidelines for statistical analysis and presentation of results will improve the quality and value of research. The Nordic Arthroplasty Register Association (NARA) has therefore developed guidelines for the statistical analysis of arthroplasty register data. The guidelines are divided into two parts, one with an introduction and a discussion of the background to the guidelines (Ranstam et al. 2011a, see pages x-y in this issue), and this one with a more technical statistical discussion on how specific problems can be handled. This second part contains (1) recommendations for the interpretation of methods used to calculate survival, (2) recommendations on howto deal with bilateral observations, and (3) a discussion of problems and pitfalls associated with analysis of factors that influence survival or comparisons between outcomes extracted from different hospitals. PMID:21619500
A perceptual space of local image statistics.
Victor, Jonathan D; Thengone, Daniel J; Rizvi, Syed M; Conte, Mary M
2015-12-01
Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice - a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4min. In sum, local image statistics form a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. Copyright © 2015 Elsevier Ltd. All rights reserved.
A perceptual space of local image statistics
Victor, Jonathan D.; Thengone, Daniel J.; Rizvi, Syed M.; Conte, Mary M.
2015-01-01
Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice – a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14 min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4 min. In sum, local image statistics forms a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. PMID:26130606
NASA Astrophysics Data System (ADS)
Karl, Thomas R.; Wang, Wei-Chyung; Schlesinger, Michael E.; Knight, Richard W.; Portman, David
1990-10-01
Important surface observations such as the daily maximum and minimum temperature, daily precipitation, and cloud ceilings often have localized characteristics that are difficult to reproduce with the current resolution and the physical parameterizations in state-of-the-art General Circulation climate Models (GCMs). Many of the difficulties can be partially attributed to mismatches in scale, local topography. regional geography and boundary conditions between models and surface-based observations. Here, we present a method, called climatological projection by model statistics (CPMS), to relate GCM grid-point flee-atmosphere statistics, the predictors, to these important local surface observations. The method can be viewed as a generalization of the model output statistics (MOS) and perfect prog (PP) procedures used in numerical weather prediction (NWP) models. It consists of the application of three statistical methods: 1) principle component analysis (FICA), 2) canonical correlation, and 3) inflated regression analysis. The PCA reduces the redundancy of the predictors The canonical correlation is used to develop simultaneous relationships between linear combinations of the predictors, the canonical variables, and the surface-based observations. Finally, inflated regression is used to relate the important canonical variables to each of the surface-based observed variables.We demonstrate that even an early version of the Oregon State University two-level atmospheric GCM (with prescribed sea surface temperature) produces free-atmosphere statistics than can, when standardized using the model's internal means and variances (the MOS-like version of CPMS), closely approximate the observed local climate. When the model data are standardized by the observed free-atmosphere means and variances (the PP version of CPMS), however, the model does not reproduce the observed surface climate as well. Our results indicate that in the MOS-like version of CPMS the differences between the output of a ten-year GCM control run and the surface-based observations are often smaller than the differences between the observations of two ten-year periods. Such positive results suggest that GCMs may already contain important climatological information that can be used to infer the local climate.
NASA Astrophysics Data System (ADS)
Slaski, G.; Ohde, B.
2016-09-01
The article presents the results of a statistical dispersion analysis of an energy and power demand for tractive purposes of a battery electric vehicle. The authors compare data distribution for different values of an average speed in two approaches, namely a short and long period of observation. The short period of observation (generally around several hundred meters) results from a previously proposed macroscopic energy consumption model based on an average speed per road section. This approach yielded high values of standard deviation and coefficient of variation (the ratio between standard deviation and the mean) around 0.7-1.2. The long period of observation (about several kilometers long) is similar in length to standardized speed cycles used in testing a vehicle energy consumption and available range. The data were analysed to determine the impact of observation length on the energy and power demand variation. The analysis was based on a simulation of electric power and energy consumption performed with speed profiles data recorded in Poznan agglomeration.
Statistical analysis of dynamic fibrils observed from NST/BBSO observations
NASA Astrophysics Data System (ADS)
Gopalan Priya, Thambaje; Su, Jiang-Tao; Chen, Jie; Deng, Yuan-Yong; Prasad Choudhury, Debi
2018-02-01
We present the results obtained from the analysis of dynamic fibrils in NOAA active region (AR) 12132, using high resolution Hα observations from the New Solar Telescope operating at Big Bear Solar Observatory. The dynamic fibrils are seen to be moving up and down, and most of these dynamic fibrils are periodic and have a jet-like appearance. We found from our observations that the fibrils follow almost perfect parabolic paths in many cases. A statistical analysis on the properties of the parabolic paths showing an analysis on deceleration, maximum velocity, duration and kinetic energy of these fibrils is presented here. We found the average maximum velocity to be around 15 kms‑1 and mean deceleration to be around 100 ms‑2. The observed deceleration appears to be a fraction of gravity of the Sun and is not compatible with the path of ballistic motion due to gravity of the Sun. We found a positive correlation between deceleration and maximum velocity. This correlation is consistent with simulations done earlier on magnetoacoustic shock waves propagating upward.
NASA Astrophysics Data System (ADS)
De Angelis, Francesco; Cimini, Domenico; Löhnert, Ulrich; Caumont, Olivier; Haefele, Alexander; Pospichal, Bernhard; Martinet, Pauline; Navas-Guzmán, Francisco; Klein-Baltink, Henk; Dupont, Jean-Charles; Hocking, James
2017-10-01
Ground-based microwave radiometers (MWRs) offer the capability to provide continuous, high-temporal-resolution observations of the atmospheric thermodynamic state in the planetary boundary layer (PBL) with low maintenance. This makes MWR an ideal instrument to supplement radiosonde and satellite observations when initializing numerical weather prediction (NWP) models through data assimilation. State-of-the-art data assimilation systems (e.g. variational schemes) require an accurate representation of the differences between model (background) and observations, which are then weighted by their respective errors to provide the best analysis of the true atmospheric state. In this perspective, one source of information is contained in the statistics of the differences between observations and their background counterparts (O-B). Monitoring of O-B statistics is crucial to detect and remove systematic errors coming from the measurements, the observation operator, and/or the NWP model. This work illustrates a 1-year O-B analysis for MWR observations in clear-sky conditions for an European-wide network of six MWRs. Observations include MWR brightness temperatures (TB) measured by the two most common types of MWR instruments. Background profiles are extracted from the French convective-scale model AROME-France before being converted into TB. The observation operator used to map atmospheric profiles into TB is the fast radiative transfer model RTTOV-gb. It is shown that O-B monitoring can effectively detect instrument malfunctions. O-B statistics (bias, standard deviation, and root mean square) for water vapour channels (22.24-30.0 GHz) are quite consistent for all the instrumental sites, decreasing from the 22.24 GHz line centre ( ˜ 2-2.5 K) towards the high-frequency wing ( ˜ 0.8-1.3 K). Statistics for zenith and lower-elevation observations show a similar trend, though values increase with increasing air mass. O-B statistics for temperature channels show different behaviour for relatively transparent (51-53 GHz) and opaque channels (54-58 GHz). Opaque channels show lower uncertainties (< 0.8-0.9 K) and little variation with elevation angle. Transparent channels show larger biases ( ˜ 2-3 K) with relatively low standard deviations ( ˜ 1-1.5 K). The observations minus analysis TB statistics are similar to the O-B statistics, suggesting a possible improvement to be expected by assimilating MWR TB into NWP models. Lastly, the O-B TB differences have been evaluated to verify the normal-distribution hypothesis underlying variational and ensemble Kalman filter-based DA systems. Absolute values of excess kurtosis and skewness are generally within 1 and 0.5, respectively, for all instrumental sites, demonstrating O-B normal distribution for most of the channels and elevations angles.
Two-Bin Kanban: Ordering Impact at Navy Medical Center San Diego
2016-06-17
pretest (2013 data set) and posttest (2015 data set) analysis to avoid having the findings influenced by price changes. DMLSS does not track shipping...statistics based on those observations (Kabacoff, 2011, p. 112). Replacing the groups of observations with summary statistics allows the analyst...listed on the Acquisition Research Program website (www.acquisitionresearch.net). Acquisition Research Program Graduate School of Business & Public
Fisher statistics for analysis of diffusion tensor directional information.
Hutchinson, Elizabeth B; Rutecki, Paul A; Alexander, Andrew L; Sutula, Thomas P
2012-04-30
A statistical approach is presented for the quantitative analysis of diffusion tensor imaging (DTI) directional information using Fisher statistics, which were originally developed for the analysis of vectors in the field of paleomagnetism. In this framework, descriptive and inferential statistics have been formulated based on the Fisher probability density function, a spherical analogue of the normal distribution. The Fisher approach was evaluated for investigation of rat brain DTI maps to characterize tissue orientation in the corpus callosum, fornix, and hilus of the dorsal hippocampal dentate gyrus, and to compare directional properties in these regions following status epilepticus (SE) or traumatic brain injury (TBI) with values in healthy brains. Direction vectors were determined for each region of interest (ROI) for each brain sample and Fisher statistics were applied to calculate the mean direction vector and variance parameters in the corpus callosum, fornix, and dentate gyrus of normal rats and rats that experienced TBI or SE. Hypothesis testing was performed by calculation of Watson's F-statistic and associated p-value giving the likelihood that grouped observations were from the same directional distribution. In the fornix and midline corpus callosum, no directional differences were detected between groups, however in the hilus, significant (p<0.0005) differences were found that robustly confirmed observations that were suggested by visual inspection of directionally encoded color DTI maps. The Fisher approach is a potentially useful analysis tool that may extend the current capabilities of DTI investigation by providing a means of statistical comparison of tissue structural orientation. Copyright © 2012 Elsevier B.V. All rights reserved.
Humans make efficient use of natural image statistics when performing spatial interpolation.
D'Antona, Anthony D; Perry, Jeffrey S; Geisler, Wilson S
2013-12-16
Visual systems learn through evolution and experience over the lifespan to exploit the statistical structure of natural images when performing visual tasks. Understanding which aspects of this statistical structure are incorporated into the human nervous system is a fundamental goal in vision science. To address this goal, we measured human ability to estimate the intensity of missing image pixels in natural images. Human estimation accuracy is compared with various simple heuristics (e.g., local mean) and with optimal observers that have nearly complete knowledge of the local statistical structure of natural images. Human estimates are more accurate than those of simple heuristics, and they match the performance of an optimal observer that knows the local statistical structure of relative intensities (contrasts). This optimal observer predicts the detailed pattern of human estimation errors and hence the results place strong constraints on the underlying neural mechanisms. However, humans do not reach the performance of an optimal observer that knows the local statistical structure of the absolute intensities, which reflect both local relative intensities and local mean intensity. As predicted from a statistical analysis of natural images, human estimation accuracy is negligibly improved by expanding the context from a local patch to the whole image. Our results demonstrate that the human visual system exploits efficiently the statistical structure of natural images.
Bayesian Sensitivity Analysis of Statistical Models with Missing Data
ZHU, HONGTU; IBRAHIM, JOSEPH G.; TANG, NIANSHENG
2013-01-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures. PMID:24753718
Biometric Analysis - A Reliable Indicator for Diagnosing Taurodontism using Panoramic Radiographs.
Hegde, Veda; Anegundi, Rajesh Trayambhak; Pravinchandra, K R
2013-08-01
Taurodontism is a clinical entity with a morpho-anatomical change in the shape of the tooth, which was thought to be absent in modern man. Taurodontism is mostly observed as an isolated trait or a component of a syndrome. Various techniques have been devised to diagnose taurodontism. The aim of this study was to analyze whether a biometric analysis was useful in diagnosing taurodontism, in radiographs which appeared to be normal on cursory observations. This study was carried out in our institution by using radiographs which were taken for routine procedures. In this retrospective study, panoramic radiographs were obtained from dental records of children who were aged between 9-14 years, who did not have any abnormality on cursory observations. Biometric analyses were carried out on permanent mandibular first molar(s) by using a novel biometric method. The values were tabulated and analysed. Fischer exact probability test, Chi square test and Chi-square test with Yates correction were used for statistical analysis of the data. Cursory observation did not yield us any case of taurodontism. In contrast, the biometric analysis yielded us a statistically significant number of cases of taurodontism. However, there was no statistically significant difference in the number of cases with taurodontism, which was obtained between the genders and the age group which was considered. Thus, taurodontism was diagnosed on a biometric analysis, which was otherwise missed on a cursory observation. It is therefore necessary from the clinical point of view, to diagnose even the mildest form of taurodontism by using metric analysis rather than just relying on a visual radiographic assessment, as its occurrence has many clinical implications and a diagnostic importance.
Toward improved analysis of concentration data: Embracing nondetects.
Shoari, Niloofar; Dubé, Jean-Sébastien
2018-03-01
Various statistical tests on concentration data serve to support decision-making regarding characterization and monitoring of contaminated media, assessing exposure to a chemical, and quantifying the associated risks. However, the routine statistical protocols cannot be directly applied because of challenges arising from nondetects or left-censored observations, which are concentration measurements below the detection limit of measuring instruments. Despite the existence of techniques based on survival analysis that can adjust for nondetects, these are seldom taken into account properly. A comprehensive review of the literature showed that managing policies regarding analysis of censored data do not always agree and that guidance from regulatory agencies may be outdated. Therefore, researchers and practitioners commonly resort to the most convenient way of tackling the censored data problem by substituting nondetects with arbitrary constants prior to data analysis, although this is generally regarded as a bias-prone approach. Hoping to improve the interpretation of concentration data, the present article aims to familiarize researchers in different disciplines with the significance of left-censored observations and provides theoretical and computational recommendations (under both frequentist and Bayesian frameworks) for adequate analysis of censored data. In particular, the present article synthesizes key findings from previous research with respect to 3 noteworthy aspects of inferential statistics: estimation of descriptive statistics, hypothesis testing, and regression analysis. Environ Toxicol Chem 2018;37:643-656. © 2017 SETAC. © 2017 SETAC.
Dark matter constraints from a joint analysis of dwarf Spheroidal galaxy observations with VERITAS
Archambault, S.; Archer, A.; Benbow, W.; ...
2017-04-05
We present constraints on the annihilation cross section of weakly interacting massive particles dark matter based on the joint statistical analysis of four dwarf galaxies with VERITAS. These results are derived from an optimized photon weighting statistical technique that improves on standard imaging atmospheric Cherenkov telescope (IACT) analyses by utilizing the spectral and spatial properties of individual photon events.
Mager, P P; Rothe, H
1990-10-01
Multicollinearity of physicochemical descriptors leads to serious consequences in quantitative structure-activity relationship (QSAR) analysis, such as incorrect estimators and test statistics of regression coefficients of the ordinary least-squares (OLS) model applied usually to QSARs. Beside the diagnosis of the known simple collinearity, principal component regression analysis (PCRA) also allows the diagnosis of various types of multicollinearity. Only if the absolute values of PCRA estimators are order statistics that decrease monotonically, the effects of multicollinearity can be circumvented. Otherwise, obscure phenomena may be observed, such as good data recognition but low predictive model power of a QSAR model.
Detector noise statistics in the non-linear regime
NASA Technical Reports Server (NTRS)
Shopbell, P. L.; Bland-Hawthorn, J.
1992-01-01
The statistical behavior of an idealized linear detector in the presence of threshold and saturation levels is examined. It is assumed that the noise is governed by the statistical fluctuations in the number of photons emitted by the source during an exposure. Since physical detectors cannot have infinite dynamic range, our model illustrates that all devices have non-linear regimes, particularly at high count rates. The primary effect is a decrease in the statistical variance about the mean signal due to a portion of the expected noise distribution being removed via clipping. Higher order statistical moments are also examined, in particular, skewness and kurtosis. In principle, the expected distortion in the detector noise characteristics can be calibrated using flatfield observations with count rates matched to the observations. For this purpose, some basic statistical methods that utilize Fourier analysis techniques are described.
NASA DOE POD NDE Capabilities Data Book
NASA Technical Reports Server (NTRS)
Generazio, Edward R.
2015-01-01
This data book contains the Directed Design of Experiments for Validating Probability of Detection (POD) Capability of NDE Systems (DOEPOD) analyses of the nondestructive inspection data presented in the NTIAC, Nondestructive Evaluation (NDE) Capabilities Data Book, 3rd ed., NTIAC DB-97-02. DOEPOD is designed as a decision support system to validate inspection system, personnel, and protocol demonstrating 0.90 POD with 95% confidence at critical flaw sizes, a90/95. The test methodology used in DOEPOD is based on the field of statistical sequential analysis founded by Abraham Wald. Sequential analysis is a method of statistical inference whose characteristic feature is that the number of observations required by the procedure is not determined in advance of the experiment. The decision to terminate the experiment depends, at each stage, on the results of the observations previously made. A merit of the sequential method, as applied to testing statistical hypotheses, is that test procedures can be constructed which require, on average, a substantially smaller number of observations than equally reliable test procedures based on a predetermined number of observations.
National Centers for Environmental Prediction
Statistics Observational Data Processing Data Assimilation Monsoon Desk Model Transition Seminars Seminar Hurricane Weather Research and Forecast System ANALYSIS FORECAST MODEL GSI Gridpoint Statistical Weather and Climate Prediction (NCWCP) 5830 University Research Court College Park, MD 20740 Page Author
The large sample size fallacy.
Lantz, Björn
2013-06-01
Significance in the statistical sense has little to do with significance in the common practical sense. Statistical significance is a necessary but not a sufficient condition for practical significance. Hence, results that are extremely statistically significant may be highly nonsignificant in practice. The degree of practical significance is generally determined by the size of the observed effect, not the p-value. The results of studies based on large samples are often characterized by extreme statistical significance despite small or even trivial effect sizes. Interpreting such results as significant in practice without further analysis is referred to as the large sample size fallacy in this article. The aim of this article is to explore the relevance of the large sample size fallacy in contemporary nursing research. Relatively few nursing articles display explicit measures of observed effect sizes or include a qualitative discussion of observed effect sizes. Statistical significance is often treated as an end in itself. Effect sizes should generally be calculated and presented along with p-values for statistically significant results, and observed effect sizes should be discussed qualitatively through direct and explicit comparisons with the effects in related literature. © 2012 Nordic College of Caring Science.
Periodontal disease and carotid atherosclerosis: A meta-analysis of 17,330 participants.
Zeng, Xian-Tao; Leng, Wei-Dong; Lam, Yat-Yin; Yan, Bryan P; Wei, Xue-Mei; Weng, Hong; Kwong, Joey S W
2016-01-15
The association between periodontal disease and carotid atherosclerosis has been evaluated primarily in single-center studies, and whether periodontal disease is an independent risk factor of carotid atherosclerosis remains uncertain. This meta-analysis aimed to evaluate the association between periodontal disease and carotid atherosclerosis. We searched PubMed and Embase for relevant observational studies up to February 20, 2015. Two authors independently extracted data from included studies, and odds ratios (ORs) with 95% confidence intervals (CIs) were calculated for overall and subgroup meta-analyses. Statistical heterogeneity was assessed by the chi-squared test (P<0.1 for statistical significance) and quantified by the I(2) statistic. Data analysis was conducted using the Comprehensive Meta-Analysis (CMA) software. Fifteen observational studies involving 17,330 participants were included in the meta-analysis. The overall pooled result showed that periodontal disease was associated with carotid atherosclerosis (OR: 1.27, 95% CI: 1.14-1.41; P<0.001) but statistical heterogeneity was substantial (I(2)=78.90%). Subgroup analysis of adjusted smoking and diabetes mellitus showed borderline significance (OR: 1.08; 95% CI: 1.00-1.18; P=0.05). Sensitivity and cumulative analyses both indicated that our results were robust. Findings of our meta-analysis indicated that the presence of periodontal disease was associated with carotid atherosclerosis; however, further large-scale, well-conducted clinical studies are needed to explore the precise risk of developing carotid atherosclerosis in patients with periodontal disease. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
The Analysis of Completely Randomized Factorial Experiments When Observations Are Lost at Random.
ERIC Educational Resources Information Center
Hummel, Thomas J.
An investigation was conducted of the characteristics of two estimation procedures and corresponding test statistics used in the analysis of completely randomized factorial experiments when observations are lost at random. For one estimator, contrast coefficients for cell means did not involve the cell frequencies. For the other, contrast…
Explorations in statistics: the log transformation.
Curran-Everett, Douglas
2018-06-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This thirteenth installment of Explorations in Statistics explores the log transformation, an established technique that rescales the actual observations from an experiment so that the assumptions of some statistical analysis are better met. A general assumption in statistics is that the variability of some response Y is homogeneous across groups or across some predictor variable X. If the variability-the standard deviation-varies in rough proportion to the mean value of Y, a log transformation can equalize the standard deviations. Moreover, if the actual observations from an experiment conform to a skewed distribution, then a log transformation can make the theoretical distribution of the sample mean more consistent with a normal distribution. This is important: the results of a one-sample t test are meaningful only if the theoretical distribution of the sample mean is roughly normal. If we log-transform our observations, then we want to confirm the transformation was useful. We can do this if we use the Box-Cox method, if we bootstrap the sample mean and the statistic t itself, and if we assess the residual plots from the statistical model of the actual and transformed sample observations.
Statistical analysis of tiny SXR flares observed by SphinX
NASA Astrophysics Data System (ADS)
Gryciuk, Magdalena; Siarkowski, Marek; Sylwester, Janusz; Kepa, Anna; Gburek, Szymon; Mrozek, Tomasz; Podgórski, Piotr
2015-08-01
The Solar Photometer in X-rays (SphinX) was designed to observe soft X-ray solar emission in the energy range between ~1 keV and 15 keV with the resolution better than 0.5 keV. The instrument operated from February until November 2009 aboard CORONAS-Photon satellite, during the phase of exceptionally low minimum of solar activity. Here we use SphinX data for analysis of micro-flares and brightenings. Despite a very low activity more than a thousand small X-ray events have been recognized by semi-automatic inspection of SphinX light curves. A catalogue of temporal and physical characteristics of these events is shown and discussed and results of the statistical analysis of the catalogue data are presented.
A statistical study of EMIC waves observed by Cluster: 1. Wave properties
NASA Astrophysics Data System (ADS)
Allen, R. C.; Zhang, J.-C.; Kistler, L. M.; Spence, H. E.; Lin, R.-L.; Klecker, B.; Dunlop, M. W.; André, M.; Jordanova, V. K.
2015-07-01
Electromagnetic ion cyclotron (EMIC) waves are an important mechanism for particle energization and losses inside the magnetosphere. In order to better understand the effects of these waves on particle dynamics, detailed information about the occurrence rate, wave power, ellipticity, normal angle, energy propagation angle distributions, and local plasma parameters are required. Previous statistical studies have used in situ observations to investigate the distribution of these parameters in the magnetic local time versus L-shell (MLT-L) frame within a limited magnetic latitude (MLAT) range. In this study, we present a statistical analysis of EMIC wave properties using 10 years (2001-2010) of data from Cluster, totaling 25,431 min of wave activity. Due to the polar orbit of Cluster, we are able to investigate EMIC waves at all MLATs and MLTs. This allows us to further investigate the MLAT dependence of various wave properties inside different MLT sectors and further explore the effects of Shabansky orbits on EMIC wave generation and propagation. The statistical analysis is presented in two papers. This paper focuses on the wave occurrence distribution as well as the distribution of wave properties. The companion paper focuses on local plasma parameters during wave observations as well as wave generation proxies.
NASA Technical Reports Server (NTRS)
Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)
2001-01-01
The Independent Component Analysis is a recently developed technique for component extraction. This new method requires the statistical independence of the extracted components, a stronger constraint that uses higher-order statistics, instead of the classical decorrelation, a weaker constraint that uses only second-order statistics. This technique has been used recently for the analysis of geophysical time series with the goal of investigating the causes of variability in observed data (i.e. exploratory approach). We demonstrate with a data simulation experiment that, if initialized with a Principal Component Analysis, the Independent Component Analysis performs a rotation of the classical PCA (or EOF) solution. This rotation uses no localization criterion like other Rotation Techniques (RT), only the global generalization of decorrelation by statistical independence is used. This rotation of the PCA solution seems to be able to solve the tendency of PCA to mix several physical phenomena, even when the signal is just their linear sum.
Statistical and Economic Techniques for Site-specific Nematode Management.
Liu, Zheng; Griffin, Terry; Kirkpatrick, Terrence L
2014-03-01
Recent advances in precision agriculture technologies and spatial statistics allow realistic, site-specific estimation of nematode damage to field crops and provide a platform for the site-specific delivery of nematicides within individual fields. This paper reviews the spatial statistical techniques that model correlations among neighboring observations and develop a spatial economic analysis to determine the potential of site-specific nematicide application. The spatial econometric methodology applied in the context of site-specific crop yield response contributes to closing the gap between data analysis and realistic site-specific nematicide recommendations and helps to provide a practical method of site-specifically controlling nematodes.
ERIC Educational Resources Information Center
Henry, Kimberly L.; Muthen, Bengt
2010-01-01
Latent class analysis (LCA) is a statistical method used to identify subtypes of related cases using a set of categorical or continuous observed variables. Traditional LCA assumes that observations are independent. However, multilevel data structures are common in social and behavioral research and alternative strategies are needed. In this…
ERIC Educational Resources Information Center
Torrens, Paul M.; Griffin, William A.
2013-01-01
The authors describe an observational and analytic methodology for recording and interpreting dynamic microprocesses that occur during social interaction, making use of space--time data collection techniques, spatial-statistical analysis, and visualization. The scheme has three investigative foci: Structure, Activity Composition, and Clustering.…
NASA Astrophysics Data System (ADS)
Vigan, A.; Chauvin, G.; Bonavita, M.; Desidera, S.; Bonnefoy, M.; Mesa, D.; Beuzit, J.-L.; Augereau, J.-C.; Biller, B.; Boccaletti, A.; Brugaletta, E.; Buenzli, E.; Carson, J.; Covino, E.; Delorme, P.; Eggenberger, A.; Feldt, M.; Hagelberg, J.; Henning, T.; Lagrange, A.-M.; Lanzafame, A.; Ménard, F.; Messina, S.; Meyer, M.; Montagnier, G.; Mordasini, C.; Mouillet, D.; Moutou, C.; Mugnier, L.; Quanz, S. P.; Reggiani, M.; Ségransan, D.; Thalmann, C.; Waters, R.; Zurlo, A.
2014-01-01
Over the past decade, a growing number of deep imaging surveys have started to provide meaningful constraints on the population of extrasolar giant planets at large orbital separation. Primary targets for these surveys have been carefully selected based on their age, distance and spectral type, and often on their membership to young nearby associations where all stars share common kinematics, photometric and spectroscopic properties. The next step is a wider statistical analysis of the frequency and properties of low mass companions as a function of stellar mass and orbital separation. In late 2009, we initiated a coordinated European Large Program using angular differential imaging in the H band (1.66 μm) with NaCo at the VLT. Our aim is to provide a comprehensive and statistically significant study of the occurrence of extrasolar giant planets and brown dwarfs at large (5-500 AU) orbital separation around ~150 young, nearby stars, a large fraction of which have never been observed at very deep contrast. The survey has now been completed and we present the data analysis and detection limits for the observed sample, for which we reach the planetary-mass domain at separations of >~50 AU on average. We also present the results of the statistical analysis that has been performed over the 75 targets newly observed at high-contrast. We discuss the details of the statistical analysis and the physical constraints that our survey provides for the frequency and formation scenario of planetary mass companions at large separation.
Analyzing Dyadic Sequence Data—Research Questions and Implied Statistical Models
Fuchs, Peter; Nussbeck, Fridtjof W.; Meuwly, Nathalie; Bodenmann, Guy
2017-01-01
The analysis of observational data is often seen as a key approach to understanding dynamics in romantic relationships but also in dyadic systems in general. Statistical models for the analysis of dyadic observational data are not commonly known or applied. In this contribution, selected approaches to dyadic sequence data will be presented with a focus on models that can be applied when sample sizes are of medium size (N = 100 couples or less). Each of the statistical models is motivated by an underlying potential research question, the most important model results are presented and linked to the research question. The following research questions and models are compared with respect to their applicability using a hands on approach: (I) Is there an association between a particular behavior by one and the reaction by the other partner? (Pearson Correlation); (II) Does the behavior of one member trigger an immediate reaction by the other? (aggregated logit models; multi-level approach; basic Markov model); (III) Is there an underlying dyadic process, which might account for the observed behavior? (hidden Markov model); and (IV) Are there latent groups of dyads, which might account for observing different reaction patterns? (mixture Markov; optimal matching). Finally, recommendations for researchers to choose among the different models, issues of data handling, and advises to apply the statistical models in empirical research properly are given (e.g., in a new r-package “DySeq”). PMID:28443037
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kollias, Pavlos
This is a multi-institutional, collaborative project using a three-tier modeling approach to bridge field observations and global cloud-permitting models, with emphases on cloud population structural evolution through various large-scale environments. Our contribution was in data analysis for the generation of high value cloud and precipitation products and derive cloud statistics for model validation. There are two areas in data analysis that we contributed: the development of a synergistic cloud and precipitation cloud classification that identify different cloud (e.g. shallow cumulus, cirrus) and precipitation types (shallow, deep, convective, stratiform) using profiling ARM observations and the development of a quantitative precipitation ratemore » retrieval algorithm using profiling ARM observations. Similar efforts have been developed in the past for precipitation (weather radars), but not for the millimeter-wavelength (cloud) radar deployed at the ARM sites.« less
Flares, ejections, proton events
NASA Astrophysics Data System (ADS)
Belov, A. V.
2017-11-01
Statistical analysis is performed for the relationship of coronal mass ejections (CMEs) and X-ray flares with the fluxes of solar protons with energies >10 and >100 MeV observed near the Earth. The basis for this analysis was the events that took place in 1976-2015, for which there are reliable observations of X-ray flares on GOES satellites and CME observations with SOHO/LASCO coronagraphs. A fairly good correlation has been revealed between the magnitude of proton enhancements and the power and duration of flares, as well as the initial CME speed. The statistics do not give a clear advantage either to CMEs or the flares concerning their relation with proton events, but the characteristics of the flares and ejections complement each other well and are reasonable to use together in the forecast models. Numerical dependences are obtained that allow estimation of the proton fluxes to the Earth expected from solar observations; possibilities for improving the model are discussed.
NASA Astrophysics Data System (ADS)
Alekseenko, M. A.; Gendrina, I. Yu.
2017-11-01
Recently, due to the abundance of various types of observational data in the systems of vision through the atmosphere and the need for their processing, the use of various methods of statistical research in the study of such systems as correlation-regression analysis, dynamic series, variance analysis, etc. is actual. We have attempted to apply elements of correlation-regression analysis for the study and subsequent prediction of the patterns of radiation transfer in these systems same as in the construction of radiation models of the atmosphere. In this paper, we present some results of statistical processing of the results of numerical simulation of the characteristics of vision systems through the atmosphere obtained with the help of a special software package.1
Pathway analysis with next-generation sequencing data.
Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao
2015-04-01
Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
On the Choice of Variable for Atmospheric Moisture Analysis
NASA Technical Reports Server (NTRS)
Dee, Dick P.; DaSilva, Arlindo M.; Atlas, Robert (Technical Monitor)
2002-01-01
The implications of using different control variables for the analysis of moisture observations in a global atmospheric data assimilation system are investigated. A moisture analysis based on either mixing ratio or specific humidity is prone to large extrapolation errors, due to the high variability in space and time of these parameters and to the difficulties in modeling their error covariances. Using the logarithm of specific humidity does not alleviate these problems, and has the further disadvantage that very dry background estimates cannot be effectively corrected by observations. Relative humidity is a better choice from a statistical point of view, because this field is spatially and temporally more coherent and error statistics are therefore easier to obtain. If, however, the analysis is designed to preserve relative humidity in the absence of moisture observations, then the analyzed specific humidity field depends entirely on analyzed temperature changes. If the model has a cool bias in the stratosphere this will lead to an unstable accumulation of excess moisture there. A pseudo-relative humidity can be defined by scaling the mixing ratio by the background saturation mixing ratio. A univariate pseudo-relative humidity analysis will preserve the specific humidity field in the absence of moisture observations. A pseudorelative humidity analysis is shown to be equivalent to a mixing ratio analysis with flow-dependent covariances. In the presence of multivariate (temperature-moisture) observations it produces analyzed relative humidity values that are nearly identical to those produced by a relative humidity analysis. Based on a time series analysis of radiosonde observed-minus-background differences it appears to be more justifiable to neglect specific humidity-temperature correlations (in a univariate pseudo-relative humidity analysis) than to neglect relative humidity-temperature correlations (in a univariate relative humidity analysis). A pseudo-relative humidity analysis is easily implemented in an existing moisture analysis system, by simply scaling observed-minus background moisture residuals prior to solving the analysis equation, and rescaling the analyzed increments afterward.
Random forests for classification in ecology
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J.
2007-01-01
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature. ?? 2007 by the Ecological Society of America.
Considerations for the design, analysis and presentation of in vivo studies.
Ranstam, J; Cook, J A
2017-03-01
To describe, explain and give practical suggestions regarding important principles and key methodological challenges in the study design, statistical analysis, and reporting of results from in vivo studies. Pre-specifying endpoints and analysis, recognizing the common underlying assumption of statistically independent observations, performing sample size calculations, and addressing multiplicity issues are important parts of an in vivo study. A clear reporting of results and informative graphical presentations of data are other important parts. Copyright © 2016 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Statistical analysis of CCSN/SS7 traffic data from working CCS subnetworks
NASA Astrophysics Data System (ADS)
Duffy, Diane E.; McIntosh, Allen A.; Rosenstein, Mark; Willinger, Walter
1994-04-01
In this paper, we report on an ongoing statistical analysis of actual CCSN traffic data. The data consist of approximately 170 million signaling messages collected from a variety of different working CCS subnetworks. The key findings from our analysis concern: (1) the characteristics of both the telephone call arrival process and the signaling message arrival process; (2) the tail behavior of the call holding time distribution; and (3) the observed performance of the CCSN with respect to a variety of performance and reliability measurements.
2014-09-30
for Analysis of Convective Mass Flux Parameterizations Using DYNAMO Direct Observations R. Michael Hardesty CIRES/University of Colorado/NOAA 325...the RV-Revell during legs 2 & 3 of the DYNAMO experiement to help characterize vertical transport through the boundary layer and to build statistics...obtained during DYNAMO , and to investigate whether cold pools that emanate from convection organize the interplay between humidity and convection and
OSO 8 observational limits to the acoustic coronal heating mechanism
NASA Technical Reports Server (NTRS)
Bruner, E. C., Jr.
1981-01-01
An improved analysis of time-resolved line profiles of the C IV resonance line at 1548 A has been used to test the acoustic wave hypothesis of solar coronal heating. It is shown that the observed motions and brightness fluctuations are consistent with the existence of acoustic waves. Specific account is taken of the effect of photon statistics on the observed velocities, and a test is devised to determine whether the motions represent propagating or evanescent waves. It is found that on the average about as much energy is carried upward as downward such that the net acoustic flux density is statistically consistent with zero. The statistical uncertainty in this null result is three orders of magnitue lower than the flux level needed to heat the corona.
A Divergence Statistics Extension to VTK for Performance Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pebay, Philippe Pierre; Bennett, Janine Camille
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical,more » "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.« less
Ignjatović, Aleksandra; Stojanović, Miodrag; Milošević, Zoran; Anđelković Apostolović, Marija
2017-12-02
The interest in developing risk models in medicine not only is appealing, but also associated with many obstacles in different aspects of predictive model development. Initially, the association of biomarkers or the association of more markers with the specific outcome was proven by statistical significance, but novel and demanding questions required the development of new and more complex statistical techniques. Progress of statistical analysis in biomedical research can be observed the best through the history of the Framingham study and development of the Framingham score. Evaluation of predictive models comes from a combination of the facts which are results of several metrics. Using logistic regression and Cox proportional hazards regression analysis, the calibration test, and the ROC curve analysis should be mandatory and eliminatory, and the central place should be taken by some new statistical techniques. In order to obtain complete information related to the new marker in the model, recently, there is a recommendation to use the reclassification tables by calculating the net reclassification index and the integrated discrimination improvement. Decision curve analysis is a novel method for evaluating the clinical usefulness of a predictive model. It may be noted that customizing and fine-tuning of the Framingham risk score initiated the development of statistical analysis. Clinically applicable predictive model should be a trade-off between all abovementioned statistical metrics, a trade-off between calibration and discrimination, accuracy and decision-making, costs and benefits, and quality and quantity of patient's life.
A statistical study of EMIC waves observed by Cluster. 1. Wave properties. EMIC Wave Properties
Allen, R. C.; Zhang, J. -C.; Kistler, L. M.; ...
2015-07-23
Electromagnetic ion cyclotron (EMIC) waves are an important mechanism for particle energization and losses inside the magnetosphere. In order to better understand the effects of these waves on particle dynamics, detailed information about the occurrence rate, wave power, ellipticity, normal angle, energy propagation angle distributions, and local plasma parameters are required. Previous statistical studies have used in situ observations to investigate the distribution of these parameters in the magnetic local time versus L-shell (MLT-L) frame within a limited magnetic latitude (MLAT) range. In our study, we present a statistical analysis of EMIC wave properties using 10 years (2001–2010) of datamore » from Cluster, totaling 25,431 min of wave activity. Due to the polar orbit of Cluster, we are able to investigate EMIC waves at all MLATs and MLTs. This allows us to further investigate the MLAT dependence of various wave properties inside different MLT sectors and further explore the effects of Shabansky orbits on EMIC wave generation and propagation. Thus, the statistical analysis is presented in two papers. OUr paper focuses on the wave occurrence distribution as well as the distribution of wave properties. The companion paper focuses on local plasma parameters during wave observations as well as wave generation proxies.« less
Performance characteristics of a visual-search human-model observer with sparse PET image data
NASA Astrophysics Data System (ADS)
Gifford, Howard C.
2012-02-01
As predictors of human performance in detection-localization tasks, statistical model observers can have problems with tasks that are primarily limited by target contrast or structural noise. Model observers with a visual-search (VS) framework may provide a more reliable alternative. This framework provides for an initial holistic search that identifies suspicious locations for analysis by a statistical observer. A basic VS observer for emission tomography focuses on hot "blobs" in an image and uses a channelized nonprewhitening (CNPW) observer for analysis. In [1], we investigated this model for a contrast-limited task with SPECT images; herein, a statisticalnoise limited task involving PET images is considered. An LROC study used 2D image slices with liver, lung and soft-tissue tumors. Human and model observers read the images in coronal, sagittal and transverse display formats. The study thus measured the detectability of tumors in a given organ as a function of display format. The model observers were applied under several task variants that tested their response to structural noise both at the organ boundaries alone and over the organs as a whole. As measured by correlation with the human data, the VS observer outperformed the CNPW scanning observer.
ERIC Educational Resources Information Center
Altonji, Joseph G.; Pierret, Charles R.
A statistical analysis was performed to test the hypothesis that, if profit-maximizing firms have limited information about the general productivity of new workers, they may choose to use easily observable characteristics such as years of education to discriminate statistically among workers. Information about employer learning was obtained by…
Advanced Categorical Statistics: Issues and Applications in Communication Research.
ERIC Educational Resources Information Center
Denham, Bryan E.
2002-01-01
Discusses not only the procedures, assumptions, and applications of advanced categorical statistics, but also covers some common misapplications, from which a great deal can be learned. Addresses the use and limitations of cross-tabulation and chi-square analysis, as well as issues such as observation independence and artificial inflation of a…
2003-07-01
4, Gnanadesikan , 1977). An entity whose measured features fall into one of the regions is classified accordingly. For the approaches we discuss here... Gnanadesikan , R. 1977. Methods for Statistical Data Analysis of Multivariate Observations. John Wiley & Sons, New York. Hassig, N. L., O’Brien, R. F
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses
Liu, Ruijie; Holik, Aliaksei Z.; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E.; Asselin-Labat, Marie-Liesse; Smyth, Gordon K.; Ritchie, Matthew E.
2015-01-01
Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean–variance relationship of the log-counts-per-million using ‘voom’. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source ‘limma’ package. PMID:25925576
NASA Astrophysics Data System (ADS)
Matsuda, Takashi S.; Nakamura, Takuji; Shiokawa, Kazuo; Tsutsumi, Masaki; Suzuki, Hidehiko; Ejiri, Mitsumu K.; Taguchi, Makoto
Atmospheric gravity waves (AGWs), which are generated in the lower atmosphere, transport significant amount of energy and momentum into the mesosphere and lower thermosphere and cause the mean wind accelerations in the mesosphere. This momentum deposit drives the general circulation and affects the temperature structure. Among many parameters to characterize AGWs, horizontal phase velocity is very important to discuss the vertical propagation. Airglow imaging is a useful technique for investigating the horizontal structures of AGWs at around 90 km altitude. Recently, there are many reports about statistical characteristics of AGWs observed by airglow imaging. However, comparison of these results obtained at various locations is difficult because each research group uses its own method for extracting and analyzing AGW events. We have developed a new statistical analysis method for obtaining the power spectrum in the horizontal phase velocity domain from airglow image data, so as to deal with huge amounts of imaging data obtained on different years and at various observation sites, without bias caused by different event extraction criteria for the observer. This method was applied to the data obtained at Syowa Station, Antarctica, in 2011 and compared with a conventional event analysis in which the phase fronts were traced manually in order to estimate horizontal characteristics. This comparison shows that our new method is adequate to deriving the horizontal phase velocity characteristics of AGWs observed by airglow imaging technique. We plan to apply this method to airglow imaging data observed at Syowa Station in 2002 and between 2008 and 2013, and also to the data observed at other stations in Antarctica (e.g. Rothera Station (67S, 68W) and Halley Station (75S, 26W)), in order to investigate the behavior of AGWs propagation direction and source distribution in the MLT region over Antarctica. In this presentation, we will report interim analysis result of the data at Syowa Station.
Esthetic evaluation of maxillary single-tooth implants in the esthetic zone
Cho, Hae-Lyung; Lee, Jae-Kwan; Um, Heung-Sik
2010-01-01
Purpose The aim of this study is to assess the influence exerted by the observer's dental specialization and compare patients' opinion with observers' opinion of the esthetics of maxillary single-tooth implants in the esthetic zone. Methods Forty-one adult patients, who were treated with a single implant in the esthetic zone, were enrolled in this study. Eight observers (2 periodontists, 2 prosthodontists, 2 orthodontists and 2 senior dental students) applied the pink esthetic score (PES)/white esthetic score (WES) to 41 implant-supported single restorations twice with an interval of 4 weeks. We used a visual analog scale (VAS) to assess the patient's satisfaction with the treatment outcome from an esthetic point of view. Results In the PES/WES, very good and moderate intraobserver agreements were noted between the first and second rating. The mean total PES/WES was 11.19 ± 3.59. The mean PES was 5.17 ± 2.29 and mean WES was 6.02 ± 1.96. In the total PES/WES, the difference between the groups was not significant. However, in the WES, the difference between the groups was significant and prosthodontists were found to have assigned poorer ratings than the other groups. Periodontists gave higher ratings than prosthodontists and senior dental students. Orthodontists were clearly more critical than the other observers. The statistical analysis revealed statistically significant correlation between patients' esthetic perception and dentists' perception of the anterior tooth. However, the correlation between the total PES/WES and the VAS score for the first premolar was not statistically significant. Conclusions The PES/WES is an objective tool in rating the esthetics of implant supported single crowns and adjacent soft tissues. Orthodontists were the most critical observers, while periodontists were more generous than other observers. The statistical analysis revealed a statistically significant correlation between patients' esthetic perception and dentists' perception of the anterior tooth. PMID:20827328
Statistical innovations in the medical device world sparked by the FDA.
Campbell, Gregory; Yue, Lilly Q
2016-01-01
The world of medical devices while highly diverse is extremely innovative, and this facilitates the adoption of innovative statistical techniques. Statisticians in the Center for Devices and Radiological Health (CDRH) at the Food and Drug Administration (FDA) have provided leadership in implementing statistical innovations. The innovations discussed include: the incorporation of Bayesian methods in clinical trials, adaptive designs, the use and development of propensity score methodology in the design and analysis of non-randomized observational studies, the use of tipping-point analysis for missing data, techniques for diagnostic test evaluation, bridging studies for companion diagnostic tests, quantitative benefit-risk decisions, and patient preference studies.
Statistical analysis of flight times for space shuttle ferry flights
NASA Technical Reports Server (NTRS)
Graves, M. E.; Perlmutter, M.
1974-01-01
Markov chain and Monte Carlo analysis techniques are applied to the simulated Space Shuttle Orbiter Ferry flights to obtain statistical distributions of flight time duration between Edwards Air Force Base and Kennedy Space Center. The two methods are compared, and are found to be in excellent agreement. The flights are subjected to certain operational and meteorological requirements, or constraints, which cause eastbound and westbound trips to yield different results. Persistence of events theory is applied to the occurrence of inclement conditions to find their effect upon the statistical flight time distribution. In a sensitivity test, some of the constraints are varied to observe the corresponding changes in the results.
Water quality management using statistical analysis and time-series prediction model
NASA Astrophysics Data System (ADS)
Parmar, Kulwinder Singh; Bhardwaj, Rashmi
2014-12-01
This paper deals with water quality management using statistical analysis and time-series prediction model. The monthly variation of water quality standards has been used to compare statistical mean, median, mode, standard deviation, kurtosis, skewness, coefficient of variation at Yamuna River. Model validated using R-squared, root mean square error, mean absolute percentage error, maximum absolute percentage error, mean absolute error, maximum absolute error, normalized Bayesian information criterion, Ljung-Box analysis, predicted value and confidence limits. Using auto regressive integrated moving average model, future water quality parameters values have been estimated. It is observed that predictive model is useful at 95 % confidence limits and curve is platykurtic for potential of hydrogen (pH), free ammonia, total Kjeldahl nitrogen, dissolved oxygen, water temperature (WT); leptokurtic for chemical oxygen demand, biochemical oxygen demand. Also, it is observed that predicted series is close to the original series which provides a perfect fit. All parameters except pH and WT cross the prescribed limits of the World Health Organization /United States Environmental Protection Agency, and thus water is not fit for drinking, agriculture and industrial use.
The problem of pseudoreplication in neuroscientific studies: is it affecting your analysis?
2010-01-01
Background Pseudoreplication occurs when observations are not statistically independent, but treated as if they are. This can occur when there are multiple observations on the same subjects, when samples are nested or hierarchically organised, or when measurements are correlated in time or space. Analysis of such data without taking these dependencies into account can lead to meaningless results, and examples can easily be found in the neuroscience literature. Results A single issue of Nature Neuroscience provided a number of examples and is used as a case study to highlight how pseudoreplication arises in neuroscientific studies, why the analyses in these papers are incorrect, and appropriate analytical methods are provided. 12% of papers had pseudoreplication and a further 36% were suspected of having pseudoreplication, but it was not possible to determine for certain because insufficient information was provided. Conclusions Pseudoreplication can undermine the conclusions of a statistical analysis, and it would be easier to detect if the sample size, degrees of freedom, the test statistic, and precise p-values are reported. This information should be a requirement for all publications. PMID:20074371
An interactive environment for the analysis of large Earth observation and model data sets
NASA Technical Reports Server (NTRS)
Bowman, Kenneth P.; Walsh, John E.; Wilhelmson, Robert B.
1993-01-01
We propose to develop an interactive environment for the analysis of large Earth science observation and model data sets. We will use a standard scientific data storage format and a large capacity (greater than 20 GB) optical disk system for data management; develop libraries for coordinate transformation and regridding of data sets; modify the NCSA X Image and X DataSlice software for typical Earth observation data sets by including map transformations and missing data handling; develop analysis tools for common mathematical and statistical operations; integrate the components described above into a system for the analysis and comparison of observations and model results; and distribute software and documentation to the scientific community.
An interactive environment for the analysis of large Earth observation and model data sets
NASA Technical Reports Server (NTRS)
Bowman, Kenneth P.; Walsh, John E.; Wilhelmson, Robert B.
1992-01-01
We propose to develop an interactive environment for the analysis of large Earth science observation and model data sets. We will use a standard scientific data storage format and a large capacity (greater than 20 GB) optical disk system for data management; develop libraries for coordinate transformation and regridding of data sets; modify the NCSA X Image and X Data Slice software for typical Earth observation data sets by including map transformations and missing data handling; develop analysis tools for common mathematical and statistical operations; integrate the components described above into a system for the analysis and comparison of observations and model results; and distribute software and documentation to the scientific community.
NASA Astrophysics Data System (ADS)
Shulgina, T.; Genina, E.; Gordov, E.; Nikitchuk, K.
2009-04-01
At present numerous data archives which include meteorological observations as well as climate processes modeling data are available for Earth Science specialists. Methods of mathematical statistics are widely used for their processing and analysis. In many cases they represent the only way of quantitative assessment of the meteorological and climatic information. Unified set of analysis methods allows us to compare climatic characteristics calculated on the basis of different datasets with the purpose of performing more detailed analysis of climate dynamics for both regional and global levels. The report presents the results of comparative analysis of atmosphere temperature behavior for the Northern Eurasia territory for the period from 1979 to 2004 based on the NCEP/NCAR Reanalysis, NCEP/DOE Reanalysis AMIP II, JMA/CRIEPI JRA-25 Reanalysis, ECMWF ERA-40 Reanalysis data and observation data obtained from meteorological stations of the former Soviet Union. Statistical processing of atmosphere temperature data included analysis of time series homogeneity of climate indices approved by WMO, such as "Number of frost days", "Number of summer days", "Number of icing days", "Number of tropical nights", etc. by means of parametric methods of mathematical statistics (Fisher and Student tests). That allowed conducting comprehensive research of spatio-temporal features of the atmosphere temperature. Analysis of the atmosphere temperature dynamics revealed inhomogeneity of the data obtained for large observation intervals. Particularly, analysis performed for the period 1979 - 2004 showed the significant increase of the number of frost and icing days approximately by 1 day for every 2 years and decrease roughly by 1 day for 2 years for the number of summer days. Also it should be mentioned that the growth period mean temperature have increased by 1.5 - 2° C for the time period being considered. The usage of different Reanalysis datasets in conjunction with in-situ observed data allowed comparing of climate indices values calculated on the basis of different datasets that improves the reliability of the results obtained. Partial support of SB RAS Basic Research Program 4.5.2 (Project 2) is acknowledged.
Yue, Lilly Q
2012-01-01
In the evaluation of medical products, including drugs, biological products, and medical devices, comparative observational studies could play an important role when properly conducted randomized, well-controlled clinical trials are infeasible due to ethical or practical reasons. However, various biases could be introduced at every stage and into every aspect of the observational study, and consequently the interpretation of the resulting statistical inference would be of concern. While there do exist statistical techniques for addressing some of the challenging issues, often based on propensity score methodology, these statistical tools probably have not been as widely employed in prospectively designing observational studies as they should be. There are also times when they are implemented in an unscientific manner, such as performing propensity score model selection for a dataset involving outcome data in the same dataset, so that the integrity of observational study design and the interpretability of outcome analysis results could be compromised. In this paper, regulatory considerations on prospective study design using propensity scores are shared and illustrated with hypothetical examples.
Air Quality Forecasting through Different Statistical and Artificial Intelligence Techniques
NASA Astrophysics Data System (ADS)
Mishra, D.; Goyal, P.
2014-12-01
Urban air pollution forecasting has emerged as an acute problem in recent years because there are sever environmental degradation due to increase in harmful air pollutants in the ambient atmosphere. In this study, there are different types of statistical as well as artificial intelligence techniques are used for forecasting and analysis of air pollution over Delhi urban area. These techniques are principle component analysis (PCA), multiple linear regression (MLR) and artificial neural network (ANN) and the forecasting are observed in good agreement with the observed concentrations through Central Pollution Control Board (CPCB) at different locations in Delhi. But such methods suffers from disadvantages like they provide limited accuracy as they are unable to predict the extreme points i.e. the pollution maximum and minimum cut-offs cannot be determined using such approach. Also, such methods are inefficient approach for better output forecasting. But with the advancement in technology and research, an alternative to the above traditional methods has been proposed i.e. the coupling of statistical techniques with artificial Intelligence (AI) can be used for forecasting purposes. The coupling of PCA, ANN and fuzzy logic is used for forecasting of air pollutant over Delhi urban area. The statistical measures e.g., correlation coefficient (R), normalized mean square error (NMSE), fractional bias (FB) and index of agreement (IOA) of the proposed model are observed in better agreement with the all other models. Hence, the coupling of statistical and artificial intelligence can be use for the forecasting of air pollutant over urban area.
Extreme Statistics of Storm Surges in the Baltic Sea
NASA Astrophysics Data System (ADS)
Kulikov, E. A.; Medvedev, I. P.
2017-11-01
Statistical analysis of the extreme values of the Baltic Sea level has been performed for a series of observations for 15-125 years at 13 tide gauge stations. It is shown that the empirical relation between value of extreme sea level rises or ebbs (caused by storm events) and its return period in the Baltic Sea can be well approximated by the Gumbel probability distribution. The maximum values of extreme floods/ebbs of the 100-year recurrence were observed in the Gulf of Finland and the Gulf of Riga. The two longest data series, observed in Stockholm and Vyborg over 125 years, have shown a significant deviation from the Gumbel distribution for the rarest events. Statistical analysis of the hourly sea level data series reveals some asymmetry in the variability of the Baltic Sea level. The probability of rises proved higher than that of ebbs. As for the magnitude of the 100-year recurrence surge, it considerably exceeded the magnitude of ebbs almost everywhere. This asymmetry effect can be attributed to the influence of low atmospheric pressure during storms. A statistical study of extreme values has also been applied to sea level series for Narva over the period of 1994-2000, which were simulated by the ROMS numerical model. Comparisons of the "simulated" and "observed" extreme sea level distributions show that the model reproduces quite satisfactorily extreme floods of "moderate" magnitude; however, it underestimates sea level changes for the most powerful storm surges.
A note on statistical analysis of shape through triangulation of landmarks
Rao, C. Radhakrishna
2000-01-01
In an earlier paper, the author jointly with S. Suryawanshi proposed statistical analysis of shape through triangulation of landmarks on objects. It was observed that the angles of the triangles are invariant to scaling, location, and rotation of objects. No distinction was made between an object and its reflection. The present paper provides the methodology of shape discrimination when reflection is also taken into account and makes suggestions for modifications to be made when some of the landmarks are collinear. PMID:10737780
Photon counting statistics analysis of biophotons from hands.
Jung, Hyun-Hee; Woo, Won-Myung; Yang, Joon-Mo; Choi, Chunho; Lee, Jonghan; Yoon, Gilwon; Yang, Jong S; Soh, Kwang-Sup
2003-05-01
The photon counting statistics of biophotons emitted from hands is studied with a view to test its agreement with the Poisson distribution. The moments of observed probability up to seventh order have been evaluated. The moments of biophoton emission from hands are in good agreement while those of dark counts of photomultiplier tube show large deviations from the theoretical values of Poisson distribution. The present results are consistent with the conventional delta-value analysis of the second moment of probability.
NASA Astrophysics Data System (ADS)
Yan, Rui; Parrot, Michel; Pinçon, Jean-Louis
2017-12-01
In this paper, we present the result of a statistical study performed on the ionospheric ion density variations above areas of seismic activity. The ion density was observed by the low altitude satellite DEMETER between 2004 and 2010. In the statistical analysis a superposed epoch method is used where the observed ionospheric ion density close to the epicenters both in space and in time is compared to background values recorded at the same location and in the same conditions. Data associated with aftershocks have been carefully removed from the database to prevent spurious effects on the statistics. It is shown that, during nighttime, anomalous ionospheric perturbations related to earthquakes with magnitudes larger than 5 are evidenced. At the time of these perturbations the background ion fluctuation departs from a normal distribution. They occur up to 200 km from the epicenters and mainly 5 days before the earthquakes. As expected, an ion density perturbation occurring just after the earthquakes and close to the epicenters is also evidenced.
Staging Liver Fibrosis with Statistical Observers
NASA Astrophysics Data System (ADS)
Brand, Jonathan Frieman
Chronic liver disease is a worldwide health problem, and hepatic fibrosis (HF) is one of the hallmarks of the disease. Pathology diagnosis of HF is based on textural change in the liver as a lobular collagen network that develops within portal triads. The scale of collagen lobules is characteristically on order of 1mm, which close to the resolution limit of in vivo Gd-enhanced MRI. In this work the methods to collect training and testing images for a Hotelling observer are covered. An observer based on local texture analysis is trained and tested using wet-tissue phantoms. The technique is used to optimize the MRI sequence based on task performance. The final method developed is a two stage model observer to classify fibrotic and healthy tissue in both phantoms and in vivo MRI images. The first stage observer tests for the presence of local texture. Test statistics from the first observer are used to train the second stage observer to globally sample the local observer results. A decision of the disease class is made for an entire MRI image slice using test statistics collected from the second observer. The techniques are tested on wet-tissue phantoms and in vivo clinical patient data.
Teaching Students to Use Summary Statistics and Graphics to Clean and Analyze Data
ERIC Educational Resources Information Center
Holcomb, John; Spalsbury, Angela
2005-01-01
Textbooks and websites today abound with real data. One neglected issue is that statistical investigations often require a good deal of "cleaning" to ready data for analysis. The purpose of this dataset and exercise is to teach students to use exploratory tools to identify erroneous observations. This article discusses the merits of such…
Finding P-Values for F Tests of Hypothesis on a Spreadsheet.
ERIC Educational Resources Information Center
Rochowicz, John A., Jr.
The calculation of the F statistic for a one-factor analysis of variance (ANOVA) and the construction of an ANOVA tables are easily implemented on a spreadsheet. This paper describes how to compute the p-value (observed significance level) for a particular F statistic on a spreadsheet. Decision making on a spreadsheet and applications to the…
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Wei, Evelyn; Hipwell, Alison; Pardini, Dustin; Beyers, Jennifer M; Loeber, Rolf
2005-10-01
To provide reliability information for a brief observational measure of physical disorder and determine its relation with neighbourhood level crime and health variables after controlling for census based measures of concentrated poverty and minority concentration. Psychometric analysis of block observation data comprising a brief measure of neighbourhood physical disorder, and cross sectional analysis of neighbourhood physical disorder, neighbourhood crime and birth statistics, and neighbourhood level poverty and minority concentration. Pittsburgh, Pennsylvania, US (2000 population=334 563). Pittsburgh neighbourhoods (n=82) and their residents (as reflected in neighbourhood level statistics). The physical disorder index showed adequate reliability and validity and was associated significantly with rates of crime, firearm injuries and homicides, and teen births, while controlling for concentrated poverty and minority population. This brief measure of neighbourhood physical disorder may help increase our understanding of how community level factors reflect health and crime outcomes.
NASA Astrophysics Data System (ADS)
Pilger, Christoph; Schmidt, Carsten; Bittner, Michael
2013-02-01
The detection of infrasonic signals in temperature time series of the mesopause altitude region (at about 80-100 km) is performed at the German Remote Sensing Data Center of the German Aerospace Center (DLR-DFD) using GRIPS instrumentation (GRound-based Infrared P-branch Spectrometers). Mesopause temperature values with a temporal resolution of up to 10 s are derived from the observation of nocturnal airglow emissions and permit the identification of signals within the long-period infrasound range.Spectral intensities of wave signatures with periods between 2.5 and 10 min are estimated applying the wavelet analysis technique to one minute mean temperature values. Selected events as well as the statistical distribution of 40 months of observation are presented and discussed with respect to resonant modes of the atmosphere. The mechanism of acoustic resonance generated by strong infrasonic sources is a potential explanation of distinct features with periods between 3 and 5 min observed in the dataset.
Saini, Komal; Singh, Parminder; Bajwa, Bikramjit Singh
2016-12-01
LED flourimeter has been used for microanalysis of uranium concentration in groundwater samples collected from six districts of South West (SW), West (W) and North East (NE) Punjab, India. Average value of uranium content in water samples of SW Punjab is observed to be higher than WHO, USEPA recommended safe limit of 30µgl -1 as well as AERB proposed limit of 60µgl -1 . Whereas, for W and NE region of Punjab, average level of uranium concentration was within AERB recommended limit of 60µgl -1 . Average value observed in SW Punjab is around 3-4 times the value observed in W Punjab, whereas its value is more than 17 times the average value observed in NE region of Punjab. Statistical analysis of carcinogenic as well as non carcinogenic risks due to uranium have been evaluated for each studied district. Copyright © 2016 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noo, F; Guo, Z
2016-06-15
Purpose: Penalized-weighted least-square reconstruction has become an important research topic in CT, to reduce dose without affecting image quality. Two components impact image quality in this reconstruction: the statistical weights and the use of an edge-preserving penalty term. We are interested in assessing the influence of statistical weights on their own, without the edge-preserving feature. Methods: The influence of statistical weights on image quality was assessed in terms of low-contrast detail detection using LROC analysis. The task amounted to detect and localize a 6-mm lesion with random contrast inside the FORBILD head phantom. A two-alternative forced-choice experiment was used withmore » two human observers performing the task. Reconstructions without and with statistical weights were compared, both using the same quadratic penalty term. The beam energy was set to 30keV to amplify spatial differences in attenuation and thereby the role of statistical weights. A fan-beam data acquisition geometry was used. Results: Visual inspection of images clearly showed a difference in noise between the two reconstructions methods. As expected, the reconstruction without statistical weights exhibited noise streaks. The other reconstruction appeared better in this aspect, but presented other disturbing noise patterns and artifacts induced by the weights. The LROC analysis yield the following 95-percent confidence interval for the difference in reader-averaged AUC (reconstruction without weights minus reconstruction with weights): [0.0026,0.0599]. The mean AUC value was 0.9094. Conclusion: We have investigated the impact of statistical weights without the use of edge-preserving penalty in penalized weighted least-square reconstruction. A decrease rather than increase in image quality was observed when using statistical weights. Thus, the observers were better able to cope with the noise streaks than the noise patterns and artifacts induced by the statistical weights. It may be that different results would be obtained if the penalty term was used with a pixel-dependent weight. F Noo receives research support from Siemens Healthcare GmbH.« less
NASA Astrophysics Data System (ADS)
Lopez, S. R.; Hogue, T. S.
2011-12-01
Global climate models (GCMs) are primarily used to generate historical and future large-scale circulation patterns at a coarse resolution (typical order of 50,000 km2) and fail to capture climate variability at the ground level due to localized surface influences (i.e topography, marine, layer, land cover, etc). Their inability to accurately resolve these processes has led to the development of numerous 'downscaling' techniques. The goal of this study is to enhance statistical downscaling of daily precipitation and temperature for regions with heterogeneous land cover and topography. Our analysis was divided into two periods, historical (1961-2000) and contemporary (1980-2000), and tested using sixteen predictand combinations from four GCMs (GFDL CM2.0, GFDL CM2.1, CNRM-CM3 and MRI-CGCM2 3.2a. The Southern California area was separated into five county regions: Santa Barbara, Ventura, Los Angeles, Orange and San Diego. Principle component analysis (PCA) was performed on ground-based observations in order to (1) reduce the number of redundant gauges and minimize dimensionality and (2) cluster gauges that behave statistically similarly for post-analysis. Post-PCA analysis included extensive testing of predictor-predictand relationships using an enhanced canonical correlation analysis (ECCA). The ECCA includes obtaining the optimal predictand sets for all models within each spatial domain (county) as governed by daily and monthly overall statistics. Results show all models maintain mean annual and monthly behavior within each county and daily statistics are improved. The level of improvement highly depends on the vegetation extent within each county and the land-to-ocean ratio within the GCM spatial grid. The utilization of the entire historical period also leads to better statistical representation of observed daily precipitation. The validated ECCA technique is being applied to future climate scenarios distributed by the IPCC in order to provide forcing data for regional hydrologic models and assess future water resources in the Southern California region.
NASA Technical Reports Server (NTRS)
Hughes, William O.; McNelis, Anne M.
2010-01-01
The Earth Observing System (EOS) Terra spacecraft was launched on an Atlas IIAS launch vehicle on its mission to observe planet Earth in late 1999. Prior to launch, the new design of the spacecraft's pyroshock separation system was characterized by a series of 13 separation ground tests. The analysis methods used to evaluate this unusually large amount of shock data will be discussed in this paper, with particular emphasis on population distributions and finding statistically significant families of data, leading to an overall shock separation interface level. The wealth of ground test data also allowed a derivation of a Mission Assurance level for the flight. All of the flight shock measurements were below the EOS Terra Mission Assurance level thus contributing to the overall success of the EOS Terra mission. The effectiveness of the statistical methodology for characterizing the shock interface level and for developing a flight Mission Assurance level from a large sample size of shock data is demonstrated in this paper.
Matysiak, W; Królikowska-Prasał, I; Staszyc, J; Kifer, E; Romanowska-Sarlej, J
1989-01-01
The studies were performed on 44 white female Wistar rats which were intratracheally administered the suspension of the soil dust and the electro-energetic ashes. The electro-energetic ashes were collected from 6 different local heat and power generating plants while the soil dust from several random places of our country. The statistical analysis of the body and the lung mass of the animals subjected to the single dust and ash insufflation was performed. The applied variants proved the statistically significant differences between the body and the lung mass. The observed differences are connected with the kinds of dust and ash used in the experiment.
NASA Astrophysics Data System (ADS)
Barré, Anthony; Suard, Frédéric; Gérard, Mathias; Montaru, Maxime; Riu, Delphine
2014-01-01
This paper describes the statistical analysis of recorded data parameters of electrical battery ageing during electric vehicle use. These data permit traditional battery ageing investigation based on the evolution of the capacity fade and resistance raise. The measured variables are examined in order to explain the correlation between battery ageing and operating conditions during experiments. Such study enables us to identify the main ageing factors. Then, detailed statistical dependency explorations present the responsible factors on battery ageing phenomena. Predictive battery ageing models are built from this approach. Thereby results demonstrate and quantify a relationship between variables and battery ageing global observations, and also allow accurate battery ageing diagnosis through predictive models.
NASA Astrophysics Data System (ADS)
Kassem, M.; Soize, C.; Gagliardini, L.
2009-06-01
In this paper, an energy-density field approach applied to the vibroacoustic analysis of complex industrial structures in the low- and medium-frequency ranges is presented. This approach uses a statistical computational model. The analyzed system consists of an automotive vehicle structure coupled with its internal acoustic cavity. The objective of this paper is to make use of the statistical properties of the frequency response functions of the vibroacoustic system observed from previous experimental and numerical work. The frequency response functions are expressed in terms of a dimensionless matrix which is estimated using the proposed energy approach. Using this dimensionless matrix, a simplified vibroacoustic model is proposed.
Verification of forecast ensembles in complex terrain including observation uncertainty
NASA Astrophysics Data System (ADS)
Dorninger, Manfred; Kloiber, Simon
2017-04-01
Traditionally, verification means to verify a forecast (ensemble) with the truth represented by observations. The observation errors are quite often neglected arguing that they are small when compared to the forecast error. In this study as part of the MesoVICT (Mesoscale Verification Inter-comparison over Complex Terrain) project it will be shown, that observation errors have to be taken into account for verification purposes. The observation uncertainty is estimated from the VERA (Vienna Enhanced Resolution Analysis) and represented via two analysis ensembles which are compared to the forecast ensemble. For the whole study results from COSMO-LEPS provided by Arpae-SIMC Emilia-Romagna are used as forecast ensemble. The time period covers the MesoVICT core case from 20-22 June 2007. In a first step, all ensembles are investigated concerning their distribution. Several tests have been executed (Kolmogorov-Smirnov-Test, Finkelstein-Schafer Test, Chi-Square Test etc.) showing no exact mathematical distribution. So the main focus is on non-parametric statistics (e.g. Kernel density estimation, Boxplots etc.) and also the deviation between "forced" normal distributed data and the kernel density estimations. In a next step the observational deviations due to the analysis ensembles are analysed. In a first approach scores are multiple times calculated with every single ensemble member from the analysis ensemble regarded as "true" observation. The results are presented as boxplots for the different scores and parameters. Additionally, the bootstrapping method is also applied to the ensembles. These possible approaches to incorporating observational uncertainty into the computation of statistics will be discussed in the talk.
The Canadian Precipitation Analysis (CaPA): Evaluation of the statistical interpolation scheme
NASA Astrophysics Data System (ADS)
Evans, Andrea; Rasmussen, Peter; Fortin, Vincent
2013-04-01
CaPA (Canadian Precipitation Analysis) is a data assimilation system which employs statistical interpolation to combine observed precipitation with gridded precipitation fields produced by Environment Canada's Global Environmental Multiscale (GEM) climate model into a final gridded precipitation analysis. Precipitation is important in many fields and applications, including agricultural water management projects, flood control programs, and hydroelectric power generation planning. Precipitation is a key input to hydrological models, and there is a desire to have access to the best available information about precipitation in time and space. The principal goal of CaPA is to produce this type of information. In order to perform the necessary statistical interpolation, CaPA requires the estimation of a semi-variogram. This semi-variogram is used to describe the spatial correlations between precipitation innovations, defined as the observed precipitation amounts minus the GEM forecasted amounts predicted at the observation locations. Currently, CaPA uses a single isotropic variogram across the entire analysis domain. The present project investigates the implications of this choice by first conducting a basic variographic analysis of precipitation innovation data across the Canadian prairies, with specific interest in identifying and quantifying potential anisotropy within the domain. This focus is further expanded by identifying the effect of storm type on the variogram. The ultimate goal of the variographic analysis is to develop improved semi-variograms for CaPA that better capture the spatial complexities of precipitation over the Canadian prairies. CaPA presently applies a Box-Cox data transformation to both the observations and the GEM data, prior to the calculation of the innovations. The data transformation is necessary to satisfy the normal distribution assumption, but introduces a significant bias. The second part of the investigation aims at devising a bias correction scheme based on a moving-window averaging technique. For both the variogram and bias correction components of this investigation, a series of trial runs are conducted to evaluate the impact of these changes on the resulting CaPA precipitation analyses.
Mechanical properties of silicate glasses exposed to a low-Earth orbit
NASA Technical Reports Server (NTRS)
Wiedlocher, David E.; Tucker, Dennis S.; Nichols, Ron; Kinser, Donald L.
1992-01-01
The effects of a 5.8 year exposure to low earth orbit environment upon the mechanical properties of commercial optical fused silica, low iron soda-lime-silica, Pyrex 7740, Vycor 7913, BK-7, and the glass ceramic Zerodur were examined. Mechanical testing employed the ASTM-F-394 piston on 3-ball method in a liquid nitrogen environment. Samples were exposed on the Long Duration Exposure Facility (LDEF) in two locations. Impacts were observed on all specimens except Vycor. Weibull analysis as well as a standard statistical evaluation were conducted. The Weibull analysis revealed no differences between control samples and the two exposed samples. We thus concluded that radiation components of the Earth orbital environment did not degrade the mechanical strength of the samples examined within the limits of experimental error. The upper bound of strength degradation for meteorite impacted samples based upon statistical analysis and observation was 50 percent.
NASA Astrophysics Data System (ADS)
O'Connor, Alison; Kirtman, Benjamin; Harrison, Scott; Gorman, Joe
2016-05-01
The US Navy faces several limitations when planning operations in regard to forecasting environmental conditions. Currently, mission analysis and planning tools rely heavily on short-term (less than a week) forecasts or long-term statistical climate products. However, newly available data in the form of weather forecast ensembles provides dynamical and statistical extended-range predictions that can produce more accurate predictions if ensemble members can be combined correctly. Charles River Analytics is designing the Climatological Observations for Maritime Prediction and Analysis Support Service (COMPASS), which performs data fusion over extended-range multi-model ensembles, such as the North American Multi-Model Ensemble (NMME), to produce a unified forecast for several weeks to several seasons in the future. We evaluated thirty years of forecasts using machine learning to select predictions for an all-encompassing and superior forecast that can be used to inform the Navy's decision planning process.
Statistical Study of the Properties of Magnetosheath Lion Roars using MMS observations
NASA Astrophysics Data System (ADS)
Giagkiozis, S.; Wilson, L. B., III
2017-12-01
Intense whistler-mode waves of very short duration are frequently encountered in the magnetosheath. These emissions have been linked to mirror mode waves and the Earth's bow shock. They can efficiently transfer energy between different plasma populations. These electromagnetic waves are commonly referred to as Lion roars (LR), due to the sound generated when the signals are sonified. They are generally observed during dips of the magnetic field that are anti-correlated with increases of density. Using MMS data, we have identified more than 1750 individual LR burst intervals. Each emission was band-pass filtered and further split into >35,000 subintervals, for which the direction of propagation and the polarization were calculated. The analysis of subinterval properties provides a more accurate representation of their true nature than the more commonly used time- and frequency-averaged dynamic spectra analysis. The results of the statistical analysis of the wave properties will be presented.
Bispectral analysis of equatorial spread F density irregularities
NASA Technical Reports Server (NTRS)
Labelle, J.; Lund, E. J.
1992-01-01
Bispectral analysis has been applied to density irregularities at frequencies 5-30 Hz observed with a sounding rocket launched from Peru in March 1983. Unlike the power spectrum, the bispectrum contains statistical information about the phase relations between the Fourier components which make up the waveform. In the case of spread F data from 475 km the 5-30 Hz portion of the spectrum displays overall enhanced bicoherence relative to that of the background instrumental noise and to that expected due to statistical considerations, implying that the observed f exp -2.5 power law spectrum has a significant non-Gaussian component. This is consistent with previous qualitative analyses. The bicoherence has also been calculated for simulated equatorial spread F density irregularities in approximately the same wavelength regime, and the resulting bispectrum has some features in common with that of the rocket data. The implications of this analysis for equatorial spread F are discussed, and some future investigations are suggested.
CADDIS Volume 3. Examples and Applications: Analytical Examples
Examples illustrating the use of statistical analysis to support different types of evidence, stream temperature, temperature inferred from macroinverterbate, macroinvertebrate responses, zinc concentrations, observed trait characteristics.
NASA Technical Reports Server (NTRS)
Prive, N. C.; Errico, R. M.; Tai, K.-S.
2013-01-01
The Global Modeling and Assimilation Office (GMAO) observing system simulation experiment (OSSE) framework is used to explore the response of analysis error and forecast skill to observation quality. In an OSSE, synthetic observations may be created that have much smaller error than real observations, and precisely quantified error may be applied to these synthetic observations. Three experiments are performed in which synthetic observations with magnitudes of applied observation error that vary from zero to twice the estimated realistic error are ingested into the Goddard Earth Observing System Model (GEOS-5) with Gridpoint Statistical Interpolation (GSI) data assimilation for a one-month period representing July. The analysis increment and observation innovation are strongly impacted by observation error, with much larger variances for increased observation error. The analysis quality is degraded by increased observation error, but the change in root-mean-square error of the analysis state is small relative to the total analysis error. Surprisingly, in the 120 hour forecast increased observation error only yields a slight decline in forecast skill in the extratropics, and no discernable degradation of forecast skill in the tropics.
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.
Liu, Ruijie; Holik, Aliaksei Z; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E; Asselin-Labat, Marie-Liesse; Smyth, Gordon K; Ritchie, Matthew E
2015-09-03
Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean-variance relationship of the log-counts-per-million using 'voom'. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source 'limma' package. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Statistical issues on the analysis of change in follow-up studies in dental research.
Blance, Andrew; Tu, Yu-Kang; Baelum, Vibeke; Gilthorpe, Mark S
2007-12-01
To provide an overview to the problems in study design and associated analyses of follow-up studies in dental research, particularly addressing three issues: treatment-baselineinteractions; statistical power; and nonrandomization. Our previous work has shown that many studies purport an interacion between change (from baseline) and baseline values, which is often based on inappropriate statistical analyses. A priori power calculations are essential for randomized controlled trials (RCTs), but in the pre-test/post-test RCT design it is not well known to dental researchers that the choice of statistical method affects power, and that power is affected by treatment-baseline interactions. A common (good) practice in the analysis of RCT data is to adjust for baseline outcome values using ancova, thereby increasing statistical power. However, an important requirement for ancova is there to be no interaction between the groups and baseline outcome (i.e. effective randomization); the patient-selection process should not cause differences in mean baseline values across groups. This assumption is often violated for nonrandomized (observational) studies and the use of ancova is thus problematic, potentially giving biased estimates, invoking Lord's paradox and leading to difficulties in the interpretation of results. Baseline interaction issues can be overcome by use of statistical methods; not widely practiced in dental research: Oldham's method and multilevel modelling; the latter is preferred for its greater flexibility to deal with more than one follow-up occasion as well as additional covariates To illustrate these three key issues, hypothetical examples are considered from the fields of periodontology, orthodontics, and oral implantology. Caution needs to be exercised when considering the design and analysis of follow-up studies. ancova is generally inappropriate for nonrandomized studies and causal inferences from observational data should be avoided.
Detection of crossover time scales in multifractal detrended fluctuation analysis
NASA Astrophysics Data System (ADS)
Ge, Erjia; Leung, Yee
2013-04-01
Fractal is employed in this paper as a scale-based method for the identification of the scaling behavior of time series. Many spatial and temporal processes exhibiting complex multi(mono)-scaling behaviors are fractals. One of the important concepts in fractals is crossover time scale(s) that separates distinct regimes having different fractal scaling behaviors. A common method is multifractal detrended fluctuation analysis (MF-DFA). The detection of crossover time scale(s) is, however, relatively subjective since it has been made without rigorous statistical procedures and has generally been determined by eye balling or subjective observation. Crossover time scales such determined may be spurious and problematic. It may not reflect the genuine underlying scaling behavior of a time series. The purpose of this paper is to propose a statistical procedure to model complex fractal scaling behaviors and reliably identify the crossover time scales under MF-DFA. The scaling-identification regression model, grounded on a solid statistical foundation, is first proposed to describe multi-scaling behaviors of fractals. Through the regression analysis and statistical inference, we can (1) identify the crossover time scales that cannot be detected by eye-balling observation, (2) determine the number and locations of the genuine crossover time scales, (3) give confidence intervals for the crossover time scales, and (4) establish the statistically significant regression model depicting the underlying scaling behavior of a time series. To substantive our argument, the regression model is applied to analyze the multi-scaling behaviors of avian-influenza outbreaks, water consumption, daily mean temperature, and rainfall of Hong Kong. Through the proposed model, we can have a deeper understanding of fractals in general and a statistical approach to identify multi-scaling behavior under MF-DFA in particular.
NASA Technical Reports Server (NTRS)
Bommier, V.; Leroy, J. L.; Sahal-Brechot, S.
1985-01-01
The Hanle effect method for magnetic field vector diagnostics has now provided results on the magnetic field strength and direction in quiescent prominences, from linear polarization measurements in the He I E sub 3 line, performed at the Pic-du-Midi and at Sacramento Peak. However, there is an inescapable ambiguity in the field vector determination: each polarization measurement provides two field vector solutions symmetrical with respect to the line-of-sight. A statistical analysis capable of solving this ambiguity was applied to the large sample of prominences observed at the Pic-du-Midi (Leroy, et al., 1984); the same method of analysis applied to the prominences observed at Sacramento Peak (Athay, et al., 1983) provides results in agreement on the most probable magnetic structure of prominences; these results are detailed. The statistical results were confirmed on favorable individual cases: for 15 prominences observed at Pic-du-Midi, the two-field vectors are pointing on the same side of the prominence, and the alpha angles are large enough with respect to the measurements and interpretation inaccuracies, so that the field polarity is derived without any ambiguity.
A Statistical Analysis of Reviewer Agreement and Bias in Evaluating Medical Abstracts 1
Cicchetti, Domenic V.; Conn, Harold O.
1976-01-01
Observer variability affects virtually all aspects of clinical medicine and investigation. One important aspect, not previously examined, is the selection of abstracts for presentation at national medical meetings. In the present study, 109 abstracts, submitted to the American Association for the Study of Liver Disease, were evaluated by three “blind” reviewers for originality, design-execution, importance, and overall scientific merit. Of the 77 abstracts rated for all parameters by all observers, interobserver agreement ranged between 81 and 88%. However, corresponding intraclass correlations varied between 0.16 (approaching statistical significance) and 0.37 (p < 0.01). Specific tests of systematic differences in scoring revealed statistically significant levels of observer bias on most of the abstract components. Moreover, the mean differences in interobserver ratings were quite small compared to the standard deviations of these differences. These results emphasize the importance of evaluating the simple percentage of rater agreement within the broader context of observer variability and systematic bias. PMID:997596
Long-term sea level trends: Natural or anthropogenic?
NASA Astrophysics Data System (ADS)
Becker, M.; Karpytchev, M.; Lennartz-Sassinek, S.
2014-08-01
Detection and attribution of human influence on sea level rise are important topics that have not yet been explored in depth. We question whether the sea level changes (SLC) over the past century were natural in origin. SLC exhibit power law long-term correlations. By estimating Hurst exponent through Detrended Fluctuation Analysis and by applying statistics of Lennartz and Bunde, we search the lower bounds of statistically significant external sea level trends in longest tidal records worldwide. We provide statistical evidences that the observed SLC, at global and regional scales, is beyond its natural internal variability. The minimum anthropogenic sea level trend (MASLT) contributes to the observed sea level rise more than 50% in New York, Baltimore, San Diego, Marseille, and Mumbai. A MASLT is about 1 mm/yr in global sea level reconstructions that is more than half of the total observed sea level trend during the XXth century.
Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.
Tekwe, Carmen D; Carroll, Raymond J; Dabney, Alan R
2012-08-01
Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. ctekwe@stat.tamu.edu.
Statistical Analysis of Large-Scale Structure of Universe
NASA Astrophysics Data System (ADS)
Tugay, A. V.
While galaxy cluster catalogs were compiled many decades ago, other structural elements of cosmic web are detected at definite level only in the newest works. For example, extragalactic filaments were described by velocity field and SDSS galaxy distribution during the last years. Large-scale structure of the Universe could be also mapped in the future using ATHENA observations in X-rays and SKA in radio band. Until detailed observations are not available for the most volume of Universe, some integral statistical parameters can be used for its description. Such methods as galaxy correlation function, power spectrum, statistical moments and peak statistics are commonly used with this aim. The parameters of power spectrum and other statistics are important for constraining the models of dark matter, dark energy, inflation and brane cosmology. In the present work we describe the growth of large-scale density fluctuations in one- and three-dimensional case with Fourier harmonics of hydrodynamical parameters. In result we get power-law relation for the matter power spectrum.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bennett, Janine Camille; Thompson, David; Pebay, Philippe Pierre
Statistical analysis is typically used to reduce the dimensionality of and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. Many statistical techniques, e.g., descriptive statistics or principal component analysis, are based on moments and co-moments and, using robust online update formulas, can be computed in an embarrassingly parallel manner, amenable to a map-reduce style implementation. In this paper we focus on contingency tables, through which numerous derived statistics such as joint and marginal probability, point-wise mutual information, information entropy,more » and {chi}{sup 2} independence statistics can be directly obtained. However, contingency tables can become large as data size increases, requiring a correspondingly large amount of communication between processors. This potential increase in communication prevents optimal parallel speedup and is the main difference with moment-based statistics (which we discussed in [1]) where the amount of inter-processor communication is independent of data size. Here we present the design trade-offs which we made to implement the computation of contingency tables in parallel. We also study the parallel speedup and scalability properties of our open source implementation. In particular, we observe optimal speed-up and scalability when the contingency statistics are used in their appropriate context, namely, when the data input is not quasi-diffuse.« less
Extracting chemical information from high-resolution Kβ X-ray emission spectroscopy
NASA Astrophysics Data System (ADS)
Limandri, S.; Robledo, J.; Tirao, G.
2018-06-01
High-resolution X-ray emission spectroscopy allows studying the chemical environment of a wide variety of materials. Chemical information can be obtained by fitting the X-ray spectra and observing the behavior of some spectral features. Spectral changes can also be quantified by means of statistical parameters calculated by considering the spectrum as a probability distribution. Another possibility is to perform statistical multivariate analysis, such as principal component analysis. In this work the performance of these procedures for extracting chemical information in X-ray emission spectroscopy spectra for mixtures of Mn2+ and Mn4+ oxides are studied. A detail analysis of the parameters obtained, as well as the associated uncertainties is shown. The methodologies are also applied for Mn oxidation state characterization of double perovskite oxides Ba1+xLa1-xMnSbO6 (with 0 ≤ x ≤ 0.7). The results show that statistical parameters and multivariate analysis are the most suitable for the analysis of this kind of spectra.
Pradhan, Biswajeet; Chaudhari, Amruta; Adinarayana, J; Buchroithner, Manfred F
2012-01-01
In this paper, an attempt has been made to assess, prognosis and observe dynamism of soil erosion by universal soil loss equation (USLE) method at Penang Island, Malaysia. Multi-source (map-, space- and ground-based) datasets were used to obtain both static and dynamic factors of USLE, and an integrated analysis was carried out in raster format of GIS. A landslide location map was generated on the basis of image elements interpretation from aerial photos, satellite data and field observations and was used to validate soil erosion intensity in the study area. Further, a statistical-based frequency ratio analysis was carried out in the study area for correlation purposes. The results of the statistical correlation showed a satisfactory agreement between the prepared USLE-based soil erosion map and landslide events/locations, and are directly proportional to each other. Prognosis analysis on soil erosion helps the user agencies/decision makers to design proper conservation planning program to reduce soil erosion. Temporal statistics on soil erosion in these dynamic and rapid developments in Penang Island indicate the co-existence and balance of ecosystem.
rpsftm: An R Package for Rank Preserving Structural Failure Time Models
Allison, Annabel; White, Ian R; Bond, Simon
2018-01-01
Treatment switching in a randomised controlled trial occurs when participants change from their randomised treatment to the other trial treatment during the study. Failure to account for treatment switching in the analysis (i.e. by performing a standard intention-to-treat analysis) can lead to biased estimates of treatment efficacy. The rank preserving structural failure time model (RPSFTM) is a method used to adjust for treatment switching in trials with survival outcomes. The RPSFTM is due to Robins and Tsiatis (1991) and has been developed by White et al. (1997, 1999). The method is randomisation based and uses only the randomised treatment group, observed event times, and treatment history in order to estimate a causal treatment effect. The treatment effect, ψ, is estimated by balancing counter-factual event times (that would be observed if no treatment were received) between treatment groups. G-estimation is used to find the value of ψ such that a test statistic Z(ψ) = 0. This is usually the test statistic used in the intention-to-treat analysis, for example, the log rank test statistic. We present an R package that implements the method of rpsftm. PMID:29564164
rpsftm: An R Package for Rank Preserving Structural Failure Time Models.
Allison, Annabel; White, Ian R; Bond, Simon
2017-12-04
Treatment switching in a randomised controlled trial occurs when participants change from their randomised treatment to the other trial treatment during the study. Failure to account for treatment switching in the analysis (i.e. by performing a standard intention-to-treat analysis) can lead to biased estimates of treatment efficacy. The rank preserving structural failure time model (RPSFTM) is a method used to adjust for treatment switching in trials with survival outcomes. The RPSFTM is due to Robins and Tsiatis (1991) and has been developed by White et al. (1997, 1999). The method is randomisation based and uses only the randomised treatment group, observed event times, and treatment history in order to estimate a causal treatment effect. The treatment effect, ψ , is estimated by balancing counter-factual event times (that would be observed if no treatment were received) between treatment groups. G-estimation is used to find the value of ψ such that a test statistic Z ( ψ ) = 0. This is usually the test statistic used in the intention-to-treat analysis, for example, the log rank test statistic. We present an R package that implements the method of rpsftm.
NASA Astrophysics Data System (ADS)
Ndehedehe, Christopher E.; Agutu, Nathan O.; Okwuashi, Onuwa; Ferreira, Vagner G.
2016-09-01
Lake Chad has recently been perceived to be completely desiccated and almost extinct due to insufficient published ground observations. Given the high spatial variability of rainfall in the region, and the fact that extreme climatic conditions (for example, droughts) could be intensifying in the Lake Chad basin (LCB) due to human activities, a spatio-temporal approach to drought analysis becomes essential. This study employed independent component analysis (ICA), a fourth-order cumulant statistics, to decompose standardised precipitation index (SPI), standardised soil moisture index (SSI), and terrestrial water storage (TWS) derived from Gravity Recovery and Climate Experiment (GRACE) into spatial and temporal patterns over the LCB. In addition, this study uses satellite altimetry data to estimate variations in the Lake Chad water levels, and further employs relevant climate teleconnection indices (El-Niño Southern Oscillation-ENSO, Atlantic Multi-decadal Oscillation-AMO, and Atlantic Meridional Mode-AMM) to examine their links to the observed drought temporal patterns over the basin. From the spatio-temporal drought analysis, temporal evolutions of SPI at 12 month aggregation show relatively wet conditions in the last two decades (although with marked alterations) with the 2012-2014 period being the wettest. In addition to the improved rainfall conditions during this period, there was a statistically significant increase of 0.04 m/yr in altimetry water levels observed over Lake Chad between 2008 and 2014, which confirms a shift in the hydrological conditions of the basin. Observed trend in TWS changes during the 2002-2014 period shows a statistically insignificant increase of 3.0 mm/yr at the centre of the basin, coinciding with soil moisture deficit indicated by the temporal evolutions of SSI at all monthly accumulations during the 2002-2003 and 2009-2012 periods. Further, SPI at 3 and 6 month scales indicated fluctuating drought conditions at the extreme south of the basin, coinciding with a statistically insignificant decline in TWS of about 4.5 mm/yr at the southern catchment of the basin. Finally, correlation analyses indicate that ENSO, AMO, and AMM are associated with extreme rainfall conditions in the basin, with AMO showing the strongest association (statistically significant correlation of 0.55) with SPI 12 month aggregation. Therefore, this study provides a framework that will support drought monitoring in the LCB.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kollias, Pavlos
2017-08-08
This is a multi-institutional, collaborative project using observations and modeling to study the evolution (e.g. formation and growth) of hydrometeors in continental convective clouds. Our contribution was in data analysis for the generation of high-value cloud and precipitation products and derive cloud statistics for model validation. There are two areas in data analysis that we contributed: i) the development of novel, state-of-the-art dual-wavelength radar algorithms for the retrieval of cloud microphysical properties and ii) the evaluation of large domain, high-resolution models using comprehensive multi-sensor observations. Our research group developed statistical summaries from numerous sensors and developed retrievals of vertical airmore » motion in deep convection.« less
Spectral Analysis of B Stars: An Application of Bayesian Statistics
NASA Astrophysics Data System (ADS)
Mugnes, J.-M.; Robert, C.
2012-12-01
To better understand the processes involved in stellar physics, it is necessary to obtain accurate stellar parameters (effective temperature, surface gravity, abundances…). Spectral analysis is a powerful tool for investigating stars, but it is also vital to reduce uncertainties at a decent computational cost. Here we present a spectral analysis method based on a combination of Bayesian statistics and grids of synthetic spectra obtained with TLUSTY. This method simultaneously constrains the stellar parameters by using all the lines accessible in observed spectra and thus greatly reduces uncertainties and improves the overall spectrum fitting. Preliminary results are shown using spectra from the Observatoire du Mont-Mégantic.
Statistical analysis of traversal behavior under different types of traffic lights
NASA Astrophysics Data System (ADS)
Wang, Boran; Wang, Ziyang; Li, Zhiyin
2017-12-01
According to the video observation, it is found that the traffic signal type signal has a significant effect on the illegal crossing behavior of pedestrians at the intersection. Through the method of statistical analysis and variance analysis, the difference between the violation rate and the waiting position of pedestrians at different intersecting lights is compared, and the influence of traffic signal type on pedestrian crossing behavior is evaluated. The results show that the violation rate of the intersection of the static pedestrian lights is significantly higher than that of the countdown signal lights. There are significant differences in the waiting position of the intersection of different signal lights.
Anomalous heat transfer modes of nanofluids: a review based on statistical analysis
NASA Astrophysics Data System (ADS)
Sergis, Antonis; Hardalupas, Yannis
2011-05-01
This paper contains the results of a concise statistical review analysis of a large amount of publications regarding the anomalous heat transfer modes of nanofluids. The application of nanofluids as coolants is a novel practise with no established physical foundations explaining the observed anomalous heat transfer. As a consequence, traditional methods of performing a literature review may not be adequate in presenting objectively the results representing the bulk of the available literature. The current literature review analysis aims to resolve the problems faced by researchers in the past by employing an unbiased statistical analysis to present and reveal the current trends and general belief of the scientific community regarding the anomalous heat transfer modes of nanofluids. The thermal performance analysis indicated that statistically there exists a variable enhancement for conduction, convection/mixed heat transfer, pool boiling heat transfer and critical heat flux modes. The most popular proposed mechanisms in the literature to explain heat transfer in nanofluids are revealed, as well as possible trends between nanofluid properties and thermal performance. The review also suggests future experimentation to provide more conclusive answers to the control mechanisms and influential parameters of heat transfer in nanofluids.
Anomalous heat transfer modes of nanofluids: a review based on statistical analysis.
Sergis, Antonis; Hardalupas, Yannis
2011-05-19
This paper contains the results of a concise statistical review analysis of a large amount of publications regarding the anomalous heat transfer modes of nanofluids. The application of nanofluids as coolants is a novel practise with no established physical foundations explaining the observed anomalous heat transfer. As a consequence, traditional methods of performing a literature review may not be adequate in presenting objectively the results representing the bulk of the available literature. The current literature review analysis aims to resolve the problems faced by researchers in the past by employing an unbiased statistical analysis to present and reveal the current trends and general belief of the scientific community regarding the anomalous heat transfer modes of nanofluids. The thermal performance analysis indicated that statistically there exists a variable enhancement for conduction, convection/mixed heat transfer, pool boiling heat transfer and critical heat flux modes. The most popular proposed mechanisms in the literature to explain heat transfer in nanofluids are revealed, as well as possible trends between nanofluid properties and thermal performance. The review also suggests future experimentation to provide more conclusive answers to the control mechanisms and influential parameters of heat transfer in nanofluids.
Anomalous heat transfer modes of nanofluids: a review based on statistical analysis
2011-01-01
This paper contains the results of a concise statistical review analysis of a large amount of publications regarding the anomalous heat transfer modes of nanofluids. The application of nanofluids as coolants is a novel practise with no established physical foundations explaining the observed anomalous heat transfer. As a consequence, traditional methods of performing a literature review may not be adequate in presenting objectively the results representing the bulk of the available literature. The current literature review analysis aims to resolve the problems faced by researchers in the past by employing an unbiased statistical analysis to present and reveal the current trends and general belief of the scientific community regarding the anomalous heat transfer modes of nanofluids. The thermal performance analysis indicated that statistically there exists a variable enhancement for conduction, convection/mixed heat transfer, pool boiling heat transfer and critical heat flux modes. The most popular proposed mechanisms in the literature to explain heat transfer in nanofluids are revealed, as well as possible trends between nanofluid properties and thermal performance. The review also suggests future experimentation to provide more conclusive answers to the control mechanisms and influential parameters of heat transfer in nanofluids. PMID:21711932
Teo, Guoshou; Kim, Sinae; Tsou, Chih-Chiang; Collins, Ben; Gingras, Anne-Claude; Nesvizhskii, Alexey I; Choi, Hyungwon
2015-11-03
Data independent acquisition (DIA) mass spectrometry is an emerging technique that offers more complete detection and quantification of peptides and proteins across multiple samples. DIA allows fragment-level quantification, which can be considered as repeated measurements of the abundance of the corresponding peptides and proteins in the downstream statistical analysis. However, few statistical approaches are available for aggregating these complex fragment-level data into peptide- or protein-level statistical summaries. In this work, we describe a software package, mapDIA, for statistical analysis of differential protein expression using DIA fragment-level intensities. The workflow consists of three major steps: intensity normalization, peptide/fragment selection, and statistical analysis. First, mapDIA offers normalization of fragment-level intensities by total intensity sums as well as a novel alternative normalization by local intensity sums in retention time space. Second, mapDIA removes outlier observations and selects peptides/fragments that preserve the major quantitative patterns across all samples for each protein. Last, using the selected fragments and peptides, mapDIA performs model-based statistical significance analysis of protein-level differential expression between specified groups of samples. Using a comprehensive set of simulation datasets, we show that mapDIA detects differentially expressed proteins with accurate control of the false discovery rates. We also describe the analysis procedure in detail using two recently published DIA datasets generated for 14-3-3β dynamic interaction network and prostate cancer glycoproteome. The software was written in C++ language and the source code is available for free through SourceForge website http://sourceforge.net/projects/mapdia/.This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015 Elsevier B.V. All rights reserved.
1984-02-01
prediction Extratropical cyclones Objective analysis Bogus techniques 20. ABSTRACT (Continue on reverse aide If necooearn mid Identify by block number) Jh A...quasi-objective statistical method for deriving 300 mb geopotential heights and 1000/300 mb thicknesses in the vicinity of extratropical cyclones 0I...with the aid of satellite imagery is presented. The technique utilizes satellite observed extratropical spiral cloud pattern parameters in conjunction
Generalization of Entropy Based Divergence Measures for Symbolic Sequence Analysis
Ré, Miguel A.; Azad, Rajeev K.
2014-01-01
Entropy based measures have been frequently used in symbolic sequence analysis. A symmetrized and smoothed form of Kullback-Leibler divergence or relative entropy, the Jensen-Shannon divergence (JSD), is of particular interest because of its sharing properties with families of other divergence measures and its interpretability in different domains including statistical physics, information theory and mathematical statistics. The uniqueness and versatility of this measure arise because of a number of attributes including generalization to any number of probability distributions and association of weights to the distributions. Furthermore, its entropic formulation allows its generalization in different statistical frameworks, such as, non-extensive Tsallis statistics and higher order Markovian statistics. We revisit these generalizations and propose a new generalization of JSD in the integrated Tsallis and Markovian statistical framework. We show that this generalization can be interpreted in terms of mutual information. We also investigate the performance of different JSD generalizations in deconstructing chimeric DNA sequences assembled from bacterial genomes including that of E. coli, S. enterica typhi, Y. pestis and H. influenzae. Our results show that the JSD generalizations bring in more pronounced improvements when the sequences being compared are from phylogenetically proximal organisms, which are often difficult to distinguish because of their compositional similarity. While small but noticeable improvements were observed with the Tsallis statistical JSD generalization, relatively large improvements were observed with the Markovian generalization. In contrast, the proposed Tsallis-Markovian generalization yielded more pronounced improvements relative to the Tsallis and Markovian generalizations, specifically when the sequences being compared arose from phylogenetically proximal organisms. PMID:24728338
Generalization of entropy based divergence measures for symbolic sequence analysis.
Ré, Miguel A; Azad, Rajeev K
2014-01-01
Entropy based measures have been frequently used in symbolic sequence analysis. A symmetrized and smoothed form of Kullback-Leibler divergence or relative entropy, the Jensen-Shannon divergence (JSD), is of particular interest because of its sharing properties with families of other divergence measures and its interpretability in different domains including statistical physics, information theory and mathematical statistics. The uniqueness and versatility of this measure arise because of a number of attributes including generalization to any number of probability distributions and association of weights to the distributions. Furthermore, its entropic formulation allows its generalization in different statistical frameworks, such as, non-extensive Tsallis statistics and higher order Markovian statistics. We revisit these generalizations and propose a new generalization of JSD in the integrated Tsallis and Markovian statistical framework. We show that this generalization can be interpreted in terms of mutual information. We also investigate the performance of different JSD generalizations in deconstructing chimeric DNA sequences assembled from bacterial genomes including that of E. coli, S. enterica typhi, Y. pestis and H. influenzae. Our results show that the JSD generalizations bring in more pronounced improvements when the sequences being compared are from phylogenetically proximal organisms, which are often difficult to distinguish because of their compositional similarity. While small but noticeable improvements were observed with the Tsallis statistical JSD generalization, relatively large improvements were observed with the Markovian generalization. In contrast, the proposed Tsallis-Markovian generalization yielded more pronounced improvements relative to the Tsallis and Markovian generalizations, specifically when the sequences being compared arose from phylogenetically proximal organisms.
NASA Technical Reports Server (NTRS)
Jasperson, W. H.; Nastrom, G. D.; Davis, R. E.; Holdeman, J. D.
1984-01-01
Summary studies are presented for the entire cloud observation archieve from the NASA Global Atmospheric Sampling Program (GASP). Studies are also presented for GASP particle concentration data gathered concurrently with the cloud observations. Cloud encounters are shown on about 15 percent of the data samples overall, but the probability of cloud encounter is shown to vary significantly with altitude, latitude, and distance from the tropopause. Several meteorological circulation features are apparent in the latitudinal distribution of cloud cover, and the cloud encounter statistics are shown to be consistent with the classical mid-latitude cyclone model. Observations of clouds spaced more closely than 90 minutes are shown to be statistically dependent. The statistics for cloud and particle encounter are utilized to estimate the frequency of cloud encounter on long range airline routes, and to assess the probability and extent of laminar flow loss due to cloud or particle encounter by aircraft utilizing laminar flow control (LFC). It is shown that the probability of extended cloud encounter is too low, of itself, to make LFC impractical.
Bunijevac, Mila; Petrović-Lazić, Mirjana; Jovanović-Simić, Nadica; Vuković, Mile
2016-02-01
The major role of larynx in speech, respiration and swallowing makes carcinomas of this region and their treatment very influential for patients' life quality. The aim of this study was to assess the importance of voice therapy in patients after open surgery on vocal cords. This study included 21 male patients and the control group of 19 subjects. The vowel (A) was recorded and analyzed for each examinee. All the patients were recorded twice: firstly, when they contacted the clinic and secondly, after a three-month vocal therapy, which was held twiceper week on an outpatient basis. The voice analysis was carried out in the Ear, Nose and Throat (ENT) Clinic, Clinical Hospital Center "Zvezdara" in Belgrade. The values of the acoustic parameters in the patients submitted to open surgery on the vocal cords before vocal rehabilitation and the control group subjects were significantly different in all specified parameters. These results suggest that the voice of the patients was damaged before vocal rehabilitation. The results of the acoustic parameters of the vowel (A) before and after vocal rehabilitation of the patients with open surgery on vocal cords were statistically significantly different. Among the parameters--Jitter (%), Shimmer (%)--the observed difference was highly statistically significant (p < 0.01). The voice turbulence index and the noise/harmonic ratio were also notably improved, and the observed difference was statistically significant (p < 0.05). The analysis of the tremor intensity index showed no significant improvement and the observed difference was not statistically significant (p > 0.05 ). CONCLUSION. There was a significant improvement of the acoustic parameters of the vowel (A) in the study subjects three months following vocal therapy. Only one out of five representative parameters showed no significant improvement.
Separate-channel analysis of two-channel microarrays: recovering inter-spot information.
Smyth, Gordon K; Altman, Naomi S
2013-05-26
Two-channel (or two-color) microarrays are cost-effective platforms for comparative analysis of gene expression. They are traditionally analysed in terms of the log-ratios (M-values) of the two channel intensities at each spot, but this analysis does not use all the information available in the separate channel observations. Mixed models have been proposed to analyse intensities from the two channels as separate observations, but such models can be complex to use and the gain in efficiency over the log-ratio analysis is difficult to quantify. Mixed models yield test statistics for the null distributions can be specified only approximately, and some approaches do not borrow strength between genes. This article reformulates the mixed model to clarify the relationship with the traditional log-ratio analysis, to facilitate information borrowing between genes, and to obtain an exact distributional theory for the resulting test statistics. The mixed model is transformed to operate on the M-values and A-values (average log-expression for each spot) instead of on the log-expression values. The log-ratio analysis is shown to ignore information contained in the A-values. The relative efficiency of the log-ratio analysis is shown to depend on the size of the intraspot correlation. A new separate channel analysis method is proposed that assumes a constant intra-spot correlation coefficient across all genes. This approach permits the mixed model to be transformed into an ordinary linear model, allowing the data analysis to use a well-understood empirical Bayes analysis pipeline for linear modeling of microarray data. This yields statistically powerful test statistics that have an exact distributional theory. The log-ratio, mixed model and common correlation methods are compared using three case studies. The results show that separate channel analyses that borrow strength between genes are more powerful than log-ratio analyses. The common correlation analysis is the most powerful of all. The common correlation method proposed in this article for separate-channel analysis of two-channel microarray data is no more difficult to apply in practice than the traditional log-ratio analysis. It provides an intuitive and powerful means to conduct analyses and make comparisons that might otherwise not be possible.
Huncharek, M; Kupelnick, B
2001-01-01
The etiology of epithelial ovarian cancer is unknown. Prior work suggests that high dietary fat intake is associated with an increased risk of this tumor, although this association remains speculative. A meta-analysis was performed to evaluate this suspected relationship. Using previously described methods, a protocol was developed for a meta-analysis examining the association between high vs. low dietary fat intake and the risk of epithelial ovarian cancer. Literature search techniques, study inclusion criteria, and statistical procedures were prospectively defined. Data from observational studies were pooled using a general variance-based meta-analytic method employing confidence intervals (CI) previously described by Greenland. The outcome of interest was a summary relative risk (RRs) reflecting the risk of ovarian cancer associated with high vs. low dietary fat intake. Sensitivity analyses were performed when necessary to evaluate any observed statistical heterogeneity. The literature search yielded 8 observational studies enrolling 6,689 subjects. Data were stratified into three dietary fat intake categories: total fat, animal fat, and saturated fat. Initial tests for statistical homogeneity demonstrated that hospital-based studies accounted for observed heterogeneity possibly because of selection bias. Accounting for this, an RRs was calculated for high vs. low total fat intake, yielding a value of 1.24 (95% CI = 1.07-1.43), a statistically significant result. That is, high total fat intake is associated with a 24% increased risk of ovarian cancer development. The RRs for high saturated fat intake was 1.20 (95% CI = 1.04-1.39), suggesting a 20% increased risk of ovarian cancer among subjects with these dietary habits. High vs. low animal fat diet gave an RRs of 1.70 (95% CI = 1.43-2.03), consistent with a statistically significant 70% increased ovarian cancer risk. High dietary fat intake appears to represent a significant risk factor for the development of ovarian cancer. The magnitude of this risk associated with total fat and saturated fat is rather modest. Ovarian cancer risk associated with high animal fat intake appears significantly greater than that associated with the other types of fat intake studied, although this requires confirmation via larger analyses. Further work is needed to clarify factors that may modify the effects of dietary fat in vivo.
Space Shuttle booster thrust imbalance analysis
NASA Technical Reports Server (NTRS)
Bailey, W. R.; Blackwell, D. L.
1985-01-01
An analysis of the Shuttle SRM thrust imbalance during the steady-state and tailoff portions of the boost phase of flight are presented. Results from flights STS-1 through STS-13 are included. A statistical analysis of the observed thrust imbalance data is presented. A 3 sigma thrust imbalance history versus time was generated from the observed data and is compared to the vehicle design requirements. The effect on Shuttle thrust imbalance from the use of replacement SRM segments is predicted. Comparisons of observed thrust imbalances with respect to predicted imbalances are presented for the two space shuttle flights which used replacement aft segments (STS-9 and STS-13).
Portraits of self-organization in fish schools interacting with robots
NASA Astrophysics Data System (ADS)
Aureli, M.; Fiorilli, F.; Porfiri, M.
2012-05-01
In this paper, we propose an enabling computational and theoretical framework for the analysis of experimental instances of collective behavior in response to external stimuli. In particular, this work addresses the characterization of aggregation and interaction phenomena in robot-animal groups through the exemplary analysis of fish schooling in the vicinity of a biomimetic robot. We adapt global observables from statistical mechanics to capture the main features of the shoal collective motion and its response to the robot from experimental observations. We investigate the shoal behavior by using a diffusion mapping analysis performed on these global observables that also informs the definition of relevant portraits of self-organization.
NASA Technical Reports Server (NTRS)
Bosilovich, Michael G.; Dasilva, Arindo M.
2012-01-01
Reanalyses have become important sources of data in weather and climate research. While observations are the most crucial component of the systems, few research projects consider carefully the multitudes of assimilated observations and their impact on the results. This is partly due to the diversity of observations and their individual complexity, but also due to the unfriendly nature of the data formats. Here, we discuss the NASA Modern-Era Retrospective analysis for Research and Applications (MERRA) and a companion dataset, the Gridded Innovations and Observations (GIO). GIO is simply a post-processing of the assimilated observations and their innovations (forecast error and analysis error) to a common spatio-temporal grid, following that of the MERRA analysis fields. This data includes in situ, retrieved and radiance observations that are assimilated and used in the reanalysis. While all these disparate observations and statistics are in a uniform easily accessible format, there are some limitations. Similar observations are binned to the grid, so that multiple observations are combined in the gridding process. The data is then implicitly thinned. Some details in the meta data may also be lost (e.g. aircraft or station ID). Nonetheless, the gridded observations should provide easy access to all the observations input to the reanalysis. To provide an example of the GIO data, a case study evaluating observing systems over the United States and statistics is presented, and demonstrates the evaluation of the observations and the data assimilation. The GIO data is used to collocate 200mb Radiosonde and Aircraft temperature measurements from 1979-2009. A known warm bias of the aircraft measurements is apparent compared to the radiosonde data. However, when larger quantities of aircraft data are available, they dominate the analysis and the radiosonde data become biased against the forecast. When AMSU radiances become available the radiosonde and aircraft analysis and forecast error take on an annual cycle. While this supports results of previous work that recommend bias corrections for the aircraft measurements, the interactions with AMSU radiances will also require further investigation. This also provides an example for reanalysis users in examining the available observations and their impact on the analysis. GIO data is presently available alongside the MERRA reanalysis.
AN EXPLORATION OF THE STATISTICAL SIGNATURES OF STELLAR FEEDBACK
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boyden, Ryan D.; Offner, Stella S. R.; Koch, Eric W.
2016-12-20
All molecular clouds are observed to be turbulent, but the origin, means of sustenance, and evolution of the turbulence remain debated. One possibility is that stellar feedback injects enough energy into the cloud to drive observed motions on parsec scales. Recent numerical studies of molecular clouds have found that feedback from stars, such as protostellar outflows and winds, injects energy and impacts turbulence. We expand upon these studies by analyzing magnetohydrodynamic simulations of molecular clouds, including stellar winds, with a range of stellar mass-loss rates and magnetic field strengths. We generate synthetic {sup 12}CO(1–0) maps assuming that the simulations aremore » at the distance of the nearby Perseus molecular cloud. By comparing the outputs from different initial conditions and evolutionary times, we identify differences in the synthetic observations and characterize these using common astrostatistics. We quantify the different statistical responses using a variety of metrics proposed in the literature. We find that multiple astrostatistics, including the principal component analysis, the spectral correlation function, and the velocity coordinate spectrum (VCS), are sensitive to changes in stellar mass-loss rates and/or time evolution. A few statistics, including the Cramer statistic and VCS, are sensitive to the magnetic field strength. These findings demonstrate that stellar feedback influences molecular cloud turbulence and can be identified and quantified observationally using such statistics.« less
ERIC Educational Resources Information Center
Unicomb, Rachael; Colyvas, Kim; Harrison, Elisabeth; Hewat, Sally
2015-01-01
Purpose: Case-study methodology studying change is often used in the field of speech-language pathology, but it can be criticized for not being statistically robust. Yet with the heterogeneous nature of many communication disorders, case studies allow clinicians and researchers to closely observe and report on change. Such information is valuable…
NASA Astrophysics Data System (ADS)
LIU, J.; Bi, Y.; Duan, S.; Lu, D.
2017-12-01
It is well-known that cloud characteristics, such as top and base heights and their layering structure of micro-physical parameters, spatial coverage and temporal duration are very important factors influencing both radiation budget and its vertical partitioning as well as hydrological cycle through precipitation data. Also, cloud structure and their statistical distribution and typical values will have respective characteristics with geographical and seasonal variation. Ka band radar is a powerful tool to obtain above parameters around the world, such as ARM cloud radar at the Oklahoma US, Since 2006, Cloudsat is one of NASA's A-Train satellite constellation, continuously observe the cloud structure with global coverage, but only twice a day it monitor clouds over same local site at same local time.By using IAP Ka band Doppler radar which has been operating continuously since early 2013 over the roof of IAP building in Beijing, we obtained the statistical characteristic of clouds, including cloud layering, cloud top and base heights, as well as the thickness of each cloud layer and their distribution, and were analyzed monthly and seasonal and diurnal variation, statistical analysis of cloud reflectivity profiles is also made. The analysis covers both non-precipitating clouds and precipitating clouds. Also, some preliminary comparison of the results with Cloudsat/Calipso products for same period and same area are made.
NASA Astrophysics Data System (ADS)
Karpov, A. V.; Yumagulov, E. Z.
2003-05-01
We have restored and ordered the archive of meteor observations carried out with a meteor radar complex ``KGU-M5'' since 1986. A relational database has been formed under the control of the Database Management System (DBMS) Oracle 8. We also improved and tested a statistical method for studying the fine spatial structure of meteor streams with allowance for the specific features of application of the DBMS. Statistical analysis of the results of observations made it possible to obtain information about the substance distribution in the Quadrantid, Geminid, and Perseid meteor streams.
Testing alternative ground water models using cross-validation and other methods
Foglia, L.; Mehl, S.W.; Hill, M.C.; Perona, P.; Burlando, P.
2007-01-01
Many methods can be used to test alternative ground water models. Of concern in this work are methods able to (1) rank alternative models (also called model discrimination) and (2) identify observations important to parameter estimates and predictions (equivalent to the purpose served by some types of sensitivity analysis). Some of the measures investigated are computationally efficient; others are computationally demanding. The latter are generally needed to account for model nonlinearity. The efficient model discrimination methods investigated include the information criteria: the corrected Akaike information criterion, Bayesian information criterion, and generalized cross-validation. The efficient sensitivity analysis measures used are dimensionless scaled sensitivity (DSS), composite scaled sensitivity, and parameter correlation coefficient (PCC); the other statistics are DFBETAS, Cook's D, and observation-prediction statistic. Acronyms are explained in the introduction. Cross-validation (CV) is a computationally intensive nonlinear method that is used for both model discrimination and sensitivity analysis. The methods are tested using up to five alternative parsimoniously constructed models of the ground water system of the Maggia Valley in southern Switzerland. The alternative models differ in their representation of hydraulic conductivity. A new method for graphically representing CV and sensitivity analysis results for complex models is presented and used to evaluate the utility of the efficient statistics. The results indicate that for model selection, the information criteria produce similar results at much smaller computational cost than CV. For identifying important observations, the only obviously inferior linear measure is DSS; the poor performance was expected because DSS does not include the effects of parameter correlation and PCC reveals large parameter correlations. ?? 2007 National Ground Water Association.
Evaluation of Satellite and Model Precipitation Products Over Turkey
NASA Astrophysics Data System (ADS)
Yilmaz, M. T.; Amjad, M.
2017-12-01
Satellite-based remote sensing, gauge stations, and models are the three major platforms to acquire precipitation dataset. Among them satellites and models have the advantage of retrieving spatially and temporally continuous and consistent datasets, while the uncertainty estimates of these retrievals are often required for many hydrological studies to understand the source and the magnitude of the uncertainty in hydrological response parameters. In this study, satellite and model precipitation data products are validated over various temporal scales (daily, 3-daily, 7-daily, 10-daily and monthly) using in-situ measured precipitation observations from a network of 733 gauges from all over the Turkey. Tropical Rainfall Measurement Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) 3B42 version 7 and European Center of Medium-Range Weather Forecast (ECMWF) model estimates (daily, 3-daily, 7-daily and 10-daily accumulated forecast) are used in this study. Retrievals are evaluated for their mean and standard deviation and their accuracies are evaluated via bias, root mean square error, error standard deviation and correlation coefficient statistics. Intensity vs frequency analysis and some contingency table statistics like percent correct, probability of detection, false alarm ratio and critical success index are determined using daily time-series. Both ECMWF forecasts and TRMM observations, on average, overestimate the precipitation compared to gauge estimates; wet biases are 10.26 mm/month and 8.65 mm/month, respectively for ECMWF and TRMM. RMSE values of ECMWF forecasts and TRMM estimates are 39.69 mm/month and 41.55 mm/month, respectively. Monthly correlations between Gauges-ECMWF, Gauges-TRMM and ECMWF-TRMM are 0.76, 0.73 and 0.81, respectively. The model and the satellite error statistics are further compared against the gauges error statistics based on inverse distance weighting (IWD) analysis. Both the model and satellite data have less IWD errors (14.72 mm/month and 10.75 mm/month, respectively) compared to gauges IWD error (21.58 mm/month). These results show that, on average, ECMWF forecast data have higher skill than TRMM observations. Overall, both ECMWF forecast data and TRMM observations show good potential for catchment scale hydrological analysis.
Stochastic space interval as a link between quantum randomness and macroscopic randomness?
NASA Astrophysics Data System (ADS)
Haug, Espen Gaarder; Hoff, Harald
2018-03-01
For many stochastic phenomena, we observe statistical distributions that have fat-tails and high-peaks compared to the Gaussian distribution. In this paper, we will explain how observable statistical distributions in the macroscopic world could be related to the randomness in the subatomic world. We show that fat-tailed (leptokurtic) phenomena in our everyday macroscopic world are ultimately rooted in Gaussian - or very close to Gaussian-distributed subatomic particle randomness, but they are not, in a strict sense, Gaussian distributions. By running a truly random experiment over a three and a half-year period, we observed a type of random behavior in trillions of photons. Combining our results with simple logic, we find that fat-tailed and high-peaked statistical distributions are exactly what we would expect to observe if the subatomic world is quantized and not continuously divisible. We extend our analysis to the fact that one typically observes fat-tails and high-peaks relative to the Gaussian distribution in stocks and commodity prices and many aspects of the natural world; these instances are all observable and documentable macro phenomena that strongly suggest that the ultimate building blocks of nature are discrete (e.g. they appear in quanta).
Structural Analysis of Covariance and Correlation Matrices.
ERIC Educational Resources Information Center
Joreskog, Karl G.
1978-01-01
A general approach to analysis of covariance structures is considered, in which the variances and covariances or correlations of the observed variables are directly expressed in terms of the parameters of interest. The statistical problems of identification, estimation and testing of such covariance or correlation structures are discussed.…
Structure in gamma ray burst time profiles: Statistical Analysis 1
NASA Technical Reports Server (NTRS)
Lestrade, John Patrick
1992-01-01
Since its launch on April 5, 1991, the Burst And Transient Source Experiment (BATSE) has observed and recorded over 500 gamma-ray bursts (GRB). The analysis of the time profiles of these bursts has proven to be difficult. Attempts to find periodicities through Fourier analysis have been fruitless except one celebrated case. Our goal is to be able to qualify the observed time-profiles structure. Before applying this formation to bursts, we have tested it on profiles composed of random Poissonian noise. This paper is a report of those preliminary results.
Upgrade Summer Severe Weather Tool
NASA Technical Reports Server (NTRS)
Watson, Leela
2011-01-01
The goal of this task was to upgrade to the existing severe weather database by adding observations from the 2010 warm season, update the verification dataset with results from the 2010 warm season, use statistical logistic regression analysis on the database and develop a new forecast tool. The AMU analyzed 7 stability parameters that showed the possibility of providing guidance in forecasting severe weather, calculated verification statistics for the Total Threat Score (TTS), and calculated warm season verification statistics for the 2010 season. The AMU also performed statistical logistic regression analysis on the 22-year severe weather database. The results indicated that the logistic regression equation did not show an increase in skill over the previously developed TTS. The equation showed less accuracy than TTS at predicting severe weather, little ability to distinguish between severe and non-severe weather days, and worse standard categorical accuracy measures and skill scores over TTS.
Missing Data and Multiple Imputation: An Unbiased Approach
NASA Technical Reports Server (NTRS)
Foy, M.; VanBaalen, M.; Wear, M.; Mendez, C.; Mason, S.; Meyers, V.; Alexander, D.; Law, J.
2014-01-01
The default method of dealing with missing data in statistical analyses is to only use the complete observations (complete case analysis), which can lead to unexpected bias when data do not meet the assumption of missing completely at random (MCAR). For the assumption of MCAR to be met, missingness cannot be related to either the observed or unobserved variables. A less stringent assumption, missing at random (MAR), requires that missingness not be associated with the value of the missing variable itself, but can be associated with the other observed variables. When data are truly MAR as opposed to MCAR, the default complete case analysis method can lead to biased results. There are statistical options available to adjust for data that are MAR, including multiple imputation (MI) which is consistent and efficient at estimating effects. Multiple imputation uses informing variables to determine statistical distributions for each piece of missing data. Then multiple datasets are created by randomly drawing on the distributions for each piece of missing data. Since MI is efficient, only a limited number, usually less than 20, of imputed datasets are required to get stable estimates. Each imputed dataset is analyzed using standard statistical techniques, and then results are combined to get overall estimates of effect. A simulation study will be demonstrated to show the results of using the default complete case analysis, and MI in a linear regression of MCAR and MAR simulated data. Further, MI was successfully applied to the association study of CO2 levels and headaches when initial analysis showed there may be an underlying association between missing CO2 levels and reported headaches. Through MI, we were able to show that there is a strong association between average CO2 levels and the risk of headaches. Each unit increase in CO2 (mmHg) resulted in a doubling in the odds of reported headaches.
Kelley, George A.; Kelley, Kristi S.; Kohrt, Wendy M.
2013-01-01
Objective. Examine the effects of exercise on femoral neck (FN) and lumbar spine (LS) bone mineral density (BMD) in premenopausal women. Methods. Meta-analysis of randomized controlled exercise trials ≥24 weeks in premenopausal women. Standardized effect sizes (g) were calculated for each result and pooled using random-effects models, Z score alpha values, 95% confidence intervals (CIs), and number needed to treat (NNT). Heterogeneity was examined using Q and I 2. Moderator and predictor analyses using mixed-effects ANOVA and simple metaregression were conducted. Statistical significance was set at P ≤ 0.05. Results. Statistically significant improvements were found for both FN (7g's, 466 participants, g = 0.342, 95% CI = 0.132, 0.553, P = 0.001, Q = 10.8, P = 0.22, I 2 = 25.7%, NNT = 5) and LS (6g's, 402 participants, g = 0.201, 95% CI = 0.009, 0.394, P = 0.04, Q = 3.3, P = 0.65, I 2 = 0%, NNT = 9) BMD. A trend for greater benefits in FN BMD was observed for studies published in countries other than the United States and for those who participated in home versus facility-based exercise. Statistically significant, or a trend for statistically significant, associations were observed for 7 different moderators and predictors, 6 for FN BMD and 1 for LS BMD. Conclusions. Exercise benefits FN and LS BMD in premenopausal women. The observed moderators and predictors deserve further investigation in well-designed randomized controlled trials. PMID:23401684
Kelley, George A; Kelley, Kristi S; Kohrt, Wendy M
2013-01-01
Objective. Examine the effects of exercise on femoral neck (FN) and lumbar spine (LS) bone mineral density (BMD) in premenopausal women. Methods. Meta-analysis of randomized controlled exercise trials ≥24 weeks in premenopausal women. Standardized effect sizes (g) were calculated for each result and pooled using random-effects models, Z score alpha values, 95% confidence intervals (CIs), and number needed to treat (NNT). Heterogeneity was examined using Q and I(2). Moderator and predictor analyses using mixed-effects ANOVA and simple metaregression were conducted. Statistical significance was set at P ≤ 0.05. Results. Statistically significant improvements were found for both FN (7g's, 466 participants, g = 0.342, 95% CI = 0.132, 0.553, P = 0.001, Q = 10.8, P = 0.22, I(2) = 25.7%, NNT = 5) and LS (6g's, 402 participants, g = 0.201, 95% CI = 0.009, 0.394, P = 0.04, Q = 3.3, P = 0.65, I(2) = 0%, NNT = 9) BMD. A trend for greater benefits in FN BMD was observed for studies published in countries other than the United States and for those who participated in home versus facility-based exercise. Statistically significant, or a trend for statistically significant, associations were observed for 7 different moderators and predictors, 6 for FN BMD and 1 for LS BMD. Conclusions. Exercise benefits FN and LS BMD in premenopausal women. The observed moderators and predictors deserve further investigation in well-designed randomized controlled trials.
NASA Technical Reports Server (NTRS)
da Silva, Arlindo; Redder, Christopher
2010-01-01
MERRA is a NASA reanalysis for the satellite era using a major new version of the Goddard Earth Observing System Data Assimilation System Version 5 (GEOS-5). The project focuses on historical analyses of the hydrological cycle on a broad range of weather and climate time scales and places the NASA EOS suite of observations in a climate context. The characterization of uncertainty in reanalysis fields is a commonly requested feature by users of such data. While intercomparison with reference data sets is common practice for ascertaining the realism of the datasets, such studies typically are restricted to long term climatological statistics and seldom provide state dependent measures of the uncertainties involved. In principle, variational data assimilation algorithms have the ability of producing error estimates for the analysis variables (typically surface pressure, winds, temperature, moisture and ozone) consistent with the assumed background and observation error statistics. However, these "perceived error estimates" are expensive to obtain and are limited by the somewhat simplistic errors assumed in the algorithm. The observation minus forecast residuals (innovations) by-product of any assimilation system constitutes a powerful tool for estimating the systematic and random errors in the analysis fields. Unfortunately, such data is usually not readily available with reanalysis products, often requiring the tedious decoding of large datasets and not so-user friendly file formats. With MERRA we have introduced a gridded version of the observations/innovations used in the assimilation process, using the same grid and data formats as the regular datasets. Such dataset empowers the user with the ability of conveniently performing observing system related analysis and error estimates. The scope of this dataset will be briefly described. We will present a systematic analysis of MERRA innovation time series for the conventional observing system, including maximum-likelihood estimates of background and observation errors, as well as global bias estimates. Starting with the joint PDF of innovations and analysis increments at observation locations we propose a technique for diagnosing bias among the observing systems, and document how these contextual biases have evolved during the satellite era covered by MERRA.
NASA Astrophysics Data System (ADS)
da Silva, A.; Redder, C. R.
2010-12-01
MERRA is a NASA reanalysis for the satellite era using a major new version of the Goddard Earth Observing System Data Assimilation System Version 5 (GEOS-5). The Project focuses on historical analyses of the hydrological cycle on a broad range of weather and climate time scales and places the NASA EOS suite of observations in a climate context. The characterization of uncertainty in reanalysis fields is a commonly requested feature by users of such data. While intercomparison with reference data sets is common practice for ascertaining the realism of the datasets, such studies typically are restricted to long term climatological statistics and seldom provide state dependent measures of the uncertainties involved. In principle, variational data assimilation algorithms have the ability of producing error estimates for the analysis variables (typically surface pressure, winds, temperature, moisture and ozone) consistent with the assumed background and observation error statistics. However, these "perceived error estimates" are expensive to obtain and are limited by the somewhat simplistic errors assumed in the algorithm. The observation minus forecast residuals (innovations) by-product of any assimilation system constitutes a powerful tool for estimating the systematic and random errors in the analysis fields. Unfortunately, such data is usually not readily available with reanalysis products, often requiring the tedious decoding of large datasets and not so-user friendly file formats. With MERRA we have introduced a gridded version of the observations/innovations used in the assimilation process, using the same grid and data formats as the regular datasets. Such dataset empowers the user with the ability of conveniently performing observing system related analysis and error estimates. The scope of this dataset will be briefly described. We will present a systematic analysis of MERRA innovation time series for the conventional observing system, including maximum-likelihood estimates of background and observation errors, as well as global bias estimates. Starting with the joint PDF of innovations and analysis increments at observation locations we propose a technique for diagnosing bias among the observing systems, and document how these contextual biases have evolved during the satellite era covered by MERRA.
2016-09-26
statistical analysis is done by not only examining the SSH forecast error across the entire do- main, but also by concentrating on the areamost densely covered...over (b) entire GoM domain and (d) GLAD region only. Statistics shown for FR (thin black), SSH1 (thick black), and VEL (gray) experiment 96-h SSH...coefficient. To statistically FIG. 9. Sea surface height (m) for AVISO (a) 1 Aug, (b) 20 Aug, (c) 10 Sep, and (d) 30 Sep; for SSH1 experiment (e) 1
ERIC Educational Resources Information Center
Muehlberg, Jessica Marie
2013-01-01
Adelman (2006) observed that a large quantity of research on retention is "institution-specific or use institutional characteristics as independent variables" (p. 81). However, he observed that over 60% of the students he studied attended multiple institutions making the calculation of institutional effects highly problematic. He argued…
NASA Technical Reports Server (NTRS)
Tamayo, Tak Chai
1987-01-01
Quality of software not only is vital to the successful operation of the space station, it is also an important factor in establishing testing requirements, time needed for software verification and integration as well as launching schedules for the space station. Defense of management decisions can be greatly strengthened by combining engineering judgments with statistical analysis. Unlike hardware, software has the characteristics of no wearout and costly redundancies, thus making traditional statistical analysis not suitable in evaluating reliability of software. A statistical model was developed to provide a representation of the number as well as types of failures occur during software testing and verification. From this model, quantitative measure of software reliability based on failure history during testing are derived. Criteria to terminate testing based on reliability objectives and methods to estimate the expected number of fixings required are also presented.
Zeng, Irene Sui Lan; Lumley, Thomas
2018-01-01
Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.
Testing averaged cosmology with type Ia supernovae and BAO data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Santos, B.; Alcaniz, J.S.; Coley, A.A.
An important problem in precision cosmology is the determination of the effects of averaging and backreaction on observational predictions, particularly in view of the wealth of new observational data and improved statistical techniques. In this paper, we discuss the observational viability of a class of averaged cosmologies which consist of a simple parametrized phenomenological two-scale backreaction model with decoupled spatial curvature parameters. We perform a Bayesian model selection analysis and find that this class of averaged phenomenological cosmological models is favored with respect to the standard ΛCDM cosmological scenario when a joint analysis of current SNe Ia and BAO datamore » is performed. In particular, the analysis provides observational evidence for non-trivial spatial curvature.« less
Hitting Is Contagious in Baseball: Evidence from Long Hitting Streaks
Bock, Joel R.; Maewal, Akhilesh; Gough, David A.
2012-01-01
Data analysis is used to test the hypothesis that “hitting is contagious”. A statistical model is described to study the effect of a hot hitter upon his teammates’ batting during a consecutive game hitting streak. Box score data for entire seasons comprising streaks of length games, including a total observations were compiled. Treatment and control sample groups () were constructed from core lineups of players on the streaking batter’s team. The percentile method bootstrap was used to calculate confidence intervals for statistics representing differences in the mean distributions of two batting statistics between groups. Batters in the treatment group (hot streak active) showed statistically significant improvements in hitting performance, as compared against the control. Mean for the treatment group was found to be to percentage points higher during hot streaks (mean difference increased points), while the batting heat index introduced here was observed to increase by points. For each performance statistic, the null hypothesis was rejected at the significance level. We conclude that the evidence suggests the potential existence of a “statistical contagion effect”. Psychological mechanisms essential to the empirical results are suggested, as several studies from the scientific literature lend credence to contagious phenomena in sports. Causal inference from these results is difficult, but we suggest and discuss several latent variables that may contribute to the observed results, and offer possible directions for future research. PMID:23251507
Weir, Christopher J; Butcher, Isabella; Assi, Valentina; Lewis, Stephanie C; Murray, Gordon D; Langhorne, Peter; Brady, Marian C
2018-03-07
Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses.
NASA Astrophysics Data System (ADS)
Rybák, J.; Mačura, R.; Bendík, P.; Bochníček, O.; Horecká, V.
2010-12-01
The paper presents statistical results obtained in the analysis of long-term series of meteorological observations of temperature, wind speed and wind direction, and daylight length that were measured in the period 1964-2009 at the SHMI Meteorological Observatory located at the Lomnický štít mountain peak. In relation to these meteorological data, the contribution also presents statistical results for time scales of various types of solar prominence and solar corona observations at the Lomnický štít Astronomical Observatory in the period 1980-2009. The obtained results were used to characterize the observatory from meteorological perspective presenting mainly the range and quality of observing conditions suitable for solar activity observations. The results show that observing conditions allow for observation of prominences in approximately one third of the days in a year, and observation of the emission corona in approximately one fifth of the days in a year. The contribution also documents the use of the obtained results in preparation of new post-focal instruments for solar corona monitoring, i. e. the coronal multipolarimeter (COMP-S) that is at present under construction. Effects of local warming of the Earth's atmosphere are tested in a statistical analysis of the meteorological data collected in the period 1964-2009. In this period, a marked local warming occured at Lomnický štít with increases in the minimal daily temperature 0.90°C and maximal daily temperature 1.84°C, and the mean value of these quantities raising by 1.21°C.
Exploring the Link Between Streamflow Trends and Climate Change in Indiana, USA
NASA Astrophysics Data System (ADS)
Kumar, S.; Kam, J.; Thurner, K.; Merwade, V.
2007-12-01
Streamflow trends in Indiana are evaluated for 85 USGS streamflow gaging stations that have continuous unregulated streamflow records varying from 10 to 80 years. The trends are analyzed by using the non-parametric Mann-Kendall test with prior trend-free pre-whitening to remove serial correlation in the data. Bootstrap method is used to establish field significance of the results. Trends are computed for 12 streamflow statistics to include low-, medium- (median and mean flow), and high-flow conditions on annual and seasonal time step. The analysis is done for six study periods, ranging from 10 years to more than 65 years, all ending in 2003. The trends in annual average streamflow, for 50 years study period, are compared with annual average precipitation trends from 14 National Climatic Data Center (NCDC) stations in Indiana, that have 50 years of continuous daily record. The results show field significant positive trends in annual low and medium streamflow statistics at majority of gaging stations for study periods that include 40 or more years of records. In seasonal analysis, all flow statistics in summer and fall (low flow seasons), and only low flow statistics in winter and spring (high flow seasons) are showing positive trends. No field significant trends in annual and seasonal flow statistics are observed for study periods that include 25 or fewer years of records, except for northern Indiana where localized negative trends are observed in 10 and 15 years study periods. Further, stream flow trends are found to be highly correlated with precipitation trends on annual time step. No apparent climate change signal is observed in Indiana stream flow records.
Statistical inference for noisy nonlinear ecological dynamic systems.
Wood, Simon N
2010-08-26
Chaotic ecological dynamic systems defy conventional statistical analysis. Systems with near-chaotic dynamics are little better. Such systems are almost invariably driven by endogenous dynamic processes plus demographic and environmental process noise, and are only observable with error. Their sensitivity to history means that minute changes in the driving noise realization, or the system parameters, will cause drastic changes in the system trajectory. This sensitivity is inherited and amplified by the joint probability density of the observable data and the process noise, rendering it useless as the basis for obtaining measures of statistical fit. Because the joint density is the basis for the fit measures used by all conventional statistical methods, this is a major theoretical shortcoming. The inability to make well-founded statistical inferences about biological dynamic models in the chaotic and near-chaotic regimes, other than on an ad hoc basis, leaves dynamic theory without the methods of quantitative validation that are essential tools in the rest of biological science. Here I show that this impasse can be resolved in a simple and general manner, using a method that requires only the ability to simulate the observed data on a system from the dynamic model about which inferences are required. The raw data series are reduced to phase-insensitive summary statistics, quantifying local dynamic structure and the distribution of observations. Simulation is used to obtain the mean and the covariance matrix of the statistics, given model parameters, allowing the construction of a 'synthetic likelihood' that assesses model fit. This likelihood can be explored using a straightforward Markov chain Monte Carlo sampler, but one further post-processing step returns pure likelihood-based inference. I apply the method to establish the dynamic nature of the fluctuations in Nicholson's classic blowfly experiments.
A Primer on Observational Measurement.
Girard, Jeffrey M; Cohn, Jeffrey F
2016-08-01
Observational measurement plays an integral role in a variety of scientific endeavors within biology, psychology, sociology, education, medicine, and marketing. The current article provides an interdisciplinary primer on observational measurement; in particular, it highlights recent advances in observational methodology and the challenges that accompany such growth. First, we detail the various types of instrument that can be used to standardize measurements across observers. Second, we argue for the importance of validity in observational measurement and provide several approaches to validation based on contemporary validity theory. Third, we outline the challenges currently faced by observational researchers pertaining to measurement drift, observer reactivity, reliability analysis, and time/expense. Fourth, we describe recent advances in computer-assisted measurement, fully automated measurement, and statistical data analysis. Finally, we identify several key directions for future observational research to explore.
Estimating the deposition of urban atmospheric NO2 to the urban forest in Portland-Vancouver USA
NASA Astrophysics Data System (ADS)
Rao, M.; Gonzalez Abraham, R.; George, L. A.
2016-12-01
Cities are hotspots of atmospheric emissions of reactive nitrogen oxides, including nitrogen dioxide (NO2), a US EPA criteria pollutant that affects both human and environmental health. A fraction of this anthropogenic, atmospheric NO2 is deposited onto the urban forest, potentially mitigating the impact of NO2 on respiratory health within cities. However, the role of the urban forest in removal of atmospheric NO2 through deposition has not been well studied. Here, using an observationally-based statistical model, we first estimate the reduction of NO2 associated with the urban forest in Portland-Vancouver, USA, and the health benefits accruing from this reduction. In order to assess if this statistically observed reduction in NO2 associated with the urban forest is consistent with deposition, we then compare the amount of NO2 removed through deposition to the urban forest as estimated using a 4km CMAQ simulation. We further undertake a sensitivity analysis in CMAQ to estimate the range of NO2removed as a function of bulk stomatal resistance. We find that NO2 deposition estimated by CMAQ accounts for roughly one-third of the reduction in NO2 shown by the observationally-based statistical model (Figure). Our sensitivity analysis shows that a 3-10 fold increase in the bulk stomatal resistance parameter in CMAQ would align CMAQ-estimated deposition with the statistical model. The reduction of NO2 by the urban forest in the Portland-Vancouver area may yield a health benefit of at least $1.5 million USD annually, providing strong motivation to better understand the mechanism through which the urban forest may be removing air pollutants such as NO2and thus helping create healthier urban atmospheres. Figure: Comparing the amount of NO2 deposition as estimated by CMAQ and the observationally-based statistical model (LURF). Each point corresponds to a single 4 x 4km CMAQ grid cell.
NASA Astrophysics Data System (ADS)
Barra, Adriano; Contucci, Pierluigi; Sandell, Rickard; Vernia, Cecilia
2014-02-01
How does immigrant integration in a country change with immigration density? Guided by a statistical mechanics perspective we propose a novel approach to this problem. The analysis focuses on classical integration quantifiers such as the percentage of jobs (temporary and permanent) given to immigrants, mixed marriages, and newborns with parents of mixed origin. We find that the average values of different quantifiers may exhibit either linear or non-linear growth on immigrant density and we suggest that social action, a concept identified by Max Weber, causes the observed non-linearity. Using the statistical mechanics notion of interaction to quantitatively emulate social action, a unified mathematical model for integration is proposed and it is shown to explain both growth behaviors observed. The linear theory instead, ignoring the possibility of interaction effects would underestimate the quantifiers up to 30% when immigrant densities are low, and overestimate them as much when densities are high. The capacity to quantitatively isolate different types of integration mechanisms makes our framework a suitable tool in the quest for more efficient integration policies.
Bayesian Orbit Computation Tools for Objects on Geocentric Orbits
NASA Astrophysics Data System (ADS)
Virtanen, J.; Granvik, M.; Muinonen, K.; Oszkiewicz, D.
2013-08-01
We consider the space-debris orbital inversion problem via the concept of Bayesian inference. The methodology has been put forward for the orbital analysis of solar system small bodies in early 1990's [7] and results in a full solution of the statistical inverse problem given in terms of a posteriori probability density function (PDF) for the orbital parameters. We demonstrate the applicability of our statistical orbital analysis software to Earth orbiting objects, both using well-established Monte Carlo (MC) techniques (for a review, see e.g. [13] as well as recently developed Markov-chain MC (MCMC) techniques (e.g., [9]). In particular, we exploit the novel virtual observation MCMC method [8], which is based on the characterization of the phase-space volume of orbital solutions before the actual MCMC sampling. Our statistical methods and the resulting PDFs immediately enable probabilistic impact predictions to be carried out. Furthermore, this can be readily done also for very sparse data sets and data sets of poor quality - providing that some a priori information on the observational uncertainty is available. For asteroids, impact probabilities with the Earth from the discovery night onwards have been provided, e.g., by [11] and [10], the latter study includes the sampling of the observational-error standard deviation as a random variable.
Sojoudi, Alireza; Goodyear, Bradley G
2016-12-01
Spontaneous fluctuations of blood-oxygenation level-dependent functional magnetic resonance imaging (BOLD fMRI) signals are highly synchronous between brain regions that serve similar functions. This provides a means to investigate functional networks; however, most analysis techniques assume functional connections are constant over time. This may be problematic in the case of neurological disease, where functional connections may be highly variable. Recently, several methods have been proposed to determine moment-to-moment changes in the strength of functional connections over an imaging session (so called dynamic connectivity). Here a novel analysis framework based on a hierarchical observation modeling approach was proposed, to permit statistical inference of the presence of dynamic connectivity. A two-level linear model composed of overlapping sliding windows of fMRI signals, incorporating the fact that overlapping windows are not independent was described. To test this approach, datasets were synthesized whereby functional connectivity was either constant (significant or insignificant) or modulated by an external input. The method successfully determines the statistical significance of a functional connection in phase with the modulation, and it exhibits greater sensitivity and specificity in detecting regions with variable connectivity, when compared with sliding-window correlation analysis. For real data, this technique possesses greater reproducibility and provides a more discriminative estimate of dynamic connectivity than sliding-window correlation analysis. Hum Brain Mapp 37:4566-4580, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Mori, Kaya; Chonko, James C.; Hailey, Charles J.
2005-10-01
We have reanalyzed the 260 ks XMM-Newton observation of 1E 1207.4-5209. There are several significant improvements over previous work. First, a much broader range of physically plausible spectral models was used. Second, we have used a more rigorous statistical analysis. The standard F-distribution was not employed, but rather the exact finite statistics F-distribution was determined by Monte Carlo simulations. This approach was motivated by the recent work of Protassov and coworkers and Freeman and coworkers. They demonstrated that the standard F-distribution is not even asymptotically correct when applied to assess the significance of additional absorption features in a spectrum. With our improved analysis we do not find a third and fourth spectral feature in 1E 1207.4-5209 but only the two broad absorption features previously reported. Two additional statistical tests, one line model dependent and the other line model independent, confirmed our modified F-test analysis. For all physically plausible continuum models in which the weak residuals are strong enough to fit, the residuals occur at the instrument Au M edge. As a sanity check we confirmed that the residuals are consistent in strength and position with the instrument Au M residuals observed in 3C 273.
Schäffer, Beat; Pieren, Reto; Mendolia, Franco; Basner, Mathias; Brink, Mark
2017-05-01
Noise exposure-response relationships are used to estimate the effects of noise on individuals or a population. Such relationships may be derived from independent or repeated binary observations, and modeled by different statistical methods. Depending on the method by which they were established, their application in population risk assessment or estimation of individual responses may yield different results, i.e., predict "weaker" or "stronger" effects. As far as the present body of literature on noise effect studies is concerned, however, the underlying statistical methodology to establish exposure-response relationships has not always been paid sufficient attention. This paper gives an overview on two statistical approaches (subject-specific and population-averaged logistic regression analysis) to establish noise exposure-response relationships from repeated binary observations, and their appropriate applications. The considerations are illustrated with data from three noise effect studies, estimating also the magnitude of differences in results when applying exposure-response relationships derived from the two statistical approaches. Depending on the underlying data set and the probability range of the binary variable it covers, the two approaches yield similar to very different results. The adequate choice of a specific statistical approach and its application in subsequent studies, both depending on the research question, are therefore crucial.
NASA Astrophysics Data System (ADS)
Karpushin, P. A.; Popov, Yu B.; Popova, A. I.; Popova, K. Yu; Krasnenko, N. P.; Lavrinenko, A. V.
2017-11-01
In this paper, the probabilities of faultless operation of aerologic stations are analyzed, the hypothesis of normality of the empirical data required for using the Kalman filter algorithms is tested, and the spatial correlation functions of distributions of meteorological parameters are determined. The results of a statistical analysis of two-term (0, 12 GMT) radiosonde observations of the temperature and wind velocity components at some preset altitude ranges in the troposphere in 2001-2016 are presented. These data can be used in mathematical modeling of physical processes in the atmosphere.
Modeling of a Robust Confidence Band for the Power Curve of a Wind Turbine.
Hernandez, Wilmar; Méndez, Alfredo; Maldonado-Correa, Jorge L; Balleteros, Francisco
2016-12-07
Having an accurate model of the power curve of a wind turbine allows us to better monitor its operation and planning of storage capacity. Since wind speed and direction is of a highly stochastic nature, the forecasting of the power generated by the wind turbine is of the same nature as well. In this paper, a method for obtaining a robust confidence band containing the power curve of a wind turbine under test conditions is presented. Here, the confidence band is bound by two curves which are estimated using parametric statistical inference techniques. However, the observations that are used for carrying out the statistical analysis are obtained by using the binning method, and in each bin, the outliers are eliminated by using a censorship process based on robust statistical techniques. Then, the observations that are not outliers are divided into observation sets. Finally, both the power curve of the wind turbine and the two curves that define the robust confidence band are estimated using each of the previously mentioned observation sets.
Modeling of a Robust Confidence Band for the Power Curve of a Wind Turbine
Hernandez, Wilmar; Méndez, Alfredo; Maldonado-Correa, Jorge L.; Balleteros, Francisco
2016-01-01
Having an accurate model of the power curve of a wind turbine allows us to better monitor its operation and planning of storage capacity. Since wind speed and direction is of a highly stochastic nature, the forecasting of the power generated by the wind turbine is of the same nature as well. In this paper, a method for obtaining a robust confidence band containing the power curve of a wind turbine under test conditions is presented. Here, the confidence band is bound by two curves which are estimated using parametric statistical inference techniques. However, the observations that are used for carrying out the statistical analysis are obtained by using the binning method, and in each bin, the outliers are eliminated by using a censorship process based on robust statistical techniques. Then, the observations that are not outliers are divided into observation sets. Finally, both the power curve of the wind turbine and the two curves that define the robust confidence band are estimated using each of the previously mentioned observation sets. PMID:27941604
Modelling short time series in metabolomics: a functional data analysis approach.
Montana, Giovanni; Berk, Maurice; Ebbels, Tim
2011-01-01
Metabolomics is the study of the complement of small molecule metabolites in cells, biofluids and tissues. Many metabolomic experiments are designed to compare changes observed over time under two or more experimental conditions (e.g. a control and drug-treated group), thus producing time course data. Models from traditional time series analysis are often unsuitable because, by design, only very few time points are available and there are a high number of missing values. We propose a functional data analysis approach for modelling short time series arising in metabolomic studies which overcomes these obstacles. Our model assumes that each observed time series is a smooth random curve, and we propose a statistical approach for inferring this curve from repeated measurements taken on the experimental units. A test statistic for detecting differences between temporal profiles associated with two experimental conditions is then presented. The methodology has been applied to NMR spectroscopy data collected in a pre-clinical toxicology study.
Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O; Liang, Jingjing; Young, J Hunter; Franceschini, Nora; Smith, Jennifer A; Yanek, Lisa R; Sun, Yan V; Edwards, Todd L; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K; Chakravati, Aravinda; Cooper, Richard S; Redline, Susan
2015-01-08
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Introduction to Latent Class Analysis with Applications
ERIC Educational Resources Information Center
Porcu, Mariano; Giambona, Francesca
2017-01-01
Latent class analysis (LCA) is a statistical method used to group individuals (cases, units) into classes (categories) of an unobserved (latent) variable on the basis of the responses made on a set of nominal, ordinal, or continuous observed variables. In this article, we introduce LCA in order to demonstrate its usefulness to early adolescence…
A Comparison of Imputation Methods for Bayesian Factor Analysis Models
ERIC Educational Resources Information Center
Merkle, Edgar C.
2011-01-01
Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…
Model Performance Evaluation and Scenario Analysis ...
This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too
NASA Technical Reports Server (NTRS)
Weger, R. C.; Lee, J.; Zhu, Tianri; Welch, R. M.
1992-01-01
The current controversy existing in reference to the regularity vs. clustering in cloud fields is examined by means of analysis and simulation studies based upon nearest-neighbor cumulative distribution statistics. It is shown that the Poisson representation of random point processes is superior to pseudorandom-number-generated models and that pseudorandom-number-generated models bias the observed nearest-neighbor statistics towards regularity. Interpretation of this nearest-neighbor statistics is discussed for many cases of superpositions of clustering, randomness, and regularity. A detailed analysis is carried out of cumulus cloud field spatial distributions based upon Landsat, AVHRR, and Skylab data, showing that, when both large and small clouds are included in the cloud field distributions, the cloud field always has a strong clustering signal.
Statistical analysis of subjective preferences for video enhancement
NASA Astrophysics Data System (ADS)
Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli
2010-02-01
Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.
Abdullah, Christopher; Parris, Julian; Lie, Richard; Guzdar, Amy; Tour, Ella
2015-01-01
The ability to think analytically and creatively is crucial for success in the modern workforce, particularly for graduate students, who often aim to become physicians or researchers. Analysis of the primary literature provides an excellent opportunity to practice these skills. We describe a course that includes a structured analysis of four research papers from diverse fields of biology and group exercises in proposing experiments that would follow up on these papers. To facilitate a critical approach to primary literature, we included a paper with questionable data interpretation and two papers investigating the same biological question yet reaching opposite conclusions. We report a significant increase in students’ self-efficacy in analyzing data from research papers, evaluating authors’ conclusions, and designing experiments. Using our science-process skills test, we observe a statistically significant increase in students’ ability to propose an experiment that matches the goal of investigation. We also detect gains in interpretation of controls and quantitative analysis of data. No statistically significant changes were observed in questions that tested the skills of interpretation, inference, and evaluation. PMID:26250564
Sympathetic Nerve Injury in Thyroid Cancer.
Diamantis, Evangelos; Farmaki, Paraskevi; Savvanis, Spyridon; Athanasiadis, Georgios; Troupis, Theodoros; Damaskos, Christos
The double innervation of the thyroid comes from the sympathetic and parasympathetic nervous system. Injury rates during surgery are at 30% but can be minimized by upwardly preparing the thyroid vessels at the level of thyroid capsule. Several factors have been accused of increasing the risk of injury including age and tumor size. Our aim was to investigate of there is indeed any possible correlations between these factors and a possible increase in injury rates following thyroidectomy. Seven studies were included in the meta-analysis. Statistical correlation was observed for a positive relationship between injury of the sympathetic nerve and thyroid malignancy surgery (p 2 = 74%) No statistical correlations were observed for a negative or positive relationship between injury of the sympathetic nerve and tumor size. There was also no statistically significant value observed for the correlation of the patients' age with the risk of sympathetic nerve injury (p = 0.388). Lack of significant correlation reported could be due to the small number of studies and great heterogeneity between them.
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.
Lin, Johnny; Bentler, Peter M
2012-01-01
Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.
Limb-darkening and the structure of the Jovian atmosphere
NASA Technical Reports Server (NTRS)
Newman, W. I.; Sagan, C.
1978-01-01
By observing the transit of various cloud features across the Jovian disk, limb-darkening curves were constructed for three regions in the 4.6 to 5.1 mu cm band. Several models currently employed in describing the radiative or dynamical properties of planetary atmospheres are here examined to understand their implications for limb-darkening. The statistical problem of fitting these models to the observed data is reviewed and methods for applying multiple regression analysis are discussed. Analysis of variance techniques are introduced to test the viability of a given physical process as a cause of the observed limb-darkening.
NASA Technical Reports Server (NTRS)
Jongeward, Andrew R.; Li, Zhanqing; He, Hao; Xiong, Xiaoxiong
2016-01-01
Aerosols contribute to Earths radiative budget both directly and indirectly, and large uncertainties remain in quantifying aerosol effects on climate. Variability in aerosol distribution and properties, as might result from changing emissions and transport processes, must be characterized. In this study, variations in aerosol loading across the eastern seaboard of theUnited States and theNorthAtlanticOcean during 2002 to 2012 are analyzed to examine the impacts of anthropogenic emission control measures using monthly mean data from MODIS, AERONET, and IMPROVE observations and Goddard Chemistry Aerosol Radiation and Transport (GOCART) model simulation.MODIS observes a statistically significant negative trend in aerosol optical depth (AOD) over the midlatitudes (-0.030 decade(sup-1)). Correlation analyses with surface AOD from AERONET sites in the upwind region combined with trend analysis from GOCART component AOD confirm that the observed decrease in the midlatitudes is chiefly associated with anthropogenic aerosols that exhibit significant negative trends from the eastern U.S. coast extending over the western North Atlantic. Additional analysis of IMPROVE surface PM(sub 2.5) observations demonstrates statistically significant negative trends in the anthropogenic components with decreasing mass concentrations over the eastern United States. Finally, a seasonal analysis of observational datasets is performed. The negative trend seen by MODIS is strongest during spring (MAM) and summer (JJA) months. This is supported by AERONET seasonal trends and is identified from IMPROVE seasonal trends as resulting from ammonium sulfate decreases during these seasons.
Statistical Significance and Baseline Monitoring.
1984-07-01
impacted at once........................... 24 6 Observed versus nominal a levels for multivariate tests of data sets (50 runs of 4 groups each...cumulative proportion of the observations found for each nominal level. The results of the comparisons of the observed versus nominal a levels for the...a values are always higher than nominal levels. Virtual- . .,ly all nominal a levels are below 0.20. In other words, the discriminant analysis models
Visual Data Analysis for Satellites
NASA Technical Reports Server (NTRS)
Lau, Yee; Bhate, Sachin; Fitzpatrick, Patrick
2008-01-01
The Visual Data Analysis Package is a collection of programs and scripts that facilitate visual analysis of data available from NASA and NOAA satellites, as well as dropsonde, buoy, and conventional in-situ observations. The package features utilities for data extraction, data quality control, statistical analysis, and data visualization. The Hierarchical Data Format (HDF) satellite data extraction routines from NASA's Jet Propulsion Laboratory were customized for specific spatial coverage and file input/output. Statistical analysis includes the calculation of the relative error, the absolute error, and the root mean square error. Other capabilities include curve fitting through the data points to fill in missing data points between satellite passes or where clouds obscure satellite data. For data visualization, the software provides customizable Generic Mapping Tool (GMT) scripts to generate difference maps, scatter plots, line plots, vector plots, histograms, timeseries, and color fill images.
NASA Technical Reports Server (NTRS)
Jasperson, W. H.; Nastron, G. D.; Davis, R. E.; Holdeman, J. D.
1984-01-01
Summary studies are presented for the entire cloud observation archive from the NASA Global Atmospheric Sampling Program (GASP). Studies are also presented for GASP particle-concentration data gathered concurrently with the cloud observations. Cloud encounters are shown on about 15 percent of the data samples overall, but the probability of cloud encounter is shown to vary significantly with altitude, latitude, and distance from the tropopause. Several meteorological circulation features are apparent in the latitudinal distribution of cloud cover, and the cloud-encounter statistics are shown to be consistent with the classical mid-latitude cyclone model. Observations of clouds spaced more closely than 90 minutes are shown to be statistically dependent. The statistics for cloud and particle encounter are utilized to estimate the frequency of cloud encounter on long-range airline routes, and to assess the probability and extent of laminaar flow loss due to cloud or particle encounter by aircraft utilizing laminar flow control (LFC). It is shown that the probability of extended cloud encounter is too low, of itself, to make LFC impractical. This report is presented in two volumes. Volume I contains the narrative, analysis, and conclusions. Volume II contains five supporting appendixes.
A Cyber-Attack Detection Model Based on Multivariate Analyses
NASA Astrophysics Data System (ADS)
Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi
In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.
Unicomb, Rachael; Colyvas, Kim; Harrison, Elisabeth; Hewat, Sally
2015-06-01
Case-study methodology studying change is often used in the field of speech-language pathology, but it can be criticized for not being statistically robust. Yet with the heterogeneous nature of many communication disorders, case studies allow clinicians and researchers to closely observe and report on change. Such information is valuable and can further inform large-scale experimental designs. In this research note, a statistical analysis for case-study data is outlined that employs a modification to the Reliable Change Index (Jacobson & Truax, 1991). The relationship between reliable change and clinical significance is discussed. Example data are used to guide the reader through the use and application of this analysis. A method of analysis is detailed that is suitable for assessing change in measures with binary categorical outcomes. The analysis is illustrated using data from one individual, measured before and after treatment for stuttering. The application of this approach to assess change in categorical, binary data has potential application in speech-language pathology. It enables clinicians and researchers to analyze results from case studies for their statistical and clinical significance. This new method addresses a gap in the research design literature, that is, the lack of analysis methods for noncontinuous data (such as counts, rates, proportions of events) that may be used in case-study designs.
NASA Astrophysics Data System (ADS)
Sheshukov, Aleksey Y.; Sekaluvu, Lawrence; Hutchinson, Stacy L.
2018-04-01
Topographic index (TI) models have been widely used to predict trajectories and initiation points of ephemeral gullies (EGs) in agricultural landscapes. Prediction of EGs strongly relies on the selected value of critical TI threshold, and the accuracy depends on topographic features, agricultural management, and datasets of observed EGs. This study statistically evaluated the predictions by TI models in two paired watersheds in Central Kansas that had different levels of structural disturbances due to implemented conservation practices. Four TI models with sole dependency on topographic factors of slope, contributing area, and planform curvature were used in this study. The observed EGs were obtained by field reconnaissance and through the process of hydrological reconditioning of digital elevation models (DEMs). The Kernel Density Estimation analysis was used to evaluate TI distribution within a 10-m buffer of the observed EG trajectories. The EG occurrence within catchments was analyzed using kappa statistics of the error matrix approach, while the lengths of predicted EGs were compared with the observed dataset using the Nash-Sutcliffe Efficiency (NSE) statistics. The TI frequency analysis produced bi-modal distribution of topographic indexes with the pixels within the EG trajectory having a higher peak. The graphs of kappa and NSE versus critical TI threshold showed similar profile for all four TI models and both watersheds with the maximum value representing the best comparison with the observed data. The Compound Topographic Index (CTI) model presented the overall best accuracy with NSE of 0.55 and kappa of 0.32. The statistics for the disturbed watershed showed higher best critical TI threshold values than for the undisturbed watershed. Structural conservation practices implemented in the disturbed watershed reduced ephemeral channels in headwater catchments, thus producing less variability in catchments with EGs. The variation in critical thresholds for all TI models suggested that TI models tend to predict EG occurrence and length over a range of thresholds rather than find a single best value.
1992-01-09
Crystal Polymers Tracy Reed Geophysics Laboratory (GEO) 9 Analysis of Model Output Statistics Thunderstorm Prediction Model Frank Lasley 10...four hours to twenty-four hours. It was predicted that the dogbones would turn brown once they reached the approximate annealing temperature. This was...LYS Hanscom AFB Frank A. Lasley Abstracft. Model Output Statistics (MOS) Thunderstorm prediction information and Service A weather observations
Zhang, Harrison G; Ying, Gui-Shuang
2018-02-09
The aim of this study is to evaluate the current practice of statistical analysis of eye data in clinical science papers published in British Journal of Ophthalmology ( BJO ) and to determine whether the practice of statistical analysis has improved in the past two decades. All clinical science papers (n=125) published in BJO in January-June 2017 were reviewed for their statistical analysis approaches for analysing primary ocular measure. We compared our findings to the results from a previous paper that reviewed BJO papers in 1995. Of 112 papers eligible for analysis, half of the studies analysed the data at an individual level because of the nature of observation, 16 (14%) studies analysed data from one eye only, 36 (32%) studies analysed data from both eyes at ocular level, one study (1%) analysed the overall summary of ocular finding per individual and three (3%) studies used the paired comparison. Among studies with data available from both eyes, 50 (89%) of 56 papers in 2017 did not analyse data from both eyes or ignored the intereye correlation, as compared with in 60 (90%) of 67 papers in 1995 (P=0.96). Among studies that analysed data from both eyes at an ocular level, 33 (92%) of 36 studies completely ignored the intereye correlation in 2017, as compared with in 16 (89%) of 18 studies in 1995 (P=0.40). A majority of studies did not analyse the data properly when data from both eyes were available. The practice of statistical analysis did not improve in the past two decades. Collaborative efforts should be made in the vision research community to improve the practice of statistical analysis for ocular data. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
[The concept "a case in outpatient treatment" in military policlinic activity].
Vinogradov, S N; Vorob'ev, E G; Shklovskiĭ, B L
2014-04-01
Substantiates the necessity of transition of military policlinics to the accounting system and evaluation of their activity on the finished cases of outpatient treatment. Only automating data-statistical processes can solve this problem. On the basis of analysis of the literature data, requirements of the guidance documents and observational results concludes that preliminarily should be done revisal (formalisation) of existing concepts of medical statistics from the position of information environment which in use - electronic databases. In this aspect specified the main features of outpatient treatment case as a unit of medical-statistical record, and formulated its definition.
NASA Technical Reports Server (NTRS)
Myers, R. H.
1976-01-01
The depletion of ozone in the stratosphere is examined, and causes for the depletion are cited. Ground station and satellite measurements of ozone, which are taken on a worldwide basis, are discussed. Instruments used in ozone measurement are discussed, such as the Dobson spectrophotometer, which is credited with providing the longest and most extensive series of observations for ground based observation of stratospheric ozone. Other ground based instruments used to measure ozone are also discussed. The statistical differences of ground based measurements of ozone from these different instruments are compared to each other, and to satellite measurements. Mathematical methods (i.e., trend analysis or linear regression analysis) of analyzing the variability of ozone concentration with respect to time and lattitude are described. Various time series models which can be employed in accounting for ozone concentration variability are examined.
GAFFE: a gaze-attentive fixation finding engine.
Rajashekar, U; van der Linde, I; Bovik, A C; Cormack, L K
2008-04-01
The ability to automatically detect visually interesting regions in images has many practical applications, especially in the design of active machine vision and automatic visual surveillance systems. Analysis of the statistics of image features at observers' gaze can provide insights into the mechanisms of fixation selection in humans. Using a foveated analysis framework, we studied the statistics of four low-level local image features: luminance, contrast, and bandpass outputs of both luminance and contrast, and discovered that image patches around human fixations had, on average, higher values of each of these features than image patches selected at random. Contrast-bandpass showed the greatest difference between human and random fixations, followed by luminance-bandpass, RMS contrast, and luminance. Using these measurements, we present a new algorithm that selects image regions as likely candidates for fixation. These regions are shown to correlate well with fixations recorded from human observers.
An Efficient Objective Analysis System for Parallel Computers
NASA Technical Reports Server (NTRS)
Stobie, J.
1999-01-01
A new atmospheric objective analysis system designed for parallel computers will be described. The system can produce a global analysis (on a 1 X 1 lat-lon grid with 18 levels of heights and winds and 10 levels of moisture) using 120,000 observations in 17 minutes on 32 CPUs (SGI Origin 2000). No special parallel code is needed (e.g. MPI or multitasking) and the 32 CPUs do not have to be on the same platform. The system is totally portable and can run on several different architectures at once. In addition, the system can easily scale up to 100 or more CPUS. This will allow for much higher resolution and significant increases in input data. The system scales linearly as the number of observations and the number of grid points. The cost overhead in going from 1 to 32 CPUs is 18%. In addition, the analysis results are identical regardless of the number of processors used. This system has all the characteristics of optimal interpolation, combining detailed instrument and first guess error statistics to produce the best estimate of the atmospheric state. Static tests with a 2 X 2.5 resolution version of this system showed it's analysis increments are comparable to the latest NASA operational system including maintenance of mass-wind balance. Results from several months of cycling test in the Goddard EOS Data Assimilation System (GEOS DAS) show this new analysis retains the same level of agreement between the first guess and observations (O-F statistics) as the current operational system.
Crown, William; Chang, Jessica; Olson, Melvin; Kahler, Kristijan; Swindle, Jason; Buzinec, Paul; Shah, Nilay; Borah, Bijan
2015-09-01
Missing data, particularly missing variables, can create serious analytic challenges in observational comparative effectiveness research studies. Statistical linkage of datasets is a potential method for incorporating missing variables. Prior studies have focused upon the bias introduced by imperfect linkage. This analysis uses a case study of hepatitis C patients to estimate the net effect of statistical linkage on bias, also accounting for the potential reduction in missing variable bias. The results show that statistical linkage can reduce bias while also enabling parameter estimates to be obtained for the formerly missing variables. The usefulness of statistical linkage will vary depending upon the strength of the correlations of the missing variables with the treatment variable, as well as the outcome variable of interest.
Shitara, Kohei; Matsuo, Keitaro; Oze, Isao; Mizota, Ayako; Kondo, Chihiro; Nomura, Motoo; Yokota, Tomoya; Takahari, Daisuke; Ura, Takashi; Muro, Kei
2011-08-01
We performed a systematic review and meta-analysis to determine the impact of neutropenia or leukopenia experienced during chemotherapy on survival. Eligible studies included prospective or retrospective analyses that evaluated neutropenia or leukopenia as a prognostic factor for overall survival or disease-free survival. Statistical analyses were conducted to calculate a summary hazard ratio and 95% confidence interval (CI) using random-effects or fixed-effects models based on the heterogeneity of the included studies. Thirteen trials were selected for the meta-analysis, with a total of 9,528 patients. The hazard ratio of death was 0.69 (95% CI, 0.64-0.75) for patients with higher-grade neutropenia or leukopenia compared to patients with lower-grade or lack of cytopenia. Our analysis was also stratified by statistical method (any statistical method to decrease lead-time bias; time-varying analysis or landmark analysis), but no differences were observed. Our results indicate that neutropenia or leukopenia experienced during chemotherapy is associated with improved survival in patients with advanced cancer or hematological malignancies undergoing chemotherapy. Future prospective analyses designed to investigate the potential impact of chemotherapy dose adjustment coupled with monitoring of neutropenia or leukopenia on survival are warranted.
Statistical analysis of the 70 meter antenna surface distortions
NASA Technical Reports Server (NTRS)
Kiedron, K.; Chian, C. T.; Chuang, K. L.
1987-01-01
Statistical analysis of surface distortions of the 70 meter NASA/JPL antenna, located at Goldstone, was performed. The purpose of this analysis is to verify whether deviations due to gravity loading can be treated as quasi-random variables with normal distribution. Histograms of the RF pathlength error distribution for several antenna elevation positions were generated. The results indicate that the deviations from the ideal antenna surface are not normally distributed. The observed density distribution for all antenna elevation angles is taller and narrower than the normal density, which results in large positive values of kurtosis and a significant amount of skewness. The skewness of the distribution changes from positive to negative as the antenna elevation changes from zenith to horizon.
Wang, Hong-Qiang; Tsai, Chung-Jui
2013-01-01
With the rapid increase of omics data, correlation analysis has become an indispensable tool for inferring meaningful associations from a large number of observations. Pearson correlation coefficient (PCC) and its variants are widely used for such purposes. However, it remains challenging to test whether an observed association is reliable both statistically and biologically. We present here a new method, CorSig, for statistical inference of correlation significance. CorSig is based on a biology-informed null hypothesis, i.e., testing whether the true PCC (ρ) between two variables is statistically larger than a user-specified PCC cutoff (τ), as opposed to the simple null hypothesis of ρ = 0 in existing methods, i.e., testing whether an association can be declared without a threshold. CorSig incorporates Fisher's Z transformation of the observed PCC (r), which facilitates use of standard techniques for p-value computation and multiple testing corrections. We compared CorSig against two methods: one uses a minimum PCC cutoff while the other (Zhu's procedure) controls correlation strength and statistical significance in two discrete steps. CorSig consistently outperformed these methods in various simulation data scenarios by balancing between false positives and false negatives. When tested on real-world Populus microarray data, CorSig effectively identified co-expressed genes in the flavonoid pathway, and discriminated between closely related gene family members for their differential association with flavonoid and lignin pathways. The p-values obtained by CorSig can be used as a stand-alone parameter for stratification of co-expressed genes according to their correlation strength in lieu of an arbitrary cutoff. CorSig requires one single tunable parameter, and can be readily extended to other correlation measures. Thus, CorSig should be useful for a wide range of applications, particularly for network analysis of high-dimensional genomic data. A web server for CorSig is provided at http://202.127.200.1:8080/probeWeb. R code for CorSig is freely available for non-commercial use at http://aspendb.uga.edu/downloads.
A cohort mortality study of employees exposed to chlorinated chemicals.
Wong, O
1988-01-01
The cohort of this historical prospective mortality study consisted of 697 male employees at a chlorination plant. A majority of the cohort was potentially exposed to benzotrichloride, benzyl chloride, benzoyl chloride, and other related chemicals. The mortality experience of the cohort was observed from 1943 through 1982. For the cohort as a whole, no statistically significant mortality excess was detected. The overall Standardized Mortality Ratio (SMR) was 100, and the SMR for all cancers combined was 122 (not significant). The respiratory cancer SMR for the cohort as a whole was 246 (7 observed vs. 2.8 expected). The excess was of borderline statistical significance, the lower 95% confidence limit being 99. Analysis by race showed that all 7 respiratory cancer deaths came from the white male employees, with an SMR of 265 (p less than 0.05). The respiratory cancer mortality excess was higher among employees in maintenance (SMR = 229) than among those in operations or production (SMR = 178). The lung cancer mortality excess among the laboratory employees was statistically significant (SMR = 1292). However, this observation should be viewed with caution, since it was based on only 2 deaths. Further analysis indicated that the respiratory cancer mortality excess was limited to the male employees with 15 or more years of employment (SMR = 379, p less than 0.05). Based on animal data as well as other epidemiologic studies, together with the internal consistency of analysis by length of employment, the data suggest an association between the chlorination process of toluene at the plant and an increased risk of respiratory cancer.(ABSTRACT TRUNCATED AT 250 WORDS)
A national streamflow network gap analysis
Kiang, Julie E.; Stewart, David W.; Archfield, Stacey A.; Osborne, Emily B.; Eng, Ken
2013-01-01
The U.S. Geological Survey (USGS) conducted a gap analysis to evaluate how well the USGS streamgage network meets a variety of needs, focusing on the ability to calculate various statistics at locations that have streamgages (gaged) and that do not have streamgages (ungaged). This report presents the results of analysis to determine where there are gaps in the network of gaged locations, how accurately desired statistics can be calculated with a given length of record, and whether the current network allows for estimation of these statistics at ungaged locations. The analysis indicated that there is variability across the Nation’s streamflow data-collection network in terms of the spatial and temporal coverage of streamgages. In general, the Eastern United States has better coverage than the Western United States. The arid Southwestern United States, Alaska, and Hawaii were observed to have the poorest spatial coverage, using the dataset assembled for this study. Except in Hawaii, these areas also tended to have short streamflow records. Differences in hydrology lead to differences in the uncertainty of statistics calculated in different regions of the country. Arid and semiarid areas of the Central and Southwestern United States generally exhibited the highest levels of interannual variability in flow, leading to larger uncertainty in flow statistics. At ungaged locations, information can be transferred from nearby streamgages if there is sufficient similarity between the gaged watersheds and the ungaged watersheds of interest. Areas where streamgages exhibit high correlation are most likely to be suitable for this type of information transfer. The areas with the most highly correlated streamgages appear to coincide with mountainous areas of the United States. Lower correlations are found in the Central United States and coastal areas of the Southeastern United States. Information transfer from gaged basins to ungaged basins is also most likely to be successful when basin attributes show high similarity. At the scale of the analysis completed in this study, the attributes of basins upstream of USGS streamgages cover the full range of basin attributes observed at potential locations of interest fairly well. Some exceptions included very high or very low elevation areas and very arid areas.
Dalgin, Rebecca Spirito; Dalgin, M Halim; Metzger, Scott J
2018-05-01
This article focuses on the impact of a peer run warm line as part of the psychiatric recovery process. It utilized data including the Recovery Assessment Scale, community integration measures and crisis service usage. Longitudinal statistical analysis was completed on 48 sets of data from 2011, 2012, and 2013. Although no statistically significant differences were observed for the RAS score, community integration data showed increases in visits to primary care doctors, leisure/recreation activities and socialization with others. This study highlights the complexity of psychiatric recovery and that nonclinical peer services like peer run warm lines may be critical to the process.
Statistical analysis and modeling of intermittent transport events in the tokamak scrape-off layer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Johan, E-mail: anderson.johan@gmail.com; Halpern, Federico D.; Ricci, Paolo
The turbulence observed in the scrape-off-layer of a tokamak is often characterized by intermittent events of bursty nature, a feature which raises concerns about the prediction of heat loads on the physical boundaries of the device. It appears thus necessary to delve into the statistical properties of turbulent physical fields such as density, electrostatic potential, and temperature, focusing on the mathematical expression of tails of the probability distribution functions. The method followed here is to generate statistical information from time-traces of the plasma density stemming from Braginskii-type fluid simulations and check this against a first-principles theoretical model. The analysis ofmore » the numerical simulations indicates that the probability distribution function of the intermittent process contains strong exponential tails, as predicted by the analytical theory.« less
Parallel Climate Data Assimilation PSAS Package Achieves 18 GFLOPs on 512-Node Intel Paragon
NASA Technical Reports Server (NTRS)
Ding, H. Q.; Chan, C.; Gennery, D. B.; Ferraro, R. D.
1995-01-01
Several algorithms were added to the Physical-space Statistical Analysis System (PSAS) from Goddard, which assimilates observational weather data by correcting for different levels of uncertainty about the data and different locations for mobile observation platforms. The new algorithms and use of the 512-node Intel Paragon allowed a hundred-fold decrease in processing time.
NASA Astrophysics Data System (ADS)
Colone, L.; Hovgaard, M. K.; Glavind, L.; Brincker, R.
2018-07-01
A method for mass change detection on wind turbine blades using natural frequencies is presented. The approach is based on two statistical tests. The first test decides if there is a significant mass change and the second test is a statistical group classification based on Linear Discriminant Analysis. The frequencies are identified by means of Operational Modal Analysis using natural excitation. Based on the assumption of Gaussianity of the frequencies, a multi-class statistical model is developed by combining finite element model sensitivities in 10 classes of change location on the blade, the smallest area being 1/5 of the span. The method is experimentally validated for a full scale wind turbine blade in a test setup and loaded by natural wind. Mass change from natural causes was imitated with sand bags and the algorithm was observed to perform well with an experimental detection rate of 1, localization rate of 0.88 and mass estimation rate of 0.72.
Statistical analysis of NaOH pretreatment effects on sweet sorghum bagasse characteristics
NASA Astrophysics Data System (ADS)
Putri, Ary Mauliva Hada; Wahyuni, Eka Tri; Sudiyani, Yanni
2017-01-01
We analyze the behavior of sweet sorghum bagasse characteristics before and after NaOH pretreatments by statistical analysis. These characteristics include the percentages of lignocellulosic materials and the degree of crystallinity. We use the chi-square method to get the values of fitted parameters, and then deploy student's t-test to check whether they are significantly different from zero at 99.73% confidence level (C.L.). We obtain, in the cases of hemicellulose and lignin, that their percentages after pretreatment decrease statistically. On the other hand, crystallinity does not possess similar behavior as the data proves that all fitted parameters in this case might be consistent with zero. Our statistical result is then cross examined with the observations from X-ray diffraction (XRD) and Fourier Transform Infrared (FTIR) Spectroscopy, showing pretty good agreement. This result may indicate that the 10% NaOH pretreatment might not be sufficient in changing the crystallinity index of the sweet sorghum bagasse.
12 & 15 passenger vans tire pressure study : preliminary results
DOT National Transportation Integrated Search
2005-05-01
A study was conducted by the National Highway Traffic Safety Administration's (NHTSA's) National Center for Statistics and Analysis (NCSA) to determine the extent of underinflation and observe the tire condition in 12- and 15-passenger vans. This Res...
NASA Astrophysics Data System (ADS)
Malik, Abdul; Brönnimann, Stefan
2017-09-01
The Modes of Ocean Variability (MOV) namely Atlantic Multidecadal Oscillation (AMO), Pacific Decadal Oscillation (PDO), and El Niño Southern Oscillation (ENSO) can have significant impacts on Indian Summer Monsoon Rainfall (ISMR) on different timescales. The timescales at which these MOV interacts with ISMR and the factors which may perturb their relationship with ISMR need to be investigated. We employ De-trended Cross-Correlation Analysis (DCCA), and De-trended Partial-Cross-Correlation Analysis (DPCCA) to study the timescales of interaction of ISMR with AMO, PDO, and ENSO using observational dataset (AD 1854-1999), and atmosphere-ocean-chemistry climate model simulations with SOCOL-MPIOM (AD 1600-1999). Further, this study uses De-trended Semi-Partial Cross-Correlation Analysis (DSPCCA) to address the relation between solar variability and the ISMR. We find statistically significant evidence of intrinsic correlations of ISMR with AMO, PDO, and ENSO on different timescales, consistent between model simulations and observations. However, the model fails to capture modulation in intrinsic relationship between ISRM and MOV due to external signals. Our analysis indicates that AMO is a potential source of non-stationary relationship between ISMR and ENSO. Furthermore, the pattern of correlation between ISMR and Total Solar Irradiance (TSI) is inconsistent between observations and model simulations. The observational dataset indicates statistically insignificant negative intrinsic correlation between ISMR and TSI on decadal-to-centennial timescales. This statistically insignificant negative intrinsic correlation is transformed to statistically significant positive extrinsic by AMO on 61-86-year timescale. We propose a new mechanism for Sun-monsoon connection which operates through AMO by changes in summer (June-September; JJAS) meridional gradient of tropospheric temperatures (ΔTTJJAS). There is a negative (positive) intrinsic correlation between ΔTTJJAS (AMO) and TSI. The negative intrinsic correlation between ΔTTJJAS and TSI indicates that high (low) solar activity weakens (strengthens) the meridional gradient of tropospheric temperature during the summer monsoon season and subsequently the weak (strong) ΔTTJJAS decreases (increases) the ISMR. However, the presence of AMO transforms the negative intrinsic relation between ΔTTJJAS and TSI into positive extrinsic and strengthens the ISMR. We conclude that the positive relation between ISMR and solar activity, as found by other authors, is mainly due to the effect of AMO on ISMR.
NASA Astrophysics Data System (ADS)
Malik, Abdul; Brönnimann, Stefan
2018-06-01
The Modes of Ocean Variability (MOV) namely Atlantic Multidecadal Oscillation (AMO), Pacific Decadal Oscillation (PDO), and El Niño Southern Oscillation (ENSO) can have significant impacts on Indian Summer Monsoon Rainfall (ISMR) on different timescales. The timescales at which these MOV interacts with ISMR and the factors which may perturb their relationship with ISMR need to be investigated. We employ De-trended Cross-Correlation Analysis (DCCA), and De-trended Partial-Cross-Correlation Analysis (DPCCA) to study the timescales of interaction of ISMR with AMO, PDO, and ENSO using observational dataset (AD 1854-1999), and atmosphere-ocean-chemistry climate model simulations with SOCOL-MPIOM (AD 1600-1999). Further, this study uses De-trended Semi-Partial Cross-Correlation Analysis (DSPCCA) to address the relation between solar variability and the ISMR. We find statistically significant evidence of intrinsic correlations of ISMR with AMO, PDO, and ENSO on different timescales, consistent between model simulations and observations. However, the model fails to capture modulation in intrinsic relationship between ISRM and MOV due to external signals. Our analysis indicates that AMO is a potential source of non-stationary relationship between ISMR and ENSO. Furthermore, the pattern of correlation between ISMR and Total Solar Irradiance (TSI) is inconsistent between observations and model simulations. The observational dataset indicates statistically insignificant negative intrinsic correlation between ISMR and TSI on decadal-to-centennial timescales. This statistically insignificant negative intrinsic correlation is transformed to statistically significant positive extrinsic by AMO on 61-86-year timescale. We propose a new mechanism for Sun-monsoon connection which operates through AMO by changes in summer (June-September; JJAS) meridional gradient of tropospheric temperatures (ΔTTJJAS). There is a negative (positive) intrinsic correlation between ΔTTJJAS (AMO) and TSI. The negative intrinsic correlation between ΔTTJJAS and TSI indicates that high (low) solar activity weakens (strengthens) the meridional gradient of tropospheric temperature during the summer monsoon season and subsequently the weak (strong) ΔTTJJAS decreases (increases) the ISMR. However, the presence of AMO transforms the negative intrinsic relation between ΔTTJJAS and TSI into positive extrinsic and strengthens the ISMR. We conclude that the positive relation between ISMR and solar activity, as found by other authors, is mainly due to the effect of AMO on ISMR.
Yoon, Hyun Jung; Chung, Myung Jin; Hwang, Hye Sun; Moon, Jung Won; Lee, Kyung Soo
2015-01-01
To assess the performance of adaptive statistical iterative reconstruction (ASIR)-applied ultra-low-dose CT (ULDCT) in detecting small lung nodules. Thirty patients underwent both ULDCT and standard dose CT (SCT). After determining the reference standard nodules, five observers, blinded to the reference standard reading results, independently evaluated SCT and both subsets of ASIR- and filtered back projection (FBP)-driven ULDCT images. Data assessed by observers were compared statistically. Converted effective doses in SCT and ULDCT were 2.81 ± 0.92 and 0.17 ± 0.02 mSv, respectively. A total of 114 lung nodules were detected on SCT as a standard reference. There was no statistically significant difference in sensitivity between ASIR-driven ULDCT and SCT for three out of the five observers (p = 0.678, 0.735, < 0.01, 0.038, and < 0.868 for observers 1, 2, 3, 4, and 5, respectively). The sensitivity of FBP-driven ULDCT was significantly lower than that of ASIR-driven ULDCT in three out of the five observers (p < 0.01 for three observers, and p = 0.064 and 0.146 for two observers). In jackknife alternative free-response receiver operating characteristic analysis, the mean values of figure-of-merit (FOM) for FBP, ASIR-driven ULDCT, and SCT were 0.682, 0.772, and 0.821, respectively, and there were no significant differences in FOM values between ASIR-driven ULDCT and SCT (p = 0.11), but the FOM value of FBP-driven ULDCT was significantly lower than that of ASIR-driven ULDCT and SCT (p = 0.01 and 0.00). Adaptive statistical iterative reconstruction-driven ULDCT delivering a radiation dose of only 0.17 mSv offers acceptable sensitivity in nodule detection compared with SCT and has better performance than FBP-driven ULDCT.
Yoon, Hyun Jung; Hwang, Hye Sun; Moon, Jung Won; Lee, Kyung Soo
2015-01-01
Objective To assess the performance of adaptive statistical iterative reconstruction (ASIR)-applied ultra-low-dose CT (ULDCT) in detecting small lung nodules. Materials and Methods Thirty patients underwent both ULDCT and standard dose CT (SCT). After determining the reference standard nodules, five observers, blinded to the reference standard reading results, independently evaluated SCT and both subsets of ASIR- and filtered back projection (FBP)-driven ULDCT images. Data assessed by observers were compared statistically. Results Converted effective doses in SCT and ULDCT were 2.81 ± 0.92 and 0.17 ± 0.02 mSv, respectively. A total of 114 lung nodules were detected on SCT as a standard reference. There was no statistically significant difference in sensitivity between ASIR-driven ULDCT and SCT for three out of the five observers (p = 0.678, 0.735, < 0.01, 0.038, and < 0.868 for observers 1, 2, 3, 4, and 5, respectively). The sensitivity of FBP-driven ULDCT was significantly lower than that of ASIR-driven ULDCT in three out of the five observers (p < 0.01 for three observers, and p = 0.064 and 0.146 for two observers). In jackknife alternative free-response receiver operating characteristic analysis, the mean values of figure-of-merit (FOM) for FBP, ASIR-driven ULDCT, and SCT were 0.682, 0.772, and 0.821, respectively, and there were no significant differences in FOM values between ASIR-driven ULDCT and SCT (p = 0.11), but the FOM value of FBP-driven ULDCT was significantly lower than that of ASIR-driven ULDCT and SCT (p = 0.01 and 0.00). Conclusion Adaptive statistical iterative reconstruction-driven ULDCT delivering a radiation dose of only 0.17 mSv offers acceptable sensitivity in nodule detection compared with SCT and has better performance than FBP-driven ULDCT. PMID:26357505
Crew, Page E; Rhodes, Nathaniel J; O'Donnell, J Nicholas; Miglis, Cristina; Gilbert, Elise M; Zembower, Teresa R; Qi, Chao; Silkaitis, Christina; Sutton, Sarah H; Scheetz, Marc H
2018-03-01
The purpose of this single-center, ecologic study is to characterize the relationship between facility-wide (FacWide) antibiotic consumption and incident health care facility-onset Clostridium difficile infection (HO-CDI). FacWide antibiotic consumption and incident HO-CDI were tallied on a monthly basis and standardized, from January 2013 through April 2015. Spearman rank-order correlation coefficients were calculated using matched-months analysis and a 1-month delay. Regression analyses were performed, with P < .05 considered statistically significant. FacWide analysis identified a matched-months correlation between ceftriaxone and HO-CDI (ρ = 0.44, P = .018). A unit of stem cell transplant recipients did not have significant correlation between carbapenems and HO-CDI in matched months (ρ = 0.37, P = .098), but a significant correlation was observed when a 1-month lag was applied (ρ = 0.54, P = .014). Three statistically significant lag associations were observed between FacWide/unit-level antibiotic consumption and HO-CDI, and 1 statistically significant nonlagged association was observed FacWide. Antibiotic consumption may convey extended ward-level risk for incident CDI. Consumption of antibiotic agents may have immediate and prolonged influence on incident CDI. Additional studies are needed to investigate the immediate and delayed associations between antibiotic consumption and C difficile colonization, infection, and transmission at the hospital level. Published by Elsevier Inc.
Falivene, S; Pezzulla, D; Di Franco, R; Giugliano, F M; Esposito, E; Scoglio, C; Amato, B; Borzillo, V; D'Aiuto, M; Muto, P
2017-02-01
Bone metastases are a frequent complication of advanced oncologic disease. Pain associated to bone metastasis is a major cause of morbidity in cancer patients, especially in elderly. The aim of this multicentric retrospective observational study is to evaluate the efficacy of different schedules of radiation therapy in elderly patients in terms of pain relief. 206 patients over the age of 60 were enrolled in 1 year time for a multicentre retrospective observational study. Patients were treated with palliative purposes for painful bone metastases. Pain intensity difference (PID) was found in 72% of patients. Reported PID was statistically significant for p < 0.01. Pain intensity measured by a point numeric rating scale was statistically significant reduced for p < 0.05 by one-fraction regimen compared to other two regimens. In recent years, numerous studies have evaluated the most appropriate regimen of fractionation in individual cases, despite this, a consensus about the best schedule is still debated. On our analysis, single-fractionation scheme (8 Gy) confirmed to be statistical significant effective in providing pain reduction due to bone metastases. Radiation therapy provides significant pain relief of symptomatic bone metastases, but appropriate radiotherapy scheduled is needed in order to get significant response to treatment. Multidisciplinary approach is warranted to value the balance between the therapeutic objectives and the patient quality of life.
NASA Astrophysics Data System (ADS)
Matsuda, Takashi S.; Nakamura, Takuji; Ejiri, Mitsumu K.; Tsutsumi, Masaki; Shiokawa, Kazuo
2014-08-01
We have developed a new analysis method for obtaining the power spectrum in the horizontal phase velocity domain from airglow intensity image data to study atmospheric gravity waves. This method can deal with extensive amounts of imaging data obtained on different years and at various observation sites without bias caused by different event extraction criteria for the person processing the data. The new method was applied to sodium airglow data obtained in 2011 at Syowa Station (69°S, 40°E), Antarctica. The results were compared with those obtained from a conventional event analysis in which the phase fronts were traced manually in order to estimate horizontal characteristics, such as wavelengths, phase velocities, and wave periods. The horizontal phase velocity of each wave event in the airglow images corresponded closely to a peak in the spectrum. The statistical results of spectral analysis showed an eastward offset of the horizontal phase velocity distribution. This could be interpreted as the existence of wave sources around the stratospheric eastward jet. Similar zonal anisotropy was also seen in the horizontal phase velocity distribution of the gravity waves by the event analysis. Both methods produce similar statistical results about directionality of atmospheric gravity waves. Galactic contamination of the spectrum was examined by calculating the apparent velocity of the stars and found to be limited for phase speeds lower than 30 m/s. In conclusion, our new method is suitable for deriving the horizontal phase velocity characteristics of atmospheric gravity waves from an extensive amount of imaging data.
NASA Astrophysics Data System (ADS)
Bègue, Nelson; Mbatha, Nkanyiso; Bencherif, Hassan; Tato Loua, René; Sivakumar, Venkataraman; Leblanc, Thierry
2017-11-01
In this investigation a statistical analysis of the characteristics of mesospheric inversion layers (MILs) over tropical regions is presented. This study involves the analysis of 16 years of lidar observations recorded at Réunion (20.8° S, 55.5° E) and 21 years of lidar observations recorded at Mauna Loa (19.5° N, 155.6° W) together with SABER observations at these two locations. MILs appear in 10 and 9.3 % of the observed temperature profiles recorded by Rayleigh lidar at Réunion and Mauna Loa, respectively. The parameters defining MILs show a semi-annual cycle over the two selected sites with maxima occurring near the equinoxes and minima occurring during the solstices. Over both sites, the maximum mean amplitude is observed in April and October, and this corresponds to a value greater than 35 K. According to lidar observations, the maximum and minimum mean of the base height ranged from 79 to 80.5 km and from 76 to 77.5 km, respectively. The MILs at Réunion appear on average ˜ 1 km thinner and ˜ 1 km lower, with an amplitude of ˜ 2 K higher than Mauna Loa. Generally, the statistical results for these two tropical locations as presented in this investigation are in fairly good agreement with previous studies. When compared to lidar measurements, on average SABER observations show MILs with greater amplitude, thickness and base altitudes of 4 K, 0.75 and 1.1 km, respectively. Taking into account the temperature error by SABER in the mesosphere, it can therefore be concluded that the measurements obtained from lidar and SABER observations are in significant agreement. The frequency spectrum analysis based on the lidar profiles and the 60-day averaged profile from SABER confirms the presence of the semi-annual oscillation where the magnitude maximum is found to coincide with the height range of the temperature inversion zone. This connection between increases in the semi-annual component close to the inversion zone is in agreement with most previously reported studies over tropics based on satellite observations. Results presented in this study confirm through the use of the ground-based Rayleigh lidar at Réunion and Mauna Loa that the semi-annual oscillation contributes to the formation of MILs over the tropical region.
Walsh, Daniel P.; Norton, Andrew S.; Storm, Daniel J.; Van Deelen, Timothy R.; Heisy, Dennis M.
2018-01-01
Implicit and explicit use of expert knowledge to inform ecological analyses is becoming increasingly common because it often represents the sole source of information in many circumstances. Thus, there is a need to develop statistical methods that explicitly incorporate expert knowledge, and can successfully leverage this information while properly accounting for associated uncertainty during analysis. Studies of cause-specific mortality provide an example of implicit use of expert knowledge when causes-of-death are uncertain and assigned based on the observer's knowledge of the most likely cause. To explicitly incorporate this use of expert knowledge and the associated uncertainty, we developed a statistical model for estimating cause-specific mortality using a data augmentation approach within a Bayesian hierarchical framework. Specifically, for each mortality event, we elicited the observer's belief of cause-of-death by having them specify the probability that the death was due to each potential cause. These probabilities were then used as prior predictive values within our framework. This hierarchical framework permitted a simple and rigorous estimation method that was easily modified to include covariate effects and regularizing terms. Although applied to survival analysis, this method can be extended to any event-time analysis with multiple event types, for which there is uncertainty regarding the true outcome. We conducted simulations to determine how our framework compared to traditional approaches that use expert knowledge implicitly and assume that cause-of-death is specified accurately. Simulation results supported the inclusion of observer uncertainty in cause-of-death assignment in modeling of cause-specific mortality to improve model performance and inference. Finally, we applied the statistical model we developed and a traditional method to cause-specific survival data for white-tailed deer, and compared results. We demonstrate that model selection results changed between the two approaches, and incorporating observer knowledge in cause-of-death increased the variability associated with parameter estimates when compared to the traditional approach. These differences between the two approaches can impact reported results, and therefore, it is critical to explicitly incorporate expert knowledge in statistical methods to ensure rigorous inference.
LaBudde, Robert A; Harnly, James M
2012-01-01
A qualitative botanical identification method (BIM) is an analytical procedure that returns a binary result (1 = Identified, 0 = Not Identified). A BIM may be used by a buyer, manufacturer, or regulator to determine whether a botanical material being tested is the same as the target (desired) material, or whether it contains excessive nontarget (undesirable) material. The report describes the development and validation of studies for a BIM based on the proportion of replicates identified, or probability of identification (POI), as the basic observed statistic. The statistical procedures proposed for data analysis follow closely those of the probability of detection, and harmonize the statistical concepts and parameters between quantitative and qualitative method validation. Use of POI statistics also harmonizes statistical concepts for botanical, microbiological, toxin, and other analyte identification methods that produce binary results. The POI statistical model provides a tool for graphical representation of response curves for qualitative methods, reporting of descriptive statistics, and application of performance requirements. Single collaborator and multicollaborative study examples are given.
The Statistical Value of Raw Fluorescence Signal in Luminex xMAP Based Multiplex Immunoassays
Breen, Edmond J.; Tan, Woei; Khan, Alamgir
2016-01-01
Tissue samples (plasma, saliva, serum or urine) from 169 patients classified as either normal or having one of seven possible diseases are analysed across three 96-well plates for the presences of 37 analytes using cytokine inflammation multiplexed immunoassay panels. Censoring for concentration data caused problems for analysis of the low abundant analytes. Using fluorescence analysis over concentration based analysis allowed analysis of these low abundant analytes. Mixed-effects analysis on the resulting fluorescence and concentration responses reveals a combination of censoring and mapping the fluorescence responses to concentration values, through a 5PL curve, changed observed analyte concentrations. Simulation verifies this, by showing a dependence on the mean florescence response and its distribution on the observed analyte concentration levels. Differences from normality, in the fluorescence responses, can lead to differences in concentration estimates and unreliable probabilities for treatment effects. It is seen that when fluorescence responses are normally distributed, probabilities of treatment effects for fluorescence based t-tests has greater statistical power than the same probabilities from concentration based t-tests. We add evidence that the fluorescence response, unlike concentration values, doesn’t require censoring and we show with respect to differential analysis on the fluorescence responses that background correction is not required. PMID:27243383
NASA Astrophysics Data System (ADS)
Erfanifard, Y.; Rezayan, F.
2014-10-01
Vegetation heterogeneity biases second-order summary statistics, e.g., Ripley's K-function, applied for spatial pattern analysis in ecology. Second-order investigation based on Ripley's K-function and related statistics (i.e., L- and pair correlation function g) is widely used in ecology to develop hypothesis on underlying processes by characterizing spatial patterns of vegetation. The aim of this study was to demonstrate effects of underlying heterogeneity of wild pistachio (Pistacia atlantica Desf.) trees on the second-order summary statistics of point pattern analysis in a part of Zagros woodlands, Iran. The spatial distribution of 431 wild pistachio trees was accurately mapped in a 40 ha stand in the Wild Pistachio & Almond Research Site, Fars province, Iran. Three commonly used second-order summary statistics (i.e., K-, L-, and g-functions) were applied to analyse their spatial pattern. The two-sample Kolmogorov-Smirnov goodness-of-fit test showed that the observed pattern significantly followed an inhomogeneous Poisson process null model in the study region. The results also showed that heterogeneous pattern of wild pistachio trees biased the homogeneous form of K-, L-, and g-functions, demonstrating a stronger aggregation of the trees at the scales of 0-50 m than actually existed and an aggregation at scales of 150-200 m, while regularly distributed. Consequently, we showed that heterogeneity of point patterns may bias the results of homogeneous second-order summary statistics and we also suggested applying inhomogeneous summary statistics with related null models for spatial pattern analysis of heterogeneous vegetations.
Kurtuluş-Ulküer, M; Ulküer, U; Kesici, T; Menevşe, S
2002-09-01
In this study, the phenotype and allele frequencies of five enzyme systems were determined in a total of 611 unrelated Turkish individuals and analyzed by using the exact and the chi 2 test. The following five red cell enzymes were identified by cellulose acetate electrophoresis: phosphoglucomutase (PGM), adenosine deaminase (ADA), phosphoglucose isomerase (PGI), adenylate kinase (AK), and 6-phosphogluconate dehydrogenase (6-PGD). The ADA, PGM and AK enzymes were found to be polymorphic in the Turkish population. The results of the statistical analysis showed, that the phenotype frequencies of the five enzyme under study are in Hardy-Weinberg equilibrium. Statistical analysis was performed in order to examine whether there are significant differences in the phenotype frequencies between the Turkish population and four American population groups. This analysis showed, that there are some statistically significant differences between the Turkish and the other groups. Moreover, the observed phenotype and allele frequencies were compared with those obtained in other population groups of Turkey.
The value of a statistical life: a meta-analysis with a mixed effects regression model.
Bellavance, François; Dionne, Georges; Lebeau, Martin
2009-03-01
The value of a statistical life (VSL) is a very controversial topic, but one which is essential to the optimization of governmental decisions. We see a great variability in the values obtained from different studies. The source of this variability needs to be understood, in order to offer public decision-makers better guidance in choosing a value and to set clearer guidelines for future research on the topic. This article presents a meta-analysis based on 39 observations obtained from 37 studies (from nine different countries) which all use a hedonic wage method to calculate the VSL. Our meta-analysis is innovative in that it is the first to use the mixed effects regression model [Raudenbush, S.W., 1994. Random effects models. In: Cooper, H., Hedges, L.V. (Eds.), The Handbook of Research Synthesis. Russel Sage Foundation, New York] to analyze studies on the value of a statistical life. We conclude that the variability found in the values studied stems in large part from differences in methodologies.
Quantifying the Energy Landscape Statistics in Proteins - a Relaxation Mode Analysis
NASA Astrophysics Data System (ADS)
Cai, Zhikun; Zhang, Yang
Energy landscape, the hypersurface in the configurational space, has been a useful concept in describing complex processes that occur over a very long time scale, such as the multistep slow relaxations of supercooled liquids and folding of polypeptide chains into structured proteins. Despite extensive simulation studies, its experimental characterization still remains a challenge. To address this challenge, we developed a relaxation mode analysis (RMA) for liquids under a framework analogous to the normal mode analysis for solids. Using RMA, important statistics of the activation barriers of the energy landscape becomes accessible from experimentally measurable two-point correlation functions, e.g. using quasi-elastic and inelastic scattering experiments. We observed a prominent coarsening effect of the energy landscape. The results were further confirmed by direct sampling of the energy landscape using a metadynamics-like adaptive autonomous basin climbing computation. We first demonstrate RMA in a supercooled liquid when dynamical cooperativity emerges in the landscape-influenced regime. Then we show this framework reveals encouraging energy landscape statistics when applied to proteins.
Yue Xu, Selene; Nelson, Sandahl; Kerr, Jacqueline; Godbole, Suneeta; Patterson, Ruth; Merchant, Gina; Abramson, Ian; Staudenmayer, John; Natarajan, Loki
2018-04-01
Physical inactivity is a recognized risk factor for many chronic diseases. Accelerometers are increasingly used as an objective means to measure daily physical activity. One challenge in using these devices is missing data due to device nonwear. We used a well-characterized cohort of 333 overweight postmenopausal breast cancer survivors to examine missing data patterns of accelerometer outputs over the day. Based on these observed missingness patterns, we created psuedo-simulated datasets with realistic missing data patterns. We developed statistical methods to design imputation and variance weighting algorithms to account for missing data effects when fitting regression models. Bias and precision of each method were evaluated and compared. Our results indicated that not accounting for missing data in the analysis yielded unstable estimates in the regression analysis. Incorporating variance weights and/or subject-level imputation improved precision by >50%, compared to ignoring missing data. We recommend that these simple easy-to-implement statistical tools be used to improve analysis of accelerometer data.
Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark
2013-01-01
Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pasciuti, Katia, E-mail: k.pasciuti@virgilio.it; Kuthpady, Shrinivas; Anderson, Anne
To examine tumor's and organ's response when different radiotherapy plan techniques are used. Ten patients with confirmed bladder tumors were first treated using 3-dimensional conformal radiotherapy (3DCRT) and subsequently the original plans were re-optimized using the intensity-modulated radiation treatment (IMRT) and volumetric-modulated arc therapy (VMAT)-techniques. Targets coverage in terms of conformity and homogeneity index, TCP, and organs' dose limits, including integral dose analysis were evaluated. In addition, MUs and treatment delivery times were compared. Better minimum target coverage (1.3%) was observed in VMAT plans when compared to 3DCRT and IMRT ones confirmed by a statistically significant conformity index (CI) results.more » Large differences were observed among techniques in integral dose results of the femoral heads. Even if no statistically significant differences were reported in rectum and tissue, a large amount of energy deposition was observed in 3DCRT plans. In any case, VMAT plans provided better organs and tissue sparing confirmed also by the normal tissue complication probability (NTCP) analysis as well as a better tumor control probability (TCP) result. Our analysis showed better overall results in planning using VMAT techniques. Furthermore, a total time reduction in treatment observed among techniques including gantry and collimator rotation could encourage using the more recent one, reducing target movements and patient discomfort.« less
Radiation from quantum weakly dynamical horizons in loop quantum gravity.
Pranzetti, Daniele
2012-07-06
We provide a statistical mechanical analysis of quantum horizons near equilibrium in the grand canonical ensemble. By matching the description of the nonequilibrium phase in terms of weakly dynamical horizons with a local statistical framework, we implement loop quantum gravity dynamics near the boundary. The resulting radiation process provides a quantum gravity description of the horizon evaporation. For large black holes, the spectrum we derive presents a discrete structure which could be potentially observable.
Frans, Lonna M.; Helsel, Dennis R.
2005-01-01
Trends in nitrate concentrations in water from 474 wells in 17 subregions in the Columbia Basin Ground Water Management Area (GWMA) in three counties in eastern Washington were evaluated using a variety of statistical techniques, including the Friedman test and the Kendall test. The Kendall test was modified from its typical 'seasonal' version into a 'regional' version by using well locations in place of seasons. No statistically significant trends in nitrate concentrations were identified in samples from wells in the GWMA, the three counties, or the 17 subregions from 1998 to 2002 when all data were included in the analysis. For wells in which nitrate concentrations were greater than 10 milligrams per liter (mg/L), however, a significant downward trend of -0.4 mg/L per year was observed between 1998 and 2002 for the GWMA as a whole, as well as for Adams County (-0.35 mg/L per year) and for Franklin County (-0.46 mg/L per year). Trend analysis for a smaller but longer-term 51-well dataset in Franklin County found a statistically significant upward trend in nitrate concentrations of 0.1 mg/L per year between 1986 and 2003. The largest increase of nitrate concentrations occurred between 1986 and 1991. No statistically significant differences were observed in this dataset between 1998 and 2003 indicating that the increase in nitrate concentrations has leveled off.
Lee, Yong Seuk; Lee, Sang Bok; Oh, Won Seok; Kwon, Yong Eok; Lee, Beom Koo
2016-01-01
The objectives of this study were (1) to evaluate the clinical and radiologic outcomes of open-wedge high tibial osteotomy focusing on patellofemoral alignment and (2) to search for correlation between variables and patellofemoral malalignment. A total of 46 knees (46 patients) from 32 females and 14 males who underwent open-wedge high tibial osteotomy were included in this retrospective case series. Outcomes were evaluated using clinical scales and radiologic parameters at the last follow-up. Pre-operative and final follow-up values were compared for the outcome analysis. For the focused analysis of the patellofemoral joint, correlation analyses between patellofemoral variables and pre- and post-operative weight-bearing line (WBL), clinical score, posterior slope, Blackburn Peel ratio, lateral patellar tilt, lateral patellar shift, and congruence angle were performed. The minimum follow-up period was 2 years and median follow-up period was 44 months (range 24-88 months). The percentage of weight-bearing line was shifted from 17.2 ± 11.1 to 56.7 ± 12.7%, and it was statistically significant (p < 0.01). Regarding the clinical results, statistical significance was observed using all scores (p < 0.01). In the radiologic evaluation, patellar descent was observed with statistical significance (p < 0.01). Last follow-up lateral patellar tilt was decreased with statistical significance (p < 0.01). In correlation analysis between variables of patellofemoral malalignment, the pre-operative weight-bearing line showed an association with the change in lateral patellar tilt and lateral patellar shift (correlation coefficient: 0.3). After open-wedge high tibial osteotomy, clinical results showed improvement, compared to pre-operative values. The patellar tilt and lateral patellar shift were not changed; however, descent of the patella was observed. Therefore, mild patellofemoral problems should not be a contraindication of the open-wedge high tibial osteotomy. Case series, Level IV.
Econophysical visualization of Adam Smith’s invisible hand
NASA Astrophysics Data System (ADS)
Cohen, Morrel H.; Eliazar, Iddo I.
2013-02-01
Consider a complex system whose macrostate is statistically observable, but yet whose operating mechanism is an unknown black-box. In this paper we address the problem of inferring, from the system’s macrostate statistics, the system’s intrinsic force yielding the observed statistics. The inference is established via two diametrically opposite approaches which result in the very same intrinsic force: a top-down approach based on the notion of entropy, and a bottom-up approach based on the notion of Langevin dynamics. The general results established are applied to the problem of visualizing the intrinsic socioeconomic force-Adam Smith’s invisible hand-shaping the distribution of wealth in human societies. Our analysis yields quantitative econophysical representations of figurative socioeconomic forces, quantitative definitions of “poor” and “rich”, and a quantitative characterization of the “poor-get-poorer” and the “rich-get-richer” phenomena.
NASA Astrophysics Data System (ADS)
Brizzi, S.; Sandri, L.; Funiciello, F.; Corbi, F.; Piromallo, C.; Heuret, A.
2018-03-01
The observed maximum magnitude of subduction megathrust earthquakes is highly variable worldwide. One key question is which conditions, if any, favor the occurrence of giant earthquakes (Mw ≥ 8.5). Here we carry out a multivariate statistical study in order to investigate the factors affecting the maximum magnitude of subduction megathrust earthquakes. We find that the trench-parallel extent of subduction zones and the thickness of trench sediments provide the largest discriminating capability between subduction zones that have experienced giant earthquakes and those having significantly lower maximum magnitude. Monte Carlo simulations show that the observed spatial distribution of giant earthquakes cannot be explained by pure chance to a statistically significant level. We suggest that the combination of a long subduction zone with thick trench sediments likely promotes a great lateral rupture propagation, characteristic of almost all giant earthquakes.
A random-sum Wilcoxon statistic and its application to analysis of ROC and LROC data.
Tang, Liansheng Larry; Balakrishnan, N
2011-01-01
The Wilcoxon-Mann-Whitney statistic is commonly used for a distribution-free comparison of two groups. One requirement for its use is that the sample sizes of the two groups are fixed. This is violated in some of the applications such as medical imaging studies and diagnostic marker studies; in the former, the violation occurs since the number of correctly localized abnormal images is random, while in the latter the violation is due to some subjects not having observable measurements. For this reason, we propose here a random-sum Wilcoxon statistic for comparing two groups in the presence of ties, and derive its variance as well as its asymptotic distribution for large sample sizes. The proposed statistic includes the regular Wilcoxon rank-sum statistic. Finally, we apply the proposed statistic for summarizing location response operating characteristic data from a liver computed tomography study, and also for summarizing diagnostic accuracy of biomarker data.
Statistical power analysis in wildlife research
Steidl, R.J.; Hayes, J.P.
1997-01-01
Statistical power analysis can be used to increase the efficiency of research efforts and to clarify research results. Power analysis is most valuable in the design or planning phases of research efforts. Such prospective (a priori) power analyses can be used to guide research design and to estimate the number of samples necessary to achieve a high probability of detecting biologically significant effects. Retrospective (a posteriori) power analysis has been advocated as a method to increase information about hypothesis tests that were not rejected. However, estimating power for tests of null hypotheses that were not rejected with the effect size observed in the study is incorrect; these power estimates will always be a??0.50 when bias adjusted and have no relation to true power. Therefore, retrospective power estimates based on the observed effect size for hypothesis tests that were not rejected are misleading; retrospective power estimates are only meaningful when based on effect sizes other than the observed effect size, such as those effect sizes hypothesized to be biologically significant. Retrospective power analysis can be used effectively to estimate the number of samples or effect size that would have been necessary for a completed study to have rejected a specific null hypothesis. Simply presenting confidence intervals can provide additional information about null hypotheses that were not rejected, including information about the size of the true effect and whether or not there is adequate evidence to 'accept' a null hypothesis as true. We suggest that (1) statistical power analyses be routinely incorporated into research planning efforts to increase their efficiency, (2) confidence intervals be used in lieu of retrospective power analyses for null hypotheses that were not rejected to assess the likely size of the true effect, (3) minimum biologically significant effect sizes be used for all power analyses, and (4) if retrospective power estimates are to be reported, then the I?-level, effect sizes, and sample sizes used in calculations must also be reported.
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis
Lin, Johnny; Bentler, Peter M.
2012-01-01
Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne’s asymptotically distribution-free method and Satorra Bentler’s mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler’s statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby’s study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic. PMID:23144511
Statistics analysis of distribution of Bradysia Ocellaris insect on Oyster mushroom cultivation
NASA Astrophysics Data System (ADS)
Sari, Kurnia Novita; Amelia, Ririn
2015-12-01
Bradysia Ocellaris insect is a pest on Oyster mushroom cultivation. The disitribution of Bradysia Ocellaris have a special pattern that can observed every week with several asumption such as independent, normality and homogenity. We can analyze the number of Bradysia Ocellaris for each week through descriptive analysis. Next, the distribution pattern of Bradysia Ocellaris is described through by semivariogram that is diagram of variance from difference value between pair of observation that separeted by d. Semivariogram model that suitable for Bradysia Ocellaris data is spherical isotropic model.
NASA Astrophysics Data System (ADS)
Zhao, Runchen; Ientilucci, Emmett J.
2017-05-01
Hyperspectral remote sensing systems provide spectral data composed of hundreds of narrow spectral bands. Spectral remote sensing systems can be used to identify targets, for example, without physical interaction. Often it is of interested to characterize the spectral variability of targets or objects. The purpose of this paper is to identify and characterize the LWIR spectral variability of targets based on an improved earth observing statistical performance model, known as the Forecasting and Analysis of Spectroradiometric System Performance (FASSP) model. FASSP contains three basic modules including a scene model, sensor model and a processing model. Instead of using mean surface reflectance only as input to the model, FASSP transfers user defined statistical characteristics of a scene through the image chain (i.e., from source to sensor). The radiative transfer model, MODTRAN, is used to simulate the radiative transfer based on user defined atmospheric parameters. To retrieve class emissivity and temperature statistics, or temperature / emissivity separation (TES), a LWIR atmospheric compensation method is necessary. The FASSP model has a method to transform statistics in the visible (ie., ELM) but currently does not have LWIR TES algorithm in place. This paper addresses the implementation of such a TES algorithm and its associated transformation of statistics.
Analysis of statistical misconception in terms of statistical reasoning
NASA Astrophysics Data System (ADS)
Maryati, I.; Priatna, N.
2018-05-01
Reasoning skill is needed for everyone to face globalization era, because every person have to be able to manage and use information from all over the world which can be obtained easily. Statistical reasoning skill is the ability to collect, group, process, interpret, and draw conclusion of information. Developing this skill can be done through various levels of education. However, the skill is low because many people assume that statistics is just the ability to count and using formulas and so do students. Students still have negative attitude toward course which is related to research. The purpose of this research is analyzing students’ misconception in descriptive statistic course toward the statistical reasoning skill. The observation was done by analyzing the misconception test result and statistical reasoning skill test; observing the students’ misconception effect toward statistical reasoning skill. The sample of this research was 32 students of math education department who had taken descriptive statistic course. The mean value of misconception test was 49,7 and standard deviation was 10,6 whereas the mean value of statistical reasoning skill test was 51,8 and standard deviation was 8,5. If the minimal value is 65 to state the standard achievement of a course competence, students’ mean value is lower than the standard competence. The result of students’ misconception study emphasized on which sub discussion that should be considered. Based on the assessment result, it was found that students’ misconception happen on this: 1) writing mathematical sentence and symbol well, 2) understanding basic definitions, 3) determining concept that will be used in solving problem. In statistical reasoning skill, the assessment was done to measure reasoning from: 1) data, 2) representation, 3) statistic format, 4) probability, 5) sample, and 6) association.
Statistical methods for quantitative mass spectrometry proteomic experiments with labeling.
Oberg, Ann L; Mahoney, Douglas W
2012-01-01
Mass Spectrometry utilizing labeling allows multiple specimens to be subjected to mass spectrometry simultaneously. As a result, between-experiment variability is reduced. Here we describe use of fundamental concepts of statistical experimental design in the labeling framework in order to minimize variability and avoid biases. We demonstrate how to export data in the format that is most efficient for statistical analysis. We demonstrate how to assess the need for normalization, perform normalization, and check whether it worked. We describe how to build a model explaining the observed values and test for differential protein abundance along with descriptive statistics and measures of reliability of the findings. Concepts are illustrated through the use of three case studies utilizing the iTRAQ 4-plex labeling protocol.
NASA Astrophysics Data System (ADS)
Kawahara, Hajime; Reese, Erik D.; Kitayama, Tetsu; Sasaki, Shin; Suto, Yasushi
2008-11-01
Our previous analysis indicates that small-scale fluctuations in the intracluster medium (ICM) from cosmological hydrodynamic simulations follow the lognormal probability density function. In order to test the lognormal nature of the ICM directly against X-ray observations of galaxy clusters, we develop a method of extracting statistical information about the three-dimensional properties of the fluctuations from the two-dimensional X-ray surface brightness. We first create a set of synthetic clusters with lognormal fluctuations around their mean profile given by spherical isothermal β-models, later considering polytropic temperature profiles as well. Performing mock observations of these synthetic clusters, we find that the resulting X-ray surface brightness fluctuations also follow the lognormal distribution fairly well. Systematic analysis of the synthetic clusters provides an empirical relation between the three-dimensional density fluctuations and the two-dimensional X-ray surface brightness. We analyze Chandra observations of the galaxy cluster Abell 3667, and find that its X-ray surface brightness fluctuations follow the lognormal distribution. While the lognormal model was originally motivated by cosmological hydrodynamic simulations, this is the first observational confirmation of the lognormal signature in a real cluster. Finally we check the synthetic cluster results against clusters from cosmological hydrodynamic simulations. As a result of the complex structure exhibited by simulated clusters, the empirical relation between the two- and three-dimensional fluctuation properties calibrated with synthetic clusters when applied to simulated clusters shows large scatter. Nevertheless we are able to reproduce the true value of the fluctuation amplitude of simulated clusters within a factor of 2 from their two-dimensional X-ray surface brightness alone. Our current methodology combined with existing observational data is useful in describing and inferring the statistical properties of the three-dimensional inhomogeneity in galaxy clusters.
Kaneta, Tomohiro; Nakatsuka, Masahiro; Nakamura, Kei; Seki, Takashi; Yamaguchi, Satoshi; Tsuboi, Masahiro; Meguro, Kenichi
2016-01-01
SPECT is an important diagnostic tool for dementia. Recently, statistical analysis of SPECT has been commonly used for dementia research. In this study, we evaluated the accuracy of visual SPECT evaluation and/or statistical analysis for the diagnosis (Dx) of Alzheimer disease (AD) and other forms of dementia in our community-based study "The Osaki-Tajiri Project." Eighty-nine consecutive outpatients with dementia were enrolled and underwent brain perfusion SPECT with 99mTc-ECD. Diagnostic accuracy of SPECT was tested using 3 methods: visual inspection (SPECT Dx), automated diagnostic tool using statistical analysis with easy Z-score imaging system (eZIS Dx), and visual inspection plus eZIS (integrated Dx). Integrated Dx showed the highest sensitivity, specificity, and accuracy, whereas eZIS was the second most accurate method. We also observed that a higher than expected rate of SPECT images indicated false-negative cases of AD. Among these, 50% showed hypofrontality and were diagnosed as frontotemporal lobar degeneration. These cases typically showed regional "hot spots" in the primary sensorimotor cortex (ie, a sensorimotor hot spot sign), which we determined were associated with AD rather than frontotemporal lobar degeneration. We concluded that the diagnostic abilities were improved by the integrated use of visual assessment and statistical analysis. In addition, the detection of a sensorimotor hot spot sign was useful to detect AD when hypofrontality is present and improved the ability to properly diagnose AD.
An Ultra-high Resolution Synthetic Precipitation Data for Ungauged Sites
NASA Astrophysics Data System (ADS)
Kim, Hong-Joong; Choi, Kyung-Min; Oh, Jai-Ho
2018-05-01
Despite the enormous damage caused by record heavy rainfall, the amount of precipitation in areas without observation points cannot be known precisely. One way to overcome these difficulties is to estimate meteorological data at ungauged sites. In this study, we have used observation data over Seoul city to calculate high-resolution (250-meter resolution) synthetic precipitation over a 10-year (2005-2014) period. Furthermore, three cases are analyzed by evaluating the rainfall intensity and performing statistical analysis over the 10-year period. In the case where the typhoon "Meari" passed to the west coast during 28-30 June 2011, the Pearson correlation coefficient was 0.93 for seven validation points, which implies that the temporal correlation between the observed precipitation and synthetic precipitation was very good. It can be confirmed that the time series of observation and synthetic precipitation in the period almost completely matches the observed rainfall. On June 28-29, 2011, the estimation of 10 to 30 mm h-1 of continuous strong precipitation was correct. In addition, it is shown that the synthetic precipitation closely follows the observed precipitation for all three cases. Statistical analysis of 10 years of data reveals a very high correlation coefficient between synthetic precipitation and observed rainfall (0.86). Thus, synthetic precipitation data show good agreement with the observations. Therefore, the 250-m resolution synthetic precipitation amount calculated in this study is useful as basic data in weather applications, such as urban flood detection.
Rejection of Multivariate Outliers.
1983-05-01
available in Gnanadesikan (1977). 2 The motivation for the present investigation lies in a recent paper of Schvager and Margolin (1982) who derive a... Gnanadesikan , R. (1977). Methods for Statistical Data Analysis of Multivariate Observations. Wiley, New York. [7] Hawkins, D.M. (1980). Identification of
Providing peak river flow statistics and forecasting in the Niger River basin
NASA Astrophysics Data System (ADS)
Andersson, Jafet C. M.; Ali, Abdou; Arheimer, Berit; Gustafsson, David; Minoungou, Bernard
2017-08-01
Flooding is a growing concern in West Africa. Improved quantification of discharge extremes and associated uncertainties is needed to improve infrastructure design, and operational forecasting is needed to provide timely warnings. In this study, we use discharge observations, a hydrological model (Niger-HYPE) and extreme value analysis to estimate peak river flow statistics (e.g. the discharge magnitude with a 100-year return period) across the Niger River basin. To test the model's capacity of predicting peak flows, we compared 30-year maximum discharge and peak flow statistics derived from the model vs. derived from nine observation stations. The results indicate that the model simulates peak discharge reasonably well (on average + 20%). However, the peak flow statistics have a large uncertainty range, which ought to be considered in infrastructure design. We then applied the methodology to derive basin-wide maps of peak flow statistics and their associated uncertainty. The results indicate that the method is applicable across the hydrologically active part of the river basin, and that the uncertainty varies substantially depending on location. Subsequently, we used the most recent bias-corrected climate projections to analyze potential changes in peak flow statistics in a changed climate. The results are generally ambiguous, with consistent changes only in very few areas. To test the forecasting capacity, we ran Niger-HYPE with a combination of meteorological data sets for the 2008 high-flow season and compared with observations. The results indicate reasonable forecasting capacity (on average 17% deviation), but additional years should also be evaluated. We finish by presenting a strategy and pilot project which will develop an operational flood monitoring and forecasting system based in-situ data, earth observations, modelling, and extreme statistics. In this way we aim to build capacity to ultimately improve resilience toward floods, protecting lives and infrastructure in the region.
Regression analysis of mixed recurrent-event and panel-count data with additive rate models.
Zhu, Liang; Zhao, Hui; Sun, Jianguo; Leisenring, Wendy; Robison, Leslie L
2015-03-01
Event-history studies of recurrent events are often conducted in fields such as demography, epidemiology, medicine, and social sciences (Cook and Lawless, 2007, The Statistical Analysis of Recurrent Events. New York: Springer-Verlag; Zhao et al., 2011, Test 20, 1-42). For such analysis, two types of data have been extensively investigated: recurrent-event data and panel-count data. However, in practice, one may face a third type of data, mixed recurrent-event and panel-count data or mixed event-history data. Such data occur if some study subjects are monitored or observed continuously and thus provide recurrent-event data, while the others are observed only at discrete times and hence give only panel-count data. A more general situation is that each subject is observed continuously over certain time periods but only at discrete times over other time periods. There exists little literature on the analysis of such mixed data except that published by Zhu et al. (2013, Statistics in Medicine 32, 1954-1963). In this article, we consider the regression analysis of mixed data using the additive rate model and develop some estimating equation-based approaches to estimate the regression parameters of interest. Both finite sample and asymptotic properties of the resulting estimators are established, and the numerical studies suggest that the proposed methodology works well for practical situations. The approach is applied to a Childhood Cancer Survivor Study that motivated this study. © 2014, The International Biometric Society.
Clinical applications of a quantitative analysis of regional lift ventricular wall motion
NASA Technical Reports Server (NTRS)
Leighton, R. F.; Rich, J. M.; Pollack, M. E.; Altieri, P. I.
1975-01-01
Observations were summarized which may have clinical application. These were obtained from a quantitative analysis of wall motion that was used to detect both hypokinesis and tardokinesis in left ventricular cineangiograms. The method was based on statistical comparisons with normal values for regional wall motion derived from the cineangiograms of patients who were found not to have heart disease.
[Changes in cerebrospinal fluid in patients with tuberculosis of the central nervous system].
Jedrychowski, Michał; Garlicki, Aleksander
2008-01-01
The aim of the study was to analyze the parameters of the cerebrospinal fluid in patients with tuberculosis of the central nervous system confirmed by culture or molecular methods, in comparison to patients without such confirmation. The analysis of medical documentation of 13 patients with CNS tuberculosis, 10 male and 3 female who were hospitalized at the Clinic of Infectious Diseases in Kraków in years 2001-2006 was performed. Following parameters of the cerebrospinal fluid were taken into account in both groups of patients: cytologic analysis, protein, glucose and chloride concentration. Statistical analysis was done using the non-parametric Mann-Whitney U test. The only parameter for which statistically significant difference between the two groups of patients was found was the level of glucose in CSF (p<0.05). Lower glucose concentration was observed in the group with etiologically confirmed CNS tuberculosis. Moreover additional localisation of tuberculosis was observed in this group of patients. Introduction of the molecular biology methods in diagnosis allowed to detect the etiologic factor more often.
Aftershock identification problem via the nearest-neighbor analysis for marked point processes
NASA Astrophysics Data System (ADS)
Gabrielov, A.; Zaliapin, I.; Wong, H.; Keilis-Borok, V.
2007-12-01
The centennial observations on the world seismicity have revealed a wide variety of clustering phenomena that unfold in the space-time-energy domain and provide most reliable information about the earthquake dynamics. However, there is neither a unifying theory nor a convenient statistical apparatus that would naturally account for the different types of seismic clustering. In this talk we present a theoretical framework for nearest-neighbor analysis of marked processes and obtain new results on hierarchical approach to studying seismic clustering introduced by Baiesi and Paczuski (2004). Recall that under this approach one defines an asymmetric distance D in space-time-energy domain such that the nearest-neighbor spanning graph with respect to D becomes a time- oriented tree. We demonstrate how this approach can be used to detect earthquake clustering. We apply our analysis to the observed seismicity of California and synthetic catalogs from ETAS model and show that the earthquake clustering part is statistically different from the homogeneous part. This finding may serve as a basis for an objective aftershock identification procedure.
Non-linear dielectric spectroscopy of microbiological suspensions
Treo, Ernesto F; Felice, Carmelo J
2009-01-01
Background Non-linear dielectric spectroscopy (NLDS) of microorganism was characterized by the generation of harmonics in the polarization current when a microorganism suspension was exposed to a sinusoidal electric field. The biological nonlinear response initially described was not well verified by other authors and the results were susceptible to ambiguous interpretation. In this paper NLDS was performed to yeast suspension in tripolar and tetrapolar configuration with a recently developed analyzer. Methods Tripolar analysis was carried out by applying sinusoidal voltages up to 1 V at the electrode interface. Tetrapolar analysis was carried on with sinusoidal field strengths from 0.1 V cm-1 to 70 V cm-1. Both analyses were performed within a frequency range from 1 Hz through 100 Hz. The harmonic amplitudes were Fourier-analyzed and expressed in dB. The third harmonic, as reported previously, was investigated. Statistical analysis (ANOVA) was used to test the effect of inhibitor an activator of the plasma membrane enzyme in the measured response. Results No significant non-linearities were observed in tetrapolar analysis, and no observable changes occurred when inhibitor and activator were added to the suspension. Statistical analysis confirmed these results. When a pure sinus voltage was applied to an electrode-yeast suspension interface, variations higher than 25 dB for the 3rd harmonic were observed. Variation higher than 20 dB in the 3rd harmonics has also been found when adding an inhibitor or activator of the membrane-bounded enzymes. These variations did not occur when the suspension was boiled. Discussion The lack of result in tetrapolar cells suggest that there is no, if any, harmonic generation in microbiological bulk suspension. The non-linear response observed was originated in the electrode-electrolyte interface. The frequency and voltage windows observed in previous tetrapolar analysis were repeated in the tripolar measurements, but maximum were not observed at the same values. Conclusion Contrary to previous assertions, no repeatable dielectric non-linearity was exhibited in the bulk suspensions tested under the field and frequency condition reported with this recently designed analyzer. Indeed, interface related harmonics were observed and monitored during biochemical stimuli. The changes were coherent with the expected biological response. PMID:19772595
Golder, Su; Loke, Yoon K.; Bland, Martin
2011-01-01
Background There is considerable debate as to the relative merits of using randomised controlled trial (RCT) data as opposed to observational data in systematic reviews of adverse effects. This meta-analysis of meta-analyses aimed to assess the level of agreement or disagreement in the estimates of harm derived from meta-analysis of RCTs as compared to meta-analysis of observational studies. Methods and Findings Searches were carried out in ten databases in addition to reference checking, contacting experts, citation searches, and hand-searching key journals, conference proceedings, and Web sites. Studies were included where a pooled relative measure of an adverse effect (odds ratio or risk ratio) from RCTs could be directly compared, using the ratio of odds ratios, with the pooled estimate for the same adverse effect arising from observational studies. Nineteen studies, yielding 58 meta-analyses, were identified for inclusion. The pooled ratio of odds ratios of RCTs compared to observational studies was estimated to be 1.03 (95% confidence interval 0.93–1.15). There was less discrepancy with larger studies. The symmetric funnel plot suggests that there is no consistent difference between risk estimates from meta-analysis of RCT data and those from meta-analysis of observational studies. In almost all instances, the estimates of harm from meta-analyses of the different study designs had 95% confidence intervals that overlapped (54/58, 93%). In terms of statistical significance, in nearly two-thirds (37/58, 64%), the results agreed (both studies showing a significant increase or significant decrease or both showing no significant difference). In only one meta-analysis about one adverse effect was there opposing statistical significance. Conclusions Empirical evidence from this overview indicates that there is no difference on average in the risk estimate of adverse effects of an intervention derived from meta-analyses of RCTs and meta-analyses of observational studies. This suggests that systematic reviews of adverse effects should not be restricted to specific study types. Please see later in the article for the Editors' Summary PMID:21559325
Mounting ground sections of teeth: Cyanoacrylate adhesive versus Canada balsam
Vangala, Manogna RL; Rudraraju, Amrutha; Subramanyam, RV
2016-01-01
Introduction: Hard tissues can be studied by either decalcification or by preparing ground sections. Various mounting media have been tried and used for ground sections of teeth. However, there are very few studies on the use of cyanoacrylate adhesive as a mounting medium. Aims: The aim of our study was to evaluate the efficacy of cyanoacrylate adhesive (Fevikwik™) as a mounting medium for ground sections of teeth and to compare these ground sections with those mounted with Canada balsam. Materials and Methods: Ground sections were prepared from twenty extracted teeth. Each section was divided into two halves and mounted on one slide, one with cyanoacrylate adhesive (Fevikwik™) and the other with Canada balsam. Scoring for various features in the ground sections was done by two independent observers. Statistical Analysis Used: Statistical analysis using Student's t-test (unpaired) of average scores was performed for each feature observed. Results: No statistically significant difference was found between the two for most of the features. However, cyanoacrylate was found to be better than Canada balsam for observing striae of Retzius (P < 0.0205), enamel lamellae (P < 0.036), dentinal tubules (P < 0.0057), interglobular dentin (P < 0.0001), sclerotic dentin – transmitted light (P < 0.00001), sclerotic dentin – polarized light (P < 0.0002) and Sharpey's fibers (P < 0.0004). Conclusions: This initial study shows that cyanoacrylate is better than Canada balsam for observing certain features of ground sections of teeth. However, it remains to be seen whether it will be useful for studying undecalcified sections of carious teeth and for soft tissue sections. PMID:27194857
Analysis of the dependence of extreme rainfalls
NASA Astrophysics Data System (ADS)
Padoan, Simone; Ancey, Christophe; Parlange, Marc
2010-05-01
The aim of spatial analysis is to quantitatively describe the behavior of environmental phenomena such as precipitation levels, wind speed or daily temperatures. A number of generic approaches to spatial modeling have been developed[1], but these are not necessarily ideal for handling extremal aspects given their focus on mean process levels. The areal modelling of the extremes of a natural process observed at points in space is important in environmental statistics; for example, understanding extremal spatial rainfall is crucial in flood protection. In light of recent concerns over climate change, the use of robust mathematical and statistical methods for such analyses has grown in importance. Multivariate extreme value models and the class of maxstable processes [2] have a similar asymptotic motivation to the univariate Generalized Extreme Value (GEV) distribution , but providing a general approach to modeling extreme processes incorporating temporal or spatial dependence. Statistical methods for max-stable processes and data analyses of practical problems are discussed by [3] and [4]. This work illustrates methods to the statistical modelling of spatial extremes and gives examples of their use by means of a real extremal data analysis of Switzerland precipitation levels. [1] Cressie, N. A. C. (1993). Statistics for Spatial Data. Wiley, New York. [2] de Haan, L and Ferreria A. (2006). Extreme Value Theory An Introduction. Springer, USA. [3] Padoan, S. A., Ribatet, M and Sisson, S. A. (2009). Likelihood-Based Inference for Max-Stable Processes. Journal of the American Statistical Association, Theory & Methods. In press. [4] Davison, A. C. and Gholamrezaee, M. (2009), Geostatistics of extremes. Journal of the Royal Statistical Society, Series B. To appear.
Ames Research Center SR&T program and earth observations
NASA Technical Reports Server (NTRS)
Poppoff, I. G.
1972-01-01
An overview is presented of the research activities in earth observations at Ames Research Center. Most of the tasks involve the use of research aircraft platforms. The program is also directed toward the use of the Illiac 4 computer for statistical analysis. Most tasks are weighted toward Pacific coast and Pacific basin problems with emphasis on water applications, air applications, animal migration studies, and geophysics.
The influence of ENSO, PDO and PNA on secular rainfall variations in Hawai‘i
Abby G. Frazier; Oliver Elison Timm; Thomas W. Giambelluca; Henry F. Diaz
2017-01-01
Over the last century, significant declines in rainfall across the state of Hawaiâi have been observed, and it is unknown whether these declines are due to natural variations in climate, or manifestations of human-induced climate change. Here, a statistical analysis of the observed rainfall variability was applied as first step towards better understanding causes for...
ERIC Educational Resources Information Center
Froelich, Amy G.; Nettleton, Dan
2013-01-01
In this article, we present a study to test whether neutral observers perceive a resemblance between a parent and a child. We demonstrate the general approach for two separate parent/ child pairs using survey data collected from introductory statistics students serving as neutral observers. We then present ideas for incorporating the study design…
NASA Astrophysics Data System (ADS)
Evans, Ian; Primini, Francis A.; Glotfelty, Kenny J.; Anderson, Craig S.; Bonaventura, Nina R.; Chen, Judy C.; Davis, John E.; Doe, Stephen M.; Evans, Janet D.; Fabbiano, Giuseppina; Galle, Elizabeth C.; Gibbs, Danny G., II; Grier, John D.; Hain, Roger; Hall, Diane M.; Harbo, Peter N.; He, Xiang Qun (Helen); Houck, John C.; Karovska, Margarita; Kashyap, Vinay L.; Lauer, Jennifer; McCollough, Michael L.; McDowell, Jonathan C.; Miller, Joseph B.; Mitschang, Arik W.; Morgan, Douglas L.; Mossman, Amy E.; Nichols, Joy S.; Nowak, Michael A.; Plummer, David A.; Refsdal, Brian L.; Rots, Arnold H.; Siemiginowska, Aneta L.; Sundheim, Beth A.; Tibbetts, Michael S.; van Stone, David W.; Winkelman, Sherry L.; Zografou, Panagoula
2009-09-01
The first release of the Chandra Source Catalog (CSC) was published in 2009 March, and includes information about 94,676 X-ray sources detected in a subset of public ACIS imaging observations from roughly the first eight years of the Chandra mission. This release of the catalog includes point and compact sources with observed spatial extents <˜30''.The CSC is a general purpose virtual X-ray astrophysics facility that provides access to a carefully selected set of generally useful quantities for individual X-ray sources, and is designed to satisfy the needs of a broad-based group of scientists, including those who may be less familiar with astronomical data analysis in the X-ray regime.The catalog (1) provides access to the best estimates of the X-ray source properties for detected sources, with good scientific fidelity, and directly supports medium sophistication scientific analysis on using the individual source data; (2) facilitates analysis of a wide range of statistical properties for classes of X-ray sources; (3) provides efficient access to calibrated observational data and ancillary data products for individual X-ray sources, so that users can perform detailed further analysis using existing tools; and (4) includes real X-ray sources detected with flux significance greater than a predefined threshold, while maintaining the number of spurious sources at an acceptable level. For each detected X-ray source, the CSC provides commonly tabulated quantities, including source position, extent, multi-band fluxes, hardness ratios, and variability statistics, derived from the observations in which the source is detected. In addition to these traditional catalog elements, for each X-ray source the CSC includes an extensive set of file-based data products that can be manipulated interactively, including source images, event lists, light curves, and spectra from each observation in which a source is detected.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martin, Spencer; Rodrigues, George, E-mail: george.rodrigues@lhsc.on.ca; Department of Epidemiology/Biostatistics, University of Western Ontario, London
2013-01-01
Purpose: To perform a rigorous technological assessment and statistical validation of a software technology for anatomic delineations of the prostate on MRI datasets. Methods and Materials: A 3-phase validation strategy was used. Phase I consisted of anatomic atlas building using 100 prostate cancer MRI data sets to provide training data sets for the segmentation algorithms. In phase II, 2 experts contoured 15 new MRI prostate cancer cases using 3 approaches (manual, N points, and region of interest). In phase III, 5 new physicians with variable MRI prostate contouring experience segmented the same 15 phase II datasets using 3 approaches: manual,more » N points with no editing, and full autosegmentation with user editing allowed. Statistical analyses for time and accuracy (using Dice similarity coefficient) endpoints used traditional descriptive statistics, analysis of variance, analysis of covariance, and pooled Student t test. Results: In phase I, average (SD) total and per slice contouring time for the 2 physicians was 228 (75), 17 (3.5), 209 (65), and 15 seconds (3.9), respectively. In phase II, statistically significant differences in physician contouring time were observed based on physician, type of contouring, and case sequence. The N points strategy resulted in superior segmentation accuracy when initial autosegmented contours were compared with final contours. In phase III, statistically significant differences in contouring time were observed based on physician, type of contouring, and case sequence again. The average relative timesaving for N points and autosegmentation were 49% and 27%, respectively, compared with manual contouring. The N points and autosegmentation strategies resulted in average Dice values of 0.89 and 0.88, respectively. Pre- and postedited autosegmented contours demonstrated a higher average Dice similarity coefficient of 0.94. Conclusion: The software provided robust contours with minimal editing required. Observed time savings were seen for all physicians irrespective of experience level and baseline manual contouring speed.« less
Application of econometric and ecology analysis methods in physics software
NASA Astrophysics Data System (ADS)
Han, Min Cheol; Hoff, Gabriela; Kim, Chan Hyeong; Kim, Sung Hun; Grazia Pia, Maria; Ronchieri, Elisabetta; Saracco, Paolo
2017-10-01
Some data analysis methods typically used in econometric studies and in ecology have been evaluated and applied in physics software environments. They concern the evolution of observables through objective identification of change points and trends, and measurements of inequality, diversity and evenness across a data set. Within each analysis area, various statistical tests and measures have been examined. This conference paper summarizes a brief overview of some of these methods.
Rooban, T; Joshua, Elizabeth; Rao, Umadevi K; Ranganathan, K
2012-01-01
Tobacco use is reported to be rampant in urban slums in developing countries. Demographical variations in tobacco use between males living in urban slums vs those living in non-slum areas in India has not been reported, and this study was undertaken to address this issue. Secondary data analysis of National Family Health Survey-3 (NFHS-3) was undertaken to study demographical variations in tobacco use between urban slum dwellers and non-slum dwellers in eight Indian cities. Demographic determinants for use of smoking and chewing forms of tobacco in the two groups were analyzed. SPSS version 16.0 (SPSS Inc., IL, USA) was used for statistical analysis. The study population comprised 6887 (41.8%) males from slum areas and 9588 (58.2%) from non-slum areas of eight urban cities. Cigarette/beedi smoking was the commonest form of tobacco use among the study population. Pan masala use was the least common form of smokeless tobacco use, next only to snuff. There was a high statistical significance observed within the various demographic parameter studied in both the slum and non-slum dwelling males in study population. However, on studying the differences between the two groups, it was observed that statistical significance of P≤.001 was observed with age (15-49), secondary education, religion, household structure and marital status. The difference between the two groups in the mean number of cigarettes/beedis smoked was not statistically significant (P=.598). Male slum dwellers are a distinct urban population, whose health needs assessment requires a different approach than that for non-slum dwellers who often can afford the services that an urban Indian city can offer.
Machine learning for neuroimaging with scikit-learn.
Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël
2014-01-01
Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.
Machine learning for neuroimaging with scikit-learn
Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël
2014-01-01
Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain. PMID:24600388
Do climate extreme events foster violent civil conflicts? A coincidence analysis
NASA Astrophysics Data System (ADS)
Schleussner, Carl-Friedrich; Donges, Jonathan F.; Donner, Reik V.
2014-05-01
Civil conflicts promoted by adverse environmental conditions represent one of the most important potential feedbacks in the global socio-environmental nexus. While the role of climate extremes as a triggering factor is often discussed, no consensus is yet reached about the cause-and-effect relation in the observed data record. Here we present results of a rigorous statistical coincidence analysis based on the Munich Re Inc. extreme events database and the Uppsala conflict data program. We report evidence for statistically significant synchronicity between climate extremes with high economic impact and violent conflicts for various regions, although no coherent global signal emerges from our analysis. Our results indicate the importance of regional vulnerability and might aid to identify hot-spot regions for potential climate-triggered violent social conflicts.
Evaluating statistical cloud schemes: What can we gain from ground-based remote sensing?
NASA Astrophysics Data System (ADS)
Grützun, V.; Quaas, J.; Morcrette, C. J.; Ament, F.
2013-09-01
Statistical cloud schemes with prognostic probability distribution functions have become more important in atmospheric modeling, especially since they are in principle scale adaptive and capture cloud physics in more detail. While in theory the schemes have a great potential, their accuracy is still questionable. High-resolution three-dimensional observational data of water vapor and cloud water, which could be used for testing them, are missing. We explore the potential of ground-based remote sensing such as lidar, microwave, and radar to evaluate prognostic distribution moments using the "perfect model approach." This means that we employ a high-resolution weather model as virtual reality and retrieve full three-dimensional atmospheric quantities and virtual ground-based observations. We then use statistics from the virtual observation to validate the modeled 3-D statistics. Since the data are entirely consistent, any discrepancy occurring is due to the method. Focusing on total water mixing ratio, we find that the mean ratio can be evaluated decently but that it strongly depends on the meteorological conditions as to whether the variance and skewness are reliable. Using some simple schematic description of different synoptic conditions, we show how statistics obtained from point or line measurements can be poor at representing the full three-dimensional distribution of water in the atmosphere. We argue that a careful analysis of measurement data and detailed knowledge of the meteorological situation is necessary to judge whether we can use the data for an evaluation of higher moments of the humidity distribution used by a statistical cloud scheme.
NASA Astrophysics Data System (ADS)
Choquet, Élodie; Pueyo, Laurent; Soummer, Rémi; Perrin, Marshall D.; Hagan, J. Brendan; Gofas-Salas, Elena; Rajan, Abhijith; Aguilar, Jonathan
2015-09-01
The ALICE program, for Archival Legacy Investigation of Circumstellar Environment, is currently conducting a virtual survey of about 400 stars, by re-analyzing the HST-NICMOS coronagraphic archive with advanced post-processing techniques. We present here the strategy that we adopted to identify detections and potential candidates for follow-up observations, and we give a preliminary overview of our detections. We present a statistical analysis conducted to evaluate the confidence level on these detection and the completeness of our candidate search.
An Efficient Objective Analysis System for Parallel Computers
NASA Technical Reports Server (NTRS)
Stobie, James G.
1999-01-01
A new objective analysis system designed for parallel computers will be described. The system can produce a global analysis (on a 2 x 2.5 lat-lon grid with 20 levels of heights and winds and 10 levels of moisture) using 120,000 observations in less than 3 minutes on 32 CPUs (SGI Origin 2000). No special parallel code is needed (e.g. MPI or multitasking) and the 32 CPUs do not have to be on the same platform. The system Ls totally portable and can run on -several different architectures at once. In addition, the system can easily scale up to 100 or more CPUS. This will allow for much higher resolution and significant increases in input data. The system scales linearly as the number of observations and the number of grid points. The cost overhead in going from I to 32 CPus is 18%. in addition, the analysis results are identical regardless of the number of processors used. T'his system has all the characteristics of optimal interpolation, combining detailed instrument and first guess error statistics to produce the best estimate of the atmospheric state. It also includes a new quality control (buddy check) system. Static tests with the system showed it's analysis increments are comparable to the latest NASA operational system including maintenance of mass-wind balance. Results from a 2-month cycling test in the Goddard EOS Data Assimilation System (GEOS DAS) show this new analysis retains the same level of agreement between the first guess and observations (0-F statistics) throughout the entire two months.
SDGs and Geospatial Frameworks: Data Integration in the United States
NASA Astrophysics Data System (ADS)
Trainor, T.
2016-12-01
Responding to the need to monitor a nation's progress towards meeting the Sustainable Development Goals (SDG) outlined in the 2030 U.N. Agenda requires the integration of earth observations with statistical information. The urban agenda proposed in SDG 11 challenges the global community to find a geospatial approach to monitor and measure inclusive, safe, resilient, and sustainable cities and communities. Target 11.7 identifies public safety, accessibility to green and public spaces, and the most vulnerable populations (i.e., women and children, older persons, and persons with disabilities) as the most important priorities of this goal. A challenge for both national statistical organizations and earth observation agencies in addressing SDG 11 is the requirement for detailed statistics at a sufficient spatial resolution to provide the basis for meaningful analysis of the urban population and city environments. Using an example for the city of Pittsburgh, this presentation proposes data and methods to illustrate how earth science and statistical data can be integrated to respond to Target 11.7. Finally, a preliminary series of data initiatives are proposed for extending this method to other global cities.
[How reliable is the monitoring for doping?].
Hüsler, J
1990-12-01
The reliability of the dope control, of the chemical analysis of the urine probes in the accredited laboratories and their decisions, is discussed using probabilistic and statistical methods. Basically, we evaluated and estimated the positive predictive value which means the probability that an urine probe contains prohibited dope substances given a positive test decision. Since there are not statistical data and evidence for some important quantities in relation to the predictive value, an exact evaluation is not possible, only conservative, lower bounds can be given. We found that the predictive value is at least 90% or 95% with respect to the analysis and decision based on the A-probe only, and at least 99% with respect to both A- and B-probes. A more realistic observation, but without sufficient statistical confidence, points to the fact that the true predictive value is significantly larger than these lower estimates.
Statistical analysis of effective singular values in matrix rank determination
NASA Technical Reports Server (NTRS)
Konstantinides, Konstantinos; Yao, Kung
1988-01-01
A major problem in using SVD (singular-value decomposition) as a tool in determining the effective rank of a perturbed matrix is that of distinguishing between significantly small and significantly large singular values to the end, conference regions are derived for the perturbed singular values of matrices with noisy observation data. The analysis is based on the theories of perturbations of singular values and statistical significance test. Threshold bounds for perturbation due to finite-precision and i.i.d. random models are evaluated. In random models, the threshold bounds depend on the dimension of the matrix, the noisy variance, and predefined statistical level of significance. Results applied to the problem of determining the effective order of a linear autoregressive system from the approximate rank of a sample autocorrelation matrix are considered. Various numerical examples illustrating the usefulness of these bounds and comparisons to other previously known approaches are given.
Rowlands, G J; Musoke, A J; Morzaria, S P; Nagda, S M; Ballingall, K T; McKeever, D J
2000-04-01
A statistically derived disease reaction index based on parasitological, clinical and haematological measurements observed in 309 5 to 8-month-old Boran cattle following laboratory challenge with Theileria parva is described. Principal component analysis was applied to 13 measures including first appearance of schizonts, first appearance of piroplasms and first occurrence of pyrexia, together with the duration and severity of these symptoms, and white blood cell count. The first principal component, which was based on approximately equal contributions of the 13 variables, provided the definition for the disease reaction index, defined on a scale of 0-10. As well as providing a more objective measure of the severity of the reaction, the continuous nature of the index score enables more powerful statistical analysis of the data compared with that which has been previously possible through clinically derived categories of non-, mild, moderate and severe reactions.
Shrout, Patrick E; Rodgers, Joseph L
2018-01-04
Psychology advances knowledge by testing statistical hypotheses using empirical observations and data. The expectation is that most statistically significant findings can be replicated in new data and in new laboratories, but in practice many findings have replicated less often than expected, leading to claims of a replication crisis. We review recent methodological literature on questionable research practices, meta-analysis, and power analysis to explain the apparently high rates of failure to replicate. Psychologists can improve research practices to advance knowledge in ways that improve replicability. We recommend that researchers adopt open science conventions of preregi-stration and full disclosure and that replication efforts be based on multiple studies rather than on a single replication attempt. We call for more sophisticated power analyses, careful consideration of the various influences on effect sizes, and more complete disclosure of nonsignificant as well as statistically significant findings.
Depression and Oxidative Stress: Results From a Meta-Analysis of Observational Studies
Palta, Priya; Samuel, Laura J.; Miller, Edgar R.; Szanton, Sarah L.
2014-01-01
Objective To perform a systematic review and meta-analysis that quantitatively tests and summarizes the hypothesis that depression results in elevated oxidative stress and lower antioxidant levels. Methods We performed a meta-analysis of studies that reported an association between depression and oxidative stress and/or antioxidant status markers. PubMed and EMBASE databases were searched for articles published from January 1980 through December 2012. A random-effects model, weighted by inverse variance, was performed to pool standard deviation (Cohen’s d) effect size estimates across studies for oxidative stress and antioxidant status measures, separately. Results Twenty-three studies with 4980 participants were included in the meta-analysis. Depression was most commonly measured using the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition criteria. A Cohen’s d effect size of 0.55 (95% confidence interval = 0.47–0.63) was found for the association between depression and oxidative stress, indicating a roughly 0.55 of 1-standard-deviation increase in oxidative stress among individuals with depression compared with those without depression. The results of the studies displayed significant heterogeneity (I2 = 80.0%, p < .001). A statistically significant effect was also observed for the association between depression and antioxidant status markers (Cohen’s d = −0.24, 95% confidence interval = −0.33 to −0.15). Conclusions This meta-analysis observed an association between depression and oxidative stress and antioxidant status across many different studies. Differences in measures of depression and markers of oxidative stress and antioxidant status markers could account for the observed heterogeneity. These findings suggest that well-established associations between depression and poor heath outcomes may be mediated by high oxidative stress. PMID:24336428
Depression and oxidative stress: results from a meta-analysis of observational studies.
Palta, Priya; Samuel, Laura J; Miller, Edgar R; Szanton, Sarah L
2014-01-01
To perform a systematic review and meta-analysis that quantitatively tests and summarizes the hypothesis that depression results in elevated oxidative stress and lower antioxidant levels. We performed a meta-analysis of studies that reported an association between depression and oxidative stress and/or antioxidant status markers. PubMed and EMBASE databases were searched for articles published from January 1980 through December 2012. A random-effects model, weighted by inverse variance, was performed to pool standard deviation (Cohen's d) effect size estimates across studies for oxidative stress and antioxidant status measures, separately. Twenty-three studies with 4980 participants were included in the meta-analysis. Depression was most commonly measured using the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition criteria. A Cohen's d effect size of 0.55 (95% confidence interval = 0.47-0.63) was found for the association between depression and oxidative stress, indicating a roughly 0.55 of 1-standard-deviation increase in oxidative stress among individuals with depression compared with those without depression. The results of the studies displayed significant heterogeneity (I(2) = 80.0%, p < .001). A statistically significant effect was also observed for the association between depression and antioxidant status markers (Cohen's d = -0.24, 95% confidence interval = -0.33 to -0.15). This meta-analysis observed an association between depression and oxidative stress and antioxidant status across many different studies. Differences in measures of depression and markers of oxidative stress and antioxidant status markers could account for the observed heterogeneity. These findings suggest that well-established associations between depression and poor heath outcomes may be mediated by high oxidative stress.
Li, Wen-Chin; Harris, Don; Yu, Chung-San
2008-03-01
The human factors analysis and classification system (HFACS) is based upon Reason's organizational model of human error. HFACS was developed as an analytical framework for the investigation of the role of human error in aviation accidents, however, there is little empirical work formally describing the relationship between the components in the model. This research analyses 41 civil aviation accidents occurring to aircraft registered in the Republic of China (ROC) between 1999 and 2006 using the HFACS framework. The results show statistically significant relationships between errors at the operational level and organizational inadequacies at both the immediately adjacent level (preconditions for unsafe acts) and higher levels in the organization (unsafe supervision and organizational influences). The pattern of the 'routes to failure' observed in the data from this analysis of civil aircraft accidents show great similarities to that observed in the analysis of military accidents. This research lends further support to Reason's model that suggests that active failures are promoted by latent conditions in the organization. Statistical relationships linking fallible decisions in upper management levels were found to directly affect supervisory practices, thereby creating the psychological preconditions for unsafe acts and hence indirectly impairing the performance of pilots, ultimately leading to accidents.
Distribution of lod scores in oligogenic linkage analysis.
Williams, J T; North, K E; Martin, L J; Comuzzie, A G; Göring, H H; Blangero, J
2001-01-01
In variance component oligogenic linkage analysis it can happen that the residual additive genetic variance bounds to zero when estimating the effect of the ith quantitative trait locus. Using quantitative trait Q1 from the Genetic Analysis Workshop 12 simulated general population data, we compare the observed lod scores from oligogenic linkage analysis with the empirical lod score distribution under a null model of no linkage. We find that zero residual additive genetic variance in the null model alters the usual distribution of the likelihood-ratio statistic.
Scout trajectory error propagation computer program
NASA Technical Reports Server (NTRS)
Myler, T. R.
1982-01-01
Since 1969, flight experience has been used as the basis for predicting Scout orbital accuracy. The data used for calculating the accuracy consists of errors in the trajectory parameters (altitude, velocity, etc.) at stage burnout as observed on Scout flights. Approximately 50 sets of errors are used in Monte Carlo analysis to generate error statistics in the trajectory parameters. A covariance matrix is formed which may be propagated in time. The mechanization of this process resulted in computer program Scout Trajectory Error Propagation (STEP) and is described herein. Computer program STEP may be used in conjunction with the Statistical Orbital Analysis Routine to generate accuracy in the orbit parameters (apogee, perigee, inclination, etc.) based upon flight experience.
The effects of multiple repairs on Inconel 718 weld mechanical properties
NASA Technical Reports Server (NTRS)
Russell, C. K.; Nunes, A. C., Jr.; Moore, D.
1991-01-01
Inconel 718 weldments were repaired 3, 6, 9, and 13 times using the gas tungsten arc welding process. The welded panels were machined into mechanical test specimens, postweld heat treated, and nondestructively tested. Tensile properties and high cycle fatigue life were evaluated and the results compared to unrepaired weld properties. Mechanical property data were analyzed using the statistical methods of difference in means for tensile properties and difference in log means and Weibull analysis for high cycle fatigue properties. Statistical analysis performed on the data did not show a significant decrease in tensile or high cycle fatigue properties due to the repeated repairs. Some degradation was observed in all properties, however, it was minimal.
Pounds, Stan; Cheng, Cheng; Cao, Xueyuan; Crews, Kristine R; Plunkett, William; Gandhi, Varsha; Rubnitz, Jeffrey; Ribeiro, Raul C; Downing, James R; Lamba, Jatinder
2009-08-15
In some applications, prior biological knowledge can be used to define a specific pattern of association of multiple endpoint variables with a genomic variable that is biologically most interesting. However, to our knowledge, there is no statistical procedure designed to detect specific patterns of association with multiple endpoint variables. Projection onto the most interesting statistical evidence (PROMISE) is proposed as a general procedure to identify genomic variables that exhibit a specific biologically interesting pattern of association with multiple endpoint variables. Biological knowledge of the endpoint variables is used to define a vector that represents the biologically most interesting values for statistics that characterize the associations of the endpoint variables with a genomic variable. A test statistic is defined as the dot-product of the vector of the observed association statistics and the vector of the most interesting values of the association statistics. By definition, this test statistic is proportional to the length of the projection of the observed vector of correlations onto the vector of most interesting associations. Statistical significance is determined via permutation. In simulation studies and an example application, PROMISE shows greater statistical power to identify genes with the interesting pattern of associations than classical multivariate procedures, individual endpoint analyses or listing genes that have the pattern of interest and are significant in more than one individual endpoint analysis. Documented R routines are freely available from www.stjuderesearch.org/depts/biostats and will soon be available as a Bioconductor package from www.bioconductor.org.
Bi-telescopic, deep, simultaneous meteor observations
NASA Technical Reports Server (NTRS)
Taff, L. G.
1986-01-01
A statistical summary is presented of 10 hours of observing sporadic meteors and two meteor showers using the Experimental Test System of the Lincoln Laboratory. The observatory is briefly described along with the real-time and post-processing hardware, the analysis, and the data reduction. The principal observational results are given for the sporadic meteor zenithal hourly rates. The unique properties of the observatory include twin telescopes to allow the discrimination of meteors by parallax, deep limiting magnitude, good time resolution, and sophisticated real-time and post-observing video processing.
Charlton, Alex; Sakrabani, Ruben; Tyrrel, Sean; Rivas Casado, Monica; McGrath, Steve P; Crooks, Bill; Cooper, Pat; Campbell, Colin D
2016-12-01
The Long-Term Sludge Experiments (LTSE) began in 1994 as part of continuing research into the effects of sludge-borne heavy metals on soil fertility. The long-term effects of Zn, Cu, and Cd on soil microbial biomass carbon (C mic ) were monitored for 8 years (1997-2005) in sludge amended soils at nine UK field sites. To assess the statutory limits set by the UK Sludge (Use in Agriculture) Regulations the experimental data has been reviewed using the statistical methods of meta-analysis. Previous LTSE studies have focused predominantly on statistical significance rather than effect size, whereas meta-analysis focuses on the magnitude and direction of an effect, i.e. the practical significance, rather than its statistical significance. The results presented here show that significant decreases in C mic have occurred in soils where the total concentrations of Zn and Cu fall below the current UK statutory limits. For soils receiving sewage sludge predominantly contaminated with Zn, decreases of approximately 7-11% were observed at concentrations below the UK statutory limit. The effect of Zn appeared to increase over time, with increasingly greater decreases in C mic observed over a period of 8 years. This may be due to an interactive effect between Zn and confounding Cu contamination which has augmented the bioavailability of these metals over time. Similar decreases (7-12%) in C mic were observed in soils receiving sewage sludge predominantly contaminated with Cu; however, C mic appeared to show signs of recovery after a period of 6 years. Application of sewage sludge predominantly contaminated with Cd appeared to have no effect on C mic at concentrations below the current UK statutory limit. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Calegari, Saron Souza; Konopka, Cristine Kolling; Balestrin, Bruna; Hoffmann, Maurício Scopel; de Souza, Floriano Soeiro; Resener, Elaine Verena
2012-08-01
To determine the epidemiological profile of women admitted for urinary tract infection as well as to verify the most prevalent agents and response to antibiotic therapy. A retrospective study of 106 pregnant women admitted to a university hospital for urinary tract infection treatment during the period between January 2007 to December 2010. The evaluation was based on analysis of the medical records of these pregnant women, with the observation of hospitalization and pregnancy data, as well as its outcome. Statistical analysis was performed using Statistical Package for the Social Science, version 15.0. The bilateral Fisher exact test and Student's t test were used for data analysis, as well as descriptive statistical methods. Positive urine cultures were observed in 60.5% of pregnant women admitted due to urinary tract infection. The most frequent infectious agent was Escherichia coli and no difference in resistance, recurrence or complications was observed between the most frequent etiologic agents. Pregnant women with previous UTI had a higher recurrence risk (OR=10.8; p<0.05). The antibiotics most commonly used during hospitalization were ampicillin and cefazolin. Change of therapeutic agent due to bacterial resistance occurred in 11.9% of patients who took cefazolin and in 20% of patients who took ampicillin (OR=5.5; p<0.05). The rate of gestational complications was the same for both treatments. There was no difference in mean number of days of hospitalization between the treatments. In the studied population ampicillin showed a higher rate of bacterial resistance than cefazolin, requiring a larger number of treatment regimen exchanges, without resulting in differences in clinical outcome or time of hospitalization.
Statistical validation of a solar wind propagation model from 1 to 10 AU
NASA Astrophysics Data System (ADS)
Zieger, Bertalan; Hansen, Kenneth C.
2008-08-01
A one-dimensional (1-D) numerical magnetohydrodynamic (MHD) code is applied to propagate the solar wind from 1 AU through 10 AU, i.e., beyond the heliocentric distance of Saturn's orbit, in a non-rotating frame of reference. The time-varying boundary conditions at 1 AU are obtained from hourly solar wind data observed near the Earth. Although similar MHD simulations have been carried out and used by several authors, very little work has been done to validate the statistical accuracy of such solar wind predictions. In this paper, we present an extensive analysis of the prediction efficiency, using 12 selected years of solar wind data from the major heliospheric missions Pioneer, Voyager, and Ulysses. We map the numerical solution to each spacecraft in space and time, and validate the simulation, comparing the propagated solar wind parameters with in-situ observations. We do not restrict our statistical analysis to the times of spacecraft alignment, as most of the earlier case studies do. Our superposed epoch analysis suggests that the prediction efficiency is significantly higher during periods with high recurrence index of solar wind speed, typically in the late declining phase of the solar cycle. Among the solar wind variables, the solar wind speed can be predicted to the highest accuracy, with a linear correlation of 0.75 on average close to the time of opposition. We estimate the accuracy of shock arrival times to be as high as 10-15 hours within ±75 d from apparent opposition during years with high recurrence index. During solar activity maximum, there is a clear bias for the model to predicted shocks arriving later than observed in the data, suggesting that during these periods, there is an additional acceleration mechanism in the solar wind that is not included in the model.
Uncertainty Analysis and Order-by-Order Optimization of Chiral Nuclear Interactions
Carlsson, Boris; Forssen, Christian; Fahlin Strömberg, D.; ...
2016-02-24
Chiral effective field theory ( ΧEFT) provides a systematic approach to describe low-energy nuclear forces. Moreover, EFT is able to provide well-founded estimates of statistical and systematic uncertainties | although this unique advantage has not yet been fully exploited. We ll this gap by performing an optimization and statistical analysis of all the low-energy constants (LECs) up to next-to-next-to-leading order. Our optimization protocol corresponds to a simultaneous t to scattering and bound-state observables in the pion-nucleon, nucleon-nucleon, and few-nucleon sectors, thereby utilizing the full model capabilities of EFT. Finally, we study the effect on other observables by demonstrating forward-error-propagation methodsmore » that can easily be adopted by future works. We employ mathematical optimization and implement automatic differentiation to attain e cient and machine-precise first- and second-order derivatives of the objective function with respect to the LECs. This is also vital for the regression analysis. We use power-counting arguments to estimate the systematic uncertainty that is inherent to EFT and we construct chiral interactions at different orders with quantified uncertainties. Statistical error propagation is compared with Monte Carlo sampling showing that statistical errors are in general small compared to systematic ones. In conclusion, we find that a simultaneous t to different sets of data is critical to (i) identify the optimal set of LECs, (ii) capture all relevant correlations, (iii) reduce the statistical uncertainty, and (iv) attain order-by-order convergence in EFT. Furthermore, certain systematic uncertainties in the few-nucleon sector are shown to get substantially magnified in the many-body sector; in particlar when varying the cutoff in the chiral potentials. The methodology and results presented in this Paper open a new frontier for uncertainty quantification in ab initio nuclear theory.« less
Bertolaccini, Luca; Viti, Andrea; Cavallo, Antonio; Terzi, Alberto
2014-04-01
The role of electro-thermal bipolar tissue sealing system (LigaSure(®), (LS); Covidien, Inc., CO, USA) in thoracic surgery is still undefined. Reports of its use are still limited. The objective of the trial was to evaluate the cost and benefits of LS in major lung resection surgery. A randomized blinded study of a consecutive series of 100 patients undergoing lobectomy was undertaken. After muscle-sparing thoracotomy and classification of lung fissures according to Craig-Walker, patients with fissure Grade 2-4 were randomized to Stapler group or LS group fissure completion. Recorded parameters were analysed for differences in selected intraoperative and postoperative outcomes. Statistical analysis was performed with the bootstrap method. Pearson's χ(2) test and Fisher's exact test were used to calculate probability value for dichotomous variables comparison. Cost-benefit evaluation was performed using Pareto optimal analysis. There were no significant differences between groups, regarding demographic and baseline characteristics. No patient was withdrawn from the study; no adverse effect was recorded. There was no mortality or major complications in both groups. There were no statistically significant differences as to operative time or morbidity between patients in the LS group compared with the Stapler group. In the LS group, there was a not statistically significant increase of postoperative air leaks in the first 24 postoperative hours, while a statistically significant increase of drainage amount was observed in the LS group. No statistically significant difference in hospital length of stay was observed. Overall, the LS group had a favourable multi-criteria analysis of cost/benefit ratio with a good 'Pareto optimum'. LS is a safe device for thoracic surgery and can be a valid alternative to Staplers. In this setting, LS allows functional lung tissue preservation. As to costs, LS seems equivalent to Staplers.
NASA Astrophysics Data System (ADS)
Bierstedt, Svenja E.; Hünicke, Birgit; Zorita, Eduardo; Ludwig, Juliane
2017-07-01
We statistically analyse the relationship between the structure of migrating dunes in the southern Baltic and the driving wind conditions over the past 26 years, with the long-term aim of using migrating dunes as a proxy for past wind conditions at an interannual resolution. The present analysis is based on the dune record derived from geo-radar measurements by Ludwig et al. (2017). The dune system is located at the Baltic Sea coast of Poland and is migrating from west to east along the coast. The dunes present layers with different thicknesses that can be assigned to absolute dates at interannual timescales and put in relation to seasonal wind conditions. To statistically analyse this record and calibrate it as a wind proxy, we used a gridded regional meteorological reanalysis data set (coastDat2) covering recent decades. The identified link between the dune annual layers and wind conditions was additionally supported by the co-variability between dune layers and observed sea level variations in the southern Baltic Sea. We include precipitation and temperature into our analysis, in addition to wind, to learn more about the dependency between these three atmospheric factors and their common influence on the dune system. We set up a statistical linear model based on the correlation between the frequency of days with specific wind conditions in a given season and dune migration velocities derived for that season. To some extent, the dune records can be seen as analogous to tree-ring width records, and hence we use a proxy validation method usually applied in dendrochronology, cross-validation with the leave-one-out method, when the observational record is short. The revealed correlations between the wind record from the reanalysis and the wind record derived from the dune structure is in the range between 0.28 and 0.63, yielding similar statistical validation skill as dendroclimatological records.
On statistical inference in time series analysis of the evolution of road safety.
Commandeur, Jacques J F; Bijleveld, Frits D; Bergel-Hayat, Ruth; Antoniou, Constantinos; Yannis, George; Papadimitriou, Eleonora
2013-11-01
Data collected for building a road safety observatory usually include observations made sequentially through time. Examples of such data, called time series data, include annual (or monthly) number of road traffic accidents, traffic fatalities or vehicle kilometers driven in a country, as well as the corresponding values of safety performance indicators (e.g., data on speeding, seat belt use, alcohol use, etc.). Some commonly used statistical techniques imply assumptions that are often violated by the special properties of time series data, namely serial dependency among disturbances associated with the observations. The first objective of this paper is to demonstrate the impact of such violations to the applicability of standard methods of statistical inference, which leads to an under or overestimation of the standard error and consequently may produce erroneous inferences. Moreover, having established the adverse consequences of ignoring serial dependency issues, the paper aims to describe rigorous statistical techniques used to overcome them. In particular, appropriate time series analysis techniques of varying complexity are employed to describe the development over time, relating the accident-occurrences to explanatory factors such as exposure measures or safety performance indicators, and forecasting the development into the near future. Traditional regression models (whether they are linear, generalized linear or nonlinear) are shown not to naturally capture the inherent dependencies in time series data. Dedicated time series analysis techniques, such as the ARMA-type and DRAG approaches are discussed next, followed by structural time series models, which are a subclass of state space methods. The paper concludes with general recommendations and practice guidelines for the use of time series models in road safety research. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Yerlikaya, Emrah; Karageçili, Hasan; Aydin, Ruken Zeynep
2016-04-01
Obesity is a key risk for the development of hyperglycemia, hypertension, hyperlipidemia, insulin resistance and is totally referred to as the metabolic disorders. Diabetes mellitus, a metabolic disorder, is related with hyperglycemia, altered metabolism of lipids, carbohydrates and proteins. The minimum defining characteristic feature to identify diabetes mellitus is chronic and substantiated elevation of circulating glucose concentration. In this study, it is aimed to determine the body composition analyze of obese and (obese+diabetes) patients.We studied the datas taken from three independent groups with the body composition analyzer instrument. The body composition analyzer calculates body parameters, such as body fat ratio, body fat mass, fat free mass, estimated muscle mass, and base metabolic rate on the basis of data obtained by Dual Energy X-ray Absorptiometry using Bioelectrical Impedance Analysis. All patients and healthy subjects applied to Siirt University Medico and their datas were taken. The Statistical Package for Social Sciences version 21 was used for descriptive data analysis. When we compared and analyzed three groups datas, we found statistically significant difference between obese, (obese+diabetes) and control groups values. Anova test and tukey test are used to analyze the difference between groups and to do multiple comparisons. T test is also used to analyze the difference between genders. We observed the statistically significant difference in age and mineral amount p<0.00 between (diabetes+obese) and obese groups. Besides, when these patient groups and control group were analyzed, there were significant difference between most parameters. In terms of education level among the illiterate and university graduates; fat mass kg, fat percentage, internal lubrication, body mass index, water percentage, protein mass percentage, mineral percentage p<0.05, significant statistically difference were observed. This difference especially may result of a sedentary lifestyle.
NASA Astrophysics Data System (ADS)
Lin, Shu; Wang, Rui; Xia, Ning; Li, Yongdong; Liu, Chunliang
2018-01-01
Statistical multipactor theories are critical prediction approaches for multipactor breakdown determination. However, these approaches still require a negotiation between the calculation efficiency and accuracy. This paper presents an improved stationary statistical theory for efficient threshold analysis of two-surface multipactor. A general integral equation over the distribution function of the electron emission phase with both the single-sided and double-sided impacts considered is formulated. The modeling results indicate that the improved stationary statistical theory can not only obtain equally good accuracy of multipactor threshold calculation as the nonstationary statistical theory, but also achieve high calculation efficiency concurrently. By using this improved stationary statistical theory, the total time consumption in calculating full multipactor susceptibility zones of parallel plates can be decreased by as much as a factor of four relative to the nonstationary statistical theory. It also shows that the effect of single-sided impacts is indispensable for accurate multipactor prediction of coaxial lines and also more significant for the high order multipactor. Finally, the influence of secondary emission yield (SEY) properties on the multipactor threshold is further investigated. It is observed that the first cross energy and the energy range between the first cross and the SEY maximum both play a significant role in determining the multipactor threshold, which agrees with the numerical simulation results in the literature.
Climate drivers on malaria transmission in Arunachal Pradesh, India.
Upadhyayula, Suryanaryana Murty; Mutheneni, Srinivasa Rao; Chenna, Sumana; Parasaram, Vaideesh; Kadiri, Madhusudhan Rao
2015-01-01
The present study was conducted during the years 2006 to 2012 and provides information on prevalence of malaria and its regulation with effect to various climatic factors in East Siang district of Arunachal Pradesh, India. Correlation analysis, Principal Component Analysis and Hotelling's T² statistics models are adopted to understand the effect of weather variables on malaria transmission. The epidemiological study shows that the prevalence of malaria is mostly caused by the parasite Plasmodium vivax followed by Plasmodium falciparum. It is noted that, the intensity of malaria cases declined gradually from the year 2006 to 2012. The transmission of malaria observed was more during the rainy season, as compared to summer and winter seasons. Further, the data analysis study with Principal Component Analysis and Hotelling's T² statistic has revealed that the climatic variables such as temperature and rainfall are the most influencing factors for the high rate of malaria transmission in East Siang district of Arunachal Pradesh.
Historical Phenological Observations: Past Climate Impact Analyses and Climate Reconstructions
NASA Astrophysics Data System (ADS)
Rutishauser, T.; Luterbacher, J.; Meier, N.; Jeanneret, F.; Pfister, C.; Wanner, H.
2007-12-01
Plant phenological observations have been found an important indicator of climate change impacts on seasonal and interannual vegetation development for the late 20th/early 21st century. Our contribution contains three parts that are essential for the understanding (part 1), the analysis (part 2) and the application (part 3) of historical phenological observations in global change research. First, we propose a definition for historical phenonolgy (Rutishauser, 2007). We shortly portray the first appearance of phenological observations in Medieval philosophical and literature sources, the usage and application of this method in the Age of Enlightenment (Carl von Linné, Charles Morren), as well as the development in the 20th century (Schnelle, Lieth) to present-day networks (COST725, USA-NPN) Second, we introduce a methodological approach to estimate 'Statistical plants' from historical phenological observations (Rutishauser et al., JGR-Biogeoscience, in press). We combine spatial averaging methods and regression transfer modeling to estimate 'statistical plant' dates from historical observations that often contain gaps, changing observers and changing locations. We apply the concept to reconstruct a statistical 'Spring plant' as the weighted mean of the flowering date of cherry and apple tree and beech budburst of Switzerland 1702- 2005. Including dating total data uncertainty we estimate 10 at interannual and 3.4 days at decadal time scales. Third, we apply two long-term phenological records to describe plant phenological response to spring temperature and reconstruct warm-season temperatures from grape harvest dates (Rutishauser et al, submitted; Meier et al, GRL, in press).
Impact of satellite-based data on FGGE general circulation statistics
NASA Technical Reports Server (NTRS)
Salstein, David A.; Rosen, Richard D.; Baker, Wayman E.; Kalnay, Eugenia
1987-01-01
The NASA Goddard Laboratory for Atmospheres (GLA) analysis/forecast system was run in two different parallel modes in order to evaluate the influence that data from satellites and other FGGE observation platforms can have on analyses of large scale circulation; in the first mode, data from all observation systems were used, while in the second only conventional upper air and surface reports were used. The GLA model was also integrated for the same period without insertion of any data; an independent objective analysis based only on rawinsonde and pilot balloon data is also performed. A small decrease in the vigor of the general circulation is noted to follow from the inclusion of satellite observations.
Observational difference between gamma and X-ray properties of optically dark and bright GRBs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balazs, L. G.; Horvath, I.; Bagoly, Zs.
2008-05-22
Using the discriminant analysis of the multivariate statistical analysis we compared the distribution of the physical quantities of the optically dark and bright GRBs, detected by the BAT and XRT on board of the Swift Satellite. We found that the GRBs having detected optical transients (OT) have systematically higher peak fluxes and lower HI column densities than those without OT.
A nonparametric analysis of plot basal area growth using tree based models
G. L. Gadbury; H. K. lyer; H. T. Schreuder; C. Y. Ueng
1997-01-01
Tree based statistical models can be used to investigate data structure and predict future observations. We used nonparametric and nonlinear models to reexamine the data sets on tree growth used by Bechtold et al. (1991) and Ruark et al. (1991). The growth data were collected by Forest Inventory and Analysis (FIA) teams from 1962 to 1972 (4th cycle) and 1972 to 1982 (...
Multi-criteria evaluation of CMIP5 GCMs for climate change impact analysis
NASA Astrophysics Data System (ADS)
Ahmadalipour, Ali; Rana, Arun; Moradkhani, Hamid; Sharma, Ashish
2017-04-01
Climate change is expected to have severe impacts on global hydrological cycle along with food-water-energy nexus. Currently, there are many climate models used in predicting important climatic variables. Though there have been advances in the field, there are still many problems to be resolved related to reliability, uncertainty, and computing needs, among many others. In the present work, we have analyzed performance of 20 different global climate models (GCMs) from Climate Model Intercomparison Project Phase 5 (CMIP5) dataset over the Columbia River Basin (CRB) in the Pacific Northwest USA. We demonstrate a statistical multicriteria approach, using univariate and multivariate techniques, for selecting suitable GCMs to be used for climate change impact analysis in the region. Univariate methods includes mean, standard deviation, coefficient of variation, relative change (variability), Mann-Kendall test, and Kolmogorov-Smirnov test (KS-test); whereas multivariate methods used were principal component analysis (PCA), singular value decomposition (SVD), canonical correlation analysis (CCA), and cluster analysis. The analysis is performed on raw GCM data, i.e., before bias correction, for precipitation and temperature climatic variables for all the 20 models to capture the reliability and nature of the particular model at regional scale. The analysis is based on spatially averaged datasets of GCMs and observation for the period of 1970 to 2000. Ranking is provided to each of the GCMs based on the performance evaluated against gridded observational data on various temporal scales (daily, monthly, and seasonal). Results have provided insight into each of the methods and various statistical properties addressed by them employed in ranking GCMs. Further; evaluation was also performed for raw GCM simulations against different sets of gridded observational dataset in the area.
Kim, Jung-in; Choi, Chang Heon; Wu, Hong-Gyun; Kim, Jin Ho; Kim, Kyubo; Park, Jong Min
2017-01-01
The aim of this work was to investigate correlations between 2D and quasi-3D gamma passing rates. A total of 20 patients (10 prostate cases and 10 head and neck cases, H&N) were retrospectively selected. For each patient, both intensity-modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT) plans were generated. For each plan, 2D gamma evaluation with radiochromic films and quasi-3D gamma evaluation with fluence measurements were performed with both 2%/2 mm and 3%/3 mm criteria. Gamma passing rates were grouped together according to delivery techniques and treatment sites. Statistical analyses were performed to examine the correlation between 2D and quasi-3D gamma evaluations. Statistically significant difference was observed between delivery techniques only in the quasi-3D gamma passing rates with 2%/2 mm. Statistically significant differences were observed between treatment sites in the 2D gamma passing rates (differences of less than 8%). No statistically significant correlations were observed between 2D and quasi-3D gamma passing rates except the VMAT group and the group including both IMRT and VMAT with 3%/3 mm (r = 0.564 with p = 0.012 for theVMAT group and r = 0.372 with p = 0.020 for the group including both IMRT and VMAT), however, those were not strong. No strong correlations were observed between 2D and quasi-3D gamma evaluations. PMID:27690300
Wrestling with Philosophy: Improving Scholarship in Higher Education
ERIC Educational Resources Information Center
Kezar, Adrianna
2004-01-01
Method is usually viewed as completely separate from philosophy or theory, focusing instead on techniques and procedures of interviewing, focus groups, observation, or statistical analysis. Several texts on methodology published recently have added significant sections on philosophy, such as Creswell's (1998) Qualitative inquiry and research…
Unjamming a granular hopper by vibration
NASA Astrophysics Data System (ADS)
Janda, A.; Maza, D.; Garcimartín, A.; Kolb, E.; Lanuza, J.; Clément, E.
2009-07-01
We present an experimental study of the outflow of a hopper continuously vibrated by a piezoelectric device. Outpouring of grains can be achieved for apertures much below the usual jamming limit observed for non-vibrated hoppers. Granular flow persists down to the physical limit of one grain diameter, a limit reached for a finite vibration amplitude. For the smaller orifices, we observe an intermittent regime characterized by alternated periods of flow and blockage. Vibrations do not significantly modify the flow rates both in the continuous and the intermittent regime. The analysis of the statistical features of the flowing regime shows that the flow time significantly increases with the vibration amplitude. However, at low vibration amplitude and small orifice sizes, the jamming time distribution displays an anomalous statistics.
Arrival directions of cosmic rays of E .4 EeV
NASA Technical Reports Server (NTRS)
Baltrusaitis, R. M.; Cady, R.; Cassiday, G. I.; Cooper, R.; Elbert, J. W.; Gerhardy, P. R.; Ko, S.; Loh, E. C.; Mizumoto, Y.; Salamon, M. H.
1985-01-01
The anisotropy of cosmic rays observed by the Utah Fly's Eye detector has been studied. Emphasis has been placed on examining distributions of events in galactic coordinates. No statistically significant departure from isotropy has been observed for energies greater than 0.4 EeV (1 EeV = 10 to the 18th power eV). Results of the standard harmonic analysis in right ascension are also presented.
Analysis of the cycle-to-cycle pressure distribution variations in dynamic stall
NASA Astrophysics Data System (ADS)
Harms, Tanner; Nikoueeyan, Pourya; Naughton, Jonathan
2017-11-01
Dynamic stall is an unsteady flow phenomenon observed on blades and wings that, despite decades of focused study, remains a challenging problem for rotorcraft and wind turbine applications. Traditionally, dynamic stall has been studied on pitch-oscillating airfoils by measuring the unsteady pressure distribution that is phase-averaged, by which the typical flow pattern may be observed and quantified. In cases where light to deep dynamic stall are observed, pressure distributions with high levels of variance are present in regions of separation. It was recently observed that, under certain conditions, this scatter may be the result of a two-state flow solution - as if there were a bifurcation in the unsteady pressure distribution behavior on the suction side of the airfoil. This is significant since phase-averaged dynamic stall data are often used to tune dynamic stall models and for validation of simulations of dynamic stall. In order to better understand this phenomenon, statistical analysis of the pressure data using probability density functions (PDFs) and other statistical approaches has been carried out for the SC 1094R8, DU97-W-300, and NACA 0015 airfoil geometries. This work uses airfoil data acquired under Army contract W911W60160C-0021, DOE Grant DE-SC0001261, and a gift from BP Alternative Energy North America, Inc.
Volume analysis of heat-induced cracks in human molars: A preliminary study
Sandholzer, Michael A.; Baron, Katharina; Heimel, Patrick; Metscher, Brian D.
2014-01-01
Context: Only a few methods have been published dealing with the visualization of heat-induced cracks inside bones and teeth. Aims: As a novel approach this study used nondestructive X-ray microtomography (micro-CT) for volume analysis of heat-induced cracks to observe the reaction of human molars to various levels of thermal stress. Materials and Methods: Eighteen clinically extracted third molars were rehydrated and burned under controlled temperatures (400, 650, and 800°C) using an electric furnace adjusted with a 25°C increase/min. The subsequent high-resolution scans (voxel-size 17.7 μm) were made with a compact micro-CT scanner (SkyScan 1174). In total, 14 scans were automatically segmented with Definiens XD Developer 1.2 and three-dimensional (3D) models were computed with Visage Imaging Amira 5.2.2. The results of the automated segmentation were analyzed with an analysis of variance (ANOVA) and uncorrected post hoc least significant difference (LSD) tests using Statistical Package for Social Sciences (SPSS) 17. A probability level of P < 0.05 was used as an index of statistical significance. Results: A temperature-dependent increase of heat-induced cracks was observed between the three temperature groups (P < 0.05, ANOVA post hoc LSD). In addition, the distributions and shape of the heat-induced changes could be classified using the computed 3D models. Conclusion: The macroscopic heat-induced changes observed in this preliminary study correspond with previous observations of unrestored human teeth, yet the current observations also take into account the entire microscopic 3D expansions of heat-induced cracks within the dental hard tissues. Using the same experimental conditions proposed in the literature, this study confirms previous results, adds new observations, and offers new perspectives in the investigation of forensic evidence. PMID:25125923
NASA Astrophysics Data System (ADS)
Ding, Xiangyi; Liu, Jiahong; Gong, Jiaguo
2018-02-01
Precipitation is one of the important factors of water cycle and main sources of regional water resources. It is of great significance to analyze the evolution of precipitation under changing environment for identifying the evolution law of water resources, thus can provide a scientific reference for the sustainable utilization of water resources and the formulation of related policies and measures. Generally, analysis of the evolution of precipitation consists of three levels: analysis the observed precipitation change based on measured data, explore the possible factors responsible for the precipitation change, and estimate the change trend of precipitation under changing environment. As the political and cultural centre of China, the climatic conditions in the Haihe river basin have greatly changed in recent decades. This study analyses the evolution of precipitation in the basin under changing environment based on observed meteorological data, GCMs and statistical methods. Firstly, based on the observed precipitation data during 1961-2000 at 26 meteorological stations in the basin, the actual precipitation change in the basin is analyzed. Secondly, the observed precipitation change in the basin is attributed using the fingerprint-based attribution method, and the causes of the observed precipitation change is identified. Finally, the change trend of precipitation in the basin under climate change in the future is predicted based on GCMs and a statistical downscaling model. The results indicate that: 1) during 1961-2000, the precipitation in the basin showed a decreasing trend, and the possible mutation time was 1965; 2) natural variability may be the factor responsible for the observed precipitation change in the basin; 3) under climate change in the future, precipitation in the basin will slightly increase by 4.8% comparing with the average, and the extremes will not vary significantly.
Dahabreh, Issa J.; Sheldrick, Radley C.; Paulus, Jessica K.; Chung, Mei; Varvarigou, Vasileia; Jafri, Haseeb; Rassen, Jeremy A.; Trikalinos, Thomas A.; Kitsios, Georgios D.
2012-01-01
Aims Randomized controlled trials (RCTs) are the gold standard for assessing the efficacy of therapeutic interventions because randomization protects from biases inherent in observational studies. Propensity score (PS) methods, proposed as a potential solution to confounding of the treatment–outcome association, are widely used in observational studies of therapeutic interventions for acute coronary syndromes (ACS). We aimed to systematically assess agreement between observational studies using PS methods and RCTs on therapeutic interventions for ACS. Methods and results We searched for observational studies of interventions for ACS that used PS methods to estimate treatment effects on short- or long-term mortality. Using a standardized algorithm, we matched observational studies to RCTs based on patients’ characteristics, interventions, and outcomes (‘topics’), and we compared estimates of treatment effect between the two designs. When multiple observational studies or RCTs were identified for the same topic, we performed a meta-analysis and used the summary relative risk for comparisons. We matched 21 observational studies investigating 17 distinct clinical topics to 63 RCTs (median = 3 RCTs per observational study) for short-term (7 topics) and long-term (10 topics) mortality. Estimates from PS analyses differed statistically significantly from randomized evidence in two instances; however, observational studies reported more extreme beneficial treatment effects compared with RCTs in 13 of 17 instances (P = 0.049). Sensitivity analyses limited to large RCTs, and using alternative meta-analysis models yielded similar results. Conclusion For the treatment of ACS, observational studies using PS methods produce treatment effect estimates that are of more extreme magnitude compared with those from RCTs, although the differences are rarely statistically significant. PMID:22711757
Solomon, Patricia J; Kasza, Jessica; Moran, John L
2014-04-22
The Australian and New Zealand Intensive Care Society (ANZICS) Adult Patient Database (APD) collects voluntary data on patient admissions to Australian and New Zealand intensive care units (ICUs). This paper presents an in-depth statistical analysis of risk-adjusted mortality of ICU admissions from 2000 to 2010 for the purpose of identifying ICUs with unusual performance. A cohort of 523,462 patients from 144 ICUs was analysed. For each ICU, the natural logarithm of the standardised mortality ratio (log-SMR) was estimated from a risk-adjusted, three-level hierarchical model. This is the first time a three-level model has been fitted to such a large ICU database anywhere. The analysis was conducted in three stages which included the estimation of a null distribution to describe usual ICU performance. Log-SMRs with appropriate estimates of standard errors are presented in a funnel plot using 5% false discovery rate thresholds. False coverage-statement rate confidence intervals are also presented. The observed numbers of deaths for ICUs identified as unusual are compared to the predicted true worst numbers of deaths under the model for usual ICU performance. Seven ICUs were identified as performing unusually over the period 2000 to 2010, in particular, demonstrating high risk-adjusted mortality compared to the majority of ICUs. Four of the seven were ICUs in private hospitals. Our three-stage approach to the analysis detected outlying ICUs which were not identified in a conventional (single) risk-adjusted model for mortality using SMRs to compare ICUs. We also observed a significant linear decline in mortality over the decade. Distinct yearly and weekly respiratory seasonal effects were observed across regions of Australia and New Zealand for the first time. The statistical approach proposed in this paper is intended to be used for the review of observed ICU and hospital mortality. Two important messages from our study are firstly, that comprehensive risk-adjustment is essential in modelling patient mortality for comparing performance, and secondly, that the appropriate statistical analysis is complicated.
Identifying unusual performance in Australian and New Zealand intensive care units from 2000 to 2010
2014-01-01
Background The Australian and New Zealand Intensive Care Society (ANZICS) Adult Patient Database (APD) collects voluntary data on patient admissions to Australian and New Zealand intensive care units (ICUs). This paper presents an in-depth statistical analysis of risk-adjusted mortality of ICU admissions from 2000 to 2010 for the purpose of identifying ICUs with unusual performance. Methods A cohort of 523,462 patients from 144 ICUs was analysed. For each ICU, the natural logarithm of the standardised mortality ratio (log-SMR) was estimated from a risk-adjusted, three-level hierarchical model. This is the first time a three-level model has been fitted to such a large ICU database anywhere. The analysis was conducted in three stages which included the estimation of a null distribution to describe usual ICU performance. Log-SMRs with appropriate estimates of standard errors are presented in a funnel plot using 5% false discovery rate thresholds. False coverage-statement rate confidence intervals are also presented. The observed numbers of deaths for ICUs identified as unusual are compared to the predicted true worst numbers of deaths under the model for usual ICU performance. Results Seven ICUs were identified as performing unusually over the period 2000 to 2010, in particular, demonstrating high risk-adjusted mortality compared to the majority of ICUs. Four of the seven were ICUs in private hospitals. Our three-stage approach to the analysis detected outlying ICUs which were not identified in a conventional (single) risk-adjusted model for mortality using SMRs to compare ICUs. We also observed a significant linear decline in mortality over the decade. Distinct yearly and weekly respiratory seasonal effects were observed across regions of Australia and New Zealand for the first time. Conclusions The statistical approach proposed in this paper is intended to be used for the review of observed ICU and hospital mortality. Two important messages from our study are firstly, that comprehensive risk-adjustment is essential in modelling patient mortality for comparing performance, and secondly, that the appropriate statistical analysis is complicated. PMID:24755369
NASA Astrophysics Data System (ADS)
Palozzi, Jason; Pantopoulos, George; Maravelis, Angelos G.; Nordsvan, Adam; Zelilidis, Avraam
2018-02-01
This investigation presents an outcrop-based integrated study of internal division analysis and statistical treatment of turbidite bed thickness applied to a Carboniferous deep-water channel-levee complex in the Myall Trough, southeast Australia. Turbidite beds of the studied succession are characterized by a range of sedimentary structures grouped into two main associations, a thick-bedded and a thin-bedded one, that reflect channel-fill and overbank/levee deposits, respectively. Three vertically stacked channel-levee cycles have been identified. Results of statistical analysis of bed thickness, grain-size and internal division patterns applied on the studied channel-levee succession, indicate that turbidite bed thickness data seem to be well characterized by a bimodal lognormal distribution, which is possibly reflecting the difference between deposition from lower-density flows (in a levee/overbank setting) and very high-density flows (in a channel fill setting). Power law and exponential distributions were observed to hold only for the thick-bedded parts of the succession and cannot characterize the whole bed thickness range of the studied sediments. The succession also exhibits non-random clustering of bed thickness and grain-size measurements. The studied sediments are also characterized by the presence of statistically detected fining-upward sandstone packets. A novel quantitative approach (change-point analysis) is proposed for the detection of those packets. Markov permutation statistics also revealed the existence of order in the alternation of internal divisions in the succession expressed by an optimal internal division cycle reflecting two main types of gravity flow events deposited within both thick-bedded conglomeratic and thin-bedded sandstone associations. The analytical methods presented in this study can be used as additional tools for quantitative analysis and recognition of depositional environments in hydrocarbon-bearing research of ancient deep-water channel-levee settings.
Analysis of repeated measurement data in the clinical trials
Singh, Vineeta; Rana, Rakesh Kumar; Singhal, Richa
2013-01-01
Statistics is an integral part of Clinical Trials. Elements of statistics span Clinical Trial design, data monitoring, analyses and reporting. A solid understanding of statistical concepts by clinicians improves the comprehension and the resulting quality of Clinical Trials. In biomedical research it has been seen that researcher frequently use t-test and ANOVA to compare means between the groups of interest irrespective of the nature of the data. In Clinical Trials we record the data on the patients more than two times. In such a situation using the standard ANOVA procedures is not appropriate as it does not consider dependencies between observations within subjects in the analysis. To deal with such types of study data Repeated Measure ANOVA should be used. In this article the application of One-way Repeated Measure ANOVA has been demonstrated by using the software SPSS (Statistical Package for Social Sciences) Version 15.0 on the data collected at four time points 0 day, 15th day, 30th day, and 45th day of multicentre clinical trial conducted on Pandu Roga (~Iron Deficiency Anemia) with an Ayurvedic formulation Dhatrilauha. PMID:23930038
High order statistical signatures from source-driven measurements of subcritical fissile systems
NASA Astrophysics Data System (ADS)
Mattingly, John Kelly
1998-11-01
This research focuses on the development and application of high order statistical analyses applied to measurements performed with subcritical fissile systems driven by an introduced neutron source. The signatures presented are derived from counting statistics of the introduced source and radiation detectors that observe the response of the fissile system. It is demonstrated that successively higher order counting statistics possess progressively higher sensitivity to reactivity. Consequently, these signatures are more sensitive to changes in the composition, fissile mass, and configuration of the fissile assembly. Furthermore, it is shown that these techniques are capable of distinguishing the response of the fissile system to the introduced source from its response to any internal or inherent sources. This ability combined with the enhanced sensitivity of higher order signatures indicates that these techniques will be of significant utility in a variety of applications. Potential applications include enhanced radiation signature identification of weapons components for nuclear disarmament and safeguards applications and augmented nondestructive analysis of spent nuclear fuel. In general, these techniques expand present capabilities in the analysis of subcritical measurements.
Tiedeman, Claire; Ely, D. Matthew; Hill, Mary C.; O'Brien, Grady M.
2004-01-01
We develop a new observation‐prediction (OPR) statistic for evaluating the importance of system state observations to model predictions. The OPR statistic measures the change in prediction uncertainty produced when an observation is added to or removed from an existing monitoring network, and it can be used to guide refinement and enhancement of the network. Prediction uncertainty is approximated using a first‐order second‐moment method. We apply the OPR statistic to a model of the Death Valley regional groundwater flow system (DVRFS) to evaluate the importance of existing and potential hydraulic head observations to predicted advective transport paths in the saturated zone underlying Yucca Mountain and underground testing areas on the Nevada Test Site. Important existing observations tend to be far from the predicted paths, and many unimportant observations are in areas of high observation density. These results can be used to select locations at which increased observation accuracy would be beneficial and locations that could be removed from the network. Important potential observations are mostly in areas of high hydraulic gradient far from the paths. Results for both existing and potential observations are related to the flow system dynamics and coarse parameter zonation in the DVRFS model. If system properties in different locations are as similar as the zonation assumes, then the OPR results illustrate a data collection opportunity whereby observations in distant, high‐gradient areas can provide information about properties in flatter‐gradient areas near the paths. If this similarity is suspect, then the analysis produces a different type of data collection opportunity involving testing of model assumptions critical to the OPR results.
Ephemeris data and error analysis in support of a Comet Encke intercept mission
NASA Technical Reports Server (NTRS)
Yeomans, D. K.
1974-01-01
Utilizing an orbit determination based upon 65 observations over the 1961 - 1973 interval, ephemeris data were generated for the 1976-77, 1980-81 and 1983-84 apparitions of short period comet Encke. For the 1980-81 apparition, results from a statistical error analysis are outlined. All ephemeris and error analysis computations include the effects of planetary perturbations as well as the nongravitational accelerations introduced by the outgassing cometary nucleus. In 1980, excellent observing conditions and a close approach of comet Encke to the earth permit relatively small uncertainties in the cometary position errors and provide an excellent opportunity for a close flyby of a physically interesting comet.
Hayes, Andrew F; Rockwood, Nicholas J
2017-11-01
There have been numerous treatments in the clinical research literature about various design, analysis, and interpretation considerations when testing hypotheses about mechanisms and contingencies of effects, popularly known as mediation and moderation analysis. In this paper we address the practice of mediation and moderation analysis using linear regression in the pages of Behaviour Research and Therapy and offer some observations and recommendations, debunk some popular myths, describe some new advances, and provide an example of mediation, moderation, and their integration as conditional process analysis using the PROCESS macro for SPSS and SAS. Our goal is to nudge clinical researchers away from historically significant but increasingly old school approaches toward modifications, revisions, and extensions that characterize more modern thinking about the analysis of the mechanisms and contingencies of effects. Copyright © 2016 Elsevier Ltd. All rights reserved.
High Agreement and High Prevalence: The Paradox of Cohen's Kappa.
Zec, Slavica; Soriani, Nicola; Comoretto, Rosanna; Baldi, Ileana
2017-01-01
Cohen's Kappa is the most used agreement statistic in literature. However, under certain conditions, it is affected by a paradox which returns biased estimates of the statistic itself. The aim of the study is to provide sufficient information which allows the reader to make an informed choice of the correct agreement measure, by underlining some optimal properties of Gwet's AC1 in comparison to Cohen's Kappa, using a real data example. During the process of literature review, we have asked a panel of three evaluators to come up with a judgment on the quality of 57 randomized controlled trials assigning a score to each trial using the Jadad scale. The quality was evaluated according to the following dimensions: adopted design, randomization unit, type of primary endpoint. With respect to each of the above described features, the agreement between the three evaluators has been calculated using Cohen's Kappa statistic and Gwet's AC1 statistic and, finally, the values have been compared with the observed agreement. The values of the Cohen's Kappa statistic would lead to believe that the agreement levels for the variables Unit, Design and Primary Endpoints are totally unsatisfactory. The AC1 statistic, on the contrary, shows plausible values which are in line with the respective values of the observed concordance. We conclude that it would always be appropriate to adopt the AC1 statistic, thus bypassing any risk of incurring the paradox and drawing wrong conclusions about the results of agreement analysis.
A statistical model for water quality predictions from a river discharge using coastal observations
NASA Astrophysics Data System (ADS)
Kim, S.; Terrill, E. J.
2007-12-01
Understanding and predicting coastal ocean water quality has benefits for reducing human health risks, protecting the environment, and improving local economies which depend on clean beaches. Continuous observations of coastal physical oceanography increase the understanding of the processes which control the fate and transport of a riverine plume which potentially contains high levels of contaminants from the upstream watershed. A data-driven model of the fate and transport of river plume water from the Tijuana River has been developed using surface current observations provided by a network of HF radar operated as part of a local coastal observatory that has been in place since 2002. The model outputs are compared with water quality sampling of shoreline indicator bacteria, and the skill of an alarm for low water quality is evaluated using the receiver operating characteristic (ROC) curve. In addition, statistical analysis of beach closures in comparison with environmental variables is also discussed.
Walden-Schreiner, Chelsey; Leung, Yu-Fai
2013-07-01
Ecological impacts associated with nature-based recreation and tourism can compromise park and protected area goals if left unrestricted. Protected area agencies are increasingly incorporating indicator-based management frameworks into their management plans to address visitor impacts. Development of indicators requires empirical evaluation of indicator measures and examining their ecological and social relevance. This study addresses the development of the informal trail indicator in Yosemite National Park by spatially characterizing visitor use in open landscapes and integrating use patterns with informal trail condition data to examine their spatial association. Informal trail and visitor use data were collected concurrently during July and August of 2011 in three, high-use meadows of Yosemite Valley. Visitor use was clustered at statistically significant levels in all three study meadows. Spatial data integration found no statistically significant differences between use patterns and trail condition class. However, statistically significant differences were found between the distance visitors were observed from informal trails and visitor activity type with active activities occurring closer to trail corridors. Gender was also found to be significant with male visitors observed further from trail corridors. Results highlight the utility of integrated spatial analysis in supporting indicator-based monitoring and informing management of open landscapes. Additional variables for future analysis and methodological improvements are discussed.
NASA Astrophysics Data System (ADS)
Walden-Schreiner, Chelsey; Leung, Yu-Fai
2013-07-01
Ecological impacts associated with nature-based recreation and tourism can compromise park and protected area goals if left unrestricted. Protected area agencies are increasingly incorporating indicator-based management frameworks into their management plans to address visitor impacts. Development of indicators requires empirical evaluation of indicator measures and examining their ecological and social relevance. This study addresses the development of the informal trail indicator in Yosemite National Park by spatially characterizing visitor use in open landscapes and integrating use patterns with informal trail condition data to examine their spatial association. Informal trail and visitor use data were collected concurrently during July and August of 2011 in three, high-use meadows of Yosemite Valley. Visitor use was clustered at statistically significant levels in all three study meadows. Spatial data integration found no statistically significant differences between use patterns and trail condition class. However, statistically significant differences were found between the distance visitors were observed from informal trails and visitor activity type with active activities occurring closer to trail corridors. Gender was also found to be significant with male visitors observed further from trail corridors. Results highlight the utility of integrated spatial analysis in supporting indicator-based monitoring and informing management of open landscapes. Additional variables for future analysis and methodological improvements are discussed.
Wiedermann, Wolfgang; Li, Xintong
2018-04-16
In nonexperimental data, at least three possible explanations exist for the association of two variables x and y: (1) x is the cause of y, (2) y is the cause of x, or (3) an unmeasured confounder is present. Statistical tests that identify which of the three explanatory models fits best would be a useful adjunct to the use of theory alone. The present article introduces one such statistical method, direction dependence analysis (DDA), which assesses the relative plausibility of the three explanatory models on the basis of higher-moment information about the variables (i.e., skewness and kurtosis). DDA involves the evaluation of three properties of the data: (1) the observed distributions of the variables, (2) the residual distributions of the competing models, and (3) the independence properties of the predictors and residuals of the competing models. When the observed variables are nonnormally distributed, we show that DDA components can be used to uniquely identify each explanatory model. Statistical inference methods for model selection are presented, and macros to implement DDA in SPSS are provided. An empirical example is given to illustrate the approach. Conceptual and empirical considerations are discussed for best-practice applications in psychological data, and sample size recommendations based on previous simulation studies are provided.
EVALUATION OF THE EXTRACELLULAR MATRIX OF INJURED SUPRASPINATUS IN RATS
Almeida, Luiz Henrique Oliveira; Ikemoto, Roberto; Mader, Ana Maria; Pinhal, Maria Aparecida Silva; Munhoz, Bruna; Murachovsky, Joel
2016-01-01
ABSTRACT Objective: To evaluate the evolution of injuries of the supraspinatus muscle by immunohistochemistry (IHC) and anatomopathological analysis in animal model (Wistar rats). Methods: Twenty-five Wistar rats were submitted to complete injury of the supraspinatus tendon, then subsequently sacrificed in groups of five animals at the following periods: immediately after the injury, 24h after the injury, 48h after, 30 days after and three months after the injury. All groups underwent histological and IHC analysis. Results: Regarding vascular proliferation and inflammatory infiltrate, we found a statistically significant difference between groups 1(control group) and 2 (24h after injury). IHC analysis showed that expression of vascular endothelial growth factor (VEGF) showed a statistically significant difference between groups 1 and 2, and collagen type 1 (Col-1) evaluation presented a statistically significant difference between groups 1 and 4. Conclusion: We observed changes in the extracellular matrix components compatible with remodeling and healing. Remodeling is more intense 24h after injury. However, VEGF and Col-1 are substantially increased at 24h and 30 days after the injury, respectively. Level of Evidence I, Experimental Study. PMID:26997907
Neyeloff, Jeruza L; Fuchs, Sandra C; Moreira, Leila B
2012-01-20
Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software.
2012-01-01
Background Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. Findings We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. Conclusions It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software. PMID:22264277
Statistical analysis of the calibration procedure for personnel radiation measurement instruments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bush, W.J.; Bengston, S.J.; Kalbeitzer, F.L.
1980-11-01
Thermoluminescent analyzer (TLA) calibration procedures were used to estimate personnel radiation exposure levels at the Idaho National Engineering Laboratory (INEL). A statistical analysis is presented herein based on data collected over a six month period in 1979 on four TLA's located in the Department of Energy (DOE) Radiological and Environmental Sciences Laboratory at the INEL. The data were collected according to the day-to-day procedure in effect at that time. Both gamma and beta radiation models are developed. Observed TLA readings of thermoluminescent dosimeters are correlated with known radiation levels. This correlation is then used to predict unknown radiation doses frommore » future analyzer readings of personnel thermoluminescent dosimeters. The statistical techniques applied in this analysis include weighted linear regression, estimation of systematic and random error variances, prediction interval estimation using Scheffe's theory of calibration, the estimation of the ratio of the means of two normal bivariate distributed random variables and their corresponding confidence limits according to Kendall and Stuart, tests of normality, experimental design, a comparison between instruments, and quality control.« less
The Global Oscillation Network Group site survey. 1: Data collection and analysis methods
NASA Technical Reports Server (NTRS)
Hill, Frank; Fischer, George; Grier, Jennifer; Leibacher, John W.; Jones, Harrison B.; Jones, Patricia P.; Kupke, Renate; Stebbins, Robin T.
1994-01-01
The Global Oscillation Network Group (GONG) Project is planning to place a set of instruments around the world to observe solar oscillations as continuously as possible for at least three years. The Project has now chosen the sites that will comprise the network. This paper describes the methods of data collection and analysis that were used to make this decision. Solar irradiance data were collected with a one-minute cadence at fifteen sites around the world and analyzed to produce statistics of cloud cover, atmospheric extinction, and transparency power spectra at the individual sites. Nearly 200 reasonable six-site networks were assembled from the individual stations, and a set of statistical measures of the performance of the networks was analyzed using a principal component analysis. An accompanying paper presents the results of the survey.
A methodological analysis of chaplaincy research: 2000-2009.
Galek, Kathleen; Flannelly, Kevin J; Jankowski, Katherine R B; Handzo, George F
2011-01-01
The present article presents a comprehensive review and analysis of quantitative research conducted in the United States on chaplaincy and closely related topics published between 2000 and 2009. A combined search strategy identified 49 quantitative studies in 13 journals. The analysis focuses on the methodological sophistication of the studies, compared to earlier research on chaplaincy and pastoral care. Cross-sectional surveys of convenience samples still dominate the field, but sample sizes have increased somewhat over the past three decades. Reporting of the validity and reliability of measures continues to be low, although reporting of response rates has improved. Improvements in the use of inferential statistics and statistical controls were also observed, compared to previous research. The authors conclude that more experimental research is needed on chaplaincy, along with an increased use of hypothesis testing, regardless of the research designs that are used.
NASA Technical Reports Server (NTRS)
Edwards, B. F.; Waligora, J. M.; Horrigan, D. J., Jr.
1985-01-01
This analysis was done to determine whether various decompression response groups could be characterized by the pooled nitrogen (N2) washout profiles of the group members, pooling individual washout profiles provided a smooth time dependent function of means representative of the decompression response group. No statistically significant differences were detected. The statistical comparisons of the profiles were performed by means of univariate weighted t-test at each 5 minute profile point, and with levels of significance of 5 and 10 percent. The estimated powers of the tests (i.e., probabilities) to detect the observed differences in the pooled profiles were of the order of 8 to 30 percent.
Multivariate analysis in thoracic research.
Mengual-Macenlle, Noemí; Marcos, Pedro J; Golpe, Rafael; González-Rivas, Diego
2015-03-01
Multivariate analysis is based in observation and analysis of more than one statistical outcome variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest. The development of multivariate methods emerged to analyze large databases and increasingly complex data. Since the best way to represent the knowledge of reality is the modeling, we should use multivariate statistical methods. Multivariate methods are designed to simultaneously analyze data sets, i.e., the analysis of different variables for each person or object studied. Keep in mind at all times that all variables must be treated accurately reflect the reality of the problem addressed. There are different types of multivariate analysis and each one should be employed according to the type of variables to analyze: dependent, interdependence and structural methods. In conclusion, multivariate methods are ideal for the analysis of large data sets and to find the cause and effect relationships between variables; there is a wide range of analysis types that we can use.
NASA Astrophysics Data System (ADS)
Flores-Marquez, Leticia Elsa; Ramirez Rojaz, Alejandro; Telesca, Luciano
2015-04-01
The study of two statistical approaches is analyzed for two different types of data sets, one is the seismicity generated by the subduction processes occurred at south Pacific coast of Mexico between 2005 and 2012, and the other corresponds to the synthetic seismic data generated by a stick-slip experimental model. The statistical methods used for the present study are the visibility graph in order to investigate the time dynamics of the series and the scaled probability density function in the natural time domain to investigate the critical order of the system. This comparison has the purpose to show the similarities between the dynamical behaviors of both types of data sets, from the point of view of critical systems. The observed behaviors allow us to conclude that the experimental set up globally reproduces the behavior observed in the statistical approaches used to analyses the seismicity of the subduction zone. The present study was supported by the Bilateral Project Italy-Mexico Experimental Stick-slip models of tectonic faults: innovative statistical approaches applied to synthetic seismic sequences, jointly funded by MAECI (Italy) and AMEXCID (Mexico) in the framework of the Bilateral Agreement for Scientific and Technological Cooperation PE 2014-2016.
Generalized statistical mechanics approaches to earthquakes and tectonics.
Vallianatos, Filippos; Papadakis, Giorgos; Michas, Georgios
2016-12-01
Despite the extreme complexity that characterizes the mechanism of the earthquake generation process, simple empirical scaling relations apply to the collective properties of earthquakes and faults in a variety of tectonic environments and scales. The physical characterization of those properties and the scaling relations that describe them attract a wide scientific interest and are incorporated in the probabilistic forecasting of seismicity in local, regional and planetary scales. Considerable progress has been made in the analysis of the statistical mechanics of earthquakes, which, based on the principle of entropy, can provide a physical rationale to the macroscopic properties frequently observed. The scale-invariant properties, the (multi) fractal structures and the long-range interactions that have been found to characterize fault and earthquake populations have recently led to the consideration of non-extensive statistical mechanics (NESM) as a consistent statistical mechanics framework for the description of seismicity. The consistency between NESM and observations has been demonstrated in a series of publications on seismicity, faulting, rock physics and other fields of geosciences. The aim of this review is to present in a concise manner the fundamental macroscopic properties of earthquakes and faulting and how these can be derived by using the notions of statistical mechanics and NESM, providing further insights into earthquake physics and fault growth processes.
Generalized statistical mechanics approaches to earthquakes and tectonics
Papadakis, Giorgos; Michas, Georgios
2016-01-01
Despite the extreme complexity that characterizes the mechanism of the earthquake generation process, simple empirical scaling relations apply to the collective properties of earthquakes and faults in a variety of tectonic environments and scales. The physical characterization of those properties and the scaling relations that describe them attract a wide scientific interest and are incorporated in the probabilistic forecasting of seismicity in local, regional and planetary scales. Considerable progress has been made in the analysis of the statistical mechanics of earthquakes, which, based on the principle of entropy, can provide a physical rationale to the macroscopic properties frequently observed. The scale-invariant properties, the (multi) fractal structures and the long-range interactions that have been found to characterize fault and earthquake populations have recently led to the consideration of non-extensive statistical mechanics (NESM) as a consistent statistical mechanics framework for the description of seismicity. The consistency between NESM and observations has been demonstrated in a series of publications on seismicity, faulting, rock physics and other fields of geosciences. The aim of this review is to present in a concise manner the fundamental macroscopic properties of earthquakes and faulting and how these can be derived by using the notions of statistical mechanics and NESM, providing further insights into earthquake physics and fault growth processes. PMID:28119548
Detection of semi-volatile organic compounds in permeable ...
Abstract The Edison Environmental Center (EEC) has a research and demonstration permeable parking lot comprised of three different permeable systems: permeable asphalt, porous concrete and interlocking concrete permeable pavers. Water quality and quantity analysis has been ongoing since January, 2010. This paper describes a subset of the water quality analysis, analysis of semivolatile organic compounds (SVOCs) to determine if hydrocarbons were in water infiltrated through the permeable surfaces. SVOCs were analyzed in samples collected from 11 dates over a 3 year period, from 2/8/2010 to 4/1/2013.Results are broadly divided into three categories: 42 chemicals were never detected; 12 chemicals (11 chemical test) were detected at a rate of less than 10% or less; and 22 chemicals were detected at a frequency of 10% or greater (ranging from 10% to 66.5% detections). Fundamental and exploratory statistical analyses were performed on these latter analyses results by grouping results by surface type. The statistical analyses were limited due to low frequency of detections and dilutions of samples which impacted detection limits. The infiltrate data through three permeable surfaces were analyzed as non-parametric data by the Kaplan-Meier estimation method for fundamental statistics; there were some statistically observable difference in concentration between pavement types when using Tarone-Ware Comparison Hypothesis Test. Additionally Spearman Rank order non-parame
Robustness of fit indices to outliers and leverage observations in structural equation modeling.
Yuan, Ke-Hai; Zhong, Xiaoling
2013-06-01
Normal-distribution-based maximum likelihood (NML) is the most widely used method in structural equation modeling (SEM), although practical data tend to be nonnormally distributed. The effect of nonnormally distributed data or data contamination on the normal-distribution-based likelihood ratio (LR) statistic is well understood due to many analytical and empirical studies. In SEM, fit indices are used as widely as the LR statistic. In addition to NML, robust procedures have been developed for more efficient and less biased parameter estimates with practical data. This article studies the effect of outliers and leverage observations on fit indices following NML and two robust methods. Analysis and empirical results indicate that good leverage observations following NML and one of the robust methods lead most fit indices to give more support to the substantive model. While outliers tend to make a good model superficially bad according to many fit indices following NML, they have little effect on those following the two robust procedures. Implications of the results to data analysis are discussed, and recommendations are provided regarding the use of estimation methods and interpretation of fit indices. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Evans, Ian N.; Primini, Francis A.; Glotfelty, Kenny J.; Anderson, Craig S.; Bonaventura, Nina R.; Chen, Judy C.; Davis, John E.; Doe, Stephen M.; Evans, Janet D.; Fabbiano, Giuseppina; Galle, Elizabeth C.; Gibbs, Danny G., II; Grier, John D.; Hain, Roger M.; Hall, Diane M.; Harbo, Peter N.; He, Xiangqun Helen; Houck, John C.; Karovska, Margarita; Kashyap, Vinay L.; Lauer, Jennifer; McCollough, Michael L.; McDowell, Jonathan C.; Miller, Joseph B.; Mitschang, Arik W.; Morgan, Douglas L.; Mossman, Amy E.; Nichols, Joy S.; Nowak, Michael A.; Plummer, David A.; Refsdal, Brian L.; Rots, Arnold H.; Siemiginowska, Aneta; Sundheim, Beth A.; Tibbetts, Michael S.; Van Stone, David W.; Winkelman, Sherry L.; Zografou, Panagoula
2010-07-01
The Chandra Source Catalog (CSC) is a general purpose virtual X-ray astrophysics facility that provides access to a carefully selected set of generally useful quantities for individual X-ray sources, and is designed to satisfy the needs of a broad-based group of scientists, including those who may be less familiar with astronomical data analysis in the X-ray regime. The first release of the CSC includes information about 94,676 distinct X-ray sources detected in a subset of public Advanced CCD Imaging Spectrometer imaging observations from roughly the first eight years of the Chandra mission. This release of the catalog includes point and compact sources with observed spatial extents lsim30''. The catalog (1) provides access to the best estimates of the X-ray source properties for detected sources, with good scientific fidelity, and directly supports scientific analysis using the individual source data; (2) facilitates analysis of a wide range of statistical properties for classes of X-ray sources; and (3) provides efficient access to calibrated observational data and ancillary data products for individual X-ray sources, so that users can perform detailed further analysis using existing tools. The catalog includes real X-ray sources detected with flux estimates that are at least 3 times their estimated 1σ uncertainties in at least one energy band, while maintaining the number of spurious sources at a level of lsim1 false source per field for a 100 ks observation. For each detected source, the CSC provides commonly tabulated quantities, including source position, extent, multi-band fluxes, hardness ratios, and variability statistics, derived from the observations in which the source is detected. In addition to these traditional catalog elements, for each X-ray source the CSC includes an extensive set of file-based data products that can be manipulated interactively, including source images, event lists, light curves, and spectra from each observation in which a source is detected.
Optimizing fixed observational assets in a coastal observatory
NASA Astrophysics Data System (ADS)
Frolov, Sergey; Baptista, António; Wilkin, Michael
2008-11-01
Proliferation of coastal observatories necessitates an objective approach to managing of observational assets. In this article, we used our experience in the coastal observatory for the Columbia River estuary and plume to identify and address common problems in managing of fixed observational assets, such as salinity, temperature, and water level sensors attached to pilings and moorings. Specifically, we addressed the following problems: assessing the quality of an existing array, adding stations to an existing array, removing stations from an existing array, validating an array design, and targeting of an array toward data assimilation or monitoring. Our analysis was based on a combination of methods from oceanographic and statistical literature, mainly on the statistical machinery of the best linear unbiased estimator. The key information required for our analysis was the covariance structure for a field of interest, which was computed from the output of assimilated and non-assimilated models of the Columbia River estuary and plume. The network optimization experiments in the Columbia River estuary and plume proved to be successful, largely withstanding the scrutiny of sensitivity and validation studies, and hence providing valuable insight into optimization and operation of the existing observational network. Our success in the Columbia River estuary and plume suggest that algorithms for optimal placement of sensors are reaching maturity and are likely to play a significant role in the design of emerging ocean observatories, such as the United State's ocean observation initiative (OOI) and integrated ocean observing system (IOOS) observatories, and smaller regional observatories.
Wear behavior of AA 5083/SiC nano-particle metal matrix composite: Statistical analysis
NASA Astrophysics Data System (ADS)
Hussain Idrisi, Amir; Ismail Mourad, Abdel-Hamid; Thekkuden, Dinu Thomas; Christy, John Victor
2018-03-01
This paper reports study on statistical analysis of the wear characteristics of AA5083/SiC nanocomposite. The aluminum matrix composites with different wt % (0%, 1% and 2%) of SiC nanoparticles were fabricated by using stir casting route. The developed composites were used in the manufacturing of spur gears on which the study was conducted. A specially designed test rig was used in testing the wear performance of the gears. The wear was investigated under different conditions of applied load (10N, 20N, and 30N) and operation time (30 mins, 60 mins, 90 mins, and 120mins). The analysis carried out at room temperature under constant speed of 1450 rpm. The wear parameters were optimized by using Taguchi’s method. During this statistical approach, L27 Orthogonal array was selected for the analysis of output. Furthermore, analysis of variance (ANOVA) was used to investigate the influence of applied load, operation time and SiC wt. % on wear behaviour. The wear resistance was analyzed by selecting “smaller is better” characteristics as the objective of the model. From this research, it is observed that experiment time and SiC wt % have the most significant effect on the wear performance followed by the applied load.
Order statistics applied to the most massive and most distant galaxy clusters
NASA Astrophysics Data System (ADS)
Waizmann, J.-C.; Ettori, S.; Bartelmann, M.
2013-06-01
In this work, we present an analytic framework for calculating the individual and joint distributions of the nth most massive or nth highest redshift galaxy cluster for a given survey characteristic allowing us to formulate Λ cold dark matter (ΛCDM) exclusion criteria. We show that the cumulative distribution functions steepen with increasing order, giving them a higher constraining power with respect to the extreme value statistics. Additionally, we find that the order statistics in mass (being dominated by clusters at lower redshifts) is sensitive to the matter density and the normalization of the matter fluctuations, whereas the order statistics in redshift is particularly sensitive to the geometric evolution of the Universe. For a fixed cosmology, both order statistics are efficient probes of the functional shape of the mass function at the high-mass end. To allow a quick assessment of both order statistics, we provide fits as a function of the survey area that allow percentile estimation with an accuracy better than 2 per cent. Furthermore, we discuss the joint distributions in the two-dimensional case and find that for the combination of the largest and the second largest observation, it is most likely to find them to be realized with similar values with a broadly peaked distribution. When combining the largest observation with higher orders, it is more likely to find a larger gap between the observations and when combining higher orders in general, the joint probability density function peaks more strongly. Having introduced the theory, we apply the order statistical analysis to the Southpole Telescope (SPT) massive cluster sample and metacatalogue of X-ray detected clusters of galaxies catalogue and find that the 10 most massive clusters in the sample are consistent with ΛCDM and the Tinker mass function. For the order statistics in redshift, we find a discrepancy between the data and the theoretical distributions, which could in principle indicate a deviation from the standard cosmology. However, we attribute this deviation to the uncertainty in the modelling of the SPT survey selection function. In turn, by assuming the ΛCDM reference cosmology, order statistics can also be utilized for consistency checks of the completeness of the observed sample and of the modelling of the survey selection function.
Applications of Bayesian Statistics to Problems in Gamma-Ray Bursts
NASA Technical Reports Server (NTRS)
Meegan, Charles A.
1997-01-01
This presentation will describe two applications of Bayesian statistics to Gamma Ray Bursts (GRBS). The first attempts to quantify the evidence for a cosmological versus galactic origin of GRBs using only the observations of the dipole and quadrupole moments of the angular distribution of bursts. The cosmological hypothesis predicts isotropy, while the galactic hypothesis is assumed to produce a uniform probability distribution over positive values for these moments. The observed isotropic distribution indicates that the Bayes factor for the cosmological hypothesis over the galactic hypothesis is about 300. Another application of Bayesian statistics is in the estimation of chance associations of optical counterparts with galaxies. The Bayesian approach is preferred to frequentist techniques here because the Bayesian approach easily accounts for galaxy mass distributions and because one can incorporate three disjoint hypotheses: (1) bursts come from galactic centers, (2) bursts come from galaxies in proportion to luminosity, and (3) bursts do not come from external galaxies. This technique was used in the analysis of the optical counterpart to GRB970228.
NASA Technical Reports Server (NTRS)
Lien, Guo-Yuan; Kalnay, Eugenia; Miyoshi, Takemasa; Huffman, George J.
2016-01-01
Assimilation of satellite precipitation data into numerical models presents several difficulties, with two of the most important being the non-Gaussian error distributions associated with precipitation, and large model and observation errors. As a result, improving the model forecast beyond a few hours by assimilating precipitation has been found to be difficult. To identify the challenges and propose practical solutions to assimilation of precipitation, statistics are calculated for global precipitation in a low-resolution NCEP Global Forecast System (GFS) model and the TRMM Multisatellite Precipitation Analysis (TMPA). The samples are constructed using the same model with the same forecast period, observation variables, and resolution as in the follow-on GFSTMPA precipitation assimilation experiments presented in the companion paper.The statistical results indicate that the T62 and T126 GFS models generally have positive bias in precipitation compared to the TMPA observations, and that the simulation of the marine stratocumulus precipitation is not realistic in the T62 GFS model. It is necessary to apply to precipitation either the commonly used logarithm transformation or the newly proposed Gaussian transformation to obtain a better relationship between the model and observational precipitation. When the Gaussian transformations are separately applied to the model and observational precipitation, they serve as a bias correction that corrects the amplitude-dependent biases. In addition, using a spatially andor temporally averaged precipitation variable, such as the 6-h accumulated precipitation, should be advantageous for precipitation assimilation.
Pounds, Stan; Cheng, Cheng; Cao, Xueyuan; Crews, Kristine R.; Plunkett, William; Gandhi, Varsha; Rubnitz, Jeffrey; Ribeiro, Raul C.; Downing, James R.; Lamba, Jatinder
2009-01-01
Motivation: In some applications, prior biological knowledge can be used to define a specific pattern of association of multiple endpoint variables with a genomic variable that is biologically most interesting. However, to our knowledge, there is no statistical procedure designed to detect specific patterns of association with multiple endpoint variables. Results: Projection onto the most interesting statistical evidence (PROMISE) is proposed as a general procedure to identify genomic variables that exhibit a specific biologically interesting pattern of association with multiple endpoint variables. Biological knowledge of the endpoint variables is used to define a vector that represents the biologically most interesting values for statistics that characterize the associations of the endpoint variables with a genomic variable. A test statistic is defined as the dot-product of the vector of the observed association statistics and the vector of the most interesting values of the association statistics. By definition, this test statistic is proportional to the length of the projection of the observed vector of correlations onto the vector of most interesting associations. Statistical significance is determined via permutation. In simulation studies and an example application, PROMISE shows greater statistical power to identify genes with the interesting pattern of associations than classical multivariate procedures, individual endpoint analyses or listing genes that have the pattern of interest and are significant in more than one individual endpoint analysis. Availability: Documented R routines are freely available from www.stjuderesearch.org/depts/biostats and will soon be available as a Bioconductor package from www.bioconductor.org. Contact: stanley.pounds@stjude.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19528086
Observational Word Learning: Beyond Propose-But-Verify and Associative Bean Counting.
Roembke, Tanja; McMurray, Bob
2016-04-01
Learning new words is difficult. In any naming situation, there are multiple possible interpretations of a novel word. Recent approaches suggest that learners may solve this problem by tracking co-occurrence statistics between words and referents across multiple naming situations (e.g. Yu & Smith, 2007), overcoming the ambiguity in any one situation. Yet, there remains debate around the underlying mechanisms. We conducted two experiments in which learners acquired eight word-object mappings using cross-situational statistics while eye-movements were tracked. These addressed four unresolved questions regarding the learning mechanism. First, eye-movements during learning showed evidence that listeners maintain multiple hypotheses for a given word and bring them all to bear in the moment of naming. Second, trial-by-trial analyses of accuracy suggested that listeners accumulate continuous statistics about word/object mappings, over and above prior hypotheses they have about a word. Third, consistent, probabilistic context can impede learning, as false associations between words and highly co-occurring referents are formed. Finally, a number of factors not previously considered in prior analysis impact observational word learning: knowledge of the foils, spatial consistency of the target object, and the number of trials between presentations of the same word. This evidence suggests that observational word learning may derive from a combination of gradual statistical or associative learning mechanisms and more rapid real-time processes such as competition, mutual exclusivity and even inference or hypothesis testing.
Statistical scaling of pore-scale Lagrangian velocities in natural porous media.
Siena, M; Guadagnini, A; Riva, M; Bijeljic, B; Pereira Nunes, J P; Blunt, M J
2014-08-01
We investigate the scaling behavior of sample statistics of pore-scale Lagrangian velocities in two different rock samples, Bentheimer sandstone and Estaillades limestone. The samples are imaged using x-ray computer tomography with micron-scale resolution. The scaling analysis relies on the study of the way qth-order sample structure functions (statistical moments of order q of absolute increments) of Lagrangian velocities depend on separation distances, or lags, traveled along the mean flow direction. In the sandstone block, sample structure functions of all orders exhibit a power-law scaling within a clearly identifiable intermediate range of lags. Sample structure functions associated with the limestone block display two diverse power-law regimes, which we infer to be related to two overlapping spatially correlated structures. In both rocks and for all orders q, we observe linear relationships between logarithmic structure functions of successive orders at all lags (a phenomenon that is typically known as extended power scaling, or extended self-similarity). The scaling behavior of Lagrangian velocities is compared with the one exhibited by porosity and specific surface area, which constitute two key pore-scale geometric observables. The statistical scaling of the local velocity field reflects the behavior of these geometric observables, with the occurrence of power-law-scaling regimes within the same range of lags for sample structure functions of Lagrangian velocity, porosity, and specific surface area.
Cross-cultural variation of memory colors of familiar objects.
Smet, Kevin A G; Lin, Yandan; Nagy, Balázs V; Németh, Zoltan; Duque-Chica, Gloria L; Quintero, Jesús M; Chen, Hung-Shing; Luo, Ronnier M; Safi, Mahdi; Hanselaer, Peter
2014-12-29
The effect of cross-regional or cross-cultural differences on color appearance ratings and memory colors of familiar objects was investigated in seven different countries/regions - Belgium, Hungary, Brazil, Colombia, Taiwan, China and Iran. In each region the familiar objects were presented on a calibrated monitor in over 100 different colors to a test panel of observers that were asked to rate the similarity of the presented object color with respect to what they thought the object looks like in reality (memory color). For each object and region the mean observer ratings were modeled by a bivariate Gaussian function. A statistical analysis showed significant (p < 0.001) differences between the region average observers and the global average observer obtained by pooling the data from all regions. However, the effect size of geographical region or culture was found to be small. In fact, the differences between the region average observers and the global average observer were found to of the same magnitude or smaller than the typical within region inter-observer variability. Thus, although statistical differences in color appearance ratings and memory between regions were found, regional impact is not likely to be of practical importance.
Zheng, Jusheng; Yang, Bin; Huang, Tao; Yu, Yinghua; Yang, Jing; Li, Duo
2011-01-01
Observational studies on tea consumption and prostate cancer (PCa) risk are still inconsistent. The authors conducted a meta-analysis to investigate the association between green tea and black tea consumption with PCa risk. Thirteen studies providing data on green tea or black tea consumption were identified by searching PubMed and ISI Web of Science databases and secondary referencing qualified for inclusion. A random-effects model was used to calculate the summary odds ratios (OR) and their corresponding 95% confidence intervals (CIs). For green tea, the summary OR of PCa indicated a borderline significant association in Asian populations for highest green tea consumption vs. non/lowest (OR = 0.62; 95% CI: 0.38-1.01); and the pooled estimate reached statistically significant level for case-control studies (OR = 0.43; 95% CI: 0.25-0.73), but not for prospective cohort studies (OR = 1.00; 95% CI: 0.66-1.53). For black tea, no statistically significant association was observed for the highest vs. non/lowest black tea consumption (OR = 0.99; 95% CI: 0.82-1.20). In conclusion, this meta-analysis supported that green tea but not black tea may have a protective effect on PCa, especially in Asian populations. Further research regarding green tea consumption across different regions apart from Asia is needed.
NASA Astrophysics Data System (ADS)
Tzou, Chia-Yu; altwegg, kathrin; Bieler, Andre; Calmonte, Ursina; Gasc, Sébastien; Le Roy, Léna; Rubin, Martin
2016-10-01
ROSINA is the in situ Rosetta Orbiter Spectrometer for Ion and Neutral Analysis on board of Rosetta, one of the corner stone missions of the European Space Agency (ESA) to land and orbit the Jupiter family comet 67P/Churyumov-Gerasimenko (67P). ROSINA consists of two mass spectrometers and a pressure sensor. The Reflectron Time of Flight Spectrometer (RTOF) and the Double Focusing Mass Spectrometer (DFMS) complement each other in mass and time resolution.The Comet Pressure Sensor (COPS) provides density measurements of the neutral molecules in the cometary coma of 67P. COPS has two gauges, a nude gauge that measures the total neutral density and a ram gauge that measures the dynamic pressure from the comet. Combining the two COPS is also capable of providing gas dynamic information such as gas velocity and gas temperature of the coma.While Rosetta started orbiting around 67P in August 2014, COPS observed diurnal and seasonal variations of the neutral gas density in the coma. Surprisingly, additional to these major density variation patterns, COPS occasionally observed small spikes in the density that are associated with dust. These dust signals can be interpreted as a result of cometary dust releasing volatiles while heated up near COPS. A statistical analysis of dust signals detected by COPS will be presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Croft, S.; Favalli, Andrea; Weaver, Brian Phillip
2015-10-06
In this paper we develop and investigate several criteria for assessing how well a proposed spectral form fits observed spectra. We consider the classical improved figure of merit (FOM) along with several modifications, as well as criteria motivated by Poisson regression from the statistical literature. We also develop a new FOM that is based on the statistical idea of the bootstrap. A spectral simulator has been developed to assess the performance of these different criteria under multiple data configurations.
Predictive data modeling of human type II diabetes related statistics
NASA Astrophysics Data System (ADS)
Jaenisch, Kristina L.; Jaenisch, Holger M.; Handley, James W.; Albritton, Nathaniel G.
2009-04-01
During the course of routine Type II treatment of one of the authors, it was decided to derive predictive analytical Data Models of the daily sampled vital statistics: namely weight, blood pressure, and blood sugar, to determine if the covariance among the observed variables could yield a descriptive equation based model, or better still, a predictive analytical model that could forecast the expected future trend of the variables and possibly eliminate the number of finger stickings required to montior blood sugar levels. The personal history and analysis with resulting models are presented.
Statistical analysis of multivariate atmospheric variables. [cloud cover
NASA Technical Reports Server (NTRS)
Tubbs, J. D.
1979-01-01
Topics covered include: (1) estimation in discrete multivariate distributions; (2) a procedure to predict cloud cover frequencies in the bivariate case; (3) a program to compute conditional bivariate normal parameters; (4) the transformation of nonnormal multivariate to near-normal; (5) test of fit for the extreme value distribution based upon the generalized minimum chi-square; (6) test of fit for continuous distributions based upon the generalized minimum chi-square; (7) effect of correlated observations on confidence sets based upon chi-square statistics; and (8) generation of random variates from specified distributions.
LP-search and its use in analysis of the accuracy of control systems with acoustical models
NASA Technical Reports Server (NTRS)
Sergeyev, V. I.; Sobol, I. M.; Statnikov, R. B.; Statnikov, I. N.
1973-01-01
The LP-search is proposed as an analog of the Monte Carlo method for finding values in nonlinear statistical systems. It is concluded that: To attain the required accuracy in solution to the problem of control for a statistical system in the LP-search, a considerably smaller number of tests is required than in the Monte Carlo method. The LP-search allows the possibility of multiple repetitions of tests under identical conditions and observability of the output variables of the system.
Linear theory for filtering nonlinear multiscale systems with model error
Berry, Tyrus; Harlim, John
2014-01-01
In this paper, we study filtering of multiscale dynamical systems with model error arising from limitations in resolving the smaller scale processes. In particular, the analysis assumes the availability of continuous-time noisy observations of all components of the slow variables. Mathematically, this paper presents new results on higher order asymptotic expansion of the first two moments of a conditional measure. In particular, we are interested in the application of filtering multiscale problems in which the conditional distribution is defined over the slow variables, given noisy observation of the slow variables alone. From the mathematical analysis, we learn that for a continuous time linear model with Gaussian noise, there exists a unique choice of parameters in a linear reduced model for the slow variables which gives the optimal filtering when only the slow variables are observed. Moreover, these parameters simultaneously give the optimal equilibrium statistical estimates of the underlying system, and as a consequence they can be estimated offline from the equilibrium statistics of the true signal. By examining a nonlinear test model, we show that the linear theory extends in this non-Gaussian, nonlinear configuration as long as we know the optimal stochastic parametrization and the correct observation model. However, when the stochastic parametrization model is inappropriate, parameters chosen for good filter performance may give poor equilibrium statistical estimates and vice versa; this finding is based on analytical and numerical results on our nonlinear test model and the two-layer Lorenz-96 model. Finally, even when the correct stochastic ansatz is given, it is imperative to estimate the parameters simultaneously and to account for the nonlinear feedback of the stochastic parameters into the reduced filter estimates. In numerical experiments on the two-layer Lorenz-96 model, we find that the parameters estimated online, as part of a filtering procedure, simultaneously produce accurate filtering and equilibrium statistical prediction. In contrast, an offline estimation technique based on a linear regression, which fits the parameters to a training dataset without using the filter, yields filter estimates which are worse than the observations or even divergent when the slow variables are not fully observed. This finding does not imply that all offline methods are inherently inferior to the online method for nonlinear estimation problems, it only suggests that an ideal estimation technique should estimate all parameters simultaneously whether it is online or offline. PMID:25002829
Statistical Analysis of Acoustic Signal Propagating Through the South China Sea Basin
2016-03-01
internal tidal constituents are observed in both spectra, and the diurnal (D) and semidiurnal (SD) internal waves ’ energy are strong. The spectrum is...bandwidths were utilized during the frequency smoothing process to ensure the reliability of the spectra in the meso-, tidal and internal wave scale...mooring temperature sensors capture the internal waves ’ energy, and six high amplitude peaks are observed in the spectra in the internal tidal band
NASA Astrophysics Data System (ADS)
Mitchell, M. J.; Pichugina, Y. L.; Banta, R. M.
2015-12-01
Models are important tools for assessing potential of wind energy sites, but the accuracy of these projections has not been properly validated. In this study, High Resolution Doppler Lidar (HRDL) data obtained with high temporal and spatial resolution at heights of modern turbine rotors were compared to output from the WRF-chem model in order to help improve the performance of the model in producing accurate wind forecasts for the industry. HRDL data were collected from January 23-March 1, 2012 during the Uintah Basin Winter Ozone Study (UBWOS) field campaign. A model validation method was based on the qualitative comparison of the wind field images, time-series analysis and statistical analysis of the observed and modeled wind speed and direction, both for case studies and for the whole experiment. To compare the WRF-chem model output to the HRDL observations, the model heights and forecast times were interpolated to match the observed times and heights. Then, time-height cross-sections of the HRDL and WRF-Chem wind speed and directions were plotted to select case studies. Cross-sections of the differences between the observed and forecasted wind speed and directions were also plotted to visually analyze the model performance in different wind flow conditions. A statistical analysis includes the calculation of vertical profiles and time series of bias, correlation coefficient, root mean squared error, and coefficient of determination between two datasets. The results from this analysis reveals where and when the model typically struggles in forecasting winds at heights of modern turbine rotors so that in the future the model can be improved for the industry.
Spatial analysis of relative humidity during ungauged periods in a mountainous region
NASA Astrophysics Data System (ADS)
Um, Myoung-Jin; Kim, Yeonjoo
2017-08-01
Although atmospheric humidity influences environmental and agricultural conditions, thereby influencing plant growth, human health, and air pollution, efforts to develop spatial maps of atmospheric humidity using statistical approaches have thus far been limited. This study therefore aims to develop statistical approaches for inferring the spatial distribution of relative humidity (RH) for a mountainous island, for which data are not uniformly available across the region. A multiple regression analysis based on various mathematical models was used to identify the optimal model for estimating monthly RH by incorporating not only temperature but also location and elevation. Based on the regression analysis, we extended the monthly RH data from weather stations to cover the ungauged periods when no RH observations were available. Then, two different types of station-based data, the observational data and the data extended via the regression model, were used to form grid-based data with a resolution of 100 m. The grid-based data that used the extended station-based data captured the increasing RH trend along an elevation gradient. Furthermore, annual RH values averaged over the regions were examined. Decreasing temporal trends were found in most cases, with magnitudes varying based on the season and region.
Characterizing Giant Exoplanets through Multiwavelength Transit Observations: KELT-9b
NASA Astrophysics Data System (ADS)
Gardner, Cristilyn N.; Cole, Jackson L.; Garver, Bethany R.; Jarka, Kyla L.; Kar, Aman; McGough, Aylin M.; PeQueen, David J.; Rivera, Daniel I.; Kasper, David; Jang-Condell, Hannah; Kobulnicky, Henry A.; Dale, Daniel A.
2018-01-01
Multiwavelength observations of host stellar light scattered through an exoplanet's atmosphere during a transit characterizes exoplanetary parameters. Using the Wyoming Infrared Observatory 2.3-meter telescope, we observed primary transits of KELT-9b in the ugriz Sloan filters. We present an analysis of the phase-folded transit observations of KELT-9b using a Bayesian statistical approach. By plotting the transit depth as a function of wavelength, our preliminary results are indicative of scattering in the atmosphere surrounding KELT-9b. This work is supported by the National Science Foundation under REU grant AST 1560461 and PAARE grant AST 1559559.
NASA Technical Reports Server (NTRS)
Bauman, William H., III
2010-01-01
The AMU conducted an objective analysis of the MesoNAM forecasts compared to observed values from sensors at specified KSC/CCAFS wind towers by calculating the following statistics to verify the performance of the model: 1) Bias (mean difference), 2) Standard deviation of Bias, 3) Root Mean Square Error (RMSE), and 4) Hypothesis test for Bias = O. The 45 WS LWOs use the MesoNAM to support launch weather operations. However, the actual performance of the model at KSC and CCAFS had not been measured objectively. The analysis compared the MesoNAM forecast winds, temperature and dew point to the observed values from the sensors on wind towers. The data were stratified by tower sensor, month and onshore/offshore wind direction based on the orientation of the coastline to each tower's location. The model's performance statistics were then calculated for each wind tower based on sensor height and model initialization time. The period of record for the data used in this task was based on the operational start of the current MesoNAM in mid-August 2006 and so the task began with the first full month of data, September 2006, through May 2010. The analysis of model performance indicated: a) The accuracy decreased as the forecast valid time from the model initialization increased, b) There was a diurnal signal in T with a cool bias during the late night and a warm bias during the afternoon, c) There was a diurnal signal in Td with a low bias during the afternoon and a high bias during the late night, and d) The model parameters at each vertical level most closely matched the observed parameters at heights closest to those vertical levels. The AMU developed a GUI that consists of a multi-level drop-down menu written in JavaScript embedded within the HTML code. This tool allows the LWO to easily and efficiently navigate among the charts and spreadsheet files containing the model performance statistics. The objective statistics give the LWOs knowledge of the model's strengths and weaknesses and the GUI allows quick access to the data which will result in improved forecasts for operations.
Influence of nonlinear effects on statistical properties of the radiation from SASE FEL
NASA Astrophysics Data System (ADS)
Saldin, E. L.; Schneidmiller, E. A.; Yurkov, M. V.
1998-02-01
The paper presents analysis of statistical properties of the radiation from self-amplified spontaneous emission (SASE) free-electron laser operating in nonlinear mode. The present approach allows one to calculate the following statistical properties of the SASE FEL radiation: time and spectral field correlation functions, distribution of the fluctuations of the instantaneous radiation power, distribution of the energy in the electron bunch, distribution of the radiation energy after monochromator installed at the FEL amplifier exit and the radiation spectrum. It has been observed that the statistics of the instantaneous radiation power from SASE FEL operating in the nonlinear regime changes significantly with respect to the linear regime. All numerical results presented in the paper have been calculated for the 70 nm SASE FEL at the TESLA Test Facility under construction at DESY.
Low statistical power in biomedical science: a review of three human research domains.
Dumas-Mallet, Estelle; Button, Katherine S; Boraud, Thomas; Gonon, Francois; Munafò, Marcus R
2017-02-01
Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0-10% or 11-20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation.
Low statistical power in biomedical science: a review of three human research domains
Dumas-Mallet, Estelle; Button, Katherine S.; Boraud, Thomas; Gonon, Francois
2017-01-01
Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0–10% or 11–20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation. PMID:28386409
On Muthen's Maximum Likelihood for Two-Level Covariance Structure Models
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Hayashi, Kentaro
2005-01-01
Data in social and behavioral sciences are often hierarchically organized. Special statistical procedures that take into account the dependence of such observations have been developed. Among procedures for 2-level covariance structure analysis, Muthen's maximum likelihood (MUML) has the advantage of easier computation and faster convergence. When…
Component Models for Fuzzy Data
ERIC Educational Resources Information Center
Coppi, Renato; Giordani, Paolo; D'Urso, Pierpaolo
2006-01-01
The fuzzy perspective in statistical analysis is first illustrated with reference to the "Informational Paradigm" allowing us to deal with different types of uncertainties related to the various informational ingredients (data, model, assumptions). The fuzzy empirical data are then introduced, referring to "J" LR fuzzy variables as observed on "I"…
The Effects of Auditory Tempo Changes on Rates of Stereotypic Behavior in Handicapped Children.
ERIC Educational Resources Information Center
Christopher, R.; Lewis, B.
1984-01-01
Rates of stereotypic behaviors in six severely/profoundly retarded children (eight to 15 years old) were observed during varying presentations of auditory beats produced by a metronome. Visual and statistical analysis of research results suggested a significant reaction to stimulus presentation. However, additional data following…
Multi-Parameter Linear Least-Squares Fitting to Poisson Data One Count at a Time
NASA Technical Reports Server (NTRS)
Wheaton, W.; Dunklee, A.; Jacobson, A.; Ling, J.; Mahoney, W.; Radocinski, R.
1993-01-01
A standard problem in gamma-ray astronomy data analysis is the decomposition of a set of observed counts, described by Poisson statistics, according to a given multi-component linear model, with underlying physical count rates or fluxes which are to be estimated from the data.
How does new evidence change our estimates of probabilities? Carnap's formula revisited
NASA Technical Reports Server (NTRS)
Kreinovich, Vladik; Quintana, Chris
1992-01-01
The formula originally proposed by R. Carnap in his analysis of induction is reviewed and its natural generalization is presented. A situation is considered where the probability of a certain event is determined without using standard statistical methods due to the lack of observation.
Ocean Surface Wave Optical Roughness - Analysis of Innovative Measurements
2011-09-30
Phillips et al, 2001, Gemmrich et al., 2008) and microscale breaker crest length spectral density (e.g. Jessup and Phadnis , 2005) have been reported...Statistics of breaking waves observed as whitecaps in the open sea, Journal of Physical Oceanography, 16, 290-297. Jessup, A.T. and Phadnis , K.R
USDA-ARS?s Scientific Manuscript database
Land data assimilations are typically based on highly uncertain assumptions regarding the statistical structure of observation and modeling errors. Left uncorrected, poor assumptions can degrade the quality of analysis products generated by land data assimilation systems. Recently, Crow and van de...
NASA Astrophysics Data System (ADS)
Clerc, F.; Njiki-Menga, G.-H.; Witschger, O.
2013-04-01
Most of the measurement strategies that are suggested at the international level to assess workplace exposure to nanomaterials rely on devices measuring, in real time, airborne particles concentrations (according different metrics). Since none of the instruments to measure aerosols can distinguish a particle of interest to the background aerosol, the statistical analysis of time resolved data requires special attention. So far, very few approaches have been used for statistical analysis in the literature. This ranges from simple qualitative analysis of graphs to the implementation of more complex statistical models. To date, there is still no consensus on a particular approach and the current period is always looking for an appropriate and robust method. In this context, this exploratory study investigates a statistical method to analyse time resolved data based on a Bayesian probabilistic approach. To investigate and illustrate the use of the this statistical method, particle number concentration data from a workplace study that investigated the potential for exposure via inhalation from cleanout operations by sandpapering of a reactor producing nanocomposite thin films have been used. In this workplace study, the background issue has been addressed through the near-field and far-field approaches and several size integrated and time resolved devices have been used. The analysis of the results presented here focuses only on data obtained with two handheld condensation particle counters. While one was measuring at the source of the released particles, the other one was measuring in parallel far-field. The Bayesian probabilistic approach allows a probabilistic modelling of data series, and the observed task is modelled in the form of probability distributions. The probability distributions issuing from time resolved data obtained at the source can be compared with the probability distributions issuing from the time resolved data obtained far-field, leading in a quantitative estimation of the airborne particles released at the source when the task is performed. Beyond obtained results, this exploratory study indicates that the analysis of the results requires specific experience in statistics.
Statistics and classification of the microwave zebra patterns associated with solar flares
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tan, Baolin; Tan, Chengming; Zhang, Yin
2014-01-10
The microwave zebra pattern (ZP) is the most interesting, intriguing, and complex spectral structure frequently observed in solar flares. A comprehensive statistical study will certainly help us to understand the formation mechanism, which is not exactly clear now. This work presents a comprehensive statistical analysis of a big sample with 202 ZP events collected from observations at the Chinese Solar Broadband Radio Spectrometer at Huairou and the Ondŕejov Radiospectrograph in the Czech Republic at frequencies of 1.00-7.60 GHz from 2000 to 2013. After investigating the parameter properties of ZPs, such as the occurrence in flare phase, frequency range, polarization degree,more » duration, etc., we find that the variation of zebra stripe frequency separation with respect to frequency is the best indicator for a physical classification of ZPs. Microwave ZPs can be classified into three types: equidistant ZPs, variable-distant ZPs, and growing-distant ZPs, possibly corresponding to mechanisms of the Bernstein wave model, whistler wave model, and double plasma resonance model, respectively. This statistical classification may help us to clarify the controversies between the existing various theoretical models and understand the physical processes in the source regions.« less
Barnett, L A; Lewis, M; Mallen, C D; Peat, G
2017-12-04
Selection bias is a concern when designing cluster randomised controlled trials (c-RCT). Despite addressing potential issues at the design stage, bias cannot always be eradicated from a trial design. The application of bias analysis presents an important step forward in evaluating whether trial findings are credible. The aim of this paper is to give an example of the technique to quantify potential selection bias in c-RCTs. This analysis uses data from the Primary care Osteoarthritis Screening Trial (POST). The primary aim of this trial was to test whether screening for anxiety and depression, and providing appropriate care for patients consulting their GP with osteoarthritis would improve clinical outcomes. Quantitative bias analysis is a seldom-used technique that can quantify types of bias present in studies. Due to lack of information on the selection probability, probabilistic bias analysis with a range of triangular distributions was also used, applied at all three follow-up time points; 3, 6, and 12 months post consultation. A simple bias analysis was also applied to the study. Worse pain outcomes were observed among intervention participants than control participants (crude odds ratio at 3, 6, and 12 months: 1.30 (95% CI 1.01, 1.67), 1.39 (1.07, 1.80), and 1.17 (95% CI 0.90, 1.53), respectively). Probabilistic bias analysis suggested that the observed effect became statistically non-significant if the selection probability ratio was between 1.2 and 1.4. Selection probability ratios of > 1.8 were needed to mask a statistically significant benefit of the intervention. The use of probabilistic bias analysis in this c-RCT suggested that worse outcomes observed in the intervention arm could plausibly be attributed to selection bias. A very large degree of selection of bias was needed to mask a beneficial effect of intervention making this interpretation less plausible.
Automatic Generation of Algorithms for the Statistical Analysis of Planetary Nebulae Images
NASA Technical Reports Server (NTRS)
Fischer, Bernd
2004-01-01
Analyzing data sets collected in experiments or by observations is a Core scientific activity. Typically, experimentd and observational data are &aught with uncertainty, and the analysis is based on a statistical model of the conjectured underlying processes, The large data volumes collected by modern instruments make computer support indispensible for this. Consequently, scientists spend significant amounts of their time with the development and refinement of the data analysis programs. AutoBayes [GF+02, FS03] is a fully automatic synthesis system for generating statistical data analysis programs. Externally, it looks like a compiler: it takes an abstract problem specification and translates it into executable code. Its input is a concise description of a data analysis problem in the form of a statistical model as shown in Figure 1; its output is optimized and fully documented C/C++ code which can be linked dynamically into the Matlab and Octave environments. Internally, however, it is quite different: AutoBayes derives a customized algorithm implementing the given model using a schema-based process, and then further refines and optimizes the algorithm into code. A schema is a parameterized code template with associated semantic constraints which define and restrict the template s applicability. The schema parameters are instantiated in a problem-specific way during synthesis as AutoBayes checks the constraints against the original model or, recursively, against emerging sub-problems. AutoBayes schema library contains problem decomposition operators (which are justified by theorems in a formal logic in the domain of Bayesian networks) as well as machine learning algorithms (e.g., EM, k-Means) and nu- meric optimization methods (e.g., Nelder-Mead simplex, conjugate gradient). AutoBayes augments this schema-based approach by symbolic computation to derive closed-form solutions whenever possible. This is a major advantage over other statistical data analysis systems which use numerical approximations even in cases where closed-form solutions exist. AutoBayes is implemented in Prolog and comprises approximately 75.000 lines of code. In this paper, we take one typical scientific data analysis problem-analyzing planetary nebulae images taken by the Hubble Space Telescope-and show how AutoBayes can be used to automate the implementation of the necessary anal- ysis programs. We initially follow the analysis described by Knuth and Hajian [KHO2] and use AutoBayes to derive code for the published models. We show the details of the code derivation process, including the symbolic computations and automatic integration of library procedures, and compare the results of the automatically generated and manually implemented code. We then go beyond the original analysis and use AutoBayes to derive code for a simple image segmentation procedure based on a mixture model which can be used to automate a manual preproceesing step. Finally, we combine the original approach with the simple segmentation which yields a more detailed analysis. This also demonstrates that AutoBayes makes it easy to combine different aspects of data analysis.
LEP Events, TLE's, and Q-bursts observed from the Antarctic
NASA Astrophysics Data System (ADS)
Moore, R. C.; Kim, D.; Flint, Q. A.
2017-12-01
ELF/VLF measurements at Palmer Station, McMurdo Station, and South Pole Station, Antarctica are used to detect lightning-generated ELF/VLF radio atmospherics from around the globe and to remote sense ionospheric disturbances in the Southern hemisphere. The Antarctic ELF/VLF receivers complement a Northern hemisphere ELF/VLF monitoring array. In this paper, we present our latest observational results, including a full statistical analysis of conjugate observations of lightning-induced electron precipitation and radio atmospherics associated specifically with the transient luminous events known as gigantic jets and sprites.
NASA Astrophysics Data System (ADS)
Bencomo, Jose Antonio Fagundez
The main goal of this study was to relate physical changes in image quality measured by Modulation Transfer Function (MTF) to diagnostic accuracy. One Hundred and Fifty Kodak Min-R screen/film combination conventional craniocaudal mammograms obtained with the Pfizer Microfocus Mammographic system were selected from the files of the Department of Radiology, at M.D. Anderson Hospital and Tumor Institute. The mammograms included 88 cases with a variety of benign diagnosis and 62 cases with a variety of malignant biopsy diagnosis. The average age of the patient population was 55 years old. 70 cases presented calcifications with 30 cases having calcifications smaller than 0.5mm. 46 cases presented irregular bordered masses larger than 1 cm. 30 cases presented smooth bordered masses with 20 larger than 1 cm. Four separated copies of the original images were made each having a different change in the MTF using a defocusing technique whereby copies of the original were obtained by light exposure through different thicknesses (spacing) of transparent film base. The mammograms were randomized, and evaluated by three experienced mammographers for the degree of visibility of various anatomical breast structures and pathological lesions (masses and calicifications), subjective image quality, and mammographic interpretation. 3,000 separate evaluations were anayzed by several statistical techniques including Receiver Operating Characteristic curve analysis, McNemar test for differences between proportions and the Landis et al. method of agreement weighted kappa for ordinal categorical data. Results from the statistical analysis show: (1) There were no statistical significant differences in the diagnostic accuracy of the observers when diagnosing from mammograms with the same MTF. (2) There were no statistically significant differences in diagnostic accuracy for each observer when diagnosing from mammograms with the different MTF's used in the study. (3) There statistical significant differences in detail visibility between the copies and the originals. Detail visibility was better in the originals. (4) Feature interpretations were not significantly different between the originals and the copies. (5) Perception of image quality did not affect image interpretation. Continuation and improvement of this research ca be accomplished by: using a case population more sensitive to MTF changes, i.e., asymptomatic women with minimum breast cancer, more observers (including less experienced radiologists and experienced technologists) must collaborate in the study, and using a minimum of 200 benign and 200 malignant cases.
NASA Astrophysics Data System (ADS)
Počakal, Damir; Štalec, Janez
In the continental part of Croatia, operational hail suppression has been conducted for more than 30 years. The current protected area is 25,177 km 2 and has about 492 hail suppression stations which are managed with eight weather radar centres. This paper present a statistical analysis of parameters connected with hail occurrence on hail suppression stations in the western part of protected area in 1981-2000 period. This analysis compares data of two periods with different intensity of hail suppression activity and is made as a part of a project for assessment of hail suppression efficiency in Croatia. Because of disruption in hail suppression system during the independence war in Croatia (1991-1995), lack of rockets and other objective circumstances, it is considered that in the 1991-2000 period, hail suppression system could not act properly. Because of that, a comparison of hail suppression data for two periods was made. The first period (1981-1990), which is characterised with full application of hail suppression technology is compared with the second period (1991-2000). The protected area is divided into quadrants (9×9 km), such that every quadrant has at least one hail suppression station and intercomparison is more precise. Discriminant analysis was performed for the yearly values of each quadrant. These values included number of cases with solid precipitation, hail damage, heavy hail damage, number of active hail suppression stations, number of days with solid precipitation, solid precipitation damage, heavy solid precipitation damage and the number and duration of air traffic control bans. The discriminant analysis shows that there is a significant difference between the two periods. Average values of observed periods on isolated discriminant function 1 are for the first period (1981-1990) -0.36 and for the second period +0.23 standard deviation of all observations. The analysis for all eight variables shows statistically substantial differences in the number of hail suppression stations (which have a positive correlation) and in the number of cases with air traffic control ban, which have, like all other variables, a negative correlation. Results of statistical analysis for two periods show positive influence of hail suppression system. The discriminant analysis made for three periods shows that these three periods can not be compared because of the short time period, the difference in hail suppression technology, working conditions and possible differences in meteorological conditions. Therefore, neither the effectiveness nor ineffectiveness of hail suppression operations nor their efficiency can be statistically proven. For an exact assessment of hail suppression effectiveness, it is necessary to develop a project, which would take into consideration all the parameters used in such previous projects around the world—a hailpad polygon.
Rock Statistics at the Mars Pathfinder Landing Site, Roughness and Roving on Mars
NASA Technical Reports Server (NTRS)
Haldemann, A. F. C.; Bridges, N. T.; Anderson, R. C.; Golombek, M. P.
1999-01-01
Several rock counts have been carried out at the Mars Pathfinder landing site producing consistent statistics of rock coverage and size-frequency distributions. These rock statistics provide a primary element of "ground truth" for anchoring remote sensing information used to pick the Pathfinder, and future, landing sites. The observed rock population statistics should also be consistent with the emplacement and alteration processes postulated to govern the landing site landscape. The rock population databases can however be used in ways that go beyond the calculation of cumulative number and cumulative area distributions versus rock diameter and height. Since the spatial parameters measured to characterize each rock are determined with stereo image pairs, the rock database serves as a subset of the full landing site digital terrain model (DTM). Insofar as a rock count can be carried out in a speedier, albeit coarser, manner than the full DTM analysis, rock counting offers several operational and scientific products in the near term. Quantitative rock mapping adds further information to the geomorphic study of the landing site, and can also be used for rover traverse planning. Statistical analysis of the surface roughness using the rock count proxy DTM is sufficiently accurate when compared to the full DTM to compare with radar remote sensing roughness measures, and with rover traverse profiles.
Statistical Inference for Data Adaptive Target Parameters.
Hubbard, Alan E; Kherad-Pajouh, Sara; van der Laan, Mark J
2016-05-01
Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in an estimation sample (one of the V subsamples) and corresponding complementary parameter-generating sample. For each of the V parameter-generating samples, we apply an algorithm that maps the sample to a statistical target parameter. We define our sample-split data adaptive statistical target parameter as the average of these V-sample specific target parameters. We present an estimator (and corresponding central limit theorem) of this type of data adaptive target parameter. This general methodology for generating data adaptive target parameters is demonstrated with a number of practical examples that highlight new opportunities for statistical learning from data. This new framework provides a rigorous statistical methodology for both exploratory and confirmatory analysis within the same data. Given that more research is becoming "data-driven", the theory developed within this paper provides a new impetus for a greater involvement of statistical inference into problems that are being increasingly addressed by clever, yet ad hoc pattern finding methods. To suggest such potential, and to verify the predictions of the theory, extensive simulation studies, along with a data analysis based on adaptively determined intervention rules are shown and give insight into how to structure such an approach. The results show that the data adaptive target parameter approach provides a general framework and resulting methodology for data-driven science.
NASA Astrophysics Data System (ADS)
Deidda, Roberto; Mascaro, Giuseppe; Hellies, Matteo; Baldini, Luca; Roberto, Nicoletta
2013-04-01
COSMO Sky-Med (CSK) is an important programme of the Italian Space Agency aiming at supporting environmental monitoring and management of exogenous, endogenous and anthropogenic risks through X-band Synthetic Aperture Radar (X-SAR) on board of 4 satellites forming a constellation. Most of typical SAR applications are focused on land or ocean observation. However, X-band SAR can be detect precipitation that results in a specific signature caused by the combination of attenuation of surface returns induced by precipitation and enhancement of backscattering determined by the hydrometeors in the SAR resolution volume. Within CSK programme, we conducted an intercomparison between the statistical properties of precipitation fields derived by CSK SARs and those derived by the CNR Polar 55C (C-band) ground based weather radar located in Rome (Italy). This contribution presents main results of this research which was aimed at the robust characterisation of rainfall statistical properties across different scales by means of scale-invariance analysis and multifractal theory. The analysis was performed on a dataset of more two years of precipitation observations collected by the CNR Polar 55C radar and rainfall fields derived from available images collected by the CSK satellites during intense rainfall events. Scale-invariance laws and multifractal properties were detected on the most intense rainfall events derived from the CNR Polar 55C radar for spatial scales from 4 km to 64 km. The analysis on X-SAR retrieved rainfall fields, although based on few images, leaded to similar results and confirmed the existence of scale-invariance and multifractal properties for scales larger than 4 km. These outcomes encourage investigating SAR methodologies for future development of meteo-hydrological forecasting models based on multifractal theory.
Statistical Analysis of NAS Parallel Benchmarks and LINPACK Results
NASA Technical Reports Server (NTRS)
Meuer, Hans-Werner; Simon, Horst D.; Strohmeier, Erich; Lasinski, T. A. (Technical Monitor)
1994-01-01
In the last three years extensive performance data have been reported for parallel machines both based on the NAS Parallel Benchmarks, and on LINPACK. In this study we have used the reported benchmark results and performed a number of statistical experiments using factor, cluster, and regression analyses. In addition to the performance results of LINPACK and the eight NAS parallel benchmarks, we have also included peak performance of the machine, and the LINPACK n and n(sub 1/2) values. Some of the results and observations can be summarized as follows: 1) All benchmarks are strongly correlated with peak performance. 2) LINPACK and EP have each a unique signature. 3) The remaining NPB can grouped into three groups as follows: (CG and IS), (LU and SP), and (MG, FT, and BT). Hence three (or four with EP) benchmarks are sufficient to characterize the overall NPB performance. Our poster presentation will follow a standard poster format, and will present the data of our statistical analysis in detail.
Spatial statistical analysis of tree deaths using airborne digital imagery
NASA Astrophysics Data System (ADS)
Chang, Ya-Mei; Baddeley, Adrian; Wallace, Jeremy; Canci, Michael
2013-04-01
High resolution digital airborne imagery offers unprecedented opportunities for observation and monitoring of vegetation, providing the potential to identify, locate and track individual vegetation objects over time. Analytical tools are required to quantify relevant information. In this paper, locations of trees over a large area of native woodland vegetation were identified using morphological image analysis techniques. Methods of spatial point process statistics were then applied to estimate the spatially-varying tree death risk, and to show that it is significantly non-uniform. [Tree deaths over the area were detected in our previous work (Wallace et al., 2008).] The study area is a major source of ground water for the city of Perth, and the work was motivated by the need to understand and quantify vegetation changes in the context of water extraction and drying climate. The influence of hydrological variables on tree death risk was investigated using spatial statistics (graphical exploratory methods, spatial point pattern modelling and diagnostics).
NASA Astrophysics Data System (ADS)
Hendikawati, P.; Arifudin, R.; Zahid, M. Z.
2018-03-01
This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.
NASA Astrophysics Data System (ADS)
Titov, A. G.; Gordov, E. P.; Okladnikov, I.; Shulgina, T. M.
2011-12-01
Analysis of recent climatic and environmental changes in Siberia performed on the basis of the CLEARS (CLimate and Environment Analysis and Research System) information-computational system is presented. The system was developed using the specialized software framework for rapid development of thematic information-computational systems based on Web-GIS technologies. It comprises structured environmental datasets, computational kernel, specialized web portal implementing web mapping application logic, and graphical user interface. Functional capabilities of the system include a number of procedures for mathematical and statistical analysis, data processing and visualization. At present a number of georeferenced datasets is available for processing including two editions of NCEP/NCAR Reanalysis, JMA/CRIEPI JRA-25 Reanalysis, ECMWF ERA-40 and ERA Interim Reanalysis, meteorological observation data for the territory of the former USSR, and others. Firstly, using functionality of the computational kernel employing approved statistical methods it was shown that the most reliable spatio-temporal characteristics of surface temperature and precipitation in Siberia in the second half of 20th and beginning of 21st centuries are provided by ERA-40/ERA Interim Reanalysis and APHRODITE JMA Reanalysis, respectively. Namely those Reanalyses are statistically consistent with reliable in situ meteorological observations. Analysis of surface temperature and precipitation dynamics for the territory of Siberia performed on the base of the developed information-computational system reveals fine spatial and temporal details in heterogeneous patterns obtained for the region earlier. Dynamics of bioclimatic indices determining climate change impact on structure and functioning of regional vegetation cover was investigated as well. Analysis shows significant positive trends of growing season length accompanied by statistically significant increase of sum of growing degree days and total annual precipitation over the south of Western Siberia. In particular, we conclude that analysis of trends of growing season length, sum of growing degree-days and total precipitation during the growing season reveals a tendency to an increase of vegetation ecosystems productivity across the south of Western Siberia (55°-60°N, 59°-84°E) in the past several decades. The developed system functionality providing instruments for comparison of modeling and observational data and for reliable climatological analysis allowed us to obtain new results characterizing regional manifestations of global change. It should be added that each analysis performed using the system leads also to generation of the archive of spatio-temporal data fields ready for subsequent usage by other specialists. In particular, the archive of bioclimatic indices obtained will allow performing further detailed studies of interrelations between local climate and vegetation cover changes, including changes of carbon uptake related to variations of types and amount of vegetation and spatial shift of vegetation zones. This work is partially supported by RFBR grants #10-07-00547 and #11-05-01190-a, SB RAS Basic Program Projects 4.31.1.5 and 4.31.2.7.
Mróz, Tomasz; Szufa, Katarzyna; Frontasyeva, Marina V; Tselmovich, Vladimir; Ostrovnaya, Tatiana; Kornaś, Andrzej; Olech, Maria A; Mietelski, Jerzy W; Brudecki, Kamil
2018-01-01
Seven lichens (Usnea antarctica and U. aurantiacoatra) and nine moss samples (Sanionia uncinata) collected in King George Island were analyzed using instrumental neutron activation analysis, and concentration of major and trace elements was calculated. For some elements, the concentrations observed in moss samples were higher than corresponding values reported from other sites in the Antarctica, but in the lichens, these were in the same range of concentrations. Scanning electron microscopy (SEM) and statistical analysis showed large influence of volcanic-origin particles. Also, the interplanetary cosmic particles (ICP) were observed in investigated samples, as mosses and lichens are good collectors of ICP and micrometeorites.
Usman, Mohammad N.; Umar, Muhammad D.
2018-01-01
Background: Recent studies have revealed that pharmacists have interest in conducting research. However, lack of confidence is a major barrier. Objective: This study evaluated pharmacists’ self-perceived competence and confidence to plan and conduct health-related research. Method: This cross sectional study was conducted during the 89th Annual National Conference of the Pharmaceutical Society of Nigeria in November 2016. An adapted questionnaire was validated and administered to 200 pharmacist delegates during the conference. Result: Overall, 127 questionnaires were included in the analysis. At least 80% of the pharmacists had previous health-related research experience. Pharmacist’s competence and confidence scores were lowest for research skills such as: using software for statistical analysis, choosing and applying appropriate inferential statistical test and method, and outlining detailed statistical plan to be used in data analysis. Highest competence and confidence scores were observed for conception of research idea, literature search and critical appraisal of literature. Pharmacists with previous research experience had higher competence and confidence scores than those with no previous research experience (p<0.05). The only predictor of moderate-to-extreme self-competence and confidence was having at least one journal article publication during the last 5 years. Conclusion: Nigerian pharmacists indicated interest to participate in health-related research. However, self-competence and confidence to plan and conduct research were low. This was particularly so for skills related to statistical analysis. Training programs and building of Pharmacy Practice Research Network are recommended to enhance pharmacist’s research capacity. PMID:29619141
Booth, Brian G; Keijsers, Noël L W; Sijbers, Jan; Huysmans, Toon
2018-05-03
Pedobarography produces large sets of plantar pressure samples that are routinely subsampled (e.g. using regions of interest) or aggregated (e.g. center of pressure trajectories, peak pressure images) in order to simplify statistical analysis and provide intuitive clinical measures. We hypothesize that these data reductions discard gait information that can be used to differentiate between groups or conditions. To test the hypothesis of null information loss, we created an implementation of statistical parametric mapping (SPM) for dynamic plantar pressure datasets (i.e. plantar pressure videos). Our SPM software framework brings all plantar pressure videos into anatomical and temporal correspondence, then performs statistical tests at each sampling location in space and time. Novelly, we introduce non-linear temporal registration into the framework in order to normalize for timing differences within the stance phase. We refer to our software framework as STAPP: spatiotemporal analysis of plantar pressure measurements. Using STAPP, we tested our hypothesis on plantar pressure videos from 33 healthy subjects walking at different speeds. As walking speed increased, STAPP was able to identify significant decreases in plantar pressure at mid-stance from the heel through the lateral forefoot. The extent of these plantar pressure decreases has not previously been observed using existing plantar pressure analysis techniques. We therefore conclude that the subsampling of plantar pressure videos - a task which led to the discarding of gait information in our study - can be avoided using STAPP. Copyright © 2018 Elsevier B.V. All rights reserved.
Impact of ontology evolution on functional analyses.
Groß, Anika; Hartung, Michael; Prüfer, Kay; Kelso, Janet; Rahm, Erhard
2012-10-15
Ontologies are used in the annotation and analysis of biological data. As knowledge accumulates, ontologies and annotation undergo constant modifications to reflect this new knowledge. These modifications may influence the results of statistical applications such as functional enrichment analyses that describe experimental data in terms of ontological groupings. Here, we investigate to what degree modifications of the Gene Ontology (GO) impact these statistical analyses for both experimental and simulated data. The analysis is based on new measures for the stability of result sets and considers different ontology and annotation changes. Our results show that past changes in the GO are non-uniformly distributed over different branches of the ontology. Considering the semantic relatedness of significant categories in analysis results allows a more realistic stability assessment for functional enrichment studies. We observe that the results of term-enrichment analyses tend to be surprisingly stable despite changes in ontology and annotation.
Multivariate space - time analysis of PRE-STORM precipitation
NASA Technical Reports Server (NTRS)
Polyak, Ilya; North, Gerald R.; Valdes, Juan B.
1994-01-01
This paper presents the methodologies and results of the multivariate modeling and two-dimensional spectral and correlation analysis of PRE-STORM rainfall gauge data. Estimated parameters of the models for the specific spatial averages clearly indicate the eastward and southeastward wave propagation of rainfall fluctuations. A relationship between the coefficients of the diffusion equation and the parameters of the stochastic model of rainfall fluctuations is derived that leads directly to the exclusive use of rainfall data to estimate advection speed (about 12 m/s) as well as other coefficients of the diffusion equation of the corresponding fields. The statistical methodology developed here can be used for confirmation of physical models by comparison of the corresponding second-moment statistics of the observed and simulated data, for generating multiple samples of any size, for solving the inverse problem of the hydrodynamic equations, and for application in some other areas of meteorological and climatological data analysis and modeling.
RP and RQA Analysis for Floating Potential Fluctuations in a DC Magnetron Sputtering Plasma
NASA Astrophysics Data System (ADS)
Sabavath, Gopikishan; Banerjee, I.; Mahapatra, S. K.
2016-04-01
The nonlinear dynamics of a direct current magnetron sputtering plasma is visualized using recurrence plot (RP) technique. RP comprises the recurrence quantification analysis (RQA) which is an efficient method to observe critical regime transitions in dynamics. Further, RQA provides insight information about the system’s behavior. We observed the floating potential fluctuations of the plasma as a function of discharge voltage by using Langmuir probe. The system exhibits quasi-periodic-chaotic-quasi-periodic-chaotic transitions. These transitions are quantified from determinism, Lmax, and entropy of RQA. Statistical investigations like kurtosis and skewness also studied for these transitions which are in well agreement with RQA results.
A critique of supernova data analysis in cosmology
NASA Astrophysics Data System (ADS)
Gopal Vishwakarma, Ram; Narlikar, Jayant V.
2010-12-01
Observational astronomy has shown significant growth over the last decade and has made important contributions to cosmology. A major paradigm shift in cosmology was brought about by observations of Type Ia supernovae. The notion that the universe is accelerating has led to several theoretical challenges. Unfortunately, although high-quality supernovae data-sets are being produced, their statistical analysis leaves much to be desired. Instead of using the data to directly test the model, several studies seem to concentrate on assuming the model to be correct and limiting themselves to estimating model parameters and internal errors. As shown here, the important purpose of testing a cosmological theory is thereby vitiated.
Use of acetaminophen and risk of endometrial cancer: evidence from observational studies.
Ding, Yuan-Yuan; Yao, Peng; Verma, Surya; Han, Zhen-Kai; Hong, Tao; Zhu, Yong-Qiang; Li, Hong-Xi
2017-05-23
Previous meta-analyses suggested that aspirin was associated with reduced risk of endometrial cancer. However, there has been no study comprehensively summarize the evidence of acetaminophen use and risk of endometrial cancer from observational studies. We systematically searched electronic databases (PubMed , EMBASE, Web of Science, and Cochrane Library) for relevant cohort or case-control studies up to February 28, 2017. Two independent authors performed the eligibility evaluation and data extraction. All differences were resolved by discussion. A random-effects model was applied to estimate summary relative risks (RRs) with 95% CIs. All statistical tests were two-sided. Seven observational studies including four prospective cohort studies and three case-control studies with 3874 endometrial cancer cases were included for final analysis. Compared with never use acetaminophen, ever use this drug was not associated with risk of endometrial cancer (summarized RR = 1.02; 95% CI: 0.93-1.13, I2 = 0%). Similar null association was also observed when compared the highest category of frequency/duration with never use acetaminophen (summarized RR = 0.88; 95% CI: 0.70-1.11, I2 = 15.2%). Additionally, the finding was robust in the subgroup analyses stratified by study characteristics and adjustment for potential confounders and risk factors. There was no evidence of publication bias by a visual inspection of a funnel plot and formal statistical tests. In summary, the present meta-analysis reveals no association between acetaminophen use and risk of endometrial cancer. More large scale prospective cohort studies are warranted to confirm our findings and carry out the dose-response analysis of aforementioned association.
NASA Astrophysics Data System (ADS)
Ryu, K.; Jangsoo, C.; Kim, S. G.; Jeong, K. S.; Parrot, M.; Pulinets, S. A.; Oyama, K. I.
2014-12-01
Examples of intensified EIA features temporally and spatially related to large earthquakes observed by satellites and GPS-TEC are introduced. The precursory, concurrent, and ex-post enhancements of EIA represented by the equatorial electron density, which are thought to be related to the M8.7 Northern Sumatra earthquake of March 2005, the M8.0 Pisco earthquake of August 2007, and the M7.9 Wenchuan Earthquake of 12 May 2008, are shown with space weather condition. Based on the case studies, statistical analysis on the ionospheric electron density data measured by the Detection of Electro-Magnetic Emissions Transmitted from Earthquake Regions satellite (DEMETER) over a period of 2005-2010 was executed in order to investigate the correlation between seismic activity and equatorial plasma density variations. To simplify the analysis, three equatorial regions with frequent earthquakes were selected and then one-dimensional time series analysis between the daily seismic activity indices and the EIA intensity indices were performed for each region with excluding the possible effects from the geomagnetic and solar activity. The statistically significant values of the lagged cross-correlation function, particularly in the region with minimal effects of longitudinal asymmetry, indicate that some of the very large earthquakes with M > 7.0 in the low latitude region can accompany observable seismo-ionospheric coupling phenomena in the form of EIA enhancements, even though the seismic activity is not the most significant driver of the equatorial ionospheric evolution. The physical mechanisms of the seismo-ionospheric coupling to explain the observation and the possibility of earthquake prediction using the EIA intensity variation are discussed.
Wu, Johnny C; Gardner, David P; Ozer, Stuart; Gutell, Robin R; Ren, Pengyu
2009-08-28
The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.
A statistical spatial power spectrum of the Earth's lithospheric magnetic field
NASA Astrophysics Data System (ADS)
Thébault, E.; Vervelidou, F.
2015-05-01
The magnetic field of the Earth's lithosphere arises from rock magnetization contrasts that were shaped over geological times. The field can be described mathematically in spherical harmonics or with distributions of magnetization. We exploit this dual representation and assume that the lithospheric field is induced by spatially varying susceptibility values within a shell of constant thickness. By introducing a statistical assumption about the power spectrum of the susceptibility, we then derive a statistical expression for the spatial power spectrum of the crustal magnetic field for the spatial scales ranging from 60 to 2500 km. This expression depends on the mean induced magnetization, the thickness of the shell, and a power law exponent for the power spectrum of the susceptibility. We test the relevance of this form with a misfit analysis to the observational NGDC-720 lithospheric magnetic field model power spectrum. This allows us to estimate a mean global apparent induced magnetization value between 0.3 and 0.6 A m-1, a mean magnetic crustal thickness value between 23 and 30 km, and a root mean square for the field value between 190 and 205 nT at 95 per cent. These estimates are in good agreement with independent models of the crustal magnetization and of the seismic crustal thickness. We carry out the same analysis in the continental and oceanic domains separately. We complement the misfit analyses with a Kolmogorov-Smirnov goodness-of-fit test and we conclude that the observed power spectrum can be each time a sample of the statistical one.
Statistical analysis of iron geochemical data suggests limited late Proterozoic oxygenation
NASA Astrophysics Data System (ADS)
Sperling, Erik A.; Wolock, Charles J.; Morgan, Alex S.; Gill, Benjamin C.; Kunzmann, Marcus; Halverson, Galen P.; MacDonald, Francis A.; Knoll, Andrew H.; Johnston, David T.
2015-07-01
Sedimentary rocks deposited across the Proterozoic-Phanerozoic transition record extreme climate fluctuations, a potential rise in atmospheric oxygen or re-organization of the seafloor redox landscape, and the initial diversification of animals. It is widely assumed that the inferred redox change facilitated the observed trends in biodiversity. Establishing this palaeoenvironmental context, however, requires that changes in marine redox structure be tracked by means of geochemical proxies and translated into estimates of atmospheric oxygen. Iron-based proxies are among the most effective tools for tracking the redox chemistry of ancient oceans. These proxies are inherently local, but have global implications when analysed collectively and statistically. Here we analyse about 4,700 iron-speciation measurements from shales 2,300 to 360 million years old. Our statistical analyses suggest that subsurface water masses in mid-Proterozoic oceans were predominantly anoxic and ferruginous (depleted in dissolved oxygen and iron-bearing), but with a tendency towards euxinia (sulfide-bearing) that is not observed in the Neoproterozoic era. Analyses further indicate that early animals did not experience appreciable benthic sulfide stress. Finally, unlike proxies based on redox-sensitive trace-metal abundances, iron geochemical data do not show a statistically significant change in oxygen content through the Ediacaran and Cambrian periods, sharply constraining the magnitude of the end-Proterozoic oxygen increase. Indeed, this re-analysis of trace-metal data is consistent with oxygenation continuing well into the Palaeozoic era. Therefore, if changing redox conditions facilitated animal diversification, it did so through a limited rise in oxygen past critical functional and ecological thresholds, as is seen in modern oxygen minimum zone benthic animal communities.
Towards Precision Spectroscopy of Baryonic Resonances
NASA Astrophysics Data System (ADS)
Döring, Michael; Mai, Maxim; Rönchen, Deborah
2017-01-01
Recent progress in baryon spectroscopy is reviewed. In a common effort, various groups have analyzed a set of new high-precision polarization observables from ELSA. The Jülich-Bonn group has finalized the analysis of pion-induced meson-baryon production, the potoproduction of pions and eta mesons, and (almost) the KΛ final state. As data become preciser, statistical aspects in the analysis of excited baryons become increasingly relevant and several advances in this direction are proposed.
Physics Education: A Significant Backbone of Sustainable Development in Developing Countries
NASA Astrophysics Data System (ADS)
Akintola, R. A.
2006-08-01
In the quest for technological self-reliance, many policies, programs and projects have been proposed and implemented in order to procure solutions to the problems of technological inadequacies of developing countries. It has been observed that all these failed. This research identifies the problems and proposes lasting solutions to emancipate physics education in developing nations and highlight possible future gains. The statistical analysis employed was based on questionnaires, interviews and data analysis.
Towards precision spectroscopy of baryonic resonances
Doring, Michael; Mai, Maxim; Ronchen, Deborah
2017-01-26
Recent progress in baryon spectroscopy is reviewed. In a common effort, various groups have analyzed a set of new high-precision polarization observables from ELSA. The Julich-Bonn group has finalized the analysis of pion-induced meson-baryon production, the potoproduction of pions and eta mesons, and (almost) the KΛ final state. Lastly, as data become preciser, statistical aspects in the analysis of excited baryons become increasingly relevant and several advances in this direction are proposed.
Materials Approach to Dissecting Surface Responses in the Attachment Stages of Biofouling Organisms
2016-04-25
their settlement behavior in regards to the coating surfaces. 5) Multivariate statistical analysis was used to examine the effect (if any) of the...applied to glass rods and were deployed in the field to evaluate settlement preferences. Canonical Analysis of Principal Coordinates were applied to...the influence of coating surface properties on the patterns in settlement observed in the field in the extension of this work over the coming year
Descalzo, Miguel Á; Garcia, Virginia Villaverde; González-Alvaro, Isidoro; Carbonell, Jordi; Balsa, Alejandro; Sanmartí, Raimon; Lisbona, Pilar; Hernandez-Barrera, Valentín; Jiménez-Garcia, Rodrigo; Carmona, Loreto
2013-02-01
To describe the results of different statistical ways of addressing radiographic outcome affected by missing data--multiple imputation technique, inverse probability weights and complete case analysis--using data from an observational study. A random sample of 96 RA patients was selected for a follow-up study in which radiographs of hands and feet were scored. Radiographic progression was tested by comparing the change in the total Sharp-van der Heijde radiographic score (TSS) and the joint erosion score (JES) from baseline to the end of the second year of follow-up. MI technique, inverse probability weights in weighted estimating equation (WEE) and CC analysis were used to fit a negative binomial regression. Major predictors of radiographic progression were JES and joint space narrowing (JSN) at baseline, together with baseline disease activity measured by DAS28 for TSS and MTX use for JES. Results from CC analysis show larger coefficients and s.e.s compared with MI and weighted techniques. The results from the WEE model were quite in line with those of MI. If it seems plausible that CC or MI analysis may be valid, then MI should be preferred because of its greater efficiency. CC analysis resulted in inefficient estimates or, translated into non-statistical terminology, could guide us into inaccurate results and unwise conclusions. The methods discussed here will contribute to the use of alternative approaches for tackling missing data in observational studies.
NASA Astrophysics Data System (ADS)
Kovalenko, I. D.; Doressoundiram, A.; Lellouch, E.; Vilenius, E.; Müller, T.; Stansberry, J.
2017-11-01
Context. Gravitationally bound multiple systems provide an opportunity to estimate the mean bulk density of the objects, whereas this characteristic is not available for single objects. Being a primitive population of the outer solar system, binary and multiple trans-Neptunian objects (TNOs) provide unique information about bulk density and internal structure, improving our understanding of their formation and evolution. Aims: The goal of this work is to analyse parameters of multiple trans-Neptunian systems, observed with Herschel and Spitzer space telescopes. Particularly, statistical analysis is done for radiometric size and geometric albedo, obtained from photometric observations, and for estimated bulk density. Methods: We use Monte Carlo simulation to estimate the real size distribution of TNOs. For this purpose, we expand the dataset of diameters by adopting the Minor Planet Center database list with available values of the absolute magnitude therein, and the albedo distribution derived from Herschel radiometric measurements. We use the 2-sample Anderson-Darling non-parametric statistical method for testing whether two samples of diameters, for binary and single TNOs, come from the same distribution. Additionally, we use the Spearman's coefficient as a measure of rank correlations between parameters. Uncertainties of estimated parameters together with lack of data are taken into account. Conclusions about correlations between parameters are based on statistical hypothesis testing. Results: We have found that the difference in size distributions of multiple and single TNOs is biased by small objects. The test on correlations between parameters shows that the effective diameter of binary TNOs strongly correlates with heliocentric orbital inclination and with magnitude difference between components of binary system. The correlation between diameter and magnitude difference implies that small and large binaries are formed by different mechanisms. Furthermore, the statistical test indicates, although not significant with the sample size, that a moderately strong correlation exists between diameter and bulk density. Herschel is an ESA space observatory with science instruments provided by European-led Principal Investigator consortia and with important participation from NASA.
Conservative Tests under Satisficing Models of Publication Bias.
McCrary, Justin; Christensen, Garret; Fanelli, Daniele
2016-01-01
Publication bias leads consumers of research to observe a selected sample of statistical estimates calculated by producers of research. We calculate critical values for statistical significance that could help to adjust after the fact for the distortions created by this selection effect, assuming that the only source of publication bias is file drawer bias. These adjusted critical values are easy to calculate and differ from unadjusted critical values by approximately 50%-rather than rejecting a null hypothesis when the t-ratio exceeds 2, the analysis suggests rejecting a null hypothesis when the t-ratio exceeds 3. Samples of published social science research indicate that on average, across research fields, approximately 30% of published t-statistics fall between the standard and adjusted cutoffs.
Conservative Tests under Satisficing Models of Publication Bias
McCrary, Justin; Christensen, Garret; Fanelli, Daniele
2016-01-01
Publication bias leads consumers of research to observe a selected sample of statistical estimates calculated by producers of research. We calculate critical values for statistical significance that could help to adjust after the fact for the distortions created by this selection effect, assuming that the only source of publication bias is file drawer bias. These adjusted critical values are easy to calculate and differ from unadjusted critical values by approximately 50%—rather than rejecting a null hypothesis when the t-ratio exceeds 2, the analysis suggests rejecting a null hypothesis when the t-ratio exceeds 3. Samples of published social science research indicate that on average, across research fields, approximately 30% of published t-statistics fall between the standard and adjusted cutoffs. PMID:26901834
2007-01-01
Background The US Food and Drug Administration approved the Charité artificial disc on October 26, 2004. This approval was based on an extensive analysis and review process; 20 years of disc usage worldwide; and the results of a prospective, randomized, controlled clinical trial that compared lumbar artificial disc replacement to fusion. The results of the investigational device exemption (IDE) study led to a conclusion that clinical outcomes following lumbar arthroplasty were at least as good as outcomes from fusion. Methods The author performed a new analysis of the Visual Analog Scale pain scores and the Oswestry Disability Index scores from the Charité artificial disc IDE study and used a nonparametric statistical test, because observed data distributions were not normal. The analysis included all of the enrolled subjects in both the nonrandomized and randomized phases of the study. Results Subjects from both the treatment and control groups improved from the baseline situation (P < .001) at all follow-up times (6 weeks to 24 months). Additionally, these pain and disability levels with artificial disc replacement were superior (P < .05) to the fusion treatment at all follow-up times including 2 years. Conclusions The a priori statistical plan for an IDE study may not adequately address the final distribution of the data. Therefore, statistical analyses more appropriate to the distribution may be necessary to develop meaningful statistical conclusions from the study. A nonparametric statistical analysis of the Charité artificial disc IDE outcomes scores demonstrates superiority for lumbar arthroplasty versus fusion at all follow-up time points to 24 months. PMID:25802574
Detailed Analysis of the Interoccurrence Time Statistics in Seismic Activity
NASA Astrophysics Data System (ADS)
Tanaka, Hiroki; Aizawa, Yoji
2017-02-01
The interoccurrence time statistics of seismiciry is studied theoretically as well as numerically by taking into account the conditional probability and the correlations among many earthquakes in different magnitude levels. It is known so far that the interoccurrence time statistics is well approximated by the Weibull distribution, but the more detailed information about the interoccurrence times can be obtained from the analysis of the conditional probability. Firstly, we propose the Embedding Equation Theory (EET), where the conditional probability is described by two kinds of correlation coefficients; one is the magnitude correlation and the other is the inter-event time correlation. Furthermore, the scaling law of each correlation coefficient is clearly determined from the numerical data-analysis carrying out with the Preliminary Determination of Epicenter (PDE) Catalog and the Japan Meteorological Agency (JMA) Catalog. Secondly, the EET is examined to derive the magnitude dependence of the interoccurrence time statistics and the multi-fractal relation is successfully formulated. Theoretically we cannot prove the universality of the multi-fractal relation in seismic activity; nevertheless, the theoretical results well reproduce all numerical data in our analysis, where several common features or the invariant aspects are clearly observed. Especially in the case of stationary ensembles the multi-fractal relation seems to obey an invariant curve, furthermore in the case of non-stationary (moving time) ensembles for the aftershock regime the multi-fractal relation seems to satisfy a certain invariant curve at any moving times. It is emphasized that the multi-fractal relation plays an important role to unify the statistical laws of seismicity: actually the Gutenberg-Richter law and the Weibull distribution are unified in the multi-fractal relation, and some universality conjectures regarding the seismicity are briefly discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mahfuz, H.; Maniruzzaman, M.; Vaidya, U.
1997-04-01
Monotonic tensile and fatigue response of continuous silicon carbide fiber reinforced silicon nitride (SiC{sub f}/Si{sub 3}N{sub 4}) composites has been investigated. The monotonic tensile tests have been performed at room and elevated temperatures. Fatigue tests have been conducted at room temperature (RT), at a stress ratio, R = 0.1 and a frequency of 5 Hz. It is observed during the monotonic tests that the composites retain only 30% of its room temperature strength at 1,600 C suggesting a substantial chemical degradation of the matrix at that temperature. The softening of the matrix at elevated temperature also causes reduction in tensilemore » modulus, and the total reduction in modulus is around 45%. Fatigue data have been generated at three load levels and the fatigue strength of the composite has been found to be considerably high; about 75% of its ultimate room temperature strength. Extensive statistical analysis has been performed to understand the degree of scatter in the fatigue as well as in the static test data. Weibull shape factors and characteristic values have been determined for each set of tests and their relationship with the response of the composites has been discussed. A statistical fatigue life prediction method developed from the Weibull distribution is also presented. Maximum Likelihood Estimator with censoring techniques and data pooling schemes has been employed to determine the distribution parameters for the statistical analysis. These parameters have been used to generate the S-N diagram with desired level of reliability. Details of the statistical analysis and the discussion of the static and fatigue behavior of the composites are presented in this paper.« less
Singh, Ajai; Kumar, Vineet; Ali, Sabir; Mahdi, Abbas Ali; Srivastava, Rajeshwer Nath
2017-01-01
Aims: The aim of this study is to analyze the serial estimation of phosphorylated neurofilament heavy (pNF-H) in blood plasma that would act as a potential biomarker for early prediction of the neurological severity of acute spinal cord injuries (SCI) in adults. Settings and Design: Pilot study/observational study. Subjects and Methods: A total of 40 patients (28 cases and 12 controls) of spine injury were included in this study. In the enrolled cases, plasma level of pNF-H was evaluated in blood samples and neurological evaluation was performed by the American Spinal Injury Association Injury Scale at specified period. Serial plasma neurofilament heavy values were then correlated with the neurological status of these patients during follow-up visits and were analyzed statistically. Statistical Analysis Used: Statistical analysis was performed using GraphPad InStat software (version 3.05 for Windows, San Diego, CA, USA). The correlation analysis between the clinical progression and pNF-H expression was done using Spearman's correlation. Results: The mean baseline level of pNF-H in cases was 6.40 ± 2.49 ng/ml, whereas in controls it was 0.54 ± 0.27 ng/ml. On analyzing the association between the two by Mann–Whitney U–test, the difference in levels was found to be statistically significant. The association between the neurological progression and pNF-H expression was determined using correlation analysis (Spearman's correlation). At 95% confidence interval, the correlation coefficient was found to be 0.64, and the correlation was statistically significant. Conclusions: Plasma pNF-H levels were elevated in accordance with the severity of SCI. Therefore, pNF-H may be considered as a potential biomarker to determine early the severity of SCI in adult patients. PMID:29291173
Interactions and triggering in a 3D rate and state asperity model
NASA Astrophysics Data System (ADS)
Dublanchet, P.; Bernard, P.
2012-12-01
Precise relocation of micro-seismicity and careful analysis of seismic source parameters have progressively imposed the concept of seismic asperities embedded in a creeping fault segment as being one of the most important aspect that should appear in a realistic representation of micro-seismic sources. Another important issue concerning micro-seismic activity is the existence of robust empirical laws describing the temporal and magnitude distribution of earthquakes, such as the Omori law, the distribution of inter-event time and the Gutenberg-Richter law. In this framework, this study aims at understanding statistical properties of earthquakes, by generating synthetic catalogs with a 3D, quasi-dynamic continuous rate and state asperity model, that takes into account a realistic geometry of asperities. Our approach contrasts with ETAS models (Kagan and Knopoff, 1981) usually implemented to produce earthquake catalogs, in the sense that the non linearity observed in rock friction experiments (Dieterich, 1979) is fully taken into account by the use of rate and state friction law. Furthermore, our model differs from discrete models of faults (Ziv and Cochard, 2006) because the continuity allows us to define realistic geometries and distributions of asperities by the assembling of sub-critical computational cells that always fail in a single event. Moreover, this model allows us to adress the question of the influence of barriers and distribution of asperities on the event statistics. After recalling the main observations of asperities in the specific case of Parkfield segment of San-Andreas Fault, we analyse earthquake statistical properties computed for this area. Then, we present synthetic statistics obtained by our model that allow us to discuss the role of barriers on clustering and triggering phenomena among a population of sources. It appears that an effective size of barrier, that depends on its frictional strength, controls the presence or the absence, in the synthetic catalog, of statistical laws that are similar to what is observed for real earthquakes. As an application, we attempt to draw a comparison between synthetic statistics and the observed statistics of Parkfield in order to characterize what could be a realistic frictional model of Parkfield area. More generally, we obtained synthetic statistical properties that are in agreement with power-law decays characterized by exponents that match the observations at a global scale, showing that our mechanical model is able to provide new insights into the understanding of earthquake interaction processes in general.
NASA Astrophysics Data System (ADS)
Abbasnezhadi, K.; Rasmussen, P. F.; Stadnyk, T.
2014-12-01
To gain a better understanding of the spatiotemporal distribution of rainfall over the Churchill River basin, this study was undertaken. The research incorporates gridded precipitation data from the Canadian Precipitation Analysis (CaPA) system. CaPA has been developed by Environment Canada and provides near real-time precipitation estimates on a 10 km by 10 km grid over North America at a temporal resolution of 6 hours. The spatial fields are generated by combining forecasts from the Global Environmental Multiscale (GEM) model with precipitation observations from the network of synoptic weather stations. CaPA's skill is highly influenced by the number of weather stations in the region of interest as well as by the quality of the observations. In an attempt to evaluate the performance of CaPA as a function of the density of the weather station network, a dual-stage design algorithm to simulate CaPA is proposed which incorporates generated weather fields. More specifically, we are adopting a controlled design algorithm which is generally known as Observing System Simulation Experiment (OSSE). The advantage of using the experiment is that one can define reference precipitation fields assumed to represent the true state of rainfall over the region of interest. In the first stage of the defined OSSE, a coupled stochastic model of precipitation and temperature gridded fields is calibrated and validated. The performance of the generator is then validated by comparing model statistics with observed statistics and by using the generated samples as input to the WATFLOOD™ hydrologic model. In the second stage of the experiment, in order to account for the systematic error of station observations and GEM fields, representative errors are to be added to the reference field using by-products of CaPA's variographic analysis. These by-products explain the variance of station observations and background errors.
Management Modalities for Traumatic Macular Hole: A Systematic Review and Single-Arm Meta-Analysis.
Gao, Min; Liu, Kun; Lin, Qiurong; Liu, Haiyun
2017-02-01
The purposes of this study were to (i) determine macular hole (MH) closure rates and visual outcomes by comparing two methods of managing traumatic MH (TMH)-an event resulting in severe loss of visual acuity (VA); (ii) characterize patients who undergo spontaneous TMH closure; (iii) determine which TMH patients should be observed before resorting to surgical repair; and (iv) elucidate factors that influence postoperative visual outcomes. Studies (n=10) of patients who were managed by surgery or observation for TMH were meta-analyzed retrospectively. Management modalities included surgical repair (surgery group) and observation for spontaneous hole closure (observation group). In addition, a 12-case series of articles (1990-2014) on spontaneous hole closure was statistically summarized. SAS and Comprehensive Meta-Analysis (CMA) (version 3.0) were used for analysis. For surgery group patients, the fixed-model pooled event rate for hole closure was 0.919 (range, 0.861-0.954) and for observation group patients, 0.368 (range, 0.236-0.448). The random-model pooled event rate for improvement of visual acuity (VA) for surgery group patients was 0.748 (range, 0.610-0.849) and for observation group patients, 0.505 (range, 0.397-0.613). For patients in both groups, the mean age of spontaneous closure was 18.71±10.64 years; mean size of TMHs, 0.18±0.06 decimal degrees (DD); and mean time for hole closure, 3.38±3.08 months. The pooled event rate for visual improvement was 0.748 (0.610-0.849). Hole closure and VA improvement rates of surgery group patients were significantly higher than those for observation group patients. Patients of ≤ 24 years of age with MH sizes of ≤ 0.2DD were more likely to achieve spontaneous hole closure. The interval of time from injury to surgery was statistically significantly associated with the level of visual improvement.
Cancer mortality in Minamata disease patients exposed to methylmercury through fish diet.
Kinjo, Y; Akiba, S; Yamaguchi, N; Mizuno, S; Watanabe, S; Wakamiya, J; Futatsuka, M; Kato, H
1996-09-01
We report here a historical cohort study on cancer mortality among Minamata disease (MD) patients (n = 1,351) in Kagoshima and Kumamoto Prefectures of Japan. Taking into account their living area, sex, age and fish eating habits, the residents (n = 5,667; 40 years of age or over at 1966) living in coastal areas of Kagoshima, who consumed fish daily, were selected as a reference group from the six-prefecture cohort study conducted by Hirayama et al. The observation periods of the MD patients and of the reference group were from 1973 to 1984 and from 1970 to 1981, respectively. Survival analysis using the Poisson regression model was applied for comparison of mortality between the MD patients and the reference group. No excess of relative risk (RR) adjusted for attained age, sex and follow-up period was observed for mortality from all causes, all cancers, and non-cancers combined. Analysis of site-specific cancers showed a statistically significant decrease in mortality from stomach cancer among MD patients (RR, 0.49; 95% confidence interval, 0.26-0.94). In addition, a statistically significant eight-fold excess risk, based on 5 observed deaths, was noted for mortality from leukemia (RR, 8.35; 95 % confidence interval 1.61-43.3). It is, however, unlikely for these observed risks to be derived from methylmercury exposure only. Further studies are needed to understand the mechanisms involved in the observed risks among MD patients.
DOE Office of Scientific and Technical Information (OSTI.GOV)
van Lier-Walqui, Marcus; Fridlind, Ann; Ackerman, Andrew S
2016-02-01
The representation of deep convection in general circulation models is in part informed by cloud-resolving models (CRMs) that function at higher spatial and temporal resolution; however, recent studies have shown that CRMs often fail at capturing the details of deep convection updrafts. With the goal of providing constraint on CRM simulation of deep convection updrafts, ground-based remote sensing observations are analyzed and statistically correlated for four deep convection events observed during the Midlatitude Continental Convective Clouds Experiment (MC3E). Since positive values of specific differential phase observed above the melting level are associated with deep convection updraft cells, so-called columns aremore » analyzed using two scanning polarimetric radars in Oklahoma: the National Weather Service Vance WSR-88D (KVNX) and the Department of Energy C-band Scanning Atmospheric Radiation Measurement (ARM) Precipitation Radar (C-SAPR). KVNX and C-SAPR volumes and columns are then statistically correlated with vertical winds retrieved via multi-Doppler wind analysis, lightning flash activity derived from the Oklahoma Lightning Mapping Array, and KVNX differential reflectivity . Results indicate strong correlations of volume above the melting level with updraft mass flux, lightning flash activity, and intense rainfall. Analysis of columns reveals signatures of changing updraft properties from one storm event to another as well as during event evolution. Comparison of to shows commonalities in information content of each, as well as potential problems with associated with observational artifacts.« less
The effect of viewing distance on observer performance in skeletal radiographs
NASA Astrophysics Data System (ADS)
Butler, M. L.; Lowe, J.; Toomey, R. J.; Maher, M.; Evanoff, M. E.; Rainford, L.
2013-03-01
A number of different viewing distances are recommended by international agencies, however none with specific reference to radiologist performance. The purpose of this study was to ascertain the extent to which radiologists performance is affected by viewing distance on softcopy skeletal reporting. Eighty dorsi-palmar (DP) wrist radiographs, of which half feature 1 or more fractures, were viewed by seven observers at 2 viewing distances, 30cm and 70cm. Observers rated the images as normal or not on a scale of 1 to 5 and could mark multiple locations on the images when they visualised a fracture. Viewing distance was measured from the centre of the face plate to the outer canthus of the eye. The DBM MRM analysis showed no statistically significant differences between the area under the curve for the two distances (p = 0.482). The JAFROC analysis, however, demonstrated a statistically significantly higher area under the curve with the 30cm viewing distance than with the 70 cm distance (p = 0.035). This suggests that while observers were able to make decisions about whether an image contained a fracture or not equally well at both viewing distances, they may have been less reliable in terms of fracture localisation or detection of multiple fractures. The impact of viewing distance warrants further attention from both clinical and scientific perspectives.
Isolating the anthropogenic component of Arctic warming
Chylek, Petr; Hengartner, Nicholas; Lesins, Glen; ...
2014-05-28
Structural equation modeling is used in statistical applications as both confirmatory and exploratory modeling to test models and to suggest the most plausible explanation for a relationship between the independent and the dependent variables. Although structural analysis cannot prove causation, it can suggest the most plausible set of factors that influence the observed variable. Here, we apply structural model analysis to the annual mean Arctic surface air temperature from 1900 to 2012 to find the most effective set of predictors and to isolate the anthropogenic component of the recent Arctic warming by subtracting the effects of natural forcing and variabilitymore » from the observed temperature. We also find that anthropogenic greenhouse gases and aerosols radiative forcing and the Atlantic Multidecadal Oscillation internal mode dominate Arctic temperature variability. Finally, our structural model analysis of observational data suggests that about half of the recent Arctic warming of 0.64 K/decade may have anthropogenic causes.« less
Patange Subba Rao, Sheethal Prasad; Lewis, James; Haddad, Ziad; Paringe, Vishal; Mohanty, Khitish
2014-10-01
The aim of the study was to evaluate inter-observer reliability and intra-observer reproducibility between the three-column classification and Schatzker classification systems using 2D and 3D CT models. Fifty-two consecutive patients with tibial plateau fractures were evaluated by five orthopaedic surgeons. All patients were classified into Schatzker and three-column classification systems using x-rays and 2D and 3D CT images. The inter-observer reliability was evaluated in the first round and the intra-observer reliability was determined during the second round 2 weeks later. The average intra-observer reproducibility for the three-column classification was from substantial to excellent in all sub classifications, as compared with Schatzker classification. The inter-observer kappa values increased from substantial to excellent in three-column classification and to moderate in Schatzker classification The average values for three-column classification for all the categories are as follows: (I-III) k2D = 0.718, 95% CI 0.554-0.864, p < 0.0001 and average 3D = 0.874, 95% CI 0.754-0.890, p < 0.0001. For Schatzker classification system, the average values for all six categories are as follows: (I-VI) k2D = 0.536, 95% CI 0.365-0.685, p < 0.0001 and average k3D = 0.552 95% CI 0.405-0.700, p < 0.0001. The values are statistically significant. Statistically significant inter-observer values in both rounds were noted with the three-column classification, making it statistically an excellent agreement. The intra-observer reproducibility for the three-column classification improved as compared with the Schatzker classification. The three-column classification seems to be an effective way to characterise and classify fractures of tibial plateau.
Ménard, Richard; Deshaies-Jacques, Martin; Gasset, Nicolas
2016-09-01
An objective analysis is one of the main components of data assimilation. By combining observations with the output of a predictive model we combine the best features of each source of information: the complete spatial and temporal coverage provided by models, with a close representation of the truth provided by observations. The process of combining observations with a model output is called an analysis. To produce an analysis requires the knowledge of observation and model errors, as well as its spatial correlation. This paper is devoted to the development of methods of estimation of these error variances and the characteristic length-scale of the model error correlation for its operational use in the Canadian objective analysis system. We first argue in favor of using compact support correlation functions, and then introduce three estimation methods: the Hollingsworth-Lönnberg (HL) method in local and global form, the maximum likelihood method (ML), and the [Formula: see text] diagnostic method. We perform one-dimensional (1D) simulation studies where the error variance and true correlation length are known, and perform an estimation of both error variances and correlation length where both are non-uniform. We show that a local version of the HL method can capture accurately the error variances and correlation length at each observation site, provided that spatial variability is not too strong. However, the operational objective analysis requires only a single and globally valid correlation length. We examine whether any statistics of the local HL correlation lengths could be a useful estimate, or whether other global estimation methods such as by the global HL, ML, or [Formula: see text] should be used. We found in both 1D simulation and using real data that the ML method is able to capture physically significant aspects of the correlation length, while most other estimates give unphysical and larger length-scale values. This paper describes a proposed improvement of the objective analysis of surface pollutants at Environment and Climate Change Canada (formerly known as Environment Canada). Objective analyses are essentially surface maps of air pollutants that are obtained by combining observations with an air quality model output, and are thought to provide a complete and more accurate representation of the air quality. The highlight of this study is an analysis of methods to estimate the model (or background) error correlation length-scale. The error statistics are an important and critical component to the analysis scheme.
Lonjon, Guillaume; Porcher, Raphael; Ergina, Patrick; Fouet, Mathilde; Boutron, Isabelle
2017-05-01
To describe the evolution of the use and reporting of propensity score (PS) analysis in observational studies assessing a surgical procedure. Assessing surgery in randomized controlled trials raises several challenges. Observational studies with PS analysis are a robust alternative for comparative effectiveness research. In this methodological systematic review, we identified all PubMed reports of observational studies with PS analysis that evaluated a surgical procedure and described the evolution of their use over time. Then, we selected a sample of articles published from August 2013 to July 2014 and systematically appraised the quality of reporting and potential bias of the PS analysis used. We selected 652 reports of observational studies with PS analysis. The publications increased over time, from 1 report in 1987 to 198 in 2013. Among the 129 reports assessed, 20% (n = 24) did not detail the covariates included in the PS and 77% (n = 100) did not report a justification for including these covariates in the PS. The rate of missing data for potential covariates was reported in 9% of articles. When a crossover by conversion was possible, only 14% of reports (n = 12) mentioned this issue. For matched analysis, 10% of articles reported all 4 key elements that allow for reproducibility of a PS-matched analysis (matching ratio, method to choose the nearest neighbors, replacement and method for statistical analysis). Observational studies with PS analysis in surgery are increasing in frequency, but specific methodological issues and weaknesses in reporting exist.
Statistical Analysis of Hubble /WFC3 Transit Spectroscopy of Extrasolar Planets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fu, Guangwei; Deming, Drake; Knutson, Heather
2017-10-01
Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST /WFC3 transit spectra for 1.1–1.65 μ m water vapor absorption and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-transmit code. We express the magnitude ofmore » the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.« less
Statistical Analysis of Hubble/WFC3 Transit Spectroscopy of Extrasolar Planets
NASA Astrophysics Data System (ADS)
Fu, Guangwei; Deming, Drake; Knutson, Heather; Madhusudhan, Nikku; Mandell, Avi; Fraine, Jonathan
2018-01-01
Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST/WFC3 transit spectra for 1.1-1.65 micron water vapor absorption, and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-Transmit code. We express the magnitude of the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.
Statistical Analysis of Hubble/WFC3 Transit Spectroscopy of Extrasolar Planets
NASA Astrophysics Data System (ADS)
Fu, Guangwei; Deming, Drake; Knutson, Heather; Madhusudhan, Nikku; Mandell, Avi; Fraine, Jonathan
2017-10-01
Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST/WFC3 transit spectra for 1.1-1.65 μm water vapor absorption and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-transmit code. We express the magnitude of the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.
Jorge, Inmaculada; Navarro, Pedro; Martínez-Acedo, Pablo; Núñez, Estefanía; Serrano, Horacio; Alfranca, Arántzazu; Redondo, Juan Miguel; Vázquez, Jesús
2009-01-01
Statistical models for the analysis of protein expression changes by stable isotope labeling are still poorly developed, particularly for data obtained by 16O/18O labeling. Besides large scale test experiments to validate the null hypothesis are lacking. Although the study of mechanisms underlying biological actions promoted by vascular endothelial growth factor (VEGF) on endothelial cells is of considerable interest, quantitative proteomics studies on this subject are scarce and have been performed after exposing cells to the factor for long periods of time. In this work we present the largest quantitative proteomics study to date on the short term effects of VEGF on human umbilical vein endothelial cells by 18O/16O labeling. Current statistical models based on normality and variance homogeneity were found unsuitable to describe the null hypothesis in a large scale test experiment performed on these cells, producing false expression changes. A random effects model was developed including four different sources of variance at the spectrum-fitting, scan, peptide, and protein levels. With the new model the number of outliers at scan and peptide levels was negligible in three large scale experiments, and only one false protein expression change was observed in the test experiment among more than 1000 proteins. The new model allowed the detection of significant protein expression changes upon VEGF stimulation for 4 and 8 h. The consistency of the changes observed at 4 h was confirmed by a replica at a smaller scale and further validated by Western blot analysis of some proteins. Most of the observed changes have not been described previously and are consistent with a pattern of protein expression that dynamically changes over time following the evolution of the angiogenic response. With this statistical model the 18O labeling approach emerges as a very promising and robust alternative to perform quantitative proteomics studies at a depth of several thousand proteins. PMID:19181660
Statistical Analyses of Scatterplots to Identify Important Factors in Large-Scale Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kleijnen, J.P.C.; Helton, J.C.
1999-04-01
The robustness of procedures for identifying patterns in scatterplots generated in Monte Carlo sensitivity analyses is investigated. These procedures are based on attempts to detect increasingly complex patterns in the scatterplots under consideration and involve the identification of (1) linear relationships with correlation coefficients, (2) monotonic relationships with rank correlation coefficients, (3) trends in central tendency as defined by means, medians and the Kruskal-Wallis statistic, (4) trends in variability as defined by variances and interquartile ranges, and (5) deviations from randomness as defined by the chi-square statistic. The following two topics related to the robustness of these procedures are consideredmore » for a sequence of example analyses with a large model for two-phase fluid flow: the presence of Type I and Type II errors, and the stability of results obtained with independent Latin hypercube samples. Observations from analysis include: (1) Type I errors are unavoidable, (2) Type II errors can occur when inappropriate analysis procedures are used, (3) physical explanations should always be sought for why statistical procedures identify variables as being important, and (4) the identification of important variables tends to be stable for independent Latin hypercube samples.« less
Avalappampatty Sivasamy, Aneetha; Sundan, Bose
2015-01-01
The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T2 method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T2 statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better. PMID:26357668
Sivasamy, Aneetha Avalappampatty; Sundan, Bose
2015-01-01
The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T(2) method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T(2) statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better.
Scaling Laws in Canopy Flows: A Wind-Tunnel Analysis
NASA Astrophysics Data System (ADS)
Segalini, Antonio; Fransson, Jens H. M.; Alfredsson, P. Henrik
2013-08-01
An analysis of velocity statistics and spectra measured above a wind-tunnel forest model is reported. Several measurement stations downstream of the forest edge have been investigated and it is observed that, while the mean velocity profile adjusts quickly to the new canopy boundary condition, the turbulence lags behind and shows a continuous penetration towards the free stream along the canopy model. The statistical profiles illustrate this growth and do not collapse when plotted as a function of the vertical coordinate. However, when the statistics are plotted as function of the local mean velocity (normalized with a characteristic velocity scale), they do collapse, independently of the streamwise position and freestream velocity. A new scaling for the spectra of all three velocity components is proposed based on the velocity variance and integral time scale. This normalization improves the collapse of the spectra compared to existing scalings adopted in atmospheric measurements, and allows the determination of a universal function that provides the velocity spectrum. Furthermore, a comparison of the proposed scaling laws for two different canopy densities is shown, demonstrating that the vertical velocity variance is the most sensible statistical quantity to the characteristics of the canopy roughness.
Redshift data and statistical inference
NASA Technical Reports Server (NTRS)
Newman, William I.; Haynes, Martha P.; Terzian, Yervant
1994-01-01
Frequency histograms and the 'power spectrum analysis' (PSA) method, the latter developed by Yu & Peebles (1969), have been widely employed as techniques for establishing the existence of periodicities. We provide a formal analysis of these two classes of methods, including controlled numerical experiments, to better understand their proper use and application. In particular, we note that typical published applications of frequency histograms commonly employ far greater numbers of class intervals or bins than is advisable by statistical theory sometimes giving rise to the appearance of spurious patterns. The PSA method generates a sequence of random numbers from observational data which, it is claimed, is exponentially distributed with unit mean and variance, essentially independent of the distribution of the original data. We show that the derived random processes is nonstationary and produces a small but systematic bias in the usual estimate of the mean and variance. Although the derived variable may be reasonably described by an exponential distribution, the tail of the distribution is far removed from that of an exponential, thereby rendering statistical inference and confidence testing based on the tail of the distribution completely unreliable. Finally, we examine a number of astronomical examples wherein these methods have been used giving rise to widespread acceptance of statistically unconfirmed conclusions.
NASA Astrophysics Data System (ADS)
Nykyri, K.; Moore, T.; Dimmock, A. P.
2017-12-01
In the Earth's magnetosphere, the magnetotail plasma sheet ions are much hotter than in the shocked solar wind. On the dawn-sector, the cold-component ions are more abundant and hotter by 30-40 percent when compared to the dusk sector. Recent statistical studies of the flank magnetopause and magnetosheath have shown that the level of temperature asymmetry of the magnetosheath is unable to account for this, so additional physical mechanisms must be at play, either at the magnetopause or plasma sheet that contribute to this asymmetry. In this study, we perform a statistical analysis on the ion-scale wave properties in the three main plasma regimes common to flank magnetopause boundary crossings when the boundary is unstable to KHI: hot and tenuous magnetospheric, cold and dense magnetosheath and mixed [Hasegawa 2004 et al., 2004]. These statistics of ion-scale wave properties are compared to observations of fast magnetosonic wave modes that have recently been linked to Kelvin-Helmholtz vortex centered ion heating [Moore et al., 2016]. The statistical analysis shows that during KH events there is enhanced non-adiabatic heating calculated during (temporal) ion scale wave intervals when compared to non-KH events.
NASA Astrophysics Data System (ADS)
Franz, T. E.; Avery, W. A.; Finkenbiner, C. E.; Wang, T.; Brocca, L.
2014-12-01
Approximately 40% of global food production comes from irrigated agriculture. With the increasing demand for food even greater pressures will be placed on water resources within these systems. In this work we aimed to characterize the spatial and temporal patterns of soil moisture at the field-scale (~500 m) using the newly developed cosmic-ray neutron rover near Waco, NE. Here we mapped soil moisture of 144 quarter section fields (a mix of maize, soybean, and natural areas) each week during the 2014 growing season (May to September). The 11 x11 km study domain also contained 3 stationary cosmic-ray neutron probes for independent validation of the rover surveys. Basic statistical analysis of the domain indicated a strong inverted parabolic relationship between the mean and variance of soil moisture. The relationship between the mean and higher order moments were not as strong. Geostatistical analysis indicated the range of the soil moisture semi-variogram was significantly shorter during periods of heavy irrigation as compared to non-irrigated periods. Scaling analysis indicated strong power law behavior between the variance of soil moisture and averaging area with minimal dependence of mean soil moisture on the slope of the power law function. Statistical relationships derived from the rover dataset offer a novel set of observations that will be useful in: 1) calibrating and validating land surface models, 2) calibrating and validating crop models, 3) soil moisture covariance estimates for statistical downscaling of remote sensing products such as SMOS and SMAP, and 4) provide center-pivot scale mean soil moisture data for optimal irrigation timing and volume amounts.
Alignments of parity even/odd-only multipoles in CMB
NASA Astrophysics Data System (ADS)
Aluri, Pavan K.; Ralston, John P.; Weltman, Amanda
2017-12-01
We compare the statistics of parity even and odd multipoles of the cosmic microwave background (CMB) sky from Planck full mission temperature measurements. An excess power in odd multipoles compared to even multipoles has previously been found on large angular scales. Motivated by this apparent parity asymmetry, we evaluate directional statistics associated with even compared to odd multipoles, along with their significances. Primary tools are the Power tensor and Alignment tensor statistics. We limit our analysis to the first 60 multipoles i.e. l = [2, 61]. We find no evidence for statistically unusual alignments of even parity multipoles. More than one independent statistic finds evidence for alignments of anisotropy axes of odd multipoles, with a significance equivalent to ∼2σ or more. The robustness of alignment axes is tested by making Galactic cuts and varying the multipole range. Very interestingly, the region spanned by the (a)symmetry axes is found to broadly contain other parity (a)symmetry axes previously observed in the literature.
Saba, Luca; Atzeni, Matteo; Ribuffo, Diego; Mallarini, Giorgio; Suri, Jasjit S
2012-08-01
Our purpose was to compare two post-processing techniques, Maximum-Intensity-Projection (MIP) and Volume Rendering (VR) for the study of perforator arteries. Thirty patients who underwent Multi-Detector-Row CT Angiography (MDCTA) between February 2010 and May 2010 were retrospectively analyzed. For each patient and for each reconstruction method, the image quality was evaluated and the inter- and intra-observer agreement was calculated according to the Cohen statistics. The Hounsfield Unit (HU) value in the common femoral artery was quantified and the correlation (Pearson Statistic) between image quality and HU value was explored. The Pearson r between the right and left common femoral artery was excellent (r=0.955). The highest image quality score was obtained using MIP for both observers (total value 75, with a mean value 2.67 for observer 1 and total value of 79 and a mean value of 2.82 for observer 2). The highest agreement between the two observers was detected using the MIP protocol with a Cohen kappa value of 0.856. The ROC area under the curve (Az) for the VR is 0.786 (0.086 SD; p value=0.0009) whereas the ROC area under the curve (Az) for the MIP is 0.0928 (0.051 SD; p value=0.0001). MIP showed the optimal inter- and intra-observer agreement and the highest quality scores and therefore should be used as post-processing techniques in the analysis of perforating arteries. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Evaluating the Impact of Database Heterogeneity on Observational Study Results
Madigan, David; Ryan, Patrick B.; Schuemie, Martijn; Stang, Paul E.; Overhage, J. Marc; Hartzema, Abraham G.; Suchard, Marc A.; DuMouchel, William; Berlin, Jesse A.
2013-01-01
Clinical studies that use observational databases to evaluate the effects of medical products have become commonplace. Such studies begin by selecting a particular database, a decision that published papers invariably report but do not discuss. Studies of the same issue in different databases, however, can and do generate different results, sometimes with strikingly different clinical implications. In this paper, we systematically study heterogeneity among databases, holding other study methods constant, by exploring relative risk estimates for 53 drug-outcome pairs and 2 widely used study designs (cohort studies and self-controlled case series) across 10 observational databases. When holding the study design constant, our analysis shows that estimated relative risks range from a statistically significant decreased risk to a statistically significant increased risk in 11 of 53 (21%) of drug-outcome pairs that use a cohort design and 19 of 53 (36%) of drug-outcome pairs that use a self-controlled case series design. This exceeds the proportion of pairs that were consistent across databases in both direction and statistical significance, which was 9 of 53 (17%) for cohort studies and 5 of 53 (9%) for self-controlled case series. Our findings show that clinical studies that use observational databases can be sensitive to the choice of database. More attention is needed to consider how the choice of data source may be affecting results. PMID:23648805
Upper Atmosphere Research Satellite (UARS) onboard attitude determination using a Kalman filter
NASA Technical Reports Server (NTRS)
Garrick, Joseph
1993-01-01
The Upper Atmospheric Research Satellite (UARS) requires a highly accurate knowledge of its attitude to accomplish its mission. Propagation of the attitude state using gyro measurements is not sufficient to meet the accuracy requirements, and must be supplemented by a observer/compensation process to correct for dynamics and observation anomalies. The process of amending the attitude state utilizes a well known method, the discrete Kalman Filter. This study is a sensitivity analysis of the discrete Kalman Filter as implemented in the UARS Onboard Computer (OBC). The stability of the Kalman Filter used in the normal on-orbit control mode within the OBC, is investigated for the effects of corrupted observations and nonlinear errors. Also, a statistical analysis on the residuals of the Kalman Filter is performed. These analysis is based on simulations using the UARS Dynamics Simulator (UARSDSIM) and compared against attitude requirements as defined by General Electric (GE). An independent verification of expected accuracies is performed using the Attitude Determination Error Analysis System (ADEAS).
NASA Astrophysics Data System (ADS)
Harrison, R. A.; Davies, J. A.; Barnes, D.; Byrne, J. P.; Perry, C. H.; Bothmer, V.; Eastwood, J. P.; Gallagher, P. T.; Kilpua, E. K. J.; Möstl, C.; Rodriguez, L.; Rouillard, A. P.; Odstrčil, D.
2018-05-01
We present a statistical analysis of coronal mass ejections (CMEs) imaged by the Heliospheric Imager (HI) instruments on board NASA's twin-spacecraft STEREO mission between April 2007 and August 2017 for STEREO-A and between April 2007 and September 2014 for STEREO-B. The analysis exploits a catalogue that was generated within the FP7 HELCATS project. Here, we focus on the observational characteristics of CMEs imaged in the heliosphere by the inner (HI-1) cameras, while following papers will present analyses of CME propagation through the entire HI fields of view. More specifically, in this paper we present distributions of the basic observational parameters - namely occurrence frequency, central position angle (PA) and PA span - derived from nearly 2000 detections of CMEs in the heliosphere by HI-1 on STEREO-A or STEREO-B from the minimum between Solar Cycles 23 and 24 to the maximum of Cycle 24; STEREO-A analysis includes a further 158 CME detections from the descending phase of Cycle 24, by which time communication with STEREO-B had been lost. We compare heliospheric CME characteristics with properties of CMEs observed at coronal altitudes, and with sunspot number. As expected, heliospheric CME rates correlate with sunspot number, and are not inconsistent with coronal rates once instrumental factors/differences in cataloguing philosophy are considered. As well as being more abundant, heliospheric CMEs, like their coronal counterparts, tend to be wider during solar maximum. Our results confirm previous coronagraph analyses suggesting that CME launch sites do not simply migrate to higher latitudes with increasing solar activity. At solar minimum, CMEs tend to be launched from equatorial latitudes, while at maximum, CMEs appear to be launched over a much wider latitude range; this has implications for understanding the CME/solar source association. Our analysis provides some supporting evidence for the systematic dragging of CMEs to lower latitude as they propagate outwards.
Friedman, David B
2012-01-01
All quantitative proteomics experiments measure variation between samples. When performing large-scale experiments that involve multiple conditions or treatments, the experimental design should include the appropriate number of individual biological replicates from each condition to enable the distinction between a relevant biological signal from technical noise. Multivariate statistical analyses, such as principal component analysis (PCA), provide a global perspective on experimental variation, thereby enabling the assessment of whether the variation describes the expected biological signal or the unanticipated technical/biological noise inherent in the system. Examples will be shown from high-resolution multivariable DIGE experiments where PCA was instrumental in demonstrating biologically significant variation as well as sample outliers, fouled samples, and overriding technical variation that would not be readily observed using standard univariate tests.
Provision for Learners with Special Educational Needs in Botswana: A Situational Analysis
ERIC Educational Resources Information Center
Dart, Gareth
2007-01-01
This paper considers the support of children with special educational needs in Botswana. A variety of sources including policy documents, literature, statistical data, interviews with key personnel and observation, are used to analyse the context and delivery of provision. Botswana is a middle-income country that has seen rapid economic expansion…
Clinical Efficacy of Psychoeducational Interventions with Family Caregivers
ERIC Educational Resources Information Center
Limiñana-Gras, Rosa M.; Colodro-Conde, Lucía; Cuéllar-Flores, Isabel; Sánchez-López, M. Pilar
2016-01-01
The goal of this study is to investigate the efficacy of psychoeducational interventions geared to reducing psychological distress for caregivers in a sample of 90 family caregivers of elderly dependent (78 women and 12 men). We conducted an analysis of the statistical and clinical significance of the changes observed in psychological health…
Model Identification in Time-Series Analysis: Some Empirical Results.
ERIC Educational Resources Information Center
Padia, William L.
Model identification of time-series data is essential to valid statistical tests of intervention effects. Model identification is, at best, inexact in the social and behavioral sciences where one is often confronted with small numbers of observations. These problems are discussed, and the results of independent identifications of 130 social and…
ERIC Educational Resources Information Center
O'Connell, Ann Aileen
The relationships among types of errors observed during probability problem solving were studied. Subjects were 50 graduate students in an introductory probability and statistics course. Errors were classified as text comprehension, conceptual, procedural, and arithmetic. Canonical correlation analysis was conducted on the frequencies of specific…
Toward User Interfaces and Data Visualization Criteria for Learning Design of Digital Textbooks
ERIC Educational Resources Information Center
Railean, Elena
2014-01-01
User interface and data visualisation criteria are central issues in digital textbooks design. However, when applying mathematical modelling of learning process to the analysis of the possible solutions, it could be observed that results differ. Mathematical learning views cognition in on the base on statistics and probability theory, graph…
ERIC Educational Resources Information Center
Acar, Tu¨lin
2014-01-01
In literature, it has been observed that many enhanced criteria are limited by factor analysis techniques. Besides examinations of statistical structure and/or psychological structure, such validity studies as cross validation and classification-sequencing studies should be performed frequently. The purpose of this study is to examine cross…
Modeling Conditional Probabilities in Complex Educational Assessments. CSE Technical Report.
ERIC Educational Resources Information Center
Mislevy, Robert J.; Almond, Russell; Dibello, Lou; Jenkins, Frank; Steinberg, Linda; Yan, Duanli; Senturk, Deniz
An active area in psychometric research is coordinated task design and statistical analysis built around cognitive models. Compared with classical test theory and item response theory, there is often less information from observed data about the measurement-model parameters. On the other hand, there is more information from the grounding…
Challenge in Enhancing the Teaching and Learning of Variable Measurements in Quantitative Research
ERIC Educational Resources Information Center
Kee, Chang Peng; Osman, Kamisah; Ahmad, Fauziah
2013-01-01
Statistical analysis is one component that cannot be avoided in a quantitative research. Initial observations noted that students in higher education institution faced difficulty analysing quantitative data which were attributed to the confusions of various variable measurements. This paper aims to compare the outcomes of two approaches applied in…
ERIC Educational Resources Information Center
Hicks, Catherine
2018-01-01
Purpose: This paper aims to explore predicting employee learning activity via employee characteristics and usage for two online learning tools. Design/methodology/approach: Statistical analysis focused on observational data collected from user logs. Data are analyzed via regression models. Findings: Findings are presented for over 40,000…
NASA Technical Reports Server (NTRS)
Bauman, William H., III
2010-01-01
The 45th Weather Squadron (45 WS) Launch Weather Officers use the 12-km resolution North American Mesoscale (NAM) model (MesoNAM) text and graphical product forecasts extensively to support launch weather operations. However, the actual performance of the model at Kennedy Space Center (KSC) and Cape Canaveral Air Force Station (CCAFS) has not been measured objectively. In order to have tangible evidence of model performance, the 45 WS tasked the Applied Meteorology Unit to conduct a detailed statistical analysis of model output compared to observed values. The model products are provided to the 45 WS by ACTA, Inc. and include hourly forecasts from 0 to 84 hours based on model initialization times of 00, 06, 12 and 18 UTC. The objective analysis compared the MesoNAM forecast winds, temperature and dew point, as well as the changes in these parameters over time, to the observed values from the sensors in the KSC/CCAFS wind tower network. Objective statistics will give the forecasters knowledge of the model's strength and weaknesses, which will result in improved forecasts for operations.
A statistical study of ionopause perturbation and associated boundary wave formation at Venus.
NASA Astrophysics Data System (ADS)
Chong, G. S.; Pope, S. A.; Walker, S. N.; Zhang, T.; Balikhin, M. A.
2017-12-01
In contrast to Earth, Venus does not possess an intrinsic magnetic field. Hence the interaction between solar wind and Venus is significantly different when compared to Earth, even though these two planets were once considered similar. Within the induced magnetosphere and ionosphere of Venus, previous studies have shown the existence of ionospheric boundary waves. These structures may play an important role in the atmospheric evolution of Venus. By using Venus Express data, the crossings of the ionopause boundary are determined based on the observations of photoelectrons during 2011. Pulses of dropouts in the electron energy spectrometer were observed in 92 events, which suggests potential perturbations of the boundary. Minimum variance analysis of the 1Hz magnetic field data for the perturbations is conducted and used to confirm the occurrence of the boundary waves. Statistical analysis shows that they were propagating mainly in the ±VSO-Y direction in the polar north terminator region. The generation mechanisms of boundary waves and their evolution into the potential nonlinear regime are discussed and analysed.
NASA Technical Reports Server (NTRS)
Watson, Leela R.
2011-01-01
The 45th Weather Squadron Launch Weather Officers use the 12-km resolution North American Mesoscale model (MesoNAM) forecasts to support launch weather operations. In Phase I, the performance of the model at KSC/CCAFS was measured objectively by conducting a detailed statistical analysis of model output compared to observed values. The objective analysis compared the MesoNAM forecast winds, temperature, and dew point to the observed values from the sensors in the KSC/CCAFS wind tower network. In Phase II, the AMU modified the current tool by adding an additional 15 months of model output to the database and recalculating the verification statistics. The bias, standard deviation of bias, Root Mean Square Error, and Hypothesis test for bias were calculated to verify the performance of the model. The results indicated that the accuracy decreased as the forecast progressed, there was a diurnal signal in temperature with a cool bias during the late night and a warm bias during the afternoon, and there was a diurnal signal in dewpoint temperature with a low bias during the afternoon and a high bias during the late night.
NASA Astrophysics Data System (ADS)
Schmith, Torben; Thejll, Peter; Johansen, Søren
2016-04-01
We analyse the statistical relationship between changes in global temperature, global steric sea level and radiative forcing in order to reveal causal relationships. There are in this, however, potential pitfalls due to the trending nature of the time series. We therefore apply a statistical method called cointegration analysis, originating from the field of econometrics, which is able to correctly handle the analysis of series with trends and other long-range dependencies. Further, we find a relationship between steric sea level and temperature and find that temperature causally depends on the steric sea level, which can be understood as a consequence of the large heat capacity of the ocean. This result is obtained both when analyzing observed data and data from a CMIP5 historical model run. Finally, we find that in the data from the historical run, the steric sea level, in turn, is driven by the external forcing. Finally, we demonstrate that combining these two results can lead to a novel estimate of radiative forcing back in time based on observations.
Statistical analysis of the ambiguities in the asteroid period determinations
NASA Astrophysics Data System (ADS)
Butkiewicz-Bąk, M.; Kwiatkowski, T.; Bartczak, P.; Dudziński, G.; Marciniak, A.
2017-09-01
Among asteroids there exist ambiguities in their rotation period determinations. They are due to incomplete coverage of the rotation, noise and/or aliases resulting from gaps between separate lightcurves. To help to remove such uncertainties, basic characteristic of the lightcurves resulting from constraints imposed by the asteroid shapes and geometries of observations should be identified. We simulated light variations of asteroids whose shapes were modelled as Gaussian random spheres, with random orientations of spin vectors and phase angles changed every 5° from 0° to 65°. This produced 1.4 million lightcurves. For each simulated lightcurve, Fourier analysis has been made and the harmonic of the highest amplitude was recorded. From the statistical point of view, all lightcurves observed at phase angles α < 30°, with peak-to-peak amplitudes A > 0.2 mag, are bimodal. Second most frequently dominating harmonic is the first one, with the 3rd harmonic following right after. For 1 per cent of lightcurves with amplitudes A < 0.1 mag and phase angles α < 40°, 4th harmonic dominates.
Variables in psychology: a critique of quantitative psychology.
Toomela, Aaro
2008-09-01
Mind is hidden from direct observation; it can be studied only by observing behavior. Variables encode information about behaviors. There is no one-to-one correspondence between behaviors and mental events underlying the behaviors, however. In order to understand mind it would be necessary to understand exactly what information is represented in variables. This aim cannot be reached after variables are already encoded. Therefore, statistical data analysis can be very misleading in studies aimed at understanding mind that underlies behavior. In this article different kinds of information that can be represented in variables are described. It is shown how informational ambiguity of variables leads to problems of theoretically meaningful interpretation of the results of statistical data analysis procedures in terms of hidden mental processes. Reasons are provided why presence of dependence between variables does not imply causal relationship between events represented by variables and absence of dependence between variables cannot rule out the causal dependence of events represented by variables. It is concluded that variable-psychology has a very limited range of application for the development of a theory of mind-psychology.
NASA Astrophysics Data System (ADS)
Kim, J.; Kim, J. H.; Jee, G.; Lee, C.; Kim, Y.
2017-12-01
Spectral Airglow Temperature Imager (SATI) installed at King Sejong Station (62.22S, 58.78W), Antarctica, has been continuously measured the airglow emissions from OH (6-2) Meinel and O2 (0-1) atmospheric bands since 2002, in order to investigate the dynamics of the polar MLT region. The measurements allow us to derive the rotational temperature at peak emission heights known as about 87 km and 94 km for OH and O2 airglows, respectively. In this study, we briefly introduce improved analysis technique that modified original analysis code. The major change compared to original program is the improvement of the function to find the exact center position in the observed image. In addition to brief introduction of the improved technique, we also present the results statistically investigating the periodic variations on the temperatures of two layers during the period of 2002 through 2011 and compare our results with those from the temperatures measured by satellite.
Mridula, Meenu R; Nair, Ashalatha S; Kumar, K Satheesh
2018-02-01
In this paper, we compared the efficacy of observation based modeling approach using a genetic algorithm with the regular statistical analysis as an alternative methodology in plant research. Preliminary experimental data on in vitro rooting was taken for this study with an aim to understand the effect of charcoal and naphthalene acetic acid (NAA) on successful rooting and also to optimize the two variables for maximum result. Observation-based modelling, as well as traditional approach, could identify NAA as a critical factor in rooting of the plantlets under the experimental conditions employed. Symbolic regression analysis using the software deployed here optimised the treatments studied and was successful in identifying the complex non-linear interaction among the variables, with minimalistic preliminary data. The presence of charcoal in the culture medium has a significant impact on root generation by reducing basal callus mass formation. Such an approach is advantageous for establishing in vitro culture protocols as these models will have significant potential for saving time and expenditure in plant tissue culture laboratories, and it further reduces the need for specialised background.
Anderson localization of shear waves observed by magnetic resonance imaging
NASA Astrophysics Data System (ADS)
Papazoglou, S.; Klatt, D.; Braun, J.; Sack, I.
2010-07-01
In this letter we present for the first time an experimental investigation of shear wave localization using motion-sensitive magnetic resonance imaging (MRI). Shear wave localization was studied in gel phantoms containing arrays of randomly positioned parallel glass rods. The phantoms were exposed to continuous harmonic vibrations in a frequency range from 25 to 175 Hz, yielding wavelengths on the order of the elastic mean free path, i.e. the Ioffe-Regel criterion of Anderson localization was satisfied. The experimental setup was further chosen such that purely shear horizontal waves were induced to avoid effects due to mode conversion and pressure waves. Analysis of the distribution of shear wave intensity in experiments and simulations revealed a significant deviation from Rayleigh statistics indicating that shear wave energy is localized. This observation is further supported by experiments on weakly scattering samples exhibiting Rayleigh statistics and an analysis of the multifractality of wave functions. Our results suggest that motion-sensitive MRI is a promising tool for studying Anderson localization of time-harmonic shear waves, which are increasingly used in dynamic elastography.
Statistical Analysis of Bus Networks in India
2016-01-01
In this paper, we model the bus networks of six major Indian cities as graphs in L-space, and evaluate their various statistical properties. While airline and railway networks have been extensively studied, a comprehensive study on the structure and growth of bus networks is lacking. In India, where bus transport plays an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer basic questions on its evolution, growth, robustness and resiliency. Although the common feature of small-world property is observed, our analysis reveals a wide spectrum of network topologies arising due to significant variation in the degree-distribution patterns in the networks. We also observe that these networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, such as Internet, WWW and airline, that are virtual, bus networks are physically constrained. Our findings therefore, throw light on the evolution of such geographically and constrained networks that will help us in designing more efficient bus networks in the future. PMID:27992590
Persistent homology and non-Gaussianity
NASA Astrophysics Data System (ADS)
Cole, Alex; Shiu, Gary
2018-03-01
In this paper, we introduce the topological persistence diagram as a statistic for Cosmic Microwave Background (CMB) temperature anisotropy maps. A central concept in 'Topological Data Analysis' (TDA), the idea of persistence is to represent a data set by a family of topological spaces. One then examines how long topological features 'persist' as the family of spaces is traversed. We compute persistence diagrams for simulated CMB temperature anisotropy maps featuring various levels of primordial non-Gaussianity of local type. Postponing the analysis of observational effects, we show that persistence diagrams are more sensitive to local non-Gaussianity than previous topological statistics including the genus and Betti number curves, and can constrain Δ fNLloc= 35.8 at the 68% confidence level on the simulation set, compared to Δ fNLloc= 60.6 for the Betti number curves. Given the resolution of our simulations, we expect applying persistence diagrams to observational data will give constraints competitive with those of the Minkowski Functionals. This is the first in a series of papers where we plan to apply TDA to different shapes of non-Gaussianity in the CMB and Large Scale Structure.
Quantifying discrimination of Framingham risk functions with different survival C statistics.
Pencina, Michael J; D'Agostino, Ralph B; Song, Linye
2012-07-10
Cardiovascular risk prediction functions offer an important diagnostic tool for clinicians and patients themselves. They are usually constructed with the use of parametric or semi-parametric survival regression models. It is essential to be able to evaluate the performance of these models, preferably with summaries that offer natural and intuitive interpretations. The concept of discrimination, popular in the logistic regression context, has been extended to survival analysis. However, the extension is not unique. In this paper, we define discrimination in survival analysis as the model's ability to separate those with longer event-free survival from those with shorter event-free survival within some time horizon of interest. This definition remains consistent with that used in logistic regression, in the sense that it assesses how well the model-based predictions match the observed data. Practical and conceptual examples and numerical simulations are employed to examine four C statistics proposed in the literature to evaluate the performance of survival models. We observe that they differ in the numerical values and aspects of discrimination that they capture. We conclude that the index proposed by Harrell is the most appropriate to capture discrimination described by the above definition. We suggest researchers report which C statistic they are using, provide a rationale for their selection, and be aware that comparing different indices across studies may not be meaningful. Copyright © 2012 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Wu, Chong; Liu, Liping; Wei, Ming; Xi, Baozhu; Yu, Minghui
2018-03-01
A modified hydrometeor classification algorithm (HCA) is developed in this study for Chinese polarimetric radars. This algorithm is based on the U.S. operational HCA. Meanwhile, the methodology of statistics-based optimization is proposed including calibration checking, datasets selection, membership functions modification, computation thresholds modification, and effect verification. Zhuhai radar, the first operational polarimetric radar in South China, applies these procedures. The systematic bias of calibration is corrected, the reliability of radar measurements deteriorates when the signal-to-noise ratio is low, and correlation coefficient within the melting layer is usually lower than that of the U.S. WSR-88D radar. Through modification based on statistical analysis of polarimetric variables, the localized HCA especially for Zhuhai is obtained, and it performs well over a one-month test through comparison with sounding and surface observations. The algorithm is then utilized for analysis of a squall line process on 11 May 2014 and is found to provide reasonable details with respect to horizontal and vertical structures, and the HCA results—especially in the mixed rain-hail region—can reflect the life cycle of the squall line. In addition, the kinematic and microphysical processes of cloud evolution and the differences between radar-detected hail and surface observations are also analyzed. The results of this study provide evidence for the improvement of this HCA developed specifically for China.
Trends in bromide wet deposition concentrations in the contiguous United States, 2001-2016.
Wetherbee, Gregory A; Lehmann, Christopher M B; Kerschner, Brian M; Ludtke, Amy S; Green, Lee A; Rhodes, Mark F
2018-02-01
Bromide (Br - ) and other solute concentration data from wet deposition samples collected and analyzed by the National Atmospheric Deposition Program (NADP) from 2001 to 2016, were statistically analyzed for trends both geographically and temporally by precipitation type. Analysis was limited to NADP sites in the contiguous 48 United States. The Br - concentrations for this time period had a high number of values censored at the detection limits with greater than 86 percent of sample concentrations below analytical detection. Bromide was more frequently detected at NADP sites in coastal regions. Analysis using specialized statistical techniques for censored data revealed that Br - concentrations varied by precipitation type with higher concentrations usually observed in liquid versus precipitation containing snow. Negative temporal trends in Br - wet deposition concentrations were observed at a majority of NADP sites; approximately 25 percent of these trend values were statistically significant at less than 0.05 to 0.10 significance levels. Potential causes for the negative trends were explored, including annual and seasonal changes in precipitation depth, reduced emissions of methyl bromide (CH 3 Br) from coastal wetlands, and declining industrial use of bromine compounds. The results indicate that Br - in non-coastal wet-deposition comes mainly from long-range transport, not local sources. Correlations between Br - , chloride, and nitrate concentrations also were evaluated. Published by Elsevier Ltd.
Toward statistical modeling of saccadic eye-movement and visual saliency.
Sun, Xiaoshuai; Yao, Hongxun; Ji, Rongrong; Liu, Xian-Ming
2014-11-01
In this paper, we present a unified statistical framework for modeling both saccadic eye movements and visual saliency. By analyzing the statistical properties of human eye fixations on natural images, we found that human attention is sparsely distributed and usually deployed to locations with abundant structural information. This observations inspired us to model saccadic behavior and visual saliency based on super-Gaussian component (SGC) analysis. Our model sequentially obtains SGC using projection pursuit, and generates eye movements by selecting the location with maximum SGC response. Besides human saccadic behavior simulation, we also demonstrated our superior effectiveness and robustness over state-of-the-arts by carrying out dense experiments on synthetic patterns and human eye fixation benchmarks. Multiple key issues in saliency modeling research, such as individual differences, the effects of scale and blur, are explored in this paper. Based on extensive qualitative and quantitative experimental results, we show promising potentials of statistical approaches for human behavior research.
Dynamic heterogeneity and non-Gaussian statistics for acetylcholine receptors on live cell membrane
NASA Astrophysics Data System (ADS)
He, W.; Song, H.; Su, Y.; Geng, L.; Ackerson, B. J.; Peng, H. B.; Tong, P.
2016-05-01
The Brownian motion of molecules at thermal equilibrium usually has a finite correlation time and will eventually be randomized after a long delay time, so that their displacement follows the Gaussian statistics. This is true even when the molecules have experienced a complex environment with a finite correlation time. Here, we report that the lateral motion of the acetylcholine receptors on live muscle cell membranes does not follow the Gaussian statistics for normal Brownian diffusion. From a careful analysis of a large volume of the protein trajectories obtained over a wide range of sampling rates and long durations, we find that the normalized histogram of the protein displacements shows an exponential tail, which is robust and universal for cells under different conditions. The experiment indicates that the observed non-Gaussian statistics and dynamic heterogeneity are inherently linked to the slow-active remodelling of the underlying cortical actin network.
An astronomer's guide to period searching
NASA Astrophysics Data System (ADS)
Schwarzenberg-Czerny, A.
2003-03-01
We concentrate on analysis of unevenly sampled time series, interrupted by periodic gaps, as often encountered in astronomy. While some of our conclusions may appear surprising, all are based on classical statistical principles of Fisher & successors. Except for discussion of the resolution issues, it is best for the reader to forget temporarily about Fourier transforms and to concentrate on problems of fitting of a time series with a model curve. According to their statistical content we divide the issues into several sections, consisting of: (ii) statistical numerical aspects of model fitting, (iii) evaluation of fitted models as hypotheses testing, (iv) the role of the orthogonal models in signal detection (v) conditions for equivalence of periodograms (vi) rating sensitivity by test power. An experienced observer working with individual objects would benefit little from formalized statistical approach. However, we demonstrate the usefulness of this approach in evaluation of performance of periodograms and in quantitative design of large variability surveys.
Color stability comparison of silicone facial prostheses following disinfection.
Goiato, Marcelo Coelho; Pesqueira, Aldiéris Alves; dos Santos, Daniela Micheline; Zavanelli, Adriana Cristina; Ribeiro, Paula do Prado
2009-04-01
The purpose of this study was to evaluate the color stability of two silicones for use in facial prostheses, under the influence of chemical disinfection and storage time. Twenty-eight specimens were obtained half made from Silastic MDX 4-4210 silicone and the other half from Silastic 732 RTV silicone. The specimens were divided into four groups: Silastic 732 RTV and MDX 4-4210 with disinfection three times a week with Efferdent and Sliastic 732 RTV and MDX 4-4210 disinfected with neutral soap. Color stability was analyzed by spectrophotometry, immediately and 2 months after making the specimens. After obtaining the results, ANOVA and Tukey test with 1% reliability were used for statistical analysis. Statistical differences between mean color values were observed. Disinfection with Efferdent did not statistically influence the mean color values. The factors of storage time and disinfection statistically influenced color stability; disinfection acts as a bleaching agent in silicone materials.
Thompson, Cheryl Bagley
2009-01-01
This 13th article of the Basics of Research series is first in a short series on statistical analysis. These articles will discuss creating your statistical analysis plan, levels of measurement, descriptive statistics, probability theory, inferential statistics, and general considerations for interpretation of the results of a statistical analysis.
Vecchiato, Giovanni; Astolfi, Laura; Tabarrini, Alessandro; Salinari, Serenella; Mattia, Donatella; Cincotti, Febo; Bianchi, Luigi; Sorrentino, Domenica; Aloise, Fabio; Soranzo, Ramon; Babiloni, Fabio
2010-01-01
The use of modern brain imaging techniques could be useful to understand what brain areas are involved in the observation of video clips related to commercial advertising, as well as for the support of political campaigns, and also the areas of Public Service Announcements (PSAs). In this paper we describe the capability of tracking brain activity during the observation of commercials, political spots, and PSAs with advanced high-resolution EEG statistical techniques in time and frequency domains in a group of normal subjects. We analyzed the statistically significant cortical spectral power activity in different frequency bands during the observation of a commercial video clip related to the use of a beer in a group of 13 normal subjects. In addition, a TV speech of the Prime Minister of Italy was analyzed in two groups of swing and "supporter" voters. Results suggested that the cortical activity during the observation of commercial spots could vary consistently across the spot. This fact suggest the possibility to remove the parts of the spot that are not particularly attractive by using those cerebral indexes. The cortical activity during the observation of the political speech indicated a major cortical activity in the supporters group when compared to the swing voters. In this case, it is possible to conclude that the communication proposed has failed to raise attention or interest on swing voters. In conclusions, high-resolution EEG statistical techniques have been proved to able to generate useful insights about the particular fruition of TV messages, related to both commercial as well as political fields.
Vecchiato, Giovanni; Astolfi, Laura; Tabarrini, Alessandro; Salinari, Serenella; Mattia, Donatella; Cincotti, Febo; Bianchi, Luigi; Sorrentino, Domenica; Aloise, Fabio; Soranzo, Ramon; Babiloni, Fabio
2010-01-01
The use of modern brain imaging techniques could be useful to understand what brain areas are involved in the observation of video clips related to commercial advertising, as well as for the support of political campaigns, and also the areas of Public Service Announcements (PSAs). In this paper we describe the capability of tracking brain activity during the observation of commercials, political spots, and PSAs with advanced high-resolution EEG statistical techniques in time and frequency domains in a group of normal subjects. We analyzed the statistically significant cortical spectral power activity in different frequency bands during the observation of a commercial video clip related to the use of a beer in a group of 13 normal subjects. In addition, a TV speech of the Prime Minister of Italy was analyzed in two groups of swing and “supporter” voters. Results suggested that the cortical activity during the observation of commercial spots could vary consistently across the spot. This fact suggest the possibility to remove the parts of the spot that are not particularly attractive by using those cerebral indexes. The cortical activity during the observation of the political speech indicated a major cortical activity in the supporters group when compared to the swing voters. In this case, it is possible to conclude that the communication proposed has failed to raise attention or interest on swing voters. In conclusions, high-resolution EEG statistical techniques have been proved to able to generate useful insights about the particular fruition of TV messages, related to both commercial as well as political fields. PMID:20069055
Time averaging, ageing and delay analysis of financial time series
NASA Astrophysics Data System (ADS)
Cherstvy, Andrey G.; Vinod, Deepak; Aghion, Erez; Chechkin, Aleksei V.; Metzler, Ralf
2017-06-01
We introduce three strategies for the analysis of financial time series based on time averaged observables. These comprise the time averaged mean squared displacement (MSD) as well as the ageing and delay time methods for varying fractions of the financial time series. We explore these concepts via statistical analysis of historic time series for several Dow Jones Industrial indices for the period from the 1960s to 2015. Remarkably, we discover a simple universal law for the delay time averaged MSD. The observed features of the financial time series dynamics agree well with our analytical results for the time averaged measurables for geometric Brownian motion, underlying the famed Black-Scholes-Merton model. The concepts we promote here are shown to be useful for financial data analysis and enable one to unveil new universal features of stock market dynamics.
Local sensitivity analysis for inverse problems solved by singular value decomposition
Hill, M.C.; Nolan, B.T.
2010-01-01
Local sensitivity analysis provides computationally frugal ways to evaluate models commonly used for resource management, risk assessment, and so on. This includes diagnosing inverse model convergence problems caused by parameter insensitivity and(or) parameter interdependence (correlation), understanding what aspects of the model and data contribute to measures of uncertainty, and identifying new data likely to reduce model uncertainty. Here, we consider sensitivity statistics relevant to models in which the process model parameters are transformed using singular value decomposition (SVD) to create SVD parameters for model calibration. The statistics considered include the PEST identifiability statistic, and combined use of the process-model parameter statistics composite scaled sensitivities and parameter correlation coefficients (CSS and PCC). The statistics are complimentary in that the identifiability statistic integrates the effects of parameter sensitivity and interdependence, while CSS and PCC provide individual measures of sensitivity and interdependence. PCC quantifies correlations between pairs or larger sets of parameters; when a set of parameters is intercorrelated, the absolute value of PCC is close to 1.00 for all pairs in the set. The number of singular vectors to include in the calculation of the identifiability statistic is somewhat subjective and influences the statistic. To demonstrate the statistics, we use the USDA’s Root Zone Water Quality Model to simulate nitrogen fate and transport in the unsaturated zone of the Merced River Basin, CA. There are 16 log-transformed process-model parameters, including water content at field capacity (WFC) and bulk density (BD) for each of five soil layers. Calibration data consisted of 1,670 observations comprising soil moisture, soil water tension, aqueous nitrate and bromide concentrations, soil nitrate concentration, and organic matter content. All 16 of the SVD parameters could be estimated by regression based on the range of singular values. Identifiability statistic results varied based on the number of SVD parameters included. Identifiability statistics calculated for four SVD parameters indicate the same three most important process-model parameters as CSS/PCC (WFC1, WFC2, and BD2), but the order differed. Additionally, the identifiability statistic showed that BD1 was almost as dominant as WFC1. The CSS/PCC analysis showed that this results from its high correlation with WCF1 (-0.94), and not its individual sensitivity. Such distinctions, combined with analysis of how high correlations and(or) sensitivities result from the constructed model, can produce important insights into, for example, the use of sensitivity analysis to design monitoring networks. In conclusion, the statistics considered identified similar important parameters. They differ because (1) with CSS/PCC can be more awkward because sensitivity and interdependence are considered separately and (2) identifiability requires consideration of how many SVD parameters to include. A continuing challenge is to understand how these computationally efficient methods compare with computationally demanding global methods like Markov-Chain Monte Carlo given common nonlinear processes and the often even more nonlinear models.
Jager, Tjalling
2013-02-05
The individuals of a species are not equal. These differences frustrate experimental biologists and ecotoxicologists who wish to study the response of a species (in general) to a treatment. In the analysis of data, differences between model predictions and observations on individual animals are usually treated as random measurement error around the true response. These deviations, however, are mainly caused by real differences between the individuals (e.g., differences in physiology and in initial conditions). Understanding these intraspecies differences, and accounting for them in the data analysis, will improve our understanding of the response to the treatment we are investigating and allow for a more powerful, less biased, statistical analysis. Here, I explore a basic scheme for statistical inference to estimate parameters governing stress that allows individuals to differ in their basic physiology. This scheme is illustrated using a simple toxicokinetic-toxicodynamic model and a data set for growth of the springtail Folsomia candida exposed to cadmium in food. This article should be seen as proof of concept; a first step in bringing more realism into the statistical inference for process-based models in ecotoxicology.
Complexity quantification of dense array EEG using sample entropy analysis.
Ramanand, Pravitha; Nampoori, V P N; Sreenivasan, R
2004-09-01
In this paper, a time series complexity analysis of dense array electroencephalogram signals is carried out using the recently introduced Sample Entropy (SampEn) measure. This statistic quantifies the regularity in signals recorded from systems that can vary from the purely deterministic to purely stochastic realm. The present analysis is conducted with an objective of gaining insight into complexity variations related to changing brain dynamics for EEG recorded from the three cases of passive, eyes closed condition, a mental arithmetic task and the same mental task carried out after a physical exertion task. It is observed that the statistic is a robust quantifier of complexity suited for short physiological signals such as the EEG and it points to the specific brain regions that exhibit lowered complexity during the mental task state as compared to a passive, relaxed state. In the case of mental tasks carried out before and after the performance of a physical exercise, the statistic can detect the variations brought in by the intermediate fatigue inducing exercise period. This enhances its utility in detecting subtle changes in the brain state that can find wider scope for applications in EEG based brain studies.
NASA Astrophysics Data System (ADS)
Roy, P. K.; Pal, S.; Banerjee, G.; Biswas Roy, M.; Ray, D.; Majumder, A.
2014-12-01
River is considered as one of the main sources of freshwater all over the world. Hence analysis and maintenance of this water resource is globally considered a matter of major concern. This paper deals with the assessment of surface water quality of the Ichamati river using multivariate statistical techniques. Eight distinct surface water quality observation stations were located and samples were collected. For the samples collected statistical techniques were applied to the physico-chemical parameters and depth of siltation. In this paper cluster analysis is done to determine the relations between surface water quality and siltation depth of river Ichamati. Multiple regressions and mathematical equation modeling have been done to characterize surface water quality of Ichamati river on the basis of physico-chemical parameters. It was found that surface water quality of the downstream river was different from the water quality of the upstream. The analysis of the water quality parameters of the Ichamati river clearly indicate high pollution load on the river water which can be accounted to agricultural discharge, tidal effect and soil erosion. The results further reveal that with the increase in depth of siltation, water quality degraded.
Patterson, Megan S; Goodson, Patricia
2017-05-01
Compulsive exercise, a form of unhealthy exercise often associated with prioritizing exercise and feeling guilty when exercise is missed, is a common precursor to and symptom of eating disorders. College-aged women are at high risk of exercising compulsively compared with other groups. Social network analysis (SNA) is a theoretical perspective and methodology allowing researchers to observe the effects of relational dynamics on the behaviors of people. SNA was used to assess the relationship between compulsive exercise and body dissatisfaction, physical activity, and network variables. Descriptive statistics were conducted using SPSS, and quadratic assignment procedure (QAP) analyses were conducted using UCINET. QAP regression analysis revealed a statistically significant model (R 2 = .375, P < .0001) predicting compulsive exercise behavior. Physical activity, body dissatisfaction, and network variables were statistically significant predictor variables in the QAP regression model. In our sample, women who are connected to "important" or "powerful" people in their network are likely to have higher compulsive exercise scores. This result provides healthcare practitioners key target points for intervention within similar groups of women. For scholars researching eating disorders and associated behaviors, this study supports looking into group dynamics and network structure in conjunction with body dissatisfaction and exercise frequency.
Capturing rogue waves by multi-point statistics
NASA Astrophysics Data System (ADS)
Hadjihosseini, A.; Wächter, Matthias; Hoffmann, N. P.; Peinke, J.
2016-01-01
As an example of a complex system with extreme events, we investigate ocean wave states exhibiting rogue waves. We present a statistical method of data analysis based on multi-point statistics which for the first time allows the grasping of extreme rogue wave events in a highly satisfactory statistical manner. The key to the success of the approach is mapping the complexity of multi-point data onto the statistics of hierarchically ordered height increments for different time scales, for which we can show that a stochastic cascade process with Markov properties is governed by a Fokker-Planck equation. Conditional probabilities as well as the Fokker-Planck equation itself can be estimated directly from the available observational data. With this stochastic description surrogate data sets can in turn be generated, which makes it possible to work out arbitrary statistical features of the complex sea state in general, and extreme rogue wave events in particular. The results also open up new perspectives for forecasting the occurrence probability of extreme rogue wave events, and even for forecasting the occurrence of individual rogue waves based on precursory dynamics.
Analyzing thresholds and efficiency with hierarchical Bayesian logistic regression.
Houpt, Joseph W; Bittner, Jennifer L
2018-07-01
Ideal observer analysis is a fundamental tool used widely in vision science for analyzing the efficiency with which a cognitive or perceptual system uses available information. The performance of an ideal observer provides a formal measure of the amount of information in a given experiment. The ratio of human to ideal performance is then used to compute efficiency, a construct that can be directly compared across experimental conditions while controlling for the differences due to the stimuli and/or task specific demands. In previous research using ideal observer analysis, the effects of varying experimental conditions on efficiency have been tested using ANOVAs and pairwise comparisons. In this work, we present a model that combines Bayesian estimates of psychometric functions with hierarchical logistic regression for inference about both unadjusted human performance metrics and efficiencies. Our approach improves upon the existing methods by constraining the statistical analysis using a standard model connecting stimulus intensity to human observer accuracy and by accounting for variability in the estimates of human and ideal observer performance scores. This allows for both individual and group level inferences. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Yamada, Yoshiyuki; Gouda, Naoteru; Yoshioka, Satoshi
2015-08-01
We are planning JASMINE (Japan Astrometric Satellite Mission for INfrared Exploration) as a series missions of Nano-JASMINE, Small-JASMINE, and JASMINE. Nano-JASMINE data analysis will be performed as a collaboration with Gaia data analysis team. We apply Gaia core processing software named AGIS as a Nano-JASMINE core solution. Applicability has been confirmed by D. Michalik and Gaia DPAC team. Converting telemetry data to AGIS input is a JASMINE team's task. It includes centroid caoculatoin of the stellar image. Accuracy of Gaia is two-order better than that of Nano-JASMINE. But there are only two astrometric satellite missions with CCD detector for global astrometry. So, Nano-JASMINE will have role of calibrating Gaia data. Bright star centroiding is the most important science target.Small-JASMINE has completely different observation strategy. It will observe step stair observation with about a million observations for individual star. Sub milli arcsec centroid errors of individual steallar images will be reduced by two order and getting 10 micro arcsecond astrometric accuracy by applying square root N law of million observations. Various systematic noise should be estimated, modelled, and subtracted. Some statistical study will be shown in this poster.
Lazić-Mitrović, Tanja; Djukić, Milan; Cutura, Nedjo; Andjelić, Spaso; Curković, Aleksandar; Soldo, Vesna; Radlović, Nedeljko
2010-01-01
According to numerous researches, transitory hypothermia is a part of the neonatological energetic triangle and represents a significant prognostic factor within morbidity and mortality in newborns with intrauterine growth retardation (IUGR), that are, due to their characteristics, more inclined to transitory hypothermia. The aim of the study was an analysis of frequency of transitory hypothermia in term newborns with IUGR, as well as an analysis of frequency of the most frequent pathological conditions typical of IUGR newborns depending on the presence of transitory hypothermia after birth (hypoglycaemia, perinatal asphyxia, hyperbilirubinaemia and hypocalcaemia). The study included 143 term newborns with IUGR treated at the Neonatology Ward of the Gynaecology-Obstetrics Clinic "Narodni front", Belgrade. The newborns were divided into two groups: the one with registered transitory hypothermia--the observed group, and the one without transitory hypothermia--the control group. The data analysis included the analysis of the frequency of transitory hypothermia depending on gestation and body mass, as well as the analysis of pathological conditions (perinatal asphyxia, hypoglycaemia, hypocalcaemia, hyperbilirubinaemia) depending on the presence of hypothermia. The analysis was done by statistical tests of analytic and descriptive statistics. In morbidity structure dominate hypothermia (65.03%), hypoglycaemia (43.36%), perinatal asphyxia (37.76%), hyperbilirubinaemia (30.77%), hypocalcaemia (25.17%). There were 93 newborns in the observed group, and 50 in the control one. Mean value of the measured body temperature was 35.9 degrees C. 20 newborns (32.26%) had moderate hypothermia, and 73 newborns (67.74%) had mild hypothermia. In the observed group, average gestation was 39.0 weeks, and 39.6 (p < 0.01) in the control group. Average body mass at birth in the whole group was 2339 g: 2214 g in the observed and 2571 g in the control group. The frequency of hypoglycaemia in the observed group was 53.8%, and 24% in the control group (p < 0.01). In the observed group, the frequency of pH < 7.25 was 38.71%, and 14% in the control group (p < 0.05). The frequency of hyperbilirubinaemia was 38.71% in the observed group, and 16% in the control group (p < 0.01). The frequency of hypocalcaemia was 32.26% in the observed, and 12% in the control group (p < 0.01). Transitory hypothermia in the first ten hours of life represents a significant risk factor for deepening hypoglycaemia, asphyxia, hyperbilirubinaemia and hypocalcaemia in term newborns with IUGR.
The relationship between temporomandibular dysfunction and head and cervical posture.
Matheus, Ricardo Alves; Ramos-Perez, Flávia Maria de Moraes; Menezes, Alynne Vieira; Ambrosano, Gláucia Maria Bovi; Haiter-Neto, Francisco; Bóscolo, Frab Norberto; de Almeida, Solange Maria
2009-01-01
This study aimed to evaluate the possibility of any correlation between disc displacement and parameters used for evaluation of skull positioning in relation to the cervical spine: craniocervical angle, suboccipital space between C0-C1, cervical curvature and position of the hyoid bone in individuals with and without symptoms of temporomandibular dysfunction. The patients were evaluated following the guidelines set forth by RDC/TMD. Evaluation was performed by magnetic resonance imaging for establishment of disc positioning in the temporomandibular joints (TMJs) of 30 volunteer patients without temporomandibular dysfunction symptoms and 30 patients with symptoms. Evaluation of skull positioning in relation to the cervical spine was performed on lateral cephalograms achieved with the individual in natural head position. Data were submitted to statistical analysis by Fisher's exact test at 5% significance level. To measure the degree of reproducibility/agreements between surveys, the kappa (K) statistics was used. Significant differences were observed between C0-C1 measurement for both symptomatic (p=0.04) and asymptomatic (p=0.02). No statistical differences were observed regarding craniocervical angle, C1-C2 and hyoid bone position in relation to the TMJs with and without disc displacement. Although statistically significant difference was found in the C0-C1 space, no association between these and internal temporomandibular joint disorder can be considered. Based on the results observed in this study, no direct relationship could be determined between the presence of disc displacement and the variables assessed.
THE RELATIONSHIP BETWEEN TEMPOROMANDIBULAR DYSFUNCTION AND HEAD AND CERVICAL POSTURE
Matheus, Ricardo Alves; Ramos-Perez, Flávia Maria de Moraes; Menezes, Alynne Vieira; Ambrosano, Gláucia Maria Bovi; Haiter, Francisco; Bóscolo, Frab Norberto; de Almeida, Solange Maria
2009-01-01
Objective: This study aimed to evaluate the possibility of any correlation between disc displacement and parameters used for evaluation of skull positioning in relation to the cervical spine: craniocervical angle, suboccipital space between C0-C1, cervical curvature and position of the hyoid bone in individuals with and without symptoms of temporomandibular dysfunction. Material and Methods: The patients were evaluated following the guidelines set forth by RDC/TMD. Evaluation was performed by magnetic resonance imaging for establishment of disc positioning in the temporomandibular joints (TMJs) of 30 volunteer patients without temporomandibular dysfunction symptoms and 30 patients with symptoms. Evaluation of skull positioning in relation to the cervical spine was performed on lateral cephalograms achieved with the individual in natural head position. Data were submitted to statistical analysis by Fisher's exact test at 5% significance level. To measure the degree of reproducibility/agreements between surveys, the kappa (K) statistics was used. Results: Significant differences were observed between C0-C1 measurement for both symptomatic (p=0.04) and asymptomatic (p=0.02). No statistical differences were observed regarding craniocervical angle, C1-C2 and hyoid bone position in relation to the TMJs with and without disc displacement. Although statistically significant difference was found in the C0-C1 space, no association between these and internal temporomandibular joint disorder can be considered. Conclusion: Based on the results observed in this study, no direct relationship could be determined between the presence of disc displacement and the variables assessed. PMID:19466252
New axion and hidden photon constraints from a solar data global fit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vinyoles, N.; Serenelli, A.; Isern, J.
2015-10-01
We present a new statistical analysis that combines helioseismology (sound speed, surface helium and convective radius) and solar neutrino observations (the {sup 8}B and {sup 7}Be fluxes) to place upper limits to the properties of non standard weakly interacting particles. Our analysis includes theoretical and observational errors, accounts for tensions between input parameters of solar models and can be easily extended to include other observational constraints. We present two applications to test the method: the well studied case of axions and axion-like particles and the more novel case of low mass hidden photons. For axions we obtain an upper limitmore » at 3σ for the axion-photon coupling constant of g{sub aγ} < 4.1 · 10{sup −10} GeV{sup −1}. For hidden photons we obtain the most restrictive upper limit available accross a wide range of masses for the product of the kinetic mixing and mass of χ m < 1.8 ⋅ 10{sup −12} eV at 3σ. Both cases improve the previous solar constraints based on the Standard Solar Models showing the power of using a global statistical approach.« less
McSwain, Kristen Bukowski; Strickland, A.G.
2010-01-01
Groundwater conditions in Brunswick County, North Carolina, have been monitored continuously since 2000 through the operation and maintenance of groundwater-level observation wells in the surficial, Castle Hayne, and Peedee aquifers of the North Atlantic Coastal Plain aquifer system. Groundwater-resource conditions for the Brunswick County area were evaluated by relating the normal range (25th to 75th percentile) monthly mean groundwater-level and precipitation data for water years 2001 to 2008 to median monthly mean groundwater levels and monthly sum of daily precipitation for water year 2008. Summaries of precipitation and groundwater conditions for the Brunswick County area and hydrographs and statistics of continuous groundwater levels collected during the 2008 water year are presented in this report. Groundwater levels varied by aquifer and geographic location within Brunswick County, but were influenced by drought conditions and groundwater withdrawals. Water levels were normal in two of the eight observation wells and below normal in the remaining six wells. Seasonal Kendall trend analysis performed on more than 9 years of monthly mean groundwater-level data collected in an observation well located within the Brunswick County well field indicated there is a strong downward trend, with water levels declining at a rate of about 2.2 feet per year.
NASA Technical Reports Server (NTRS)
Ellis, David L.
2007-01-01
Room temperature tensile testing of Chemically Pure (CP) Titanium Grade 2 was conducted for as-received commercially produced sheet and following thermal exposure at 550 and 650 K for times up to 5,000 h. No significant changes in microstructure or failure mechanism were observed. A statistical analysis of the data was performed. Small statistical differences were found, but all properties were well above minimum values for CP Ti Grade 2 as defined by ASTM standards and likely would fall within normal variation of the material.
Lucijanic, Marko; Petrovecki, Mladen
2012-01-01
Analyzing events over time is often complicated by incomplete, or censored, observations. Special non-parametric statistical methods were developed to overcome difficulties in summarizing and comparing censored data. Life-table (actuarial) method and Kaplan-Meier method are described with an explanation of survival curves. For the didactic purpose authors prepared a workbook based on most widely used Kaplan-Meier method. It should help the reader understand how Kaplan-Meier method is conceptualized and how it can be used to obtain statistics and survival curves needed to completely describe a sample of patients. Log-rank test and hazard ratio are also discussed.
Linguistic Analysis of the Human Heartbeat Using Frequency and Rank Order Statistics
NASA Astrophysics Data System (ADS)
Yang, Albert C.-C.; Hseu, Shu-Shya; Yien, Huey-Wen; Goldberger, Ary L.; Peng, C.-K.
2003-03-01
Complex physiologic signals may carry unique dynamical signatures that are related to their underlying mechanisms. We present a method based on rank order statistics of symbolic sequences to investigate the profile of different types of physiologic dynamics. We apply this method to heart rate fluctuations, the output of a central physiologic control system. The method robustly discriminates patterns generated from healthy and pathologic states, as well as aging. Furthermore, we observe increased randomness in the heartbeat time series with physiologic aging and pathologic states and also uncover nonrandom patterns in the ventricular response to atrial fibrillation.
Gender discrimination and prediction on the basis of facial metric information.
Fellous, J M
1997-07-01
Horizontal and vertical facial measurements are statistically independent. Discriminant analysis shows that five of such normalized distances explain over 95% of the gender differences of "training" samples and predict the gender of 90% novel test faces exhibiting various facial expressions. The robustness of the method and its results are assessed. It is argued that these distances (termed fiducial) are compatible with those found experimentally by psychophysical and neurophysiological studies. In consequence, partial explanations for the effects observed in these experiments can be found in the intrinsic statistical nature of the facial stimuli used.
NASA Astrophysics Data System (ADS)
Choi, B. H.; Min, B. I.; Yoshinobu, T.; Kim, K. O.; Pelinovsky, E.
2012-04-01
Data from a field survey of the 2011 tsunami in the Sanriku area of Japan is presented and used to plot the distribution function of runup heights along the coast. It is shown that the distribution function can be approximated using a theoretical log-normal curve [Choi et al, 2002]. The characteristics of the distribution functions derived from the runup-heights data obtained during the 2011 event are compared with data from two previous gigantic tsunamis (1896 and 1933) that occurred in almost the same region. The number of observations during the last tsunami is very large (more than 5,247), which provides an opportunity to revise the conception of the distribution of tsunami wave heights and the relationship between statistical characteristics and number of observations suggested by Kajiura [1983]. The distribution function of the 2011 event demonstrates the sensitivity to the number of observation points (many of them cannot be considered independent measurements) and can be used to determine the characteristic scale of the coast, which corresponds to the statistical independence of observed wave heights.
On the Determination of Poisson Statistics for Haystack Radar Observations of Orbital Debris
NASA Technical Reports Server (NTRS)
Stokely, Christopher L.; Benbrook, James R.; Horstman, Matt
2007-01-01
A convenient and powerful method is used to determine if radar detections of orbital debris are observed according to Poisson statistics. This is done by analyzing the time interval between detection events. For Poisson statistics, the probability distribution of the time interval between events is shown to be an exponential distribution. This distribution is a special case of the Erlang distribution that is used in estimating traffic loads on telecommunication networks. Poisson statistics form the basis of many orbital debris models but the statistical basis of these models has not been clearly demonstrated empirically until now. Interestingly, during the fiscal year 2003 observations with the Haystack radar in a fixed staring mode, there are no statistically significant deviations observed from that expected with Poisson statistics, either independent or dependent of altitude or inclination. One would potentially expect some significant clustering of events in time as a result of satellite breakups, but the presence of Poisson statistics indicates that such debris disperse rapidly with respect to Haystack's very narrow radar beam. An exception to Poisson statistics is observed in the months following the intentional breakup of the Fengyun satellite in January 2007.
Effects of ozone (O3) therapy on cisplatin-induced ototoxicity in rats.
Koçak, Hasan Emre; Taşkın, Ümit; Aydın, Salih; Oktay, Mehmet Faruk; Altınay, Serdar; Çelik, Duygu Sultan; Yücebaş, Kadir; Altaş, Bengül
2016-12-01
The aim of this study is to investigate the effect of rectal ozone and intratympanic ozone therapy on cisplatin-induced ototoxicity in rats. Eighteen female Wistar albino rats were included in our study. External auditory canal and tympanic membrane examinations were normal in all rats. The rats were randomly divided into three groups. Initially, all the rats were tested with distortion product otoacoustic emissions (DPOAE), and emissions were measured normally. All rats were injected with 5-mg/kg/day cisplatin for 3 days intraperitoneally. Ototoxicy had developed in all rats, as confirmed with DPOAE after 1 week. Rectal and intratympanic ozone therapy group was Group 1. No treatment was administered for the rats in Group 2 as the control group. The rats in Group 3 were treated with rectal ozone. All the rats were tested with DPOAE under general anesthesia, and all were sacrificed for pathological examination 1 week after ozone administration. Their cochleas were removed. The outer hair cell damage and stria vascularis damage were examined. In the statistical analysis conducted, a statistically significant difference between Group 1 and Group 2 was observed in all frequencies according to the DPOAE test. In addition, between Group 2 and Group 3, a statistically significant difference was observed in the DPOAE test. However, a statistically significant difference was not observed between Group 1 and Group 3 according to the DPOAE test. According to histopathological scoring, the outer hair cell damage score was statistically significantly high in Group 2 compared with Group 1. In addition, the outer hair cell damage score was also statistically significantly high in Group 2 compared with Group 3. Outer hair cell damage scores were low in Group 1 and Group 3, but there was no statistically significant difference between these groups. There was no statistically significant difference between the groups in terms of stria vascularis damage score examinations. Systemic ozone gas therapy is effective in the treatment of cell damage in cisplatin-induced ototoxicity. The intratympanic administration of ozone gas does not have any additional advantage over the rectal administration.
Nimptsch, Ulrike; Wengler, Annelene; Mansky, Thomas
2016-11-01
In Germany, nationwide hospital discharge data (DRG statistics provided by the research data centers of the Federal Statistical Office and the Statistical Offices of the 'Länder') are increasingly used as data source for health services research. Within this data hospitals can be separated via their hospital identifier ([Institutionskennzeichen] IK). However, this hospital identifier primarily designates the invoicing unit and is not necessarily equivalent to one hospital location. Aiming to investigate direction and extent of possible bias in hospital-level analyses this study examines the continuity of the hospital identifier within a cross-sectional and longitudinal approach and compares the results to official hospital census statistics. Within the DRG statistics from 2005 to 2013 the annual number of hospitals as classified by hospital identifiers was counted for each year of observation. The annual number of hospitals derived from DRG statistics was compared to the number of hospitals in the official census statistics 'Grunddaten der Krankenhäuser'. Subsequently, the temporal continuity of hospital identifiers in the DRG statistics was analyzed within cohorts of hospitals. Until 2013, the annual number of hospital identifiers in the DRG statistics fell by 175 (from 1,725 to 1,550). This decline affected only providers with small or medium case volume. The number of hospitals identified in the DRG statistics was lower than the number given in the census statistics (e.g., in 2013 1,550 IK vs. 1,668 hospitals in the census statistics). The longitudinal analyses revealed that the majority of hospital identifiers persisted in the years of observation, while one fifth of hospital identifiers changed. In cross-sectional studies of German hospital discharge data the separation of hospitals via the hospital identifier might lead to underestimating the number of hospitals and consequential overestimation of caseload per hospital. Discontinuities of hospital identifiers over time might impair the follow-up of hospital cohorts. These limitations must be taken into account in analyses of German hospital discharge data focusing on the hospital level. Copyright © 2016. Published by Elsevier GmbH.
NASA Astrophysics Data System (ADS)
Campbell, Adam J.; Hulbe, Christina L.; Lee, Choon-Ki
2018-01-01
As time series observations of Antarctic change proliferate, it is imperative that mathematical frameworks through which they are understood keep pace. Here we present a new method of interpreting remotely sensed change using spatial statistics and apply it to the specific case of thickness change on the Ross Ice Shelf. First, a numerical model of ice shelf flow is used together with empirical orthogonal function analysis to generate characteristic patterns of response to specific forcings. Because they are continuous and scalable in space and time, the patterns allow short duration observations to be placed in a longer time series context. Second, focusing only on changes that are statistically significant, the synthetic response surfaces are used to extract magnitude and timing of past events from the observational data. Slowdown of Kamb and Whillans Ice Streams is clearly detectable in remotely sensed thickness change. Moreover, those past events will continue to drive thinning into the future.
Statistical properties of Galactic CMB foregrounds: dust and synchrotron
NASA Astrophysics Data System (ADS)
Kandel, D.; Lazarian, A.; Pogosyan, D.
2018-07-01
Recent Planck observations have revealed some of the important statistical properties of synchrotron and dust polarization, namely, the B to E mode power and temperature-E (TE) mode cross-correlation. In this paper, we extend our analysis in Kandel et al. that studied the B to E mode power ratio for polarized dust emission to include TE cross-correlation and develop an analogous formalism for synchrotron signal, all using a realistic model of magnetohydrodynamical turbulence. Our results suggest that the Planck results for both synchrotron and dust polarization can be understood if the turbulence in the Galaxy is sufficiently sub-Alfvénic. Making use of the observed poor magnetic field-density correlation, we show that the observed positive TE correlation for dust corresponds to our theoretical expectations. We also show how the B to E ratio as well as the TE cross-correlation can be used to study media magnetization, compressibility, and level of density-magnetic field correlation.
An asymptotic analysis of the logrank test.
Strawderman, R L
1997-01-01
Asymptotic expansions for the null distribution of the logrank statistic and its distribution under local proportional hazards alternatives are developed in the case of iid observations. The results, which are derived from the work of Gu (1992) and Taniguchi (1992), are easy to interpret, and provide some theoretical justification for many behavioral characteristics of the logrank test that have been previously observed in simulation studies. We focus primarily upon (i) the inadequacy of the usual normal approximation under treatment group imbalance; and, (ii) the effects of treatment group imbalance on power and sample size calculations. A simple transformation of the logrank statistic is also derived based on results in Konishi (1991) and is found to substantially improve the standard normal approximation to its distribution under the null hypothesis of no survival difference when there is treatment group imbalance.
Associations between host characteristics and antimicrobial resistance of Salmonella typhimurium.
Ruddat, I; Tietze, E; Ziehm, D; Kreienbrock, L
2014-10-01
A collection of Salmonella Typhimurium isolates obtained from sporadic salmonellosis cases in humans from Lower Saxony, Germany between June 2008 and May 2010 was used to perform an exploratory risk-factor analysis on antimicrobial resistance (AMR) using comprehensive host information on sociodemographic attributes, medical history, food habits and animal contact. Multivariate resistance profiles of minimum inhibitory concentrations for 13 antimicrobial agents were analysed using a non-parametric approach with multifactorial models adjusted for phage types. Statistically significant associations were observed for consumption of antimicrobial agents, region type and three factors on egg-purchasing behaviour, indicating that besides antimicrobial use the proximity to other community members, health consciousness and other lifestyle-related attributes may play a role in the dissemination of resistances. Furthermore, a statistically significant increase in AMR from the first study year to the second year was observed.
Statistical properties of derivatives: A journey in term structures
NASA Astrophysics Data System (ADS)
Lautier, Delphine; Raynaud, Franck
2011-06-01
This article presents an empirical study of 13 derivative markets for commodities and financial assets. The study goes beyond statistical analysis by including the maturity as a variable for the daily returns of futures contracts from 1998 to 2010, and for delivery dates up to 120 months. We observe that the mean and variance of the commodities follow a scaling behavior in the maturity dimension with an exponent characteristic of the Samuelson effect. The comparison between the tails of the probability distribution according to the expiration dates shows that there is a segmentation in the fat tails exponent term structure above the Lévy stable region. Finally, we compute the average tail exponent for each maturity, and we observe two regimes of extreme events for derivative markets, reminiscent of a phase diagram with a sharp transition at the 18th delivery month.
How weak values emerge in joint measurements on cloned quantum systems.
Hofmann, Holger F
2012-07-13
A statistical analysis of optimal universal cloning shows that it is possible to identify an ideal (but nonpositive) copying process that faithfully maps all properties of the original Hilbert space onto two separate quantum systems, resulting in perfect correlations for all observables. The joint probabilities for noncommuting measurements on separate clones then correspond to the real parts of the complex joint probabilities observed in weak measurements on a single system, where the measurements on the two clones replace the corresponding sequence of weak measurement and postselection. The imaginary parts of weak measurement statics can be obtained by replacing the cloning process with a partial swap operation. A controlled-swap operation combines both processes, making the complete weak measurement statistics accessible as a well-defined contribution to the joint probabilities of fully resolved projective measurements on the two output systems.
RCT: Module 2.03, Counting Errors and Statistics, Course 8768
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hillmer, Kurt T.
2017-04-01
Radiological sample analysis involves the observation of a random process that may or may not occur and an estimation of the amount of radioactive material present based on that observation. Across the country, radiological control personnel are using the activity measurements to make decisions that may affect the health and safety of workers at those facilities and their surrounding environments. This course will present an overview of measurement processes, a statistical evaluation of both measurements and equipment performance, and some actions to take to minimize the sources of error in count room operations. This course will prepare the student withmore » the skills necessary for radiological control technician (RCT) qualification by passing quizzes, tests, and the RCT Comprehensive Phase 1, Unit 2 Examination (TEST 27566) and by providing in the field skills.« less