John, Majnu; Lencz, Todd; Malhotra, Anil K; Correll, Christoph U; Zhang, Jian-Ping
2018-06-01
Meta-analysis of genetic association studies is being increasingly used to assess phenotypic differences between genotype groups. When the underlying genetic model is assumed to be dominant or recessive, assessing the phenotype differences based on summary statistics, reported for individual studies in a meta-analysis, is a valid strategy. However, when the genetic model is additive, a similar strategy based on summary statistics will lead to biased results. This fact about the additive model is one of the things that we establish in this paper, using simulations. The main goal of this paper is to present an alternate strategy for the additive model based on simulating data for the individual studies. We show that the alternate strategy is far superior to the strategy based on summary statistics.
Analysis of half diallel mating designs I: a practical analysis procedure for ANOVA approximation.
G.R. Johnson; J.N. King
1998-01-01
Procedures to analyze half-diallel mating designs using the SAS statistical package are presented. The procedure requires two runs of PROC and VARCOMP and results in estimates of additive and non-additive genetic variation. The procedures described can be modified to work on most statistical software packages which can compute variance component estimates. The...
Statistical models and NMR analysis of polymer microstructure
USDA-ARS?s Scientific Manuscript database
Statistical models can be used in conjunction with NMR spectroscopy to study polymer microstructure and polymerization mechanisms. Thus, Bernoullian, Markovian, and enantiomorphic-site models are well known. Many additional models have been formulated over the years for additional situations. Typica...
Langan, Dean; Higgins, Julian P T; Gregory, Walter; Sutton, Alexander J
2012-05-01
We aim to illustrate the potential impact of a new study on a meta-analysis, which gives an indication of the robustness of the meta-analysis. A number of augmentations are proposed to one of the most widely used of graphical displays, the funnel plot. Namely, 1) statistical significance contours, which define regions of the funnel plot in which a new study would have to be located to change the statistical significance of the meta-analysis; and 2) heterogeneity contours, which show how a new study would affect the extent of heterogeneity in a given meta-analysis. Several other features are also described, and the use of multiple features simultaneously is considered. The statistical significance contours suggest that one additional study, no matter how large, may have a very limited impact on the statistical significance of a meta-analysis. The heterogeneity contours illustrate that one outlying study can increase the level of heterogeneity dramatically. The additional features of the funnel plot have applications including 1) informing sample size calculations for the design of future studies eligible for inclusion in the meta-analysis; and 2) informing the updating prioritization of a portfolio of meta-analyses such as those prepared by the Cochrane Collaboration. Copyright © 2012 Elsevier Inc. All rights reserved.
Online Statistical Modeling (Regression Analysis) for Independent Responses
NASA Astrophysics Data System (ADS)
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Research of Extension of the Life Cycle of Helicopter Rotor Blade in Hungary
2003-02-01
Radiography (DXR), and (iii) Vibration Diagnostics (VD) with Statistical Energy Analysis (SEA) were semi- simultaneously applied [1]. The used three...2.2. Vibration Diagnostics (VD)) Parallel to the NDT measurements the Statistical Energy Analysis (SEA) as a vibration diagnostical tool were...noises were analysed with a dual-channel real time frequency analyser (BK2035). In addition to the Statistical Energy Analysis measurement a small
Pathway analysis with next-generation sequencing data.
Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao
2015-04-01
Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
NASA Astrophysics Data System (ADS)
Jokhio, Gul A.; Syed Mohsin, Sharifah M.; Gul, Yasmeen
2018-04-01
It has been established that Adobe provides, in addition to being sustainable and economic, a better indoor air quality without spending extensive amounts of energy as opposed to the modern synthetic materials. The material, however, suffers from weak structural behaviour when subjected to adverse loading conditions. A wide range of mechanical properties has been reported in literature owing to lack of research and standardization. The present paper presents the statistical analysis of the results that were obtained through compressive and flexural tests on Adobe samples. Adobe specimens with and without wire mesh reinforcement were tested and the results were reported. The statistical analysis of these results presents an interesting read. It has been found that the compressive strength of adobe increases by about 43% after adding a single layer of wire mesh reinforcement. This increase is statistically significant. The flexural response of Adobe has also shown improvement with the addition of wire mesh reinforcement, however, the statistical significance of the same cannot be established.
Tanavalee, Chotetawan; Luksanapruksa, Panya; Singhatanadgige, Weerasak
2016-06-01
Microsoft Excel (MS Excel) is a commonly used program for data collection and statistical analysis in biomedical research. However, this program has many limitations, including fewer functions that can be used for analysis and a limited number of total cells compared with dedicated statistical programs. MS Excel cannot complete analyses with blank cells, and cells must be selected manually for analysis. In addition, it requires multiple steps of data transformation and formulas to plot survival analysis graphs, among others. The Megastat add-on program, which will be supported by MS Excel 2016 soon, would eliminate some limitations of using statistic formulas within MS Excel.
Frank, Till D.; Carmody, Aimée M.; Kholodenko, Boris N.
2012-01-01
We derive a statistical model of transcriptional activation using equilibrium thermodynamics of chemical reactions. We examine to what extent this statistical model predicts synergy effects of cooperative activation of gene expression. We determine parameter domains in which greater-than-additive and less-than-additive effects are predicted for cooperative regulation by two activators. We show that the statistical approach can be used to identify different causes of synergistic greater-than-additive effects: nonlinearities of the thermostatistical transcriptional machinery and three-body interactions between RNA polymerase and two activators. In particular, our model-based analysis suggests that at low transcription factor concentrations cooperative activation cannot yield synergistic greater-than-additive effects, i.e., DNA transcription can only exhibit less-than-additive effects. Accordingly, transcriptional activity turns from synergistic greater-than-additive responses at relatively high transcription factor concentrations into less-than-additive responses at relatively low concentrations. In addition, two types of re-entrant phenomena are predicted. First, our analysis predicts that under particular circumstances transcriptional activity will feature a sequence of less-than-additive, greater-than-additive, and eventually less-than-additive effects when for fixed activator concentrations the regulatory impact of activators on the binding of RNA polymerase to the promoter increases from weak, to moderate, to strong. Second, for appropriate promoter conditions when activator concentrations are increased then the aforementioned re-entrant sequence of less-than-additive, greater-than-additive, and less-than-additive effects is predicted as well. Finally, our model-based analysis suggests that even for weak activators that individually induce only negligible increases in promoter activity, promoter activity can exhibit greater-than-additive responses when transcription factors and RNA polymerase interact by means of three-body interactions. Overall, we show that versatility of transcriptional activation is brought about by nonlinearities of transcriptional response functions and interactions between transcription factors, RNA polymerase and DNA. PMID:22506020
Teaching Statistics from the Operating Table: Minimally Invasive and Maximally Educational
ERIC Educational Resources Information Center
Nowacki, Amy S.
2015-01-01
Statistics courses that focus on data analysis in isolation, discounting the scientific inquiry process, may not motivate students to learn the subject. By involving students in other steps of the inquiry process, such as generating hypotheses and data, students may become more interested and vested in the analysis step. Additionally, such an…
On an Additive Semigraphoid Model for Statistical Networks With Application to Pathway Analysis.
Li, Bing; Chun, Hyonho; Zhao, Hongyu
2014-09-01
We introduce a nonparametric method for estimating non-gaussian graphical models based on a new statistical relation called additive conditional independence, which is a three-way relation among random vectors that resembles the logical structure of conditional independence. Additive conditional independence allows us to use one-dimensional kernel regardless of the dimension of the graph, which not only avoids the curse of dimensionality but also simplifies computation. It also gives rise to a parallel structure to the gaussian graphical model that replaces the precision matrix by an additive precision operator. The estimators derived from additive conditional independence cover the recently introduced nonparanormal graphical model as a special case, but outperform it when the gaussian copula assumption is violated. We compare the new method with existing ones by simulations and in genetic pathway analysis.
Statistics for People Who (Think They) Hate Statistics. Third Edition
ERIC Educational Resources Information Center
Salkind, Neil J.
2007-01-01
This text teaches an often intimidating and difficult subject in a way that is informative, personable, and clear. The author takes students through various statistical procedures, beginning with correlation and graphical representation of data and ending with inferential techniques and analysis of variance. In addition, the text covers SPSS, and…
Markov Logic Networks in the Analysis of Genetic Data
Sakhanenko, Nikita A.
2010-01-01
Abstract Complex, non-additive genetic interactions are common and can be critical in determining phenotypes. Genome-wide association studies (GWAS) and similar statistical studies of linkage data, however, assume additive models of gene interactions in looking for genotype-phenotype associations. These statistical methods view the compound effects of multiple genes on a phenotype as a sum of influences of each gene and often miss a substantial part of the heritable effect. Such methods do not use any biological knowledge about underlying mechanisms. Modeling approaches from the artificial intelligence (AI) field that incorporate deterministic knowledge into models to perform statistical analysis can be applied to include prior knowledge in genetic analysis. We chose to use the most general such approach, Markov Logic Networks (MLNs), for combining deterministic knowledge with statistical analysis. Using simple, logistic regression-type MLNs we can replicate the results of traditional statistical methods, but we also show that we are able to go beyond finding independent markers linked to a phenotype by using joint inference without an independence assumption. The method is applied to genetic data on yeast sporulation, a complex phenotype with gene interactions. In addition to detecting all of the previously identified loci associated with sporulation, our method identifies four loci with smaller effects. Since their effect on sporulation is small, these four loci were not detected with methods that do not account for dependence between markers due to gene interactions. We show how gene interactions can be detected using more complex models, which can be used as a general framework for incorporating systems biology with genetics. PMID:20958249
The Statistical Consulting Center for Astronomy (SCCA)
NASA Technical Reports Server (NTRS)
Akritas, Michael
2001-01-01
The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.
Kamal, Ghulam Mustafa; Wang, Xiaohua; Bin Yuan; Wang, Jie; Sun, Peng; Zhang, Xu; Liu, Maili
2016-09-01
Soy sauce a well known seasoning all over the world, especially in Asia, is available in global market in a wide range of types based on its purpose and the processing methods. Its composition varies with respect to the fermentation processes and addition of additives, preservatives and flavor enhancers. A comprehensive (1)H NMR based study regarding the metabonomic variations of soy sauce to differentiate among different types of soy sauce available on the global market has been limited due to the complexity of the mixture. In present study, (13)C NMR spectroscopy coupled with multivariate statistical data analysis like principle component analysis (PCA), and orthogonal partial least square-discriminant analysis (OPLS-DA) was applied to investigate metabonomic variations among different types of soy sauce, namely super light, super dark, red cooking and mushroom soy sauce. The main additives in soy sauce like glutamate, sucrose and glucose were easily distinguished and quantified using (13)C NMR spectroscopy which were otherwise difficult to be assigned and quantified due to serious signal overlaps in (1)H NMR spectra. The significantly higher concentration of sucrose in dark, red cooking and mushroom flavored soy sauce can directly be linked to the addition of caramel in soy sauce. Similarly, significantly higher level of glutamate in super light as compared to super dark and mushroom flavored soy sauce may come from the addition of monosodium glutamate. The study highlights the potentiality of (13)C NMR based metabonomics coupled with multivariate statistical data analysis in differentiating between the types of soy sauce on the basis of level of additives, raw materials and fermentation procedures. Copyright © 2016 Elsevier B.V. All rights reserved.
Uncertainty Analysis of Seebeck Coefficient and Electrical Resistivity Characterization
NASA Technical Reports Server (NTRS)
Mackey, Jon; Sehirlioglu, Alp; Dynys, Fred
2014-01-01
In order to provide a complete description of a materials thermoelectric power factor, in addition to the measured nominal value, an uncertainty interval is required. The uncertainty may contain sources of measurement error including systematic bias error and precision error of a statistical nature. The work focuses specifically on the popular ZEM-3 (Ulvac Technologies) measurement system, but the methods apply to any measurement system. The analysis accounts for sources of systematic error including sample preparation tolerance, measurement probe placement, thermocouple cold-finger effect, and measurement parameters; in addition to including uncertainty of a statistical nature. Complete uncertainty analysis of a measurement system allows for more reliable comparison of measurement data between laboratories.
[Statistical analysis using freely-available "EZR (Easy R)" software].
Kanda, Yoshinobu
2015-10-01
Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.
SOCR: Statistics Online Computational Resource
Dinov, Ivo D.
2011-01-01
The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR). This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student’s intuition and enhance their learning. PMID:21451741
Weir, Christopher J; Butcher, Isabella; Assi, Valentina; Lewis, Stephanie C; Murray, Gordon D; Langhorne, Peter; Brady, Marian C
2018-03-07
Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses.
NASA Technical Reports Server (NTRS)
Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)
2000-01-01
The use of the Principal Component Analysis technique for the analysis of geophysical time series has been questioned in particular for its tendency to extract components that mix several physical phenomena even when the signal is just their linear sum. We demonstrate with a data simulation experiment that the Independent Component Analysis, a recently developed technique, is able to solve this problem. This new technique requires the statistical independence of components, a stronger constraint, that uses higher-order statistics, instead of the classical decorrelation a weaker constraint, that uses only second-order statistics. Furthermore, ICA does not require additional a priori information such as the localization constraint used in Rotational Techniques.
2005-04-01
the radiography gauging. In addition to the Statistical Energy Analysis (SEA) measurement a small exciter table (BK4810) and impedance head (BK 8000... Statistical Energy Analysis ; 7th Conf. on Vehicle System Dynamics, Identification and Anomalies (VSDIA2000), 6-8 Nov. 2000 Budapest, Proc. pp. 491-493... Energy Analysis (SEA) and Ultrasound Test. (UT) were concurrently applied. These methods collect accessory information on the objects under inspection
Lachowiec, Jennifer; Shen, Xia; Queitsch, Christine; Carlborg, Örjan
2015-01-01
Efforts to identify loci underlying complex traits generally assume that most genetic variance is additive. Here, we examined the genetics of Arabidopsis thaliana root length and found that the genomic narrow-sense heritability for this trait in the examined population was statistically zero. The low amount of additive genetic variance that could be captured by the genome-wide genotypes likely explains why no associations to root length could be found using standard additive-model-based genome-wide association (GWA) approaches. However, as the broad-sense heritability for root length was significantly larger, and primarily due to epistasis, we also performed an epistatic GWA analysis to map loci contributing to the epistatic genetic variance. Four interacting pairs of loci were revealed, involving seven chromosomal loci that passed a standard multiple-testing corrected significance threshold. The genotype-phenotype maps for these pairs revealed epistasis that cancelled out the additive genetic variance, explaining why these loci were not detected in the additive GWA analysis. Small population sizes, such as in our experiment, increase the risk of identifying false epistatic interactions due to testing for associations with very large numbers of multi-marker genotypes in few phenotyped individuals. Therefore, we estimated the false-positive risk using a new statistical approach that suggested half of the associated pairs to be true positive associations. Our experimental evaluation of candidate genes within the seven associated loci suggests that this estimate is conservative; we identified functional candidate genes that affected root development in four loci that were part of three of the pairs. The statistical epistatic analyses were thus indispensable for confirming known, and identifying new, candidate genes for root length in this population of wild-collected A. thaliana accessions. We also illustrate how epistatic cancellation of the additive genetic variance explains the insignificant narrow-sense and significant broad-sense heritability by using a combination of careful statistical epistatic analyses and functional genetic experiments.
Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent. PMID:26053876
Wu, Yazhou; Zhou, Liang; Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent.
Implementation of Head Start Planned Variation: 1970-1971. Part II.
ERIC Educational Resources Information Center
Lukas, Carol Van Deusen; Wohlleb, Cynthia
This volume of appendices is Part II of a study of program implementation in 12 models of Head Start Planned Variation. It presents details of the data analysis, copies of data collection instruments, and additional analyses and statistics. The appendices are: (A) Analysis of Variance Designs, (B) Copies of Instruments, (C) Additional Analyses,…
Trend Analysis Using Microcomputers.
ERIC Educational Resources Information Center
Berger, Carl F.
A trend analysis statistical package and additional programs for the Apple microcomputer are presented. They illustrate strategies of data analysis suitable to the graphics and processing capabilities of the microcomputer. The programs analyze data sets using examples of: (1) analysis of variance with multiple linear regression; (2) exponential…
Investigation of trends in flooding in the Tug Fork basin of Kentucky, Virginia, and West Virginia
Hirsch, Robert M.; Scott, Arthur G.; Wyant, Timothy
1982-01-01
Statistical analysis indicates that the average size of annual-flood peaks of the Tug Fork (Ky., Va., and W. Va.) has been increasing. However, additional statistical analysis does not indicate that the flood levels that were exceeded typically once or twice a year in the period 1947-79 are any more likely to be exceeded now than in 1947. Possible trends in streamchannel size also are investigated at three locations. No discernible trends in channel size are noted. Further statistical analysis of the trend in the size of annual-flood peaks shows that much of the annual variation is related to local rainfall and to the 'natural' hydrologic response in a relatively undisturbed subbasin. However, some statistical indication of trend persists after accounting for these natural factors, though it is of borderline statistical significance. Further study in the basin may relate flood magnitudes to both rainfall and to land use.
Robbins, L G
2000-01-01
Graduate school programs in genetics have become so full that courses in statistics have often been eliminated. In addition, typical introductory statistics courses for the "statistics user" rather than the nascent statistician are laden with methods for analysis of measured variables while genetic data are most often discrete numbers. These courses are often seen by students and genetics professors alike as largely irrelevant cookbook courses. The powerful methods of likelihood analysis, although commonly employed in human genetics, are much less often used in other areas of genetics, even though current computational tools make this approach readily accessible. This article introduces the MLIKELY.PAS computer program and the logic of do-it-yourself maximum-likelihood statistics. The program itself, course materials, and expanded discussions of some examples that are only summarized here are available at http://www.unisi. it/ricerca/dip/bio_evol/sitomlikely/mlikely.h tml. PMID:10628965
Noise limitations in optical linear algebra processors.
Batsell, S G; Jong, T L; Walkup, J F; Krile, T F
1990-05-10
A general statistical noise model is presented for optical linear algebra processors. A statistical analysis which includes device noise, the multiplication process, and the addition operation is undertaken. We focus on those processes which are architecturally independent. Finally, experimental results which verify the analytical predictions are also presented.
Improved score statistics for meta-analysis in single-variant and gene-level association studies.
Yang, Jingjing; Chen, Sai; Abecasis, Gonçalo
2018-06-01
Meta-analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta-analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case-control ratios. Here, we investigate the power loss problem by the standard meta-analysis methods for unbalanced studies, and further propose novel meta-analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta-score-statistics that can accurately approximate the joint-score-statistics with combined individual-level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene-level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene-level tests with 26 unbalanced studies of age-related macular degeneration . In addition, we took the meta-analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta-analyzing multi-ethnic samples. In summary, our improved meta-score-statistics with corrections for population stratification can be used to construct both single-variant and gene-level association studies, providing a useful framework for ensuring well-powered, convenient, cross-study analyses. © 2018 WILEY PERIODICALS, INC.
The Content of Statistical Requirements for Authors in Biomedical Research Journals
Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang
2016-01-01
Background: Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues’ serious considerations not only at the stage of data analysis but also at the stage of methodological design. Methods: Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Results: Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including “address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation,” and “statistical methods and the reasons.” Conclusions: Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible. PMID:27748343
The Content of Statistical Requirements for Authors in Biomedical Research Journals.
Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang
2016-10-20
Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues' serious considerations not only at the stage of data analysis but also at the stage of methodological design. Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including "address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation," and "statistical methods and the reasons." Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Xiaoying; Liu, Chongxuan; Hu, Bill X.
This study statistically analyzed a grain-size based additivity model that has been proposed to scale reaction rates and parameters from laboratory to field. The additivity model assumed that reaction properties in a sediment including surface area, reactive site concentration, reaction rate, and extent can be predicted from field-scale grain size distribution by linearly adding reaction properties for individual grain size fractions. This study focused on the statistical analysis of the additivity model with respect to reaction rate constants using multi-rate uranyl (U(VI)) surface complexation reactions in a contaminated sediment as an example. Experimental data of rate-limited U(VI) desorption in amore » stirred flow-cell reactor were used to estimate the statistical properties of multi-rate parameters for individual grain size fractions. The statistical properties of the rate constants for the individual grain size fractions were then used to analyze the statistical properties of the additivity model to predict rate-limited U(VI) desorption in the composite sediment, and to evaluate the relative importance of individual grain size fractions to the overall U(VI) desorption. The result indicated that the additivity model provided a good prediction of the U(VI) desorption in the composite sediment. However, the rate constants were not directly scalable using the additivity model, and U(VI) desorption in individual grain size fractions have to be simulated in order to apply the additivity model. An approximate additivity model for directly scaling rate constants was subsequently proposed and evaluated. The result found that the approximate model provided a good prediction of the experimental results within statistical uncertainty. This study also found that a gravel size fraction (2-8mm), which is often ignored in modeling U(VI) sorption and desorption, is statistically significant to the U(VI) desorption in the sediment.« less
Introductory Statistics and Fish Management.
ERIC Educational Resources Information Center
Jardine, Dick
2002-01-01
Describes how fisheries research and management data (available on a website) have been incorporated into an Introductory Statistics course. In addition to the motivation gained from seeing the practical relevance of the course, some students have participated in the data collection and analysis for the New Hampshire Fish and Game Department. (MM)
Methodologies for the Statistical Analysis of Memory Response to Radiation
NASA Astrophysics Data System (ADS)
Bosser, Alexandre L.; Gupta, Viyas; Tsiligiannis, Georgios; Frost, Christopher D.; Zadeh, Ali; Jaatinen, Jukka; Javanainen, Arto; Puchner, Helmut; Saigné, Frédéric; Virtanen, Ari; Wrobel, Frédéric; Dilillo, Luigi
2016-08-01
Methodologies are proposed for in-depth statistical analysis of Single Event Upset data. The motivation for using these methodologies is to obtain precise information on the intrinsic defects and weaknesses of the tested devices, and to gain insight on their failure mechanisms, at no additional cost. The case study is a 65 nm SRAM irradiated with neutrons, protons and heavy ions. This publication is an extended version of a previous study [1].
Chevance, Aurélie; Schuster, Tibor; Steele, Russell; Ternès, Nils; Platt, Robert W
2015-10-01
Robustness of an existing meta-analysis can justify decisions on whether to conduct an additional study addressing the same research question. We illustrate the graphical assessment of the potential impact of an additional study on an existing meta-analysis using published data on statin use and the risk of acute kidney injury. A previously proposed graphical augmentation approach is used to assess the sensitivity of the current test and heterogeneity statistics extracted from existing meta-analysis data. In addition, we extended the graphical augmentation approach to assess potential changes in the pooled effect estimate after updating a current meta-analysis and applied the three graphical contour definitions to data from meta-analyses on statin use and acute kidney injury risk. In the considered example data, the pooled effect estimates and heterogeneity indices demonstrated to be considerably robust to the addition of a future study. Supportingly, for some previously inconclusive meta-analyses, a study update might yield statistically significant kidney injury risk increase associated with higher statin exposure. The illustrated contour approach should become a standard tool for the assessment of the robustness of meta-analyses. It can guide decisions on whether to conduct additional studies addressing a relevant research question. Copyright © 2015 Elsevier Inc. All rights reserved.
Study Designs and Statistical Analyses for Biomarker Research
Gosho, Masahiko; Nagashima, Kengo; Sato, Yasunori
2012-01-01
Biomarkers are becoming increasingly important for streamlining drug discovery and development. In addition, biomarkers are widely expected to be used as a tool for disease diagnosis, personalized medication, and surrogate endpoints in clinical research. In this paper, we highlight several important aspects related to study design and statistical analysis for clinical research incorporating biomarkers. We describe the typical and current study designs for exploring, detecting, and utilizing biomarkers. Furthermore, we introduce statistical issues such as confounding and multiplicity for statistical tests in biomarker research. PMID:23012528
Distribution of lod scores in oligogenic linkage analysis.
Williams, J T; North, K E; Martin, L J; Comuzzie, A G; Göring, H H; Blangero, J
2001-01-01
In variance component oligogenic linkage analysis it can happen that the residual additive genetic variance bounds to zero when estimating the effect of the ith quantitative trait locus. Using quantitative trait Q1 from the Genetic Analysis Workshop 12 simulated general population data, we compare the observed lod scores from oligogenic linkage analysis with the empirical lod score distribution under a null model of no linkage. We find that zero residual additive genetic variance in the null model alters the usual distribution of the likelihood-ratio statistic.
Chang, Cheng; Xu, Kaikun; Guo, Chaoping; Wang, Jinxia; Yan, Qi; Zhang, Jian; He, Fuchu; Zhu, Yunping
2018-05-22
Compared with the numerous software tools developed for identification and quantification of -omics data, there remains a lack of suitable tools for both downstream analysis and data visualization. To help researchers better understand the biological meanings in their -omics data, we present an easy-to-use tool, named PANDA-view, for both statistical analysis and visualization of quantitative proteomics data and other -omics data. PANDA-view contains various kinds of analysis methods such as normalization, missing value imputation, statistical tests, clustering and principal component analysis, as well as the most commonly-used data visualization methods including an interactive volcano plot. Additionally, it provides user-friendly interfaces for protein-peptide-spectrum representation of the quantitative proteomics data. PANDA-view is freely available at https://sourceforge.net/projects/panda-view/. 1987ccpacer@163.com and zhuyunping@gmail.com. Supplementary data are available at Bioinformatics online.
NASA Technical Reports Server (NTRS)
Simmons, D. B.; Marchbanks, M. P., Jr.; Quick, M. J.
1982-01-01
The results of an effort to thoroughly and objectively analyze the statistical and historical information gathered during the development of the Shuttle Orbiter Primary Flight Software are given. The particular areas of interest include cost of the software, reliability of the software, requirements for the software and how the requirements changed during development of the system. Data related to the current version of the software system produced some interesting results. Suggestions are made for the saving of additional data which will allow additional investigation.
ERIC Educational Resources Information Center
Armijo, Michael; Lundy-Wagner, Valerie; Merrill, Elizabeth
2012-01-01
This paper asks how doctoral students understand the use of race variables in statistical modeling. More specifically, it examines how doctoral students at two universities are trained to define, operationalize, and analyze race variables. The authors interviewed students and instructors in addition to conducting a document analysis of their texts…
Batch reporting of forest inventory statistics using the EVALIDator
Patrick D. Miles
2015-01-01
The EVALIDator Web application, developed in 2007, provides estimates and sampling errors of forest statistics (e.g., forest area, number of trees, tree biomass) from data stored in the Forest Inventory and Analysis database. In response to user demand, new features have been added to the EVALIDator. The most recent additions are 1) the ability to generate multiple...
Ma, Junshui; Wang, Shubing; Raubertas, Richard; Svetnik, Vladimir
2010-07-15
With the increasing popularity of using electroencephalography (EEG) to reveal the treatment effect in drug development clinical trials, the vast volume and complex nature of EEG data compose an intriguing, but challenging, topic. In this paper the statistical analysis methods recommended by the EEG community, along with methods frequently used in the published literature, are first reviewed. A straightforward adjustment of the existing methods to handle multichannel EEG data is then introduced. In addition, based on the spatial smoothness property of EEG data, a new category of statistical methods is proposed. The new methods use a linear combination of low-degree spherical harmonic (SPHARM) basis functions to represent a spatially smoothed version of the EEG data on the scalp, which is close to a sphere in shape. In total, seven statistical methods, including both the existing and the newly proposed methods, are applied to two clinical datasets to compare their power to detect a drug effect. Contrary to the EEG community's recommendation, our results suggest that (1) the nonparametric method does not outperform its parametric counterpart; and (2) including baseline data in the analysis does not always improve the statistical power. In addition, our results recommend that (3) simple paired statistical tests should be avoided due to their poor power; and (4) the proposed spatially smoothed methods perform better than their unsmoothed versions. Copyright 2010 Elsevier B.V. All rights reserved.
Spatio-temporal analysis of annual rainfall in Crete, Greece
NASA Astrophysics Data System (ADS)
Varouchakis, Emmanouil A.; Corzo, Gerald A.; Karatzas, George P.; Kotsopoulou, Anastasia
2018-03-01
Analysis of rainfall data from the island of Crete, Greece was performed to identify key hydrological years and return periods as well as to analyze the inter-annual behavior of the rainfall variability during the period 1981-2014. The rainfall spatial distribution was also examined in detail to identify vulnerable areas of the island. Data analysis using statistical tools and spectral analysis were applied to investigate and interpret the temporal course of the available rainfall data set. In addition, spatial analysis techniques were applied and compared to determine the rainfall spatial distribution on the island of Crete. The analysis presented that in contrast to Regional Climate Model estimations, rainfall rates have not decreased, while return periods vary depending on seasonality and geographic location. A small but statistical significant increasing trend was detected in the inter-annual rainfall variations as well as a significant rainfall cycle almost every 8 years. In addition, statistically significant correlation of the island's rainfall variability with the North Atlantic Oscillation is identified for the examined period. On the other hand, regression kriging method combining surface elevation as secondary information improved the estimation of the annual rainfall spatial variability on the island of Crete by 70% compared to ordinary kriging. The rainfall spatial and temporal trends on the island of Crete have variable characteristics that depend on the geographical area and on the hydrological period.
Reif, David M.; Israel, Mark A.; Moore, Jason H.
2007-01-01
The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666
Kim, Sung-Min; Choi, Yosoon
2017-01-01
To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs) in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z-score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF) analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES) data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z-scores: high content with a high z-score (HH), high content with a low z-score (HL), low content with a high z-score (LH), and low content with a low z-score (LL). The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1–4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required. PMID:28629168
Kim, Sung-Min; Choi, Yosoon
2017-06-18
To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs) in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z -score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF) analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES) data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z -scores: high content with a high z -score (HH), high content with a low z -score (HL), low content with a high z -score (LH), and low content with a low z -score (LL). The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1-4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required.
On an additive partial correlation operator and nonparametric estimation of graphical models.
Lee, Kuang-Yao; Li, Bing; Zhao, Hongyu
2016-09-01
We introduce an additive partial correlation operator as an extension of partial correlation to the nonlinear setting, and use it to develop a new estimator for nonparametric graphical models. Our graphical models are based on additive conditional independence, a statistical relation that captures the spirit of conditional independence without having to resort to high-dimensional kernels for its estimation. The additive partial correlation operator completely characterizes additive conditional independence, and has the additional advantage of putting marginal variation on appropriate scales when evaluating interdependence, which leads to more accurate statistical inference. We establish the consistency of the proposed estimator. Through simulation experiments and analysis of the DREAM4 Challenge dataset, we demonstrate that our method performs better than existing methods in cases where the Gaussian or copula Gaussian assumption does not hold, and that a more appropriate scaling for our method further enhances its performance.
On an additive partial correlation operator and nonparametric estimation of graphical models
Li, Bing; Zhao, Hongyu
2016-01-01
Abstract We introduce an additive partial correlation operator as an extension of partial correlation to the nonlinear setting, and use it to develop a new estimator for nonparametric graphical models. Our graphical models are based on additive conditional independence, a statistical relation that captures the spirit of conditional independence without having to resort to high-dimensional kernels for its estimation. The additive partial correlation operator completely characterizes additive conditional independence, and has the additional advantage of putting marginal variation on appropriate scales when evaluating interdependence, which leads to more accurate statistical inference. We establish the consistency of the proposed estimator. Through simulation experiments and analysis of the DREAM4 Challenge dataset, we demonstrate that our method performs better than existing methods in cases where the Gaussian or copula Gaussian assumption does not hold, and that a more appropriate scaling for our method further enhances its performance. PMID:29422689
Cost-Effectiveness Analysis: a proposal of new reporting standards in statistical analysis
Bang, Heejung; Zhao, Hongwei
2014-01-01
Cost-effectiveness analysis (CEA) is a method for evaluating the outcomes and costs of competing strategies designed to improve health, and has been applied to a variety of different scientific fields. Yet, there are inherent complexities in cost estimation and CEA from statistical perspectives (e.g., skewness, bi-dimensionality, and censoring). The incremental cost-effectiveness ratio that represents the additional cost per one unit of outcome gained by a new strategy has served as the most widely accepted methodology in the CEA. In this article, we call for expanded perspectives and reporting standards reflecting a more comprehensive analysis that can elucidate different aspects of available data. Specifically, we propose that mean and median-based incremental cost-effectiveness ratios and average cost-effectiveness ratios be reported together, along with relevant summary and inferential statistics as complementary measures for informed decision making. PMID:24605979
16 CFR 1000.26 - Directorate for Epidemiology.
Code of Federal Regulations, 2011 CFR
2011-01-01
.... In addition, staff in the Hazard Analysis Division design special studies, design and analyze data from experiments for testing of consumer products, and provide statistical expertise and advice to...
16 CFR 1000.26 - Directorate for Epidemiology.
Code of Federal Regulations, 2012 CFR
2012-01-01
.... In addition, staff in the Hazard Analysis Division design special studies, design and analyze data from experiments for testing of consumer products, and provide statistical expertise and advice to...
16 CFR 1000.26 - Directorate for Epidemiology.
Code of Federal Regulations, 2014 CFR
2014-01-01
.... In addition, staff in the Hazard Analysis Division design special studies, design and analyze data from experiments for testing of consumer products, and provide statistical expertise and advice to...
Biological Parametric Mapping: A Statistical Toolbox for Multi-Modality Brain Image Analysis
Casanova, Ramon; Ryali, Srikanth; Baer, Aaron; Laurienti, Paul J.; Burdette, Jonathan H.; Hayasaka, Satoru; Flowers, Lynn; Wood, Frank; Maldjian, Joseph A.
2006-01-01
In recent years multiple brain MR imaging modalities have emerged; however, analysis methodologies have mainly remained modality specific. In addition, when comparing across imaging modalities, most researchers have been forced to rely on simple region-of-interest type analyses, which do not allow the voxel-by-voxel comparisons necessary to answer more sophisticated neuroscience questions. To overcome these limitations, we developed a toolbox for multimodal image analysis called biological parametric mapping (BPM), based on a voxel-wise use of the general linear model. The BPM toolbox incorporates information obtained from other modalities as regressors in a voxel-wise analysis, thereby permitting investigation of more sophisticated hypotheses. The BPM toolbox has been developed in MATLAB with a user friendly interface for performing analyses, including voxel-wise multimodal correlation, ANCOVA, and multiple regression. It has a high degree of integration with the SPM (statistical parametric mapping) software relying on it for visualization and statistical inference. Furthermore, statistical inference for a correlation field, rather than a widely-used T-field, has been implemented in the correlation analysis for more accurate results. An example with in-vivo data is presented demonstrating the potential of the BPM methodology as a tool for multimodal image analysis. PMID:17070709
Analysis and meta-analysis of single-case designs: an introduction.
Shadish, William R
2014-04-01
The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCord, R.A.; Olson, R.J.
1988-01-01
Environmental research and assessment activities at Oak Ridge National Laboratory (ORNL) include the analysis of spatial and temporal patterns of ecosystem response at a landscape scale. Analysis through use of geographic information system (GIS) involves an interaction between the user and thematic data sets frequently expressed as maps. A portion of GIS analysis has a mathematical or statistical aspect, especially for the analysis of temporal patterns. ARC/INFO is an excellent tool for manipulating GIS data and producing the appropriate map graphics. INFO also has some limited ability to produce statistical tabulation. At ORNL we have extended our capabilities by graphicallymore » interfacing ARC/INFO and SAS/GRAPH to provide a combined mapping and statistical graphics environment. With the data management, statistical, and graphics capabilities of SAS added to ARC/INFO, we have expanded the analytical and graphical dimensions of the GIS environment. Pie or bar charts, frequency curves, hydrographs, or scatter plots as produced by SAS can be added to maps from attribute data associated with ARC/INFO coverages. Numerous, small, simplified graphs can also become a source of complex map ''symbols.'' These additions extend the dimensions of GIS graphics to include time, details of the thematic composition, distribution, and interrelationships. 7 refs., 3 figs.« less
16 CFR § 1000.26 - Directorate for Epidemiology.
Code of Federal Regulations, 2013 CFR
2013-01-01
.... In addition, staff in the Hazard Analysis Division design special studies, design and analyze data from experiments for testing of consumer products, and provide statistical expertise and advice to...
Analysis of Two Methods to Evaluate Antioxidants
ERIC Educational Resources Information Center
Tomasina, Florencia; Carabio, Claudio; Celano, Laura; Thomson, Leonor
2012-01-01
This exercise is intended to introduce undergraduate biochemistry students to the analysis of antioxidants as a biotechnological tool. In addition, some statistical resources will also be used and discussed. Antioxidants play an important metabolic role, preventing oxidative stress-mediated cell and tissue injury. Knowing the antioxidant content…
NASA Astrophysics Data System (ADS)
Ishida, Shigeki; Mori, Atsuo; Shinji, Masato
The main method to reduce the blasting charge noise which occurs in a tunnel under construction is to install the sound insulation door in the tunnel. However, the numerical analysis technique to predict the accurate effect of the transmission loss in the sound insulation door is not established. In this study, we measured the blasting charge noise and the vibration of the sound insulation door in the tunnel with the blasting charge, and performed analysis and modified acoustic feature. In addition, we reproduced the noise reduction effect of the sound insulation door by statistical energy analysis method and confirmed that numerical simulation is possible by this procedure.
New software for statistical analysis of Cambridge Structural Database data
Sykes, Richard A.; McCabe, Patrick; Allen, Frank H.; Battle, Gary M.; Bruno, Ian J.; Wood, Peter A.
2011-01-01
A collection of new software tools is presented for the analysis of geometrical, chemical and crystallographic data from the Cambridge Structural Database (CSD). This software supersedes the program Vista. The new functionality is integrated into the program Mercury in order to provide statistical, charting and plotting options alongside three-dimensional structural visualization and analysis. The integration also permits immediate access to other information about specific CSD entries through the Mercury framework, a common requirement in CSD data analyses. In addition, the new software includes a range of more advanced features focused towards structural analysis such as principal components analysis, cone-angle correction in hydrogen-bond analyses and the ability to deal with topological symmetry that may be exhibited in molecular search fragments. PMID:22477784
Precipitate statistics in an Al-Mg-Si-Cu alloy from scanning precession electron diffraction data
NASA Astrophysics Data System (ADS)
Sunde, J. K.; Paulsen, Ø.; Wenner, S.; Holmestad, R.
2017-09-01
The key microstructural feature providing strength to age-hardenable Al alloys is nanoscale precipitates. Alloy development requires a reliable statistical assessment of these precipitates, in order to link the microstructure with material properties. Here, it is demonstrated that scanning precession electron diffraction combined with computational analysis enable the semi-automated extraction of precipitate statistics in an Al-Mg-Si-Cu alloy. Among the main findings is the precipitate number density, which agrees well with a conventional method based on manual counting and measurements. By virtue of its data analysis objectivity, our methodology is therefore seen as an advantageous alternative to existing routines, offering reproducibility and efficiency in alloy statistics. Additional results include improved qualitative information on phase distributions. The developed procedure is generic and applicable to any material containing nanoscale precipitates.
Quantile regression for the statistical analysis of immunological data with many non-detects.
Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth
2012-07-07
Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.
Harkness, Mark; Fisher, Angela; Lee, Michael D; Mack, E Erin; Payne, Jo Ann; Dworatzek, Sandra; Roberts, Jeff; Acheson, Carolyn; Herrmann, Ronald; Possolo, Antonio
2012-04-01
A large, multi-laboratory microcosm study was performed to select amendments for supporting reductive dechlorination of high levels of trichloroethylene (TCE) found at an industrial site in the United Kingdom (UK) containing dense non-aqueous phase liquid (DNAPL) TCE. The study was designed as a fractional factorial experiment involving 177 bottles distributed between four industrial laboratories and was used to assess the impact of six electron donors, bioaugmentation, addition of supplemental nutrients, and two TCE levels (0.57 and 1.90 mM or 75 and 250 mg/L in the aqueous phase) on TCE dechlorination. Performance was assessed based on the concentration changes of TCE and reductive dechlorination degradation products. The chemical data was evaluated using analysis of variance (ANOVA) and survival analysis techniques to determine both main effects and important interactions for all the experimental variables during the 203-day study. The statistically based design and analysis provided powerful tools that aided decision-making for field application of this technology. The analysis showed that emulsified vegetable oil (EVO), lactate, and methanol were the most effective electron donors, promoting rapid and complete dechlorination of TCE to ethene. Bioaugmentation and nutrient addition also had a statistically significant positive impact on TCE dechlorination. In addition, the microbial community was measured using phospholipid fatty acid analysis (PLFA) for quantification of total biomass and characterization of the community structure and quantitative polymerase chain reaction (qPCR) for enumeration of Dehalococcoides organisms (Dhc) and the vinyl chloride reductase (vcrA) gene. The highest increase in levels of total biomass and Dhc was observed in the EVO microcosms, which correlated well with the dechlorination results. Copyright © 2012 Elsevier B.V. All rights reserved.
Statistical power analysis of cardiovascular safety pharmacology studies in conscious rats.
Bhatt, Siddhartha; Li, Dingzhou; Flynn, Declan; Wisialowski, Todd; Hemkens, Michelle; Steidl-Nichols, Jill
2016-01-01
Cardiovascular (CV) toxicity and related attrition are a major challenge for novel therapeutic entities and identifying CV liability early is critical for effective derisking. CV safety pharmacology studies in rats are a valuable tool for early investigation of CV risk. Thorough understanding of data analysis techniques and statistical power of these studies is currently lacking and is imperative for enabling sound decision-making. Data from 24 crossover and 12 parallel design CV telemetry rat studies were used for statistical power calculations. Average values of telemetry parameters (heart rate, blood pressure, body temperature, and activity) were logged every 60s (from 1h predose to 24h post-dose) and reduced to 15min mean values. These data were subsequently binned into super intervals for statistical analysis. A repeated measure analysis of variance was used for statistical analysis of crossover studies and a repeated measure analysis of covariance was used for parallel studies. Statistical power analysis was performed to generate power curves and establish relationships between detectable CV (blood pressure and heart rate) changes and statistical power. Additionally, data from a crossover CV study with phentolamine at 4, 20 and 100mg/kg are reported as a representative example of data analysis methods. Phentolamine produced a CV profile characteristic of alpha adrenergic receptor antagonism, evidenced by a dose-dependent decrease in blood pressure and reflex tachycardia. Detectable blood pressure changes at 80% statistical power for crossover studies (n=8) were 4-5mmHg. For parallel studies (n=8), detectable changes at 80% power were 6-7mmHg. Detectable heart rate changes for both study designs were 20-22bpm. Based on our results, the conscious rat CV model is a sensitive tool to detect and mitigate CV risk in early safety studies. Furthermore, these results will enable informed selection of appropriate models and study design for early stage CV studies. Copyright © 2016 Elsevier Inc. All rights reserved.
Statistical analysis of fNIRS data: a comprehensive review.
Tak, Sungho; Ye, Jong Chul
2014-01-15
Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.
Overholser, Brian R; Sowinski, Kevin M
2007-12-01
Biostatistics is the application of statistics to biologic data. The field of statistics can be broken down into 2 fundamental parts: descriptive and inferential. Descriptive statistics are commonly used to categorize, display, and summarize data. Inferential statistics can be used to make predictions based on a sample obtained from a population or some large body of information. It is these inferences that are used to test specific research hypotheses. This 2-part review will outline important features of descriptive and inferential statistics as they apply to commonly conducted research studies in the biomedical literature. Part 1 in this issue will discuss fundamental topics of statistics and data analysis. Additionally, some of the most commonly used statistical tests found in the biomedical literature will be reviewed in Part 2 in the February 2008 issue.
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
NASA Technical Reports Server (NTRS)
Djorgovski, George
1993-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multiparameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resource.
NASA Technical Reports Server (NTRS)
Djorgovski, Stanislav
1992-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multi parameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resources.
Hu, Xiangdong; Liu, Yujiang; Qian, Linxue
2017-10-01
Real-time elastography (RTE) and shear wave elastography (SWE) are noninvasive and easily available imaging techniques that measure the tissue strain, and it has been reported that the sensitivity and the specificity of elastography were better in differentiating between benign and malignant thyroid nodules than conventional technologies. Relevant articles were searched in multiple databases; the comparison of elasticity index (EI) was conducted with the Review Manager 5.0. Forest plots of the sensitivity and specificity and SROC curve of RTE and SWE were performed with STATA 10.0 software. In addition, sensitivity analysis and bias analysis of the studies were conducted to examine the quality of articles; and to estimate possible publication bias, funnel plot was used and the Egger test was conducted. Finally 22 articles which eventually satisfied the inclusion criteria were included in this study. After eliminating the inefficient, benign and malignant nodules were 2106 and 613, respectively. The meta-analysis suggested that the difference of EI between benign and malignant nodules was statistically significant (SMD = 2.11, 95% CI [1.67, 2.55], P < .00001). The overall sensitivities of RTE and SWE were roughly comparable, whereas the difference of specificities between these 2 methods was statistically significant. In addition, statistically significant difference of AUC between RTE and SWE was observed between RTE and SWE (P < .01). The specificity of RTE was statistically higher than that of SWE; which suggests that compared with SWE, RTE may be more accurate on differentiating benign and malignant thyroid nodules.
Effects of additional data on Bayesian clustering.
Yamazaki, Keisuke
2017-10-01
Hierarchical probabilistic models, such as mixture models, are used for cluster analysis. These models have two types of variables: observable and latent. In cluster analysis, the latent variable is estimated, and it is expected that additional information will improve the accuracy of the estimation of the latent variable. Many proposed learning methods are able to use additional data; these include semi-supervised learning and transfer learning. However, from a statistical point of view, a complex probabilistic model that encompasses both the initial and additional data might be less accurate due to having a higher-dimensional parameter. The present paper presents a theoretical analysis of the accuracy of such a model and clarifies which factor has the greatest effect on its accuracy, the advantages of obtaining additional data, and the disadvantages of increasing the complexity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Statistical Modeling for Radiation Hardness Assurance
NASA Technical Reports Server (NTRS)
Ladbury, Raymond L.
2014-01-01
We cover the models and statistics associated with single event effects (and total ionizing dose), why we need them, and how to use them: What models are used, what errors exist in real test data, and what the model allows us to say about the DUT will be discussed. In addition, how to use other sources of data such as historical, heritage, and similar part and how to apply experience, physics, and expert opinion to the analysis will be covered. Also included will be concepts of Bayesian statistics, data fitting, and bounding rates.
ERIC Educational Resources Information Center
Aragón, Sonia; Lapresa, Daniel; Arana, Javier; Anguera, M. Teresa; Garzón, Belén
2017-01-01
Polar coordinate analysis is a powerful data reduction technique based on the Zsum statistic, which is calculated from adjusted residuals obtained by lag sequential analysis. Its use has been greatly simplified since the addition of a module in the free software program HOISAN for performing the necessary computations and producing…
Lin, Feng-Chang; Zhu, Jun
2012-01-01
We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.
NASA Astrophysics Data System (ADS)
Wang, Fang; Liao, Gui-ping; Li, Jian-hui; Zou, Rui-biao; Shi, Wen
2013-03-01
A novel method, which we called the analogous multifractal cross-correlation analysis, is proposed in this paper to study the multifractal behavior in the power-law cross-correlation between price and load in California electricity market. In addition, a statistic ρAMF -XA, which we call the analogous multifractal cross-correlation coefficient, is defined to test whether the cross-correlation between two given signals is genuine or not. Our analysis finds that both the price and load time series in California electricity market express multifractal nature. While, as indicated by the ρAMF -XA statistical test, there is a huge difference in the cross-correlation behavior between the years 1999 and 2000 in California electricity markets.
Wang, Fang; Liao, Gui-ping; Li, Jian-hui; Zou, Rui-biao; Shi, Wen
2013-03-01
A novel method, which we called the analogous multifractal cross-correlation analysis, is proposed in this paper to study the multifractal behavior in the power-law cross-correlation between price and load in California electricity market. In addition, a statistic ρAMF-XA, which we call the analogous multifractal cross-correlation coefficient, is defined to test whether the cross-correlation between two given signals is genuine or not. Our analysis finds that both the price and load time series in California electricity market express multifractal nature. While, as indicated by the ρAMF-XA statistical test, there is a huge difference in the cross-correlation behavior between the years 1999 and 2000 in California electricity markets.
Additive scales in degenerative disease--calculation of effect sizes and clinical judgment.
Riepe, Matthias W; Wilkinson, David; Förstl, Hans; Brieden, Andreas
2011-12-16
The therapeutic efficacy of an intervention is often assessed in clinical trials by scales measuring multiple diverse activities that are added to produce a cumulative global score. Medical communities and health care systems subsequently use these data to calculate pooled effect sizes to compare treatments. This is done because major doubt has been cast over the clinical relevance of statistically significant findings relying on p values with the potential to report chance findings. Hence in an aim to overcome this pooling the results of clinical studies into a meta-analyses with a statistical calculus has been assumed to be a more definitive way of deciding of efficacy. We simulate the therapeutic effects as measured with additive scales in patient cohorts with different disease severity and assess the limitations of an effect size calculation of additive scales which are proven mathematically. We demonstrate that the major problem, which cannot be overcome by current numerical methods, is the complex nature and neurobiological foundation of clinical psychiatric endpoints in particular and additive scales in general. This is particularly relevant for endpoints used in dementia research. 'Cognition' is composed of functions such as memory, attention, orientation and many more. These individual functions decline in varied and non-linear ways. Here we demonstrate that with progressive diseases cumulative values from multidimensional scales are subject to distortion by the limitations of the additive scale. The non-linearity of the decline of function impedes the calculation of effect sizes based on cumulative values from these multidimensional scales. Statistical analysis needs to be guided by boundaries of the biological condition. Alternatively, we suggest a different approach avoiding the error imposed by over-analysis of cumulative global scores from additive scales.
NASA Technical Reports Server (NTRS)
Hill, C. L.
1984-01-01
A computer-implemented classification has been derived from Landsat-4 Thematic Mapper data acquired over Baldwin County, Alabama on January 15, 1983. One set of spectral signatures was developed from the data by utilizing a 3x3 pixel sliding window approach. An analysis of the classification produced from this technique identified forested areas. Additional information regarding only the forested areas. Additional information regarding only the forested areas was extracted by employing a pixel-by-pixel signature development program which derived spectral statistics only for pixels within the forested land covers. The spectral statistics from both approaches were integrated and the data classified. This classification was evaluated by comparing the spectral classes produced from the data against corresponding ground verification polygons. This iterative data analysis technique resulted in an overall classification accuracy of 88.4 percent correct for slash pine, young pine, loblolly pine, natural pine, and mixed hardwood-pine. An accuracy assessment matrix has been produced for the classification.
NASA Astrophysics Data System (ADS)
Mori, Kaya; Chonko, James C.; Hailey, Charles J.
2005-10-01
We have reanalyzed the 260 ks XMM-Newton observation of 1E 1207.4-5209. There are several significant improvements over previous work. First, a much broader range of physically plausible spectral models was used. Second, we have used a more rigorous statistical analysis. The standard F-distribution was not employed, but rather the exact finite statistics F-distribution was determined by Monte Carlo simulations. This approach was motivated by the recent work of Protassov and coworkers and Freeman and coworkers. They demonstrated that the standard F-distribution is not even asymptotically correct when applied to assess the significance of additional absorption features in a spectrum. With our improved analysis we do not find a third and fourth spectral feature in 1E 1207.4-5209 but only the two broad absorption features previously reported. Two additional statistical tests, one line model dependent and the other line model independent, confirmed our modified F-test analysis. For all physically plausible continuum models in which the weak residuals are strong enough to fit, the residuals occur at the instrument Au M edge. As a sanity check we confirmed that the residuals are consistent in strength and position with the instrument Au M residuals observed in 3C 273.
Vibration Response Models of a Stiffened Aluminum Plate Excited by a Shaker
NASA Technical Reports Server (NTRS)
Cabell, Randolph H.
2008-01-01
Numerical models of structural-acoustic interactions are of interest to aircraft designers and the space program. This paper describes a comparison between two energy finite element codes, a statistical energy analysis code, a structural finite element code, and the experimentally measured response of a stiffened aluminum plate excited by a shaker. Different methods for modeling the stiffeners and the power input from the shaker are discussed. The results show that the energy codes (energy finite element and statistical energy analysis) accurately predicted the measured mean square velocity of the plate. In addition, predictions from an energy finite element code had the best spatial correlation with measured velocities. However, predictions from a considerably simpler, single subsystem, statistical energy analysis model also correlated well with the spatial velocity distribution. The results highlight a need for further work to understand the relationship between modeling assumptions and the prediction results.
From fields to objects: A review of geographic boundary analysis
NASA Astrophysics Data System (ADS)
Jacquez, G. M.; Maruca, S.; Fortin, M.-J.
Geographic boundary analysis is a relatively new approach unfamiliar to many spatial analysts. It is best viewed as a technique for defining objects - geographic boundaries - on spatial fields, and for evaluating the statistical significance of characteristics of those boundary objects. This is accomplished using null spatial models representative of the spatial processes expected in the absence of boundary-generating phenomena. Close ties to the object-field dialectic eminently suit boundary analysis to GIS data. The majority of existing spatial methods are field-based in that they describe, estimate, or predict how attributes (variables defining the field) vary through geographic space. Such methods are appropriate for field representations but not object representations. As the object-field paradigm gains currency in geographic information science, appropriate techniques for the statistical analysis of objects are required. The methods reviewed in this paper are a promising foundation. Geographic boundary analysis is clearly a valuable addition to the spatial statistical toolbox. This paper presents the philosophy of, and motivations for geographic boundary analysis. It defines commonly used statistics for quantifying boundaries and their characteristics, as well as simulation procedures for evaluating their significance. We review applications of these techniques, with the objective of making this promising approach accessible to the GIS-spatial analysis community. We also describe the implementation of these methods within geographic boundary analysis software: GEM.
Mathysen, Danny G P; Aclimandos, Wagih; Roelant, Ella; Wouters, Kristien; Creuzot-Garcher, Catherine; Ringens, Peter J; Hawlina, Marko; Tassignon, Marie-José
2013-11-01
To investigate whether introduction of item-response theory (IRT) analysis, in parallel to the 'traditional' statistical analysis methods available for performance evaluation of multiple T/F items as used in the European Board of Ophthalmology Diploma (EBOD) examination, has proved beneficial, and secondly, to study whether the overall assessment performance of the current written part of EBOD is sufficiently high (KR-20≥ 0.90) to be kept as examination format in future EBOD editions. 'Traditional' analysis methods for individual MCQ item performance comprise P-statistics, Rit-statistics and item discrimination, while overall reliability is evaluated through KR-20 for multiple T/F items. The additional set of statistical analysis methods for the evaluation of EBOD comprises mainly IRT analysis. These analysis techniques are used to monitor whether the introduction of negative marking for incorrect answers (since EBOD 2010) has a positive influence on the statistical performance of EBOD as a whole and its individual test items in particular. Item-response theory analysis demonstrated that item performance parameters should not be evaluated individually, but should be related to one another. Before the introduction of negative marking, the overall EBOD reliability (KR-20) was good though with room for improvement (EBOD 2008: 0.81; EBOD 2009: 0.78). After the introduction of negative marking, the overall reliability of EBOD improved significantly (EBOD 2010: 0.92; EBOD 2011:0.91; EBOD 2012: 0.91). Although many statistical performance parameters are available to evaluate individual items, our study demonstrates that the overall reliability assessment remains the only crucial parameter to be evaluated allowing comparison. While individual item performance analysis is worthwhile to undertake as secondary analysis, drawing final conclusions seems to be more difficult. Performance parameters need to be related, as shown by IRT analysis. Therefore, IRT analysis has proved beneficial for the statistical analysis of EBOD. Introduction of negative marking has led to a significant increase in the reliability (KR-20 > 0.90), indicating that the current examination format can be kept for future EBOD examinations. © 2013 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
Hazing DEOCS 4.1 Construct Validity Summary
2017-08-01
Hazing DEOCS 4.1 Construct Validity Summary DEFENSE EQUAL OPPORTUNITY MANAGEMENT INSTITUTE DIRECTORATE OF...the analysis. Tables 4 – 6 provide additional information regarding the descriptive statistics and reliability of the Hazing items. Table 7 provides
Damron, T A; McBeath, A A
1995-04-01
With the increasing duration of follow up on total knee arthroplasties, more revision arthroplasties are being performed. When revision is not advisable, a salvage procedure such as arthrodesis or resection arthroplasty is indicated. This article provides a comprehensive review of the literature regarding arthrodesis following failed total knee arthroplasty. In addition, a statistical meta-analysis of five studies using modern arthrodesis techniques is presented. A statistically significant greater fusion rate with intramedullary nail arthrodesis compared to external fixation is documented. Gram negative and mixed infections are found to be significant risk factors for failure of arthrodesis.
Evidence-Based Medicine as a Tool for Undergraduate Probability and Statistics Education
Masel, J.; Humphrey, P. T.; Blackburn, B.; Levine, J. A.
2015-01-01
Most students have difficulty reasoning about chance events, and misconceptions regarding probability can persist or even strengthen following traditional instruction. Many biostatistics classes sidestep this problem by prioritizing exploratory data analysis over probability. However, probability itself, in addition to statistics, is essential both to the biology curriculum and to informed decision making in daily life. One area in which probability is particularly important is medicine. Given the preponderance of pre health students, in addition to more general interest in medicine, we capitalized on students’ intrinsic motivation in this area to teach both probability and statistics. We use the randomized controlled trial as the centerpiece of the course, because it exemplifies the most salient features of the scientific method, and the application of critical thinking to medicine. The other two pillars of the course are biomedical applications of Bayes’ theorem and science and society content. Backward design from these three overarching aims was used to select appropriate probability and statistics content, with a focus on eliciting and countering previously documented misconceptions in their medical context. Pretest/posttest assessments using the Quantitative Reasoning Quotient and Attitudes Toward Statistics instruments are positive, bucking several negative trends previously reported in statistics education. PMID:26582236
Rare-Variant Association Analysis: Study Designs and Statistical Tests
Lee, Seunggeung; Abecasis, Gonçalo R.; Boehnke, Michael; Lin, Xihong
2014-01-01
Despite the extensive discovery of trait- and disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants can explain additional disease risk or trait variability. An increasing number of studies are underway to identify trait- and disease-associated rare variants. In this review, we provide an overview of statistical issues in rare-variant association studies with a focus on study designs and statistical tests. We present the design and analysis pipeline of rare-variant studies and review cost-effective sequencing designs and genotyping platforms. We compare various gene- or region-based association tests, including burden tests, variance-component tests, and combined omnibus tests, in terms of their assumptions and performance. Also discussed are the related topics of meta-analysis, population-stratification adjustment, genotype imputation, follow-up studies, and heritability due to rare variants. We provide guidelines for analysis and discuss some of the challenges inherent in these studies and future research directions. PMID:24995866
Egbewale, Bolaji E; Lewis, Martyn; Sim, Julius
2014-04-09
Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. 126 hypothetical trial scenarios were evaluated (126,000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power.
2014-01-01
Background Analysis of variance (ANOVA), change-score analysis (CSA) and analysis of covariance (ANCOVA) respond differently to baseline imbalance in randomized controlled trials. However, no empirical studies appear to have quantified the differential bias and precision of estimates derived from these methods of analysis, and their relative statistical power, in relation to combinations of levels of key trial characteristics. This simulation study therefore examined the relative bias, precision and statistical power of these three analyses using simulated trial data. Methods 126 hypothetical trial scenarios were evaluated (126 000 datasets), each with continuous data simulated by using a combination of levels of: treatment effect; pretest-posttest correlation; direction and magnitude of baseline imbalance. The bias, precision and power of each method of analysis were calculated for each scenario. Results Compared to the unbiased estimates produced by ANCOVA, both ANOVA and CSA are subject to bias, in relation to pretest-posttest correlation and the direction of baseline imbalance. Additionally, ANOVA and CSA are less precise than ANCOVA, especially when pretest-posttest correlation ≥ 0.3. When groups are balanced at baseline, ANCOVA is at least as powerful as the other analyses. Apparently greater power of ANOVA and CSA at certain imbalances is achieved in respect of a biased treatment effect. Conclusions Across a range of correlations between pre- and post-treatment scores and at varying levels and direction of baseline imbalance, ANCOVA remains the optimum statistical method for the analysis of continuous outcomes in RCTs, in terms of bias, precision and statistical power. PMID:24712304
A statistical analysis of the impact of advertising signs on road safety.
Yannis, George; Papadimitriou, Eleonora; Papantoniou, Panagiotis; Voulgari, Chrisoula
2013-01-01
This research aims to investigate the impact of advertising signs on road safety. An exhaustive review of international literature was carried out on the effect of advertising signs on driver behaviour and safety. Moreover, a before-and-after statistical analysis with control groups was applied on several road sites with different characteristics in the Athens metropolitan area, in Greece, in order to investigate the correlation between the placement or removal of advertising signs and the related occurrence of road accidents. Road accident data for the 'before' and 'after' periods on the test sites and the control sites were extracted from the database of the Hellenic Statistical Authority, and the selected 'before' and 'after' periods vary from 2.5 to 6 years. The statistical analysis shows no statistical correlation between road accidents and advertising signs in none of the nine sites examined, as the confidence intervals of the estimated safety effects are non-significant at 95% confidence level. This can be explained by the fact that, in the examined road sites, drivers are overloaded with information (traffic signs, directions signs, labels of shops, pedestrians and other vehicles, etc.) so that the additional information load from advertising signs may not further distract them.
Analysis techniques for residual acceleration data
NASA Technical Reports Server (NTRS)
Rogers, Melissa J. B.; Alexander, J. Iwan D.; Snyder, Robert S.
1990-01-01
Various aspects of residual acceleration data are of interest to low-gravity experimenters. Maximum and mean values and various other statistics can be obtained from data as collected in the time domain. Additional information may be obtained through manipulation of the data. Fourier analysis is discussed as a means of obtaining information about dominant frequency components of a given data window. Transformation of data into different coordinate axes is useful in the analysis of experiments with different orientations and can be achieved by the use of a transformation matrix. Application of such analysis techniques to residual acceleration data provides additional information than what is provided in a time history and increases the effectiveness of post-flight analysis of low-gravity experiments.
Sunspot activity and influenza pandemics: a statistical assessment of the purported association.
Towers, S
2017-10-01
Since 1978, a series of papers in the literature have claimed to find a significant association between sunspot activity and the timing of influenza pandemics. This paper examines these analyses, and attempts to recreate the three most recent statistical analyses by Ertel (1994), Tapping et al. (2001), and Yeung (2006), which all have purported to find a significant relationship between sunspot numbers and pandemic influenza. As will be discussed, each analysis had errors in the data. In addition, in each analysis arbitrary selections or assumptions were also made, and the authors did not assess the robustness of their analyses to changes in those arbitrary assumptions. Varying the arbitrary assumptions to other, equally valid, assumptions negates the claims of significance. Indeed, an arbitrary selection made in one of the analyses appears to have resulted in almost maximal apparent significance; changing it only slightly yields a null result. This analysis applies statistically rigorous methodology to examine the purported sunspot/pandemic link, using more statistically powerful un-binned analysis methods, rather than relying on arbitrarily binned data. The analyses are repeated using both the Wolf and Group sunspot numbers. In all cases, no statistically significant evidence of any association was found. However, while the focus in this particular analysis was on the purported relationship of influenza pandemics to sunspot activity, the faults found in the past analyses are common pitfalls; inattention to analysis reproducibility and robustness assessment are common problems in the sciences, that are unfortunately not noted often enough in review.
Additional Support for the Information Systems Analyst Exam as a Valid Program Assessment Tool
ERIC Educational Resources Information Center
Carpenter, Donald A.; Snyder, Johnny; Slauson, Gayla Jo; Bridge, Morgan K.
2011-01-01
This paper presents a statistical analysis to support the notion that the Information Systems Analyst (ISA) exam can be used as a program assessment tool in addition to measuring student performance. It compares ISA exam scores earned by students in one particular Computer Information Systems program with scores earned by the same students on the…
Liang, Li-Jung; Weiss, Robert E; Redelings, Benjamin; Suchard, Marc A
2009-10-01
Statistical analyses of phylogenetic data culminate in uncertain estimates of underlying model parameters. Lack of additional data hinders the ability to reduce this uncertainty, as the original phylogenetic dataset is often complete, containing the entire gene or genome information available for the given set of taxa. Informative priors in a Bayesian analysis can reduce posterior uncertainty; however, publicly available phylogenetic software specifies vague priors for model parameters by default. We build objective and informative priors using hierarchical random effect models that combine additional datasets whose parameters are not of direct interest but are similar to the analysis of interest. We propose principled statistical methods that permit more precise parameter estimates in phylogenetic analyses by creating informative priors for parameters of interest. Using additional sequence datasets from our lab or public databases, we construct a fully Bayesian semiparametric hierarchical model to combine datasets. A dynamic iteratively reweighted Markov chain Monte Carlo algorithm conveniently recycles posterior samples from the individual analyses. We demonstrate the value of our approach by examining the insertion-deletion (indel) process in the enolase gene across the Tree of Life using the phylogenetic software BALI-PHY; we incorporate prior information about indels from 82 curated alignments downloaded from the BAliBASE database.
Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario
2014-01-01
Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis
NASA Astrophysics Data System (ADS)
Ahmad, Siti Rohaidah; Yusop, Nurhafizah Moziyana Mohd; Bakar, Azuraliza Abu; Yaakub, Mohd Ridzwan
2017-10-01
This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.
Crop Identification Technology Assessment for Remote Sensing (CITARS)
NASA Technical Reports Server (NTRS)
Bauer, M. E.; Cary, T. K.; Davis, B. J.; Swain, P. H.
1975-01-01
The results of classifications and experiments performed for the Crop Identification Technology Assessment for Remote Sensing (CITARS) project are summarized. Fifteen data sets were classified using two analysis procedures. One procedure used class weights while the other assumed equal probabilities of occurrence for all classes. In addition, 20 data sets were classified using training statistics from another segment or date. The results of both the local and non-local classifications in terms of classification and proportion estimation are presented. Several additional experiments are described which were performed to provide additional understanding of the CITARS results. These experiments investigated alternative analysis procedures, training set selection and size, effects of multitemporal registration, the spectral discriminability of corn, soybeans, and other, and analysis of aircraft multispectral data.
Angeler, David G; Viedma, Olga; Moreno, José M
2009-11-01
Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.
Mayo, Charles; Conners, Steve; Warren, Christopher; Miller, Robert; Court, Laurence; Popple, Richard
2013-01-01
Purpose: With emergence of clinical outcomes databases as tools utilized routinely within institutions, comes need for software tools to support automated statistical analysis of these large data sets and intrainstitutional exchange from independent federated databases to support data pooling. In this paper, the authors present a design approach and analysis methodology that addresses both issues. Methods: A software application was constructed to automate analysis of patient outcomes data using a wide range of statistical metrics, by combining use of C#.Net and R code. The accuracy and speed of the code was evaluated using benchmark data sets. Results: The approach provides data needed to evaluate combinations of statistical measurements for ability to identify patterns of interest in the data. Through application of the tools to a benchmark data set for dose-response threshold and to SBRT lung data sets, an algorithm was developed that uses receiver operator characteristic curves to identify a threshold value and combines use of contingency tables, Fisher exact tests, Welch t-tests, and Kolmogorov-Smirnov tests to filter the large data set to identify values demonstrating dose-response. Kullback-Leibler divergences were used to provide additional confirmation. Conclusions: The work demonstrates the viability of the design approach and the software tool for analysis of large data sets. PMID:24320426
Mayo, Charles; Conners, Steve; Warren, Christopher; Miller, Robert; Court, Laurence; Popple, Richard
2013-11-01
With emergence of clinical outcomes databases as tools utilized routinely within institutions, comes need for software tools to support automated statistical analysis of these large data sets and intrainstitutional exchange from independent federated databases to support data pooling. In this paper, the authors present a design approach and analysis methodology that addresses both issues. A software application was constructed to automate analysis of patient outcomes data using a wide range of statistical metrics, by combining use of C#.Net and R code. The accuracy and speed of the code was evaluated using benchmark data sets. The approach provides data needed to evaluate combinations of statistical measurements for ability to identify patterns of interest in the data. Through application of the tools to a benchmark data set for dose-response threshold and to SBRT lung data sets, an algorithm was developed that uses receiver operator characteristic curves to identify a threshold value and combines use of contingency tables, Fisher exact tests, Welch t-tests, and Kolmogorov-Smirnov tests to filter the large data set to identify values demonstrating dose-response. Kullback-Leibler divergences were used to provide additional confirmation. The work demonstrates the viability of the design approach and the software tool for analysis of large data sets.
MAGMA: Generalized Gene-Set Analysis of GWAS Data
de Leeuw, Christiaan A.; Mooij, Joris M.; Heskes, Tom; Posthuma, Danielle
2015-01-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn’s Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn’s Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn’s Disease data was found to be considerably faster as well. PMID:25885710
MAGMA: generalized gene-set analysis of GWAS data.
de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle
2015-04-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
Re-Analysis Report: Daylighting in Schools, Additional Analysis. Tasks 2.2.1 through 2.2.5.
ERIC Educational Resources Information Center
Heschong, Lisa; Elzeyadi, Ihab; Knecht, Carey
This study expands and validates previous research that found a statistical correlation between the amount of daylight in elementary school classrooms and the performance of students on standardized math and reading tests. The researchers reanalyzed the 19971998 school year student performance data from the Capistrano Unified School District…
Modeling Longitudinal Data with Generalized Additive Models: Applications to Single-Case Designs
ERIC Educational Resources Information Center
Sullivan, Kristynn J.; Shadish, William R.
2013-01-01
Single case designs (SCDs) are short time series that assess intervention effects by measuring units repeatedly over time both in the presence and absence of treatment. For a variety of reasons, interest in the statistical analysis and meta-analysis of these designs has been growing in recent years. This paper proposes modeling SCD data with…
Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O; Liang, Jingjing; Young, J Hunter; Franceschini, Nora; Smith, Jennifer A; Yanek, Lisa R; Sun, Yan V; Edwards, Todd L; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K; Chakravati, Aravinda; Cooper, Richard S; Redline, Susan
2015-01-08
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
A d-statistic for single-case designs that is equivalent to the usual between-groups d-statistic.
Shadish, William R; Hedges, Larry V; Pustejovsky, James E; Boyajian, Jonathan G; Sullivan, Kristynn J; Andrade, Alma; Barrientos, Jeannette L
2014-01-01
We describe a standardised mean difference statistic (d) for single-case designs that is equivalent to the usual d in between-groups experiments. We show how it can be used to summarise treatment effects over cases within a study, to do power analyses in planning new studies and grant proposals, and to meta-analyse effects across studies of the same question. We discuss limitations of this d-statistic, and possible remedies to them. Even so, this d-statistic is better founded statistically than other effect size measures for single-case design, and unlike many general linear model approaches such as multilevel modelling or generalised additive models, it produces a standardised effect size that can be integrated over studies with different outcome measures. SPSS macros for both effect size computation and power analysis are available.
Meta-analysis inside and outside particle physics: two traditions that should converge?
Baker, Rose D; Jackson, Dan
2013-06-01
The use of meta-analysis in medicine and epidemiology really took off in the 1970s. However, in high-energy physics, the Particle Data Group has been carrying out meta-analyses of measurements of particle masses and other properties since 1957. Curiously, there has been virtually no interaction between those working inside and outside particle physics. In this paper, we use statistical models to study two major differences in practice. The first is the usefulness of systematic errors, which physicists are now beginning to quote in addition to statistical errors. The second is whether it is better to treat heterogeneity by scaling up errors as do the Particle Data Group or by adding a random effect as does the rest of the community. Besides fitting models, we derive and use an exact test of the error-scaling hypothesis. We also discuss the other methodological differences between the two streams of meta-analysis. Our conclusion is that systematic errors are not currently very useful and that the conventional random effects model, as routinely used in meta-analysis, has a useful role to play in particle physics. The moral we draw for statisticians is that we should be more willing to explore 'grassroots' areas of statistical application, so that good statistical practice can flow both from and back to the statistical mainstream. Copyright © 2012 John Wiley & Sons, Ltd. Copyright © 2012 John Wiley & Sons, Ltd.
Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao
2009-01-01
Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650
Comparative analysis on the selection of number of clusters in community detection
NASA Astrophysics Data System (ADS)
Kawamoto, Tatsuro; Kabashima, Yoshiyuki
2018-02-01
We conduct a comparative analysis on various estimates of the number of clusters in community detection. An exhaustive comparison requires testing of all possible combinations of frameworks, algorithms, and assessment criteria. In this paper we focus on the framework based on a stochastic block model, and investigate the performance of greedy algorithms, statistical inference, and spectral methods. For the assessment criteria, we consider modularity, map equation, Bethe free energy, prediction errors, and isolated eigenvalues. From the analysis, the tendency of overfit and underfit that the assessment criteria and algorithms have becomes apparent. In addition, we propose that the alluvial diagram is a suitable tool to visualize statistical inference results and can be useful to determine the number of clusters.
Statistics for demodulation RFI in inverting operational amplifier circuits
NASA Astrophysics Data System (ADS)
Sutu, Y.-H.; Whalen, J. J.
An investigation was conducted with the objective to determine statistical variations for RFI demodulation responses in operational amplifier (op amp) circuits. Attention is given to the experimental procedures employed, a three-stage op amp LED experiment, NCAP (Nonlinear Circuit Analysis Program) simulations of demodulation RFI in 741 op amps, and a comparison of RFI in four op amp types. Three major recommendations for future investigations are presented on the basis of the obtained results. One is concerned with the conduction of additional measurements of demodulation RFI in inverting amplifiers, while another suggests the employment of an automatic measurement system. It is also proposed to conduct additional NCAP simulations in which parasitic effects are accounted for more thoroughly.
Abboud, R; Issa, H; Abed-Allah, Y D; Bakraji, E H
2015-11-01
Statistical analysis based on chemical composition, using radioisotope X-ray fluorescence, have been applied on 39 ancient pottery fragments coming from the excavation at Tell Al-Kasra archaeological site, Syria. Three groups were defined by applying Cluster and Factor analysis statistical methods. Thermoluminescence (TL) dating was investigated on three sherds taken from the bathroom (hammam) on the site. Multiple aliquot additive dose (MAAD) was used to estimate the paleodose value, and the gamma spectrometry was used to estimate the dose rate. The average age was found to be 715±36 year. Copyright © 2015 Elsevier Ltd. All rights reserved.
Hu, Xiangdong; Liu, Yujiang; Qian, Linxue
2017-01-01
Abstract Background: Real-time elastography (RTE) and shear wave elastography (SWE) are noninvasive and easily available imaging techniques that measure the tissue strain, and it has been reported that the sensitivity and the specificity of elastography were better in differentiating between benign and malignant thyroid nodules than conventional technologies. Methods: Relevant articles were searched in multiple databases; the comparison of elasticity index (EI) was conducted with the Review Manager 5.0. Forest plots of the sensitivity and specificity and SROC curve of RTE and SWE were performed with STATA 10.0 software. In addition, sensitivity analysis and bias analysis of the studies were conducted to examine the quality of articles; and to estimate possible publication bias, funnel plot was used and the Egger test was conducted. Results: Finally 22 articles which eventually satisfied the inclusion criteria were included in this study. After eliminating the inefficient, benign and malignant nodules were 2106 and 613, respectively. The meta-analysis suggested that the difference of EI between benign and malignant nodules was statistically significant (SMD = 2.11, 95% CI [1.67, 2.55], P < .00001). The overall sensitivities of RTE and SWE were roughly comparable, whereas the difference of specificities between these 2 methods was statistically significant. In addition, statistically significant difference of AUC between RTE and SWE was observed between RTE and SWE (P < .01). Conclusion: The specificity of RTE was statistically higher than that of SWE; which suggests that compared with SWE, RTE may be more accurate on differentiating benign and malignant thyroid nodules. PMID:29068996
A novel bi-level meta-analysis approach: applied to biological pathway analysis.
Nguyen, Tin; Tagett, Rebecca; Donato, Michele; Mitrea, Cristina; Draghici, Sorin
2016-02-01
The accumulation of high-throughput data in public repositories creates a pressing need for integrative analysis of multiple datasets from independent experiments. However, study heterogeneity, study bias, outliers and the lack of power of available methods present real challenge in integrating genomic data. One practical drawback of many P-value-based meta-analysis methods, including Fisher's, Stouffer's, minP and maxP, is that they are sensitive to outliers. Another drawback is that, because they perform just one statistical test for each individual experiment, they may not fully exploit the potentially large number of samples within each study. We propose a novel bi-level meta-analysis approach that employs the additive method and the Central Limit Theorem within each individual experiment and also across multiple experiments. We prove that the bi-level framework is robust against bias, less sensitive to outliers than other methods, and more sensitive to small changes in signal. For comparative analysis, we demonstrate that the intra-experiment analysis has more power than the equivalent statistical test performed on a single large experiment. For pathway analysis, we compare the proposed framework versus classical meta-analysis approaches (Fisher's, Stouffer's and the additive method) as well as against a dedicated pathway meta-analysis package (MetaPath), using 1252 samples from 21 datasets related to three human diseases, acute myeloid leukemia (9 datasets), type II diabetes (5 datasets) and Alzheimer's disease (7 datasets). Our framework outperforms its competitors to correctly identify pathways relevant to the phenotypes. The framework is sufficiently general to be applied to any type of statistical meta-analysis. The R scripts are available on demand from the authors. sorin@wayne.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Statistical process management: An essential element of quality improvement
NASA Astrophysics Data System (ADS)
Buckner, M. R.
Successful quality improvement requires a balanced program involving the three elements that control quality: organization, people and technology. The focus of the SPC/SPM User's Group is to advance the technology component of Total Quality by networking within the Group and by providing an outreach within Westinghouse to foster the appropriate use of statistic techniques to achieve Total Quality. SPM encompasses the disciplines by which a process is measured against its intrinsic design capability, in the face of measurement noise and other obscuring variability. SPM tools facilitate decisions about the process that generated the data. SPM deals typically with manufacturing processes, but with some flexibility of definition and technique it accommodates many administrative processes as well. The techniques of SPM are those of Statistical Process Control, Statistical Quality Control, Measurement Control, and Experimental Design. In addition, techniques such as job and task analysis, and concurrent engineering are important elements of systematic planning and analysis that are needed early in the design process to ensure success. The SPC/SPM User's Group is endeavoring to achieve its objectives by sharing successes that have occurred within the member's own Westinghouse department as well as within other US and foreign industry. In addition, failures are reviewed to establish lessons learned in order to improve future applications. In broader terms, the Group is interested in making SPM the accepted way of doing business within Westinghouse.
Tips and Tricks for Successful Application of Statistical Methods to Biological Data.
Schlenker, Evelyn
2016-01-01
This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.
Mediation analysis in nursing research: a methodological review.
Liu, Jianghong; Ulrich, Connie
2016-12-01
Mediation statistical models help clarify the relationship between independent predictor variables and dependent outcomes of interest by assessing the impact of third variables. This type of statistical analysis is applicable for many clinical nursing research questions, yet its use within nursing remains low. Indeed, mediational analyses may help nurse researchers develop more effective and accurate prevention and treatment programs as well as help bridge the gap between scientific knowledge and clinical practice. In addition, this statistical approach allows nurse researchers to ask - and answer - more meaningful and nuanced questions that extend beyond merely determining whether an outcome occurs. Therefore, the goal of this paper is to provide a brief tutorial on the use of mediational analyses in clinical nursing research by briefly introducing the technique and, through selected empirical examples from the nursing literature, demonstrating its applicability in advancing nursing science.
Hollunder, Jens; Friedel, Maik; Kuiper, Martin; Wilhelm, Thomas
2010-04-01
Many large 'omics' datasets have been published and many more are expected in the near future. New analysis methods are needed for best exploitation. We have developed a graphical user interface (GUI) for easy data analysis. Our discovery of all significant substructures (DASS) approach elucidates the underlying modularity, a typical feature of complex biological data. It is related to biclustering and other data mining approaches. Importantly, DASS-GUI also allows handling of multi-sets and calculation of statistical significances. DASS-GUI contains tools for further analysis of the identified patterns: analysis of the pattern hierarchy, enrichment analysis, module validation, analysis of additional numerical data, easy handling of synonymous names, clustering, filtering and merging. Different export options allow easy usage of additional tools such as Cytoscape. Source code, pre-compiled binaries for different systems, a comprehensive tutorial, case studies and many additional datasets are freely available at http://www.ifr.ac.uk/dass/gui/. DASS-GUI is implemented in Qt.
The Practicality of Statistical Physics Handout Based on KKNI and the Constructivist Approach
NASA Astrophysics Data System (ADS)
Sari, S. Y.; Afrizon, R.
2018-04-01
Statistical physics lecture shows that: 1) the performance of lecturers, social climate, students’ competence and soft skills needed at work are in enough category, 2) students feel difficulties in following the lectures of statistical physics because it is abstract, 3) 40.72% of students needs more understanding in the form of repetition, practice questions and structured tasks, and 4) the depth of statistical physics material needs to be improved gradually and structured. This indicates that learning materials in accordance of The Indonesian National Qualification Framework or Kerangka Kualifikasi Nasional Indonesia (KKNI) with the appropriate learning approach are needed to help lecturers and students in lectures. The author has designed statistical physics handouts which have very valid criteria (90.89%) according to expert judgment. In addition, the practical level of handouts designed also needs to be considered in order to be easy to use, interesting and efficient in lectures. The purpose of this research is to know the practical level of statistical physics handout based on KKNI and a constructivist approach. This research is a part of research and development with 4-D model developed by Thiagarajan. This research activity has reached part of development test at Development stage. Data collection took place by using a questionnaire distributed to lecturers and students. Data analysis using descriptive data analysis techniques in the form of percentage. The analysis of the questionnaire shows that the handout of statistical physics has very practical criteria. The conclusion of this study is statistical physics handouts based on the KKNI and constructivist approach have been practically used in lectures.
Traumatic injury among drywall installers, 1992 to 1995.
Chiou, S S; Pan, C S; Keane, P
2000-11-01
This study examined the traumatic-injury characteristics associated with one of the high-risk occupations in the construction industry--drywall installers--through an analysis of the traumatic-injury data obtained from the Bureau of Labor Statistics. An additional objective was to demonstrate a feasible and economic approach to identify risk factors associated with a specific occupation by using an existing database. An analysis of nonfatal traumatic injuries with days away from work among wage-and-salary drywall installers was performed for 1992 through 1995 using the Occupational Injury and Illness Survey conducted by the Bureau of Labor Statistics. Results from this study indicate that drywall installers are at a high risk of overexertion and falls to a lower level. More than 40% of the injured drywall installers suffered sprains, strains, and/or tears. The most frequently injured body part was the trunk. More than one-third of the trunk injuries occurred while handling solid building materials, mainly drywall. In addition, the database analysis used in this study is valid in identifying overall risk factors for specific occupations.
ERIC Educational Resources Information Center
Creagh, Sue
2016-01-01
This article presents a Foucauldian analysis of the political rationalities of national testing and accountability practices in Australia, and their inconsistencies for students for whom English is a second or additional language. It focuses on a problem associated with the statistical data category "Language Background Other Than…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robinson, D.G.; Eubanks, L.
1998-03-01
This software assists the engineering designer in characterizing the statistical uncertainty in the performance of complex systems as a result of variations in manufacturing processes, material properties, system geometry or operating environment. The software is composed of a graphical user interface that provides the user with easy access to Cassandra uncertainty analysis routines. Together this interface and the Cassandra routines are referred to as CRAX (CassandRA eXoskeleton). The software is flexible enough, that with minor modification, it is able to interface with large modeling and analysis codes such as heat transfer or finite element analysis software. The current version permitsmore » the user to manually input a performance function, the number of random variables and their associated statistical characteristics: density function, mean, coefficients of variation. Additional uncertainity analysis modules are continuously being added to the Cassandra core.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robiinson, David G.
1999-02-20
This software assists the engineering designer in characterizing the statistical uncertainty in the performance of complex systems as a result of variations in manufacturing processes, material properties, system geometry or operating environment. The software is composed of a graphical user interface that provides the user with easy access to Cassandra uncertainty analysis routines. Together this interface and the Cassandra routines are referred to as CRAX (CassandRA eXoskeleton). The software is flexible enough, that with minor modification, it is able to interface with large modeling and analysis codes such as heat transfer or finite element analysis software. The current version permitsmore » the user to manually input a performance function, the number of random variables and their associated statistical characteristics: density function, mean, coefficients of variation. Additional uncertainity analysis modules are continuously being added to the Cassandra core.« less
NASA Astrophysics Data System (ADS)
Hsu, Kuo-Hsien
2012-11-01
Formosat-2 image is a kind of high-spatial-resolution (2 meters GSD) remote sensing satellite data, which includes one panchromatic band and four multispectral bands (Blue, Green, Red, near-infrared). An essential sector in the daily processing of received Formosat-2 image is to estimate the cloud statistic of image using Automatic Cloud Coverage Assessment (ACCA) algorithm. The information of cloud statistic of image is subsequently recorded as an important metadata for image product catalog. In this paper, we propose an ACCA method with two consecutive stages: preprocessing and post-processing analysis. For pre-processing analysis, the un-supervised K-means classification, Sobel's method, thresholding method, non-cloudy pixels reexamination, and cross-band filter method are implemented in sequence for cloud statistic determination. For post-processing analysis, Box-Counting fractal method is implemented. In other words, the cloud statistic is firstly determined via pre-processing analysis, the correctness of cloud statistic of image of different spectral band is eventually cross-examined qualitatively and quantitatively via post-processing analysis. The selection of an appropriate thresholding method is very critical to the result of ACCA method. Therefore, in this work, We firstly conduct a series of experiments of the clustering-based and spatial thresholding methods that include Otsu's, Local Entropy(LE), Joint Entropy(JE), Global Entropy(GE), and Global Relative Entropy(GRE) method, for performance comparison. The result shows that Otsu's and GE methods both perform better than others for Formosat-2 image. Additionally, our proposed ACCA method by selecting Otsu's method as the threshoding method has successfully extracted the cloudy pixels of Formosat-2 image for accurate cloud statistic estimation.
1985-06-01
ADDRESS 10. PROGRAM ELEMENT, PROJECT. TASK AREA & WORK UNIT NUMBERS Naval Postgraduate School Monterey, California 93943 11. CONTROLLING OFFICE NAME AND...determine the sccioeccnomic representativeness of the Army’s enlistees in that iarticular year. In addition, the socioeconomic overviev of Republic cf...accomplished with the use of the Statistical Analysis System (SAS), an integrated computer system for data analysis. 32 TABLE 2 The States in Each District
GIS and statistical analysis for landslide susceptibility mapping in the Daunia area, Italy
NASA Astrophysics Data System (ADS)
Mancini, F.; Ceppi, C.; Ritrovato, G.
2010-09-01
This study focuses on landslide susceptibility mapping in the Daunia area (Apulian Apennines, Italy) and achieves this by using a multivariate statistical method and data processing in a Geographical Information System (GIS). The Logistic Regression (hereafter LR) method was chosen to produce a susceptibility map over an area of 130 000 ha where small settlements are historically threatened by landslide phenomena. By means of LR analysis, the tendency to landslide occurrences was, therefore, assessed by relating a landslide inventory (dependent variable) to a series of causal factors (independent variables) which were managed in the GIS, while the statistical analyses were performed by means of the SPSS (Statistical Package for the Social Sciences) software. The LR analysis produced a reliable susceptibility map of the investigated area and the probability level of landslide occurrence was ranked in four classes. The overall performance achieved by the LR analysis was assessed by local comparison between the expected susceptibility and an independent dataset extrapolated from the landslide inventory. Of the samples classified as susceptible to landslide occurrences, 85% correspond to areas where landslide phenomena have actually occurred. In addition, the consideration of the regression coefficients provided by the analysis demonstrated that a major role is played by the "land cover" and "lithology" causal factors in determining the occurrence and distribution of landslide phenomena in the Apulian Apennines.
Optical Logarithmic Transformation of Speckle Images with Bacteriorhodopsin Films
NASA Technical Reports Server (NTRS)
Downie, John D.
1995-01-01
The application of logarithmic transformations to speckle images is sometimes desirable in converting the speckle noise distribution into an additive, constant-variance noise distribution. The optical transmission properties of some bacteriorhodopsin films are well suited to implement such a transformation optically in a parallel fashion. I present experimental results of the optical conversion of a speckle image into a transformed image with signal-independent noise statistics, using the real-time photochromic properties of bacteriorhodopsin. The original and transformed noise statistics are confirmed by histogram analysis.
Laser Velocimeter Measurements and Analysis in Turbulent Flows with Combustion. Part 2.
1983-07-01
sampling error for 63 this sample size. Mean velocities and turbulence intensi- ties were found to be statistically accurate to ± 1 % and 13%, respectively...Although the statist - ical error was found to be rather small (± 1 % for mean velo- cities and 13% for turbulence intensities), there can be additional...34Computational and Experimental Study of a Captive Annular Eddy," Journal of Fluid Mechanics, Vol. 28, pt. 1 , pp. 43-63, 12 April, 1967. 152 REFERENCES (con’d
ERIC Educational Resources Information Center
Joyce, Theodore
1990-01-01
Analyzes the incidence of low birth weight in New York City using monthly time-series statistical data from 1968 through 1988. Finds that a downward trend before 1984 for both Blacks and Whites has reversed, with 3,110 additional low birth weight births to Blacks and 1,385 additional low birth weight births to Whites over the numbers expected.…
Quantitative trait nucleotide analysis using Bayesian model selection.
Blangero, John; Goring, Harald H H; Kent, Jack W; Williams, Jeff T; Peterson, Charles P; Almasy, Laura; Dyer, Thomas D
2005-10-01
Although much attention has been given to statistical genetic methods for the initial localization and fine mapping of quantitative trait loci (QTLs), little methodological work has been done to date on the problem of statistically identifying the most likely functional polymorphisms using sequence data. In this paper we provide a general statistical genetic framework, called Bayesian quantitative trait nucleotide (BQTN) analysis, for assessing the likely functional status of genetic variants. The approach requires the initial enumeration of all genetic variants in a set of resequenced individuals. These polymorphisms are then typed in a large number of individuals (potentially in families), and marker variation is related to quantitative phenotypic variation using Bayesian model selection and averaging. For each sequence variant a posterior probability of effect is obtained and can be used to prioritize additional molecular functional experiments. An example of this quantitative nucleotide analysis is provided using the GAW12 simulated data. The results show that the BQTN method may be useful for choosing the most likely functional variants within a gene (or set of genes). We also include instructions on how to use our computer program, SOLAR, for association analysis and BQTN analysis.
Censored data treatment using additional information in intelligent medical systems
NASA Astrophysics Data System (ADS)
Zenkova, Z. N.
2015-11-01
Statistical procedures are a very important and significant part of modern intelligent medical systems. They are used for proceeding, mining and analysis of different types of the data about patients and their diseases; help to make various decisions, regarding the diagnosis, treatment, medication or surgery, etc. In many cases the data can be censored or incomplete. It is a well-known fact that censorship considerably reduces the efficiency of statistical procedures. In this paper the author makes a brief review of the approaches which allow improvement of the procedures using additional information, and describes a modified estimation of an unknown cumulative distribution function involving additional information about a quantile which is known exactly. The additional information is used by applying a projection of a classical estimator to a set of estimators with certain properties. The Kaplan-Meier estimator is considered as an estimator of the unknown cumulative distribution function, the properties of the modified estimator are investigated for a case of a single right censorship by means of simulations.
Kaneta, Tomohiro; Nakatsuka, Masahiro; Nakamura, Kei; Seki, Takashi; Yamaguchi, Satoshi; Tsuboi, Masahiro; Meguro, Kenichi
2016-01-01
SPECT is an important diagnostic tool for dementia. Recently, statistical analysis of SPECT has been commonly used for dementia research. In this study, we evaluated the accuracy of visual SPECT evaluation and/or statistical analysis for the diagnosis (Dx) of Alzheimer disease (AD) and other forms of dementia in our community-based study "The Osaki-Tajiri Project." Eighty-nine consecutive outpatients with dementia were enrolled and underwent brain perfusion SPECT with 99mTc-ECD. Diagnostic accuracy of SPECT was tested using 3 methods: visual inspection (SPECT Dx), automated diagnostic tool using statistical analysis with easy Z-score imaging system (eZIS Dx), and visual inspection plus eZIS (integrated Dx). Integrated Dx showed the highest sensitivity, specificity, and accuracy, whereas eZIS was the second most accurate method. We also observed that a higher than expected rate of SPECT images indicated false-negative cases of AD. Among these, 50% showed hypofrontality and were diagnosed as frontotemporal lobar degeneration. These cases typically showed regional "hot spots" in the primary sensorimotor cortex (ie, a sensorimotor hot spot sign), which we determined were associated with AD rather than frontotemporal lobar degeneration. We concluded that the diagnostic abilities were improved by the integrated use of visual assessment and statistical analysis. In addition, the detection of a sensorimotor hot spot sign was useful to detect AD when hypofrontality is present and improved the ability to properly diagnose AD.
Spontaneous collective synchronization in the Kuramoto model with additional non-local interactions
NASA Astrophysics Data System (ADS)
Gupta, Shamik
2017-10-01
In the context of the celebrated Kuramoto model of globally-coupled phase oscillators of distributed natural frequencies, which serves as a paradigm to investigate spontaneous collective synchronization in many-body interacting systems, we report on a very rich phase diagram in presence of thermal noise and an additional non-local interaction on a one-dimensional periodic lattice. Remarkably, the phase diagram involves both equilibrium and non-equilibrium phase transitions. In two contrasting limits of the dynamics, we obtain exact analytical results for the phase transitions. These two limits correspond to (i) the absence of thermal noise, when the dynamics reduces to that of a non-linear dynamical system, and (ii) the oscillators having the same natural frequency, when the dynamics becomes that of a statistical system in contact with a heat bath and relaxing to a statistical equilibrium state. In the former case, our exact analysis is based on the use of the so-called Ott-Antonsen ansatz to derive a reduced set of nonlinear partial differential equations for the macroscopic evolution of the system. Our results for the case of statistical equilibrium are on the other hand obtained by extending the well-known transfer matrix approach for nearest-neighbor Ising model to consider non-local interactions. The work offers a case study of exact analysis in many-body interacting systems. The results obtained underline the crucial role of additional non-local interactions in either destroying or enhancing the possibility of observing synchrony in mean-field systems exhibiting spontaneous synchronization.
Statistical plant set estimation using Schroeder-phased multisinusoidal input design
NASA Technical Reports Server (NTRS)
Bayard, D. S.
1992-01-01
A frequency domain method is developed for plant set estimation. The estimation of a plant 'set' rather than a point estimate is required to support many methods of modern robust control design. The approach here is based on using a Schroeder-phased multisinusoid input design which has the special property of placing input energy only at the discrete frequency points used in the computation. A detailed analysis of the statistical properties of the frequency domain estimator is given, leading to exact expressions for the probability distribution of the estimation error, and many important properties. It is shown that, for any nominal parametric plant estimate, one can use these results to construct an overbound on the additive uncertainty to any prescribed statistical confidence. The 'soft' bound thus obtained can be used to replace 'hard' bounds presently used in many robust control analysis and synthesis methods.
A phylogenetic transform enhances analysis of compositional microbiota data.
Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A
2017-02-15
Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities.
Mediation analysis in nursing research: a methodological review
Liu, Jianghong; Ulrich, Connie
2017-01-01
Mediation statistical models help clarify the relationship between independent predictor variables and dependent outcomes of interest by assessing the impact of third variables. This type of statistical analysis is applicable for many clinical nursing research questions, yet its use within nursing remains low. Indeed, mediational analyses may help nurse researchers develop more effective and accurate prevention and treatment programs as well as help bridge the gap between scientific knowledge and clinical practice. In addition, this statistical approach allows nurse researchers to ask – and answer – more meaningful and nuanced questions that extend beyond merely determining whether an outcome occurs. Therefore, the goal of this paper is to provide a brief tutorial on the use of mediational analyses in clinical nursing research by briefly introducing the technique and, through selected empirical examples from the nursing literature, demonstrating its applicability in advancing nursing science. PMID:26176804
Reconnection properties in Kelvin-Helmholtz instabilities
NASA Astrophysics Data System (ADS)
Vernisse, Y.; Lavraud, B.; Eriksson, S.; Gershman, D. J.; Dorelli, J.; Pollock, C. J.; Giles, B. L.; Aunai, N.; Avanov, L. A.; Burch, J.; Chandler, M. O.; Coffey, V. N.; Dargent, J.; Ergun, R.; Farrugia, C. J.; Genot, V. N.; Graham, D.; Hasegawa, H.; Jacquey, C.; Kacem, I.; Khotyaintsev, Y. V.; Li, W.; Magnes, W.; Marchaudon, A.; Moore, T. E.; Paterson, W. R.; Penou, E.; Phan, T.; Retino, A.; Schwartz, S. J.; Saito, Y.; Sauvaud, J. A.; Schiff, C.; Torbert, R. B.; Wilder, F. D.; Yokota, S.
2017-12-01
Kelvin-Helmholtz instabilities are particular laboratories to study strong guide field reconnection processes. In particular, unlike the usual dayside magnetopause, the conditions across the magnetopause in KH vortices are quasi-symmetric, with low differences in beta and magnetic shear angle. We study these properties by means of statistical analysis of the high-resolution data of the Magnetospheric Multiscale mission. Several events of Kelvin-Helmholtz instabilities pas the terminator plane and a long lasting dayside instabilities event where used in order to produce this statistical analysis. Early results present a consistency between the data and the theory. In addition, the results emphasize the importance of the thickness of the magnetopause as a driver of magnetic reconnection in low magnetic shear events.
Uses and Misuses of the Correlation Coefficient.
ERIC Educational Resources Information Center
Onwuegbuzie, Anthony J.; Daniel, Larry G.
The purpose of this paper is to provide an in-depth critical analysis of the use and misuse of correlation coefficients. Various analytical and interpretational misconceptions are reviewed, beginning with the egregious assumption that correlational statistics may be useful in inferring causality. Additional misconceptions, stemming from…
Nonclassical point of view of the Brownian motion generation via fractional deterministic model
NASA Astrophysics Data System (ADS)
Gilardi-Velázquez, H. E.; Campos-Cantón, E.
In this paper, we present a dynamical system based on the Langevin equation without stochastic term and using fractional derivatives that exhibit properties of Brownian motion, i.e. a deterministic model to generate Brownian motion is proposed. The stochastic process is replaced by considering an additional degree of freedom in the second-order Langevin equation. Thus, it is transformed into a system of three first-order linear differential equations, additionally α-fractional derivative are considered which allow us to obtain better statistical properties. Switching surfaces are established as a part of fluctuating acceleration. The final system of three α-order linear differential equations does not contain a stochastic term, so the system generates motion in a deterministic way. Nevertheless, from the time series analysis, we found that the behavior of the system exhibits statistics properties of Brownian motion, such as, a linear growth in time of mean square displacement, a Gaussian distribution. Furthermore, we use the detrended fluctuation analysis to prove the Brownian character of this motion.
Statistical Field Estimation for Complex Coastal Regions and Archipelagos (PREPRINT)
2011-04-09
and study the computational properties of these schemes. Specifically, we extend a multiscale Objective Analysis (OA) approach to complex coastal...computational properties of these schemes. Specifically, we extend a multiscale Objective Analysis (OA) approach to complex coastal regions and... multiscale free-surface code builds on the primitive-equation model of the Harvard Ocean Predic- tion System (HOPS, Haley et al. (2009)). Additionally
NASA Astrophysics Data System (ADS)
Bierstedt, Svenja E.; Hünicke, Birgit; Zorita, Eduardo; Ludwig, Juliane
2017-07-01
We statistically analyse the relationship between the structure of migrating dunes in the southern Baltic and the driving wind conditions over the past 26 years, with the long-term aim of using migrating dunes as a proxy for past wind conditions at an interannual resolution. The present analysis is based on the dune record derived from geo-radar measurements by Ludwig et al. (2017). The dune system is located at the Baltic Sea coast of Poland and is migrating from west to east along the coast. The dunes present layers with different thicknesses that can be assigned to absolute dates at interannual timescales and put in relation to seasonal wind conditions. To statistically analyse this record and calibrate it as a wind proxy, we used a gridded regional meteorological reanalysis data set (coastDat2) covering recent decades. The identified link between the dune annual layers and wind conditions was additionally supported by the co-variability between dune layers and observed sea level variations in the southern Baltic Sea. We include precipitation and temperature into our analysis, in addition to wind, to learn more about the dependency between these three atmospheric factors and their common influence on the dune system. We set up a statistical linear model based on the correlation between the frequency of days with specific wind conditions in a given season and dune migration velocities derived for that season. To some extent, the dune records can be seen as analogous to tree-ring width records, and hence we use a proxy validation method usually applied in dendrochronology, cross-validation with the leave-one-out method, when the observational record is short. The revealed correlations between the wind record from the reanalysis and the wind record derived from the dune structure is in the range between 0.28 and 0.63, yielding similar statistical validation skill as dendroclimatological records.
Lefebvre, Alexandre; Rochefort, Gael Y.; Santos, Frédéric; Le Denmat, Dominique; Salmon, Benjamin; Pétillon, Jean-Marc
2016-01-01
Over the last decade, biomedical 3D-imaging tools have gained widespread use in the analysis of prehistoric bone artefacts. While initial attempts to characterise the major categories used in osseous industry (i.e. bone, antler, and dentine/ivory) have been successful, the taxonomic determination of prehistoric artefacts remains to be investigated. The distinction between reindeer and red deer antler can be challenging, particularly in cases of anthropic and/or taphonomic modifications. In addition to the range of destructive physicochemical identification methods available (mass spectrometry, isotopic ratio, and DNA analysis), X-ray micro-tomography (micro-CT) provides convincing non-destructive 3D images and analyses. This paper presents the experimental protocol (sample scans, image processing, and statistical analysis) we have developed in order to identify modern and archaeological antler collections (from Isturitz, France). This original method is based on bone microstructure analysis combined with advanced statistical support vector machine (SVM) classifiers. A combination of six microarchitecture biomarkers (bone volume fraction, trabecular number, trabecular separation, trabecular thickness, trabecular bone pattern factor, and structure model index) were screened using micro-CT in order to characterise internal alveolar structure. Overall, reindeer alveoli presented a tighter mesh than red deer alveoli, and statistical analysis allowed us to distinguish archaeological antler by species with an accuracy of 96%, regardless of anatomical location on the antler. In conclusion, micro-CT combined with SVM classifiers proves to be a promising additional non-destructive method for antler identification, suitable for archaeological artefacts whose degree of human modification and cultural heritage or scientific value has previously made it impossible (tools, ornaments, etc.). PMID:26901355
Geostatistical applications in environmental remediation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stewart, R.N.; Purucker, S.T.; Lyon, B.F.
1995-02-01
Geostatistical analysis refers to a collection of statistical methods for addressing data that vary in space. By incorporating spatial information into the analysis, geostatistics has advantages over traditional statistical analysis for problems with a spatial context. Geostatistics has a history of success in earth science applications, and its popularity is increasing in other areas, including environmental remediation. Due to recent advances in computer technology, geostatistical algorithms can be executed at a speed comparable to many standard statistical software packages. When used responsibly, geostatistics is a systematic and defensible tool can be used in various decision frameworks, such as the Datamore » Quality Objectives (DQO) process. At every point in the site, geostatistics can estimate both the concentration level and the probability or risk of exceeding a given value. Using these probability maps can assist in identifying clean-up zones. Given any decision threshold and an acceptable level of risk, the probability maps identify those areas that are estimated to be above or below the acceptable risk. Those areas that are above the threshold are of the most concern with regard to remediation. In addition to estimating clean-up zones, geostatistics can assist in designing cost-effective secondary sampling schemes. Those areas of the probability map with high levels of estimated uncertainty are areas where more secondary sampling should occur. In addition, geostatistics has the ability to incorporate soft data directly into the analysis. These data include historical records, a highly correlated secondary contaminant, or expert judgment. The role of geostatistics in environmental remediation is a tool that in conjunction with other methods can provide a common forum for building consensus.« less
Stucki, Sheldon Lee; Biss, David J.
2000-01-01
An analysis was performed using the National Automotive Sampling System Crashworthiness Data System (NASS-CDS) database to compare the injury/fatality rates of variously restrained driver occupants as compared to unrestrained driver occupants in the total database of drivers/frontals, and also by Delta-V. A structured search of the NASS-CDS was done using the SAS® statistical analysis software to extract the data for this analysis and the SUDAAN software package was used to arrive at statistical significance indicators. In addition, this paper goes on to investigate different methods for presenting results of accident database searches including significance results; a risk versus Delta-V format for specific exposures; and, a percent cumulative injury versus Delta-V format to characterize injury trends. These alternative analysis presentation methods are then discussed by example using the present study results. PMID:11558105
Tsallis statistics and neurodegenerative disorders
NASA Astrophysics Data System (ADS)
Iliopoulos, Aggelos C.; Tsolaki, Magdalini; Aifantis, Elias C.
2016-08-01
In this paper, we perform statistical analysis of time series deriving from four neurodegenerative disorders, namely epilepsy, amyotrophic lateral sclerosis (ALS), Parkinson's disease (PD), Huntington's disease (HD). The time series are concerned with electroencephalograms (EEGs) of healthy and epileptic states, as well as gait dynamics (in particular stride intervals) of the ALS, PD and HDs. We study data concerning one subject for each neurodegenerative disorder and one healthy control. The analysis is based on Tsallis non-extensive statistical mechanics and in particular on the estimation of Tsallis q-triplet, namely {qstat, qsen, qrel}. The deviation of Tsallis q-triplet from unity indicates non-Gaussian statistics and long-range dependencies for all time series considered. In addition, the results reveal the efficiency of Tsallis statistics in capturing differences in brain dynamics between healthy and epileptic states, as well as differences between ALS, PD, HDs from healthy control subjects. The results indicate that estimations of Tsallis q-indices could be used as possible biomarkers, along with others, for improving classification and prediction of epileptic seizures, as well as for studying the gait complex dynamics of various diseases providing new insights into severity, medications and fall risk, improving therapeutic interventions.
Evaluating the statistical methodology of randomized trials on dentin hypersensitivity management.
Matranga, Domenica; Matera, Federico; Pizzo, Giuseppe
2017-12-27
The present study aimed to evaluate the characteristics and quality of statistical methodology used in clinical studies on dentin hypersensitivity management. An electronic search was performed for data published from 2009 to 2014 by using PubMed, Ovid/MEDLINE, and Cochrane Library databases. The primary search terms were used in combination. Eligibility criteria included randomized clinical trials that evaluated the efficacy of desensitizing agents in terms of reducing dentin hypersensitivity. A total of 40 studies were considered eligible for assessment of quality statistical methodology. The four main concerns identified were i) use of nonparametric tests in the presence of large samples, coupled with lack of information about normality and equality of variances of the response; ii) lack of P-value adjustment for multiple comparisons; iii) failure to account for interactions between treatment and follow-up time; and iv) no information about the number of teeth examined per patient and the consequent lack of cluster-specific approach in data analysis. Owing to these concerns, statistical methodology was judged as inappropriate in 77.1% of the 35 studies that used parametric methods. Additional studies with appropriate statistical analysis are required to obtain appropriate assessment of the efficacy of desensitizing agents.
Multiple Phenotype Association Tests Using Summary Statistics in Genome-Wide Association Studies
Liu, Zhonghua; Lin, Xihong
2017-01-01
Summary We study in this paper jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. PMID:28653391
Multiple phenotype association tests using summary statistics in genome-wide association studies.
Liu, Zhonghua; Lin, Xihong
2018-03-01
We study in this article jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. © 2017, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Noel, Jean; Prieto, Juan C.; Styner, Martin
2017-03-01
Functional Analysis of Diffusion Tensor Tract Statistics (FADTTS) is a toolbox for analysis of white matter (WM) fiber tracts. It allows associating diffusion properties along major WM bundles with a set of covariates of interest, such as age, diagnostic status and gender, and the structure of the variability of these WM tract properties. However, to use this toolbox, a user must have an intermediate knowledge in scripting languages (MATLAB). FADTTSter was created to overcome this issue and make the statistical analysis accessible to any non-technical researcher. FADTTSter is actively being used by researchers at the University of North Carolina. FADTTSter guides non-technical users through a series of steps including quality control of subjects and fibers in order to setup the necessary parameters to run FADTTS. Additionally, FADTTSter implements interactive charts for FADTTS' outputs. This interactive chart enhances the researcher experience and facilitates the analysis of the results. FADTTSter's motivation is to improve usability and provide a new analysis tool to the community that complements FADTTS. Ultimately, by enabling FADTTS to a broader audience, FADTTSter seeks to accelerate hypothesis testing in neuroimaging studies involving heterogeneous clinical data and diffusion tensor imaging. This work is submitted to the Biomedical Applications in Molecular, Structural, and Functional Imaging conference. The source code of this application is available in NITRC.
Mild cognitive impairment and fMRI studies of brain functional connectivity: the state of the art
Farràs-Permanyer, Laia; Guàrdia-Olmos, Joan; Peró-Cebollero, Maribel
2015-01-01
In the last 15 years, many articles have studied brain connectivity in Mild Cognitive Impairment patients with fMRI techniques, seemingly using different connectivity statistical models in each investigation to identify complex connectivity structures so as to recognize typical behavior in this type of patient. This diversity in statistical approaches may cause problems in results comparison. This paper seeks to describe how researchers approached the study of brain connectivity in MCI patients using fMRI techniques from 2002 to 2014. The focus is on the statistical analysis proposed by each research group in reference to the limitations and possibilities of those techniques to identify some recommendations to improve the study of functional connectivity. The included articles came from a search of Web of Science and PsycINFO using the following keywords: f MRI, MCI, and functional connectivity. Eighty-one papers were found, but two of them were discarded because of the lack of statistical analysis. Accordingly, 79 articles were included in this review. We summarized some parts of the articles, including the goal of every investigation, the cognitive paradigm and methods used, brain regions involved, use of ROI analysis and statistical analysis, emphasizing on the connectivity estimation model used in each investigation. The present analysis allowed us to confirm the remarkable variability of the statistical analysis methods found. Additionally, the study of brain connectivity in this type of population is not providing, at the moment, any significant information or results related to clinical aspects relevant for prediction and treatment. We propose to follow guidelines for publishing fMRI data that would be a good solution to the problem of study replication. The latter aspect could be important for future publications because a higher homogeneity would benefit the comparison between publications and the generalization of results. PMID:26300802
Adams, James; Kruger, Uwe; Geis, Elizabeth; Gehn, Eva; Fimbres, Valeria; Pollard, Elena; Mitchell, Jessica; Ingram, Julie; Hellmers, Robert; Quig, David; Hahn, Juergen
2017-01-01
Introduction A number of previous studies examined a possible association of toxic metals and autism, and over half of those studies suggest that toxic metal levels are different in individuals with Autism Spectrum Disorders (ASD). Additionally, several studies found that those levels correlate with the severity of ASD. Methods In order to further investigate these points, this paper performs the most detailed statistical analysis to date of a data set in this field. First morning urine samples were collected from 67 children and adults with ASD and 50 neurotypical controls of similar age and gender. The samples were analyzed to determine the levels of 10 urinary toxic metals (UTM). Autism-related symptoms were assessed with eleven behavioral measures. Statistical analysis was used to distinguish participants on the ASD spectrum and neurotypical participants based upon the UTM data alone. The analysis also included examining the association of autism severity with toxic metal excretion data using linear and nonlinear analysis. “Leave-one-out” cross-validation was used to ensure statistical independence of results. Results and Discussion Average excretion levels of several toxic metals (lead, tin, thallium, antimony) were significantly higher in the ASD group. However, ASD classification using univariate statistics proved difficult due to large variability, but nonlinear multivariate statistical analysis significantly improved ASD classification with Type I/II errors of 15% and 18%, respectively. These results clearly indicate that the urinary toxic metal excretion profiles of participants in the ASD group were significantly different from those of the neurotypical participants. Similarly, nonlinear methods determined a significantly stronger association between the behavioral measures and toxic metal excretion. The association was strongest for the Aberrant Behavior Checklist (including subscales on Irritability, Stereotypy, Hyperactivity, and Inappropriate Speech), but significant associations were found for UTM with all eleven autism-related assessments with cross-validation R2 values ranging from 0.12–0.48. PMID:28068407
Excoffier, L; Smouse, P E; Quattro, J M
1992-06-01
We present here a framework for the study of molecular variation within a single species. Information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes. This analysis of molecular variance (AMOVA) produces estimates of variance components and F-statistic analogs, designated here as phi-statistics, reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivision. The method is flexible enough to accommodate several alternative input matrices, corresponding to different types of molecular data, as well as different types of evolutionary assumptions, without modifying the basic structure of the analysis. The significance of the variance components and phi-statistics is tested using a permutational approach, eliminating the normality assumption that is conventional for analysis of variance but inappropriate for molecular data. Application of AMOVA to human mitochondrial DNA haplotype data shows that population subdivisions are better resolved when some measure of molecular differences among haplotypes is introduced into the analysis. At the intraspecific level, however, the additional information provided by knowing the exact phylogenetic relations among haplotypes or by a nonlinear translation of restriction-site change into nucleotide diversity does not significantly modify the inferred population genetic structure. Monte Carlo studies show that site sampling does not fundamentally affect the significance of the molecular variance components. The AMOVA treatment is easily extended in several different directions and it constitutes a coherent and flexible framework for the statistical analysis of molecular data.
New Developments in the Embedded Statistical Coupling Method: Atomistic/Continuum Crack Propagation
NASA Technical Reports Server (NTRS)
Saether, E.; Yamakov, V.; Glaessgen, E.
2008-01-01
A concurrent multiscale modeling methodology that embeds a molecular dynamics (MD) region within a finite element (FEM) domain has been enhanced. The concurrent MD-FEM coupling methodology uses statistical averaging of the deformation of the atomistic MD domain to provide interface displacement boundary conditions to the surrounding continuum FEM region, which, in turn, generates interface reaction forces that are applied as piecewise constant traction boundary conditions to the MD domain. The enhancement is based on the addition of molecular dynamics-based cohesive zone model (CZM) elements near the MD-FEM interface. The CZM elements are a continuum interpretation of the traction-displacement relationships taken from MD simulations using Cohesive Zone Volume Elements (CZVE). The addition of CZM elements to the concurrent MD-FEM analysis provides a consistent set of atomistically-based cohesive properties within the finite element region near the growing crack. Another set of CZVEs are then used to extract revised CZM relationships from the enhanced embedded statistical coupling method (ESCM) simulation of an edge crack under uniaxial loading.
Women's health and women's work in health services: what statistics tell us.
Hedman, B; Herner, E
1988-01-01
This article draws together statistical information in several broad areas that relate to women's health, women's reproductive activities and women's occupations in Sweden. The statistical analysis reflects the major changes that have occurred in Swedish society and that have had a major impact on the health and well-being, as well as on the social participation rate, of women. Much of the data is drawn from a recent special effort at Statistic Sweden aimed at influencing the classification, collection and presentation of statistical data in all fields in such a way that family, working, education, health and other conditions of women can be more readily and equitably compared with those of men. In addition, social changes have seen the shifting of the responsibility of health care from the unpaid duties of women in the home to health care institutions, where female employees predominate. These trends are also discussed.
The discrimination of sea ice types using SAR backscatter statistics
NASA Technical Reports Server (NTRS)
Shuchman, Robert A.; Wackerman, Christopher C.; Maffett, Andrew L.; Onstott, Robert G.; Sutherland, Laura L.
1989-01-01
X-band (HH) synthetic aperture radar (SAR) data of sea ice collected during the Marginal Ice Zone Experiment in March and April of 1987 was statistically analyzed with respect to discriminating open water, first-year ice, multiyear ice, and Odden. Odden are large expanses of nilas ice that rapidly form in the Greenland Sea and transform into pancake ice. A first-order statistical analysis indicated that mean versus variance can segment out open water and first-year ice, and skewness versus modified skewness can segment the Odden and multilayer categories. In additions to first-order statistics, a model has been generated for the distribution function of the SAR ice data. Segmentation of ice types was also attempted using textural measurements. In this case, the general co-occurency matrix was evaluated. The textural method did not generate better results than the first-order statistical approach.
GWAR: robust analysis and meta-analysis of genome-wide association studies.
Dimou, Niki L; Tsirigos, Konstantinos D; Elofsson, Arne; Bagos, Pantelis G
2017-05-15
In the context of genome-wide association studies (GWAS), there is a variety of statistical techniques in order to conduct the analysis, but, in most cases, the underlying genetic model is usually unknown. Under these circumstances, the classical Cochran-Armitage trend test (CATT) is suboptimal. Robust procedures that maximize the power and preserve the nominal type I error rate are preferable. Moreover, performing a meta-analysis using robust procedures is of great interest and has never been addressed in the past. The primary goal of this work is to implement several robust methods for analysis and meta-analysis in the statistical package Stata and subsequently to make the software available to the scientific community. The CATT under a recessive, additive and dominant model of inheritance as well as robust methods based on the Maximum Efficiency Robust Test statistic, the MAX statistic and the MIN2 were implemented in Stata. Concerning MAX and MIN2, we calculated their asymptotic null distributions relying on numerical integration resulting in a great gain in computational time without losing accuracy. All the aforementioned approaches were employed in a fixed or a random effects meta-analysis setting using summary data with weights equal to the reciprocal of the combined cases and controls. Overall, this is the first complete effort to implement procedures for analysis and meta-analysis in GWAS using Stata. A Stata program and a web-server are freely available for academic users at http://www.compgen.org/tools/GWAR. pbagos@compgen.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Dittmar, John C.; Pierce, Steven; Rothstein, Rodney; Reid, Robert J. D.
2013-01-01
Genome-wide experiments often measure quantitative differences between treated and untreated cells to identify affected strains. For these studies, statistical models are typically used to determine significance cutoffs. We developed a method termed “CLIK” (Cutoff Linked to Interaction Knowledge) that overlays biological knowledge from the interactome on screen results to derive a cutoff. The method takes advantage of the fact that groups of functionally related interacting genes often respond similarly to experimental conditions and, thus, cluster in a ranked list of screen results. We applied CLIK analysis to five screens of the yeast gene disruption library and found that it defined a significance cutoff that differed from traditional statistics. Importantly, verification experiments revealed that the CLIK cutoff correlated with the position in the rank order where the rate of true positives drops off significantly. In addition, the gene sets defined by CLIK analysis often provide further biological perspectives. For example, applying CLIK analysis retrospectively to a screen for cisplatin sensitivity allowed us to identify the importance of the Hrq1 helicase in DNA crosslink repair. Furthermore, we demonstrate the utility of CLIK to determine optimal treatment conditions by analyzing genome-wide screens at multiple rapamycin concentrations. We show that CLIK is an extremely useful tool for evaluating screen quality, determining screen cutoffs, and comparing results between screens. Furthermore, because CLIK uses previously annotated interaction data to determine biologically informed cutoffs, it provides additional insights into screen results, which supplement traditional statistical approaches. PMID:23589890
ERIC Educational Resources Information Center
Kadhi, T.; Holley, D.; Garrison, P.; Green, T.; Palasota, A.
2010-01-01
The following report of descriptive statistics gives the passing percentages of the Bar examination for the Thurgood Marshall School of Law (TMSL) for the calendar years of 2005-2009. A Five Year Analysis is given for the entire period, followed by a Three Year Analysis of years 2005-2007, 2006-2008, and 2007-2009. In addition, an Annual Analysis…
Method for factor analysis of GC/MS data
Van Benthem, Mark H; Kotula, Paul G; Keenan, Michael R
2012-09-11
The method of the present invention provides a fast, robust, and automated multivariate statistical analysis of gas chromatography/mass spectroscopy (GC/MS) data sets. The method can involve systematic elimination of undesired, saturated peak masses to yield data that follow a linear, additive model. The cleaned data can then be subjected to a combination of PCA and orthogonal factor rotation followed by refinement with MCR-ALS to yield highly interpretable results.
Using Statistical Analysis Software to Advance Nitro Plasticizer Wettability
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shear, Trevor Allan
Statistical analysis in science is an extremely powerful tool that is often underutilized. Additionally, it is frequently the case that data is misinterpreted or not used to its fullest extent. Utilizing the advanced software JMP®, many aspects of experimental design and data analysis can be evaluated and improved. This overview will detail the features of JMP® and how they were used to advance a project, resulting in time and cost savings, as well as the collection of scientifically sound data. The project analyzed in this report addresses the inability of a nitro plasticizer to coat a gold coated quartz crystalmore » sensor used in a quartz crystal microbalance. Through the use of the JMP® software, the wettability of the nitro plasticizer was increased by over 200% using an atmospheric plasma pen, ensuring good sample preparation and reliable results.« less
Effect of local and global geomagnetic activity on human cardiovascular homeostasis.
Dimitrova, Svetla; Stoilova, Irina; Yanev, Toni; Cholakov, Ilia
2004-02-01
The authors investigated the effects of local and planetary geomagnetic activity on human physiology. They collected data in Sofia, Bulgaria, from a group of 86 volunteers during the periods of the autumnal and vernal equinoxes. They used the factors local/planetary geomagnetic activity, day of measurement, gender, and medication use to apply a four-factor multiple analysis of variance. They also used a post hoc analysis to establish the statistical significance of the differences between the average values of the measured physiological parameters in the separate factor levels. In addition, the authors performed correlation analysis between the physiological parameters examined and geophysical factors. The results revealed that geomagnetic changes had a statistically significant influence on arterial blood pressure. Participants expressed this reaction with weak local geomagnetic changes and when major and severe global geomagnetic storms took place.
Integration of statistical and physiological analyses of adaptation of near-isogenic barley lines.
Romagosa, I; Fox, P N; García Del Moral, L F; Ramos, J M; García Del Moral, B; Roca de Togores, F; Molina-Cano, J L
1993-08-01
Seven near-isogenic barley lines, differing for three independent mutant genes, were grown in 15 environments in Spain. Genotype x environment interaction (G x E) for grain yield was examined with the Additive Main Effects and Multiplicative interaction (AMMI) model. The results of this statistical analysis of multilocation yield-data were compared with a morpho-physiological characterization of the lines at two sites (Molina-Cano et al. 1990). The first two principal component axes from the AMMI analysis were strongly associated with the morpho-physiological characters. The independent but parallel discrimination among genotypes reflects genetic differences and highlights the power of the AMMI analysis as a tool to investigate G x E. Characters which appear to be positively associated with yield in the germplasm under study could be identified for some environments.
Open Source Tools for Seismicity Analysis
NASA Astrophysics Data System (ADS)
Powers, P.
2010-12-01
The spatio-temporal analysis of seismicity plays an important role in earthquake forecasting and is integral to research on earthquake interactions and triggering. For instance, the third version of the Uniform California Earthquake Rupture Forecast (UCERF), currently under development, will use Epidemic Type Aftershock Sequences (ETAS) as a model for earthquake triggering. UCERF will be a "living" model and therefore requires robust, tested, and well-documented ETAS algorithms to ensure transparency and reproducibility. Likewise, as earthquake aftershock sequences unfold, real-time access to high quality hypocenter data makes it possible to monitor the temporal variability of statistical properties such as the parameters of the Omori Law and the Gutenberg Richter b-value. Such statistical properties are valuable as they provide a measure of how much a particular sequence deviates from expected behavior and can be used when assigning probabilities of aftershock occurrence. To address these demands and provide public access to standard methods employed in statistical seismology, we present well-documented, open-source JavaScript and Java software libraries for the on- and off-line analysis of seismicity. The Javascript classes facilitate web-based asynchronous access to earthquake catalog data and provide a framework for in-browser display, analysis, and manipulation of catalog statistics; implementations of this framework will be made available on the USGS Earthquake Hazards website. The Java classes, in addition to providing tools for seismicity analysis, provide tools for modeling seismicity and generating synthetic catalogs. These tools are extensible and will be released as part of the open-source OpenSHA Commons library.
On Statistical Analysis of Neuroimages with Imperfect Registration
Kim, Won Hwa; Ravi, Sathya N.; Johnson, Sterling C.; Okonkwo, Ozioma C.; Singh, Vikas
2016-01-01
A variety of studies in neuroscience/neuroimaging seek to perform statistical inference on the acquired brain image scans for diagnosis as well as understanding the pathological manifestation of diseases. To do so, an important first step is to register (or co-register) all of the image data into a common coordinate system. This permits meaningful comparison of the intensities at each voxel across groups (e.g., diseased versus healthy) to evaluate the effects of the disease and/or use machine learning algorithms in a subsequent step. But errors in the underlying registration make this problematic, they either decrease the statistical power or make the follow-up inference tasks less effective/accurate. In this paper, we derive a novel algorithm which offers immunity to local errors in the underlying deformation field obtained from registration procedures. By deriving a deformation invariant representation of the image, the downstream analysis can be made more robust as if one had access to a (hypothetical) far superior registration procedure. Our algorithm is based on recent work on scattering transform. Using this as a starting point, we show how results from harmonic analysis (especially, non-Euclidean wavelets) yields strategies for designing deformation and additive noise invariant representations of large 3-D brain image volumes. We present a set of results on synthetic and real brain images where we achieve robust statistical analysis even in the presence of substantial deformation errors; here, standard analysis procedures significantly under-perform and fail to identify the true signal. PMID:27042168
The Effects of Auditory Tempo Changes on Rates of Stereotypic Behavior in Handicapped Children.
ERIC Educational Resources Information Center
Christopher, R.; Lewis, B.
1984-01-01
Rates of stereotypic behaviors in six severely/profoundly retarded children (eight to 15 years old) were observed during varying presentations of auditory beats produced by a metronome. Visual and statistical analysis of research results suggested a significant reaction to stimulus presentation. However, additional data following…
The Spreadsheet in an Educational Setting. Microcomputing Working Paper Series F 84-4.
ERIC Educational Resources Information Center
Wozny, Lucy
This overview of a specific spreadsheet, Microsoft's Multiplan for the Apple Macintosh microcomputer, emphasizes specific features that are important to the academic community, including the mathematical functions of algebra, trigonometry, and statistical analysis. Additional features are summarized, including data formats for both numerical and…
Effectiveness of propolis on oral health: a meta-analysis.
Hwu, Yueh-Juen; Lin, Feng-Yu
2014-12-01
The use of propolis mouth rinse or gel as a supplementary intervention has increased during the last decade in Taiwan. However, the effect of propolis on oral health is not well understood. The purpose of this meta-analysis was to present the best available evidence regarding the effects of propolis use on oral health, including oral infection, dental plaque, and stomatitis. Researchers searched seven electronic databases for relevant articles published between 1969 and 2012. Data were collected using inclusion and exclusion criteria. The Joanna Briggs Institute Meta Analysis of Statistics Assessment and Review Instrument was used to evaluate the quality of the identified articles. Eight trials published from 1997 to 2011 with 194 participants had extractable data. The result of the meta-analysis indicated that, although propolis had an effect on reducing dental plaque, this effect was not statistically significant. The results were not statistically significant for oral infection or stomatitis. Although there are a number of promising indications, in view of the limited number and quality of studies and the variation in results among studies, this review highlights the need for additional well-designed trials to draw conclusions that are more robust.
Quinolizidine alkaloids from Lupinus lanatus
NASA Astrophysics Data System (ADS)
Neto, Alexandre T.; Oliveira, Carolina Q.; Ilha, Vinicius; Pedroso, Marcelo; Burrow, Robert A.; Dalcol, Ionara I.; Morel, Ademir F.
2011-10-01
In this study, one new quinolizidine alkaloid, lanatine A ( 1), together with three other known alkaloids, 13-α- trans-cinnamoyloxylupanine ( 2), 13-α-hydroxylupanine ( 3), and (-)-multiflorine ( 4) were isolated from the aerial parts of Lupinus lanatus (Fabaceae). The structures of alkaloids 1- 4 were elucidated by spectroscopic data analysis. The stereochemistry of 1 was determined by single crystal X-ray analysis. Bayesian statistical analysis of the Bijvoet differences suggests the absolute stereochemistry of 1. In addition, the antimicrobial potential of alkaloids 1- 4 is also reported.
Test data analysis for concentrating photovoltaic arrays
NASA Astrophysics Data System (ADS)
Maish, A. B.; Cannon, J. E.
A test data analysis approach for use with steady state efficiency measurements taken on concentrating photovoltaic arrays is presented. The analysis procedures can be used to identify based and erroneous data. The steps involved in analyzing the test data are screening the data, developing coefficients for the performance equation, analyzing statistics to ensure adequacy of the regression fit to the data, and plotting the data. In addition, this paper analyzes the sources and magnitudes of precision and bias errors that affect measurement accuracy are analyzed.
NASA Astrophysics Data System (ADS)
Edjah, Adwoba; Stenni, Barbara; Cozzi, Giulio; Turetta, Clara; Dreossi, Giuliano; Tetteh Akiti, Thomas; Yidana, Sandow
2017-04-01
Adwoba Kua- Manza Edjaha, Barbara Stennib,c,Giuliano Dreossib, Giulio Cozzic, Clara Turetta c,T.T Akitid ,Sandow Yidanae a,eDepartment of Earth Science, University of Ghana Legon, Ghana West Africa bDepartment of Enviromental Sciences, Informatics and Statistics, Ca Foscari University of Venice, Italy cInstitute for the Dynamics of Environmental Processes, CNR, Venice, Italy dDepartment of Nuclear Application and Techniques, Graduate School of Nuclear and Allied Sciences University of Ghana Legon This research is part of a PhD research work "Hydrogeological Assessment of the Lower Tano river basin for sustainable economic usage, Ghana, West - Africa". In this study, the researcher investigated surface water and groundwater quality in the Lower Tano river basin. This assessment was based on some selected sampling sites associated with mining activities, and the development of oil and gas. Statistical approach was applied to characterize the quality of surface water and groundwater. Also, water stable isotopes, which is a natural tracer of the hydrological cycle was used to investigate the origin of groundwater recharge in the basin. The study revealed that Pb and Ni values of the surface water and groundwater samples exceeded the WHO standards for drinking water. In addition, water quality index (WQI), based on physicochemical parameters(EC, TDS, pH) and major ions(Ca2+, Na+, Mg2+, HCO3-,NO3-, CL-, SO42-, K+) exhibited good quality water for 60% of the sampled surface water and groundwater. Other statistical techniques, such as Heavy metal pollution index (HPI), degree of contamination (Cd), and heavy metal evaluation index (HEI), based on trace element parameters in the water samples, reveal that 90% of the surface water and groundwater samples belong to high level of pollution. Principal component analysis (PCA) also suggests that the water quality in the basin is likely affected by rock - water interaction and anthropogenic activities (sea water intrusion). This was confirm by further statistical analysis (cluster analysis and correlation matrix) of the water quality parameters. Spatial distribution of water quality parameters, trace elements and the results obtained from the statistical analysis was determined by geographical information system (GIS). In addition, the isotopic analysis of the sampled surface water and groundwater revealed that most of the surface water and groundwater were of meteoric origin with little or no isotopic variations. It is expected that outcomes of this research will form a baseline for making appropriate decision on water quality management by decision makers in the Lower Tano river Basin. Keywords: Water stable isotopes, Trace elements, Multivariate statistics, Evaluation indices, Lower Tano river basin.
A Monte Carlo study of Weibull reliability analysis for space shuttle main engine components
NASA Technical Reports Server (NTRS)
Abernethy, K.
1986-01-01
The incorporation of a number of additional capabilities into an existing Weibull analysis computer program and the results of Monte Carlo computer simulation study to evaluate the usefulness of the Weibull methods using samples with a very small number of failures and extensive censoring are discussed. Since the censoring mechanism inherent in the Space Shuttle Main Engine (SSME) data is hard to analyze, it was decided to use a random censoring model, generating censoring times from a uniform probability distribution. Some of the statistical techniques and computer programs that are used in the SSME Weibull analysis are described. The methods documented in were supplemented by adding computer calculations of approximate (using iteractive methods) confidence intervals for several parameters of interest. These calculations are based on a likelihood ratio statistic which is asymptotically a chisquared statistic with one degree of freedom. The assumptions built into the computer simulations are described. The simulation program and the techniques used in it are described there also. Simulation results are tabulated for various combinations of Weibull shape parameters and the numbers of failures in the samples.
New U.S. Geological Survey Method for the Assessment of Reserve Growth
Klett, Timothy R.; Attanasi, E.D.; Charpentier, Ronald R.; Cook, Troy A.; Freeman, P.A.; Gautier, Donald L.; Le, Phuong A.; Ryder, Robert T.; Schenk, Christopher J.; Tennyson, Marilyn E.; Verma, Mahendra K.
2011-01-01
Reserve growth is defined as the estimated increases in quantities of crude oil, natural gas, and natural gas liquids that have the potential to be added to remaining reserves in discovered accumulations through extension, revision, improved recovery efficiency, and additions of new pools or reservoirs. A new U.S. Geological Survey method was developed to assess the reserve-growth potential of technically recoverable crude oil and natural gas to be added to reserves under proven technology currently in practice within the trend or play, or which reasonably can be extrapolated from geologically similar trends or plays. This method currently is in use to assess potential additions to reserves in discovered fields of the United States. The new approach involves (1) individual analysis of selected large accumulations that contribute most to reserve growth, and (2) conventional statistical modeling of reserve growth in remaining accumulations. This report will focus on the individual accumulation analysis. In the past, the U.S. Geological Survey estimated reserve growth by statistical methods using historical recoverable-quantity data. Those statistical methods were based on growth rates averaged by the number of years since accumulation discovery. Accumulations in mature petroleum provinces with volumetrically significant reserve growth, however, bias statistical models of the data; therefore, accumulations with significant reserve growth are best analyzed separately from those with less significant reserve growth. Large (greater than 500 million barrels) and older (with respect to year of discovery) oil accumulations increase in size at greater rates late in their development history in contrast to more recently discovered accumulations that achieve most growth early in their development history. Such differences greatly affect the statistical methods commonly used to forecast reserve growth. The individual accumulation-analysis method involves estimating the in-place petroleum quantity and its uncertainty, as well as the estimated (forecasted) recoverability and its respective uncertainty. These variables are assigned probabilistic distributions and are combined statistically to provide probabilistic estimates of ultimate recoverable quantities. Cumulative production and remaining reserves are then subtracted from the estimated ultimate recoverable quantities to provide potential reserve growth. In practice, results of the two methods are aggregated to various scales, the highest of which includes an entire country or the world total. The aggregated results are reported along with the statistically appropriate uncertainties.
Four modes of optical parametric operation for squeezed state generation
NASA Astrophysics Data System (ADS)
Andersen, U. L.; Buchler, B. C.; Lam, P. K.; Wu, J. W.; Gao, J. R.; Bachor, H.-A.
2003-11-01
We report a versatile instrument, based on a monolithic optical parametric amplifier, which reliably generates four different types of squeezed light. We obtained vacuum squeezing, low power amplitude squeezing, phase squeezing and bright amplitude squeezing. We show a complete analysis of this light, including a full quantum state tomography. In addition we demonstrate the direct detection of the squeezed state statistics without the aid of a spectrum analyser. This technique makes the nonclassical properties directly visible and allows complete measurement of the statistical moments of the squeezed quadrature.
NASA Astrophysics Data System (ADS)
Hernandez-Cardoso, G. G.; Alfaro-Gomez, M.; Rojas-Landeros, S. C.; Salas-Gutierrez, I.; Castro-Camus, E.
2018-03-01
In this article, we present a series of hydration mapping images of the foot soles of diabetic and non-diabetic subjects measured by terahertz reflectance. In addition to the hydration images, we present a series of RYG-color-coded (red yellow green) images where pixels are assigned one of the three colors in order to easily identify areas in risk of ulceration. We also present the statistics of the number of pixels with each color as a potential quantitative indicator for diabetic foot-syndrome deterioration.
NASA Astrophysics Data System (ADS)
Palozzi, Jason; Pantopoulos, George; Maravelis, Angelos G.; Nordsvan, Adam; Zelilidis, Avraam
2018-02-01
This investigation presents an outcrop-based integrated study of internal division analysis and statistical treatment of turbidite bed thickness applied to a Carboniferous deep-water channel-levee complex in the Myall Trough, southeast Australia. Turbidite beds of the studied succession are characterized by a range of sedimentary structures grouped into two main associations, a thick-bedded and a thin-bedded one, that reflect channel-fill and overbank/levee deposits, respectively. Three vertically stacked channel-levee cycles have been identified. Results of statistical analysis of bed thickness, grain-size and internal division patterns applied on the studied channel-levee succession, indicate that turbidite bed thickness data seem to be well characterized by a bimodal lognormal distribution, which is possibly reflecting the difference between deposition from lower-density flows (in a levee/overbank setting) and very high-density flows (in a channel fill setting). Power law and exponential distributions were observed to hold only for the thick-bedded parts of the succession and cannot characterize the whole bed thickness range of the studied sediments. The succession also exhibits non-random clustering of bed thickness and grain-size measurements. The studied sediments are also characterized by the presence of statistically detected fining-upward sandstone packets. A novel quantitative approach (change-point analysis) is proposed for the detection of those packets. Markov permutation statistics also revealed the existence of order in the alternation of internal divisions in the succession expressed by an optimal internal division cycle reflecting two main types of gravity flow events deposited within both thick-bedded conglomeratic and thin-bedded sandstone associations. The analytical methods presented in this study can be used as additional tools for quantitative analysis and recognition of depositional environments in hydrocarbon-bearing research of ancient deep-water channel-levee settings.
Managing heteroscedasticity in general linear models.
Rosopa, Patrick J; Schaffer, Meline M; Schroeder, Amber N
2013-09-01
Heteroscedasticity refers to a phenomenon where data violate a statistical assumption. This assumption is known as homoscedasticity. When the homoscedasticity assumption is violated, this can lead to increased Type I error rates or decreased statistical power. Because this can adversely affect substantive conclusions, the failure to detect and manage heteroscedasticity could have serious implications for theory, research, and practice. In addition, heteroscedasticity is not uncommon in the behavioral and social sciences. Thus, in the current article, we synthesize extant literature in applied psychology, econometrics, quantitative psychology, and statistics, and we offer recommendations for researchers and practitioners regarding available procedures for detecting heteroscedasticity and mitigating its effects. In addition to discussing the strengths and weaknesses of various procedures and comparing them in terms of existing simulation results, we describe a 3-step data-analytic process for detecting and managing heteroscedasticity: (a) fitting a model based on theory and saving residuals, (b) the analysis of residuals, and (c) statistical inferences (e.g., hypothesis tests and confidence intervals) involving parameter estimates. We also demonstrate this data-analytic process using an illustrative example. Overall, detecting violations of the homoscedasticity assumption and mitigating its biasing effects can strengthen the validity of inferences from behavioral and social science data.
SPA- STATISTICAL PACKAGE FOR TIME AND FREQUENCY DOMAIN ANALYSIS
NASA Technical Reports Server (NTRS)
Brownlow, J. D.
1994-01-01
The need for statistical analysis often arises when data is in the form of a time series. This type of data is usually a collection of numerical observations made at specified time intervals. Two kinds of analysis may be performed on the data. First, the time series may be treated as a set of independent observations using a time domain analysis to derive the usual statistical properties including the mean, variance, and distribution form. Secondly, the order and time intervals of the observations may be used in a frequency domain analysis to examine the time series for periodicities. In almost all practical applications, the collected data is actually a mixture of the desired signal and a noise signal which is collected over a finite time period with a finite precision. Therefore, any statistical calculations and analyses are actually estimates. The Spectrum Analysis (SPA) program was developed to perform a wide range of statistical estimation functions. SPA can provide the data analyst with a rigorous tool for performing time and frequency domain studies. In a time domain statistical analysis the SPA program will compute the mean variance, standard deviation, mean square, and root mean square. It also lists the data maximum, data minimum, and the number of observations included in the sample. In addition, a histogram of the time domain data is generated, a normal curve is fit to the histogram, and a goodness-of-fit test is performed. These time domain calculations may be performed on both raw and filtered data. For a frequency domain statistical analysis the SPA program computes the power spectrum, cross spectrum, coherence, phase angle, amplitude ratio, and transfer function. The estimates of the frequency domain parameters may be smoothed with the use of Hann-Tukey, Hamming, Barlett, or moving average windows. Various digital filters are available to isolate data frequency components. Frequency components with periods longer than the data collection interval are removed by least-squares detrending. As many as ten channels of data may be analyzed at one time. Both tabular and plotted output may be generated by the SPA program. This program is written in FORTRAN IV and has been implemented on a CDC 6000 series computer with a central memory requirement of approximately 142K (octal) of 60 bit words. This core requirement can be reduced by segmentation of the program. The SPA program was developed in 1978.
Detection of semi-volatile organic compounds in permeable ...
Abstract The Edison Environmental Center (EEC) has a research and demonstration permeable parking lot comprised of three different permeable systems: permeable asphalt, porous concrete and interlocking concrete permeable pavers. Water quality and quantity analysis has been ongoing since January, 2010. This paper describes a subset of the water quality analysis, analysis of semivolatile organic compounds (SVOCs) to determine if hydrocarbons were in water infiltrated through the permeable surfaces. SVOCs were analyzed in samples collected from 11 dates over a 3 year period, from 2/8/2010 to 4/1/2013.Results are broadly divided into three categories: 42 chemicals were never detected; 12 chemicals (11 chemical test) were detected at a rate of less than 10% or less; and 22 chemicals were detected at a frequency of 10% or greater (ranging from 10% to 66.5% detections). Fundamental and exploratory statistical analyses were performed on these latter analyses results by grouping results by surface type. The statistical analyses were limited due to low frequency of detections and dilutions of samples which impacted detection limits. The infiltrate data through three permeable surfaces were analyzed as non-parametric data by the Kaplan-Meier estimation method for fundamental statistics; there were some statistically observable difference in concentration between pavement types when using Tarone-Ware Comparison Hypothesis Test. Additionally Spearman Rank order non-parame
Paranjpe, Madhav G; Denton, Melissa D; Vidmar, Tom; Elbekai, Reem H
2014-01-01
Carcinogenicity studies have been performed in conventional 2-year rodent studies for at least 3 decades, whereas the short-term carcinogenicity studies in transgenic mice, such as Tg.rasH2, have only been performed over the last decade. In the 2-year conventional rodent studies, interlinked problems, such as increasing trends in the initial body weights, increased body weight gains, high incidence of spontaneous tumors, and low survival, that complicate the interpretation of findings have been well established. However, these end points have not been evaluated in the short-term carcinogenicity studies involving the Tg.rasH2 mice. In this article, we present retrospective analysis of data obtained from control groups in 26-week carcinogenicity studies conducted in Tg.rasH2 mice since 2004. Our analysis showed statistically significant decreasing trends in initial body weights of both sexes. Although the terminal body weights did not show any significant trends, there was a statistically significant increasing trend toward body weight gains, more so in males than in females, which correlated with increasing trends in the food consumption. There were no statistically significant alterations in mortality trends. In addition, the incidence of all common spontaneous tumors remained fairly constant with no statistically significant differences in trends. © The Author(s) 2014.
Role of diversity in ICA and IVA: theory and applications
NASA Astrophysics Data System (ADS)
Adalı, Tülay
2016-05-01
Independent component analysis (ICA) has been the most popular approach for solving the blind source separation problem. Starting from a simple linear mixing model and the assumption of statistical independence, ICA can recover a set of linearly-mixed sources to within a scaling and permutation ambiguity. It has been successfully applied to numerous data analysis problems in areas as diverse as biomedicine, communications, finance, geo- physics, and remote sensing. ICA can be achieved using different types of diversity—statistical property—and, can be posed to simultaneously account for multiple types of diversity such as higher-order-statistics, sample dependence, non-circularity, and nonstationarity. A recent generalization of ICA, independent vector analysis (IVA), generalizes ICA to multiple data sets and adds the use of one more type of diversity, statistical dependence across the data sets, for jointly achieving independent decomposition of multiple data sets. With the addition of each new diversity type, identification of a broader class of signals become possible, and in the case of IVA, this includes sources that are independent and identically distributed Gaussians. We review the fundamentals and properties of ICA and IVA when multiple types of diversity are taken into account, and then ask the question whether diversity plays an important role in practical applications as well. Examples from various domains are presented to demonstrate that in many scenarios it might be worthwhile to jointly account for multiple statistical properties. This paper is submitted in conjunction with the talk delivered for the "Unsupervised Learning and ICA Pioneer Award" at the 2016 SPIE Conference on Sensing and Analysis Technologies for Biomedical and Cognitive Applications.
Comparison of the predictive validity of diagnosis-based risk adjusters for clinical outcomes.
Petersen, Laura A; Pietz, Kenneth; Woodard, LeChauncy D; Byrne, Margaret
2005-01-01
Many possible methods of risk adjustment exist, but there is a dearth of comparative data on their performance. We compared the predictive validity of 2 widely used methods (Diagnostic Cost Groups [DCGs] and Adjusted Clinical Groups [ACGs]) for 2 clinical outcomes using a large national sample of patients. We studied all patients who used Veterans Health Administration (VA) medical services in fiscal year (FY) 2001 (n = 3,069,168) and assigned both a DCG and an ACG to each. We used logistic regression analyses to compare predictive ability for death or long-term care (LTC) hospitalization for age/gender models, DCG models, and ACG models. We also assessed the effect of adding age to the DCG and ACG models. Patients in the highest DCG categories, indicating higher severity of illness, were more likely to die or to require LTC hospitalization. Surprisingly, the age/gender model predicted death slightly more accurately than the ACG model (c-statistic of 0.710 versus 0.700, respectively). The addition of age to the ACG model improved the c-statistic to 0.768. The highest c-statistic for prediction of death was obtained with a DCG/age model (0.830). The lowest c-statistics were obtained for age/gender models for LTC hospitalization (c-statistic 0.593). The c-statistic for use of ACGs to predict LTC hospitalization was 0.783, and improved to 0.792 with the addition of age. The c-statistics for use of DCGs and DCG/age to predict LTC hospitalization were 0.885 and 0.890, respectively, indicating the best prediction. We found that risk adjusters based upon diagnoses predicted an increased likelihood of death or LTC hospitalization, exhibiting good predictive validity. In this comparative analysis using VA data, DCG models were generally superior to ACG models in predicting clinical outcomes, although ACG model performance was enhanced by the addition of age.
Eijssen, Lars M T; Goelela, Varshna S; Kelder, Thomas; Adriaens, Michiel E; Evelo, Chris T; Radonjic, Marijana
2015-06-30
Illumina whole-genome expression bead arrays are a widely used platform for transcriptomics. Most of the tools available for the analysis of the resulting data are not easily applicable by less experienced users. ArrayAnalysis.org provides researchers with an easy-to-use and comprehensive interface to the functionality of R and Bioconductor packages for microarray data analysis. As a modular open source project, it allows developers to contribute modules that provide support for additional types of data or extend workflows. To enable data analysis of Illumina bead arrays for a broad user community, we have developed a module for ArrayAnalysis.org that provides a free and user-friendly web interface for quality control and pre-processing for these arrays. This module can be used together with existing modules for statistical and pathway analysis to provide a full workflow for Illumina gene expression data analysis. The module accepts data exported from Illumina's GenomeStudio, and provides the user with quality control plots and normalized data. The outputs are directly linked to the existing statistics module of ArrayAnalysis.org, but can also be downloaded for further downstream analysis in third-party tools. The Illumina bead arrays analysis module is available at http://www.arrayanalysis.org . A user guide, a tutorial demonstrating the analysis of an example dataset, and R scripts are available. The module can be used as a starting point for statistical evaluation and pathway analysis provided on the website or to generate processed input data for a broad range of applications in life sciences research.
NASA Astrophysics Data System (ADS)
Sanchez, J.
2018-06-01
In this paper, the application and analysis of the asymptotic approximation method to a single degree-of-freedom has recently been produced. The original concepts are summarized, and the necessary probabilistic concepts are developed and applied to single degree-of-freedom systems. Then, these concepts are united, and the theoretical and computational models are developed. To determine the viability of the proposed method in a probabilistic context, numerical experiments are conducted, and consist of a frequency analysis, analysis of the effects of measurement noise, and a statistical analysis. In addition, two examples are presented and discussed.
Model-Based Linkage Analysis of a Quantitative Trait.
Song, Yeunjoo E; Song, Sunah; Schnell, Audrey H
2017-01-01
Linkage Analysis is a family-based method of analysis to examine whether any typed genetic markers cosegregate with a given trait, in this case a quantitative trait. If linkage exists, this is taken as evidence in support of a genetic basis for the trait. Historically, linkage analysis was performed using a binary disease trait, but has been extended to include quantitative disease measures. Quantitative traits are desirable as they provide more information than binary traits. Linkage analysis can be performed using single-marker methods (one marker at a time) or multipoint (using multiple markers simultaneously). In model-based linkage analysis the genetic model for the trait of interest is specified. There are many software options for performing linkage analysis. Here, we use the program package Statistical Analysis for Genetic Epidemiology (S.A.G.E.). S.A.G.E. was chosen because it also includes programs to perform data cleaning procedures and to generate and test genetic models for a quantitative trait, in addition to performing linkage analysis. We demonstrate in detail the process of running the program LODLINK to perform single-marker analysis, and MLOD to perform multipoint analysis using output from SEGREG, where SEGREG was used to determine the best fitting statistical model for the trait.
Analysis of S-box in Image Encryption Using Root Mean Square Error Method
NASA Astrophysics Data System (ADS)
Hussain, Iqtadar; Shah, Tariq; Gondal, Muhammad Asif; Mahmood, Hasan
2012-07-01
The use of substitution boxes (S-boxes) in encryption applications has proven to be an effective nonlinear component in creating confusion and randomness. The S-box is evolving and many variants appear in literature, which include advanced encryption standard (AES) S-box, affine power affine (APA) S-box, Skipjack S-box, Gray S-box, Lui J S-box, residue prime number S-box, Xyi S-box, and S8 S-box. These S-boxes have algebraic and statistical properties which distinguish them from each other in terms of encryption strength. In some circumstances, the parameters from algebraic and statistical analysis yield results which do not provide clear evidence in distinguishing an S-box for an application to a particular set of data. In image encryption applications, the use of S-boxes needs special care because the visual analysis and perception of a viewer can sometimes identify artifacts embedded in the image. In addition to existing algebraic and statistical analysis already used for image encryption applications, we propose an application of root mean square error technique, which further elaborates the results and enables the analyst to vividly distinguish between the performances of various S-boxes. While the use of the root mean square error analysis in statistics has proven to be effective in determining the difference in original data and the processed data, its use in image encryption has shown promising results in estimating the strength of the encryption method. In this paper, we show the application of the root mean square error analysis to S-box image encryption. The parameters from this analysis are used in determining the strength of S-boxes
An Efficient Objective Analysis System for Parallel Computers
NASA Technical Reports Server (NTRS)
Stobie, J.
1999-01-01
A new atmospheric objective analysis system designed for parallel computers will be described. The system can produce a global analysis (on a 1 X 1 lat-lon grid with 18 levels of heights and winds and 10 levels of moisture) using 120,000 observations in 17 minutes on 32 CPUs (SGI Origin 2000). No special parallel code is needed (e.g. MPI or multitasking) and the 32 CPUs do not have to be on the same platform. The system is totally portable and can run on several different architectures at once. In addition, the system can easily scale up to 100 or more CPUS. This will allow for much higher resolution and significant increases in input data. The system scales linearly as the number of observations and the number of grid points. The cost overhead in going from 1 to 32 CPUs is 18%. In addition, the analysis results are identical regardless of the number of processors used. This system has all the characteristics of optimal interpolation, combining detailed instrument and first guess error statistics to produce the best estimate of the atmospheric state. Static tests with a 2 X 2.5 resolution version of this system showed it's analysis increments are comparable to the latest NASA operational system including maintenance of mass-wind balance. Results from several months of cycling test in the Goddard EOS Data Assimilation System (GEOS DAS) show this new analysis retains the same level of agreement between the first guess and observations (O-F statistics) as the current operational system.
Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.
Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y; Chen, Wei
2016-02-01
Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. © 2016 WILEY PERIODICALS, INC.
Gene-based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions
Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E.; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y.; Chen, Wei
2015-01-01
Summary Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, we develop here Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT) which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. PMID:26782979
Statistical quality control through overall vibration analysis
NASA Astrophysics Data System (ADS)
Carnero, M. a. Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos
2010-05-01
The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence of predictive variables (high-frequency vibration displacements) that are sensible to the processes setup and the quality of the products obtained. Based on the result of this overall vibration analysis, a second paper will analyse self-induced vibration spectrums in order to define limit vibration bands, controllable every cycle or connected to permanent vibration-monitoring systems able to adjust sensible process variables identified by ANOVA, once the vibration readings exceed established quality limits.
Yigzaw, Kassaye Yitbarek; Michalas, Antonis; Bellika, Johan Gustav
2017-01-03
Techniques have been developed to compute statistics on distributed datasets without revealing private information except the statistical results. However, duplicate records in a distributed dataset may lead to incorrect statistical results. Therefore, to increase the accuracy of the statistical analysis of a distributed dataset, secure deduplication is an important preprocessing step. We designed a secure protocol for the deduplication of horizontally partitioned datasets with deterministic record linkage algorithms. We provided a formal security analysis of the protocol in the presence of semi-honest adversaries. The protocol was implemented and deployed across three microbiology laboratories located in Norway, and we ran experiments on the datasets in which the number of records for each laboratory varied. Experiments were also performed on simulated microbiology datasets and data custodians connected through a local area network. The security analysis demonstrated that the protocol protects the privacy of individuals and data custodians under a semi-honest adversarial model. More precisely, the protocol remains secure with the collusion of up to N - 2 corrupt data custodians. The total runtime for the protocol scales linearly with the addition of data custodians and records. One million simulated records distributed across 20 data custodians were deduplicated within 45 s. The experimental results showed that the protocol is more efficient and scalable than previous protocols for the same problem. The proposed deduplication protocol is efficient and scalable for practical uses while protecting the privacy of patients and data custodians.
Medicaid reimbursement, prenatal care and infant health.
Sonchak, Lyudmyla
2015-12-01
This paper evaluates the impact of state-level Medicaid reimbursement rates for obstetric care on prenatal care utilization across demographic groups. It also uses these rates as an instrumental variable to assess the importance of prenatal care on birth weight. The analysis is conducted using a unique dataset of Medicaid reimbursement rates and 2001-2010 Vital Statistics Natality data. Conditional on county fixed effects, the study finds a modest, but statistically significant positive relationship between Medicaid reimbursement rates and the number of prenatal visits obtained by pregnant women. Additionally, higher rates are associated with an increase in the probability of obtaining adequate care, as well as a reduction in the incidence of going without any prenatal care. However, the effect of an additional prenatal visit on birth weight is virtually zero for black disadvantaged mothers, while an additional visit yields a substantial increase in birth weight of over 20 g for white disadvantaged mothers. Copyright © 2015 Elsevier B.V. All rights reserved.
A phylogenetic transform enhances analysis of compositional microbiota data
Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A
2017-01-01
Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities. DOI: http://dx.doi.org/10.7554/eLife.21887.001 PMID:28198697
A flexible, interpretable framework for assessing sensitivity to unmeasured confounding.
Dorie, Vincent; Harada, Masataka; Carnegie, Nicole Bohme; Hill, Jennifer
2016-09-10
When estimating causal effects, unmeasured confounding and model misspecification are both potential sources of bias. We propose a method to simultaneously address both issues in the form of a semi-parametric sensitivity analysis. In particular, our approach incorporates Bayesian Additive Regression Trees into a two-parameter sensitivity analysis strategy that assesses sensitivity of posterior distributions of treatment effects to choices of sensitivity parameters. This results in an easily interpretable framework for testing for the impact of an unmeasured confounder that also limits the number of modeling assumptions. We evaluate our approach in a large-scale simulation setting and with high blood pressure data taken from the Third National Health and Nutrition Examination Survey. The model is implemented as open-source software, integrated into the treatSens package for the R statistical programming language. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Statistical Analysis of NAS Parallel Benchmarks and LINPACK Results
NASA Technical Reports Server (NTRS)
Meuer, Hans-Werner; Simon, Horst D.; Strohmeier, Erich; Lasinski, T. A. (Technical Monitor)
1994-01-01
In the last three years extensive performance data have been reported for parallel machines both based on the NAS Parallel Benchmarks, and on LINPACK. In this study we have used the reported benchmark results and performed a number of statistical experiments using factor, cluster, and regression analyses. In addition to the performance results of LINPACK and the eight NAS parallel benchmarks, we have also included peak performance of the machine, and the LINPACK n and n(sub 1/2) values. Some of the results and observations can be summarized as follows: 1) All benchmarks are strongly correlated with peak performance. 2) LINPACK and EP have each a unique signature. 3) The remaining NPB can grouped into three groups as follows: (CG and IS), (LU and SP), and (MG, FT, and BT). Hence three (or four with EP) benchmarks are sufficient to characterize the overall NPB performance. Our poster presentation will follow a standard poster format, and will present the data of our statistical analysis in detail.
Factorial analysis of trihalomethanes formation in drinking water.
Chowdhury, Shakhawat; Champagne, Pascale; McLellan, P James
2010-06-01
Disinfection of drinking water reduces pathogenic infection, but may pose risks to human health through the formation of disinfection byproducts. The effects of different factors on the formation of trihalomethanes were investigated using a statistically designed experimental program, and a predictive model for trihalomethanes formation was developed. Synthetic water samples with different factor levels were produced, and trihalomethanes concentrations were measured. A replicated fractional factorial design with center points was performed, and significant factors were identified through statistical analysis. A second-order trihalomethanes formation model was developed from 92 experiments, and the statistical adequacy was assessed through appropriate diagnostics. This model was validated using additional data from the Drinking Water Surveillance Program database and was applied to the Smiths Falls water supply system in Ontario, Canada. The model predictions were correlated strongly to the measured trihalomethanes, with correlations of 0.95 and 0.91, respectively. The resulting model can assist in analyzing risk-cost tradeoffs in the design and operation of water supply systems.
An Adaptive Buddy Check for Observational Quality Control
NASA Technical Reports Server (NTRS)
Dee, Dick P.; Rukhovets, Leonid; Todling, Ricardo; DaSilva, Arlindo M.; Larson, Jay W.; Einaudi, Franco (Technical Monitor)
2000-01-01
An adaptive buddy check algorithm is presented that adjusts tolerances for outlier observations based on the variability of surrounding data. The algorithm derives from a statistical hypothesis test combined with maximum-likelihood covariance estimation. Its stability is shown to depend on the initial identification of outliers by a simple background check. The adaptive feature ensures that the final quality control decisions are not very sensitive to prescribed statistics of first-guess and observation errors, nor on other approximations introduced into the algorithm. The implementation of the algorithm in a global atmospheric data assimilation is described. Its performance is contrasted with that of a non-adaptive buddy check, for the surface analysis of an extreme storm that took place in Europe on 27 December 1999. The adaptive algorithm allowed the inclusion of many important observations that differed greatly from the first guess and that would have been excluded on the basis of prescribed statistics. The analysis of the storm development was much improved as a result of these additional observations.
A Tool for Estimating Variability in Wood Preservative Treatment Retention
Patricia K. Lebow; Adam M. Taylor; Timothy M. Young
2015-01-01
Composite sampling is standard practice for evaluation of preservative retention levels in preservative-treated wood. Current protocols provide an average retention value but no estimate of uncertainty. Here we describe a statistical method for calculating uncertainty estimates using the standard sampling regime with minimal additional chemical analysis. This tool can...
An Interpersonal Analysis of Pathological Personality Traits in "DSM-5"
ERIC Educational Resources Information Center
Wright, Aidan G. C.; Pincus, Aaron L.; Hopwood, Christopher J.; Thomas, Katherine M.; Markon, Kristian E.; Krueger, Robert F.
2012-01-01
The proposed changes to the personality disorder section of the "Diagnostic and Statistical Manual of Mental Disorders" (5th ed.) places an increased focus on interpersonal impairment as one of the defining features of personality psychopathology. In addition, a proposed trait model has been offered to provide a means of capturing…
ERIC Educational Resources Information Center
Bitler, Marianne; Domina, Thurston; Penner, Emily; Hoynes, Hilary
2015-01-01
We use quantile treatment effects estimation to examine the consequences of the random-assignment New York City School Choice Scholarship Program across the distribution of student achievement. Our analyses suggest that the program had negligible and statistically insignificant effects across the skill distribution. In addition to contributing to…
Linguistic Features of Humor in Academic Writing
ERIC Educational Resources Information Center
Skalicky, Stephen; Berger, Cynthia M.; Crossley, Scott A.; McNamara, Danielle S.
2016-01-01
A corpus of 313 freshman college essays was analyzed in order to better understand the forms and functions of humor in academic writing. Human ratings of humor and wordplay were statistically aggregated using Factor Analysis to provide an overall "Humor" component score for each essay in the corpus. In addition, the essays were also…
ERIC Educational Resources Information Center
Palazotto, Anthony N.; And Others
This report is the result of a pilot program to seek out ways for developing an educational institution's transportation flow. Techniques and resulting statistics are discussed. Suggestions for additional uses of the information obtained are indicated. (Author)
Multi-Scale Surface Descriptors
Cipriano, Gregory; Phillips, George N.; Gleicher, Michael
2010-01-01
Local shape descriptors compactly characterize regions of a surface, and have been applied to tasks in visualization, shape matching, and analysis. Classically, curvature has be used as a shape descriptor; however, this differential property characterizes only an infinitesimal neighborhood. In this paper, we provide shape descriptors for surface meshes designed to be multi-scale, that is, capable of characterizing regions of varying size. These descriptors capture statistically the shape of a neighborhood around a central point by fitting a quadratic surface. They therefore mimic differential curvature, are efficient to compute, and encode anisotropy. We show how simple variants of mesh operations can be used to compute the descriptors without resorting to expensive parameterizations, and additionally provide a statistical approximation for reduced computational cost. We show how these descriptors apply to a number of uses in visualization, analysis, and matching of surfaces, particularly to tasks in protein surface analysis. PMID:19834190
Evaluation of statistical distributions to analyze the pollution of Cd and Pb in urban runoff.
Toranjian, Amin; Marofi, Safar
2017-05-01
Heavy metal pollution in urban runoff causes severe environmental damage. Identification of these pollutants and their statistical analysis is necessary to provide management guidelines. In this study, 45 continuous probability distribution functions were selected to fit the Cd and Pb data in the runoff events of an urban area during October 2014-May 2015. The sampling was conducted from the outlet of the city basin during seven precipitation events. For evaluation and ranking of the functions, we used the goodness of fit Kolmogorov-Smirnov and Anderson-Darling tests. The results of Cd analysis showed that Hyperbolic Secant, Wakeby and Log-Pearson 3 are suitable for frequency analysis of the event mean concentration (EMC), the instantaneous concentration series (ICS) and instantaneous concentration of each event (ICEE), respectively. In addition, the LP3, Wakeby and Generalized Extreme Value functions were chosen for the EMC, ICS and ICEE related to Pb contamination.
MAI statistics estimation and analysis in a DS-CDMA system
NASA Astrophysics Data System (ADS)
Alami Hassani, A.; Zouak, M.; Mrabti, M.; Abdi, F.
2018-05-01
A primary limitation of Direct Sequence Code Division Multiple Access DS-CDMA link performance and system capacity is multiple access interference (MAI). To examine the performance of CDMA systems in the presence of MAI, i.e., in a multiuser environment, several works assumed that the interference can be approximated by a Gaussian random variable. In this paper, we first develop a new and simple approach to characterize the MAI in a multiuser system. In addition to statistically quantifying the MAI power, the paper also proposes a statistical model for both variance and mean of the MAI for synchronous and asynchronous CDMA transmission. We show that the MAI probability density function (PDF) is Gaussian for the equal-received-energy case and validate it by computer simulations.
Effect of sexual steroids on boar kinematic sperm subpopulations.
Ayala, E M E; Aragón, M A
2017-11-01
Here, we show the effects of sexual steroids, progesterone, testosterone, or estradiol on motility parameters of boar sperm. Sixteen commercial seminal doses, four each of four adult boars, were analyzed using computer assisted sperm analysis (CASA). Mean values of motility parameters were analyzed by bivariate and multivariate statistics. Principal component analysis (PCA), followed by hierarchical clustering, was applied on data of motility parameters, provided automatically as intervals by the CASA system. Effects of sexual steroids were described in the kinematic subpopulations identified from multivariate statistics. Mean values of motility parameters were not significantly changed after addition of sexual steroids. Multivariate graphics showed that sperm subpopulations were not sensitive to the addition of either testosterone or estradiol, but sperm subpopulations responsive to progesterone were found. Distribution of motility parameters were wide in controls but sharpened at distinct concentrations of progesterone. We conclude that kinematic sperm subpopulations responsive to progesterone are present in boar semen, and these subpopulations are masked in evaluations of mean values of motility parameters. © 2017 International Society for Advancement of Cytometry. © 2017 International Society for Advancement of Cytometry.
Panattil, Prabitha; Sreelatha, M
2016-09-01
Proteinuria is always associated with intrinsic kidney disese and is a strong predictor of later development of End Stage Renal Disease (ESRD). As Renin Angiotensin Aldosterone System (RAAS) has a role in mediating proteinuria, inhibitors of this system are renoprotective and patients with refractory proteinuria are put on a combination of these agents. The routinely employed triple blockade of RAAS with Angiotensin Converting Enzyme (ACE) inhibitor, ARB and Aldosterone antagonist has many limitations. Addition of Aliskiren to this combination suppresses the RAAS at the earliest stage and can offset many of these limitations. This study was conducted to assess the safety and efficacy of complete RAAS blockade by the addition of Aliskiren in those patients with refractory proteinuria who were already on triple blockade with ACE inhibitor, ARB and Aldosterone antagonist. This study was conducted in Nephrology Department, Calicut Medical College. A total of 36 patients with refractory proteinuria who were already on ACE inhibitor, ARB and Aldosterone antagonist were divided in to two groups A and B. Group A received Aliskiren in addition to the above combination whereas group B continued the same treatment for 12 weeks. Efficacy of the treatment was assessed by recording 24hr urine protein and safety by S.Creatinine, S.Potassium every 2 weeks of the treatment period. Statistical analysis of the lab values was done using SPSS software. Unpaired t-test, Paired t-test and Chi-square test were done for data analysis. Statistical analysis revealed that addition of Aliskiren to the combination therapy with ACE inhibitor+ ARB+ Aldosterone antagonist offers no advantage. But mean reduction in proteinuria was more with Group A than Group B. There is no statistically significant change in S.Creatinine and S.Potassium at the end of treatment. As proteinuria is a strong risk factor for progression to ESRD, even a mild decrease in proteinuria by treatment is renoprotective. Hence treatment with group A may be considered clinically superior to group B with no alteration in safety and tolerability. But further multicentre studies with larger sample size and dose escalation are required for confirmation.
CORSSA: Community Online Resource for Statistical Seismicity Analysis
NASA Astrophysics Data System (ADS)
Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.
2011-12-01
Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.
Do perfume additives termed human pheromones warrant being termed pheromones?
Winman, Anders
2004-09-30
Two studies of the effects of perfume additives, termed human pheromones by the authors, have conveyed the message that these substances can promote an increase in human sociosexual behaviour [Physiol. Behav. 75 (2003) R1; Arch. Sex. Behav. 27 (1998) R2]. The present paper presents an extended analysis of this data. It is shown that in neither study is there a statistically significant increase in any of the sociosexual behaviours for the experimental groups. In the control groups of both studies, there are, however, moderate but statistically significant decreases in the corresponding behaviour. Most notably, there is no support in data for the claim that the substances increase the attractiveness of the wearers of the substances to the other sex. It is concluded that more research using matched homogenous groups of participants is needed. Copyright 2004 Elsevier Inc.
INFORMATION: THEORY, BRAIN, AND BEHAVIOR
Jensen, Greg; Ward, Ryan D.; Balsam, Peter D.
2016-01-01
In the 65 years since its formal specification, information theory has become an established statistical paradigm, providing powerful tools for quantifying probabilistic relationships. Behavior analysis has begun to adopt these tools as a novel means of measuring the interrelations between behavior, stimuli, and contingent outcomes. This approach holds great promise for making more precise determinations about the causes of behavior and the forms in which conditioning may be encoded by organisms. In addition to providing an introduction to the basics of information theory, we review some of the ways that information theory has informed the studies of Pavlovian conditioning, operant conditioning, and behavioral neuroscience. In addition to enriching each of these empirical domains, information theory has the potential to act as a common statistical framework by which results from different domains may be integrated, compared, and ultimately unified. PMID:24122456
Analyzing Immunoglobulin Repertoires
Chaudhary, Neha; Wesemann, Duane R.
2018-01-01
Somatic assembly of T cell receptor and B cell receptor (BCR) genes produces a vast diversity of lymphocyte antigen recognition capacity. The advent of efficient high-throughput sequencing of lymphocyte antigen receptor genes has recently generated unprecedented opportunities for exploration of adaptive immune responses. With these opportunities have come significant challenges in understanding the analysis techniques that most accurately reflect underlying biological phenomena. In this regard, sample preparation and sequence analysis techniques, which have largely been borrowed and adapted from other fields, continue to evolve. Here, we review current methods and challenges of library preparation, sequencing and statistical analysis of lymphocyte receptor repertoire studies. We discuss the general steps in the process of immune repertoire generation including sample preparation, platforms available for sequencing, processing of sequencing data, measurable features of the immune repertoire, and the statistical tools that can be used for analysis and interpretation of the data. Because BCR analysis harbors additional complexities, such as immunoglobulin (Ig) (i.e., antibody) gene somatic hypermutation and class switch recombination, the emphasis of this review is on Ig/BCR sequence analysis. PMID:29593723
Load Model Verification, Validation and Calibration Framework by Statistical Analysis on Field Data
NASA Astrophysics Data System (ADS)
Jiao, Xiangqing; Liao, Yuan; Nguyen, Thai
2017-11-01
Accurate load models are critical for power system analysis and operation. A large amount of research work has been done on load modeling. Most of the existing research focuses on developing load models, while little has been done on developing formal load model verification and validation (V&V) methodologies or procedures. Most of the existing load model validation is based on qualitative rather than quantitative analysis. In addition, not all aspects of model V&V problem have been addressed by the existing approaches. To complement the existing methods, this paper proposes a novel load model verification and validation framework that can systematically and more comprehensively examine load model's effectiveness and accuracy. Statistical analysis, instead of visual check, quantifies the load model's accuracy, and provides a confidence level of the developed load model for model users. The analysis results can also be used to calibrate load models. The proposed framework can be used as a guidance to systematically examine load models for utility engineers and researchers. The proposed method is demonstrated through analysis of field measurements collected from a utility system.
Chua, Felicia H Z; Thien, Ady; Ng, Lee Ping; Seow, Wan Tew; Low, David C Y; Chang, Kenneth T E; Lian, Derrick W Q; Loh, Eva; Low, Sharon Y Y
2017-03-01
Posterior fossa syndrome (PFS) is a serious complication faced by neurosurgeons and their patients, especially in paediatric medulloblastoma patients. The uncertain aetiology of PFS, myriad of cited risk factors and therapeutic challenges make this phenomenon an elusive entity. The primary objective of this study was to identify associative factors related to the development of PFS in medulloblastoma patient post-tumour resection. This is a retrospective study based at a single institution. Patient data and all related information were collected from the hospital records, in accordance to a list of possible risk factors associated with PFS. These included pre-operative tumour volume, hydrocephalus, age, gender, extent of resection, metastasis, ventriculoperitoneal shunt insertion, post-operative meningitis and radiological changes in MRI. Additional variables included molecular and histological subtypes of each patient's medulloblastoma tumour. Statistical analysis was employed to determine evidence of each variable's significance in PFS permanence. A total of 19 patients with appropriately complete data was identified. Initial univariate analysis did not show any statistical significance. However, multivariate analysis for MRI-specific changes reported bilateral DWI restricted diffusion changes involving both right and left sides of the surgical cavity was of statistical significance for PFS permanence. The authors performed a clinical study that evaluated possible risk factors for permanent PFS in paediatric medulloblastoma patients. Analysis of collated results found that post-operative DWI restriction in bilateral regions within the surgical cavity demonstrated statistical significance as a predictor of PFS permanence-a novel finding in the current literature.
Effects of ozone (O3) therapy on cisplatin-induced ototoxicity in rats.
Koçak, Hasan Emre; Taşkın, Ümit; Aydın, Salih; Oktay, Mehmet Faruk; Altınay, Serdar; Çelik, Duygu Sultan; Yücebaş, Kadir; Altaş, Bengül
2016-12-01
The aim of this study is to investigate the effect of rectal ozone and intratympanic ozone therapy on cisplatin-induced ototoxicity in rats. Eighteen female Wistar albino rats were included in our study. External auditory canal and tympanic membrane examinations were normal in all rats. The rats were randomly divided into three groups. Initially, all the rats were tested with distortion product otoacoustic emissions (DPOAE), and emissions were measured normally. All rats were injected with 5-mg/kg/day cisplatin for 3 days intraperitoneally. Ototoxicy had developed in all rats, as confirmed with DPOAE after 1 week. Rectal and intratympanic ozone therapy group was Group 1. No treatment was administered for the rats in Group 2 as the control group. The rats in Group 3 were treated with rectal ozone. All the rats were tested with DPOAE under general anesthesia, and all were sacrificed for pathological examination 1 week after ozone administration. Their cochleas were removed. The outer hair cell damage and stria vascularis damage were examined. In the statistical analysis conducted, a statistically significant difference between Group 1 and Group 2 was observed in all frequencies according to the DPOAE test. In addition, between Group 2 and Group 3, a statistically significant difference was observed in the DPOAE test. However, a statistically significant difference was not observed between Group 1 and Group 3 according to the DPOAE test. According to histopathological scoring, the outer hair cell damage score was statistically significantly high in Group 2 compared with Group 1. In addition, the outer hair cell damage score was also statistically significantly high in Group 2 compared with Group 3. Outer hair cell damage scores were low in Group 1 and Group 3, but there was no statistically significant difference between these groups. There was no statistically significant difference between the groups in terms of stria vascularis damage score examinations. Systemic ozone gas therapy is effective in the treatment of cell damage in cisplatin-induced ototoxicity. The intratympanic administration of ozone gas does not have any additional advantage over the rectal administration.
Descriptive Statistics and Cluster Analysis for Extreme Rainfall in Java Island
NASA Astrophysics Data System (ADS)
E Komalasari, K.; Pawitan, H.; Faqih, A.
2017-03-01
This study aims to describe regional pattern of extreme rainfall based on maximum daily rainfall for period 1983 to 2012 in Java Island. Descriptive statistics analysis was performed to obtain centralization, variation and distribution of maximum precipitation data. Mean and median are utilized to measure central tendency data while Inter Quartile Range (IQR) and standard deviation are utilized to measure variation of data. In addition, skewness and kurtosis used to obtain shape the distribution of rainfall data. Cluster analysis using squared euclidean distance and ward method is applied to perform regional grouping. Result of this study show that mean (average) of maximum daily rainfall in Java Region during period 1983-2012 is around 80-181mm with median between 75-160mm and standard deviation between 17 to 82. Cluster analysis produces four clusters and show that western area of Java tent to have a higher annual maxima of daily rainfall than northern area, and have more variety of annual maximum value.
Uei, Shu-Lin; Tsai, Chung-Hung; Kuo, Yu-Ming
2016-04-29
Telehealth cost analysis has become a crucial issue for governments in recent years. In this study, we examined cases of metabolic syndrome in Hualien County, Taiwan. This research adopted the framework proposed by Marchand to establish a study process. In addition, descriptive statistics, a t test, analysis of variance, and regression analysis were employed to analyze 100 questionnaires. The results of the t$ test revealed significant differences in medical health expenditure, number of clinical visits for medical treatment, average amount of time spent commuting to clinics, amount of time spent undergoing medical treatment, and average number of people accompanying patients to medical care facilities or assisting with other tasks in the past one month, indicating that offering telehealth care services can reduce health expenditure. The statistical analysis results revealed that customer satisfaction has a positive effect on reducing health expenditure. Therefore, this study proves that telehealth care systems can effectively reduce health expenditure and directly improve customer satisfaction with medical treatment.
Lamart, Stephanie; Griffiths, Nina M; Tchitchek, Nicolas; Angulo, Jaime F; Van der Meeren, Anne
2017-03-01
The aim of this work was to develop a computational tool that integrates several statistical analysis features for biodistribution data from internal contamination experiments. These data represent actinide levels in biological compartments as a function of time and are derived from activity measurements in tissues and excreta. These experiments aim at assessing the influence of different contamination conditions (e.g. intake route or radioelement) on the biological behavior of the contaminant. The ever increasing number of datasets and diversity of experimental conditions make the handling and analysis of biodistribution data difficult. This work sought to facilitate the statistical analysis of a large number of datasets and the comparison of results from diverse experimental conditions. Functional modules were developed using the open-source programming language R to facilitate specific operations: descriptive statistics, visual comparison, curve fitting, and implementation of biokinetic models. In addition, the structure of the datasets was harmonized using the same table format. Analysis outputs can be written in text files and updated data can be written in the consistent table format. Hence, a data repository is built progressively, which is essential for the optimal use of animal data. Graphical representations can be automatically generated and saved as image files. The resulting computational tool was applied using data derived from wound contamination experiments conducted under different conditions. In facilitating biodistribution data handling and statistical analyses, this computational tool ensures faster analyses and a better reproducibility compared with the use of multiple office software applications. Furthermore, re-analysis of archival data and comparison of data from different sources is made much easier. Hence this tool will help to understand better the influence of contamination characteristics on actinide biokinetics. Our approach can aid the optimization of treatment protocols and therefore contribute to the improvement of the medical response after internal contamination with actinides.
[Evaluative designs in public health: methodological considerations].
López, Ma José; Marí-Dell'Olmo, Marc; Pérez-Giménez, Anna; Nebot, Manel
2011-06-01
Evaluation of public health interventions poses numerous methodological challenges. Randomization of individuals is not always feasible and interventions are usually composed of multiple factors. To face these challenges, certain elements, such as the selection of the most appropriate design and the use of a statistical analysis that includes potential confounders, are essential. The objective of this article was to describe the most frequently used designs in the evaluation of public health interventions (policies, programs or campaigns). The characteristics, strengths and weaknesses of each of these evaluative designs are described. Additionally, a brief explanation of the most commonly used statistical analysis in each of these designs is provided. Copyright © 2011 Sociedad Española de Salud Pública y Administración Sanitaria. Published by Elsevier Espana. All rights reserved.
Docking studies on NSAID/COX-2 isozyme complexes using Contact Statistics analysis
NASA Astrophysics Data System (ADS)
Ermondi, Giuseppe; Caron, Giulia; Lawrence, Raelene; Longo, Dario
2004-11-01
The selective inhibition of COX-2 isozymes should lead to a new generation of NSAIDs with significantly reduced side effects; e.g. celecoxib (Celebrex®) and rofecoxib (Vioxx®). To obtain inhibitors with higher selectivity it has become essential to gain additional insight into the details of the interactions between COX isozymes and NSAIDs. Although X-ray structures of COX-2 complexed with a small number of ligands are available, experimental data are missing for two well-known selective COX-2 inhibitors (rofecoxib and nimesulide) and docking results reported are controversial. We use a combination of a traditional docking procedure with a new computational tool (Contact Statistics analysis) that identifies the best orientation among a number of solutions to shed some light on this topic.
Wu, Johnny C; Gardner, David P; Ozer, Stuart; Gutell, Robin R; Ren, Pengyu
2009-08-28
The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.
Exploring the use of statistical process control methods to assess course changes
NASA Astrophysics Data System (ADS)
Vollstedt, Ann-Marie
This dissertation pertains to the field of Engineering Education. The Department of Mechanical Engineering at the University of Nevada, Reno (UNR) is hosting this dissertation under a special agreement. This study was motivated by the desire to find an improved, quantitative measure of student quality that is both convenient to use and easy to evaluate. While traditional statistical analysis tools such as ANOVA (analysis of variance) are useful, they are somewhat time consuming and are subject to error because they are based on grades, which are influenced by numerous variables, independent of student ability and effort (e.g. inflation and curving). Additionally, grades are currently the only measure of quality in most engineering courses even though most faculty agree that grades do not accurately reflect student quality. Based on a literature search, in this study, quality was defined as content knowledge, cognitive level, self efficacy, and critical thinking. Nineteen treatments were applied to a pair of freshmen classes in an effort in increase the qualities. The qualities were measured via quiz grades, essays, surveys, and online critical thinking tests. Results from the quality tests were adjusted and filtered prior to analysis. All test results were subjected to Chauvenet's criterion in order to detect and remove outlying data. In addition to removing outliers from data sets, it was felt that individual course grades needed adjustment to accommodate for the large portion of the grade that was defined by group work. A new method was developed to adjust grades within each group based on the residual of the individual grades within the group and the portion of the course grade defined by group work. It was found that the grade adjustment method agreed 78% of the time with the manual ii grade changes instructors made in 2009, and also increased the correlation between group grades and individual grades. Using these adjusted grades, Statistical Process Control (SPC) methods were employed to evaluate the impact of the treatments applied to improve the courses. It was determined that using SPC is advantageous because it does not require additional resources and is not affected if a course is curved by adding the same amount of points to each student's grade. It was also determined that SPC results, unlike average grade, correlated well with anecdotal evidence from the instructors concerning how well the students performed in any given year. In addition to application of SPC to evaluate curriculum change, statistical analysis was used to show that course grades correlate with quiz grades, but do not correlate with critical thinking, self efficacy, or cognitive level which implies that treatments need to be implemented to increase these qualities.
Wagner, Tyler; Irwin, Brian J.; James R. Bence,; Daniel B. Hayes,
2016-01-01
Monitoring to detect temporal trends in biological and habitat indices is a critical component of fisheries management. Thus, it is important that management objectives are linked to monitoring objectives. This linkage requires a definition of what constitutes a management-relevant “temporal trend.” It is also important to develop expectations for the amount of time required to detect a trend (i.e., statistical power) and for choosing an appropriate statistical model for analysis. We provide an overview of temporal trends commonly encountered in fisheries management, review published studies that evaluated statistical power of long-term trend detection, and illustrate dynamic linear models in a Bayesian context, as an additional analytical approach focused on shorter term change. We show that monitoring programs generally have low statistical power for detecting linear temporal trends and argue that often management should be focused on different definitions of trends, some of which can be better addressed by alternative analytical approaches.
Simulator Evaluation of Lineup Visual Landing Aids for Night Carrier Landing.
1987-03-10
recognized that the system is less than optimum (2,3). Because the information from the meatball is of zero order (displacement only), there are...gives the analysis-of-variance summaries of glideslope performance across the flight segments for TOT glideslope + 0.3 degrees (± 1.0 meatball ), RMS...accepted as reliable. In addition, analysis-of- variance of percent TOT glideslope ± 0.45 degrees (± 1.5 meatball ) did not reveal any statistical
NASA Astrophysics Data System (ADS)
Ndehedehe, Christopher E.; Agutu, Nathan O.; Okwuashi, Onuwa; Ferreira, Vagner G.
2016-09-01
Lake Chad has recently been perceived to be completely desiccated and almost extinct due to insufficient published ground observations. Given the high spatial variability of rainfall in the region, and the fact that extreme climatic conditions (for example, droughts) could be intensifying in the Lake Chad basin (LCB) due to human activities, a spatio-temporal approach to drought analysis becomes essential. This study employed independent component analysis (ICA), a fourth-order cumulant statistics, to decompose standardised precipitation index (SPI), standardised soil moisture index (SSI), and terrestrial water storage (TWS) derived from Gravity Recovery and Climate Experiment (GRACE) into spatial and temporal patterns over the LCB. In addition, this study uses satellite altimetry data to estimate variations in the Lake Chad water levels, and further employs relevant climate teleconnection indices (El-Niño Southern Oscillation-ENSO, Atlantic Multi-decadal Oscillation-AMO, and Atlantic Meridional Mode-AMM) to examine their links to the observed drought temporal patterns over the basin. From the spatio-temporal drought analysis, temporal evolutions of SPI at 12 month aggregation show relatively wet conditions in the last two decades (although with marked alterations) with the 2012-2014 period being the wettest. In addition to the improved rainfall conditions during this period, there was a statistically significant increase of 0.04 m/yr in altimetry water levels observed over Lake Chad between 2008 and 2014, which confirms a shift in the hydrological conditions of the basin. Observed trend in TWS changes during the 2002-2014 period shows a statistically insignificant increase of 3.0 mm/yr at the centre of the basin, coinciding with soil moisture deficit indicated by the temporal evolutions of SSI at all monthly accumulations during the 2002-2003 and 2009-2012 periods. Further, SPI at 3 and 6 month scales indicated fluctuating drought conditions at the extreme south of the basin, coinciding with a statistically insignificant decline in TWS of about 4.5 mm/yr at the southern catchment of the basin. Finally, correlation analyses indicate that ENSO, AMO, and AMM are associated with extreme rainfall conditions in the basin, with AMO showing the strongest association (statistically significant correlation of 0.55) with SPI 12 month aggregation. Therefore, this study provides a framework that will support drought monitoring in the LCB.
2007-01-01
Background The US Food and Drug Administration approved the Charité artificial disc on October 26, 2004. This approval was based on an extensive analysis and review process; 20 years of disc usage worldwide; and the results of a prospective, randomized, controlled clinical trial that compared lumbar artificial disc replacement to fusion. The results of the investigational device exemption (IDE) study led to a conclusion that clinical outcomes following lumbar arthroplasty were at least as good as outcomes from fusion. Methods The author performed a new analysis of the Visual Analog Scale pain scores and the Oswestry Disability Index scores from the Charité artificial disc IDE study and used a nonparametric statistical test, because observed data distributions were not normal. The analysis included all of the enrolled subjects in both the nonrandomized and randomized phases of the study. Results Subjects from both the treatment and control groups improved from the baseline situation (P < .001) at all follow-up times (6 weeks to 24 months). Additionally, these pain and disability levels with artificial disc replacement were superior (P < .05) to the fusion treatment at all follow-up times including 2 years. Conclusions The a priori statistical plan for an IDE study may not adequately address the final distribution of the data. Therefore, statistical analyses more appropriate to the distribution may be necessary to develop meaningful statistical conclusions from the study. A nonparametric statistical analysis of the Charité artificial disc IDE outcomes scores demonstrates superiority for lumbar arthroplasty versus fusion at all follow-up time points to 24 months. PMID:25802574
NASA Astrophysics Data System (ADS)
Nykyri, K.; Moore, T.; Dimmock, A. P.
2017-12-01
In the Earth's magnetosphere, the magnetotail plasma sheet ions are much hotter than in the shocked solar wind. On the dawn-sector, the cold-component ions are more abundant and hotter by 30-40 percent when compared to the dusk sector. Recent statistical studies of the flank magnetopause and magnetosheath have shown that the level of temperature asymmetry of the magnetosheath is unable to account for this, so additional physical mechanisms must be at play, either at the magnetopause or plasma sheet that contribute to this asymmetry. In this study, we perform a statistical analysis on the ion-scale wave properties in the three main plasma regimes common to flank magnetopause boundary crossings when the boundary is unstable to KHI: hot and tenuous magnetospheric, cold and dense magnetosheath and mixed [Hasegawa 2004 et al., 2004]. These statistics of ion-scale wave properties are compared to observations of fast magnetosonic wave modes that have recently been linked to Kelvin-Helmholtz vortex centered ion heating [Moore et al., 2016]. The statistical analysis shows that during KH events there is enhanced non-adiabatic heating calculated during (temporal) ion scale wave intervals when compared to non-KH events.
auf dem Keller, Ulrich; Prudova, Anna; Gioia, Magda; Butler, Georgina S.; Overall, Christopher M.
2010-01-01
Terminal amine isotopic labeling of substrates (TAILS), our recently introduced platform for quantitative N-terminome analysis, enables wide dynamic range identification of original mature protein N-termini and protease cleavage products. Modifying TAILS by use of isobaric tag for relative and absolute quantification (iTRAQ)-like labels for quantification together with a robust statistical classifier derived from experimental protease cleavage data, we report reliable and statistically valid identification of proteolytic events in complex biological systems in MS2 mode. The statistical classifier is supported by a novel parameter evaluating ion intensity-dependent quantification confidences of single peptide quantifications, the quantification confidence factor (QCF). Furthermore, the isoform assignment score (IAS) is introduced, a new scoring system for the evaluation of single peptide-to-protein assignments based on high confidence protein identifications in the same sample prior to negative selection enrichment of N-terminal peptides. By these approaches, we identified and validated, in addition to known substrates, low abundance novel bioactive MMP-2 targets including the plasminogen receptor S100A10 (p11) and the proinflammatory cytokine proEMAP/p43 that were previously undescribed. PMID:20305283
Ganger, Michael T; Dietz, Geoffrey D; Ewing, Sarah J
2017-12-01
qPCR has established itself as the technique of choice for the quantification of gene expression. Procedures for conducting qPCR have received significant attention; however, more rigorous approaches to the statistical analysis of qPCR data are needed. Here we develop a mathematical model, termed the Common Base Method, for analysis of qPCR data based on threshold cycle values (C q ) and efficiencies of reactions (E). The Common Base Method keeps all calculations in the logscale as long as possible by working with log 10 (E) ∙ C q , which we call the efficiency-weighted C q value; subsequent statistical analyses are then applied in the logscale. We show how efficiency-weighted C q values may be analyzed using a simple paired or unpaired experimental design and develop blocking methods to help reduce unexplained variation. The Common Base Method has several advantages. It allows for the incorporation of well-specific efficiencies and multiple reference genes. The method does not necessitate the pairing of samples that must be performed using traditional analysis methods in order to calculate relative expression ratios. Our method is also simple enough to be implemented in any spreadsheet or statistical software without additional scripts or proprietary components.
Hall, Lenwood W; Anderson, Ronald D; Killen, William D
2016-02-01
The objective of this study was to assess temporal and spatial trends for eight pyrethroids monitored in sediment spanning 10 years from 2006 to 2015 in a residential stream in California (Pleasant Grove Creek). The timeframe for this study included sampling 3 years during a somewhat normal non-drought period (2006-2008) and 3 years during a severe drought period (2013-2015). Regression analysis of pyrethroid concentrations in Pleasant Grove Creek for 2006, 2007, 2008, 2012, 2013, 2014, and 2015 using ½ the detection limit for nondetected concentrations showed statistically significant declining trends for cyfluthrin, cypermethrin, deltamethrin, permethrin, and total pyrethoids. Additional trends analysis of the Pleasant Grove Creek pyrethroid data using only measured concentrations, without nondetected values, showed similar statistically significant declining trends for cyfluthrin, cypermethrin, deltamethrin, esfenvalerate, fenpropathrin, permethrin, and total pyrethroids. Spatial trends analysis for the specific creek sites showed that six of the eight pyrethroids had a greater number of sites with statistically significant declining concentrations. Possible reasons for reduced pyrethroid concentrations in the stream bed in Pleasant Grove Creek during this 10-year period are label changes in 2012 that reduced residential use and lack of precipitation during the later severe drought years of 2013-2015.
Hewett, Paul; Bullock, William H
2014-01-01
For more than 20 years CSX Transportation (CSXT) has collected exposure measurements from locomotive engineers and conductors who are potentially exposed to diesel emissions. The database included measurements for elemental and total carbon, polycyclic aromatic hydrocarbons, aromatics, aldehydes, carbon monoxide, and nitrogen dioxide. This database was statistically analyzed and summarized, and the resulting statistics and exposure profiles were compared to relevant occupational exposure limits (OELs) using both parametric and non-parametric descriptive and compliance statistics. Exposure ratings, using the American Industrial Health Association (AIHA) exposure categorization scheme, were determined using both the compliance statistics and Bayesian Decision Analysis (BDA). The statistical analysis of the elemental carbon data (a marker for diesel particulate) strongly suggests that the majority of levels in the cabs of the lead locomotives (n = 156) were less than the California guideline of 0.020 mg/m(3). The sample 95th percentile was roughly half the guideline; resulting in an AIHA exposure rating of category 2/3 (determined using BDA). The elemental carbon (EC) levels in the trailing locomotives tended to be greater than those in the lead locomotive; however, locomotive crews rarely ride in the trailing locomotive. Lead locomotive EC levels were similar to those reported by other investigators studying locomotive crew exposures and to levels measured in urban areas. Lastly, both the EC sample mean and 95%UCL were less than the Environmental Protection Agency (EPA) reference concentration of 0.005 mg/m(3). With the exception of nitrogen dioxide, the overwhelming majority of the measurements for total carbon, polycyclic aromatic hydrocarbons, aromatics, aldehydes, and combustion gases in the cabs of CSXT locomotives were either non-detects or considerably less than the working OELs for the years represented in the database. When compared to the previous American Conference of Governmental Industrial Hygienists (ACGIH) threshold limit value (TLV) of 3 ppm the nitrogen dioxide exposure profile merits an exposure rating of AIHA exposure category 1. However, using the newly adopted TLV of 0.2 ppm the exposure profile receives an exposure rating of category 4. Further evaluation is recommended to determine the current status of nitrogen dioxide exposures. [Supplementary materials are available for this article. Go to the publisher's online edition of Journal of Occupational and Environmental Hygiene for the following free supplemental resource: additional text on OELs, methods, results, and additional figures and tables.].
Association between pathology and texture features of multi parametric MRI of the prostate
NASA Astrophysics Data System (ADS)
Kuess, Peter; Andrzejewski, Piotr; Nilsson, David; Georg, Petra; Knoth, Johannes; Susani, Martin; Trygg, Johan; Helbich, Thomas H.; Polanec, Stephan H.; Georg, Dietmar; Nyholm, Tufve
2017-10-01
The role of multi-parametric (mp)MRI in the diagnosis and treatment of prostate cancer has increased considerably. An alternative to visual inspection of mpMRI is the evaluation using histogram-based (first order statistics) parameters and textural features (second order statistics). The aims of the present work were to investigate the relationship between benign and malignant sub-volumes of the prostate and textures obtained from mpMR images. The performance of tumor prediction was investigated based on the combination of histogram-based and textural parameters. Subsequently, the relative importance of mpMR images was assessed and the benefit of additional imaging analyzed. Finally, sub-structures based on the PI-RADS classification were investigated as potential regions to automatically detect maligned lesions. Twenty-five patients who received mpMRI prior to radical prostatectomy were included in the study. The imaging protocol included T2, DWI, and DCE. Delineation of tumor regions was performed based on pathological information. First and second order statistics were derived from each structure and for all image modalities. The resulting data were processed with multivariate analysis, using PCA (principal component analysis) and OPLS-DA (orthogonal partial least squares discriminant analysis) for separation of malignant and healthy tissue. PCA showed a clear difference between tumor and healthy regions in the peripheral zone for all investigated images. The predictive ability of the OPLS-DA models increased for all image modalities when first and second order statistics were combined. The predictive value reached a plateau after adding ADC and T2, and did not increase further with the addition of other image information. The present study indicates a distinct difference in the signatures between malign and benign prostate tissue. This is an absolute prerequisite for automatic tumor segmentation, but only the first step in that direction. For the specific identified signature, DCE did not add complementary information to T2 and ADC maps.
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.
Chu, Annie; Cui, Jenny; Dinov, Ivo D
2009-03-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.
Statistical issues on the analysis of change in follow-up studies in dental research.
Blance, Andrew; Tu, Yu-Kang; Baelum, Vibeke; Gilthorpe, Mark S
2007-12-01
To provide an overview to the problems in study design and associated analyses of follow-up studies in dental research, particularly addressing three issues: treatment-baselineinteractions; statistical power; and nonrandomization. Our previous work has shown that many studies purport an interacion between change (from baseline) and baseline values, which is often based on inappropriate statistical analyses. A priori power calculations are essential for randomized controlled trials (RCTs), but in the pre-test/post-test RCT design it is not well known to dental researchers that the choice of statistical method affects power, and that power is affected by treatment-baseline interactions. A common (good) practice in the analysis of RCT data is to adjust for baseline outcome values using ancova, thereby increasing statistical power. However, an important requirement for ancova is there to be no interaction between the groups and baseline outcome (i.e. effective randomization); the patient-selection process should not cause differences in mean baseline values across groups. This assumption is often violated for nonrandomized (observational) studies and the use of ancova is thus problematic, potentially giving biased estimates, invoking Lord's paradox and leading to difficulties in the interpretation of results. Baseline interaction issues can be overcome by use of statistical methods; not widely practiced in dental research: Oldham's method and multilevel modelling; the latter is preferred for its greater flexibility to deal with more than one follow-up occasion as well as additional covariates To illustrate these three key issues, hypothetical examples are considered from the fields of periodontology, orthodontics, and oral implantology. Caution needs to be exercised when considering the design and analysis of follow-up studies. ancova is generally inappropriate for nonrandomized studies and causal inferences from observational data should be avoided.
Statistical significance of trace evidence matches using independent physicochemical measurements
NASA Astrophysics Data System (ADS)
Almirall, Jose R.; Cole, Michael; Furton, Kenneth G.; Gettinby, George
1997-02-01
A statistical approach to the significance of glass evidence is proposed using independent physicochemical measurements and chemometrics. Traditional interpretation of the significance of trace evidence matches or exclusions relies on qualitative descriptors such as 'indistinguishable from,' 'consistent with,' 'similar to' etc. By performing physical and chemical measurements with are independent of one another, the significance of object exclusions or matches can be evaluated statistically. One of the problems with this approach is that the human brain is excellent at recognizing and classifying patterns and shapes but performs less well when that object is represented by a numerical list of attributes. Chemometrics can be employed to group similar objects using clustering algorithms and provide statistical significance in a quantitative manner. This approach is enhanced when population databases exist or can be created and the data in question can be evaluated given these databases. Since the selection of the variables used and their pre-processing can greatly influence the outcome, several different methods could be employed in order to obtain a more complete picture of the information contained in the data. Presently, we report on the analysis of glass samples using refractive index measurements and the quantitative analysis of the concentrations of the metals: Mg, Al, Ca, Fe, Mn, Ba, Sr, Ti and Zr. The extension of this general approach to fiber and paint comparisons also is discussed. This statistical approach should not replace the current interpretative approaches to trace evidence matches or exclusions but rather yields an additional quantitative measure. The lack of sufficient general population databases containing the needed physicochemical measurements and the potential for confusion arising from statistical analysis currently hamper this approach and ways of overcoming these obstacles are presented.
Regression analysis of mixed recurrent-event and panel-count data with additive rate models.
Zhu, Liang; Zhao, Hui; Sun, Jianguo; Leisenring, Wendy; Robison, Leslie L
2015-03-01
Event-history studies of recurrent events are often conducted in fields such as demography, epidemiology, medicine, and social sciences (Cook and Lawless, 2007, The Statistical Analysis of Recurrent Events. New York: Springer-Verlag; Zhao et al., 2011, Test 20, 1-42). For such analysis, two types of data have been extensively investigated: recurrent-event data and panel-count data. However, in practice, one may face a third type of data, mixed recurrent-event and panel-count data or mixed event-history data. Such data occur if some study subjects are monitored or observed continuously and thus provide recurrent-event data, while the others are observed only at discrete times and hence give only panel-count data. A more general situation is that each subject is observed continuously over certain time periods but only at discrete times over other time periods. There exists little literature on the analysis of such mixed data except that published by Zhu et al. (2013, Statistics in Medicine 32, 1954-1963). In this article, we consider the regression analysis of mixed data using the additive rate model and develop some estimating equation-based approaches to estimate the regression parameters of interest. Both finite sample and asymptotic properties of the resulting estimators are established, and the numerical studies suggest that the proposed methodology works well for practical situations. The approach is applied to a Childhood Cancer Survivor Study that motivated this study. © 2014, The International Biometric Society.
Gene- and pathway-based association tests for multiple traits with GWAS summary statistics.
Kwak, Il-Youp; Pan, Wei
2017-01-01
To identify novel genetic variants associated with complex traits and to shed new insights on underlying biology, in addition to the most popular single SNP-single trait association analysis, it would be useful to explore multiple correlated (intermediate) traits at the gene- or pathway-level by mining existing single GWAS or meta-analyzed GWAS data. For this purpose, we present an adaptive gene-based test and a pathway-based test for association analysis of multiple traits with GWAS summary statistics. The proposed tests are adaptive at both the SNP- and trait-levels; that is, they account for possibly varying association patterns (e.g. signal sparsity levels) across SNPs and traits, thus maintaining high power across a wide range of situations. Furthermore, the proposed methods are general: they can be applied to mixed types of traits, and to Z-statistics or P-values as summary statistics obtained from either a single GWAS or a meta-analysis of multiple GWAS. Our numerical studies with simulated and real data demonstrated the promising performance of the proposed methods. The methods are implemented in R package aSPU, freely and publicly available at: https://cran.r-project.org/web/packages/aSPU/ CONTACT: weip@biostat.umn.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Morphological texture assessment of oral bone as a screening tool for osteoporosis
NASA Astrophysics Data System (ADS)
Analoui, Mostafa; Eggertsson, Hafsteinn; Eckert, George
2001-07-01
Three classes of texture analysis approaches have been employed to assess the textural characteristic of oral bone. A set of linear structuring elements was used to compute granulometric features of trabecular bone. Multifractal analysis was also used to compute the fractal dimension of the corresponding tissues. In addition, some statistical features and histomorphometric parameters were computed. To assess the proposed approach we acquired digital intraoral radiographs of 47 subjects (14 males and 33 females). All radiographs were captured at 12 bits/pixel. Images were converted to binary form through a sliding locally adaptive thresholding approach. Each subject was scanned by DEXA for bone dosimetry. Subject were classified into one of the following three categories according World Health Organization (WHO) standard (1) healthy, (2) with osteopenia and (3) osteoporosis. In this study fractal dimension showed very low correlation with bone mineral density (BMD) measurements, which did not reach a level of statistical significance (p<0.5). However, entropy of pattern spectrum (EPS), along with statistical features and histomorphometric parameters, has shown correlation coefficients ranging from low to high, with statistical significance for both males and females. The results of this study indicate the utility of this approach for bone texture analysis. It is conjectured that designing a 2-D structuring element, specially tuned to trabecular bone texture, will increase the efficacy of the proposed method.
Hiza, Elise A; Gottschalk, Michael B; Umpierrez, Erica; Bush, Patricia; Reisman, William M
2015-07-01
The objective of this study is to analyze the effect of an orthopaedic trauma advanced practice provider on length of stay (LOS) and cost in a level I trauma center. The hypothesis of this study is that the addition of a single full-time nurse practitioner (NP) to the orthopaedic trauma team at a level I Trauma center would decrease overall LOS and hospital cost. A retrospective chart review of all patients discharged from the orthopaedic surgery service 1 year before the addition of a NP (pre-NP) and 1 year after the hiring of a NP (post-NP) were reviewed. Chart review included age, gender, LOS, discharge destination, intravenous antibiotic use, wound VAC therapy, admission location, and length of time to surgery. Statistical analysis was performed using the Wilcoxon/Kruskal-Wallis test. The hiring of a NP yielded a statistically significant decrease in the LOS across the following patient subgroups: patients transferred from the trauma service (13.56 compared with 7.02 days, P < 0.001), patients aged 60 years and older (7.34 compared with 5.04 days, P = 0.037), patients discharged to a rehabilitation facility (10.84 compared with 8.31 days, P = 0.002), and patients discharged on antibiotics/wound VAC therapy (15.16 compared with 11.24 days, P = 0.017). Length of time to surgery was also decreased (1.48 compared with 1.31 days, P = 0.37). The addition of a dedicated orthopaedic trauma advanced practice provider at a county level I trauma center resulted in a statistically significant decrease in LOS and thus reduced indirect costs to the hospital. Economic Level IV. See Instructions for Authors for a complete description of levels of evidence.
Kanda, Junya
2016-01-01
The Transplant Registry Unified Management Program (TRUMP) made it possible for members of the Japan Society for Hematopoietic Cell Transplantation (JSHCT) to analyze large sets of national registry data on autologous and allogeneic hematopoietic stem cell transplantation. However, as the processes used to collect transplantation information are complex and differed over time, the background of these processes should be understood when using TRUMP data. Previously, information on the HLA locus of patients and donors had been collected using a questionnaire-based free-description method, resulting in some input errors. To correct minor but significant errors and provide accurate HLA matching data, the use of a Stata or EZR/R script offered by the JSHCT is strongly recommended when analyzing HLA data in the TRUMP dataset. The HLA mismatch direction, mismatch counting method, and different impacts of HLA mismatches by stem cell source are other important factors in the analysis of HLA data. Additionally, researchers should understand the statistical analyses specific for hematopoietic stem cell transplantation, such as competing risk, landmark analysis, and time-dependent analysis, to correctly analyze transplant data. The data center of the JSHCT can be contacted if statistical assistance is required.
Hou, Deyi; O'Connor, David; Nathanail, Paul; Tian, Li; Ma, Yan
2017-12-01
Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0-0.10 m, or 0-0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identified several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthropogenic sources of heavy metals were found to derive from lithogenic origin, roadway and transportation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specific anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component analysis (PCA) and cluster analysis (CA). Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Asal, F. F.
2012-07-01
Digital elevation data obtained from different Engineering Surveying techniques is utilized in generating Digital Elevation Model (DEM), which is employed in many Engineering and Environmental applications. This data is usually in discrete point format making it necessary to utilize an interpolation approach for the creation of DEM. Quality assessment of the DEM is a vital issue controlling its use in different applications; however this assessment relies heavily on statistical methods with neglecting the visual methods. The research applies visual analysis investigation on DEMs generated using IDW interpolator of varying powers in order to examine their potential in the assessment of the effects of the variation of the IDW power on the quality of the DEMs. Real elevation data has been collected from field using total station instrument in a corrugated terrain. DEMs have been generated from the data at a unified cell size using IDW interpolator with power values ranging from one to ten. Visual analysis has been undertaken using 2D and 3D views of the DEM; in addition, statistical analysis has been performed for assessment of the validity of the visual techniques in doing such analysis. Visual analysis has shown that smoothing of the DEM decreases with the increase in the power value till the power of four; however, increasing the power more than four does not leave noticeable changes on 2D and 3D views of the DEM. The statistical analysis has supported these results where the value of the Standard Deviation (SD) of the DEM has increased with increasing the power. More specifically, changing the power from one to two has produced 36% of the total increase (the increase in SD due to changing the power from one to ten) in SD and changing to the powers of three and four has given 60% and 75% respectively. This refers to decrease in DEM smoothing with the increase in the power of the IDW. The study also has shown that applying visual methods supported by statistical analysis has proven good potential in the DEM quality assessment.
Comparisons of modeled height predictions to ocular height estimates
W.A. Bechtold; S.J. Zarnoch; W.G. Burkman
1998-01-01
Equations used by USDA Forest Service Forest Inventory and Analysis projects to predict individual tree heights on the basis of species and d.b.h. were improved by the addition of mean overstory height. However, ocular estimates of total height by field crews were more accurate than the statistically improved models, especially for hardwood species. Height predictions...
Targeting Terrorist Leaders: A Case Study
2011-03-01
PRESSURE .................................38 IV. STATISTICAL ANALYSIS ON ISRAELI LEADERSHIP TARGETING ...43 V. CONCLUSION: HAVE ISRAELI ATTEMPTS TO...organization. D. POTENTIAL WEAKNESSES The use of one case to evaluate the efficacy of the terrorist-leadership targeting model is problematic...times who was in charge. In addition, the literature on Hamas is widely split on the role that inspirational leaders had on operational matters
Appropriate statistical analyses are critical for evaluating interactions of mixtures with a common mode of action, as is often the case for cumulative risk assessments. Our objective is to develop analyses for use when a response variable is ordinal, and to test for interaction...
JCMT COADD: UKT14 continuum and photometry data reduction
NASA Astrophysics Data System (ADS)
Hughes, David; Oliveira, Firmin J.; Tilanus, Remo P. J.; Jenness, Tim
2014-11-01
COADD was used to reduce photometry and continuum data from the UKT14 instrument on the James Clerk Maxwell Telescope in the 1990s. The software can co-add multiple observations and perform sigma clipping and Kolmogorov-Smirnov statistical analysis. Additional information on the software is available in the JCMT Spring 1993 newsletter (large PDF).
Faculty Salary Equity: Issues in Regression Model Selection. AIR 1992 Annual Forum Paper.
ERIC Educational Resources Information Center
Moore, Nelle
This paper discusses the determination of college faculty salary inequity and identifies the areas in which human judgment must be used in order to conduct a statistical analysis of salary equity. In addition, it provides some informed guidelines for making those judgments. The paper provides a framework for selecting salary equity models, based…
HOW WELL ARE HYDRAULIC CONDUCTIVITY VARIATIONS APPROXIMATED BY ADDITIVE STABLE PROCESSES? (R826171)
Analysis of the higher statistical moments of a hydraulic conductivity (K) and an intrinsic permeability (k) data set leads to the conclusion that the increments of the data and the logs of the data are not governed by Levy-stable or Gaussian dis...
NASA Astrophysics Data System (ADS)
Magazù, Salvatore; Mezei, Ferenc; Migliardo, Federica
2018-05-01
In a variety of applications of inelastic neutron scattering spectroscopy the goal is to single out the elastic scattering contribution from the total scattered spectrum as a function of momentum transfer and sample environment parameters. The elastic part of the spectrum is defined in such a case by the energy resolution of the spectrometer. Variable elastic energy resolution offers a way to distinguish between elastic and quasi-elastic intensities. Correlation spectroscopy lends itself as an efficient, high intensity approach for accomplishing this both at continuous and pulsed neutron sources. On the one hand, in beam modulation methods the Liouville theorem coupling between intensity and resolution is relaxed and time-of-flight velocity analysis of the neutron velocity distribution can be performed with 50 % duty factor exposure for all available resolutions. On the other hand, the (quasi)elastic part of the spectrum generally contains the major part of the integrated intensity at a given detector, and thus correlation spectroscopy can be applied with most favorable signal to statistical noise ratio. The novel spectrometer CORELLI at SNS is an example for this type of application of the correlation technique at a pulsed source. On a continuous neutron source a statistical chopper can be used for quasi-random time dependent beam modulation and the total time-of-flight of the neutron from the statistical chopper to detection is determined by the analysis of the correlation between the temporal fluctuation of the neutron detection rate and the statistical chopper beam modulation pattern. The correlation analysis can either be used for the determination of the incoming neutron velocity or for the scattered neutron velocity, depending of the position of the statistical chopper along the neutron trajectory. These two options are considered together with an evaluation of spectrometer performance compared to conventional spectroscopy, in particular for variable resolution elastic neutron scattering (RENS) studies of relaxation processes and the evolution of mean square displacements. A particular focus of our analysis is the unique feature of correlation spectroscopy of delivering high and resolution independent beam intensity, thus the same statistical chopper scan contains both high intensity and high resolution information at the same time, and can be evaluated both ways. This flexibility for variable resolution data handling represents an additional asset for correlation spectroscopy in variable resolution work. Changing the beam width for the same statistical chopper allows us to additionally trade resolution for intensity in two different experimental runs, similarly for conventional single slit chopper spectroscopy. The combination of these two approaches is a capability of particular value in neutron spectroscopy studies requiring variable energy resolution, such as the systematic study of quasi-elastic scattering and mean square displacement. Furthermore the statistical chopper approach is particularly advantageous for studying samples with low scattering intensity in the presence of a high, sample independent background.
Mars Exploration Rover Six-Degree-Of-Freedom Entry Trajectory Analysis
NASA Technical Reports Server (NTRS)
Desai, Prasun N.; Schoenenberger, Mark; Cheatwood, F. M.
2003-01-01
The Mars Exploration Rover mission will be the next opportunity for surface exploration of Mars in January 2004. Two rovers will be delivered to the surface of Mars using the same entry, descent, and landing scenario that was developed and successfully implemented by Mars Pathfinder. This investigation describes the trajectory analysis that was performed for the hypersonic portion of the MER entry. In this analysis, a six-degree-of-freedom trajectory simulation of the entry is performed to determine the entry characteristics of the capsules. In addition, a Monte Carlo analysis is also performed to statistically assess the robustness of the entry design to off-nominal conditions to assure that all entry requirements are satisfied. The results show that the attitude at peak heating and parachute deployment are well within entry limits. In addition, the parachute deployment dynamics pressure and Mach number are also well within the design requirements.
Impact of tailored feedback in assessment of communication skills for medical students.
Uhm, Seilin; Lee, Gui H; Jin, Jeong K; Bak, Yong I; Jeoung, Yeon O; Kim, Chan W
2015-01-01
Finding out the effective ways of teaching and assessing communication skills remain a challenging part of medication education. This study aims at exploring the usefulness and effectiveness of having additional feedback using qualitative analysis in assessment of communication skills in undergraduate medical training. We also determined the possibilities of using qualitative analysis in developing tailored strategies for improvement in communication skills training. This study was carried out on medical students (n=87) undergoing their final year clinical performance examination on communication skills using standardized patient by video-recording and transcribing their performances. Video-recordings of 26 students were randomly selected for qualitative analysis, and additional feedback was provided. We assessed the level of acceptance of communication skills scores between the study and nonstudy group and within the study group, before and after receiving feedback based on qualitative analysis. There was a statistically significant increase in the level of acceptance of feedback after delivering additional feedback using qualitative analysis, where the percentage of agreement with feedback increased from 15.4 to 80.8% (p<0.001). Incorporating feedback based on qualitative analysis for communication skills assessment gives essential information for medical students to learn and self-reflect, which could potentially lead to improved communication skills. As evident from our study, feedback becomes more meaningful and effective with additional feedback using qualitative analysis.
Impact of tailored feedback in assessment of communication skills for medical students
Uhm, Seilin; Lee, Gui H.; Jin, Jeong K.; Bak, Yong I.; Jeoung, Yeon O.; Kim, Chan W.
2015-01-01
Background Finding out the effective ways of teaching and assessing communication skills remain a challenging part of medication education. This study aims at exploring the usefulness and effectiveness of having additional feedback using qualitative analysis in assessment of communication skills in undergraduate medical training. We also determined the possibilities of using qualitative analysis in developing tailored strategies for improvement in communication skills training. Methods This study was carried out on medical students (n=87) undergoing their final year clinical performance examination on communication skills using standardized patient by video-recording and transcribing their performances. Video-recordings of 26 students were randomly selected for qualitative analysis, and additional feedback was provided. We assessed the level of acceptance of communication skills scores between the study and nonstudy group and within the study group, before and after receiving feedback based on qualitative analysis. Results There was a statistically significant increase in the level of acceptance of feedback after delivering additional feedback using qualitative analysis, where the percentage of agreement with feedback increased from 15.4 to 80.8% (p<0.001). Conclusions Incorporating feedback based on qualitative analysis for communication skills assessment gives essential information for medical students to learn and self-reflect, which could potentially lead to improved communication skills. As evident from our study, feedback becomes more meaningful and effective with additional feedback using qualitative analysis. PMID:26154864
NASA Astrophysics Data System (ADS)
Donges, Jonathan F.; Heitzig, Jobst; Beronov, Boyan; Wiedermann, Marc; Runge, Jakob; Feng, Qing Yi; Tupikina, Liubov; Stolbova, Veronika; Donner, Reik V.; Marwan, Norbert; Dijkstra, Henk A.; Kurths, Jürgen
2015-11-01
We introduce the pyunicorn (Pythonic unified complex network and recurrence analysis toolbox) open source software package for applying and combining modern methods of data analysis and modeling from complex network theory and nonlinear time series analysis. pyunicorn is a fully object-oriented and easily parallelizable package written in the language Python. It allows for the construction of functional networks such as climate networks in climatology or functional brain networks in neuroscience representing the structure of statistical interrelationships in large data sets of time series and, subsequently, investigating this structure using advanced methods of complex network theory such as measures and models for spatial networks, networks of interacting networks, node-weighted statistics, or network surrogates. Additionally, pyunicorn provides insights into the nonlinear dynamics of complex systems as recorded in uni- and multivariate time series from a non-traditional perspective by means of recurrence quantification analysis, recurrence networks, visibility graphs, and construction of surrogate time series. The range of possible applications of the library is outlined, drawing on several examples mainly from the field of climatology.
Statistical Analysis of the Uncertainty in Pre-Flight Aerodynamic Database of a Hypersonic Vehicle
NASA Astrophysics Data System (ADS)
Huh, Lynn
The objective of the present research was to develop a new method to derive the aerodynamic coefficients and the associated uncertainties for flight vehicles via post- flight inertial navigation analysis using data from the inertial measurement unit. Statistical estimates of vehicle state and aerodynamic coefficients are derived using Monte Carlo simulation. Trajectory reconstruction using the inertial navigation system (INS) is a simple and well used method. However, deriving realistic uncertainties in the reconstructed state and any associated parameters is not so straight forward. Extended Kalman filters, batch minimum variance estimation and other approaches have been used. However, these methods generally depend on assumed physical models, assumed statistical distributions (usually Gaussian) or have convergence issues for non-linear problems. The approach here assumes no physical models, is applicable to any statistical distribution, and does not have any convergence issues. The new approach obtains the statistics directly from a sufficient number of Monte Carlo samples using only the generally well known gyro and accelerometer specifications and could be applied to the systems of non-linear form and non-Gaussian distribution. When redundant data are available, the set of Monte Carlo simulations are constrained to satisfy the redundant data within the uncertainties specified for the additional data. The proposed method was applied to validate the uncertainty in the pre-flight aerodynamic database of the X-43A Hyper-X research vehicle. In addition to gyro and acceleration data, the actual flight data include redundant measurements of position and velocity from the global positioning system (GPS). The criteria derived from the blend of the GPS and INS accuracy was used to select valid trajectories for statistical analysis. The aerodynamic coefficients were derived from the selected trajectories by either direct extraction method based on the equations in dynamics, or by the inquiry of the pre-flight aerodynamic database. After the application of the proposed method to the case of the X-43A Hyper-X research vehicle, it was found that 1) there were consistent differences in the aerodynamic coefficients from the pre-flight aerodynamic database and post-flight analysis, 2) the pre-flight estimation of the pitching moment coefficients was significantly different from the post-flight analysis, 3) the type of distribution of the states from the Monte Carlo simulation were affected by that of the perturbation parameters, 4) the uncertainties in the pre-flight model were overestimated, 5) the range where the aerodynamic coefficients from the pre-flight aerodynamic database and post-flight analysis are in closest agreement is between Mach *.* and *.* and more data points may be needed between Mach * and ** in the pre-flight aerodynamic database, 6) selection criterion for valid trajectories from the Monte Carlo simulations was mostly driven by the horizontal velocity error, 7) the selection criterion must be based on reasonable model to ensure the validity of the statistics from the proposed method, and 8) the results from the proposed method applied to the two different flights with the identical geometry and similar flight profile were consistent.
The spectral analysis of fuel oils using terahertz radiation and chemometric methods
NASA Astrophysics Data System (ADS)
Zhan, Honglei; Zhao, Kun; Zhao, Hui; Li, Qian; Zhu, Shouming; Xiao, Lizhi
2016-10-01
The combustion characteristics of fuel oils are closely related to both engine efficiency and pollutant emissions, and the analysis of oils and their additives is thus important. These oils and additives have been found to generate distinct responses to terahertz (THz) radiation as the result of various molecular vibrational modes. In the present work, THz spectroscopy was employed to identify a number of oils, including lubricants, gasoline and diesel, with different additives. The identities of dozens of these oils could be readily established using statistical models based on principal component analysis. The THz spectra of gasoline, diesel, sulfur and methyl methacrylate (MMA) were acquired and linear fittings were obtained. By using chemometric methods, including back propagation, artificial neural network and support vector machine techniques, typical concentrations of sulfur in gasoline (ppm-grade) could be detected, together with MMA in diesel below 0.5%. The absorption characteristics of the oil additives were also assessed using 2D correlation spectroscopy, and several hidden absorption peaks were discovered. The technique discussed herein should provide a useful new means of analyzing fuel oils with various additives and impurities in a non-destructive manner and therefore will be of benefit to the field of chemical detection and identification.
Evidence-Based Medicine as a Tool for Undergraduate Probability and Statistics Education.
Masel, J; Humphrey, P T; Blackburn, B; Levine, J A
2015-01-01
Most students have difficulty reasoning about chance events, and misconceptions regarding probability can persist or even strengthen following traditional instruction. Many biostatistics classes sidestep this problem by prioritizing exploratory data analysis over probability. However, probability itself, in addition to statistics, is essential both to the biology curriculum and to informed decision making in daily life. One area in which probability is particularly important is medicine. Given the preponderance of pre health students, in addition to more general interest in medicine, we capitalized on students' intrinsic motivation in this area to teach both probability and statistics. We use the randomized controlled trial as the centerpiece of the course, because it exemplifies the most salient features of the scientific method, and the application of critical thinking to medicine. The other two pillars of the course are biomedical applications of Bayes' theorem and science and society content. Backward design from these three overarching aims was used to select appropriate probability and statistics content, with a focus on eliciting and countering previously documented misconceptions in their medical context. Pretest/posttest assessments using the Quantitative Reasoning Quotient and Attitudes Toward Statistics instruments are positive, bucking several negative trends previously reported in statistics education. © 2015 J. Masel et al. CBE—Life Sciences Education © 2015 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).
Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online
Forsberg, Erica M; Huan, Tao; Rinehart, Duane; Benton, H Paul; Warth, Benedikt; Hilmers, Brian; Siuzdak, Gary
2018-01-01
Systems biology is the study of complex living organisms, and as such, analysis on a systems-wide scale involves the collection of information-dense data sets that are representative of an entire phenotype. To uncover dynamic biological mechanisms, bioinformatics tools have become essential to facilitating data interpretation in large-scale analyses. Global metabolomics is one such method for performing systems biology, as metabolites represent the downstream functional products of ongoing biological processes. We have developed XCMS Online, a platform that enables online metabolomics data processing and interpretation. A systems biology workflow recently implemented within XCMS Online enables rapid metabolic pathway mapping using raw metabolomics data for investigating dysregulated metabolic processes. In addition, this platform supports integration of multi-omic (such as genomic and proteomic) data to garner further systems-wide mechanistic insight. Here, we provide an in-depth procedure showing how to effectively navigate and use the systems biology workflow within XCMS Online without a priori knowledge of the platform, including uploading liquid chromatography (LCLC)–mass spectrometry (MS) data from metabolite-extracted biological samples, defining the job parameters to identify features, correcting for retention time deviations, conducting statistical analysis of features between sample classes and performing predictive metabolic pathway analysis. Additional multi-omics data can be uploaded and overlaid with previously identified pathways to enhance systems-wide analysis of the observed dysregulations. We also describe unique visualization tools to assist in elucidation of statistically significant dysregulated metabolic pathways. Parameter input takes 5–10 min, depending on user experience; data processing typically takes 1–3 h, and data analysis takes ~30 min. PMID:29494574
An Efficient Objective Analysis System for Parallel Computers
NASA Technical Reports Server (NTRS)
Stobie, James G.
1999-01-01
A new objective analysis system designed for parallel computers will be described. The system can produce a global analysis (on a 2 x 2.5 lat-lon grid with 20 levels of heights and winds and 10 levels of moisture) using 120,000 observations in less than 3 minutes on 32 CPUs (SGI Origin 2000). No special parallel code is needed (e.g. MPI or multitasking) and the 32 CPUs do not have to be on the same platform. The system Ls totally portable and can run on -several different architectures at once. In addition, the system can easily scale up to 100 or more CPUS. This will allow for much higher resolution and significant increases in input data. The system scales linearly as the number of observations and the number of grid points. The cost overhead in going from I to 32 CPus is 18%. in addition, the analysis results are identical regardless of the number of processors used. T'his system has all the characteristics of optimal interpolation, combining detailed instrument and first guess error statistics to produce the best estimate of the atmospheric state. It also includes a new quality control (buddy check) system. Static tests with the system showed it's analysis increments are comparable to the latest NASA operational system including maintenance of mass-wind balance. Results from a 2-month cycling test in the Goddard EOS Data Assimilation System (GEOS DAS) show this new analysis retains the same level of agreement between the first guess and observations (0-F statistics) throughout the entire two months.
Smoking increases the risk of diabetic foot amputation: A meta-analysis.
Liu, Min; Zhang, Wei; Yan, Zhaoli; Yuan, Xiangzhen
2018-02-01
Accumulating evidence suggests that smoking is associated with diabetic foot amputation. However, the currently available results are inconsistent and controversial. Therefore, the present study performed a meta-analysis to systematically review the association between smoking and diabetic foot amputation and to investigate the risk factors of diabetic foot amputation. Public databases, including PubMed and Embase, were searched prior to 29th February 2016. The heterogeneity was assessed using the Cochran's Q statistic and the I 2 statistic, and odds ratio (OR) and 95% confidence interval (CI) were calculated and pooled appropriately. Sensitivity analysis was performed to evaluate the stability of the results. In addition, Egger's test was applied to assess any potential publication bias. Based on the research, a total of eight studies, including five cohort studies and three case control studies were included. The data indicated that smoking significantly increased the risk of diabetic foot amputation (OR=1.65; 95% CI, 1.09-2.50; P<0.0001) compared with non-smoking. Sensitivity analysis demonstrated that the pooled analysis did not vary substantially following the exclusion of any one study. Additionally, there was no evidence of publication bias (Egger's test, t=0.1378; P=0.8958). Furthermore, no significant difference was observed between the minor and major amputation groups in patients who smoked (OR=0.79; 95% CI, 0.24-2.58). The results of the present meta-analysis suggested that smoking is a notable risk factor for diabetic foot amputation. Smoking cessation appears to reduce the risk of diabetic foot amputation.
Interpretation of correlations in clinical research.
Hung, Man; Bounsanga, Jerry; Voss, Maren Wright
2017-11-01
Critically analyzing research is a key skill in evidence-based practice and requires knowledge of research methods, results interpretation, and applications, all of which rely on a foundation based in statistics. Evidence-based practice makes high demands on trained medical professionals to interpret an ever-expanding array of research evidence. As clinical training emphasizes medical care rather than statistics, it is useful to review the basics of statistical methods and what they mean for interpreting clinical studies. We reviewed the basic concepts of correlational associations, violations of normality, unobserved variable bias, sample size, and alpha inflation. The foundations of causal inference were discussed and sound statistical analyses were examined. We discuss four ways in which correlational analysis is misused, including causal inference overreach, over-reliance on significance, alpha inflation, and sample size bias. Recent published studies in the medical field provide evidence of causal assertion overreach drawn from correlational findings. The findings present a primer on the assumptions and nature of correlational methods of analysis and urge clinicians to exercise appropriate caution as they critically analyze the evidence before them and evaluate evidence that supports practice. Critically analyzing new evidence requires statistical knowledge in addition to clinical knowledge. Studies can overstate relationships, expressing causal assertions when only correlational evidence is available. Failure to account for the effect of sample size in the analyses tends to overstate the importance of predictive variables. It is important not to overemphasize the statistical significance without consideration of effect size and whether differences could be considered clinically meaningful.
Chung, Dongjun; Kim, Hang J; Zhao, Hongyu
2017-02-01
Genome-wide association studies (GWAS) have identified tens of thousands of genetic variants associated with hundreds of phenotypes and diseases, which have provided clinical and medical benefits to patients with novel biomarkers and therapeutic targets. However, identification of risk variants associated with complex diseases remains challenging as they are often affected by many genetic variants with small or moderate effects. There has been accumulating evidence suggesting that different complex traits share common risk basis, namely pleiotropy. Recently, several statistical methods have been developed to improve statistical power to identify risk variants for complex traits through a joint analysis of multiple GWAS datasets by leveraging pleiotropy. While these methods were shown to improve statistical power for association mapping compared to separate analyses, they are still limited in the number of phenotypes that can be integrated. In order to address this challenge, in this paper, we propose a novel statistical framework, graph-GPA, to integrate a large number of GWAS datasets for multiple phenotypes using a hidden Markov random field approach. Application of graph-GPA to a joint analysis of GWAS datasets for 12 phenotypes shows that graph-GPA improves statistical power to identify risk variants compared to statistical methods based on smaller number of GWAS datasets. In addition, graph-GPA also promotes better understanding of genetic mechanisms shared among phenotypes, which can potentially be useful for the development of improved diagnosis and therapeutics. The R implementation of graph-GPA is currently available at https://dongjunchung.github.io/GGPA/.
NASA Astrophysics Data System (ADS)
Bruña, Ricardo; Poza, Jesús; Gómez, Carlos; García, María; Fernández, Alberto; Hornero, Roberto
2012-06-01
Alzheimer's disease (AD) is the most common cause of dementia. Over the last few years, a considerable effort has been devoted to exploring new biomarkers. Nevertheless, a better understanding of brain dynamics is still required to optimize therapeutic strategies. In this regard, the characterization of mild cognitive impairment (MCI) is crucial, due to the high conversion rate from MCI to AD. However, only a few studies have focused on the analysis of magnetoencephalographic (MEG) rhythms to characterize AD and MCI. In this study, we assess the ability of several parameters derived from information theory to describe spontaneous MEG activity from 36 AD patients, 18 MCI subjects and 26 controls. Three entropies (Shannon, Tsallis and Rényi entropies), one disequilibrium measure (based on Euclidean distance ED) and three statistical complexities (based on Lopez Ruiz-Mancini-Calbet complexity LMC) were used to estimate the irregularity and statistical complexity of MEG activity. Statistically significant differences between AD patients and controls were obtained with all parameters (p < 0.01). In addition, statistically significant differences between MCI subjects and controls were achieved by ED and LMC (p < 0.05). In order to assess the diagnostic ability of the parameters, a linear discriminant analysis with a leave-one-out cross-validation procedure was applied. The accuracies reached 83.9% and 65.9% to discriminate AD and MCI subjects from controls, respectively. Our findings suggest that MCI subjects exhibit an intermediate pattern of abnormalities between normal aging and AD. Furthermore, the proposed parameters provide a new description of brain dynamics in AD and MCI.
Loktionov, A S; Prianishnikov, V A
1981-05-01
A system has been proposed to provide the automatic analysis of data on: a) point cytophotometry, b) two-wave cytophotometry, c) cytofluorimetry. The system provides the input of the data from a photomultiplier to a specialized computer "Electronica-T3-16M" in addition to the simultaneous statistical analysis of these. The information on the programs used is presented. The advantages of the system, compared with some commercially available cytophotometers, are indicated.
Quantitative analysis of drainage obtained from aerial photographs and RBV/LANDSAT images
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Formaggio, A. R.; Epiphanio, J. C. N.; Filho, M. V.
1981-01-01
Data obtained from aerial photographs (1:60,000) and LANDSAT return beam vidicon imagery (1:100,000) concerning drainage density, drainage texture, hydrography density, and the average length of channels were compared. Statistical analysis shows that significant differences exist in data from the two sources. The highly drained area lost more information than the less drained area. In addition, it was observed that the loss of information about the number of rivers was higher than that about the length of the channels.
NASA Technical Reports Server (NTRS)
Jobson, Daniel J.; Rahman, Zia-Ur; Woodell, Glenn A.; Hines, Glenn D.
2004-01-01
Noise is the primary visibility limit in the process of non-linear image enhancement, and is no longer a statistically stable additive noise in the post-enhancement image. Therefore novel approaches are needed to both assess and reduce spatially variable noise at this stage in overall image processing. Here we will examine the use of edge pattern analysis both for automatic assessment of spatially variable noise and as a foundation for new noise reduction methods.
Accuracy of trace element determinations in alternate fuels
NASA Technical Reports Server (NTRS)
Greenbauer-Seng, L. A.
1980-01-01
A review of the techniques used at Lewis Research Center (LeRC) in trace metals analysis is presented, including the results of Atomic Absorption Spectrometry and DC Arc Emission Spectrometry of blank levels and recovery experiments for several metals. The design of an Interlaboratory Study conducted by LeRC is presented. Several factors were investigated, including: laboratory, analytical technique, fuel type, concentration, and ashing additive. Conclusions drawn from the statistical analysis will help direct research efforts toward those areas most responsible for the poor interlaboratory analytical results.
Friedman, L.C.; Schroder, L.J.; Brooks, M.G.
1986-01-01
Solutions containing volatile organic compounds were prepared in organic-free water and 2% methanol and submitted to two U.S. Geological Survey laboratories. Data from the determination of volatile compounds in these samples were compared to analytical data for the same volatile compounds that had been kept in solutions 100 times more concentrated until immediately before analysis; there was no statistically significant difference in the analytical recoveries. Addition of 2% methanol to the storage containers hindered the recovery of bromomethane and vinyl chloride. Methanol addition did not enhance sample stability. Further, there was no statistically significant difference in results from the two laboratories, and the recovery efficiency was more than 80% in more than half of the determinations made. In a subsequent study, six of eight volatile compounds showed no significant loss of recovery after 34 days.
Gene flow analysis method, the D-statistic, is robust in a wide parameter space.
Zheng, Yichen; Janke, Axel
2018-01-08
We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis. The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, [Formula: see text] and [Formula: see text], to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background. The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations.
Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)
NASA Astrophysics Data System (ADS)
Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee
2010-12-01
Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review available statistical seismology software packages.
Multivariate meta-analysis: potential and promise.
Jackson, Dan; Riley, Richard; White, Ian R
2011-09-10
The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day 'Multivariate meta-analysis' event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd.
Local sensitivity analysis for inverse problems solved by singular value decomposition
Hill, M.C.; Nolan, B.T.
2010-01-01
Local sensitivity analysis provides computationally frugal ways to evaluate models commonly used for resource management, risk assessment, and so on. This includes diagnosing inverse model convergence problems caused by parameter insensitivity and(or) parameter interdependence (correlation), understanding what aspects of the model and data contribute to measures of uncertainty, and identifying new data likely to reduce model uncertainty. Here, we consider sensitivity statistics relevant to models in which the process model parameters are transformed using singular value decomposition (SVD) to create SVD parameters for model calibration. The statistics considered include the PEST identifiability statistic, and combined use of the process-model parameter statistics composite scaled sensitivities and parameter correlation coefficients (CSS and PCC). The statistics are complimentary in that the identifiability statistic integrates the effects of parameter sensitivity and interdependence, while CSS and PCC provide individual measures of sensitivity and interdependence. PCC quantifies correlations between pairs or larger sets of parameters; when a set of parameters is intercorrelated, the absolute value of PCC is close to 1.00 for all pairs in the set. The number of singular vectors to include in the calculation of the identifiability statistic is somewhat subjective and influences the statistic. To demonstrate the statistics, we use the USDA’s Root Zone Water Quality Model to simulate nitrogen fate and transport in the unsaturated zone of the Merced River Basin, CA. There are 16 log-transformed process-model parameters, including water content at field capacity (WFC) and bulk density (BD) for each of five soil layers. Calibration data consisted of 1,670 observations comprising soil moisture, soil water tension, aqueous nitrate and bromide concentrations, soil nitrate concentration, and organic matter content. All 16 of the SVD parameters could be estimated by regression based on the range of singular values. Identifiability statistic results varied based on the number of SVD parameters included. Identifiability statistics calculated for four SVD parameters indicate the same three most important process-model parameters as CSS/PCC (WFC1, WFC2, and BD2), but the order differed. Additionally, the identifiability statistic showed that BD1 was almost as dominant as WFC1. The CSS/PCC analysis showed that this results from its high correlation with WCF1 (-0.94), and not its individual sensitivity. Such distinctions, combined with analysis of how high correlations and(or) sensitivities result from the constructed model, can produce important insights into, for example, the use of sensitivity analysis to design monitoring networks. In conclusion, the statistics considered identified similar important parameters. They differ because (1) with CSS/PCC can be more awkward because sensitivity and interdependence are considered separately and (2) identifiability requires consideration of how many SVD parameters to include. A continuing challenge is to understand how these computationally efficient methods compare with computationally demanding global methods like Markov-Chain Monte Carlo given common nonlinear processes and the often even more nonlinear models.
van der Krieke, Lian; Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith Gm; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter
2015-08-07
Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher's tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use.
Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith GM; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter
2015-01-01
Background Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. Objective This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. Methods We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher’s tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). Results An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Conclusions Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use. PMID:26254160
da Silva, Camila Sousa; de Souza, Evaristo Jorge Oliveira; Pereira, Gerfesson Felipe Cavalcanti; Cavalcante, Edwilka Oliveira; de Lima, Ewerton Ivo Martins; Torres, Thaysa Rodrigues; da Silva, José Ricardo Coelho; da Silva, Daniel Cézar
2017-02-01
The objective was to evaluate the intake, digestibility, and ingestive sheep behavior with feeding phytogenic additives derived from plant extracts. Five non-emasculated sheep without defined breed at 28 ± 1.81 kg initial body weight and 6 months age were used. Treatments consisted of administering four phytogenic additives from the garlic extracts, coriander seed, oregano, and pods of mesquite, plus a control treatment (without additive). The ration was composed of Tifton 85 hay grass, corn, soybean meal, and mineral salt. As experimental design, we used a 5 × 5 Latin square design (five treatments and five periods). The data were analyzed through the mixed model through the procedure PROC MIXED of software Systems Statistical Analysis version 9.1, with comparation analysis between the treatment without additive (control) with phytogenic additives produced from vegetable extracts of mesquite pod, of coriander seed, the bulb of garlic, and the oregano leaves. There were no significant differences for the nutrient intake and ingestive behavior patterns. However, the additive intake derived from mesquite pods and coriander extracts provided an increase in digestibility. Extracts from garlic, coriander, and mesquite pods can be used as phytogenic additives in feeding sheep.
Wallach, Joshua D; Sullivan, Patrick G; Trepanowski, John F; Sainani, Kristin L; Steyerberg, Ewout W; Ioannidis, John P A
2017-04-01
Many published randomized clinical trials (RCTs) make claims for subgroup differences. To evaluate how often subgroup claims reported in the abstracts of RCTs are actually supported by statistical evidence (P < .05 from an interaction test) and corroborated by subsequent RCTs and meta-analyses. This meta-epidemiological survey examines data sets of trials with at least 1 subgroup claim, including Subgroup Analysis of Trials Is Rarely Easy (SATIRE) articles and Discontinuation of Randomized Trials (DISCO) articles. We used Scopus (updated July 2016) to search for English-language articles citing each of the eligible index articles with at least 1 subgroup finding in the abstract. Articles with a subgroup claim in the abstract with or without evidence of statistical heterogeneity (P < .05 from an interaction test) in the text and articles attempting to corroborate the subgroup findings. Study characteristics of trials with at least 1 subgroup claim in the abstract were recorded. Two reviewers extracted the data necessary to calculate subgroup-level effect sizes, standard errors, and the P values for interaction. For individual RCTs and meta-analyses that attempted to corroborate the subgroup findings from the index articles, trial characteristics were extracted. Cochran Q test was used to reevaluate heterogeneity with the data from all available trials. The number of subgroup claims in the abstracts of RCTs, the number of subgroup claims in the abstracts of RCTs with statistical support (subgroup findings), and the number of subgroup findings corroborated by subsequent RCTs and meta-analyses. Sixty-four eligible RCTs made a total of 117 subgroup claims in their abstracts. Of these 117 claims, only 46 (39.3%) in 33 articles had evidence of statistically significant heterogeneity from a test for interaction. In addition, out of these 46 subgroup findings, only 16 (34.8%) ensured balance between randomization groups within the subgroups (eg, through stratified randomization), 13 (28.3%) entailed a prespecified subgroup analysis, and 1 (2.2%) was adjusted for multiple testing. Only 5 (10.9%) of the 46 subgroup findings had at least 1 subsequent pure corroboration attempt by a meta-analysis or an RCT. In all 5 cases, the corroboration attempts found no evidence of a statistically significant subgroup effect. In addition, all effect sizes from meta-analyses were attenuated toward the null. A minority of subgroup claims made in the abstracts of RCTs are supported by their own data (ie, a significant interaction effect). For those that have statistical support (P < .05 from an interaction test), most fail to meet other best practices for subgroup tests, including prespecification, stratified randomization, and adjustment for multiple testing. Attempts to corroborate statistically significant subgroup differences are rare; when done, the initially observed subgroup differences are not reproduced.
Re-analysis of survival data of cancer patients utilizing additive homeopathy.
Gleiss, Andreas; Frass, Michael; Gaertner, Katharina
2016-08-01
In this short communication we present a re-analysis of homeopathic patient data in comparison to control patient data from the same Outpatient´s Unit "Homeopathy in malignant diseases" of the Medical University of Vienna. In this analysis we took account of a probable immortal time bias. For patients suffering from advanced stages of cancer and surviving the first 6 or 12 months after diagnosis, respectively, the results show that utilizing homeopathy gives a statistically significant (p<0.001) advantage over control patients regarding survival time. In conclusion, bearing in mind all limitations, the results of this retrospective study suggest that patients with advanced stages of cancer might benefit from additional homeopathic treatment until a survival time of up to 12 months after diagnosis. Copyright © 2016 Elsevier Ltd. All rights reserved.
Venter, Anre; Maxwell, Scott E; Bolig, Erika
2002-06-01
Adding a pretest as a covariate to a randomized posttest-only design increases statistical power, as does the addition of intermediate time points to a randomized pretest-posttest design. Although typically 5 waves of data are required in this instance to produce meaningful gains in power, a 3-wave intensive design allows the evaluation of the straight-line growth model and may reduce the effect of missing data. The authors identify the statistically most powerful method of data analysis in the 3-wave intensive design. If straight-line growth is assumed, the pretest-posttest slope must assume fairly extreme values for the intermediate time point to increase power beyond the standard analysis of covariance on the posttest with the pretest as covariate, ignoring the intermediate time point.
NASA Astrophysics Data System (ADS)
Woolley, Thomas W.; Dawson, George O.
It has been two decades since the first power analysis of a psychological journal and 10 years since the Journal of Research in Science Teaching made its contribution to this debate. One purpose of this article is to investigate what power-related changes, if any, have occurred in science education research over the past decade as a result of the earlier survey. In addition, previous recommendations are expanded and expounded upon within the context of more recent work in this area. The absence of any consistent mode of presenting statistical results, as well as little change with regard to power-related issues are reported. Guidelines for reporting the minimal amount of information demanded for clear and independent evaluation of research results by readers are also proposed.
Heart Rate Variability Dynamics for the Prognosis of Cardiovascular Risk
Ramirez-Villegas, Juan F.; Lam-Espinosa, Eric; Ramirez-Moreno, David F.; Calvo-Echeverry, Paulo C.; Agredo-Rodriguez, Wilfredo
2011-01-01
Statistical, spectral, multi-resolution and non-linear methods were applied to heart rate variability (HRV) series linked with classification schemes for the prognosis of cardiovascular risk. A total of 90 HRV records were analyzed: 45 from healthy subjects and 45 from cardiovascular risk patients. A total of 52 features from all the analysis methods were evaluated using standard two-sample Kolmogorov-Smirnov test (KS-test). The results of the statistical procedure provided input to multi-layer perceptron (MLP) neural networks, radial basis function (RBF) neural networks and support vector machines (SVM) for data classification. These schemes showed high performances with both training and test sets and many combinations of features (with a maximum accuracy of 96.67%). Additionally, there was a strong consideration for breathing frequency as a relevant feature in the HRV analysis. PMID:21386966
Advanced functional network analysis in the geosciences: The pyunicorn package
NASA Astrophysics Data System (ADS)
Donges, Jonathan F.; Heitzig, Jobst; Runge, Jakob; Schultz, Hanna C. H.; Wiedermann, Marc; Zech, Alraune; Feldhoff, Jan; Rheinwalt, Aljoscha; Kutza, Hannes; Radebach, Alexander; Marwan, Norbert; Kurths, Jürgen
2013-04-01
Functional networks are a powerful tool for analyzing large geoscientific datasets such as global fields of climate time series originating from observations or model simulations. pyunicorn (pythonic unified complex network and recurrence analysis toolbox) is an open-source, fully object-oriented and easily parallelizable package written in the language Python. It allows for constructing functional networks (aka climate networks) representing the structure of statistical interrelationships in large datasets and, subsequently, investigating this structure using advanced methods of complex network theory such as measures for networks of interacting networks, node-weighted statistics or network surrogates. Additionally, pyunicorn allows to study the complex dynamics of geoscientific systems as recorded by time series by means of recurrence networks and visibility graphs. The range of possible applications of the package is outlined drawing on several examples from climatology.
Qu, Shu-Gen; Gao, Jin; Tang, Bo; Yu, Bo; Shen, Yue-Ping; Tu, Yu
2018-05-01
Low-dose ionizing radiation (LDIR) may increase the mortality of solid cancers in nuclear industry workers, but only few individual cohort studies exist, and the available reports have low statistical power. The aim of the present study was to focus on solid cancer mortality risk from LDIR in the nuclear industry using standard mortality ratios (SMRs) and 95% confidence intervals. A systematic literature search through the PubMed and Embase databases identified 27 studies relevant to this meta-analysis. There was statistical significance for total, solid and lung cancers, with meta-SMR values of 0.88, 0.80, and 0.89, respectively. There was evidence of stochastic effects by IR, but more definitive conclusions require additional analyses using standardized protocols to determine whether LDIR increases the risk of solid cancer-related mortality.
Statistical Analysis of Zebrafish Locomotor Response.
Liu, Yiwen; Carmer, Robert; Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai
2015-01-01
Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling's T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling's T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure.
Statistical Analysis of Zebrafish Locomotor Response
Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai
2015-01-01
Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling’s T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling’s T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure. PMID:26437184
Validating an Air Traffic Management Concept of Operation Using Statistical Modeling
NASA Technical Reports Server (NTRS)
He, Yuning; Davies, Misty Dawn
2013-01-01
Validating a concept of operation for a complex, safety-critical system (like the National Airspace System) is challenging because of the high dimensionality of the controllable parameters and the infinite number of states of the system. In this paper, we use statistical modeling techniques to explore the behavior of a conflict detection and resolution algorithm designed for the terminal airspace. These techniques predict the robustness of the system simulation to both nominal and off-nominal behaviors within the overall airspace. They also can be used to evaluate the output of the simulation against recorded airspace data. Additionally, the techniques carry with them a mathematical value of the worth of each prediction-a statistical uncertainty for any robustness estimate. Uncertainty Quantification (UQ) is the process of quantitative characterization and ultimately a reduction of uncertainties in complex systems. UQ is important for understanding the influence of uncertainties on the behavior of a system and therefore is valuable for design, analysis, and verification and validation. In this paper, we apply advanced statistical modeling methodologies and techniques on an advanced air traffic management system, namely the Terminal Tactical Separation Assured Flight Environment (T-TSAFE). We show initial results for a parameter analysis and safety boundary (envelope) detection in the high-dimensional parameter space. For our boundary analysis, we developed a new sequential approach based upon the design of computer experiments, allowing us to incorporate knowledge from domain experts into our modeling and to determine the most likely boundary shapes and its parameters. We carried out the analysis on system parameters and describe an initial approach that will allow us to include time-series inputs, such as the radar track data, into the analysis
Sarode, D; Bari, D A; Cain, A C; Syed, M I; Williams, A T
2017-04-01
To critically evaluate the evidence comparing success rates of endonasal dacryocystorhinostomy (EN-DCR) with and without silicone tubing and to thus determine whether silicone intubation is beneficial in primary EN-DCR. Systematic review and meta-analysis. A literature search was performed on AMED, EMBASE, HMIC, MEDLINE, PsycINFO, BNI, CINAHL, HEALTH BUSINESS ELITE, CENTRAL and Cochrane Ear, Nose and Throat disorders groups trials register using a combination of various MeSH. The date of last search was January 2016. This review was limited to randomised controlled trials (RCTs) in English language. Risk of bias was assessed using the Cochrane Collaboration's risk of bias tool. Chi-square and I 2 statistics were calculated to determine the presence and extent of statistical heterogeneity. Study selection, data extraction and risk of bias scoring were performed independently by two authors in concordance with the PRISMA statement. Five RCTs (447 primary EN-DCR procedures in 426 patients) were included for analysis. Moderate interstudy statistical heterogeneity was demonstrated (Chi 2 = 6.18; d.f. = 4; I 2 = 35%). Bicanalicular silicone stents were used in 229 and not used in 218 procedures. The overall success rate of EN-DCR was 92.8% (415/447). The success rate of EN-DCR was 93.4% (214/229) with silicone tubing and 92.2% (201/218) without silicone tubing. Meta-analysis using a random-effects model showed no statistically significant difference in outcomes between the two groups (P = 0.63; RR = 0.79; 95% CI = 0.3-2.06). Our review and meta-analysis did not demonstrate an additional advantage of silicone stenting. A high-quality well-powered prospective multicentre RCT is needed to further clarify on the benefit of silicone stents. © 2016 John Wiley & Sons Ltd.
Use of Statistical Analyses in the Ophthalmic Literature
Lisboa, Renato; Meira-Freitas, Daniel; Tatham, Andrew J.; Marvasti, Amir H.; Sharpsten, Lucie; Medeiros, Felipe A.
2014-01-01
Purpose To identify the most commonly used statistical analyses in the ophthalmic literature and to determine the likely gain in comprehension of the literature that readers could expect if they were to sequentially add knowledge of more advanced techniques to their statistical repertoire. Design Cross-sectional study Methods All articles published from January 2012 to December 2012 in Ophthalmology, American Journal of Ophthalmology and Archives of Ophthalmology were reviewed. A total of 780 peer-reviewed articles were included. Two reviewers examined each article and assigned categories to each one depending on the type of statistical analyses used. Discrepancies between reviewers were resolved by consensus. Main Outcome Measures Total number and percentage of articles containing each category of statistical analysis were obtained. Additionally we estimated the accumulated number and percentage of articles that a reader would be expected to be able to interpret depending on their statistical repertoire. Results Readers with little or no statistical knowledge would be expected to be able to interpret the statistical methods presented in only 20.8% of articles. In order to understand more than half (51.4%) of the articles published, readers were expected to be familiar with at least 15 different statistical methods. Knowledge of 21 categories of statistical methods was necessary to comprehend 70.9% of articles, while knowledge of more than 29 categories was necessary to comprehend more than 90% of articles. Articles in retina and glaucoma subspecialties showed a tendency for using more complex analysis when compared to cornea. Conclusions Readers of clinical journals in ophthalmology need to have substantial knowledge of statistical methodology to understand the results of published studies in the literature. The frequency of use of complex statistical analyses also indicates that those involved in the editorial peer-review process must have sound statistical knowledge in order to critically appraise articles submitted for publication. The results of this study could provide guidance to direct the statistical learning of clinical ophthalmologists, researchers and educators involved in the design of courses for residents and medical students. PMID:24612977
Milic, Natasa M.; Masic, Srdjan; Milin-Lazovic, Jelena; Trajkovic, Goran; Bukumiric, Zoran; Savic, Marko; Milic, Nikola V.; Cirkovic, Andja; Gajic, Milan; Kostic, Mirjana; Ilic, Aleksandra; Stanisavljevic, Dejana
2016-01-01
Background The scientific community increasingly is recognizing the need to bolster standards of data analysis given the widespread concern that basic mistakes in data analysis are contributing to the irreproducibility of many published research findings. The aim of this study was to investigate students’ attitudes towards statistics within a multi-site medical educational context, monitor their changes and impact on student achievement. In addition, we performed a systematic review to better support our future pedagogical decisions in teaching applied statistics to medical students. Methods A validated Serbian Survey of Attitudes Towards Statistics (SATS-36) questionnaire was administered to medical students attending obligatory introductory courses in biostatistics from three medical universities in the Western Balkans. A systematic review of peer-reviewed publications was performed through searches of Scopus, Web of Science, Science Direct, Medline, and APA databases through 1994. A meta-analysis was performed for the correlation coefficients between SATS component scores and statistics achievement. Pooled estimates were calculated using random effects models. Results SATS-36 was completed by 461 medical students. Most of the students held positive attitudes towards statistics. Ability in mathematics and grade point average were associated in a multivariate regression model with the Cognitive Competence score, after adjusting for age, gender and computer ability. The results of 90 paired data showed that Affect, Cognitive Competence, and Effort scores demonstrated significant positive changes. The Cognitive Competence score showed the largest increase (M = 0.48, SD = 0.95). The positive correlation found between the Cognitive Competence score and students’ achievement (r = 0.41; p<0.001), was also shown in the meta-analysis (r = 0.37; 95% CI 0.32–0.41). Conclusion Students' subjective attitudes regarding Cognitive Competence at the beginning of the biostatistics course, which were directly linked to mathematical knowledge, affected their attitudes at the end of the course that, in turn, influenced students' performance. This indicates the importance of positively changing not only students’ cognitive competency, but also their perceptions of gained competency during the biostatistics course. PMID:27764123
Milic, Natasa M; Masic, Srdjan; Milin-Lazovic, Jelena; Trajkovic, Goran; Bukumiric, Zoran; Savic, Marko; Milic, Nikola V; Cirkovic, Andja; Gajic, Milan; Kostic, Mirjana; Ilic, Aleksandra; Stanisavljevic, Dejana
2016-01-01
The scientific community increasingly is recognizing the need to bolster standards of data analysis given the widespread concern that basic mistakes in data analysis are contributing to the irreproducibility of many published research findings. The aim of this study was to investigate students' attitudes towards statistics within a multi-site medical educational context, monitor their changes and impact on student achievement. In addition, we performed a systematic review to better support our future pedagogical decisions in teaching applied statistics to medical students. A validated Serbian Survey of Attitudes Towards Statistics (SATS-36) questionnaire was administered to medical students attending obligatory introductory courses in biostatistics from three medical universities in the Western Balkans. A systematic review of peer-reviewed publications was performed through searches of Scopus, Web of Science, Science Direct, Medline, and APA databases through 1994. A meta-analysis was performed for the correlation coefficients between SATS component scores and statistics achievement. Pooled estimates were calculated using random effects models. SATS-36 was completed by 461 medical students. Most of the students held positive attitudes towards statistics. Ability in mathematics and grade point average were associated in a multivariate regression model with the Cognitive Competence score, after adjusting for age, gender and computer ability. The results of 90 paired data showed that Affect, Cognitive Competence, and Effort scores demonstrated significant positive changes. The Cognitive Competence score showed the largest increase (M = 0.48, SD = 0.95). The positive correlation found between the Cognitive Competence score and students' achievement (r = 0.41; p<0.001), was also shown in the meta-analysis (r = 0.37; 95% CI 0.32-0.41). Students' subjective attitudes regarding Cognitive Competence at the beginning of the biostatistics course, which were directly linked to mathematical knowledge, affected their attitudes at the end of the course that, in turn, influenced students' performance. This indicates the importance of positively changing not only students' cognitive competency, but also their perceptions of gained competency during the biostatistics course.
An application of principal component analysis to the clavicle and clavicle fixation devices.
Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan
2010-03-26
Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.
NASA Astrophysics Data System (ADS)
Seaward, James Nicholas
International development organizations have recently ramped up efforts to promote the use of improved cookstoves (ICS) in developing countries, aiming to reduce the harmful environmental and public health impacts of the burning of biomass for cooking and heating. I hypothesize that ICS use also has additional benefits---economic and social benefits---that can contribute to women's economic empowerment in the developing world. To explore the relationship between ICS use and women's economic empowerment, I use Ordinary Least Squares and Logit models based on data from the India Human Development Survey (IHDS) to analyze differences between women living in households that use ICS and those living in homes that use traditional cookstoves. My regression results reveal that ICS use has a statistically significant and negative effect on the amount of time women and girls spend on fuel collection and a statistically significant and positive effect on the likelihood of women's participation in side businesses, but does not have a statistically significant effect on the likelihood of lost productivity. My analysis shows promise that in addition to health and environmental benefits, fuel-efficient cooking technologies can also have social and economic impacts that are especially beneficial to women. It is my hope that the analysis provided in this paper will be used to further the dialogue about the importance of women's access to modern energy services in the fight to improve women's living standards in the developing world.
Di, Yanming; Schafer, Daniel W.; Wilhelm, Larry J.; Fox, Samuel E.; Sullivan, Christopher M.; Curzon, Aron D.; Carrington, James C.; Mockler, Todd C.; Chang, Jeff H.
2011-01-01
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts. PMID:21998647
NASA Astrophysics Data System (ADS)
Hendikawati, P.; Arifudin, R.; Zahid, M. Z.
2018-03-01
This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.
Piepho, H P
1995-03-01
The additive main effects multiplicative interaction model is frequently used in the analysis of multilocation trials. In the analysis of such data it is of interest to decide how many of the multiplicative interaction terms are significant. Several tests for this task are available, all of which assume that errors are normally distributed with a common variance. This paper investigates the robustness of several tests (Gollob, F GH1, FGH2, FR)to departures from these assumptions. It is concluded that, because of its better robustness, the F Rtest is preferable. If the other tests are to be used, preliminary tests for the validity of assumptions should be performed.
A Categorization of Dynamic Analyzers
NASA Technical Reports Server (NTRS)
Lujan, Michelle R.
1997-01-01
Program analysis techniques and tools are essential to the development process because of the support they provide in detecting errors and deficiencies at different phases of development. The types of information rendered through analysis includes the following: statistical measurements of code, type checks, dataflow analysis, consistency checks, test data,verification of code, and debugging information. Analyzers can be broken into two major categories: dynamic and static. Static analyzers examine programs with respect to syntax errors and structural properties., This includes gathering statistical information on program content, such as the number of lines of executable code, source lines. and cyclomatic complexity. In addition, static analyzers provide the ability to check for the consistency of programs with respect to variables. Dynamic analyzers in contrast are dependent on input and the execution of a program providing the ability to find errors that cannot be detected through the use of static analysis alone. Dynamic analysis provides information on the behavior of a program rather than on the syntax. Both types of analysis detect errors in a program, but dynamic analyzers accomplish this through run-time behavior. This paper focuses on the following broad classification of dynamic analyzers: 1) Metrics; 2) Models; and 3) Monitors. Metrics are those analyzers that provide measurement. The next category, models, captures those analyzers that present the state of the program to the user at specified points in time. The last category, monitors, checks specified code based on some criteria. The paper discusses each classification and the techniques that are included under them. In addition, the role of each technique in the software life cycle is discussed. Familiarization with the tools that measure, model and monitor programs provides a framework for understanding the program's dynamic behavior from different, perspectives through analysis of the input/output data.
van Uitert, Miranda; Moerland, Perry D; Enquobahrie, Daniel A; Laivuori, Hannele; van der Post, Joris A M; Ris-Stalpers, Carrie; Afink, Gijs B
2015-01-01
Studies using the placental transcriptome to identify key molecules relevant for preeclampsia are hampered by a relatively small sample size. In addition, they use a variety of bioinformatics and statistical methods, making comparison of findings challenging. To generate a more robust preeclampsia gene expression signature, we performed a meta-analysis on the original data of 11 placenta RNA microarray experiments, representing 139 normotensive and 116 preeclamptic pregnancies. Microarray data were pre-processed and analyzed using standardized bioinformatics and statistical procedures and the effect sizes were combined using an inverse-variance random-effects model. Interactions between genes in the resulting gene expression signature were identified by pathway analysis (Ingenuity Pathway Analysis, Gene Set Enrichment Analysis, Graphite) and protein-protein associations (STRING). This approach has resulted in a comprehensive list of differentially expressed genes that led to a 388-gene meta-signature of preeclamptic placenta. Pathway analysis highlights the involvement of the previously identified hypoxia/HIF1A pathway in the establishment of the preeclamptic gene expression profile, while analysis of protein interaction networks indicates CREBBP/EP300 as a novel element central to the preeclamptic placental transcriptome. In addition, there is an apparent high incidence of preeclampsia in women carrying a child with a mutation in CREBBP/EP300 (Rubinstein-Taybi Syndrome). The 388-gene preeclampsia meta-signature offers a vital starting point for further studies into the relevance of these genes (in particular CREBBP/EP300) and their concomitant pathways as biomarkers or functional molecules in preeclampsia. This will result in a better understanding of the molecular basis of this disease and opens up the opportunity to develop rational therapies targeting the placental dysfunction causal to preeclampsia.
6C.04: INTEGRATED SNP ANALYSIS AND METABOLOMIC PROFILES OF METABOLIC SYNDROME.
Marrachelli, V; Monleon, D; Morales, J M; Rentero, P; Martínez, F; Chaves, F J; Martin-Escudero, J C; Redon, J
2015-06-01
Metabolic syndrome (MS) has become a health and financial burden worldwide. Susceptibility of genetically determined metabotype of MS has not yet been investigated. We aimed to identify a distinctive metabolic profile of blood serum which might correlates to the early detection of the development of MS associated to genetic polymorphism. We applied high resolution NMR spectroscopy to profile blood serum from patients without MS (n = 945) or with (n = 291). Principal component analysis (PCA) and projection to latent structures for discriminant analysis (PLS-DA) were applied to NMR spectral datasets. Results were cross-validated using the Venetian Blinds approach. Additionally, five SNPs previously associated with MS were genotyped with SNPlex and tested for associations between the metabolic profiles and the genetic variants. Statistical analysis was performed using in-house MATLAB scripts and the PLS Toolbox statistical multivariate analysis library. Our analysis provided a PLS-DA Metabolic Syndrome discrimination model based on NMR metabolic profile (AUC = 0.86) with 84% of sensitivity and 72% specificity. The model identified 11 metabolites differentially regulated in patients with MS. Among others, fatty acids, glucose, alanine, hydroxyisovalerate, acetone, trimethylamine, 2-phenylpropionate, isobutyrate and valine, significantly contributed to the model. The combined analysis of metabolomics and SNP data revealed an association between the metabolic profile of MS and genes polymorphism involved in the adiposity regulation and fatty acids metabolism: rs2272903_TT (TFAP2B), rs3803_TT (GATA2), rs174589_CC (FADS2) and rs174577_AA (FADS2). In addition, individuals with the rs2272903-TT genotype seem to develop MS earlier than general population. Our study provides new insights on the metabolic alterations associated with a MS high-risk genotype. These results could help in future development of risk assessment and predictive models for subclinical cardiovascular disease.
An Introduction to MAMA (Meta-Analysis of MicroArray data) System.
Zhang, Zhe; Fenstermacher, David
2005-01-01
Analyzing microarray data across multiple experiments has been proven advantageous. To support this kind of analysis, we are developing a software system called MAMA (Meta-Analysis of MicroArray data). MAMA utilizes a client-server architecture with a relational database on the server-side for the storage of microarray datasets collected from various resources. The client-side is an application running on the end user's computer that allows the user to manipulate microarray data and analytical results locally. MAMA implementation will integrate several analytical methods, including meta-analysis within an open-source framework offering other developers the flexibility to plug in additional statistical algorithms.
Using data warehousing and OLAP in public health care.
Hristovski, D; Rogac, M; Markota, M
2000-01-01
The paper describes the possibilities of using data warehousing and OLAP technologies in public health care in general and then our own experience with these technologies gained during the implementation of a data warehouse of outpatient data at the national level. Such a data warehouse serves as a basis for advanced decision support systems based on statistical, OLAP or data mining methods. We used OLAP to enable interactive exploration and analysis of the data. We found out that data warehousing and OLAP are suitable for the domain of public health and that they enable new analytical possibilities in addition to the traditional statistical approaches.
Using data warehousing and OLAP in public health care.
Hristovski, D.; Rogac, M.; Markota, M.
2000-01-01
The paper describes the possibilities of using data warehousing and OLAP technologies in public health care in general and then our own experience with these technologies gained during the implementation of a data warehouse of outpatient data at the national level. Such a data warehouse serves as a basis for advanced decision support systems based on statistical, OLAP or data mining methods. We used OLAP to enable interactive exploration and analysis of the data. We found out that data warehousing and OLAP are suitable for the domain of public health and that they enable new analytical possibilities in addition to the traditional statistical approaches. PMID:11079907
An analysis of tropical hardwood product importation and consumption in the United States
Paul M. Smith; Michael P. Haas; William G. Luppold; William G. Luppold
1995-01-01
The consumption of forest products emanating from tropical rainforests is an issue that is receiving increasing attention in the United States. This attention stems from concerns over the sustainability of tropical ecosystems. However, trade statistics show the United States imported only 4.0 percent of all tropical timber products traded globally in 1989. In addition...
"Hold the Phone!": Cell Phone Use and Partner Reaction among University Students
ERIC Educational Resources Information Center
Beaver, Tiffany; Knox, David; Zusman, Marty E.
2010-01-01
Analysis of survey data from 995 undergraduates at a large southeastern university revealed that 93% reported owning a cell phone and a statistically significant difference between women and men (95% versus 91.2%) and between Whites (95.1%) and Blacks (87.7%). In addition, Blacks were twice as likely as Whites to be bothered by their partner's use…
Statistical, Graphical, and Learning Methods for Sensing, Surveillance, and Navigation Systems
2016-06-28
harsh propagation environments. Conventional filtering techniques fail to provide satisfactory performance in many important nonlinear or non...Gaussian scenarios. In addition, there is a lack of a unified methodology for the design and analysis of different filtering techniques. To address...these problems, we have proposed a new filtering methodology called belief condensation (BC) DISTRIBUTION A: Distribution approved for public release
ERIC Educational Resources Information Center
Strolin-Goltzman, Jessica
2008-01-01
This comparison study analyzes the commonalties, similarities, and differences on supervisory and organizational factors between a group of high turnover systems and a group of low turnover systems. Significant differences on organizational factors, but not on supervisory factors, emerged from the statistical analysis. Additionally, this study…
ERIC Educational Resources Information Center
Mercado, Claudia
2012-01-01
The purpose of this study was to learn more about the Hispanic students attending Northeastern Illinois University, a four-year institution in Chicago, IL, and their student success. Little is known descriptively and statistically about this population at NEIU, which serves as a Hispanic-Serving Institution. In addition, little is known about…
A Heat Vulnerability Index and Adaptation Solutions for Pittsburgh, Pennsylvania.
Bradford, Kathryn; Abrahams, Leslie; Hegglin, Miriam; Klima, Kelly
2015-10-06
With increasing evidence of global warming, many cities have focused attention on response plans to address their populations' vulnerabilities. Despite expected increased frequency and intensity of heat waves, the health impacts of such events in urban areas can be minimized with careful policy and economic investments. We focus on Pittsburgh, Pennsylvania and ask two questions. First, what are the top factors contributing to heat vulnerability and how do these characteristics manifest geospatially throughout Pittsburgh? Second, assuming the City wishes to deploy additional cooling centers, what placement will optimally address the vulnerability of the at risk populations? We use national census data, ArcGIS geospatial modeling, and statistical analysis to determine a range of heat vulnerability indices and optimal cooling center placement. We find that while different studies use different data and statistical calculations, all methods tested locate additional cooling centers at the confluence of the three rivers (Downtown), the northeast side of Pittsburgh (Shadyside/Highland Park), and the southeast side of Pittsburgh (Squirrel Hill). This suggests that for Pittsburgh, a researcher could apply the same factor analysis procedure to compare data sets for different locations and times; factor analyses for heat vulnerability are more robust than previously thought.
A nonlinear isobologram model with Box-Cox transformation to both sides for chemical mixtures.
Chen, D G; Pounds, J G
1998-12-01
The linear logistical isobologram is a commonly used and powerful graphical and statistical tool for analyzing the combined effects of simple chemical mixtures. In this paper a nonlinear isobologram model is proposed to analyze the joint action of chemical mixtures for quantitative dose-response relationships. This nonlinear isobologram model incorporates two additional new parameters, Ymin and Ymax, to facilitate analysis of response data that are not constrained between 0 and 1, where parameters Ymin and Ymax represent the minimal and the maximal observed toxic response. This nonlinear isobologram model for binary mixtures can be expressed as [formula: see text] In addition, a Box-Cox transformation to both sides is introduced to improve the goodness of fit and to provide a more robust model for achieving homogeneity and normality of the residuals. Finally, a confidence band is proposed for selected isobols, e.g., the median effective dose, to facilitate graphical and statistical analysis of the isobologram. The versatility of this approach is demonstrated using published data describing the toxicity of the binary mixtures of citrinin and ochratoxin as well as a new experimental data from our laboratory for mixtures of mercury and cadmium.
A nonlinear isobologram model with Box-Cox transformation to both sides for chemical mixtures.
Chen, D G; Pounds, J G
1998-01-01
The linear logistical isobologram is a commonly used and powerful graphical and statistical tool for analyzing the combined effects of simple chemical mixtures. In this paper a nonlinear isobologram model is proposed to analyze the joint action of chemical mixtures for quantitative dose-response relationships. This nonlinear isobologram model incorporates two additional new parameters, Ymin and Ymax, to facilitate analysis of response data that are not constrained between 0 and 1, where parameters Ymin and Ymax represent the minimal and the maximal observed toxic response. This nonlinear isobologram model for binary mixtures can be expressed as [formula: see text] In addition, a Box-Cox transformation to both sides is introduced to improve the goodness of fit and to provide a more robust model for achieving homogeneity and normality of the residuals. Finally, a confidence band is proposed for selected isobols, e.g., the median effective dose, to facilitate graphical and statistical analysis of the isobologram. The versatility of this approach is demonstrated using published data describing the toxicity of the binary mixtures of citrinin and ochratoxin as well as a new experimental data from our laboratory for mixtures of mercury and cadmium. PMID:9860894
A Heat Vulnerability Index and Adaptation Solutions for Pittsburgh, Pennsylvania
NASA Astrophysics Data System (ADS)
Klima, K.; Abrahams, L.; Bradford, K.; Hegglin, M.
2015-12-01
With increasing evidence of global warming, many cities have focused attention on response plans to address their populations' vulnerabilities. Despite expected increased frequency and intensity of heat waves, the health impacts of such events in urban areas can be minimized with careful policy and economic investments. We focus on Pittsburgh, Pennsylvania and ask two questions. First, what are the top factors contributing to heat vulnerability and how do these characteristics manifest geospatially throughout Pittsburgh? Second, assuming the City wishes to deploy additional cooling centers, what placement will optimally address the vulnerability of the at risk populations? We use national census data, ArcGIS geospatial modeling, and statistical analysis to determine a range of heat vulnerability indices and optimal cooling center placement. We find that while different studies use different data and statistical calculations, all methods tested locate additional cooling centers at the confluence of the three rivers (Downtown), the northeast side of Pittsburgh (Shadyside/ Highland Park), and the southeast side of Pittsburgh (Squirrel Hill). This suggests that for Pittsburgh, a researcher could apply the same factor analysis procedure to compare datasets for different locations and times; factor analyses for heat vulnerability are more robust than previously thought.
Buttigieg, Pier Luigi; Ramette, Alban
2014-12-01
The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community. © 2014 The Authors. FEMS Microbiology Ecology published by John Wiley & Sons Ltd on behalf of Federation of European Microbiological Societies.
Kirkpatrick, Robert M; McGue, Matt; Iacono, William G
2015-03-01
The present study of general cognitive ability attempts to replicate and extend previous investigations of a biometric moderator, family-of-origin socioeconomic status (SES), in a sample of 2,494 pairs of adolescent twins, non-twin biological siblings, and adoptive siblings assessed with individually administered IQ tests. We hypothesized that SES would covary positively with additive-genetic variance and negatively with shared-environmental variance. Important potential confounds unaddressed in some past studies, such as twin-specific effects, assortative mating, and differential heritability by trait level, were found to be negligible. In our main analysis, we compared models by their sample-size corrected AIC, and base our statistical inference on model-averaged point estimates and standard errors. Additive-genetic variance increased with SES-an effect that was statistically significant and robust to model specification. We found no evidence that SES moderated shared-environmental influence. We attempt to explain the inconsistent replication record of these effects, and provide suggestions for future research.
Kirkpatrick, Robert M.; McGue, Matt; Iacono, William G.
2015-01-01
The present study of general cognitive ability attempts to replicate and extend previous investigations of a biometric moderator, family-of-origin socioeconomic status (SES), in a sample of 2,494 pairs of adolescent twins, non-twin biological siblings, and adoptive siblings assessed with individually administered IQ tests. We hypothesized that SES would covary positively with additive-genetic variance and negatively with shared-environmental variance. Important potential confounds unaddressed in some past studies, such as twin-specific effects, assortative mating, and differential heritability by trait level, were found to be negligible. In our main analysis, we compared models by their sample-size corrected AIC, and base our statistical inference on model-averaged point estimates and standard errors. Additive-genetic variance increased with SES—an effect that was statistically significant and robust to model specification. We found no evidence that SES moderated shared-environmental influence. We attempt to explain the inconsistent replication record of these effects, and provide suggestions for future research. PMID:25539975
NASA Astrophysics Data System (ADS)
Akmaev, R. a.
1999-04-01
In Part 1 of this work ([Akmaev, 1999]), an overview of the theory of optimal interpolation (OI) ([Gandin, 1963]) and related techniques of data assimilation based on linear optimal estimation ([Liebelt, 1967]; [Catlin, 1989]; [Mendel, 1995]) is presented. The approach implies the use in data analysis of additional statistical information in the form of statistical moments, e.g., the mean and covariance (correlation). The a priori statistical characteristics, if available, make it possible to constrain expected errors and obtain optimal in some sense estimates of the true state from a set of observations in a given domain in space and/or time. The primary objective of OI is to provide estimates away from the observations, i.e., to fill in data voids in the domain under consideration. Additionally, OI performs smoothing suppressing the noise, i.e., the spectral components that are presumably not present in the true signal. Usually, the criterion of optimality is minimum variance of the expected errors and the whole approach may be considered constrained least squares or least squares with a priori information. Obviously, data assimilation techniques capable of incorporating any additional information are potentially superior to techniques that have no access to such information as, for example, the conventional least squares (e.g., [Liebelt, 1967]; [Weisberg, 1985]; [Press et al., 1992]; [Mendel, 1995]).
Xu, Tong-kai; Sun, Zhi-hui; Jiang, Yong
2012-03-01
To evaluate the dimensional stability and detail reproduction of five additional silicone impression materials after autoclave sterilization. Impressions were made on the ISO 4823 standard mold containing several marking lines, in five kinds of additional silicone. All the impressions were sterilized by high temperature and pressure (135 °C, 212.8 kPa) for 25 min. Linear measurements of pre-sterilization and post-sterilization were made with a measuring microscope. Statistical analysis utilized single-factor analysis with pair-wise comparison of mean values when appropriate. Hypothesis testing was conducted at alpha = 0.05. No significant difference was found between the pre-sterilization and post-sterilization conditions for all locations, and all the absolute valuse of linear rate of change less than 8%. All the sterilization by the autoclave did not affect the surfuce detail reproduction of the 5 impression materials. The dimensional stability and detail reproduction of the five additional silicone impression materials in the study was unaffected by autoclave sterilization.
Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis
Steele, Joe; Bastola, Dhundy
2014-01-01
Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base–base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel–Ziv techniques from data compression. PMID:23904502
Preparing and Presenting Effective Research Posters
Miller, Jane E
2007-01-01
Objectives Posters are a common way to present results of a statistical analysis, program evaluation, or other project at professional conferences. Often, researchers fail to recognize the unique nature of the format, which is a hybrid of a published paper and an oral presentation. This methods note demonstrates how to design research posters to convey study objectives, methods, findings, and implications effectively to varied professional audiences. Methods A review of existing literature on research communication and poster design is used to identify and demonstrate important considerations for poster content and layout. Guidelines on how to write about statistical methods, results, and statistical significance are illustrated with samples of ineffective writing annotated to point out weaknesses, accompanied by concrete examples and explanations of improved presentation. A comparison of the content and format of papers, speeches, and posters is also provided. Findings Each component of a research poster about a quantitative analysis should be adapted to the audience and format, with complex statistical results translated into simplified charts, tables, and bulleted text to convey findings as part of a clear, focused story line. Conclusions Effective research posters should be designed around two or three key findings with accompanying handouts and narrative description to supply additional technical detail and encourage dialog with poster viewers. PMID:17355594
Supaporn, Pansuwan; Yeom, Sung Ho
2018-04-30
This study investigated the biological conversion of crude glycerol generated from a commercial biodiesel production plant as a by-product to 1,3-propanediol (1,3-PD). Statistical analysis was employed to derive a statistical model for the individual and interactive effects of glycerol, (NH 4 ) 2 SO 4 , trace elements, pH, and cultivation time on the four objectives: 1,3-PD concentration, yield, selectivity, and productivity. Optimum conditions for each objective with its maximum value were predicted by statistical optimization, and experiments under the optimum conditions verified the predictions. In addition, by systematic analysis of the values of four objectives, optimum conditions for 1,3-PD concentration (49.8 g/L initial glycerol, 4.0 g/L of (NH 4 ) 2 SO 4 , 2.0 mL/L of trace element, pH 7.5, and 11.2 h of cultivation time) were determined to be the global optimum culture conditions for 1,3-PD production. Under these conditions, we could achieve high 1,3-PD yield (47.4%), 1,3-PD selectivity (88.8%), and 1,3-PD productivity (2.1/g/L/h) as well as high 1,3-PD concentration (23.6 g/L).
Toumi, Héla; Boumaiza, Moncef; Millet, Maurice; Radetski, Claudemir Marcos; Camara, Baba Issa; Felten, Vincent; Masfaraud, Jean-François; Férard, Jean-François
2018-04-19
We studied the combined acute effect (i.e., after 48 h) of deltamethrin (a pyrethroid insecticide) and malathion (an organophosphate insecticide) on Daphnia magna. Two approaches were used to examine the potential interaction effects of eight mixtures of deltamethrin and malathion: (i) calculation of mixture toxicity index (MTI) and safety factor index (SFI) and (ii) response surface methodology coupled with isobole-based statistical model (using generalized linear model). According to the calculation of MTI and SFI, one tested mixture was found additive while the two other tested mixtures were found no additive (MTI) or antagonistic (SFI), but these differences between index responses are only due to differences in terminology related to these two indexes. Through the surface response approach and isobologram analysis, we concluded that there was a significant antagonistic effect of the binary mixtures of deltamethrin and malathion that occurs on D. magna immobilization, after 48 h of exposure. Index approaches and surface response approach with isobologram analysis are complementary. Calculation of mixture toxicity index and safety factor index allows identifying punctually the type of interaction for several tested mixtures, while the surface response approach with isobologram analysis integrates all the data providing a global outcome about the type of interactive effect. Only the surface response approach and isobologram analysis allowed the statistical assessment of the ecotoxicological interaction. Nevertheless, we recommend the use of both approaches (i) to identify the combined effects of contaminants and (ii) to improve risk assessment and environmental management.
It's all relative: ranking the diversity of aquatic bacterial communities.
Shaw, Allison K; Halpern, Aaron L; Beeson, Karen; Tran, Bao; Venter, J Craig; Martiny, Jennifer B H
2008-09-01
The study of microbial diversity patterns is hampered by the enormous diversity of microbial communities and the lack of resources to sample them exhaustively. For many questions about richness and evenness, however, one only needs to know the relative order of diversity among samples rather than total diversity. We used 16S libraries from the Global Ocean Survey to investigate the ability of 10 diversity statistics (including rarefaction, non-parametric, parametric, curve extrapolation and diversity indices) to assess the relative diversity of six aquatic bacterial communities. Overall, we found that the statistics yielded remarkably similar rankings of the samples for a given sequence similarity cut-off. This correspondence, despite the different underlying assumptions of the statistics, suggests that diversity statistics are a useful tool for ranking samples of microbial diversity. In addition, sequence similarity cut-off influenced the diversity ranking of the samples, demonstrating that diversity statistics can also be used to detect differences in phylogenetic structure among microbial communities. Finally, a subsampling analysis suggests that further sequencing from these particular clone libraries would not have substantially changed the richness rankings of the samples.
Application of Multivariate Statistical Analysis to Biomarkers in Se-Turkey Crude Oils
NASA Astrophysics Data System (ADS)
Gürgey, K.; Canbolat, S.
2017-11-01
Twenty-four crude oil samples were collected from the 24 oil fields distributed in different districts of SE-Turkey. API and Sulphur content (%), Stable Carbon Isotope, Gas Chromatography (GC), and Gas Chromatography-Mass Spectrometry (GC-MS) data were used to construct a geochemical data matrix. The aim of this study is to examine the genetic grouping or correlations in the crude oil samples, hence the number of source rocks present in the SE-Turkey. To achieve these aims, two of the multivariate statistical analysis techniques (Principle Component Analysis [PCA] and Cluster Analysis were applied to data matrix of 24 samples and 8 source specific biomarker variables/parameters. The results showed that there are 3 genetically different oil groups: Batman-Nusaybin Oils, Adıyaman-Kozluk Oils and Diyarbakir Oils, in addition to a one mixed group. These groupings imply that at least, three different source rocks are present in South-Eastern (SE) Turkey. Grouping of the crude oil samples appears to be consistent with the geographic locations of the oils fields, subsurface stratigraphy as well as geology of the area.
A SIGNIFICANCE TEST FOR THE LASSO1
Lockhart, Richard; Taylor, Jonathan; Tibshirani, Ryan J.; Tibshirani, Robert
2014-01-01
In the sparse linear regression setting, we consider testing the significance of the predictor variable that enters the current lasso model, in the sequence of models visited along the lasso solution path. We propose a simple test statistic based on lasso fitted values, called the covariance test statistic, and show that when the true model is linear, this statistic has an Exp(1) asymptotic distribution under the null hypothesis (the null being that all truly active variables are contained in the current lasso model). Our proof of this result for the special case of the first predictor to enter the model (i.e., testing for a single significant predictor variable against the global null) requires only weak assumptions on the predictor matrix X. On the other hand, our proof for a general step in the lasso path places further technical assumptions on X and the generative model, but still allows for the important high-dimensional case p > n, and does not necessarily require that the current lasso model achieves perfect recovery of the truly active variables. Of course, for testing the significance of an additional variable between two nested linear models, one typically uses the chi-squared test, comparing the drop in residual sum of squares (RSS) to a χ12 distribution. But when this additional variable is not fixed, and has been chosen adaptively or greedily, this test is no longer appropriate: adaptivity makes the drop in RSS stochastically much larger than χ12 under the null hypothesis. Our analysis explicitly accounts for adaptivity, as it must, since the lasso builds an adaptive sequence of linear models as the tuning parameter λ decreases. In this analysis, shrinkage plays a key role: though additional variables are chosen adaptively, the coefficients of lasso active variables are shrunken due to the l1 penalty. Therefore, the test statistic (which is based on lasso fitted values) is in a sense balanced by these two opposing properties—adaptivity and shrinkage—and its null distribution is tractable and asymptotically Exp(1). PMID:25574062
NASA Astrophysics Data System (ADS)
Sogaro, Francesca; Poole, Robert; Dennis, David
2014-11-01
High-speed stereoscopic particle image velocimetry has been performed in fully developed turbulent pipe flow at moderate Reynolds numbers with and without a drag-reducing additive (an aqueous solution of high molecular weight polyacrylamide). Three-dimensional large and very large-scale motions (LSM and VLSM) are extracted from the flow fields by a detection algorithm and the characteristics for each case are statistically compared. The results show that the three-dimensional extent of VLSMs in drag reduced (DR) flow appears to increase significantly compared to their Newtonian counterparts. A statistical increase in azimuthal extent of DR VLSM is observed by means of two-point spatial autocorrelation of the streamwise velocity fluctuation in the radial-azimuthal plane. Furthermore, a remarkable increase in length of these structures is observed by three-dimensional two-point spatial autocorrelation. These results are accompanied by an analysis of the swirling strength in the flow field that shows a significant reduction in strength and number of the vortices for the DR flow. The findings suggest that the damping of the small scales due to polymer addition results in the undisturbed development of longer flow structures.
Guisan, Antoine; Edwards, T.C.; Hastie, T.
2002-01-01
An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001. We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling. ?? 2002 Elsevier Science B.V. All rights reserved.
Similarity of markers identified from cancer gene expression studies: observations from GEO.
Shi, Xingjie; Shen, Shihao; Liu, Jin; Huang, Jian; Zhou, Yong; Ma, Shuangge
2014-09-01
Gene expression profiling has been extensively conducted in cancer research. The analysis of multiple independent cancer gene expression datasets may provide additional information and complement single-dataset analysis. In this study, we conduct multi-dataset analysis and are interested in evaluating the similarity of cancer-associated genes identified from different datasets. The first objective of this study is to briefly review some statistical methods that can be used for such evaluation. Both marginal analysis and joint analysis methods are reviewed. The second objective is to apply those methods to 26 Gene Expression Omnibus (GEO) datasets on five types of cancers. Our analysis suggests that for the same cancer, the marker identification results may vary significantly across datasets, and different datasets share few common genes. In addition, datasets on different cancers share few common genes. The shared genetic basis of datasets on the same or different cancers, which has been suggested in the literature, is not observed in the analysis of GEO data. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Henden, Lyndal; Lee, Stuart; Mueller, Ivo; Barry, Alyssa; Bahlo, Melanie
2018-05-01
Identification of genomic regions that are identical by descent (IBD) has proven useful for human genetic studies where analyses have led to the discovery of familial relatedness and fine-mapping of disease critical regions. Unfortunately however, IBD analyses have been underutilized in analysis of other organisms, including human pathogens. This is in part due to the lack of statistical methodologies for non-diploid genomes in addition to the added complexity of multiclonal infections. As such, we have developed an IBD methodology, called isoRelate, for analysis of haploid recombining microorganisms in the presence of multiclonal infections. Using the inferred IBD status at genomic locations, we have also developed a novel statistic for identifying loci under positive selection and propose relatedness networks as a means of exploring shared haplotypes within populations. We evaluate the performance of our methodologies for detecting IBD and selection, including comparisons with existing tools, then perform an exploratory analysis of whole genome sequencing data from a global Plasmodium falciparum dataset of more than 2500 genomes. This analysis identifies Southeast Asia as having many highly related isolates, possibly as a result of both reduced transmission from intensified control efforts and population bottlenecks following the emergence of antimalarial drug resistance. Many signals of selection are also identified, most of which overlap genes that are known to be associated with drug resistance, in addition to two novel signals observed in multiple countries that have yet to be explored in detail. Additionally, we investigate relatedness networks over the selected loci and determine that one of these sweeps has spread between continents while the other has arisen independently in different countries. IBD analysis of microorganisms using isoRelate can be used for exploring population structure, positive selection and haplotype distributions, and will be a valuable tool for monitoring disease control and elimination efforts of many diseases.
Big-Data RHEED analysis for understanding epitaxial film growth processes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vasudevan, Rama K; Tselev, Alexander; Baddorf, Arthur P
Reflection high energy electron diffraction (RHEED) has by now become a standard tool for in-situ monitoring of film growth by pulsed laser deposition and molecular beam epitaxy. Yet despite the widespread adoption and wealth of information in RHEED image, most applications are limited to observing intensity oscillations of the specular spot, and much additional information on growth is discarded. With ease of data acquisition and increased computation speeds, statistical methods to rapidly mine the dataset are now feasible. Here, we develop such an approach to the analysis of the fundamental growth processes through multivariate statistical analysis of RHEED image sequence.more » This approach is illustrated for growth of LaxCa1-xMnO3 films grown on etched (001) SrTiO3 substrates, but is universal. The multivariate methods including principal component analysis and k-means clustering provide insight into the relevant behaviors, the timing and nature of a disordered to ordered growth change, and highlight statistically significant patterns. Fourier analysis yields the harmonic components of the signal and allows separation of the relevant components and baselines, isolating the assymetric nature of the step density function and the transmission spots from the imperfect layer-by-layer (LBL) growth. These studies show the promise of big data approaches to obtaining more insight into film properties during and after epitaxial film growth. Furthermore, these studies open the pathway to use forward prediction methods to potentially allow significantly more control over growth process and hence final film quality.« less
Multivariate meta-analysis: Potential and promise
Jackson, Dan; Riley, Richard; White, Ian R
2011-01-01
The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052
Thompson, Cheryl Bagley
2009-01-01
This 13th article of the Basics of Research series is first in a short series on statistical analysis. These articles will discuss creating your statistical analysis plan, levels of measurement, descriptive statistics, probability theory, inferential statistics, and general considerations for interpretation of the results of a statistical analysis.
Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi
2014-09-18
Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.
PERSEUS QC: preparing statistic data sets
NASA Astrophysics Data System (ADS)
Belokopytov, Vladimir; Khaliulin, Alexey; Ingerov, Andrey; Zhuk, Elena; Gertman, Isaac; Zodiatis, George; Nikolaidis, Marios; Nikolaidis, Andreas; Stylianou, Stavros
2017-09-01
The Desktop Oceanographic Data Processing Module was developed for visual analysis of interdisciplinary cruise measurements. The program provides the possibility of data selection based on different criteria, map plotting, sea horizontal sections, and sea depth vertical profiles. The data selection in the area of interest can be specified according to a set of different physical and chemical parameters complimented by additional parameters, such as the cruise number, ship name, and time period. The visual analysis of a set of vertical profiles in the selected area allows to determine the quality of the data, their location and the time of the in-situ measurements and to exclude any questionable data from the statistical analysis. For each selected set of profiles, the average vertical profile, the minimal and maximal values of the parameter under examination and the root mean square (r.m.s.) are estimated. These estimates are compared with the parameter ranges, set for each sub-region by MEDAR/MEDATLAS-II and SeaDataNet2 projects. In the framework of the PERSEUS project, certain parameters which lacked a range were calculated from scratch, while some of the previously used ranges were re-defined using more comprehensive data sets based on SeaDataNet2, SESAME and PERSEUS projects. In some cases we have used additional sub- regions to redefine the ranges ore precisely. The recalculated ranges are used to improve the PERSEUS Data Quality Control.
Statistics of high-level scene context.
Greene, Michelle R
2013-01-01
CONTEXT IS CRITICAL FOR RECOGNIZING ENVIRONMENTS AND FOR SEARCHING FOR OBJECTS WITHIN THEM: contextual associations have been shown to modulate reaction time and object recognition accuracy, as well as influence the distribution of eye movements and patterns of brain activations. However, we have not yet systematically quantified the relationships between objects and their scene environments. Here I seek to fill this gap by providing descriptive statistics of object-scene relationships. A total of 48, 167 objects were hand-labeled in 3499 scenes using the LabelMe tool (Russell et al., 2008). From these data, I computed a variety of descriptive statistics at three different levels of analysis: the ensemble statistics that describe the density and spatial distribution of unnamed "things" in the scene; the bag of words level where scenes are described by the list of objects contained within them; and the structural level where the spatial distribution and relationships between the objects are measured. The utility of each level of description for scene categorization was assessed through the use of linear classifiers, and the plausibility of each level for modeling human scene categorization is discussed. Of the three levels, ensemble statistics were found to be the most informative (per feature), and also best explained human patterns of categorization errors. Although a bag of words classifier had similar performance to human observers, it had a markedly different pattern of errors. However, certain objects are more useful than others, and ceiling classification performance could be achieved using only the 64 most informative objects. As object location tends not to vary as a function of category, structural information provided little additional information. Additionally, these data provide valuable information on natural scene redundancy that can be exploited for machine vision, and can help the visual cognition community to design experiments guided by statistics rather than intuition.
Statistical mechanics of economics I
NASA Astrophysics Data System (ADS)
Kusmartsev, F. V.
2011-02-01
We show that statistical mechanics is useful in the description of financial crisis and economics. Taking a large amount of instant snapshots of a market over an interval of time we construct their ensembles and study their statistical interference. This results in a probability description of the market and gives capital, money, income, wealth and debt distributions, which in the most cases takes the form of the Bose-Einstein distribution. In addition, statistical mechanics provides the main market equations and laws which govern the correlations between the amount of money, debt, product, prices and number of retailers. We applied the found relations to a study of the evolution of the economics in USA between the years 1996 to 2008 and observe that over that time the income of a major population is well described by the Bose-Einstein distribution which parameters are different for each year. Each financial crisis corresponds to a peak in the absolute activity coefficient. The analysis correctly indicates the past crises and predicts the future one.
Willard, Melissa A Bodnar; McGuffin, Victoria L; Smith, Ruth Waddell
2012-01-01
Salvia divinorum is a hallucinogenic herb that is internationally regulated. In this study, salvinorin A, the active compound in S. divinorum, was extracted from S. divinorum plant leaves using a 5-min extraction with dichloromethane. Four additional Salvia species (Salvia officinalis, Salvia guaranitica, Salvia splendens, and Salvia nemorosa) were extracted using this procedure, and all extracts were analyzed by gas chromatography-mass spectrometry. Differentiation of S. divinorum from other Salvia species was successful based on visual assessment of the resulting chromatograms. To provide a more objective comparison, the total ion chromatograms (TICs) were subjected to principal components analysis (PCA). Prior to PCA, the TICs were subjected to a series of data pretreatment procedures to minimize non-chemical sources of variance in the data set. Successful discrimination of S. divinorum from the other four Salvia species was possible based on visual assessment of the PCA scores plot. To provide a numerical assessment of the discrimination, a series of statistical procedures such as Euclidean distance measurement, hierarchical cluster analysis, Student's t tests, Wilcoxon rank-sum tests, and Pearson product moment correlation were also applied to the PCA scores. The statistical procedures were then compared to determine the advantages and disadvantages for forensic applications.
Application of the Statistical ICA Technique in the DANCE Data Analysis
NASA Astrophysics Data System (ADS)
Baramsai, Bayarbadrakh; Jandel, M.; Bredeweg, T. A.; Rusev, G.; Walker, C. L.; Couture, A.; Mosby, S.; Ullmann, J. L.; Dance Collaboration
2015-10-01
The Detector for Advanced Neutron Capture Experiments (DANCE) at the Los Alamos Neutron Science Center is used to improve our understanding of the neutron capture reaction. DANCE is a highly efficient 4 π γ-ray detector array consisting of 160 BaF2 crystals which make it an ideal tool for neutron capture experiments. The (n, γ) reaction Q-value equals to the sum energy of all γ-rays emitted in the de-excitation cascades from the excited capture state to the ground state. The total γ-ray energy is used to identify reactions on different isotopes as well as the background. However, it's challenging to identify contribution in the Esum spectra from different isotopes with the similar Q-values. Recently we have tested the applicability of modern statistical methods such as Independent Component Analysis (ICA) to identify and separate different (n, γ) reaction yields on different isotopes that are present in the target material. ICA is a recently developed computational tool for separating multidimensional data into statistically independent additive subcomponents. In this conference talk, we present some results of the application of ICA algorithms and its modification for the DANCE experimental data analysis. This research is supported by the U. S. Department of Energy, Office of Science, Nuclear Physics under the Early Career Award No. LANL20135009.
Rice straw addition as sawdust substitution in oyster mushroom (Pleurotus ostreatus) planted media
NASA Astrophysics Data System (ADS)
Utami, Christine Pamardining; Susilawati, Puspita Ratna
2017-08-01
Oyster mushroom is favorite by the people because of the high nutrients. The oyster mushroom cultivation usually using sawdust. The availability of sawdust become difficult to find. It makes difficulties of mushroom cultivation. Rice straw as an agricultural waste can be used as planted media of oyster mushroom because they contain much nutrition needed to the mushroom growth. The aims of this research were to analysis the influence of rice straw addition in a baglog as planted media and to analysis the concentration of rice straw addition which can substitute sawdust in planted media of oyster mushroom. This research used 4 treatment of sawdust and rice straw ratio K = 75 % : 0 %, P1 = 60 % : 15 %, P2 = 40 % : 35 %, P3 = 15 % : 60 %. The same material composition of all baglog was bran 20%, chalk 5%, and water 70%. The parameters used in this research were wet weight, dry weight, moisture content and number of the mushroom fruit body. Data analysis was used ANOVA test with 1 factorial. The results of this research based on statistical analysis showed that there was no influence of rice straw addition in a planted media on the oyster mushroomgrowth. 15% : 60% was the concentrationof rice straw additionwhich can substitute the sawdust in planted media of oyster mushroom.
Lee, L.; Helsel, D.
2007-01-01
Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore
2014-04-01
Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided.
Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore
2014-01-01
Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided. PMID:24834325
NASA Technical Reports Server (NTRS)
Manning, Robert M.
1996-01-01
The purpose of the propagation studies within the ACTS Project Office is to acquire 20 and 30 GHz rain fade statistics using the ACTS beacon links received at the NGS (NASA Ground Station) in Cleveland. Other than the raw, statistically unprocessed rain fade events that occur in real time, relevant rain fade statistics derived from such events are the cumulative rain fade statistics as well as fade duration statistics (beyond given fade thresholds) over monthly and yearly time intervals. Concurrent with the data logging exercise, monthly maximum rainfall levels recorded at the US Weather Service at Hopkins Airport are appended to the database to facilitate comparison of observed fade statistics with those predicted by the ACTS Rain Attenuation Model. Also, the raw fade data will be in a format, complete with documentation, for use by other investigators who require realistic fade event evolution in time for simulation purposes or further analysis for comparisons with other rain fade prediction models, etc. The raw time series data from the 20 and 30 GHz beacon signals is purged of non relevant data intervals where no rain fading has occurred. All other data intervals which contain rain fade events are archived with the accompanying time stamps. The definition of just what constitutes a rain fade event will be discussed later. The archived data serves two purposes. First, all rain fade event data is recombined into a contiguous data series every month and every year; this will represent an uninterrupted record of the actual (i.e., not statistically processed) temporal evolution of rain fade at 20 and 30 GHz at the location of the NGS. The second purpose of the data in such a format is to enable a statistical analysis of prevailing propagation parameters such as cumulative distributions of attenuation on a monthly and yearly basis as well as fade duration probabilities below given fade thresholds, also on a monthly and yearly basis. In addition, various subsidiary statistics such as attenuation rate probabilities are derived. The purged raw rain fade data as well as the results of the analyzed data will be made available for use by parties in the private sector upon their request. The process which will be followed in this dissemination is outlined in this paper.
Sullivan, Thomas R; Yelland, Lisa N; Lee, Katherine J; Ryan, Philip; Salter, Amy B
2017-08-01
After completion of a randomised controlled trial, an extended follow-up period may be initiated to learn about longer term impacts of the intervention. Since extended follow-up studies often involve additional eligibility restrictions and consent processes for participation, and a longer duration of follow-up entails a greater risk of participant attrition, missing data can be a considerable threat in this setting. As a potential source of bias, it is critical that missing data are appropriately handled in the statistical analysis, yet little is known about the treatment of missing data in extended follow-up studies. The aims of this review were to summarise the extent of missing data in extended follow-up studies and the use of statistical approaches to address this potentially serious problem. We performed a systematic literature search in PubMed to identify extended follow-up studies published from January to June 2015. Studies were eligible for inclusion if the original randomised controlled trial results were also published and if the main objective of extended follow-up was to compare the original randomised groups. We recorded information on the extent of missing data and the approach used to treat missing data in the statistical analysis of the primary outcome of the extended follow-up study. Of the 81 studies included in the review, 36 (44%) reported additional eligibility restrictions and 24 (30%) consent processes for entry into extended follow-up. Data were collected at a median of 7 years after randomisation. Excluding 28 studies with a time to event primary outcome, 51/53 studies (96%) reported missing data on the primary outcome. The median percentage of randomised participants with complete data on the primary outcome was just 66% in these studies. The most common statistical approach to address missing data was complete case analysis (51% of studies), while likelihood-based analyses were also well represented (25%). Sensitivity analyses around the missing data mechanism were rarely performed (25% of studies), and when they were, they often involved unrealistic assumptions about the mechanism. Despite missing data being a serious problem in extended follow-up studies, statistical approaches to addressing missing data were often inadequate. We recommend researchers clearly specify all sources of missing data in follow-up studies and use statistical methods that are valid under a plausible assumption about the missing data mechanism. Sensitivity analyses should also be undertaken to assess the robustness of findings to assumptions about the missing data mechanism.
Fatal falls in the US construction industry, 1990 to 1999.
Derr, J; Forst, L; Chen, H Y; Conroy, L
2001-10-01
The Occupational Safety and Health Administration's (OSHA's) Integrated Management Information System (IMIS) database allows for the detailed analysis of risk factors surrounding fatal occupational events. This study used IMIS data to (1) perform a risk factor analysis of fatal construction falls, and (2) assess the impact of the February 1995 29 CFR Part 1926 Subpart M OSHA fall protection regulations for construction by calculating trends in fatal fall rates. In addition, IMIS data on fatal construction falls were compared with data from other occupational fatality surveillance systems. For falls in construction, the study identified several demographic factors that may indicate increased risk. A statistically significant downward trend in fatal falls was evident in all construction and within several construction categories during the decade. Although the study failed to show a statistically significant intervention effect from the new OSHA regulations, it may have lacked the power to do so.
Adaptive distributed source coding.
Varodayan, David; Lin, Yao-Chung; Girod, Bernd
2012-05-01
We consider distributed source coding in the presence of hidden variables that parameterize the statistical dependence among sources. We derive the Slepian-Wolf bound and devise coding algorithms for a block-candidate model of this problem. The encoder sends, in addition to syndrome bits, a portion of the source to the decoder uncoded as doping bits. The decoder uses the sum-product algorithm to simultaneously recover the source symbols and the hidden statistical dependence variables. We also develop novel techniques based on density evolution (DE) to analyze the coding algorithms. We experimentally confirm that our DE analysis closely approximates practical performance. This result allows us to efficiently optimize parameters of the algorithms. In particular, we show that the system performs close to the Slepian-Wolf bound when an appropriate doping rate is selected. We then apply our coding and analysis techniques to a reduced-reference video quality monitoring system and show a bit rate saving of about 75% compared with fixed-length coding.
Performance analysis of different tuning rules for an isothermal CSTR using integrated EPC and SPC
NASA Astrophysics Data System (ADS)
Roslan, A. H.; Karim, S. F. Abd; Hamzah, N.
2018-03-01
This paper demonstrates the integration of Engineering Process Control (EPC) and Statistical Process Control (SPC) for the control of product concentration of an isothermal CSTR. The objectives of this study are to evaluate the performance of Ziegler-Nichols (Z-N), Direct Synthesis, (DS) and Internal Model Control (IMC) tuning methods and determine the most effective method for this process. The simulation model was obtained from past literature and re-constructed using SIMULINK MATLAB to evaluate the process response. Additionally, the process stability, capability and normality were analyzed using Process Capability Sixpack reports in Minitab. Based on the results, DS displays the best response for having the smallest rise time, settling time, overshoot, undershoot, Integral Time Absolute Error (ITAE) and Integral Square Error (ISE). Also, based on statistical analysis, DS yields as the best tuning method as it exhibits the highest process stability and capability.
Qu, Shu-Gen; Gao, Jin; Tang, Bo; Yu, Bo; Shen, Yue-Ping; Tu, Yu
2018-01-01
Low-dose ionizing radiation (LDIR) may increase the mortality of solid cancers in nuclear industry workers, but only few individual cohort studies exist, and the available reports have low statistical power. The aim of the present study was to focus on solid cancer mortality risk from LDIR in the nuclear industry using standard mortality ratios (SMRs) and 95% confidence intervals. A systematic literature search through the PubMed and Embase databases identified 27 studies relevant to this meta-analysis. There was statistical significance for total, solid and lung cancers, with meta-SMR values of 0.88, 0.80, and 0.89, respectively. There was evidence of stochastic effects by IR, but more definitive conclusions require additional analyses using standardized protocols to determine whether LDIR increases the risk of solid cancer-related mortality. PMID:29725540
A Meta-Analysis of Hypnotherapeutic Techniques in the Treatment of PTSD Symptoms.
O'Toole, Siobhan K; Solomon, Shelby L; Bergdahl, Stephen A
2016-02-01
The efficacy of hypnotherapeutic techniques as treatment for symptoms of posttraumatic stress disorder (PTSD) was explored through meta-analytic methods. Studies were selected through a search of 29 databases. Altogether, 81 studies discussing hypnotherapy and PTSD were reviewed for inclusion criteria. The outcomes of 6 studies representing 391 participants were analyzed using meta-analysis. Evaluation of effect sizes related to avoidance and intrusion, in addition to overall PTSD symptoms after hypnotherapy treatment, revealed that all studies showed that hypnotherapy had a positive effect on PTSD symptoms. The overall Cohen's d was large (-1.18) and statistically significant (p < .001). Effect sizes varied based on study quality; however, they were large and statistically significant. Using the classic fail-safe N to assess for publication bias, it was determined it would take 290 nonsignificant studies to nullify these findings. Copyright © 2016 International Society for Traumatic Stress Studies.
Singular Spectrum Analysis for Astronomical Time Series: Constructing a Parsimonious Hypothesis Test
NASA Astrophysics Data System (ADS)
Greco, G.; Kondrashov, D.; Kobayashi, S.; Ghil, M.; Branchesi, M.; Guidorzi, C.; Stratta, G.; Ciszak, M.; Marino, F.; Ortolan, A.
We present a data-adaptive spectral method - Monte Carlo Singular Spectrum Analysis (MC-SSA) - and its modification to tackle astrophysical problems. Through numerical simulations we show the ability of the MC-SSA in dealing with 1/f β power-law noise affected by photon counting statistics. Such noise process is simulated by a first-order autoregressive, AR(1) process corrupted by intrinsic Poisson noise. In doing so, we statistically estimate a basic stochastic variation of the source and the corresponding fluctuations due to the quantum nature of light. In addition, MC-SSA test retains its effectiveness even when a significant percentage of the signal falls below a certain level of detection, e.g., caused by the instrument sensitivity. The parsimonious approach presented here may be broadly applied, from the search for extrasolar planets to the extraction of low-intensity coherent phenomena probably hidden in high energy transients.
Mallette, Jennifer R; Casale, John F; Jordan, James; Morello, David R; Beyer, Paul M
2016-03-23
Previously, geo-sourcing to five major coca growing regions within South America was accomplished. However, the expansion of coca cultivation throughout South America made sub-regional origin determinations increasingly difficult. The former methodology was recently enhanced with additional stable isotope analyses ((2)H and (18)O) to fully characterize cocaine due to the varying environmental conditions in which the coca was grown. An improved data analysis method was implemented with the combination of machine learning and multivariate statistical analysis methods to provide further partitioning between growing regions. Here, we show how the combination of trace cocaine alkaloids, stable isotopes, and multivariate statistical analyses can be used to classify illicit cocaine as originating from one of 19 growing regions within South America. The data obtained through this approach can be used to describe current coca cultivation and production trends, highlight trafficking routes, as well as identify new coca growing regions.
NASA Astrophysics Data System (ADS)
Mallette, Jennifer R.; Casale, John F.; Jordan, James; Morello, David R.; Beyer, Paul M.
2016-03-01
Previously, geo-sourcing to five major coca growing regions within South America was accomplished. However, the expansion of coca cultivation throughout South America made sub-regional origin determinations increasingly difficult. The former methodology was recently enhanced with additional stable isotope analyses (2H and 18O) to fully characterize cocaine due to the varying environmental conditions in which the coca was grown. An improved data analysis method was implemented with the combination of machine learning and multivariate statistical analysis methods to provide further partitioning between growing regions. Here, we show how the combination of trace cocaine alkaloids, stable isotopes, and multivariate statistical analyses can be used to classify illicit cocaine as originating from one of 19 growing regions within South America. The data obtained through this approach can be used to describe current coca cultivation and production trends, highlight trafficking routes, as well as identify new coca growing regions.
NASA Astrophysics Data System (ADS)
Provo, Judy; Lamar, Carlton; Newby, Timothy
2002-01-01
A cross section was used to enhance three-dimensional knowledge of anatomy of the canine head. All veterinary students in two successive classes (n = 124) dissected the head; experimental groups also identified structures on a cross section of the head. A test assessing spatial knowledge of the head generated 10 dependent variables from two administrations. The test had content validity and statistically significant interrater and test-retest reliability. A live-dog examination generated one additional dependent variable. Analysis of covariance controlling for performance on course examinations and quizzes revealed no treatment effect. Including spatial skill as a third covariate revealed a statistically significant effect of spatial skill on three dependent variables. Men initially had greater spatial skill than women, but spatial skills were equal after 8 months. A qualitative analysis showed the positive impact of this experience on participants. Suggestions for improvement and future research are discussed.
Barbie, Dana L.; Wehmeyer, Loren L.
2012-01-01
Trends in selected streamflow statistics during 1922-2009 were evaluated at 19 long-term streamflow-gaging stations considered indicative of outflows from Texas to Arkansas, Louisiana, Galveston Bay, and the Gulf of Mexico. The U.S. Geological Survey, in cooperation with the Texas Water Development Board, evaluated streamflow data from streamflow-gaging stations with more than 50 years of record that were active as of 2009. The outflows into Arkansas and Louisiana were represented by 3 streamflow-gaging stations, and outflows into the Gulf of Mexico, including Galveston Bay, were represented by 16 streamflow-gaging stations. Monotonic trend analyses were done using the following three streamflow statistics generated from daily mean values of streamflow: (1) annual mean daily discharge, (2) annual maximum daily discharge, and (3) annual minimum daily discharge. The trend analyses were based on the nonparametric Kendall's Tau test, which is useful for the detection of monotonic upward or downward trends with time. A total of 69 trend analyses by Kendall's Tau were computed - 19 periods of streamflow multiplied by the 3 streamflow statistics plus 12 additional trend analyses because the periods of record for 2 streamflow-gaging stations were divided into periods representing pre- and post-reservoir impoundment. Unless otherwise described, each trend analysis used the entire period of record for each streamflow-gaging station. The monotonic trend analysis detected 11 statistically significant downward trends, 37 instances of no trend, and 21 statistically significant upward trends. One general region studied, which seemingly has relatively more upward trends for many of the streamflow statistics analyzed, includes the rivers and associated creeks and bayous to Galveston Bay in the Houston metropolitan area. Lastly, the most western river basins considered (the Nueces and Rio Grande) had statistically significant downward trends for many of the streamflow statistics analyzed.
SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (IBM VERSION)
NASA Technical Reports Server (NTRS)
Manteufel, R.
1994-01-01
The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.
SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (DEC VAX VERSION)
NASA Technical Reports Server (NTRS)
Merwarth, P. D.
1994-01-01
The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.
Goyal, Ravi; De Gruttola, Victor
2018-01-30
Analysis of sexual history data intended to describe sexual networks presents many challenges arising from the fact that most surveys collect information on only a very small fraction of the population of interest. In addition, partners are rarely identified and responses are subject to reporting biases. Typically, each network statistic of interest, such as mean number of sexual partners for men or women, is estimated independently of other network statistics. There is, however, a complex relationship among networks statistics; and knowledge of these relationships can aid in addressing concerns mentioned earlier. We develop a novel method that constrains a posterior predictive distribution of a collection of network statistics in order to leverage the relationships among network statistics in making inference about network properties of interest. The method ensures that inference on network properties is compatible with an actual network. Through extensive simulation studies, we also demonstrate that use of this method can improve estimates in settings where there is uncertainty that arises both from sampling and from systematic reporting bias compared with currently available approaches to estimation. To illustrate the method, we apply it to estimate network statistics using data from the Chicago Health and Social Life Survey. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Nuhriawangsa, A. M. P.; Hertanto, B. S.; Kartikasari, L. R.; Swastike, W.; Cahyadi, M.; Rasid, S.
2018-01-01
The objective of this study was to evaluate the effect of extract level of Biduri latex on the meat quality of laying hens. The materials of this research were Biduri latex and thigh meat from hens strain Lohman. The latex was tapped from a young tissue stem and centrifuged for its supernatant. Meats were smeared with latex, punctured and incubated for 30 minutes. Concentrations of latex were 0, 3, 6 and 9% from the weight of meat (w/w). The variables were water, dissolved protein, crude fat content, tenderness and microstructure of meat. The statistical analysis method using ANOVA and if there was a mean difference, Duncan test was used. Descriptive analysis was used for microstructures of meat by comparing its hydrolysis conditions. The study showed that fat had significant difference (P <0.05), dissolved protein and tenderness had very significance (P <0.01). Descriptive analysis showed that there were different compositions of microstructures on meat structure. The fat content increased with addition of 3% latex. The value of dissolved protein increased but tenderness decreased by addition extract of 6% latex. The addition of Biduri latex extract showed that hydrolysis in the microstructure of meat. The addition of 6% latex was the best meat quality.
ERIC Educational Resources Information Center
Radtke, Oliver; Yuan, Xin
2011-01-01
This paper deals with Chinglish as Chinese-English translations found on public bilingual signage in the People's Republic of China. After a short review of the existing literature, this study attempts to establish a typology of Chinglish with corpus-based research. Additionally, the corpus serves for geographical and statistical analysis. This…
Iarosh, O A; Iarosh, A A
1991-01-01
As many as 300 patients of different age groups underwent a probability statistical analysis of cytosis and CSF protein depending on the outcome of bacterial meningoencephalitis. The clinical and CSF interrelations discovered reflect the function of the blood-brain barrier and can be used as an additional test for predicting the disease outcome.
Modeling and Recovery of Iron (Fe) from Red Mud by Coal Reduction
NASA Astrophysics Data System (ADS)
Zhao, Xiancong; Li, Hongxu; Wang, Lei; Zhang, Lifeng
Recovery of Fe from red mud has been studied using statistically designed experiments. The effects of three factors, namely: reduction temperature, reduction time and proportion of additive on recovery of Fe have been investigated. Experiments have been carried out using orthogonal central composite design and factorial design methods. A model has been obtained through variance analysis at 92.5% confidence level.
Moser, V C; Casey, M; Hamm, A; Carter, W H; Simmons, J E; Gennings, C
2005-07-01
Environmental exposures generally involve chemical mixtures instead of single chemicals. Statistical models such as the fixed-ratio ray design, wherein the mixing ratio (proportions) of the chemicals is fixed across increasing mixture doses, allows for the detection and characterization of interactions among the chemicals. In this study, we tested for interaction(s) in a mixture of five organophosphorus (OP) pesticides (chlorpyrifos, diazinon, dimethoate, acephate, and malathion). The ratio of the five pesticides (full ray) reflected the relative dietary exposure estimates of the general population as projected by the US EPA Dietary Exposure Evaluation Model (DEEM). A second mixture was tested using the same dose levels of all pesticides, but excluding malathion (reduced ray). The experimental approach first required characterization of dose-response curves for the individual OPs to build a dose-additivity model. A series of behavioral measures were evaluated in adult male Long-Evans rats at the time of peak effect following a single oral dose, and then tissues were collected for measurement of cholinesterase (ChE) activity. Neurochemical (blood and brain cholinesterase [ChE] activity) and behavioral (motor activity, gait score, tail-pinch response score) endpoints were evaluated statistically for evidence of additivity. The additivity model constructed from the single chemical data was used to predict the effects of the pesticide mixture along the full ray (10-450 mg/kg) and the reduced ray (1.75-78.8 mg/kg). The experimental mixture data were also modeled and statistically compared to the additivity models. Analysis of the 5-OP mixture (the full ray) revealed significant deviation from additivity for all endpoints except tail-pinch response. Greater-than-additive responses (synergism) were observed at the lower doses of the 5-OP mixture, which contained non-effective dose levels of each of the components. The predicted effective doses (ED20, ED50) were about half that predicted by additivity, and for brain ChE and motor activity, there was a threshold shift in the dose-response curves. For the brain ChE and motor activity, there was no difference between the full (5-OP mixture) and reduced (4-OP mixture) rays, indicating that malathion did not influence the non-additivity. While the reduced ray for blood ChE showed greater deviation from additivity without malathion in the mixture, the non-additivity observed for the gait score was reversed when malathion was removed. Thus, greater-than-additive interactions were detected for both the full and reduced ray mixtures, and the role of malathion in the interactions varied depending on the endpoint. In all cases, the deviations from additivity occurred at the lower end of the dose-response curves.
Exercise reduces depressive symptoms in adults with arthritis: Evidential value.
Kelley, George A; Kelley, Kristi S
2016-07-12
To determine whether evidential value exists that exercise reduces depression in adults with arthritis and other rheumatic conditions. Utilizing data derived from a prior meta-analysis of 29 randomized controlled trials comprising 2449 participants (1470 exercise, 979 control) with fibromyalgia, osteoarthritis, rheumatoid arthritis or systemic lupus erythematosus, a new method, P -curve, was utilized to assess for evidentiary worth as well as dismiss the possibility of discriminating reporting of statistically significant results regarding exercise and depression in adults with arthritis and other rheumatic conditions. Using the method of Stouffer, Z -scores were calculated to examine selective-reporting bias. An alpha ( P ) value < 0.05 was deemed statistically significant. In addition, average power of the tests included in P -curve, adjusted for publication bias, was calculated. Fifteen of 29 studies (51.7%) with exercise and depression results were statistically significant ( P < 0.05) while none of the results were statistically significant with respect to exercise increasing depression in adults with arthritis and other rheumatic conditions. Right-skew to dismiss selective reporting was identified ( Z = -5.28, P < 0.0001). In addition, the included studies did not lack evidential value ( Z = 2.39, P = 0.99), nor did they lack evidential value and were P -hacked ( Z = 5.28, P > 0.99). The relative frequencies of P -values were 66.7% at 0.01, 6.7% each at 0.02 and 0.03, 13.3% at 0.04 and 6.7% at 0.05. The average power of the tests included in P -curve, corrected for publication bias, was 69%. Diagnostic plot results revealed that the observed power estimate was a better fit than the alternatives. Evidential value results provide additional support that exercise reduces depression in adults with arthritis and other rheumatic conditions.
Exercise reduces depressive symptoms in adults with arthritis: Evidential value
Kelley, George A; Kelley, Kristi S
2016-01-01
AIM To determine whether evidential value exists that exercise reduces depression in adults with arthritis and other rheumatic conditions. METHODS Utilizing data derived from a prior meta-analysis of 29 randomized controlled trials comprising 2449 participants (1470 exercise, 979 control) with fibromyalgia, osteoarthritis, rheumatoid arthritis or systemic lupus erythematosus, a new method, P-curve, was utilized to assess for evidentiary worth as well as dismiss the possibility of discriminating reporting of statistically significant results regarding exercise and depression in adults with arthritis and other rheumatic conditions. Using the method of Stouffer, Z-scores were calculated to examine selective-reporting bias. An alpha (P) value < 0.05 was deemed statistically significant. In addition, average power of the tests included in P-curve, adjusted for publication bias, was calculated. RESULTS Fifteen of 29 studies (51.7%) with exercise and depression results were statistically significant (P < 0.05) while none of the results were statistically significant with respect to exercise increasing depression in adults with arthritis and other rheumatic conditions. Right-skew to dismiss selective reporting was identified (Z = −5.28, P < 0.0001). In addition, the included studies did not lack evidential value (Z = 2.39, P = 0.99), nor did they lack evidential value and were P-hacked (Z = 5.28, P > 0.99). The relative frequencies of P-values were 66.7% at 0.01, 6.7% each at 0.02 and 0.03, 13.3% at 0.04 and 6.7% at 0.05. The average power of the tests included in P-curve, corrected for publication bias, was 69%. Diagnostic plot results revealed that the observed power estimate was a better fit than the alternatives. CONCLUSION Evidential value results provide additional support that exercise reduces depression in adults with arthritis and other rheumatic conditions. PMID:27489782
Solanki, Neeraj; Kumar, Anuj; Awasthi, Neha; Kundu, Anjali; Mathur, Suveet; Bidhumadhav, Suresh
2016-06-01
Dental problems serve as additional burden on the children with special health care needs (CSHCN) because of additional hospitalization pressure, they face for the treatment of various serious medical problems. These patients have higher incidence of dental caries due to increased quantity of sugar involved in the drug therapies and lower salivary flow in the oral cavity. Such patients are difficult to treat with local anesthesia or inhaled sedatives. Single-sitting dental treatment is possible in these patients with general anesthesia. Therefore, we conducted this retrospective analysis of oral health status of CSHCN receiving various dental treatments in a given population. A total of 200 CSHCN of age 14 years or less reporting in the pediatric wing of the general hospital from 2005 to 2014 that underwent comprehensive dental treatment under general anesthesia were included in the study. Patients with history of any additional systemic illness, any malignancy, any known drug allergy, or previous history of any dental treatment were excluded from the study. Complete mouth rehabilitation was done in these patients under general anesthesia following standard protocols. Data regarding the patient's disability, type, duration, and severity of disability was collected and analyzed. All the results were analyzed by Statistical Package for the Social Sciences (SPSS) software. Chi-square test, Student's t-test, and one-way analysis of variance were used to assess the level of significance. Statistically significant results were obtained while analyzing the subject's decayed missing filled/decayed extracted filled teeth indices divided based on age. Significant difference was observed only in cases where patients underwent complete crown placement even when divided based on type of disability. While analyzing the prevalence, statistically significant results were observed in patients when divided based on their age. In CSHCN, dental pathologies and caries indices are increased regardless of the type or extent of disability. Children with special health care needs should be given special oral health care, and regular dental checkup should be conducted as they are more prone to have dental problems.
13C NMR spectroscopic analysis of poly(electrolyte) cement liquids.
Watts, D C
1979-05-01
13C NMR spectroscopy has been applied to the analysis of carboxylic poly-acid cement liquids. Monomer incorporation, composition ratio, sequence statistics, and stereochemical configuration have been considered theoretically, and determined experimentally, from the spectra. Conventionally polymerized poly(acrylic acid) has an approximately random configuration, but other varieties may be synthesized. Two commercial glass-ionomer cement liquids both contain tartaric acid as a chelating additive but the composition of their poly-acids are different. Itaconic acid units, distributed randomly, constitute 21% of the repeating units in one of these polyelectrolytes.
Ground-Based Telescope Parametric Cost Model
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Rowell, Ginger Holmes
2004-01-01
A parametric cost model for ground-based telescopes is developed using multi-variable statistical analysis, The model includes both engineering and performance parameters. While diameter continues to be the dominant cost driver, other significant factors include primary mirror radius of curvature and diffraction limited wavelength. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e.. multi-telescope phased-array systems). Additionally, single variable models based on aperture diameter are derived. This analysis indicates that recent mirror technology advances have indeed reduced the historical telescope cost curve.
Blattmann, Peter; Heusel, Moritz; Aebersold, Ruedi
2016-01-01
SWATH-MS is an acquisition and analysis technique of targeted proteomics that enables measuring several thousand proteins with high reproducibility and accuracy across many samples. OpenSWATH is popular open-source software for peptide identification and quantification from SWATH-MS data. For downstream statistical and quantitative analysis there exist different tools such as MSstats, mapDIA and aLFQ. However, the transfer of data from OpenSWATH to the downstream statistical tools is currently technically challenging. Here we introduce the R/Bioconductor package SWATH2stats, which allows convenient processing of the data into a format directly readable by the downstream analysis tools. In addition, SWATH2stats allows annotation, analyzing the variation and the reproducibility of the measurements, FDR estimation, and advanced filtering before submitting the processed data to downstream tools. These functionalities are important to quickly analyze the quality of the SWATH-MS data. Hence, SWATH2stats is a new open-source tool that summarizes several practical functionalities for analyzing, processing, and converting SWATH-MS data and thus facilitates the efficient analysis of large-scale SWATH/DIA datasets.
A Simple Test of Class-Level Genetic Association Can Reveal Novel Cardiometabolic Trait Loci.
Qian, Jing; Nunez, Sara; Reed, Eric; Reilly, Muredach P; Foulkes, Andrea S
2016-01-01
Characterizing the genetic determinants of complex diseases can be further augmented by incorporating knowledge of underlying structure or classifications of the genome, such as newly developed mappings of protein-coding genes, epigenetic marks, enhancer elements and non-coding RNAs. We apply a simple class-level testing framework, termed Genetic Class Association Testing (GenCAT), to identify protein-coding gene association with 14 cardiometabolic (CMD) related traits across 6 publicly available genome wide association (GWA) meta-analysis data resources. GenCAT uses SNP-level meta-analysis test statistics across all SNPs within a class of elements, as well as the size of the class and its unique correlation structure, to determine if the class is statistically meaningful. The novelty of findings is evaluated through investigation of regional signals. A subset of findings are validated using recently updated, larger meta-analysis resources. A simulation study is presented to characterize overall performance with respect to power, control of family-wise error and computational efficiency. All analysis is performed using the GenCAT package, R version 3.2.1. We demonstrate that class-level testing complements the common first stage minP approach that involves individual SNP-level testing followed by post-hoc ascribing of statistically significant SNPs to genes and loci. GenCAT suggests 54 protein-coding genes at 41 distinct loci for the 13 CMD traits investigated in the discovery analysis, that are beyond the discoveries of minP alone. An additional application to biological pathways demonstrates flexibility in defining genetic classes. We conclude that it would be prudent to include class-level testing as standard practice in GWA analysis. GenCAT, for example, can be used as a simple, complementary and efficient strategy for class-level testing that leverages existing data resources, requires only summary level data in the form of test statistics, and adds significant value with respect to its potential for identifying multiple novel and clinically relevant trait associations.
Statistical principle and methodology in the NISAN system.
Asano, C
1979-01-01
The NISAN system is a new interactive statistical analysis program package constructed by an organization of Japanese statisticans. The package is widely available for both statistical situations, confirmatory analysis and exploratory analysis, and is planned to obtain statistical wisdom and to choose optimal process of statistical analysis for senior statisticians. PMID:540594
Alsaggaf, Rotana; O'Hara, Lyndsay M; Stafford, Kristen A; Leekha, Surbhi; Harris, Anthony D
2018-02-01
OBJECTIVE A systematic review of quasi-experimental studies in the field of infectious diseases was published in 2005. The aim of this study was to assess improvements in the design and reporting of quasi-experiments 10 years after the initial review. We also aimed to report the statistical methods used to analyze quasi-experimental data. DESIGN Systematic review of articles published from January 1, 2013, to December 31, 2014, in 4 major infectious disease journals. METHODS Quasi-experimental studies focused on infection control and antibiotic resistance were identified and classified based on 4 criteria: (1) type of quasi-experimental design used, (2) justification of the use of the design, (3) use of correct nomenclature to describe the design, and (4) statistical methods used. RESULTS Of 2,600 articles, 173 (7%) featured a quasi-experimental design, compared to 73 of 2,320 articles (3%) in the previous review (P<.01). Moreover, 21 articles (12%) utilized a study design with a control group; 6 (3.5%) justified the use of a quasi-experimental design; and 68 (39%) identified their design using the correct nomenclature. In addition, 2-group statistical tests were used in 75 studies (43%); 58 studies (34%) used standard regression analysis; 18 (10%) used segmented regression analysis; 7 (4%) used standard time-series analysis; 5 (3%) used segmented time-series analysis; and 10 (6%) did not utilize statistical methods for comparisons. CONCLUSIONS While some progress occurred over the decade, it is crucial to continue improving the design and reporting of quasi-experimental studies in the fields of infection control and antibiotic resistance to better evaluate the effectiveness of important interventions. Infect Control Hosp Epidemiol 2018;39:170-176.
Guidelines for the design and statistical analysis of experiments in papers submitted to ATLA.
Festing, M F
2001-01-01
In vitro experiments need to be well designed and correctly analysed if they are to achieve their full potential to replace the use of animals in research. An "experiment" is a procedure for collecting scientific data in order to answer a hypothesis, or to provide material for generating new hypotheses, and differs from a survey because the scientist has control over the treatments that can be applied. Most experiments can be classified into one of a few formal designs, the most common being completely randomised, and randomised block designs. These are quite common with in vitro experiments, which are often replicated in time. Some experiments involve a single independent (treatment) variable, while other "factorial" designs simultaneously vary two or more independent variables, such as drug treatment and cell line. Factorial designs often provide additional information at little extra cost. Experiments need to be carefully planned to avoid bias, be powerful yet simple, provide for a valid statistical analysis and, in some cases, have a wide range of applicability. Virtually all experiments need some sort of statistical analysis in order to take account of biological variation among the experimental subjects. Parametric methods using the t test or analysis of variance are usually more powerful than non-parametric methods, provided the underlying assumptions of normality of the residuals and equal variances are approximately valid. The statistical analyses of data from a completely randomised design, and from a randomised-block design are demonstrated in Appendices 1 and 2, and methods of determining sample size are discussed in Appendix 3. Appendix 4 gives a checklist for authors submitting papers to ATLA.
Middleton, David A; Hughes, Eleri; Madine, Jillian
2004-08-11
We describe an NMR approach for detecting the interactions between phospholipid membranes and proteins, peptides, or small molecules. First, 1H-13C dipolar coupling profiles are obtained from hydrated lipid samples at natural isotope abundance using cross-polarization magic-angle spinning NMR methods. Principal component analysis of dipolar coupling profiles for synthetic lipid membranes in the presence of a range of biologically active additives reveals clusters that relate to different modes of interaction of the additives with the lipid bilayer. Finally, by representing profiles from multiple samples in the form of contour plots, it is possible to reveal statistically significant changes in dipolar couplings, which reflect perturbations in the lipid molecules at the membrane surface or within the hydrophobic interior.
Allison, D B; Faith, M S
1996-06-01
I. Kirsch, G. Montgomery, and G. Sapirstein (1995) meta-analyzed 6 weight-loss studies comparing the efficacy of cognitive-behavior therapy (CBT) alone to CBT plus hypnotherapy and concluded that "the addition of hypnosis substantially enhanced treatment outcome" (p.214). Kirsch reported a mean effect size (expressed as d) of 1.96. After correcting several transcription and computational inaccuracies in the original meta-analysis, these 6 studies yield a smaller mean effect size (.26). Moreover, if 1 questionable study is removed from the analysis, the effect sizes become more homogeneous and the mean (.21) is no longer statistically significant. It is concluded that the addition of hypnosis to CBT for weight loss results in, at most, a small enhancement of treatment outcome.
NASA Technical Reports Server (NTRS)
Bauer, M. E.; Cary, T. K.; Davis, B. J.; Swain, P. H.
1975-01-01
The results of classifications and experiments for the crop identification technology assessment for remote sensing are summarized. Using two analysis procedures, 15 data sets were classified. One procedure used class weights while the other assumed equal probabilities of occurrence for all classes. Additionally, 20 data sets were classified using training statistics from another segment or date. The classification and proportion estimation results of the local and nonlocal classifications are reported. Data also describe several other experiments to provide additional understanding of the results of the crop identification technology assessment for remote sensing. These experiments investigated alternative analysis procedures, training set selection and size, effects of multitemporal registration, spectral discriminability of corn, soybeans, and other, and analyses of aircraft multispectral data.
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit
Chu, Annie; Cui, Jenny; Dinov, Ivo D.
2011-01-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994
Tsallis q-triplet, intermittent turbulence and Portevin-Le Chatelier effect
NASA Astrophysics Data System (ADS)
Iliopoulos, A. C.; Aifantis, E. C.
2018-05-01
In this paper, we extend a previous study concerning Portevin-LeChatelier (PLC) effect and Tsallis statistics (Iliopoulos et al., 2015). In particular, we estimate Tsallis' q-triplet, namely {qstat, qsens, qrel} for two sets of stress serration time series concerning the deformation of Cu-15%Al alloy corresponding to different deformation temperatures and thus types (A and B) of PLC bands. The results concerning the stress serrations analysis reveal that Tsallis q- triplet attains values different from unity ({qstat, qsens, qrel} ≠ {1,1,1}). In particular, PLC type A bands' serrations were found to follow Tsallis super-q-Gaussian, non-extensive, sub-additive, multifractal statistics indicating that the underlying dynamics are at the edge of chaos, characterized by global long range correlations and power law scaling. For PLC type B bands' serrations, the results revealed a Tsallis sub-q-Gaussian, non-extensive, super-additive, multifractal statistical profile. In addition, our results reveal also significant differences in statistical and dynamical features, indicating important variations of the stress field dynamics in terms of rate of entropy production, relaxation dynamics and non-equilibrium meta-stable stationary states. We also estimate parameters commonly used for characterizing fully developed turbulence, such as structure functions and flatness coefficient (F), in order to provide further information about jerky flow underlying dynamics. Finally, we use two multifractal models developed to describe turbulence, namely Arimitsu and Arimitsu (A&A) [2000, 2001] theoretical model which is based on Tsallis statistics and p-model to estimate theoretical multifractal spectrums f(a). Furthermore, we estimate flatness coefficient (F) using a theoretical formula based on Tsallis statistics. The theoretical results are compared with the experimental ones showing a remarkable agreement between modeling and experiment. Finally, the results of this study verify, as well as, extend previous studies which stated that type B and type A PLC bands underlying dynamics are connected with distinct dynamical behavior, namely chaotic behavior for the first and self-organized critical (SOC) behavior for the latter, while they shed new light concerning the turbulent character of the PLC jerky flow.
Multiscale hidden Markov models for photon-limited imaging
NASA Astrophysics Data System (ADS)
Nowak, Robert D.
1999-06-01
Photon-limited image analysis is often hindered by low signal-to-noise ratios. A novel Bayesian multiscale modeling and analysis method is developed in this paper to assist in these challenging situations. In addition to providing a very natural and useful framework for modeling an d processing images, Bayesian multiscale analysis is often much less computationally demanding compared to classical Markov random field models. This paper focuses on a probabilistic graph model called the multiscale hidden Markov model (MHMM), which captures the key inter-scale dependencies present in natural image intensities. The MHMM framework presented here is specifically designed for photon-limited imagin applications involving Poisson statistics, and applications to image intensity analysis are examined.
Sabourin, Jeremy; Nobel, Andrew B.; Valdar, William
2014-01-01
Genomewide association studies sometimes identify loci at which both the number and identities of the underlying causal variants are ambiguous. In such cases, statistical methods that model effects of multiple SNPs simultaneously can help disentangle the observed patterns of association and provide information about how those SNPs could be prioritized for follow-up studies. Current multi-SNP methods, however, tend to assume that SNP effects are well captured by additive genetics; yet when genetic dominance is present, this assumption translates to reduced power and faulty prioritizations. We describe a statistical procedure for prioritizing SNPs at GWAS loci that efficiently models both additive and dominance effects. Our method, LLARRMA-dawg, combines a group LASSO procedure for sparse modeling of multiple SNP effects with a resampling procedure based on fractional observation weights; it estimates for each SNP the robustness of association with the phenotype both to sampling variation and to competing explanations from other SNPs. In producing a SNP prioritization that best identifies underlying true signals, we show that: our method easily outperforms a single marker analysis; when additive-only signals are present, our joint model for additive and dominance is equivalent to or only slightly less powerful than modeling additive-only effects; and, when dominance signals are present, even in combination with substantial additive effects, our joint model is unequivocally more powerful than a model assuming additivity. We also describe how performance can be improved through calibrated randomized penalization, and discuss how dominance in ungenotyped SNPs can be incorporated through either heterozygote dosage or multiple imputation. PMID:25417853
NASA Astrophysics Data System (ADS)
Chavez, Roberto; Lozano, Sergio; Correia, Pedro; Sanz-Rodrigo, Javier; Probst, Oliver
2013-04-01
With the purpose of efficiently and reliably generating long-term wind resource maps for the wind energy industry, the application and verification of a statistical methodology for the climate downscaling of wind fields at surface level is presented in this work. This procedure is based on the combination of the Monte Carlo and the Principal Component Analysis (PCA) statistical methods. Firstly the Monte Carlo method is used to create a huge number of daily-based annual time series, so called climate representative years, by the stratified sampling of a 33-year-long time series corresponding to the available period of the NCAR/NCEP global reanalysis data set (R-2). Secondly the representative years are evaluated such that the best set is chosen according to its capability to recreate the Sea Level Pressure (SLP) temporal and spatial fields from the R-2 data set. The measure of this correspondence is based on the Euclidean distance between the Empirical Orthogonal Functions (EOF) spaces generated by the PCA (Principal Component Analysis) decomposition of the SLP fields from both the long-term and the representative year data sets. The methodology was verified by comparing the selected 365-days period against a 9-year period of wind fields generated by dynamical downscaling the Global Forecast System data with the mesoscale model SKIRON for the Iberian Peninsula. These results showed that, compared to the traditional method of dynamical downscaling any random 365-days period, the error in the average wind velocity by the PCA's representative year was reduced by almost 30%. Moreover the Mean Absolute Errors (MAE) in the monthly and daily wind profiles were also reduced by almost 25% along all SKIRON grid points. These results showed also that the methodology presented maximum error values in the wind speed mean of 0.8 m/s and maximum MAE in the monthly curves of 0.7 m/s. Besides the bulk numbers, this work shows the spatial distribution of the errors across the Iberian domain and additional wind statistics such as the velocity and directional frequency. Additional repetitions were performed to prove the reliability and robustness of this kind-of statistical-dynamical downscaling method.
Statistical methods to detect novel genetic variants using publicly available GWAS summary data.
Guo, Bin; Wu, Baolin
2018-03-01
We propose statistical methods to detect novel genetic variants using only genome-wide association studies (GWAS) summary data without access to raw genotype and phenotype data. With more and more summary data being posted for public access in the post GWAS era, the proposed methods are practically very useful to identify additional interesting genetic variants and shed lights on the underlying disease mechanism. We illustrate the utility of our proposed methods with application to GWAS meta-analysis results of fasting glucose from the international MAGIC consortium. We found several novel genome-wide significant loci that are worth further study. Copyright © 2018 Elsevier Ltd. All rights reserved.
Pitoia, Fabián; Jerkovich, Fernando; Smulever, Anabella; Brenta, Gabriela; Bueno, Fernanda; Cross, Graciela
2017-07-01
To evaluate the influence of age at diagnosis on the frequency of structural incomplete response (SIR) according to the modified risk of recurrence (RR) staging system from the American Thyroid Association guidelines. We performed a retrospective analysis of 268 patients with differentiated thyroid cancer (DTC) followed up for at least 3 years after initial treatment (total thyroidectomy and remnant ablation). The median follow-up in the whole cohort was 74.3 months (range: 36.1-317.9) and the median age at diagnosis was 45.9 years (range: 18-87). The association between age at diagnosis and the initial and final response to treatment was assessed with analysis of variance (ANOVA). Patients were also divided into several groups considering age younger and older than 40, 50, and 60 years. Age at diagnosis was not associated with either an initial or final statistically significant different SIR to treatment ( p = 0.14 and p = 0.58, respectively). Additionally, we did not find any statistically significant differences when the percentages of SIR considering the classification of RR were compared between different groups of patients by using several age cutoffs. When patients are correctly risk stratified, it seems that age at diagnosis is not involved in the frequency of having a SIR at the initial evaluation or at the final follow-up, so it should not be included as an additional variable to be considered in the RR classifications.
Cantarero, Samuel; Zafra-Gómez, Alberto; Ballesteros, Oscar; Navalón, Alberto; Reis, Marco S; Saraiva, Pedro M; Vílchez, José L
2011-01-01
In this work we present a monitoring study of linear alkylbenzene sulfonates (LAS) and insoluble soap performed on Spanish sewage sludge samples. This work focuses on finding statistical relations between LAS concentrations and insoluble soap in sewage sludge samples and variables related to wastewater treatment plants such as water hardness, population and treatment type. It is worth to mention that 38 samples, collected from different Spanish regions, were studied. The statistical tool we used was Principal Component Analysis (PC), in order to reduce the number of response variables. The analysis of variance (ANOVA) test and a non-parametric test such as the Kruskal-Wallis test were also studied through the estimation of the p-value (probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true) in order to study possible relations between the concentration of both analytes and the rest of variables. We also compared LAS and insoluble soap behaviors. In addition, the results obtained for LAS (mean value) were compared with the limit value proposed by the future Directive entitled "Working Document on Sludge". According to the results, the mean obtained for soap and LAS was 26.49 g kg(-1) and 6.15 g kg(-1) respectively. It is worth noting that LAS mean was significantly higher than the limit value (2.6 g kg(-1)). In addition, LAS and soap concentrations depend largely on water hardness. However, only LAS concentration depends on treatment type.
Evaluation of peak-picking algorithms for protein mass spectrometry.
Bauer, Chris; Cramer, Rainer; Schuchhardt, Johannes
2011-01-01
Peak picking is an early key step in MS data analysis. We compare three commonly used approaches to peak picking and discuss their merits by means of statistical analysis. Methods investigated encompass signal-to-noise ratio, continuous wavelet transform, and a correlation-based approach using a Gaussian template. Functionality of the three methods is illustrated and discussed in a practical context using a mass spectral data set created with MALDI-TOF technology. Sensitivity and specificity are investigated using a manually defined reference set of peaks. As an additional criterion, the robustness of the three methods is assessed by a perturbation analysis and illustrated using ROC curves.
Space biology initiative program definition review. Trade study 4: Design modularity and commonality
NASA Technical Reports Server (NTRS)
Jackson, L. Neal; Crenshaw, John, Sr.; Davidson, William L.; Herbert, Frank J.; Bilodeau, James W.; Stoval, J. Michael; Sutton, Terry
1989-01-01
The relative cost impacts (up or down) of developing Space Biology hardware using design modularity and commonality is studied. Recommendations for how the hardware development should be accomplished to meet optimum design modularity requirements for Life Science investigation hardware will be provided. In addition, the relative cost impacts of implementing commonality of hardware for all Space Biology hardware are defined. Cost analysis and supporting recommendations for levels of modularity and commonality are presented. A mathematical or statistical cost analysis method with the capability to support development of production design modularity and commonality impacts to parametric cost analysis is provided.
Evaluating the decision accuracy and speed of clinical data visualizations.
Pieczkiewicz, David S; Finkelstein, Stanley M
2010-01-01
Clinicians face an increasing volume of biomedical data. Assessing the efficacy of systems that enable accurate and timely clinical decision making merits corresponding attention. This paper discusses the multiple-reader multiple-case (MRMC) experimental design and linear mixed models as means of assessing and comparing decision accuracy and latency (time) for decision tasks in which clinician readers must interpret visual displays of data. These tools can assess and compare decision accuracy and latency (time). These experimental and statistical techniques, used extensively in radiology imaging studies, offer a number of practical and analytic advantages over more traditional quantitative methods such as percent-correct measurements and ANOVAs, and are recommended for their statistical efficiency and generalizability. An example analysis using readily available, free, and commercial statistical software is provided as an appendix. While these techniques are not appropriate for all evaluation questions, they can provide a valuable addition to the evaluative toolkit of medical informatics research.
NASA Technical Reports Server (NTRS)
Hoffer, R. M. (Principal Investigator); Knowlton, D. J.; Dean, M. E.
1981-01-01
A set of training statistics for the 30 meter resolution simulated thematic mapper MSS data was generated based on land use/land cover classes. In addition to this supervised data set, a nonsupervised multicluster block of training statistics is being defined in order to compare the classification results and evaluate the effect of the different training selection methods on classification performance. Two test data sets, defined using a stratified sampling procedure incorporating a grid system with dimensions of 50 lines by 50 columns, and another set based on an analyst supervised set of test fields were used to evaluate the classifications of the TMS data. The supervised training data set generated training statistics, and a per point Gaussian maximum likelihood classification of the 1979 TMS data was obtained. The August 1980 MSS data was radiometrically adjusted. The SAR data was redigitized and the SAR imagery was qualitatively analyzed.
Patton, Charles J.; Gilroy, Edward J.
1999-01-01
Data on which this report is based, including nutrient concentrations in synthetic reference samples determined concurrently with those in real samples, are extensive (greater than 20,000 determinations) and have been published separately. In addition to confirming the well-documented instability of nitrite in acidified samples, this study also demonstrates that when biota are removed from samples at collection sites by 0.45-micrometer membrane filtration, subsequent preservation with sulfuric acid or mercury (II) provides no statistically significant improvement in nutrient concentration stability during storage at 4 degrees Celsius for 30 days. Biocide preservation had no statistically significant effect on the 30-day stability of phosphorus concentrations in whole-water splits from any of the 15 stations, but did stabilize Kjeldahl nitrogen concentrations in whole-water splits from three data-collection stations where ammonium accounted for at least half of the measured Kjeldahl nitrogen.
Vejai Vekaash, Chitra Janardhanan; Kumar Reddy, Tripuravaram Vinay; Venkatesh, Kondas Vijay
2017-01-01
Aim: This study aims to evaluate the color change in human enamel bleached with three different concentrations of hydrogen peroxide, containing pineapple extract as an additive in two different timings, using reflectance spectrophotometer. Background: The study aimed to investigate the bleaching efficacy on natural teeth using natural enzymes. Materials and Methods: Baseline color values of 10 randomly selected artificially stained incisors were obtained. The specimens were divided into three groups of 20 teeth each: Group 1 – 30% hydrogen peroxide, Group II – 20% hydrogen peroxide, and Group III – 10% hydrogen peroxide. One half of the tooth was bleached with hydrogen peroxide, and other was bleached with hydrogen peroxide and pineapple extract for 20 min (Subgroup A) and 10 min (Subgroup B). Statistical Analysis: The results were statistically analyzed using student's t-test. Results: The mean ΔE values of Group IA (31.62 ± 0.9), Group IIA (29.85 ± 1.2), and Group IIIA (28.65 ± 1.2) showed statistically significant higher values when compared to the mean Δ E values of Group 1A (25.02 ± 1.2), Group IIA (22.86 ± 1.1), and Group IIIA (16.56 ± 1.1). Identical results were obtained in Subgroup B. Conclusion: The addition of pineapple extract to hydrogen peroxide resulted in effective bleaching. PMID:29386782
Statistical Analysis of Research Data | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 5-6, 2018 from 9 a.m.-5 p.m. at the National Institutes of Health's Natcher Conference Center, Balcony C on the Bethesda Campus. SARD is designed to provide an overview on the general principles of statistical analysis of research data. The first day will feature univariate data analysis, including descriptive statistics, probability distributions, one- and two-sample inferential statistics.
Shaikh, Masood Ali
2017-09-01
Assessment of research articles in terms of study designs used, statistical tests applied and the use of statistical analysis programmes help determine research activity profile and trends in the country. In this descriptive study, all original articles published by Journal of Pakistan Medical Association (JPMA) and Journal of the College of Physicians and Surgeons Pakistan (JCPSP), in the year 2015 were reviewed in terms of study designs used, application of statistical tests, and the use of statistical analysis programmes. JPMA and JCPSP published 192 and 128 original articles, respectively, in the year 2015. Results of this study indicate that cross-sectional study design, bivariate inferential statistical analysis entailing comparison between two variables/groups, and use of statistical software programme SPSS to be the most common study design, inferential statistical analysis, and statistical analysis software programmes, respectively. These results echo previously published assessment of these two journals for the year 2014.
AGR-1 Thermocouple Data Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeff Einerson
2012-05-01
This report documents an effort to analyze measured and simulated data obtained in the Advanced Gas Reactor (AGR) fuel irradiation test program conducted in the INL's Advanced Test Reactor (ATR) to support the Next Generation Nuclear Plant (NGNP) R&D program. The work follows up on a previous study (Pham and Einerson, 2010), in which statistical analysis methods were applied for AGR-1 thermocouple data qualification. The present work exercises the idea that, while recognizing uncertainties inherent in physics and thermal simulations of the AGR-1 test, results of the numerical simulations can be used in combination with the statistical analysis methods tomore » further improve qualification of measured data. Additionally, the combined analysis of measured and simulation data can generate insights about simulation model uncertainty that can be useful for model improvement. This report also describes an experimental control procedure to maintain fuel target temperature in the future AGR tests using regression relationships that include simulation results. The report is organized into four chapters. Chapter 1 introduces the AGR Fuel Development and Qualification program, AGR-1 test configuration and test procedure, overview of AGR-1 measured data, and overview of physics and thermal simulation, including modeling assumptions and uncertainties. A brief summary of statistical analysis methods developed in (Pham and Einerson 2010) for AGR-1 measured data qualification within NGNP Data Management and Analysis System (NDMAS) is also included for completeness. Chapters 2-3 describe and discuss cases, in which the combined use of experimental and simulation data is realized. A set of issues associated with measurement and modeling uncertainties resulted from the combined analysis are identified. This includes demonstration that such a combined analysis led to important insights for reducing uncertainty in presentation of AGR-1 measured data (Chapter 2) and interpretation of simulation results (Chapter 3). The statistics-based simulation-aided experimental control procedure described for the future AGR tests is developed and demonstrated in Chapter 4. The procedure for controlling the target fuel temperature (capsule peak or average) is based on regression functions of thermocouple readings and other relevant parameters and accounting for possible changes in both physical and thermal conditions and in instrument performance.« less
Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.
Bonham-Carter, Oliver; Steele, Joe; Bastola, Dhundy
2014-11-01
Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base-base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel-Ziv techniques from data compression. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Moon, Andres; Smith, Geoffrey H; Kong, Jun; Rogers, Thomas E; Ellis, Carla L; Farris, Alton B Brad
2018-02-01
Renal allograft rejection diagnosis depends on assessment of parameters such as interstitial inflammation; however, studies have shown interobserver variability regarding interstitial inflammation assessment. Since automated image analysis quantitation can be reproducible, we devised customized analysis methods for CD3+ T-cell staining density as a measure of rejection severity and compared them with established commercial methods along with visual assessment. Renal biopsy CD3 immunohistochemistry slides (n = 45), including renal allografts with various degrees of acute cellular rejection (ACR) were scanned for whole slide images (WSIs). Inflammation was quantitated in the WSIs using pathologist visual assessment, commercial algorithms (Aperio nuclear algorithm for CD3+ cells/mm 2 and Aperio positive pixel count algorithm), and customized open source algorithms developed in ImageJ with thresholding/positive pixel counting (custom CD3+%) and identification of pixels fulfilling "maxima" criteria for CD3 expression (custom CD3+ cells/mm 2 ). Based on visual inspections of "markup" images, CD3 quantitation algorithms produced adequate accuracy. Additionally, CD3 quantitation algorithms correlated between each other and also with visual assessment in a statistically significant manner (r = 0.44 to 0.94, p = 0.003 to < 0.0001). Methods for assessing inflammation suggested a progression through the tubulointerstitial ACR grades, with statistically different results in borderline versus other ACR types, in all but the custom methods. Assessment of CD3-stained slides using various open source image analysis algorithms presents salient correlations with established methods of CD3 quantitation. These analysis techniques are promising and highly customizable, providing a form of on-slide "flow cytometry" that can facilitate additional diagnostic accuracy in tissue-based assessments.
Evaluation of redundancy analysis to identify signatures of local adaptation.
Capblancq, Thibaut; Luu, Keurcien; Blum, Michael G B; Bazin, Eric
2018-05-26
Ordination is a common tool in ecology that aims at representing complex biological information in a reduced space. In landscape genetics, ordination methods such as principal component analysis (PCA) have been used to detect adaptive variation based on genomic data. Taking advantage of environmental data in addition to genotype data, redundancy analysis (RDA) is another ordination approach that is useful to detect adaptive variation. This paper aims at proposing a test statistic based on RDA to search for loci under selection. We compare redundancy analysis to pcadapt, which is a nonconstrained ordination method, and to a latent factor mixed model (LFMM), which is a univariate genotype-environment association method. Individual-based simulations identify evolutionary scenarios where RDA genome scans have a greater statistical power than genome scans based on PCA. By constraining the analysis with environmental variables, RDA performs better than PCA in identifying adaptive variation when selection gradients are weakly correlated with population structure. Additionally, we show that if RDA and LFMM have a similar power to identify genetic markers associated with environmental variables, the RDA-based procedure has the advantage to identify the main selective gradients as a combination of environmental variables. To give a concrete illustration of RDA in population genomics, we apply this method to the detection of outliers and selective gradients on an SNP data set of Populus trichocarpa (Geraldes et al., 2013). The RDA-based approach identifies the main selective gradient contrasting southern and coastal populations to northern and continental populations in the northwestern American coast. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
NASA Astrophysics Data System (ADS)
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Sacks, David B.; Yu, Yi-Kuo
2018-06-01
Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Sacks, David B; Yu, Yi-Kuo
2018-06-05
Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.
ERIC Educational Resources Information Center
Kriston, Levente; Melchior, Hanne; Hergert, Anika; Bergelt, Corinna; Watzke, Birgit; Schulz, Holger; von Wolff, Alessa
2011-01-01
The aim of our study was to develop a graphical tool that can be used in addition to standard statistical criteria to support decisions on the number of classes in explorative categorical latent variable modeling for rehabilitation research. Data from two rehabilitation research projects were used. In the first study, a latent profile analysis was…
Nonindependence and sensitivity analyses in ecological and evolutionary meta-analyses.
Noble, Daniel W A; Lagisz, Malgorzata; O'dea, Rose E; Nakagawa, Shinichi
2017-05-01
Meta-analysis is an important tool for synthesizing research on a variety of topics in ecology and evolution, including molecular ecology, but can be susceptible to nonindependence. Nonindependence can affect two major interrelated components of a meta-analysis: (i) the calculation of effect size statistics and (ii) the estimation of overall meta-analytic estimates and their uncertainty. While some solutions to nonindependence exist at the statistical analysis stages, there is little advice on what to do when complex analyses are not possible, or when studies with nonindependent experimental designs exist in the data. Here we argue that exploring the effects of procedural decisions in a meta-analysis (e.g. inclusion of different quality data, choice of effect size) and statistical assumptions (e.g. assuming no phylogenetic covariance) using sensitivity analyses are extremely important in assessing the impact of nonindependence. Sensitivity analyses can provide greater confidence in results and highlight important limitations of empirical work (e.g. impact of study design on overall effects). Despite their importance, sensitivity analyses are seldom applied to problems of nonindependence. To encourage better practice for dealing with nonindependence in meta-analytic studies, we present accessible examples demonstrating the impact that ignoring nonindependence can have on meta-analytic estimates. We also provide pragmatic solutions for dealing with nonindependent study designs, and for analysing dependent effect sizes. Additionally, we offer reporting guidelines that will facilitate disclosure of the sources of nonindependence in meta-analyses, leading to greater transparency and more robust conclusions. © 2017 John Wiley & Sons Ltd.
The Relationship Between Surface Curvature and Abdominal Aortic Aneurysm Wall Stress.
de Galarreta, Sergio Ruiz; Cazón, Aitor; Antón, Raúl; Finol, Ender A
2017-08-01
The maximum diameter (MD) criterion is the most important factor when predicting risk of rupture of abdominal aortic aneurysms (AAAs). An elevated wall stress has also been linked to a high risk of aneurysm rupture, yet is an uncommon clinical practice to compute AAA wall stress. The purpose of this study is to assess whether other characteristics of the AAA geometry are statistically correlated with wall stress. Using in-house segmentation and meshing algorithms, 30 patient-specific AAA models were generated for finite element analysis (FEA). These models were subsequently used to estimate wall stress and maximum diameter and to evaluate the spatial distributions of wall thickness, cross-sectional diameter, mean curvature, and Gaussian curvature. Data analysis consisted of statistical correlations of the aforementioned geometry metrics with wall stress for the 30 AAA inner and outer wall surfaces. In addition, a linear regression analysis was performed with all the AAA wall surfaces to quantify the relationship of the geometric indices with wall stress. These analyses indicated that while all the geometry metrics have statistically significant correlations with wall stress, the local mean curvature (LMC) exhibits the highest average Pearson's correlation coefficient for both inner and outer wall surfaces. The linear regression analysis revealed coefficients of determination for the outer and inner wall surfaces of 0.712 and 0.516, respectively, with LMC having the largest effect on the linear regression equation with wall stress. This work underscores the importance of evaluating AAA mean wall curvature as a potential surrogate for wall stress.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ingram, Jani Cheri; Lehman, Richard Michael; Bauer, William Francis
We report the use of a surface analysis approach, static secondary ion mass spectrometry (SIMS) equipped with a molecular (ReO4-) ion primary beam, to analyze the surface of intact microbial cells. SIMS spectra of 28 microorganisms were compared to fatty acid profiles determined by gas chromatographic analysis of transesterfied fatty acids extracted from the same organisms. The results indicate that surface bombardment using the molecular primary beam cleaved the ester linkage characteristic of bacteria at the glycerophosphate backbone of the phospholipid components of the cell membrane. This cleavage enables direct detection of the fatty acid conjugate base of intact microorganismsmore » by static SIMS. The limit of detection for this approach is approximately 107 bacterial cells/cm2. Multivariate statistical methods were applied in a graded approach to the SIMS microbial data. The results showed that the full data set could initially be statistically grouped based upon major differences in biochemical composition of the cell wall. The gram-positive bacteria were further statistically analyzed, followed by final analysis of a specific bacterial genus that was successfully grouped by species. Additionally, the use of SIMS to detect microbes on mineral surfaces is demonstrated by an analysis of Shewanella oneidensis on crushed hematite. The results of this study provide evidence for the potential of static SIMS to rapidly detect bacterial species based on ion fragments originating from cell membrane lipids directly from sample surfaces.« less
Medial Tibial Stress Shielding: A Limitation of Cobalt Chromium Tibial Baseplates.
Martin, J Ryan; Watts, Chad D; Levy, Daniel L; Kim, Raymond H
2017-02-01
Stress shielding is a well-recognized complication associated with total knee arthroplasty. However, this phenomenon has not been thoroughly described. Specifically, no study to our knowledge has evaluated the radiographic impact of utilizing various tibial component compositions on tibial stress shielding. We retrospectively reviewed 3 cohorts of 50 patients that had a preoperative varus deformity and were implanted with a titanium, cobalt chromium (CoCr), or an all polyethylene tibial implant. A radiographic comparative analysis was performed to evaluate the amount of medial tibial bone loss in each cohort. In addition, a clinical outcomes analysis was performed on the 3 cohorts. The CoCr was noted to have a statistically significant increase in medial tibial bone loss compared with the other 2 cohorts. The all polyethylene cohort had a statistically significantly higher final Knee Society Score and was associated with the least amount of stress shielding. The CoCr tray is the most rigid of 3 implants that were compared in this study. Interestingly, this cohort had the highest amount of medial tibial bone loss. In addition, 1 patient in the CoCr cohort had medial soft tissue irritation which was attributed to a prominent medial tibial tray which required revision surgery to mitigate the symptoms. Copyright © 2016 Elsevier Inc. All rights reserved.
The modern Japanese color lexicon.
Kuriki, Ichiro; Lange, Ryan; Muto, Yumiko; Brown, Angela M; Fukuda, Kazuho; Tokunaga, Rumi; Lindsey, Delwin T; Uchikawa, Keiji; Shioiri, Satoshi
2017-03-01
Despite numerous prior studies, important questions about the Japanese color lexicon persist, particularly about the number of Japanese basic color terms and their deployment across color space. Here, 57 native Japanese speakers provided monolexemic terms for 320 chromatic and 10 achromatic Munsell color samples. Through k-means cluster analysis we revealed 16 statistically distinct Japanese chromatic categories. These included eight chromatic basic color terms (aka/red, ki/yellow, midori/green, ao/blue, pink, orange, cha/brown, and murasaki/purple) plus eight additional terms: mizu ("water")/light blue, hada ("skin tone")/peach, kon ("indigo")/dark blue, matcha ("green tea")/yellow-green, enji/maroon, oudo ("sand or mud")/mustard, yamabuki ("globeflower")/gold, and cream. Of these additional terms, mizu was used by 98% of informants, and emerged as a strong candidate for a 12th Japanese basic color term. Japanese and American English color-naming systems were broadly similar, except for color categories in one language (mizu, kon, teal, lavender, magenta, lime) that had no equivalent in the other. Our analysis revealed two statistically distinct Japanese motifs (or color-naming systems), which differed mainly in the extension of mizu across our color palette. Comparison of the present data with an earlier study by Uchikawa & Boynton (1987) suggests that some changes in the Japanese color lexicon have occurred over the last 30 years.
Urbanowicz, Ryan J.; Granizo-Mackenzie, Ambrose; Moore, Jason H.
2014-01-01
Michigan-style learning classifier systems (M-LCSs) represent an adaptive and powerful class of evolutionary algorithms which distribute the learned solution over a sizable population of rules. However their application to complex real world data mining problems, such as genetic association studies, has been limited. Traditional knowledge discovery strategies for M-LCS rule populations involve sorting and manual rule inspection. While this approach may be sufficient for simpler problems, the confounding influence of noise and the need to discriminate between predictive and non-predictive attributes calls for additional strategies. Additionally, tests of significance must be adapted to M-LCS analyses in order to make them a viable option within fields that require such analyses to assess confidence. In this work we introduce an M-LCS analysis pipeline that combines uniquely applied visualizations with objective statistical evaluation for the identification of predictive attributes, and reliable rule generalizations in noisy single-step data mining problems. This work considers an alternative paradigm for knowledge discovery in M-LCSs, shifting the focus from individual rules to a global, population-wide perspective. We demonstrate the efficacy of this pipeline applied to the identification of epistasis (i.e., attribute interaction) and heterogeneity in noisy simulated genetic association data. PMID:25431544
Sullivan, Kristynn J; Shadish, William R; Steiner, Peter M
2015-03-01
Single-case designs (SCDs) are short time series that assess intervention effects by measuring units repeatedly over time in both the presence and absence of treatment. This article introduces a statistical technique for analyzing SCD data that has not been much used in psychological and educational research: generalized additive models (GAMs). In parametric regression, the researcher must choose a functional form to impose on the data, for example, that trend over time is linear. GAMs reverse this process by letting the data inform the choice of functional form. In this article we review the problem that trend poses in SCDs, discuss how current SCD analytic methods approach trend, describe GAMs as a possible solution, suggest a GAM model testing procedure for examining the presence of trend in SCDs, present a small simulation to show the statistical properties of GAMs, and illustrate the procedure on 3 examples of different lengths. Results suggest that GAMs may be very useful both as a form of sensitivity analysis for checking the plausibility of assumptions about trend and as a primary data analysis strategy for testing treatment effects. We conclude with a discussion of some problems with GAMs and some future directions for research on the application of GAMs to SCDs. (c) 2015 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Ramkilowan, A.; Griffith, D. J.
2017-10-01
Surveillance modelling in terms of the standard Detect, Recognise and Identify (DRI) thresholds remains a key requirement for determining the effectiveness of surveillance sensors. With readily available computational resources it has become feasible to perform statistically representative evaluations of the effectiveness of these sensors. A new capability for performing this Monte-Carlo type analysis is demonstrated in the MORTICIA (Monte- Carlo Optical Rendering for Theatre Investigations of Capability under the Influence of the Atmosphere) software package developed at the Council for Scientific and Industrial Research (CSIR). This first generation, python-based open-source integrated software package, currently in the alpha stage of development aims to provide all the functionality required to perform statistical investigations of the effectiveness of optical surveillance systems in specific or generic deployment theatres. This includes modelling of the mathematical and physical processes that govern amongst other components of a surveillance system; a sensor's detector and optical components, a target and its background as well as the intervening atmospheric influences. In this paper we discuss integral aspects of the bespoke framework that are critical to the longevity of all subsequent modelling efforts. Additionally, some preliminary results are presented.
"Magnitude-based inference": a statistical review.
Welsh, Alan H; Knight, Emma J
2015-04-01
We consider "magnitude-based inference" and its interpretation by examining in detail its use in the problem of comparing two means. We extract from the spreadsheets, which are provided to users of the analysis (http://www.sportsci.org/), a precise description of how "magnitude-based inference" is implemented. We compare the implemented version of the method with general descriptions of it and interpret the method in familiar statistical terms. We show that "magnitude-based inference" is not a progressive improvement on modern statistics. The additional probabilities introduced are not directly related to the confidence interval but, rather, are interpretable either as P values for two different nonstandard tests (for different null hypotheses) or as approximate Bayesian calculations, which also lead to a type of test. We also discuss sample size calculations associated with "magnitude-based inference" and show that the substantial reduction in sample sizes claimed for the method (30% of the sample size obtained from standard frequentist calculations) is not justifiable so the sample size calculations should not be used. Rather than using "magnitude-based inference," a better solution is to be realistic about the limitations of the data and use either confidence intervals or a fully Bayesian analysis.
Inactive Hepatitis B Carrier and Pregnancy Outcomes: A Systematic Review and Meta-analysis.
Keramat, Afsaneh; Younesian, Masud; Gholami Fesharaki, Mohammad; Hasani, Maryam; Mirzaei, Samaneh; Ebrahimi, Elham; Alavian, Seyed Moaed; Mohammadi, Fatemeh
2017-04-01
We aimed to explore whether maternal asymptomatic hepatitis B (HB) infection effects on pre-term rupture of membranous (PROM), stillbirth, preeclampsia, eclampsia, gestational hypertension, or antepartum hemorrhage. We searched the PubMed, Scopus, and ISI web of science from 1990 to Feb 2015. In addition, electronic literature searches supplemented by searching the gray literature (e.g., conference abstracts thesis and the result of technical reports) and scanning the reference lists of included studies and relevant systematic reviews. We explored statistical heterogeneity using the, I2 and tau-squared (Tau2) statistical tests. Eighteen studies were included. Preterm rupture of membranous (PROM), stillbirth, preeclampsia, eclampsia, gestational hypertension and antepartum hemorrhage were considerable outcomes in this survey. The results showed no significant association between inactive HB and these complications in pregnancy. The small amounts of P -value and chi-square and large amount of I2 suggested the probable heterogeneity in this part, which we tried to modify with statistical methods such as subgroup analysis. Inactive HB infection did not increase the risk of adversely mentioned outcomes in this study. Further, well-designed studies should be performed to confirm the results.
Using radar imagery for crop discrimination: a statistical and conditional probability study
Haralick, R.M.; Caspall, F.; Simonett, D.S.
1970-01-01
A number of the constraints with which remote sensing must contend in crop studies are outlined. They include sensor, identification accuracy, and congruencing constraints; the nature of the answers demanded of the sensor system; and the complex temporal variances of crops in large areas. Attention is then focused on several methods which may be used in the statistical analysis of multidimensional remote sensing data.Crop discrimination for radar K-band imagery is investigated by three methods. The first one uses a Bayes decision rule, the second a nearest-neighbor spatial conditional probability approach, and the third the standard statistical techniques of cluster analysis and principal axes representation.Results indicate that crop type and percent of cover significantly affect the strength of the radar return signal. Sugar beets, corn, and very bare ground are easily distinguishable, sorghum, alfalfa, and young wheat are harder to distinguish. Distinguishability will be improved if the imagery is examined in time sequence so that changes between times of planning, maturation, and harvest provide additional discriminant tools. A comparison between radar and photography indicates that radar performed surprisingly well in crop discrimination in western Kansas and warrants further study.
Highton, R
1993-12-01
An analysis of the relationship between the number of loci utilized in an electrophoretic study of genetic relationships and the statistical support for the topology of UPGMA trees is reported for two published data sets. These are Highton and Larson (Syst. Zool.28:579-599, 1979), an analysis of the relationships of 28 species of plethodonine salamanders, and Hedges (Syst. Zool., 35:1-21, 1986), a similar study of 30 taxa of Holarctic hylid frogs. As the number of loci increases, the statistical support for the topology at each node in UPGMA trees was determined by both the bootstrap and jackknife methods. The results show that the bootstrap and jackknife probabilities supporting the topology at some nodes of UPGMA trees increase as the number of loci utilized in a study is increased, as expected for nodes that have groupings that reflect phylogenetic relationships. The pattern of increase varies and is especially rapid in the case of groups with no close relatives. At nodes that likely do not represent correct phylogenetic relationships, the bootstrap probabilities do not increase and often decline with the addition of more loci.
A study of correlations between crude oil spot and futures markets: A rolling sample test
NASA Astrophysics Data System (ADS)
Liu, Li; Wan, Jieqiu
2011-10-01
In this article, we investigate the asymmetries of exceedance correlations and cross-correlations between West Texas Intermediate (WTI) spot and futures markets. First, employing the test statistic proposed by Hong et al. [Asymmetries in stock returns: statistical tests and economic evaluation, Review of Financial Studies 20 (2007) 1547-1581], we find that the exceedance correlations were overall symmetric. However, the results from rolling windows show that some occasional events could induce the significant asymmetries of the exceedance correlations. Second, employing the test statistic proposed by Podobnik et al. [Quantifying cross-correlations using local and global detrending approaches, European Physics Journal B 71 (2009) 243-250], we find that the cross-correlations were significant even for large lagged orders. Using the detrended cross-correlation analysis proposed by Podobnik and Stanley [Detrended cross-correlation analysis: a new method for analyzing two nonstationary time series, Physics Review Letters 100 (2008) 084102], we find that the cross-correlations were weakly persistent and were stronger between spot and futures contract with larger maturity. Our results from rolling sample test also show the apparent effects of the exogenous events. Additionally, we have some relevant discussions on the obtained evidence.
Eckmann, Christian; Wasserman, Matthew; Latif, Faisal; Roberts, Graeme; Beriot-Mathiot, Axelle
2013-10-01
Hospital-onset Clostridium difficile infection (CDI) places a significant burden on health care systems throughout Europe, estimated at around €3 billion per annum. This burden is shared between national payers and hospitals that support additional bed days for patients diagnosed with CDI while in hospital or patients re-admitted from a previous hospitalisation. This study was performed to quantify additional hospital stay attributable to CDI in four countries, England, Germany, Spain, and The Netherlands, by analysing nationwide hospital-episode data. We focused upon patients at increased risk of CDI: with chronic obstructive pulmonary disease, heart failure, diabetes, or chronic kidney disease, and aged 50 years or over. Multivariate regression and propensity score matching models were developed to investigate the impact of CDI on additional length of hospital stay, controlling for confounding factors such as underlying disease severity. Patients in England had the longest additional hospital stay attributable to CDI at 16.09 days, followed by Germany at 15.47 days, Spain at 13.56 days, and The Netherlands at 12.58 days, derived using regression analysis. Propensity score matching indicated a higher attributable length of stay of 32.42 days in England, 15.31 days in Spain, and 18.64 days in The Netherlands. Outputs from this study consistently demonstrate that in European countries, for patients whose hospitalisation is complicated by CDI, the infection causes a statistically significant increase in hospital length of stay. This has implications for optimising resource allocation and budget setting at both the national and hospital level to ensure that levels of CDI-complicated hospitalisations are minimised.
Expression Profiling of Nonpolar Lipids in Meibum From Patients With Dry Eye: A Pilot Study
Chen, Jianzhong; Keirsey, Jeremy K.; Green, Kari B.; Nichols, Kelly K.
2017-01-01
Purpose The purpose of this investigation was to characterize differentially expressed lipids in meibum samples from patients with dry eye disease (DED) in order to better understand the underlying pathologic mechanisms. Methods Meibum samples were collected from postmenopausal women with DED (PW-DED; n = 5) and a control group of postmenopausal women without DED (n = 4). Lipid profiles were analyzed by direct infusion full-scan electrospray ionization mass spectrometry (ESI-MS). An initial analysis of 145 representative peaks from four classes of lipids in PW-DED samples revealed that additional manual corrections for peak overlap and isotopes only slightly affected the statistical analysis. Therefore, analysis of uncorrected data, which can be applied to a greater number of peaks, was used to compare more than 500 lipid peaks common to PW-DED and control samples. Statistical analysis of peak intensities identified several lipid species that differed significantly between the two groups. Data from contact lens wearers with DED (CL-DED; n = 5) were also analyzed. Results Many species of the two types of diesters (DE) and very long chain wax esters (WE) were decreased by ∼20% in PW-DED, whereas levels of triacylglycerols were increased by an average of 39% ± 3% in meibum from PW-DED compared to that in the control group. Approximately the same reduction (20%) of similar DE and WE was observed for CL-DED. Conclusions Statistical analysis of peak intensities from direct infusion ESI-MS results identified differentially expressed lipids in meibum from dry eye patients. Further studies are warranted to support these findings. PMID:28426869
Vasudevan, Rama K; Tselev, Alexander; Baddorf, Arthur P; Kalinin, Sergei V
2014-10-28
Reflection high energy electron diffraction (RHEED) has by now become a standard tool for in situ monitoring of film growth by pulsed laser deposition and molecular beam epitaxy. Yet despite the widespread adoption and wealth of information in RHEED images, most applications are limited to observing intensity oscillations of the specular spot, and much additional information on growth is discarded. With ease of data acquisition and increased computation speeds, statistical methods to rapidly mine the data set are now feasible. Here, we develop such an approach to the analysis of the fundamental growth processes through multivariate statistical analysis of a RHEED image sequence. This approach is illustrated for growth of La(x)Ca(1-x)MnO(3) films grown on etched (001) SrTiO(3) substrates, but is universal. The multivariate methods including principal component analysis and k-means clustering provide insight into the relevant behaviors, the timing and nature of a disordered to ordered growth change, and highlight statistically significant patterns. Fourier analysis yields the harmonic components of the signal and allows separation of the relevant components and baselines, isolating the asymmetric nature of the step density function and the transmission spots from the imperfect layer-by-layer (LBL) growth. These studies show the promise of big data approaches to obtaining more insight into film properties during and after epitaxial film growth. Furthermore, these studies open the pathway to use forward prediction methods to potentially allow significantly more control over growth process and hence final film quality.
Wu, Wenjing; Wang, Yan; Xu, Lulu
2015-10-01
It is unclear whether epipolis-laser in situ keratomileusis (Epi-LASIK) has any significant advantage over photorefractive keratectomy (PRK) for correcting myopia. We undertook this meta-analysis of randomized controlled trials and cohort studies to examine possible differences in efficacy, predictability, and side effects between Epi-LASIK and PRK for correcting myopia. A system literature review was conducted in the PubMed, Cochrane Library EMBASE. The statistical analysis was performed by RevMan 5.0 software. The results included efficacy outcomes (percentage of eyes with 20/20 uncorrected visual acuity post-treatment), predictability (proportion of eyes within ±0.5 D of the target refraction), epithelial healing time, and the incidence of significant haze and pain scores after surgery. There are seven articles with total 987 eyes suitable for the meta-analysis. There is no statistical significance in the predictability between Epi-LASIK and PRK, the risk ratio (RR) is 1.03, 95% confidence interval (CI) [0.92, 1.16], p = 0.18; with respect to efficacy, the odds ratio is 1.43, 95% CI = [0.85, 2.40], p = 0.56 between Epi-LASIK and PRK, there is no statistical significance either. The epithelial cell layer healing time and the pain scores and the incidence of significant haze showed no significance between these two techniques although more pains can be found in Epi-LASIK than PRK at the early-stage post-operation. According to the above analysis, Epi-LASIK has good efficacy and predictability as PRK. In addition, both techniques have low pain scores and low incidence of significant haze.
Automated SEM Modal Analysis Applied to the Diogenites
NASA Technical Reports Server (NTRS)
Bowman, L. E.; Spilde, M. N.; Papike, James J.
1996-01-01
Analysis of volume proportions of minerals, or modal analysis, is routinely accomplished by point counting on an optical microscope, but the process, particularly on brecciated samples such as the diogenite meteorites, is tedious and prone to error by misidentification of very small fragments, which may make up a significant volume of the sample. Precise volume percentage data can be gathered on a scanning electron microscope (SEM) utilizing digital imaging and an energy dispersive spectrometer (EDS). This form of automated phase analysis reduces error, and at the same time provides more information than could be gathered using simple point counting alone, such as particle morphology statistics and chemical analyses. We have previously studied major, minor, and trace-element chemistry of orthopyroxene from a suite of diogenites. This abstract describes the method applied to determine the modes on this same suite of meteorites and the results of that research. The modal abundances thus determined add additional information on the petrogenesis of the diogenites. In addition, low-abundance phases such as spinels were located for further analysis by this method.
Single-digit arithmetic processing—anatomical evidence from statistical voxel-based lesion analysis
Mihulowicz, Urszula; Willmes, Klaus; Karnath, Hans-Otto; Klein, Elise
2014-01-01
Different specific mechanisms have been suggested for solving single-digit arithmetic operations. However, the neural correlates underlying basic arithmetic (multiplication, addition, subtraction) are still under debate. In the present study, we systematically assessed single-digit arithmetic in a group of acute stroke patients (n = 45) with circumscribed left- or right-hemispheric brain lesions. Lesion sites significantly related to impaired performance were found only in the left-hemisphere damaged (LHD) group. Deficits in multiplication and addition were related to subcortical/white matter brain regions differing from those for subtraction tasks, corroborating the notion of distinct processing pathways for different arithmetic tasks. Additionally, our results further point to the importance of investigating fiber pathways in numerical cognition. PMID:24847238
Qu, Xin; Hall, Alex; DeAngelis, Anthony M.; ...
2018-01-11
Differences among climate models in equilibrium climate sensitivity (ECS; the equilibrium surface temperature response to a doubling of atmospheric CO2) remain a significant barrier to the accurate assessment of societally important impacts of climate change. Relationships between ECS and observable metrics of the current climate in model ensembles, so-called emergent constraints, have been used to constrain ECS. Here a statistical method (including a backward selection process) is employed to achieve a better statistical understanding of the connections between four recently proposed emergent constraint metrics and individual feedbacks influencing ECS. The relationship between each metric and ECS is largely attributable tomore » a statistical connection with shortwave low cloud feedback, the leading cause of intermodel ECS spread. This result bolsters confidence in some of the metrics, which had assumed such a connection in the first place. Additional analysis is conducted with a few thousand artificial metrics that are randomly generated but are well correlated with ECS. The relationships between the contrived metrics and ECS can also be linked statistically to shortwave cloud feedback. Thus, any proposed or forthcoming ECS constraint based on the current generation of climate models should be viewed as a potential constraint on shortwave cloud feedback, and physical links with that feedback should be investigated to verify that the constraint is real. Additionally, any proposed ECS constraint should not be taken at face value since other factors influencing ECS besides shortwave cloud feedback could be systematically biased in the models.« less
NASA Astrophysics Data System (ADS)
Shirasaki, Masato; Nishimichi, Takahiro; Li, Baojiu; Higuchi, Yuichi
2017-04-01
We investigate the information content of various cosmic shear statistics on the theory of gravity. Focusing on the Hu-Sawicki-type f(R) model, we perform a set of ray-tracing simulations and measure the convergence bispectrum, peak counts and Minkowski functionals. We first show that while the convergence power spectrum does have sensitivity to the current value of extra scalar degree of freedom |fR0|, it is largely compensated by a change in the present density amplitude parameter σ8 and the matter density parameter Ωm0. With accurate covariance matrices obtained from 1000 lensing simulations, we then examine the constraining power of the three additional statistics. We find that these probes are indeed helpful to break the parameter degeneracy, which cannot be resolved from the power spectrum alone. We show that especially the peak counts and Minkowski functionals have the potential to rigorously (marginally) detect the signature of modified gravity with the parameter |fR0| as small as 10-5 (10-6) if we can properly model them on small (˜1 arcmin) scale in a future survey with a sky coverage of 1500 deg2. We also show that the signal level is similar among the additional three statistics and all of them provide complementary information to the power spectrum. These findings indicate the importance of combining multiple probes beyond the standard power spectrum analysis to detect possible modifications to general relativity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Qu, Xin; Hall, Alex; DeAngelis, Anthony M.
Differences among climate models in equilibrium climate sensitivity (ECS; the equilibrium surface temperature response to a doubling of atmospheric CO2) remain a significant barrier to the accurate assessment of societally important impacts of climate change. Relationships between ECS and observable metrics of the current climate in model ensembles, so-called emergent constraints, have been used to constrain ECS. Here a statistical method (including a backward selection process) is employed to achieve a better statistical understanding of the connections between four recently proposed emergent constraint metrics and individual feedbacks influencing ECS. The relationship between each metric and ECS is largely attributable tomore » a statistical connection with shortwave low cloud feedback, the leading cause of intermodel ECS spread. This result bolsters confidence in some of the metrics, which had assumed such a connection in the first place. Additional analysis is conducted with a few thousand artificial metrics that are randomly generated but are well correlated with ECS. The relationships between the contrived metrics and ECS can also be linked statistically to shortwave cloud feedback. Thus, any proposed or forthcoming ECS constraint based on the current generation of climate models should be viewed as a potential constraint on shortwave cloud feedback, and physical links with that feedback should be investigated to verify that the constraint is real. Additionally, any proposed ECS constraint should not be taken at face value since other factors influencing ECS besides shortwave cloud feedback could be systematically biased in the models.« less
Analysis of Variance: What Is Your Statistical Software Actually Doing?
ERIC Educational Resources Information Center
Li, Jian; Lomax, Richard G.
2011-01-01
Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
Huang, Ying; Li, Cao; Liu, Linhai; Jia, Xianbo; Lai, Song-Jia
2016-01-01
Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i) genetic diversity of DNA sequences, (ii) statistical tests for neutral evolution, and (iii) measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis. PMID:27792763
Chen, Shi-Yi; Deng, Feilong; Huang, Ying; Li, Cao; Liu, Linhai; Jia, Xianbo; Lai, Song-Jia
2016-01-01
Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i) genetic diversity of DNA sequences, (ii) statistical tests for neutral evolution, and (iii) measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis.
Quantifying Safety Margin Using the Risk-Informed Safety Margin Characterization (RISMC)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grabaskas, David; Bucknor, Matthew; Brunett, Acacia
2015-04-26
The Risk-Informed Safety Margin Characterization (RISMC), developed by Idaho National Laboratory as part of the Light-Water Reactor Sustainability Project, utilizes a probabilistic safety margin comparison between a load and capacity distribution, rather than a deterministic comparison between two values, as is usually done in best-estimate plus uncertainty analyses. The goal is to determine the failure probability, or in other words, the probability of the system load equaling or exceeding the system capacity. While this method has been used in pilot studies, there has been little work conducted investigating the statistical significance of the resulting failure probability. In particular, it ismore » difficult to determine how many simulations are necessary to properly characterize the failure probability. This work uses classical (frequentist) statistics and confidence intervals to examine the impact in statistical accuracy when the number of simulations is varied. Two methods are proposed to establish confidence intervals related to the failure probability established using a RISMC analysis. The confidence interval provides information about the statistical accuracy of the method utilized to explore the uncertainty space, and offers a quantitative method to gauge the increase in statistical accuracy due to performing additional simulations.« less
Lü, Yiran; Hao, Shuxin; Zhang, Guoqing; Liu, Jie; Liu, Yue; Xu, Dongqun
2018-01-01
To implement the online statistical analysis function in information system of air pollution and health impact monitoring, and obtain the data analysis information real-time. Using the descriptive statistical method as well as time-series analysis and multivariate regression analysis, SQL language and visual tools to implement online statistical analysis based on database software. Generate basic statistical tables and summary tables of air pollution exposure and health impact data online; Generate tendency charts of each data part online and proceed interaction connecting to database; Generate butting sheets which can lead to R, SAS and SPSS directly online. The information system air pollution and health impact monitoring implements the statistical analysis function online, which can provide real-time analysis result to its users.
Salvatore, Stefania; Bramness, Jørgen Gustav; Reid, Malcolm J; Thomas, Kevin Victor; Harman, Christopher; Røislien, Jo
2015-01-01
Wastewater-based epidemiology (WBE) is a new methodology for estimating the drug load in a population. Simple summary statistics and specification tests have typically been used to analyze WBE data, comparing differences between weekday and weekend loads. Such standard statistical methods may, however, overlook important nuanced information in the data. In this study, we apply functional data analysis (FDA) to WBE data and compare the results to those obtained from more traditional summary measures. We analysed temporal WBE data from 42 European cities, using sewage samples collected daily for one week in March 2013. For each city, the main temporal features of two selected drugs were extracted using functional principal component (FPC) analysis, along with simpler measures such as the area under the curve (AUC). The individual cities' scores on each of the temporal FPCs were then used as outcome variables in multiple linear regression analysis with various city and country characteristics as predictors. The results were compared to those of functional analysis of variance (FANOVA). The three first FPCs explained more than 99% of the temporal variation. The first component (FPC1) represented the level of the drug load, while the second and third temporal components represented the level and the timing of a weekend peak. AUC was highly correlated with FPC1, but other temporal characteristic were not captured by the simple summary measures. FANOVA was less flexible than the FPCA-based regression, and even showed concordance results. Geographical location was the main predictor for the general level of the drug load. FDA of WBE data extracts more detailed information about drug load patterns during the week which are not identified by more traditional statistical methods. Results also suggest that regression based on FPC results is a valuable addition to FANOVA for estimating associations between temporal patterns and covariate information.
Topographic ERP analyses: a step-by-step tutorial review.
Murray, Micah M; Brunet, Denis; Michel, Christoph M
2008-06-01
In this tutorial review, we detail both the rationale for as well as the implementation of a set of analyses of surface-recorded event-related potentials (ERPs) that uses the reference-free spatial (i.e. topographic) information available from high-density electrode montages to render statistical information concerning modulations in response strength, latency, and topography both between and within experimental conditions. In these and other ways these topographic analysis methods allow the experimenter to glean additional information and neurophysiologic interpretability beyond what is available from canonical waveform analyses. In this tutorial we present the example of somatosensory evoked potentials (SEPs) in response to stimulation of each hand to illustrate these points. For each step of these analyses, we provide the reader with both a conceptual and mathematical description of how the analysis is carried out, what it yields, and how to interpret its statistical outcome. We show that these topographic analysis methods are intuitive and easy-to-use approaches that can remove much of the guesswork often confronting ERP researchers and also assist in identifying the information contained within high-density ERP datasets.
Influence of flavor solvent on flavor release and perception in sugar-free chewing gum.
Potineni, Rajesh V; Peterson, Devin G
2008-05-14
The influence of flavor solvent [triacetin (TA), propylene glycol (PG), medium chained triglycerides (MCT), or no flavor solvent (NFS)] on the flavor release profile, the textural properties, and the sensory perception of a sugar-free chewing gum was investigated. Time course analysis of the exhaled breath and saliva during chewing gum mastication indicated that flavor solvent addition or type did not influence the aroma release profile; however, the sorbitol release rate was statistically lower for the TA formulated sample in comparison to those with PG, MCT, or NFS. Sensory time-intensity analysis also indicated that the TA formulated sample was statistically lower in perceived sweetness intensity, in comparison with the other chewing gum samples, and also had lower cinnamon-like aroma intensity, presumably due to an interaction between sweetness intensity on aroma perception. Measurement of the chewing gum macroscopic texture by compression analysis during consumption was not correlated to the unique flavor release properties of the TA-chewing gum. However, a relationship between gum base plasticity and retention of sugar alcohol during mastication was proposed to explain the different flavor properties of the TA sample.
[Changes in cerebrospinal fluid in patients with tuberculosis of the central nervous system].
Jedrychowski, Michał; Garlicki, Aleksander
2008-01-01
The aim of the study was to analyze the parameters of the cerebrospinal fluid in patients with tuberculosis of the central nervous system confirmed by culture or molecular methods, in comparison to patients without such confirmation. The analysis of medical documentation of 13 patients with CNS tuberculosis, 10 male and 3 female who were hospitalized at the Clinic of Infectious Diseases in Kraków in years 2001-2006 was performed. Following parameters of the cerebrospinal fluid were taken into account in both groups of patients: cytologic analysis, protein, glucose and chloride concentration. Statistical analysis was done using the non-parametric Mann-Whitney U test. The only parameter for which statistically significant difference between the two groups of patients was found was the level of glucose in CSF (p<0.05). Lower glucose concentration was observed in the group with etiologically confirmed CNS tuberculosis. Moreover additional localisation of tuberculosis was observed in this group of patients. Introduction of the molecular biology methods in diagnosis allowed to detect the etiologic factor more often.
Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M
2011-12-01
This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.
Baltzer, Pascal Andreas Thomas; Renz, Diane M; Kullnig, Petra E; Gajda, Mieczyslaw; Camara, Oumar; Kaiser, Werner A
2009-04-01
The identification of the most suspect enhancing part of a lesion is regarded as a major diagnostic criterion in dynamic magnetic resonance mammography. Computer-aided diagnosis (CAD) software allows the semi-automatic analysis of the kinetic characteristics of complete enhancing lesions, providing additional information about lesion vasculature. The diagnostic value of this information has not yet been quantified. Consecutive patients from routine diagnostic studies (1.5 T, 0.1 mmol gadopentetate dimeglumine, dynamic gradient-echo sequences at 1-minute intervals) were analyzed prospectively using CAD. Dynamic sequences were processed and reduced to a parametric map. Curve types were classified by initial signal increase (not significant, intermediate, and strong) and the delayed time course of signal intensity (continuous, plateau, and washout). Lesion enhancement was measured using CAD. The most suspect curve, the curve-type distribution percentage, and combined dynamic data were compared. Statistical analysis included logistic regression analysis and receiver-operating characteristic analysis. Fifty-one patients with 46 malignant and 44 benign lesions were enrolled. On receiver-operating characteristic analysis, the most suspect curve showed diagnostic accuracy of 76.7 +/- 5%. In comparison, the curve-type distribution percentage demonstrated accuracy of 80.2 +/- 4.9%. Combined dynamic data had the highest diagnostic accuracy (84.3 +/- 4.2%). These differences did not achieve statistical significance. With appropriate cutoff values, sensitivity and specificity, respectively, were found to be 80.4% and 72.7% for the most suspect curve, 76.1% and 83.6% for the curve-type distribution percentage, and 78.3% and 84.5% for both parameters. The integration of whole-lesion dynamic data tends to improve specificity. However, no statistical significance backs up this finding.
Xia, Li C; Ai, Dongmei; Cram, Jacob A; Liang, Xiaoyi; Fuhrman, Jed A; Sun, Fengzhu
2015-09-21
Local trend (i.e. shape) analysis of time series data reveals co-changing patterns in dynamics of biological systems. However, slow permutation procedures to evaluate the statistical significance of local trend scores have limited its applications to high-throughput time series data analysis, e.g., data from the next generation sequencing technology based studies. By extending the theories for the tail probability of the range of sum of Markovian random variables, we propose formulae for approximating the statistical significance of local trend scores. Using simulations and real data, we show that the approximate p-value is close to that obtained using a large number of permutations (starting at time points >20 with no delay and >30 with delay of at most three time steps) in that the non-zero decimals of the p-values obtained by the approximation and the permutations are mostly the same when the approximate p-value is less than 0.05. In addition, the approximate p-value is slightly larger than that based on permutations making hypothesis testing based on the approximate p-value conservative. The approximation enables efficient calculation of p-values for pairwise local trend analysis, making large scale all-versus-all comparisons possible. We also propose a hybrid approach by integrating the approximation and permutations to obtain accurate p-values for significantly associated pairs. We further demonstrate its use with the analysis of the Polymouth Marine Laboratory (PML) microbial community time series from high-throughput sequencing data and found interesting organism co-occurrence dynamic patterns. The software tool is integrated into the eLSA software package that now provides accelerated local trend and similarity analysis pipelines for time series data. The package is freely available from the eLSA website: http://bitbucket.org/charade/elsa.
Ankle plantarflexion strength in rearfoot and forefoot runners: a novel clusteranalytic approach.
Liebl, Dominik; Willwacher, Steffen; Hamill, Joseph; Brüggemann, Gert-Peter
2014-06-01
The purpose of the present study was to test for differences in ankle plantarflexion strengths of habitually rearfoot and forefoot runners. In order to approach this issue, we revisit the problem of classifying different footfall patterns in human runners. A dataset of 119 subjects running shod and barefoot (speed 3.5m/s) was analyzed. The footfall patterns were clustered by a novel statistical approach, which is motivated by advances in the statistical literature on functional data analysis. We explain the novel statistical approach in detail and compare it to the classically used strike index of Cavanagh and Lafortune (1980). The two groups found by the new cluster approach are well interpretable as a forefoot and a rearfoot footfall groups. The subsequent comparison study of the clustered subjects reveals that runners with a forefoot footfall pattern are capable of producing significantly higher joint moments in a maximum voluntary contraction (MVC) of their ankle plantarflexor muscles tendon units; difference in means: 0.28Nm/kg. This effect remains significant after controlling for an additional gender effect and for differences in training levels. Our analysis confirms the hypothesis that forefoot runners have a higher mean MVC plantarflexion strength than rearfoot runners. Furthermore, we demonstrate that our proposed stochastic cluster analysis provides a robust and useful framework for clustering foot strikes. Copyright © 2014 Elsevier B.V. All rights reserved.
WINPEPI updated: computer programs for epidemiologists, and their teaching potential
2011-01-01
Background The WINPEPI computer programs for epidemiologists are designed for use in practice and research in the health field and as learning or teaching aids. The programs are free, and can be downloaded from the Internet. Numerous additions have been made in recent years. Implementation There are now seven WINPEPI programs: DESCRIBE, for use in descriptive epidemiology; COMPARE2, for use in comparisons of two independent groups or samples; PAIRSetc, for use in comparisons of paired and other matched observations; LOGISTIC, for logistic regression analysis; POISSON, for Poisson regression analysis; WHATIS, a "ready reckoner" utility program; and ETCETERA, for miscellaneous other procedures. The programs now contain 122 modules, each of which provides a number, sometimes a large number, of statistical procedures. The programs are accompanied by a Finder that indicates which modules are appropriate for different purposes. The manuals explain the uses, limitations and applicability of the procedures, and furnish formulae and references. Conclusions WINPEPI is a handy resource for a wide variety of statistical routines used by epidemiologists. Because of its ready availability, portability, ease of use, and versatility, WINPEPI has a considerable potential as a learning and teaching aid, both with respect to practical procedures in the planning and analysis of epidemiological studies, and with respect to important epidemiological concepts. It can also be used as an aid in the teaching of general basic statistics. PMID:21288353
Specifying the ISS Plasma Environment
NASA Technical Reports Server (NTRS)
Minow, Joseph I.; Diekmann, Anne; Neergaard, Linda; Bui, Them; Mikatarian, Ronald; Barsamian, Hagop; Koontz, Steven
2002-01-01
Quantifying the spacecraft charging risks and corresponding hazards for the International Space Station (ISS) requires a plasma environment specification describing the natural variability of ionospheric temperature (Te) and density (Ne). Empirical ionospheric specification and forecast models such as the International Reference Ionosphere (IN) model typically only provide estimates of long term (seasonal) mean Te and Ne values for the low Earth orbit environment. Knowledge of the Te and Ne variability as well as the likelihood of extreme deviations from the mean values are required to estimate both the magnitude and frequency of occurrence of potentially hazardous spacecraft charging environments for a given ISS construction stage and flight configuration. This paper describes the statistical analysis of historical ionospheric low Earth orbit plasma measurements used to estimate Ne, Te variability in the ISS flight environment. The statistical variability analysis of Ne and Te enables calculation of the expected frequency of occurrence of any particular values of Ne and Te, especially those that correspond to possibly hazardous spacecraft charging environments. The database used in the original analysis included measurements from the AE-C, AE-D, and DE-2 satellites. Recent work on the database has added additional satellites to the database and ground based incoherent scatter radar observations as well. Deviations of the data values from the IRI estimated Ne, Te parameters for each data point provide a statistical basis for modeling the deviations of the plasma environment from the IRI model output.
Carmichael, Mary C.; St. Clair, Candace; Edwards, Andrea M.; Barrett, Peter; McFerrin, Harris; Davenport, Ian; Awad, Mohamed; Kundu, Anup; Ireland, Shubha Kale
2016-01-01
Xavier University of Louisiana leads the nation in awarding BS degrees in the biological sciences to African-American students. In this multiyear study with ∼5500 participants, data-driven interventions were adopted to improve student academic performance in a freshman-level general biology course. The three hour-long exams were common and administered concurrently to all students. New exam questions were developed using Bloom’s taxonomy, and exam results were analyzed statistically with validated assessment tools. All but the comprehensive final exam were returned to students for self-evaluation and remediation. Among other approaches, course rigor was monitored by using an identical set of 60 questions on the final exam across 10 semesters. Analysis of the identical sets of 60 final exam questions revealed that overall averages increased from 72.9% (2010) to 83.5% (2015). Regression analysis demonstrated a statistically significant correlation between high-risk students and their averages on the 60 questions. Additional analysis demonstrated statistically significant improvements for at least one letter grade from midterm to final and a 20% increase in the course pass rates over time, also for the high-risk population. These results support the hypothesis that our data-driven interventions and assessment techniques are successful in improving student retention, particularly for our academically at-risk students. PMID:27543637
Non-extensivity and complexity in the earthquake activity at the West Corinth rift (Greece)
NASA Astrophysics Data System (ADS)
Michas, Georgios; Vallianatos, Filippos; Sammonds, Peter
2013-04-01
Earthquakes exhibit complex phenomenology that is revealed from the fractal structure in space, time and magnitude. For that reason other tools rather than the simple Poissonian statistics seem more appropriate to describe the statistical properties of the phenomenon. Here we use Non-Extensive Statistical Physics [NESP] to investigate the inter-event time distribution of the earthquake activity at the west Corinth rift (central Greece). This area is one of the most seismotectonically active areas in Europe, with an important continental N-S extension and high seismicity rates. NESP concept refers to the non-additive Tsallis entropy Sq that includes Boltzmann-Gibbs entropy as a particular case. This concept has been successfully used for the analysis of a variety of complex dynamic systems including earthquakes, where fractality and long-range interactions are important. The analysis indicates that the cumulative inter-event time distribution can be successfully described with NESP, implying the complexity that characterizes the temporal occurrences of earthquakes. Further on, we use the Tsallis entropy (Sq) and the Fischer Information Measure (FIM) to investigate the complexity that characterizes the inter-event time distribution through different time windows along the evolution of the seismic activity at the West Corinth rift. The results of this analysis reveal a different level of organization and clusterization of the seismic activity in time. Acknowledgments. GM wish to acknowledge the partial support of the Greek State Scholarships Foundation (IKY).
Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application
Cantor, Rita M.; Lange, Kenneth; Sinsheimer, Janet S.
2010-01-01
Genome-wide association studies (GWAS) have rapidly become a standard method for disease gene discovery. A substantial number of recent GWAS indicate that for most disorders, only a few common variants are implicated and the associated SNPs explain only a small fraction of the genetic risk. This review is written from the viewpoint that findings from the GWAS provide preliminary genetic information that is available for additional analysis by statistical procedures that accumulate evidence, and that these secondary analyses are very likely to provide valuable information that will help prioritize the strongest constellations of results. We review and discuss three analytic methods to combine preliminary GWAS statistics to identify genes, alleles, and pathways for deeper investigations. Meta-analysis seeks to pool information from multiple GWAS to increase the chances of finding true positives among the false positives and provides a way to combine associations across GWAS, even when the original data are unavailable. Testing for epistasis within a single GWAS study can identify the stronger results that are revealed when genes interact. Pathway analysis of GWAS results is used to prioritize genes and pathways within a biological context. Following a GWAS, association results can be assigned to pathways and tested in aggregate with computational tools and pathway databases. Reviews of published methods with recommendations for their application are provided within the framework for each approach. PMID:20074509
Entropy Production in Collisionless Systems. II. Arbitrary Phase-space Occupation Numbers
NASA Astrophysics Data System (ADS)
Barnes, Eric I.; Williams, Liliya L. R.
2012-04-01
We present an analysis of two thermodynamic techniques for determining equilibria of self-gravitating systems. One is the Lynden-Bell (LB) entropy maximization analysis that introduced violent relaxation. Since we do not use the Stirling approximation, which is invalid at small occupation numbers, our systems have finite mass, unlike LB's isothermal spheres. (Instead of Stirling, we utilize a very accurate smooth approximation for ln x!.) The second analysis extends entropy production extremization to self-gravitating systems, also without the use of the Stirling approximation. In addition to the LB statistical family characterized by the exclusion principle in phase space, and designed to treat collisionless systems, we also apply the two approaches to the Maxwell-Boltzmann (MB) families, which have no exclusion principle and hence represent collisional systems. We implicitly assume that all of the phase space is equally accessible. We derive entropy production expressions for both families and give the extremum conditions for entropy production. Surprisingly, our analysis indicates that extremizing entropy production rate results in systems that have maximum entropy, in both LB and MB statistics. In other words, both thermodynamic approaches lead to the same equilibrium structures.
Analysis of Parasite and Other Skewed Counts
Alexander, Neal
2012-01-01
Objective To review methods for the statistical analysis of parasite and other skewed count data. Methods Statistical methods for skewed count data are described and compared, with reference to those used over a ten year period of Tropical Medicine and International Health. Two parasitological datasets are used for illustration. Results Ninety papers were identified, 89 with descriptive and 60 with inferential analysis. A lack of clarity is noted in identifying measures of location, in particular the Williams and geometric mean. The different measures are compared, emphasizing the legitimacy of the arithmetic mean for skewed data. In the published papers, the t test and related methods were often used on untransformed data, which is likely to be invalid. Several approaches to inferential analysis are described, emphasizing 1) non-parametric methods, while noting that they are not simply comparisons of medians, and 2) generalized linear modelling, in particular with the negative binomial distribution. Additional methods, such as the bootstrap, with potential for greater use are described. Conclusions Clarity is recommended when describing transformations and measures of location. It is suggested that non-parametric methods and generalized linear models are likely to be sufficient for most analyses. PMID:22943299
Influence of red wine fermentation oenological additives on inoculated strain implantation.
Duarte, Filomena L; Alves, Ana Claudia; Alemão, Maria Filomena; Baleiras-Couto, M Margarida
2013-06-01
Pure selected cultures of Saccharomyces cerevisiae starters are regularly used in the wine industry. A survey of S. cerevisiae populations during red wine fermentations was performed in order to evaluate the influence of oenological additives on the implantation of the inoculated strain. Pilot scale fermentations (500 L) were conducted with active dry yeast (ADY) and other commercial oenological additives, namely two commercial fermentation activators and two commercial tannins. Six microsatellite markers were used to type S. cerevisiae strains. The methodology proved to be very discriminating as a great diversity of wild strains (48 genotypes) was detected. Statistical analysis confirmed a high detection of the inoculated commercial strain, and for half the samples an effective implantation of ADY (over 80 %) was achieved. At late fermentation time, ADY strain implantation in fermentations conducted with commercial additives was lower than in the control. These results question the efficacy of ADY addition in the presence of other additives, indicating that further studies are needed to improve knowledge on oenological additives' use.
NASA Astrophysics Data System (ADS)
Zhao, Xuemei; Li, Rui; Chen, Yu; Sia, Sheau Fung; Li, Donghai; Zhang, Yu; Liu, Aihua
2017-04-01
Additional hemodynamic parameters are highly desirable in the clinical management of intracranial aneurysm rupture as static medical images cannot demonstrate the blood flow within aneurysms. There are two ways of obtaining the hemodynamic information—by phase-contrast magnetic resonance imaging (PCMRI) and computational fluid dynamics (CFD). In this paper, we compared PCMRI and CFD in the analysis of a stable patient's specific aneurysm. The results showed that PCMRI and CFD are in good agreement with each other. An additional CFD study of two stable and two ruptured aneurysms revealed that ruptured aneurysms have a higher statistical average blood velocity, wall shear stress, and oscillatory shear index (OSI) within the aneurysm sac compared to those of stable aneurysms. Furthermore, for ruptured aneurysms, the OSI divides the positive and negative wall shear stress divergence at the aneurysm sac.
Bremer, Peer-Timo; Weber, Gunther; Tierny, Julien; Pascucci, Valerio; Day, Marcus S; Bell, John B
2011-09-01
Large-scale simulations are increasingly being used to study complex scientific and engineering phenomena. As a result, advanced visualization and data analysis are also becoming an integral part of the scientific process. Often, a key step in extracting insight from these large simulations involves the definition, extraction, and evaluation of features in the space and time coordinates of the solution. However, in many applications, these features involve a range of parameters and decisions that will affect the quality and direction of the analysis. Examples include particular level sets of a specific scalar field, or local inequalities between derived quantities. A critical step in the analysis is to understand how these arbitrary parameters/decisions impact the statistical properties of the features, since such a characterization will help to evaluate the conclusions of the analysis as a whole. We present a new topological framework that in a single-pass extracts and encodes entire families of possible features definitions as well as their statistical properties. For each time step we construct a hierarchical merge tree a highly compact, yet flexible feature representation. While this data structure is more than two orders of magnitude smaller than the raw simulation data it allows us to extract a set of features for any given parameter selection in a postprocessing step. Furthermore, we augment the trees with additional attributes making it possible to gather a large number of useful global, local, as well as conditional statistic that would otherwise be extremely difficult to compile. We also use this representation to create tracking graphs that describe the temporal evolution of the features over time. Our system provides a linked-view interface to explore the time-evolution of the graph interactively alongside the segmentation, thus making it possible to perform extensive data analysis in a very efficient manner. We demonstrate our framework by extracting and analyzing burning cells from a large-scale turbulent combustion simulation. In particular, we show how the statistical analysis enabled by our techniques provides new insight into the combustion process.
EEG analysis using wavelet-based information tools.
Rosso, O A; Martin, M T; Figliola, A; Keller, K; Plastino, A
2006-06-15
Wavelet-based informational tools for quantitative electroencephalogram (EEG) record analysis are reviewed. Relative wavelet energies, wavelet entropies and wavelet statistical complexities are used in the characterization of scalp EEG records corresponding to secondary generalized tonic-clonic epileptic seizures. In particular, we show that the epileptic recruitment rhythm observed during seizure development is well described in terms of the relative wavelet energies. In addition, during the concomitant time-period the entropy diminishes while complexity grows. This is construed as evidence supporting the conjecture that an epileptic focus, for this kind of seizures, triggers a self-organized brain state characterized by both order and maximal complexity.
A Handbook of Sound and Vibration Parameters
1978-09-18
fixed in space. (Reference 1.) no motion atay node Static Divergence: (See Divergence.) Statistical Energy Analysis (SEA): Statistical energy analysis is...parameters of the circuits come from statistics of the vibrational characteristics of the structure. Statistical energy analysis is uniquely successful
Comparative Analysis Between Computed and Conventional Inferior Alveolar Nerve Block Techniques.
Araújo, Gabriela Madeira; Barbalho, Jimmy Charles Melo; Dias, Tasiana Guedes de Souza; Santos, Thiago de Santana; Vasconcellos, Ricardo José de Holanda; de Morais, Hécio Henrique Araújo
2015-11-01
The aim of this randomized, double-blind, controlled trial was to compare the computed and conventional inferior alveolar nerve block techniques in symmetrically positioned inferior third molars. Both computed and conventional anesthetic techniques were performed in 29 healthy patients (58 surgeries) aged between 18 and 40 years. The anesthetic of choice was 2% lidocaine with 1: 200,000 epinephrine. The Visual Analogue Scale assessed the pain variable after anesthetic infiltration. Patient satisfaction was evaluated using the Likert Scale. Heart and respiratory rates, mean time to perform technique, and the need for additional anesthesia were also evaluated. Pain variable means were higher for the conventional technique as compared with computed, 3.45 ± 2.73 and 2.86 ± 1.96, respectively, but no statistically significant differences were found (P > 0.05). Patient satisfaction showed no statistically significant differences. The average computed technique runtime and the conventional were 3.85 and 1.61 minutes, respectively, showing statistically significant differences (P <0.001). The computed anesthetic technique showed lower mean pain perception, but did not show statistically significant differences when contrasted to the conventional technique.
Conceptual and statistical problems associated with the use of diversity indices in ecology.
Barrantes, Gilbert; Sandoval, Luis
2009-09-01
Diversity indices, particularly the Shannon-Wiener index, have extensively been used in analyzing patterns of diversity at different geographic and ecological scales. These indices have serious conceptual and statistical problems which make comparisons of species richness or species abundances across communities nearly impossible. There is often no a single statistical method that retains all information needed to answer even a simple question. However, multivariate analyses could be used instead of diversity indices, such as cluster analyses or multiple regressions. More complex multivariate analyses, such as Canonical Correspondence Analysis, provide very valuable information on environmental variables associated to the presence and abundance of the species in a community. In addition, particular hypotheses associated to changes in species richness across localities, or change in abundance of one, or a group of species can be tested using univariate, bivariate, and/or rarefaction statistical tests. The rarefaction method has proved to be robust to standardize all samples to a common size. Even the simplest method as reporting the number of species per taxonomic category possibly provides more information than a diversity index value.
Carter, Laura; Wilson, Stephen; Tumer, Erwin G
2010-01-01
The purpose of this retrospective chart review was to document sedation and analgesic medications administered preoperotively, intraoperatively, and during postanesthesia care for children undergoing dental rehabilitation using general anesthesia (GA). Patient gender, age, procedure type performed, and ASA status were recorded from the medical charts of children undergoing GA for dental rehabilitation. The sedative and analgesic drugs administered pre-, intra-, and postoperatively were recorded. Statistical analysis included descriptive statistics and cross-tabulation. A sample of 115 patients with a mean age of 64 (+/-30) months was studied; 47% were females, and 71% were healthy. Over 80% of the patients were administered medications primarily during pre- and intraoperative phases, with fewer than 25% receiving medications postoperatively. Morphine and fentanyl were the most frequently administered agents intraoperatively. The procedure type, gender, and health status were not statistically associated with the number of agents administered. Younger patients, however, were statistically more likely to receive additional analgesic medications. Our study suggests that a minority of patients have postoperative discomfort in the postanesthesia care unit; mild to moderate analgesics were administered during intraoperative phases of dental rehabilitation.
Sb2Te3 and Its Superlattices: Optimization by Statistical Design.
Behera, Jitendra K; Zhou, Xilin; Ranjan, Alok; Simpson, Robert E
2018-05-02
The objective of this work is to demonstrate the usefulness of fractional factorial design for optimizing the crystal quality of chalcogenide van der Waals (vdW) crystals. We statistically analyze the growth parameters of highly c axis oriented Sb 2 Te 3 crystals and Sb 2 Te 3 -GeTe phase change vdW heterostructured superlattices. The statistical significance of the growth parameters of temperature, pressure, power, buffer materials, and buffer layer thickness was found by fractional factorial design and response surface analysis. Temperature, pressure, power, and their second-order interactions are the major factors that significantly influence the quality of the crystals. Additionally, using tungsten rather than molybdenum as a buffer layer significantly enhances the crystal quality. Fractional factorial design minimizes the number of experiments that are necessary to find the optimal growth conditions, resulting in an order of magnitude improvement in the crystal quality. We highlight that statistical design of experiment methods, which is more commonly used in product design, should be considered more broadly by those designing and optimizing materials.
NASA Technical Reports Server (NTRS)
Rino, C. L.; Livingston, R. C.; Whitney, H. E.
1976-01-01
This paper presents an analysis of ionospheric scintillation data which shows that the underlying statistical structure of the signal can be accurately modeled by the additive complex Gaussian perturbation predicted by the Born approximation in conjunction with an application of the central limit theorem. By making use of this fact, it is possible to estimate the in-phase, phase quadrature, and cophased scattered power by curve fitting to measured intensity histograms. By using this procedure, it is found that typically more than 80% of the scattered power is in phase quadrature with the undeviated signal component. Thus, the signal is modeled by a Gaussian, but highly non-Rician process. From simultaneous UHF and VHF data, only a weak dependence of this statistical structure on changes in the Fresnel radius is deduced. The signal variance is found to have a nonquadratic wavelength dependence. It is hypothesized that this latter effect is a subtle manifestation of locally homogeneous irregularity structures, a mathematical model proposed by Kolmogorov (1941) in his early studies of incompressible fluid turbulence.
An advanced probabilistic structural analysis method for implicit performance functions
NASA Technical Reports Server (NTRS)
Wu, Y.-T.; Millwater, H. R.; Cruse, T. A.
1989-01-01
In probabilistic structural analysis, the performance or response functions usually are implicitly defined and must be solved by numerical analysis methods such as finite element methods. In such cases, the most commonly used probabilistic analysis tool is the mean-based, second-moment method which provides only the first two statistical moments. This paper presents a generalized advanced mean value (AMV) method which is capable of establishing the distributions to provide additional information for reliability design. The method requires slightly more computations than the second-moment method but is highly efficient relative to the other alternative methods. In particular, the examples show that the AMV method can be used to solve problems involving non-monotonic functions that result in truncated distributions.
Xiao, Hanqiong; Li, Wei; Ma, Ruixia; Gong, Zhengpeng; Shi, Haibo; Li, Huawei; Chen, Bing; Jiang, Ye; Dai, Chunfu
2015-06-01
To describe tne regional different factors which impact on early cochlear implantation in prelingual deaf children between eastern and western regions of China. The charts of 113 children who received the cochlear implantation after 24 months old were reviewed and analyzed. Forty-five of them came from the eastern region (Jiangsu, Zhejiang or Shanghai) while 68 of them came from the western region (Ningxia or Guizhou). Parental interviews were conducted to collect information regarding the factors that impact on early cochlear implantation. Result:Based on the univariate logistic regression analysis, the odds ratio (OR) value of universal newborn hearing screening (UNHS) was 5. 481, which indicated the correlation of UNHS with early cochlear implantation is significant. There was statistical difference between the 2 groups (P<0. 01). For the financial burden, the OR value was 3. 521(strong correlation) and there was statistical difference between the 2 groups (P<0. 01). For the communication barriers and community location, the OR value was 0. 566 and 1. 128 respectively, and there was no statistical difference between the 2 groups (P>0. 05). The multivariate analysis indicated that the UNHS and financial burden are statistically different between the eastern and western regions (P=0. 00 and 0. 040 respectively). The UNHS and financial burden are statistically different between the eastern reinforced in the western region. In addition, the government and society should provide powerful policy and more financial support in the western region of China. The innovation of management system is also helpful to the early cochlear implantation.
Macfarlane, Sarah B.
2005-01-01
Efforts to strengthen health information systems in low- and middle-income countries should include forging links with systems in other social and economic sectors. Governments are seeking comprehensive socioeconomic data on the basis of which to implement strategies for poverty reduction and to monitor achievement of the Millennium Development Goals. The health sector is looking to take action on the social factors that determine health outcomes. But there are duplications and inconsistencies between sectors in the collection, reporting, storage and analysis of socioeconomic data. National offices of statistics give higher priority to collection and analysis of economic than to social statistics. The Report of the Commission for Africa has estimated that an additional US$ 60 million a year is needed to improve systems to collect and analyse statistics in Africa. Some donors recognize that such systems have been weakened by numerous international demands for indicators, and have pledged support for national initiatives to strengthen statistical systems, as well as sectoral information systems such as those in health and education. Many governments are working to coordinate information systems to monitor and evaluate poverty reduction strategies. There is therefore an opportunity for the health sector to collaborate with other sectors to lever international resources to rationalize definition and measurement of indicators common to several sectors; streamline the content, frequency and timing of household surveys; and harmonize national and subnational databases that store socioeconomic data. Without long-term commitment to improve training and build career structures for statisticians and information technicians working in the health and other sectors, improvements in information and statistical systems cannot be sustained. PMID:16184278
NASA Technical Reports Server (NTRS)
Dominick, Wayne D. (Editor); Bassari, Jinous; Triantafyllopoulos, Spiros
1984-01-01
The University of Southwestern Louisiana (USL) NASA PC R and D statistical analysis support package is designed to be a three-level package to allow statistical analysis for a variety of applications within the USL Data Base Management System (DBMS) contract work. The design addresses usage of the statistical facilities as a library package, as an interactive statistical analysis system, and as a batch processing package.
Statistical Tutorial | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018. The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean
Analysis of conditional genetic effects and variance components in developmental genetics.
Zhu, J
1995-12-01
A genetic model with additive-dominance effects and genotype x environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t-1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects.
Analysis of Conditional Genetic Effects and Variance Components in Developmental Genetics
Zhu, J.
1995-01-01
A genetic model with additive-dominance effects and genotype X environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t - 1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects. PMID:8601500
Theory of Financial Risk and Derivative Pricing
NASA Astrophysics Data System (ADS)
Bouchaud, Jean-Philippe; Potters, Marc
2009-01-01
Foreword; Preface; 1. Probability theory: basic notions; 2. Maximum and addition of random variables; 3. Continuous time limit, Ito calculus and path integrals; 4. Analysis of empirical data; 5. Financial products and financial markets; 6. Statistics of real prices: basic results; 7. Non-linear correlations and volatility fluctuations; 8. Skewness and price-volatility correlations; 9. Cross-correlations; 10. Risk measures; 11. Extreme correlations and variety; 12. Optimal portfolios; 13. Futures and options: fundamental concepts; 14. Options: hedging and residual risk; 15. Options: the role of drift and correlations; 16. Options: the Black and Scholes model; 17. Options: some more specific problems; 18. Options: minimum variance Monte-Carlo; 19. The yield curve; 20. Simple mechanisms for anomalous price statistics; Index of most important symbols; Index.
Theory of Financial Risk and Derivative Pricing - 2nd Edition
NASA Astrophysics Data System (ADS)
Bouchaud, Jean-Philippe; Potters, Marc
2003-12-01
Foreword; Preface; 1. Probability theory: basic notions; 2. Maximum and addition of random variables; 3. Continuous time limit, Ito calculus and path integrals; 4. Analysis of empirical data; 5. Financial products and financial markets; 6. Statistics of real prices: basic results; 7. Non-linear correlations and volatility fluctuations; 8. Skewness and price-volatility correlations; 9. Cross-correlations; 10. Risk measures; 11. Extreme correlations and variety; 12. Optimal portfolios; 13. Futures and options: fundamental concepts; 14. Options: hedging and residual risk; 15. Options: the role of drift and correlations; 16. Options: the Black and Scholes model; 17. Options: some more specific problems; 18. Options: minimum variance Monte-Carlo; 19. The yield curve; 20. Simple mechanisms for anomalous price statistics; Index of most important symbols; Index.
Level statistics of words: Finding keywords in literary texts and symbolic sequences
NASA Astrophysics Data System (ADS)
Carpena, P.; Bernaola-Galván, P.; Hackenberg, M.; Coronado, A. V.; Oliver, J. L.
2009-03-01
Using a generalization of the level statistics analysis of quantum disordered systems, we present an approach able to extract automatically keywords in literary texts. Our approach takes into account not only the frequencies of the words present in the text but also their spatial distribution along the text, and is based on the fact that relevant words are significantly clustered (i.e., they self-attract each other), while irrelevant words are distributed randomly in the text. Since a reference corpus is not needed, our approach is especially suitable for single documents for which no a priori information is available. In addition, we show that our method works also in generic symbolic sequences (continuous texts without spaces), thus suggesting its general applicability.
Hickey, Graeme L; Blackstone, Eugene H
2016-08-01
Clinical risk-prediction models serve an important role in healthcare. They are used for clinical decision-making and measuring the performance of healthcare providers. To establish confidence in a model, external model validation is imperative. When designing such an external model validation study, thought must be given to patient selection, risk factor and outcome definitions, missing data, and the transparent reporting of the analysis. In addition, there are a number of statistical methods available for external model validation. Execution of a rigorous external validation study rests in proper study design, application of suitable statistical methods, and transparent reporting. Copyright © 2016 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.
Mihic, Marko M; Todorovic, Marija Lj; Obradovic, Vladimir Lj; Mitrovic, Zorica M
2016-01-01
Social services aimed at the elderly are facing great challenges caused by progressive aging of the global population but also by the constant pressure to spend funds in a rational manner. This paper focuses on analyzing the investments into human resources aimed at enhancing home care for the elderly since many countries have recorded progress in the area over the past years. The goal of this paper is to stress the significance of performing an economic analysis of the investment. This paper combines statistical analysis methods such as correlation and regression analysis, methods of economic analysis, and scenario method. The economic analysis of investing in human resources for home care service in Serbia showed that the both scenarios of investing in either additional home care hours or more beneficiaries are cost-efficient. However, the optimal solution with the positive (and the highest) value of economic net present value criterion is to invest in human resources to boost the number of home care hours from 6 to 8 hours per week and increase the number of the beneficiaries to 33%. This paper shows how the statistical and economic analysis results can be used to evaluate different scenarios and enable quality decision-making based on exact data in order to improve health and quality of life of the elderly and spend funds in a rational manner.
Characterization of Low-Molecular-Weight Heparins by Strong Anion-Exchange Chromatography.
Sadowski, Radosław; Gadzała-Kopciuch, Renata; Kowalkowski, Tomasz; Widomski, Paweł; Jujeczka, Ludwik; Buszewski, Bogusław
2017-11-01
Currently, detailed structural characterization of low-molecular-weight heparin (LMWH) products is an analytical subject of great interest. In this work, we carried out a comprehensive structural analysis of LMWHs and applied a modified pharmacopeial method, as well as methods developed by other researchers, to the analysis of novel biosimilar LMWH products; and, for the first time, compared the qualitative and quantitative composition of commercially available drugs (enoxaparin, nadroparin, and dalteparin). For this purpose, we used strong anion-exchange (SAX) chromatography with spectrophotometric detection because this method is more helpful, easier, and faster than other separation techniques for the detailed disaccharide analysis of new LMWH drugs. In addition, we subjected the obtained results to statistical analysis (factor analysis, t-test, and Newman-Keuls post hoc test).
Analysis of BSRT Profiles in the LHC at Injection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fitterer, M.; Stancari, G.; Papadopoulou, S.
The beam synchrotron radiation telescope (BSRT) at the LHC allows to take profiles of the transverse beam distribution, which can provide useful additional insight in the evolution of the transverse beam distribution. A python class has been developed [1], which allows to read in the BSRT profiles, usually stored in binary format, run different analysis tools and generate plots of the statistical parameters and profiles as well as videos of the the profiles. The detailed analysis will be described in this note. The analysis is based on the data obtained at injection energy (450 GeV) during MD1217 [2] and MD1415more » [3] which will be also used as illustrative example. A similar approach is also taken with a MATLAB based analysis described in [4].« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kelly, Elizabeth J.; Daugherty, William L.; Hackney, Elizabeth R.
During surveillance of the 9975 shipping package at the Savannah River Site K-Area Complex, several package dimensions are recorded. The analysis described in this report shows that, based on the current data analysis, two of these measurements, Upper Assembly Outer Diameter (UAOD) and Upper Assembly Inside Height (UAIH), do not have statistically significant aging trends regardless of wattage levels. In contrast, this analysis indicates that the measurement of Air Shield Gap (ASGap) does show a significant increase with age. It appears that the increase is greater for high wattage containers, but this result is dominated by two measurements from high-wattagemore » containers. For all three indicators, additional high-wattage, older containers need to be examined before any definitive conclusions can be reached. In addition, the current analysis indicates that ASGap measurements for low and medium wattage containers are increasing slowly over time. To reduce uncertainties and better capture the aging trend for these containers, additional low and medium wattage older containers should also be examined. Based on this analysis, surveillance guidance is to augment surveillance containers resulting from 3013 surveillance with 9975-focused sampling that targets older, high wattage containers and also includes some older, low and medium wattage containers. This focused sampling began in 2015 and will continue in 2016. The UAOD, UAIH and ASGap data are highly variable. It is possible that additional factors such as seasonal variation and packaging site location might reduce variability and be useful for focusing surveillance and predicting aging.« less
Bayesian analyses of time-interval data for environmental radiation monitoring.
Luo, Peng; Sharp, Julia L; DeVol, Timothy A
2013-01-01
Time-interval (time difference between two consecutive pulses) analysis based on the principles of Bayesian inference was investigated for online radiation monitoring. Using experimental and simulated data, Bayesian analysis of time-interval data [Bayesian (ti)] was compared with Bayesian and a conventional frequentist analysis of counts in a fixed count time [Bayesian (cnt) and single interval test (SIT), respectively]. The performances of the three methods were compared in terms of average run length (ARL) and detection probability for several simulated detection scenarios. Experimental data were acquired with a DGF-4C system in list mode. Simulated data were obtained using Monte Carlo techniques to obtain a random sampling of the Poisson distribution. All statistical algorithms were developed using the R Project for statistical computing. Bayesian analysis of time-interval information provided a similar detection probability as Bayesian analysis of count information, but the authors were able to make a decision with fewer pulses at relatively higher radiation levels. In addition, for the cases with very short presence of the source (< count time), time-interval information is more sensitive to detect a change than count information since the source data is averaged by the background data over the entire count time. The relationships of the source time, change points, and modifications to the Bayesian approach for increasing detection probability are presented.
XU, Chen; CHEN, Shiwen; YUAN, Lutao; JING, Yao
2016-01-01
There is controversy among neurosurgeons regarding whether irrigation or drainage is necessary for achieving a lower revision rate for the treatment of chronic subdural hematoma (CSDH) using burr-hole craniostomy (BHC). Therefore, we performed a meta-analysis of all available published reports. Multiple electronic health databases were searched to identify all studies published between 1989 and June 2012 that compared irrigation and drainage. Data were processed by using Review Manager 5.1.6. Effect sizes are expressed as pooled odds ratio (OR) estimates. Due to heterogeneity between studies, we used the random effect of the inverse variance weighted method to perform the meta-analysis. Thirteen published reports were selected for this meta-analysis. The comprehensive results indicated that there were no statistically significant differences in mortality or complication rates between drainage and no drainage (P > 0.05). Additionally, there were no differences in recurrence between irrigation and no irrigation (P > 0.05). However, the difference between drainage and no drainage in recurrence rate reached statistical significance (P < 0.01). The results from this meta-analysis suggest that burr-hole surgery with closed-system drainage can reduce the recurrence of CSDH; however, irrigation is not necessary for every patient. PMID:26377830
Liu, Zechang; Wang, Liping; Liu, Yumei
2018-01-18
Hops impart flavor to beer, with the volatile components characterizing the various hop varieties and qualities. Fingerprinting, especially flavor fingerprinting, is often used to identify 'flavor products' because inconsistencies in the description of flavor may lead to an incorrect definition of beer quality. Compared to flavor fingerprinting, volatile fingerprinting is simpler and easier. We performed volatile fingerprinting using head space-solid phase micro-extraction gas chromatography-mass spectrometry combined with similarity analysis and principal component analysis (PCA) for evaluating and distinguishing between three major Chinese hops. Eighty-four volatiles were identified, which were classified into seven categories. Volatile fingerprinting based on similarity analysis did not yield any obvious result. By contrast, hop varieties and qualities were identified using volatile fingerprinting based on PCA. The potential variables explained the variance in the three hop varieties. In addition, the dendrogram and principal component score plot described the differences and classifications of hops. Volatile fingerprinting plus multivariate statistical analysis can rapidly differentiate between the different varieties and qualities of the three major Chinese hops. Furthermore, this method can be used as a reference in other fields. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
Combined proportional and additive residual error models in population pharmacokinetic modelling.
Proost, Johannes H
2017-11-15
In pharmacokinetic modelling, a combined proportional and additive residual error model is often preferred over a proportional or additive residual error model. Different approaches have been proposed, but a comparison between approaches is still lacking. The theoretical background of the methods is described. Method VAR assumes that the variance of the residual error is the sum of the statistically independent proportional and additive components; this method can be coded in three ways. Method SD assumes that the standard deviation of the residual error is the sum of the proportional and additive components. Using datasets from literature and simulations based on these datasets, the methods are compared using NONMEM. The different coding of methods VAR yield identical results. Using method SD, the values of the parameters describing residual error are lower than for method VAR, but the values of the structural parameters and their inter-individual variability are hardly affected by the choice of the method. Both methods are valid approaches in combined proportional and additive residual error modelling, and selection may be based on OFV. When the result of an analysis is used for simulation purposes, it is essential that the simulation tool uses the same method as used during analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
This document comprises Pacific Northwest National Laboratory`s report for Fiscal Year 1996 on research and development programs. The document contains 161 project summaries in 16 areas of research and development. The 16 areas of research and development reported on are: atmospheric sciences, biotechnology, chemical instrumentation and analysis, computer and information science, ecological science, electronics and sensors, health protection and dosimetry, hydrological and geologic sciences, marine sciences, materials science and engineering, molecular science, process science and engineering, risk and safety analysis, socio-technical systems analysis, statistics and applied mathematics, and thermal and energy systems. In addition, this report provides an overview ofmore » the research and development program, program management, program funding, and Fiscal Year 1997 projects.« less
Analysis of long-term ionizing radiation effects in bipolar transistors
NASA Technical Reports Server (NTRS)
Stanley, A. G.; Martin, K. E.
1978-01-01
The ionizing radiation effects of electrons on bipolar transistors have been analyzed using the data base from the Voyager project. The data were subjected to statistical analysis, leading to a quantitative characterization of the product and to data on confidence limits which will be useful for circuit design purposes. These newly-developed methods may form the basis for a radiation hardness assurance system. In addition, an attempt was made to identify the causes of the large variations in the sensitivity observed on different product lines. This included a limited construction analysis and a determination of significant design and processes variables, as well as suggested remedies for improving the tolerance of the devices to radiation.
2015-08-01
the nine questions. The Statistical Package for the Social Sciences ( SPSS ) [11] was used to conduct statistical analysis on the sample. Two types...constructs. SPSS was again used to conduct statistical analysis on the sample. This time factor analysis was conducted. Factor analysis attempts to...Business Research Methods and Statistics using SPSS . P432. 11 IBM SPSS Statistics . (2012) 12 Burns, R.B., Burns, R.A. (2008) ‘Business Research
An instrument to assess the statistical intensity of medical research papers.
Nieminen, Pentti; Virtanen, Jorma I; Vähänikkilä, Hannu
2017-01-01
There is widespread evidence that statistical methods play an important role in original research articles, especially in medical research. The evaluation of statistical methods and reporting in journals suffers from a lack of standardized methods for assessing the use of statistics. The objective of this study was to develop and evaluate an instrument to assess the statistical intensity in research articles in a standardized way. A checklist-type measure scale was developed by selecting and refining items from previous reports about the statistical contents of medical journal articles and from published guidelines for statistical reporting. A total of 840 original medical research articles that were published between 2007-2015 in 16 journals were evaluated to test the scoring instrument. The total sum of all items was used to assess the intensity between sub-fields and journals. Inter-rater agreement was examined using a random sample of 40 articles. Four raters read and evaluated the selected articles using the developed instrument. The scale consisted of 66 items. The total summary score adequately discriminated between research articles according to their study design characteristics. The new instrument could also discriminate between journals according to their statistical intensity. The inter-observer agreement measured by the ICC was 0.88 between all four raters. Individual item analysis showed very high agreement between the rater pairs, the percentage agreement ranged from 91.7% to 95.2%. A reliable and applicable instrument for evaluating the statistical intensity in research papers was developed. It is a helpful tool for comparing the statistical intensity between sub-fields and journals. The novel instrument may be applied in manuscript peer review to identify papers in need of additional statistical review.
2017 Annual Disability Statistics Supplement
ERIC Educational Resources Information Center
Lauer, E. A; Houtenville, A. J.
2018-01-01
The "Annual Disability Statistics Supplement" is a companion report to the "Annual Disability Statistics Compendium." The "Supplement" presents statistics on the same topics as the "Compendium," with additional categorizations by demographic characteristics including age, gender and race/ethnicity. In…
A quantitative study of nanoparticle skin penetration with interactive segmentation.
Lee, Onseok; Lee, See Hyun; Jeong, Sang Hoon; Kim, Jaeyoung; Ryu, Hwa Jung; Oh, Chilhwan; Son, Sang Wook
2016-10-01
In the last decade, the application of nanotechnology techniques has expanded within diverse areas such as pharmacology, medicine, and optical science. Despite such wide-ranging possibilities for implementation into practice, the mechanisms behind nanoparticle skin absorption remain unknown. Moreover, the main mode of investigation has been qualitative analysis. Using interactive segmentation, this study suggests a method of objectively and quantitatively analyzing the mechanisms underlying the skin absorption of nanoparticles. Silica nanoparticles (SNPs) were assessed using transmission electron microscopy and applied to the human skin equivalent model. Captured fluorescence images of this model were used to evaluate degrees of skin penetration. These images underwent interactive segmentation and image processing in addition to statistical quantitative analyses of calculated image parameters including the mean, integrated density, skewness, kurtosis, and area fraction. In images from both groups, the distribution area and intensity of fluorescent silica gradually increased in proportion to time. Since statistical significance was achieved after 2 days in the negative charge group and after 4 days in the positive charge group, there is a periodic difference. Furthermore, the quantity of silica per unit area showed a dramatic change after 6 days in the negative charge group. Although this quantitative result is identical to results obtained by qualitative assessment, it is meaningful in that it was proven by statistical analysis with quantitation by using image processing. The present study suggests that the surface charge of SNPs could play an important role in the percutaneous absorption of NPs. These findings can help achieve a better understanding of the percutaneous transport of NPs. In addition, these results provide important guidance for the design of NPs for biomedical applications.
NASA Technical Reports Server (NTRS)
1980-01-01
MATHPAC image-analysis library is collection of general-purpose mathematical and statistical routines and special-purpose data-analysis and pattern-recognition routines for image analysis. MATHPAC library consists of Linear Algebra, Optimization, Statistical-Summary, Densities and Distribution, Regression, and Statistical-Test packages.
Comparing Visual and Statistical Analysis of Multiple Baseline Design Graphs.
Wolfe, Katie; Dickenson, Tammiee S; Miller, Bridget; McGrath, Kathleen V
2018-04-01
A growing number of statistical analyses are being developed for single-case research. One important factor in evaluating these methods is the extent to which each corresponds to visual analysis. Few studies have compared statistical and visual analysis, and information about more recently developed statistics is scarce. Therefore, our purpose was to evaluate the agreement between visual analysis and four statistical analyses: improvement rate difference (IRD); Tau-U; Hedges, Pustejovsky, Shadish (HPS) effect size; and between-case standardized mean difference (BC-SMD). Results indicate that IRD and BC-SMD had the strongest overall agreement with visual analysis. Although Tau-U had strong agreement with visual analysis on raw values, it had poorer agreement when those values were dichotomized to represent the presence or absence of a functional relation. Overall, visual analysis appeared to be more conservative than statistical analysis, but further research is needed to evaluate the nature of these disagreements.
Fast and accurate imputation of summary statistics enhances evidence of functional enrichment
Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P.; Patterson, Nick; Price, Alkes L.
2014-01-01
Motivation: Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. Results: In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1–5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case–control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of χ2 association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Availability and implementation: Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. Contact: bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:24990607
Optimization of Multilocus Sequence Analysis for Identification of Species in the Genus Vibrio
Gabriel, Michael W.; Matsui, George Y.; Friedman, Robert
2014-01-01
Multilocus sequence analysis (MLSA) is an important method for identification of taxa that are not well differentiated by 16S rRNA gene sequences alone. In this procedure, concatenated sequences of selected genes are constructed and then analyzed. The effects that the number and the order of genes used in MLSA have on reconstruction of phylogenetic relationships were examined. The recA, rpoA, gapA, 16S rRNA gene, gyrB, and ftsZ sequences from 56 species of the genus Vibrio were used to construct molecular phylogenies, and these were evaluated individually and using various gene combinations. Phylogenies from two-gene sequences employing recA and rpoA in both possible gene orders were different. The addition of the gapA gene sequence, producing all six possible concatenated sequences, reduced the differences in phylogenies to degrees of statistical (bootstrap) support for some nodes. The overall statistical support for the phylogenetic tree, assayed on the basis of a reliability score (calculated from the number of nodes having bootstrap values of ≥80 divided by the total number of nodes) increased with increasing numbers of genes used, up to a maximum of four. No further improvement was observed from addition of the fifth gene sequence (ftsZ), and addition of the sixth gene (gyrB) resulted in lower proportions of strongly supported nodes. Reductions in the numbers of strongly supported nodes were also observed when maximum parsimony was employed for tree construction. Use of a small number of gene sequences in MLSA resulted in accurate identification of Vibrio species. PMID:24951781
Pitoia, Fabián; Jerkovich, Fernando; Smulever, Anabella; Brenta, Gabriela; Bueno, Fernanda; Cross, Graciela
2017-01-01
Objective To evaluate the influence of age at diagnosis on the frequency of structural incomplete response (SIR) according to the modified risk of recurrence (RR) staging system from the American Thyroid Association guidelines. Patients and Methods We performed a retrospective analysis of 268 patients with differentiated thyroid cancer (DTC) followed up for at least 3 years after initial treatment (total thyroidectomy and remnant ablation). The median follow-up in the whole cohort was 74.3 months (range: 36.1-317.9) and the median age at diagnosis was 45.9 years (range: 18-87). The association between age at diagnosis and the initial and final response to treatment was assessed with analysis of variance (ANOVA). Patients were also divided into several groups considering age younger and older than 40, 50, and 60 years. Results Age at diagnosis was not associated with either an initial or final statistically significant different SIR to treatment (p = 0.14 and p = 0.58, respectively). Additionally, we did not find any statistically significant differences when the percentages of SIR considering the classification of RR were compared between different groups of patients by using several age cutoffs. Conclusions When patients are correctly risk stratified, it seems that age at diagnosis is not involved in the frequency of having a SIR at the initial evaluation or at the final follow-up, so it should not be included as an additional variable to be considered in the RR classifications. PMID:28785543
Trimetazidine improves exercise tolerance in patients with ischemic heart disease : A meta-analysis.
Zhao, Y; Peng, L; Luo, Y; Li, S; Zheng, Z; Dong, R; Zhu, J; Liu, J
2016-09-01
This study aimed to evaluate the effect of trimetazidine (TMZ) in addition to standard treatment on exercise tolerance in patients with ischemic heart disease (IHD). Studies were identified via a systematic search of PubMed, Embase, Cochrane Library, and the Chinese CNKI databases from January 1978 to January 2015. Data extraction, synthesis, and statistical analysis were performed by standard meta-analysis methods. Random or fixed effects models were used to estimate pooled mean differences in total exercise duration (TED), peak oxygen uptake (pVO2), metabolic equivalent system (METS), and 6-minute walking test (6-MWT). In all, 16 randomized controlled trials (RCTs) consisting of 2,004 participants were included. Pooled results showed that TMZ treatment significantly improved TED (WMD: 37.35, 95 % CI: 25.58-49.13, p < 0.00001), pVO2 (WMD: 2.41, 95 % CI: 1.76-3.06, p < 0.00001), METS (WMD: 1.33, 95 % CI: 0.38-2.28, p = 0.006), and 6-WMT (WMD: 62.46, 95 % CI: 35.86-89.05, p < 0.001) in all patients with IHD. Subgroup analysis showed that TMZ significantly increased TED in nondiabetic participants (WMD 34.77, 95 % CI: 22.28-47.25, p < 0.001), but not in diabetic participants (WMD: 40.36, 95 % CI: - 18.76-99.48, p = 0.18). And, subgroup analysis of TED by intervention duration suggested that there is no statistically difference between the 3-month and 6-month periods (WMD: 35.47, 95 %CI: 18.35-52.60, p < 0.0001 and WMD: 49.94, 95 %CI: 44.69-55.19, p < 0.00001). In addition, TMZ improved TED (WMD: 50.01, 95 % CI: 44.77-55.25 and WMD: 24.20, 95 % CI: 12.72-35.68) in IHD patients with or without heart failure (HF), respectively. Addition of TMZ to standard treatment significantly improved exercise tolerance in patients with IHD, and IHD patients with HF may experience even more benefits. However, there is insufficient evidence to show that TMZ has beneficial effects in participants with diabetes.
Fang, H; Han, M; Li, Q-L; Cao, C Y; Xia, R; Zhang, Z-H
2016-08-01
Scaling and root planing are widely considered as effective methods for treating chronic periodontitis. A meta-analysis published in 2008 showed no statistically significant differences between full-mouth disinfection (FMD) or full-mouth scaling and root planing (FMS) and quadrant scaling and root planing (Q-SRP). The FMD approach only resulted in modest additional improvements in several indices. Whether differences exist between these two approaches requires further validation. Accordingly, a study was conducted to further validate whether FMD with antiseptics or FMS without the use of antiseptics within 24 h provides greater clinical improvement than Q-SRP in patients with chronic periodontitis. Medline (via OVID), EMBASE (via OVID), PubMed and CENTRAL databases were searched up to 27 January 2015. Randomized controlled trials comparing FMD or FMS with Q-SRP after at least 3 mo were included. Meta-analysis was performed to obtain the weighted mean difference (WMD), together with the corresponding 95% confidence intervals. Thirteen articles were included in the meta-analysis. The WMD of probing pocket depth reduction was 0.25 mm (p < 0.05) for FMD vs. Q-SRP in single-rooted teeth with moderate pockets, and clinical attachment level gain in single- and multirooted teeth with moderate pockets was 0.33 mm (p < 0.05) for FMD vs. Q-SRP. Except for those, no statistically significant differences were found in the other subanalyses of FMD vs. Q-SRP, FMS vs. Q-SRP and FMD vs. FMS. Therefore, the meta-analysis results showed that FMD was better than Q-SRP for achieving probing pocket depth reduction and clinical attachment level gain in moderate pockets. Additionally, regardless of the treatment, no serious complications were observed. FMD, FMS and Q-SRP are all effective for the treatment of adult chronic periodontitis, and they do not lead to any obvious discomfort among patients. Moreover, FMD had modest additional clinical benefits over Q-SRP, so we prefer to recommend FMD as the first choice for the treatment of adult chronic periodontitis. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
SU-F-I-10: Spatially Local Statistics for Adaptive Image Filtering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Iliopoulos, AS; Sun, X; Floros, D
Purpose: To facilitate adaptive image filtering operations, addressing spatial variations in both noise and signal. Such issues are prevalent in cone-beam projections, where physical effects such as X-ray scattering result in spatially variant noise, violating common assumptions of homogeneous noise and challenging conventional filtering approaches to signal extraction and noise suppression. Methods: We present a computational mechanism for probing into and quantifying the spatial variance of noise throughout an image. The mechanism builds a pyramid of local statistics at multiple spatial scales; local statistical information at each scale includes (weighted) mean, median, standard deviation, median absolute deviation, as well asmore » histogram or dynamic range after local mean/median shifting. Based on inter-scale differences of local statistics, the spatial scope of distinguishable noise variation is detected in a semi- or un-supervised manner. Additionally, we propose and demonstrate the incorporation of such information in globally parametrized (i.e., non-adaptive) filters, effectively transforming the latter into spatially adaptive filters. The multi-scale mechanism is materialized by efficient algorithms and implemented in parallel CPU/GPU architectures. Results: We demonstrate the impact of local statistics for adaptive image processing and analysis using cone-beam projections of a Catphan phantom, fitted within an annulus to increase X-ray scattering. The effective spatial scope of local statistics calculations is shown to vary throughout the image domain, necessitating multi-scale noise and signal structure analysis. Filtering results with and without spatial filter adaptation are compared visually, illustrating improvements in imaging signal extraction and noise suppression, and in preserving information in low-contrast regions. Conclusion: Local image statistics can be incorporated in filtering operations to equip them with spatial adaptivity to spatial signal/noise variations. An efficient multi-scale computational mechanism is developed to curtail processing latency. Spatially adaptive filtering may impact subsequent processing tasks such as reconstruction and numerical gradient computations for deformable registration. NIH Grant No. R01-184173.« less
Automated Tracking of Cell Migration with Rapid Data Analysis.
DuChez, Brian J
2017-09-01
Cell migration is essential for many biological processes including development, wound healing, and metastasis. However, studying cell migration often requires the time-consuming and labor-intensive task of manually tracking cells. To accelerate the task of obtaining coordinate positions of migrating cells, we have developed a graphical user interface (GUI) capable of automating the tracking of fluorescently labeled nuclei. This GUI provides an intuitive user interface that makes automated tracking accessible to researchers with no image-processing experience or familiarity with particle-tracking approaches. Using this GUI, users can interactively determine a minimum of four parameters to identify fluorescently labeled cells and automate acquisition of cell trajectories. Additional features allow for batch processing of numerous time-lapse images, curation of unwanted tracks, and subsequent statistical analysis of tracked cells. Statistical outputs allow users to evaluate migratory phenotypes, including cell speed, distance, displacement, and persistence, as well as measures of directional movement, such as forward migration index (FMI) and angular displacement. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Tukiendorf, Andrzej; Mansournia, Mohammad Ali; Wydmański, Jerzy; Wolny-Rokicka, Edyta
2017-04-01
Background: Clinical datasets for epithelial ovarian cancer brain metastatic patients are usually small in size. When adequate case numbers are lacking, resulting estimates of regression coefficients may demonstrate bias. One of the direct approaches to reduce such sparse-data bias is based on penalized estimation. Methods: A re- analysis of formerly reported hazard ratios in diagnosed patients was performed using penalized Cox regression with a popular SAS package providing additional software codes for a statistical computational procedure. Results: It was found that the penalized approach can readily diminish sparse data artefacts and radically reduce the magnitude of estimated regression coefficients. Conclusions: It was confirmed that classical statistical approaches may exaggerate regression estimates or distort study interpretations and conclusions. The results support the thesis that penalization via weak informative priors and data augmentation are the safest approaches to shrink sparse data artefacts frequently occurring in epidemiological research. Creative Commons Attribution License
Increased frequencies of aberrant sperm as indicators of mutagenic damage in mice.
Soares, E R; Sheridan, W; Haseman, J K; Segall, M
1979-02-01
We have tested the effects of TEM in 3 strains of mice using the sperm morphology assay. In addition, we have made an attempt to evaluate this test system with respect to experimental design, statistical problems and possible interlaboratory differences. Treatment with TEM results in significant increases in the percent of abnormally shaped sperm. These increases are readily detectable in sperm treated as spermatocytes and spermatogonial stages. Our data indicate possible problems associated with inter-laboratory variation in slide analysis. We have found that despite the introduction of such sources of variation, our data were consistent with respect to the effects of TEM. Another area of concern in the sperm morphology test is the presence of "outlier" animals. In our study, such animals comprised 4% of the total number of animals considered. Statistical analysis of the slides from these animals have shown that this problem can be dealt with and that when recognized as such, "outliers" do not effect the outcome of the sperm morphology assay.
NASA Astrophysics Data System (ADS)
Ohsawa, A.
2011-06-01
This paper presents a statistical analysis of 153 accidents attributable to static electricity in Japanese industry over the last 50 years. A more thorough understanding of their causes could help prevent similar incidents and identify hazards that could assist in the task of risk assessment. Most of the incidents occurred during operations performed by workers. In addition, more than 70% of the flammable atmospheres resulted from the presence of vapours. A noteworthy finding is that at least 70% of the ignitions were caused by isolated conductors including operators' bodies leading to spark discharges, which could have easily been prevented with earthing. These tendencies indicate that, when operators handle flammable liquids with any conductors, the ignition risk is significantly high. A serious lack of information regarding fundamental countermeasures for static electricity seems to be the main cause of such hazards. Only organised management, including education and risk communication, would prevent them.
Mallette, Jennifer R.; Casale, John F.; Jordan, James; Morello, David R.; Beyer, Paul M.
2016-01-01
Previously, geo-sourcing to five major coca growing regions within South America was accomplished. However, the expansion of coca cultivation throughout South America made sub-regional origin determinations increasingly difficult. The former methodology was recently enhanced with additional stable isotope analyses (2H and 18O) to fully characterize cocaine due to the varying environmental conditions in which the coca was grown. An improved data analysis method was implemented with the combination of machine learning and multivariate statistical analysis methods to provide further partitioning between growing regions. Here, we show how the combination of trace cocaine alkaloids, stable isotopes, and multivariate statistical analyses can be used to classify illicit cocaine as originating from one of 19 growing regions within South America. The data obtained through this approach can be used to describe current coca cultivation and production trends, highlight trafficking routes, as well as identify new coca growing regions. PMID:27006288
Striking changes in tea metabolites due to elevational effects.
Kfoury, Nicole; Morimoto, Joshua; Kern, Amanda; Scott, Eric R; Orians, Colin M; Ahmed, Selena; Griffin, Timothy; Cash, Sean B; Stepp, John Richard; Xue, Dayuan; Long, Chunlin; Robbat, Albert
2018-10-30
Climate effects on crop quality at the molecular level are not well-understood. Gas and liquid chromatography-mass spectrometry were used to measure changes of hundreds of compounds in tea at different elevations in Yunnan Province, China. Some increased in concentration while others decreased by 100's of percent. Orthogonal projection to latent structures-discriminant analysis revealed compounds exhibiting analgesic, antianxiety, antibacterial, anticancer, antidepressant, antifungal, anti-inflammatory, antioxidant, anti-stress, and cardioprotective properties statistically (p = 0.003) differentiated high from low elevation tea. Also, sweet, floral, honey-like notes were higher in concentration in the former while the latter displayed grassy, hay-like aroma. In addition, multivariate analysis of variance showed low elevation tea had statistically (p = 0.0062) higher concentrations of caffeine, epicatechin gallate, gallocatechin, and catechin; all bitter compounds. Although volatiles represent a small fraction of the total mass, this is the first comprehensive report illustrating how normal variations in temperature, 5 °C, due to elevational effects impact tea quality. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Craft, R.; Dunn, C.; Mccord, J.; Simeone, L.
1980-01-01
A user guide and programmer documentation is provided for a system of PRIME 400 minicomputer programs. The system was designed to support loading analyses on the Tracking Data Relay Satellite System (TDRSS). The system is a scheduler for various types of data relays (including tape recorder dumps and real time relays) from orbiting payloads to the TDRSS. Several model options are available to statistically generate data relay requirements. TDRSS time lines (representing resources available for scheduling) and payload/TDRSS acquisition and loss of sight time lines are input to the scheduler from disk. Tabulated output from the interactive system includes a summary of the scheduler activities over time intervals specified by the user and overall summary of scheduler input and output information. A history file, which records every event generated by the scheduler, is written to disk to allow further scheduling on remaining resources and to provide data for graphic displays or additional statistical analysis.
NASA Astrophysics Data System (ADS)
Watanabe, Kenichi; Minniti, Triestino; Kockelmann, Winfried; Dalgliesh, Robert; Burca, Genoveva; Tremsin, Anton S.
2017-07-01
The uncertainties and the stability of a neutron sensitive MCP/Timepix detector when operating in the event timing mode for quantitative image analysis at a pulsed neutron source were investigated. The dominant component to the uncertainty arises from the counting statistics. The contribution of the overlap correction to the uncertainty was concluded to be negligible from considerations based on the error propagation even if a pixel occupation probability is more than 50%. We, additionally, have taken into account the multiple counting effect in consideration of the counting statistics. Furthermore, the detection efficiency of this detector system changes under relatively high neutron fluxes due to the ageing effects of current Microchannel Plates. Since this efficiency change is position-dependent, it induces a memory image. The memory effect can be significantly reduced with correction procedures using the rate equations describing the permanent gain degradation and the scrubbing effect on the inner surfaces of the MCP pores.
Statistics of high-level scene context
Greene, Michelle R.
2013-01-01
Context is critical for recognizing environments and for searching for objects within them: contextual associations have been shown to modulate reaction time and object recognition accuracy, as well as influence the distribution of eye movements and patterns of brain activations. However, we have not yet systematically quantified the relationships between objects and their scene environments. Here I seek to fill this gap by providing descriptive statistics of object-scene relationships. A total of 48, 167 objects were hand-labeled in 3499 scenes using the LabelMe tool (Russell et al., 2008). From these data, I computed a variety of descriptive statistics at three different levels of analysis: the ensemble statistics that describe the density and spatial distribution of unnamed “things” in the scene; the bag of words level where scenes are described by the list of objects contained within them; and the structural level where the spatial distribution and relationships between the objects are measured. The utility of each level of description for scene categorization was assessed through the use of linear classifiers, and the plausibility of each level for modeling human scene categorization is discussed. Of the three levels, ensemble statistics were found to be the most informative (per feature), and also best explained human patterns of categorization errors. Although a bag of words classifier had similar performance to human observers, it had a markedly different pattern of errors. However, certain objects are more useful than others, and ceiling classification performance could be achieved using only the 64 most informative objects. As object location tends not to vary as a function of category, structural information provided little additional information. Additionally, these data provide valuable information on natural scene redundancy that can be exploited for machine vision, and can help the visual cognition community to design experiments guided by statistics rather than intuition. PMID:24194723
Modular reweighting software for statistical mechanical analysis of biased equilibrium data
NASA Astrophysics Data System (ADS)
Sindhikara, Daniel J.
2012-07-01
Here a simple, useful, modular approach and software suite designed for statistical reweighting and analysis of equilibrium ensembles is presented. Statistical reweighting is useful and sometimes necessary for analysis of equilibrium enhanced sampling methods, such as umbrella sampling or replica exchange, and also in experimental cases where biasing factors are explicitly known. Essentially, statistical reweighting allows extrapolation of data from one or more equilibrium ensembles to another. Here, the fundamental separable steps of statistical reweighting are broken up into modules - allowing for application to the general case and avoiding the black-box nature of some “all-inclusive” reweighting programs. Additionally, the programs included are, by-design, written with little dependencies. The compilers required are either pre-installed on most systems, or freely available for download with minimal trouble. Examples of the use of this suite applied to umbrella sampling and replica exchange molecular dynamics simulations will be shown along with advice on how to apply it in the general case. New version program summaryProgram title: Modular reweighting version 2 Catalogue identifier: AEJH_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJH_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 No. of lines in distributed program, including test data, etc.: 179 118 No. of bytes in distributed program, including test data, etc.: 8 518 178 Distribution format: tar.gz Programming language: C++, Python 2.6+, Perl 5+ Computer: Any Operating system: Any RAM: 50-500 MB Supplementary material: An updated version of the original manuscript (Comput. Phys. Commun. 182 (2011) 2227) is available Classification: 4.13 Catalogue identifier of previous version: AEJH_v1_0 Journal reference of previous version: Comput. Phys. Commun. 182 (2011) 2227 Does the new version supersede the previous version?: Yes Nature of problem: While equilibrium reweighting is ubiquitous, there are no public programs available to perform the reweighting in the general case. Further, specific programs often suffer from many library dependencies and numerical instability. Solution method: This package is written in a modular format that allows for easy applicability of reweighting in the general case. Modules are small, numerically stable, and require minimal libraries. Reasons for new version: Some minor bugs, some upgrades needed, error analysis added. analyzeweight.py/analyzeweight.py2 has been replaced by “multihist.py”. This new program performs all the functions of its predecessor while being versatile enough to handle other types of histograms and probability analysis. “bootstrap.py” was added. This script performs basic bootstrap resampling allowing for error analysis of data. “avg_dev_distribution.py” was added. This program computes the averages and standard deviations of multiple distributions, making error analysis (e.g. from bootstrap resampling) easier to visualize. WRE.cpp was slightly modified purely for cosmetic reasons. The manual was updated for clarity and to reflect version updates. Examples were removed from the manual in favor of online tutorials (packaged examples remain). Examples were updated to reflect the new format. An additional example is included to demonstrate error analysis. Running time: Preprocessing scripts 1-5 minutes, WHAM engine <1 minute, postprocess script ∼1-5 minutes.
A note on generalized Genome Scan Meta-Analysis statistics
Koziol, James A; Feng, Anne C
2005-01-01
Background Wise et al. introduced a rank-based statistical technique for meta-analysis of genome scans, the Genome Scan Meta-Analysis (GSMA) method. Levinson et al. recently described two generalizations of the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed different weights for analysis; and (ii) an order statistic approach, reflecting the fact that a GSMA statistic can be computed for each chromosomal region or bin width across the various genome scan studies. Results We provide an Edgeworth approximation to the null distribution of the weighted GSMA statistic, and, we examine the limiting distribution of the GSMA statistics under the order statistic formulation, and quantify the relevance of the pairwise correlations of the GSMA statistics across different bins on this limiting distribution. We also remark on aggregate criteria and multiple testing for determining significance of GSMA results. Conclusion Theoretical considerations detailed herein can lead to clarification and simplification of testing criteria for generalizations of the GSMA statistic. PMID:15717930
HLA-linked rheumatoid arthritis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hasstedt, S.J.; Clegg, D.O.; Ingles, L.
Twenty-eight pedigrees were ascertained through pairs of first-degree relatives diagnosed with rheumatoid arthritis (RA). RA was confirmed in 77 pedigree members including probands; the absence of disease was verified in an additional 261 pedigree members. Pedigree members were serologically typed for HLA. We used likelihood analysis to statistically characterize the HLA-linked RA susceptibility locus. The genetic model assumed tight linkage to HLA. The analysis supported the existence of an HLA-linked RA susceptibility locus, estimated the lifetime penetrance as 41% in male homozygotes and as 48% in female homozygotes. Inheritance was recessive in males and was nearly recessive in females. Inmore » addition, the analysis attributed 78% of the variance within genotypes to genetic or environmental effects shared by siblings. The genetic model inferred in this analysis is consistent with previous association, linkage, and familial aggregation studies of RA. The inferred HLA-linked RA susceptibility locus accounts for approximately one-fifth of the RA in the population. Although other genes may account for the remaining familial RA, a large portion of RA cases may occur sporadically. 79 refs., 9 tabs.« less
NASA Astrophysics Data System (ADS)
Dhakal, N.; Jain, S.
2013-12-01
Rare and unusually large events (such as hurricanes and floods) can create unusual and interesting trends in statistics. Generalized Extreme Value (GEV) distribution is usually used to statistically describe extreme rainfall events. A number of the recent studies have shown that the frequency of extreme rainfall events has increased over the last century and as a result, there has been change in parameters of GEV distribution with the time (non-stationary). But what impact does a single unusually large rainfall event (e.g., hurricane Irene) have on the GEV parameters and consequently on the level of risks or the return periods used in designing the civil infrastructures? In other words, if such a large event occurs today, how will it influence the level of risks (estimated based on past rainfall records) for the civil infrastructures? To answer these questions, we performed sensitivity analysis of the distribution parameters of GEV as well as the return periods to unusually large outlier events. The long-term precipitation records over the period of 1981-2010 from 12 USHCN stations across the state of Maine were used for analysis. For most of the stations, addition of each outlier event caused an increase in the shape parameter with a huge decrease on the corresponding return period. This is a key consideration for time-varying engineering design. These isolated extreme weather events should simultaneously be considered with traditional statistical methodology related to extreme events while designing civil infrastructures (such as dams, bridges, and culverts). Such analysis is also useful in understanding the statistical uncertainty of projecting extreme events into future.
Leming, Matthew; Steiner, Rachel; Styner, Martin
2016-02-27
Tract-based spatial statistics (TBSS) 6 is a software pipeline widely employed in comparative analysis of the white matter integrity from diffusion tensor imaging (DTI) datasets. In this study, we seek to evaluate the relationship between different methods of atlas registration for use with TBSS and different measurements of DTI (fractional anisotropy, FA, axial diffusivity, AD, radial diffusivity, RD, and medial diffusivity, MD). To do so, we have developed a novel tool that builds on existing diffusion atlas building software, integrating it into an adapted version of TBSS called DAB-TBSS (DTI Atlas Builder-Tract-Based Spatial Statistics) by using the advanced registration offered in DTI Atlas Builder 7 . To compare the effectiveness of these two versions of TBSS, we also propose a framework for simulating population differences for diffusion tensor imaging data, providing a more substantive means of empirically comparing DTI group analysis programs such as TBSS. In this study, we used 33 diffusion tensor imaging datasets and simulated group-wise changes in this data by increasing, in three different simulations, the principal eigenvalue (directly altering AD), the second and third eigenvalues (RD), and all three eigenvalues (MD) in the genu, the right uncinate fasciculus, and the left IFO. Additionally, we assessed the benefits of comparing the tensors directly using a functional analysis of diffusion tensor tract statistics (FADTTS 10 ). Our results indicate comparable levels of FA-based detection between DAB-TBSS and TBSS, with standard TBSS registration reporting a higher rate of false positives in other measurements of DTI. Within the simulated changes investigated here, this study suggests that the use of DTI Atlas Builder's registration enhances TBSS group-based studies.
Detecting most influencing courses on students grades using block PCA
NASA Astrophysics Data System (ADS)
Othman, Osama H.; Gebril, Rami Salah
2014-12-01
One of the modern solutions adopted in dealing with the problem of large number of variables in statistical analyses is the Block Principal Component Analysis (Block PCA). This modified technique can be used to reduce the vertical dimension (variables) of the data matrix Xn×p by selecting a smaller number of variables, (say m) containing most of the statistical information. These selected variables can then be employed in further investigations and analyses. Block PCA is an adapted multistage technique of the original PCA. It involves the application of Cluster Analysis (CA) and variable selection throughout sub principal components scores (PC's). The application of Block PCA in this paper is a modified version of the original work of Liu et al (2002). The main objective was to apply PCA on each group of variables, (established using cluster analysis), instead of involving the whole large pack of variables which was proved to be unreliable. In this work, the Block PCA is used to reduce the size of a huge data matrix ((n = 41) × (p = 251)) consisting of Grade Point Average (GPA) of the students in 251 courses (variables) in the faculty of science in Benghazi University. In other words, we are constructing a smaller analytical data matrix of the GPA's of the students with less variables containing most variation (statistical information) in the original database. By applying the Block PCA, (12) courses were found to `absorb' most of the variation or influence from the original data matrix, and hence worth to be keep for future statistical exploring and analytical studies. In addition, the course Independent Study (Math.) was found to be the most influencing course on students GPA among the 12 selected courses.
Mutz, Rüdiger; Daniel, Hans-Dieter
2013-06-01
It is often claimed that psychology students' attitudes towards research methods and statistics affect course enrollment, persistence, achievement, and course climate. However, the inter-institutional variability has been widely neglected in the research on students' attitudes towards research methods and statistics, but it is important for didactic purposes (heterogeneity of the student population). The paper presents a scale based on findings of the social psychology of attitudes (polar and emotion-based concept) in conjunction with a method for capturing beginning university students' attitudes towards research methods and statistics and identifying the proportion of students having positive attitudes at the institutional level. The study based on a re-analysis of a nationwide survey in Germany in August 2000 of all psychology students that enrolled in fall 1999/2000 (N= 1,490) and N= 44 universities. Using multilevel latent-class analysis (MLLCA), the aim was to group students in different student attitude types and at the same time to obtain university segments based on the incidences of the different student attitude types. Four student latent clusters were found that can be ranked on a bipolar attitude dimension. Membership in a cluster was predicted by age, grade point average (GPA) on school-leaving exam, and personality traits. In addition, two university segments were found: universities with an average proportion of students with positive attitudes and universities with a high proportion of students with positive attitudes (excellent segment). As psychology students make up a very heterogeneous group, the use of multiple learning activities as opposed to the classical lecture course is required. © 2011 The British Psychological Society.
Ray, Michael E; Bae, Kyounghwa; Hussain, Maha H A; Hanks, Gerald E; Shipley, William U; Sandler, Howard M
2009-02-18
The identification of surrogate endpoints for prostate cancer-specific survival may shorten the length of clinical trials for prostate cancer. We evaluated distant metastasis and general clinical treatment failure as potential surrogates for prostate cancer-specific survival by use of data from the Radiation Therapy and Oncology Group 92-02 randomized trial. Patients (n = 1554 randomly assigned and 1521 evaluable for this analysis) with locally advanced prostate cancer had been treated with 4 months of neoadjuvant and concurrent androgen deprivation therapy with external beam radiation therapy and then randomly assigned to no additional therapy (control arm) or 24 additional months of androgen deprivation therapy (experimental arm). Data from landmark analyses at 3 and 5 years for general clinical treatment failure (defined as documented local disease progression, regional or distant metastasis, initiation of androgen deprivation therapy, or a prostate-specific antigen level of 25 ng/mL or higher after radiation therapy) and/or distant metastasis were tested as surrogate endpoints for prostate cancer-specific survival at 10 years by use of Prentice's four criteria. All statistical tests were two-sided. At 3 years, 1364 patients were alive and contributed data for analysis. Both distant metastasis and general clinical treatment failure at 3 years were consistent with all four of Prentice's criteria for being surrogate endpoints for prostate cancer-specific survival at 10 years. At 5 years, 1178 patients were alive and contributed data for analysis. Although prostate cancer-specific survival was not statistically significantly different between treatment arms at 5 years (P = .08), both endpoints were consistent with Prentice's remaining criteria. Distant metastasis and general clinical treatment failure at 3 years may be candidate surrogate endpoints for prostate cancer-specific survival at 10 years. These endpoints, however, must be validated in other datasets.
Huang, Jian Wen; Mu, Jia Gui; Li, Yun Wei; Gan, Xiu Guo; Song, Lu Jie; Gu, Bao Jun; Fu, Qiang; Xu, Yue Min; An, Rui Hua
2014-11-30
To evaluate the clinical value of fluorescence in situ hybridization (FISH) for diagnosis and surveillance of bladder urothelial carcinoma (BUC). Between November 2010 and December 2013, patients suspected of having BUC were examined using urine cytology and FISH assay. Based on histopathological examination results, FISH results were compared with urine cytology. In addition, patients with a history of non-muscle invasive BUC were also examined using urine cytology and FISH assay at the first time of visit and then monitored with cystoscopy during follow-up period. A total of 162 patients included in this study and 12 patients were excluded due to uninformative FISH assays. The remaining 150 patients consisted of 108 patients suspected for BUC and 42 patients with a history of non-muscle invasive BUC. The sensitivities of FISH analysis and urine cytology were 72.8% and 27.2%, respectively, and the difference was statistically significant (P <.05). Difference between specificity of urine cytology (100%) and FISH assay (85%) was not statistically significant (P >.05). At the first visit, of 42 patients, one patient had positive cystoscopy, and FISH assay was positive in 26 of 41 patients with negative cystoscopy. During the follow-up period (mean, 29.5 months), 18 of 26 patients developed recurrence, and recurrence occurred in only one of 15 patients with negative FISH analysis. Our results suggest that FISH analysis can be used as a non-invasive diagnostic tool for patients suspected of having new BUC. In addition, FISH analysis may provide important prognostic information to better define the individual risk for BUC recurrence.& nbsp;
ERIC Educational Resources Information Center
Hahs-Vaughn, Debbie L.; Acquaye, Hannah; Griffith, Matthew D.; Jo, Hang; Matthews, Ken; Acharya, Parul
2017-01-01
Statistical literacy refers to understanding fundamental statistical concepts. Assessment of statistical literacy can take the forms of tasks that require students to identify, translate, compute, read, and interpret data. In addition, statistical instruction can take many forms encompassing course delivery format such as face-to-face, hybrid,…
Meta-analysis and The Cochrane Collaboration: 20 years of the Cochrane Statistical Methods Group
2013-01-01
The Statistical Methods Group has played a pivotal role in The Cochrane Collaboration over the past 20 years. The Statistical Methods Group has determined the direction of statistical methods used within Cochrane reviews, developed guidance for these methods, provided training, and continued to discuss and consider new and controversial issues in meta-analysis. The contribution of Statistical Methods Group members to the meta-analysis literature has been extensive and has helped to shape the wider meta-analysis landscape. In this paper, marking the 20th anniversary of The Cochrane Collaboration, we reflect on the history of the Statistical Methods Group, beginning in 1993 with the identification of aspects of statistical synthesis for which consensus was lacking about the best approach. We highlight some landmark methodological developments that Statistical Methods Group members have contributed to in the field of meta-analysis. We discuss how the Group implements and disseminates statistical methods within The Cochrane Collaboration. Finally, we consider the importance of robust statistical methodology for Cochrane systematic reviews, note research gaps, and reflect on the challenges that the Statistical Methods Group faces in its future direction. PMID:24280020
Kim, Yong Wook; Kim, Hyoung Seop; An, Young-Sil; Im, Sang Hee
2010-10-01
Permanent vegetative state is defined as the impaired level of consciousness longer than 12 months after traumatic causes and 3 months after non-traumatic causes of brain injury. Although many studies assessed the cerebral metabolism in patients with acute and persistent vegetative state after brain injury, few studies investigated the cerebral metabolism in patients with permanent vegetative state. In this study, we performed the voxel-based analysis of cerebral glucose metabolism and investigated the relationship between regional cerebral glucose metabolism and the severity of impaired consciousness in patients with permanent vegetative state after acquired brain injury. We compared the regional cerebral glucose metabolism as demonstrated by F-18 fluorodeoxyglucose positron emission tomography from 12 patients with permanent vegetative state after acquired brain injury with those from 12 control subjects. Additionally, covariance analysis was performed to identify regions where decreased changes in regional cerebral glucose metabolism significantly correlated with a decrease of level of consciousness measured by JFK-coma recovery scale. Statistical analysis was performed using statistical parametric mapping. Compared with controls, patients with permanent vegetative state demonstrated decreased cerebral glucose metabolism in the left precuneus, both posterior cingulate cortices, the left superior parietal lobule (P(corrected) < 0.001), and increased cerebral glucose metabolism in the both cerebellum and the right supramarginal cortices (P(corrected) < 0.001). In the covariance analysis, a decrease in the level of consciousness was significantly correlated with decreased cerebral glucose metabolism in the both posterior cingulate cortices (P(uncorrected) < 0.005). Our findings suggest that the posteromedial parietal cortex, which are part of neural network for consciousness, may be relevant structure for pathophysiological mechanism in patients with permanent vegetative state after acquired brain injury.
Albrecht, Jessica; Kopietz, Rainer; Frasnelli, Johannes; Wiesmann, Martin; Hummel, Thomas; Lundström, Johan N.
2009-01-01
Almost every odor we encounter in daily life has the capacity to produce a trigeminal sensation. Surprisingly, few functional imaging studies exploring human neuronal correlates of intranasal trigeminal function exist, and results are to some degree inconsistent. We utilized activation likelihood estimation (ALE), a quantitative voxel-based meta-analysis tool, to analyze functional imaging data (fMRI/PET) following intranasal trigeminal stimulation with carbon dioxide (CO2), a stimulus known to exclusively activate the trigeminal system. Meta-analysis tools are able to identify activations common across studies, thereby enabling activation mapping with higher certainty. Activation foci of nine studies utilizing trigeminal stimulation were included in the meta-analysis. We found significant ALE scores, thus indicating consistent activation across studies, in the brainstem, ventrolateral posterior thalamic nucleus, anterior cingulate cortex, insula, precentral gyrus, as well as in primary and secondary somatosensory cortices – a network known for the processing of intranasal nociceptive stimuli. Significant ALE values were also observed in the piriform cortex, insula, and the orbitofrontal cortex, areas known to process chemosensory stimuli, and in association cortices. Additionally, the trigeminal ALE statistics were directly compared with ALE statistics originating from olfactory stimulation, demonstrating considerable overlap in activation. In conclusion, the results of this meta-analysis map the human neuronal correlates of intranasal trigeminal stimulation with high statistical certainty and demonstrate that the cortical areas recruited during the processing of intranasal CO2 stimuli include those outside traditional trigeminal areas. Moreover, through illustrations of the considerable overlap between brain areas that process trigeminal and olfactory information; these results demonstrate the interconnectivity of flavor processing. PMID:19913573
Advanced Gear Alloys for Ultra High Strength Applications
NASA Technical Reports Server (NTRS)
Shen, Tony; Krantz, Timothy; Sebastian, Jason
2011-01-01
Single tooth bending fatigue (STBF) test data of UHS Ferrium C61 and C64 alloys are presented in comparison with historical test data of conventional gear steels (9310 and Pyrowear 53) with comparable statistical analysis methods. Pitting and scoring tests of C61 and C64 are works in progress. Boeing statistical analysis of STBF test data for the four gear steels (C61, C64, 9310 and Pyrowear 53) indicates that the UHS grades exhibit increases in fatigue strength in the low cycle fatigue (LCF) regime. In the high cycle fatigue (HCF) regime, the UHS steels exhibit better mean fatigue strength endurance limit behavior (particularly as compared to Pyrowear 53). However, due to considerable scatter in the UHS test data, the anticipated overall benefits of the UHS grades in bending fatigue have not been fully demonstrated. Based on all the test data and on Boeing s analysis, C61 has been selected by Boeing as the gear steel for the final ERDS demonstrator test gearboxes. In terms of potential follow-up work, detailed physics-based, micromechanical analysis and modeling of the fatigue data would allow for a better understanding of the causes of the experimental scatter, and of the transition from high-stress LCF (surface-dominated) to low-stress HCF (subsurface-dominated) fatigue failure. Additional STBF test data and failure analysis work, particularly in the HCF regime and around the endurance limit stress, could allow for better statistical confidence and could reduce the observed effects of experimental test scatter. Finally, the need for further optimization of the residual compressive stress profiles of the UHS steels (resulting from carburization and peening) is noted, particularly for the case of the higher hardness C64 material.
Kawalec, Paweł; Moćko, Pawel; Pilc, Andrzej; Radziwon-Zalewska, Maria; Malinowska-Lipień, Iwona
2016-08-01
The increasing prevalence of Crohn disease (CD) underscores the need to identify new effective drugs, which is particularly important for patients who do not respond or do not tolerate standard biologic therapies. The purpose of this analysis was to compare the efficacy and safety of vedolizumab and certolizumab pegol in patients with active moderate to severe CD. This analysis was prepared according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. A systematic literature search of Medline (PubMed), Embase, and the Cochrane Library was conducted through March 5, 2016. Studies included were randomized controlled trials (RCTs) that enrolled patients treated for CD with vedolizumab or certolizumab pegol. All studies were critically appraised; indirect comparison was performed with the Bucher method. Eight RCTs were identified, and four were homogeneous enough to be included in the indirect comparison of the induction phase of treatment. No statistically significant differences were found in clinical response (relative risk [RR] 1.23, 95% confidence interval [CI] 0.81-1.88) or remission (RR 1.35, 95% CI 0.89-2.07) between vedolizumab and certolizumab pegol in the overall population. Similar nonstatistically significant differences in response and remission were noted in a subgroup analysis of anti-tumor necrosis factor-naive patients (RR 1.10, 95% CI 0.72-1.66 and RR 1.98, 95% CI 0.95-4.11, respectively). In addition, there were no statistically significant differences in safety profiles. This indirect comparison analysis demonstrated no statistically significant differences in efficacy and safety between vedolizumab and certolizumab pegol. © 2016 Pharmacotherapy Publications, Inc.
Larson, Nicholas B; McDonnell, Shannon; Cannon Albright, Lisa; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan E; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J
2017-05-01
Next-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways. Kernel-machine regression and adaptive testing methods for aggregative rare-variant association testing have been demonstrated to be powerful approaches for pathway-level analysis, although these methods tend to be computationally intensive at high-variant dimensionality and require access to complete data. An additional analytical issue in scans of large pathway definition sets is multiple testing correction. Gene set definitions may exhibit substantial genic overlap, and the impact of the resultant correlation in test statistics on Type I error rate control for large agnostic gene set scans has not been fully explored. Herein, we first outline a statistical strategy for aggregative rare-variant analysis using component gene-level linear kernel score test summary statistics as well as derive simple estimators of the effective number of tests for family-wise error rate control. We then conduct extensive simulation studies to characterize the behavior of our approach relative to direct application of kernel and adaptive methods under a variety of conditions. We also apply our method to two case-control studies, respectively, evaluating rare variation in hereditary prostate cancer and schizophrenia. Finally, we provide open-source R code for public use to facilitate easy application of our methods to existing rare-variant analysis results. © 2017 WILEY PERIODICALS, INC.