Using R-Project for Free Statistical Analysis in Extension Research
ERIC Educational Resources Information Center
Mangiafico, Salvatore S.
2013-01-01
One option for Extension professionals wishing to use free statistical software is to use online calculators, which are useful for common, simple analyses. A second option is to use a free computing environment capable of performing statistical analyses, like R-project. R-project is free, cross-platform, powerful, and respected, but may be…
Comments on `A Cautionary Note on the Interpretation of EOFs'.
NASA Astrophysics Data System (ADS)
Behera, Swadhin K.; Rao, Suryachandra A.; Saji, Hameed N.; Yamagata, Toshio
2003-04-01
The misleading aspect of the statistical analyses used in Dommenget and Latif, which raises concerns on some of the reported climate modes, is demonstrated. Adopting simple statistical techniques, the physical existence of the Indian Ocean dipole mode is shown and then the limitations of varimax and regression analyses in capturing the climate mode are discussed.
Using a Five-Step Procedure for Inferential Statistical Analyses
ERIC Educational Resources Information Center
Kamin, Lawrence F.
2010-01-01
Many statistics texts pose inferential statistical problems in a disjointed way. By using a simple five-step procedure as a template for statistical inference problems, the student can solve problems in an organized fashion. The problem and its solution will thus be a stand-by-itself organic whole and a single unit of thought and effort. The…
Learning investment indicators through data extension
NASA Astrophysics Data System (ADS)
Dvořák, Marek
2017-07-01
Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.
The Problem of Auto-Correlation in Parasitology
Pollitt, Laura C.; Reece, Sarah E.; Mideo, Nicole; Nussey, Daniel H.; Colegrave, Nick
2012-01-01
Explaining the contribution of host and pathogen factors in driving infection dynamics is a major ambition in parasitology. There is increasing recognition that analyses based on single summary measures of an infection (e.g., peak parasitaemia) do not adequately capture infection dynamics and so, the appropriate use of statistical techniques to analyse dynamics is necessary to understand infections and, ultimately, control parasites. However, the complexities of within-host environments mean that tracking and analysing pathogen dynamics within infections and among hosts poses considerable statistical challenges. Simple statistical models make assumptions that will rarely be satisfied in data collected on host and parasite parameters. In particular, model residuals (unexplained variance in the data) should not be correlated in time or space. Here we demonstrate how failure to account for such correlations can result in incorrect biological inference from statistical analysis. We then show how mixed effects models can be used as a powerful tool to analyse such repeated measures data in the hope that this will encourage better statistical practices in parasitology. PMID:22511865
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
ERIC Educational Resources Information Center
Brossart, Daniel F.; Parker, Richard I.; Olson, Elizabeth A.; Mahadevan, Lakshmi
2006-01-01
This study explored some practical issues for single-case researchers who rely on visual analysis of graphed data, but who also may consider supplemental use of promising statistical analysis techniques. The study sought to answer three major questions: (a) What is a typical range of effect sizes from these analytic techniques for data from…
ERIC Educational Resources Information Center
Spybrook, Jessaca; Hedges, Larry; Borenstein, Michael
2014-01-01
Research designs in which clusters are the unit of randomization are quite common in the social sciences. Given the multilevel nature of these studies, the power analyses for these studies are more complex than in a simple individually randomized trial. Tools are now available to help researchers conduct power analyses for cluster randomized…
An Exploratory Data Analysis System for Support in Medical Decision-Making
Copeland, J. A.; Hamel, B.; Bourne, J. R.
1979-01-01
An experimental system was developed to allow retrieval and analysis of data collected during a study of neurobehavioral correlates of renal disease. After retrieving data organized in a relational data base, simple bivariate statistics of parametric and nonparametric nature could be conducted. An “exploratory” mode in which the system provided guidance in selection of appropriate statistical analyses was also available to the user. The system traversed a decision tree using the inherent qualities of the data (e.g., the identity and number of patients, tests, and time epochs) to search for the appropriate analyses to employ.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Hood of the truck statistics for food animal practitioners.
Slenning, Barrett D
2006-03-01
This article offers some tips on working with statistics and develops four relatively simple procedures to deal with most kinds of data with which veterinarians work. The criterion for a procedure to be a "Hood of the Truck Statistics" (HOT Stats) technique is that it must be simple enough to be done with pencil, paper, and a calculator. The goal of HOT Stats is to have the tools available to run quick analyses in only a few minutes so that decisions can be made in a timely fashion. The discipline allows us to move away from the all-too-common guess work about effects and differences we perceive following a change in treatment or management. The techniques allow us to move toward making more defensible, credible, and more quantifiably "risk-aware" real-time recommendations to our clients.
Tang, Qi-Yi; Zhang, Chuan-Xi
2013-04-01
A comprehensive but simple-to-use software package called DPS (Data Processing System) has been developed to execute a range of standard numerical analyses and operations used in experimental design, statistics and data mining. This program runs on standard Windows computers. Many of the functions are specific to entomological and other biological research and are not found in standard statistical software. This paper presents applications of DPS to experimental design, statistical analysis and data mining in entomology. © 2012 The Authors Insect Science © 2012 Institute of Zoology, Chinese Academy of Sciences.
Whole-Range Assessment: A Simple Method for Analysing Allelopathic Dose-Response Data
An, Min; Pratley, J. E.; Haig, T.; Liu, D.L.
2005-01-01
Based on the typical biological responses of an organism to allelochemicals (hormesis), concepts of whole-range assessment and inhibition index were developed for improved analysis of allelopathic data. Examples of their application are presented using data drawn from the literature. The method is concise and comprehensive, and makes data grouping and multiple comparisons simple, logical, and possible. It improves data interpretation, enhances research outcomes, and is a statistically efficient summary of the plant response profiles. PMID:19330165
Conceptual and statistical problems associated with the use of diversity indices in ecology.
Barrantes, Gilbert; Sandoval, Luis
2009-09-01
Diversity indices, particularly the Shannon-Wiener index, have extensively been used in analyzing patterns of diversity at different geographic and ecological scales. These indices have serious conceptual and statistical problems which make comparisons of species richness or species abundances across communities nearly impossible. There is often no a single statistical method that retains all information needed to answer even a simple question. However, multivariate analyses could be used instead of diversity indices, such as cluster analyses or multiple regressions. More complex multivariate analyses, such as Canonical Correspondence Analysis, provide very valuable information on environmental variables associated to the presence and abundance of the species in a community. In addition, particular hypotheses associated to changes in species richness across localities, or change in abundance of one, or a group of species can be tested using univariate, bivariate, and/or rarefaction statistical tests. The rarefaction method has proved to be robust to standardize all samples to a common size. Even the simplest method as reporting the number of species per taxonomic category possibly provides more information than a diversity index value.
Huang, Yuan-sheng; Yang, Zhi-rong; Zhan, Si-yan
2015-06-18
To investigate the use of simple pooling and bivariate model in meta-analyses of diagnostic test accuracy (DTA) published in Chinese journals (January to November, 2014), compare the differences of results from these two models, and explore the impact of between-study variability of sensitivity and specificity on the differences. DTA meta-analyses were searched through Chinese Biomedical Literature Database (January to November, 2014). Details in models and data for fourfold table were extracted. Descriptive analysis was conducted to investigate the prevalence of the use of simple pooling method and bivariate model in the included literature. Data were re-analyzed with the two models respectively. Differences in the results were examined by Wilcoxon signed rank test. How the results differences were affected by between-study variability of sensitivity and specificity, expressed by I2, was explored. The 55 systematic reviews, containing 58 DTA meta-analyses, were included and 25 DTA meta-analyses were eligible for re-analysis. Simple pooling was used in 50 (90.9%) systematic reviews and bivariate model in 1 (1.8%). The remaining 4 (7.3%) articles used other models pooling sensitivity and specificity or pooled neither of them. Of the reviews simply pooling sensitivity and specificity, 41(82.0%) were at the risk of wrongly using Meta-disc software. The differences in medians of sensitivity and specificity between two models were both 0.011 (P<0.001, P=0.031 respectively). Greater differences could be found as I2 of sensitivity or specificity became larger, especially when I2>75%. Most DTA meta-analyses published in Chinese journals(January to November, 2014) combine the sensitivity and specificity by simple pooling. Meta-disc software can pool the sensitivity and specificity only through fixed-effect model, but a high proportion of authors think it can implement random-effect model. Simple pooling tends to underestimate the results compared with bivariate model. The greater the between-study variance is, the more likely the simple pooling has larger deviation. It is necessary to increase the knowledge level of statistical methods and software for meta-analyses of DTA data.
Is simple nephrectomy truly simple? Comparison with the radical alternative.
Connolly, S S; O'Brien, M Frank; Kunni, I M; Phelan, E; Conroy, R; Thornhill, J A; Grainger, R
2011-03-01
The Oxford English dictionary defines the term "simple" as "easily done" and "uncomplicated". We tested the validity of this terminology in relation to open nephrectomy surgery. Retrospective review of 215 patients undergoing open, simple (n = 89) or radical (n = 126) nephrectomy in a single university-affiliated institution between 1998 and 2002. Operative time (OT), estimated blood loss (EBL), operative complications (OC) and length of stay in hospital (LOS) were analysed. Statistical analysis employed Fisher's exact test and Stata Release 8.2. Simple nephrectomy was associated with shorter OT (mean 126 vs. 144 min; p = 0.002), reduced EBL (mean 729 vs. 859 cc; p = 0.472), lower OC (9 vs. 17%; 0.087), and more brief LOS (mean 6 vs. 8 days; p < 0.001). All parameters suggest favourable outcome for the simple nephrectomy group, supporting the use of this terminology. This implies "simple" nephrectomies are truly easier to perform with less complication than their radical counterpart.
Chung, Sang M; Lee, David J; Hand, Austin; Young, Philip; Vaidyanathan, Jayabharathi; Sahajwalla, Chandrahas
2015-12-01
The study evaluated whether the renal function decline rate per year with age in adults varies based on two primary statistical analyses: cross-section (CS), using one observation per subject, and longitudinal (LT), using multiple observations per subject over time. A total of 16628 records (3946 subjects; age range 30-92 years) of creatinine clearance and relevant demographic data were used. On average, four samples per subject were collected for up to 2364 days (mean: 793 days). A simple linear regression and random coefficient models were selected for CS and LT analyses, respectively. The renal function decline rates per year were 1.33 and 0.95 ml/min/year for CS and LT analyses, respectively, and were slower when the repeated individual measurements were considered. The study confirms that rates are different based on statistical analyses, and that a statistically robust longitudinal model with a proper sampling design provides reliable individual as well as population estimates of the renal function decline rates per year with age in adults. In conclusion, our findings indicated that one should be cautious in interpreting the renal function decline rate with aging information because its estimation was highly dependent on the statistical analyses. From our analyses, a population longitudinal analysis (e.g. random coefficient model) is recommended if individualization is critical, such as a dose adjustment based on renal function during a chronic therapy. Copyright © 2015 John Wiley & Sons, Ltd.
A Simple Method to Control Positive Baseline Trend within Data Nonoverlap
ERIC Educational Resources Information Center
Parker, Richard I.; Vannest, Kimberly J.; Davis, John L.
2014-01-01
Nonoverlap is widely used as a statistical summary of data; however, these analyses rarely correct unwanted positive baseline trend. This article presents and validates the graph rotation for overlap and trend (GROT) technique, a hand calculation method for controlling positive baseline trend within an analysis of data nonoverlap. GROT is…
On "Rhyme, Language, and Children's Reading."
ERIC Educational Resources Information Center
Bowey, Judith A.
1990-01-01
Argues that the results detailed in Bryant, MacLean, and Bradley's 1990 article do not differ greatly with those reported earlier by Bowey and Patel, and suggests that discrepancies between the two statistical analyses reflect the relative size of simple correlations and are attributable to differences in the designs of the two studies. (30…
Valid randomization-based p-values for partially post hoc subgroup analyses.
Lee, Joseph J; Rubin, Donald B
2015-10-30
By 'partially post-hoc' subgroup analyses, we mean analyses that compare existing data from a randomized experiment-from which a subgroup specification is derived-to new, subgroup-only experimental data. We describe a motivating example in which partially post hoc subgroup analyses instigated statistical debate about a medical device's efficacy. We clarify the source of such analyses' invalidity and then propose a randomization-based approach for generating valid posterior predictive p-values for such partially post hoc subgroups. Lastly, we investigate the approach's operating characteristics in a simple illustrative setting through a series of simulations, showing that it can have desirable properties under both null and alternative hypotheses. Copyright © 2015 John Wiley & Sons, Ltd.
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.
Chu, Annie; Cui, Jenny; Dinov, Ivo D
2009-03-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.
NASA Astrophysics Data System (ADS)
Pollard, David; Chang, Won; Haran, Murali; Applegate, Patrick; DeConto, Robert
2016-05-01
A 3-D hybrid ice-sheet model is applied to the last deglacial retreat of the West Antarctic Ice Sheet over the last ˜ 20 000 yr. A large ensemble of 625 model runs is used to calibrate the model to modern and geologic data, including reconstructed grounding lines, relative sea-level records, elevation-age data and uplift rates, with an aggregate score computed for each run that measures overall model-data misfit. Two types of statistical methods are used to analyze the large-ensemble results: simple averaging weighted by the aggregate score, and more advanced Bayesian techniques involving Gaussian process-based emulation and calibration, and Markov chain Monte Carlo. The analyses provide sea-level-rise envelopes with well-defined parametric uncertainty bounds, but the simple averaging method only provides robust results with full-factorial parameter sampling in the large ensemble. Results for best-fit parameter ranges and envelopes of equivalent sea-level rise with the simple averaging method agree well with the more advanced techniques. Best-fit parameter ranges confirm earlier values expected from prior model tuning, including large basal sliding coefficients on modern ocean beds.
4P: fast computing of population genetics statistics from large DNA polymorphism panels
Benazzo, Andrea; Panziera, Alex; Bertorelle, Giorgio
2015-01-01
Massive DNA sequencing has significantly increased the amount of data available for population genetics and molecular ecology studies. However, the parallel computation of simple statistics within and between populations from large panels of polymorphic sites is not yet available, making the exploratory analyses of a set or subset of data a very laborious task. Here, we present 4P (parallel processing of polymorphism panels), a stand-alone software program for the rapid computation of genetic variation statistics (including the joint frequency spectrum) from millions of DNA variants in multiple individuals and multiple populations. It handles a standard input file format commonly used to store DNA variation from empirical or simulation experiments. The computational performance of 4P was evaluated using large SNP (single nucleotide polymorphism) datasets from human genomes or obtained by simulations. 4P was faster or much faster than other comparable programs, and the impact of parallel computing using multicore computers or servers was evident. 4P is a useful tool for biologists who need a simple and rapid computer program to run exploratory population genetics analyses in large panels of genomic data. It is also particularly suitable to analyze multiple data sets produced in simulation studies. Unix, Windows, and MacOs versions are provided, as well as the source code for easier pipeline implementations. PMID:25628874
Wang, Xiaolong; Li, Lin; Zhao, Jiaxin; Li, Fangliang; Guo, Wei; Chen, Xia
2017-04-01
To evaluate the effects of different preservation methods (stored in a -20°C ice chest, preserved in liquid nitrogen and dried in silica gel) on inter simple sequence repeat (ISSR) or random amplified polymorphic DNA (RAPD) analyses in various botanical specimens (including broad-leaved plants, needle-leaved plants and succulent plants) for different times (three weeks and three years), we used a statistical analysis based on the number of bands, genetic index and cluster analysis. The results demonstrate that methods used to preserve samples can provide sufficient amounts of genomic DNA for ISSR and RAPD analyses; however, the effect of different preservation methods on these analyses vary significantly, and the preservation time has little effect on these analyses. Our results provide a reference for researchers to select the most suitable preservation method depending on their study subject for the analysis of molecular markers based on genomic DNA. Copyright © 2017 Académie des sciences. Published by Elsevier Masson SAS. All rights reserved.
A Simple and Robust Method for Partially Matched Samples Using the P-Values Pooling Approach
Kuan, Pei Fen; Huang, Bo
2013-01-01
This paper focuses on statistical analyses in scenarios where some samples from the matched pairs design are missing, resulting in partially matched samples. Motivated by the idea of meta-analysis, we recast the partially matched samples as coming from two experimental designs, and propose a simple yet robust approach based on the weighted Z-test to integrate the p-values computed from these two designs. We show that the proposed approach achieves better operating characteristics in simulations and a case study, compared to existing methods for partially matched samples. PMID:23417968
Review of Statistical Methods for Analysing Healthcare Resources and Costs
Mihaylova, Borislava; Briggs, Andrew; O'Hagan, Anthony; Thompson, Simon G
2011-01-01
We review statistical methods for analysing healthcare resource use and costs, their ability to address skewness, excess zeros, multimodality and heavy right tails, and their ease for general use. We aim to provide guidance on analysing resource use and costs focusing on randomised trials, although methods often have wider applicability. Twelve broad categories of methods were identified: (I) methods based on the normal distribution, (II) methods following transformation of data, (III) single-distribution generalized linear models (GLMs), (IV) parametric models based on skewed distributions outside the GLM family, (V) models based on mixtures of parametric distributions, (VI) two (or multi)-part and Tobit models, (VII) survival methods, (VIII) non-parametric methods, (IX) methods based on truncation or trimming of data, (X) data components models, (XI) methods based on averaging across models, and (XII) Markov chain methods. Based on this review, our recommendations are that, first, simple methods are preferred in large samples where the near-normality of sample means is assured. Second, in somewhat smaller samples, relatively simple methods, able to deal with one or two of above data characteristics, may be preferable but checking sensitivity to assumptions is necessary. Finally, some more complex methods hold promise, but are relatively untried; their implementation requires substantial expertise and they are not currently recommended for wider applied work. Copyright © 2010 John Wiley & Sons, Ltd. PMID:20799344
Ganger, Michael T; Dietz, Geoffrey D; Ewing, Sarah J
2017-12-01
qPCR has established itself as the technique of choice for the quantification of gene expression. Procedures for conducting qPCR have received significant attention; however, more rigorous approaches to the statistical analysis of qPCR data are needed. Here we develop a mathematical model, termed the Common Base Method, for analysis of qPCR data based on threshold cycle values (C q ) and efficiencies of reactions (E). The Common Base Method keeps all calculations in the logscale as long as possible by working with log 10 (E) ∙ C q , which we call the efficiency-weighted C q value; subsequent statistical analyses are then applied in the logscale. We show how efficiency-weighted C q values may be analyzed using a simple paired or unpaired experimental design and develop blocking methods to help reduce unexplained variation. The Common Base Method has several advantages. It allows for the incorporation of well-specific efficiencies and multiple reference genes. The method does not necessitate the pairing of samples that must be performed using traditional analysis methods in order to calculate relative expression ratios. Our method is also simple enough to be implemented in any spreadsheet or statistical software without additional scripts or proprietary components.
Statistical issues in quality control of proteomic analyses: good experimental design and planning.
Cairns, David A
2011-03-01
Quality control is becoming increasingly important in proteomic investigations as experiments become more multivariate and quantitative. Quality control applies to all stages of an investigation and statistics can play a key role. In this review, the role of statistical ideas in the design and planning of an investigation is described. This involves the design of unbiased experiments using key concepts from statistical experimental design, the understanding of the biological and analytical variation in a system using variance components analysis and the determination of a required sample size to perform a statistically powerful investigation. These concepts are described through simple examples and an example data set from a 2-D DIGE pilot experiment. Each of these concepts can prove useful in producing better and more reproducible data. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Statistical analysis of the determinations of the Sun's Galactocentric distance
NASA Astrophysics Data System (ADS)
Malkin, Zinovy
2013-02-01
Based on several tens of R0 measurements made during the past two decades, several studies have been performed to derive the best estimate of R0. Some used just simple averaging to derive a result, whereas others provided comprehensive analyses of possible errors in published results. In either case, detailed statistical analyses of data used were not performed. However, a computation of the best estimates of the Galactic rotation constants is not only an astronomical but also a metrological task. Here we perform an analysis of 53 R0 measurements (published in the past 20 years) to assess the consistency of the data. Our analysis shows that they are internally consistent. It is also shown that any trend in the R0 estimates from the last 20 years is statistically negligible, which renders the presence of a bandwagon effect doubtful. On the other hand, the formal errors in the published R0 estimates improve significantly with time.
Advanced Statistics for Exotic Animal Practitioners.
Hodsoll, John; Hellier, Jennifer M; Ryan, Elizabeth G
2017-09-01
Correlation and regression assess the association between 2 or more variables. This article reviews the core knowledge needed to understand these analyses, moving from visual analysis in scatter plots through correlation, simple and multiple linear regression, and logistic regression. Correlation estimates the strength and direction of a relationship between 2 variables. Regression can be considered more general and quantifies the numerical relationships between an outcome and 1 or multiple variables in terms of a best-fit line, allowing predictions to be made. Each technique is discussed with examples and the statistical assumptions underlying their correct application. Copyright © 2017 Elsevier Inc. All rights reserved.
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit
Chu, Annie; Cui, Jenny; Dinov, Ivo D.
2011-01-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994
Fantuzzo, J. A.; Mirabella, V. R.; Zahn, J. D.
2017-01-01
Abstract Synapse formation analyses can be performed by imaging and quantifying fluorescent signals of synaptic markers. Traditionally, these analyses are done using simple or multiple thresholding and segmentation approaches or by labor-intensive manual analysis by a human observer. Here, we describe Intellicount, a high-throughput, fully-automated synapse quantification program which applies a novel machine learning (ML)-based image processing algorithm to systematically improve region of interest (ROI) identification over simple thresholding techniques. Through processing large datasets from both human and mouse neurons, we demonstrate that this approach allows image processing to proceed independently of carefully set thresholds, thus reducing the need for human intervention. As a result, this method can efficiently and accurately process large image datasets with minimal interaction by the experimenter, making it less prone to bias and less liable to human error. Furthermore, Intellicount is integrated into an intuitive graphical user interface (GUI) that provides a set of valuable features, including automated and multifunctional figure generation, routine statistical analyses, and the ability to run full datasets through nested folders, greatly expediting the data analysis process. PMID:29218324
Wiuf, Carsten; Schaumburg-Müller Pallesen, Jonatan; Foldager, Leslie; Grove, Jakob
2016-08-01
In many areas of science it is custom to perform many, potentially millions, of tests simultaneously. To gain statistical power it is common to group tests based on a priori criteria such as predefined regions or by sliding windows. However, it is not straightforward to choose grouping criteria and the results might depend on the chosen criteria. Methods that summarize, or aggregate, test statistics or p-values, without relying on a priori criteria, are therefore desirable. We present a simple method to aggregate a sequence of stochastic variables, such as test statistics or p-values, into fewer variables without assuming a priori defined groups. We provide different ways to evaluate the significance of the aggregated variables based on theoretical considerations and resampling techniques, and show that under certain assumptions the FWER is controlled in the strong sense. Validity of the method was demonstrated using simulations and real data analyses. Our method may be a useful supplement to standard procedures relying on evaluation of test statistics individually. Moreover, by being agnostic and not relying on predefined selected regions, it might be a practical alternative to conventionally used methods of aggregation of p-values over regions. The method is implemented in Python and freely available online (through GitHub, see the Supplementary information).
Biological Parametric Mapping: A Statistical Toolbox for Multi-Modality Brain Image Analysis
Casanova, Ramon; Ryali, Srikanth; Baer, Aaron; Laurienti, Paul J.; Burdette, Jonathan H.; Hayasaka, Satoru; Flowers, Lynn; Wood, Frank; Maldjian, Joseph A.
2006-01-01
In recent years multiple brain MR imaging modalities have emerged; however, analysis methodologies have mainly remained modality specific. In addition, when comparing across imaging modalities, most researchers have been forced to rely on simple region-of-interest type analyses, which do not allow the voxel-by-voxel comparisons necessary to answer more sophisticated neuroscience questions. To overcome these limitations, we developed a toolbox for multimodal image analysis called biological parametric mapping (BPM), based on a voxel-wise use of the general linear model. The BPM toolbox incorporates information obtained from other modalities as regressors in a voxel-wise analysis, thereby permitting investigation of more sophisticated hypotheses. The BPM toolbox has been developed in MATLAB with a user friendly interface for performing analyses, including voxel-wise multimodal correlation, ANCOVA, and multiple regression. It has a high degree of integration with the SPM (statistical parametric mapping) software relying on it for visualization and statistical inference. Furthermore, statistical inference for a correlation field, rather than a widely-used T-field, has been implemented in the correlation analysis for more accurate results. An example with in-vivo data is presented demonstrating the potential of the BPM methodology as a tool for multimodal image analysis. PMID:17070709
Evaluation of IOTA Simple Ultrasound Rules to Distinguish Benign and Malignant Ovarian Tumours.
Garg, Sugandha; Kaur, Amarjit; Mohi, Jaswinder Kaur; Sibia, Preet Kanwal; Kaur, Navkiran
2017-08-01
IOTA stands for International Ovarian Tumour Analysis group. Ovarian cancer is one of the common cancers in women and is diagnosed at later stage in majority. The limiting factor for early diagnosis is lack of standardized terms and procedures in gynaecological sonography. Introduction of IOTA rules has provided some consistency in defining morphological features of ovarian masses through a standardized examination technique. To evaluate the efficacy of IOTA simple ultrasound rules in distinguishing benign and malignant ovarian tumours and establishing their use as a tool in early diagnosis of ovarian malignancy. A hospital based case control prospective study was conducted. Patients with suspected ovarian pathology were evaluated using IOTA ultrasound rules and designated as benign or malignant. Findings were correlated with histopathological findings. Collected data was statistically analysed using chi-square test and kappa statistical method. Out of initial 55 patients, 50 patients were included in the final analysis who underwent surgery. IOTA simple rules were applicable in 45 out of these 50 patients (90%). The sensitivity for the detection of malignancy in cases where IOTA simple rules were applicable was 91.66% and the specificity was 84.84%. Accuracy was 86.66%. Classifying inconclusive cases as malignant, the sensitivity and specificity was 93% and 80% respectively. High level of agreement was found between USG and histopathological diagnosis with Kappa value as 0.323. IOTA simple ultrasound rules were highly sensitive and specific in predicting ovarian malignancy preoperatively yet being reproducible, easy to train and use.
Using complexity metrics with R-R intervals and BPM heart rate measures.
Wallot, Sebastian; Fusaroli, Riccardo; Tylén, Kristian; Jegindø, Else-Marie
2013-01-01
Lately, growing attention in the health sciences has been paid to the dynamics of heart rate as indicator of impending failures and for prognoses. Likewise, in social and cognitive sciences, heart rate is increasingly employed as a measure of arousal, emotional engagement and as a marker of interpersonal coordination. However, there is no consensus about which measurements and analytical tools are most appropriate in mapping the temporal dynamics of heart rate and quite different metrics are reported in the literature. As complexity metrics of heart rate variability depend critically on variability of the data, different choices regarding the kind of measures can have a substantial impact on the results. In this article we compare linear and non-linear statistics on two prominent types of heart beat data, beat-to-beat intervals (R-R interval) and beats-per-min (BPM). As a proof-of-concept, we employ a simple rest-exercise-rest task and show that non-linear statistics-fractal (DFA) and recurrence (RQA) analyses-reveal information about heart beat activity above and beyond the simple level of heart rate. Non-linear statistics unveil sustained post-exercise effects on heart rate dynamics, but their power to do so critically depends on the type data that is employed: While R-R intervals are very susceptible to non-linear analyses, the success of non-linear methods for BPM data critically depends on their construction. Generally, "oversampled" BPM time-series can be recommended as they retain most of the information about non-linear aspects of heart beat dynamics.
Valid statistical approaches for analyzing sholl data: Mixed effects versus simple linear models.
Wilson, Machelle D; Sethi, Sunjay; Lein, Pamela J; Keil, Kimberly P
2017-03-01
The Sholl technique is widely used to quantify dendritic morphology. Data from such studies, which typically sample multiple neurons per animal, are often analyzed using simple linear models. However, simple linear models fail to account for intra-class correlation that occurs with clustered data, which can lead to faulty inferences. Mixed effects models account for intra-class correlation that occurs with clustered data; thus, these models more accurately estimate the standard deviation of the parameter estimate, which produces more accurate p-values. While mixed models are not new, their use in neuroscience has lagged behind their use in other disciplines. A review of the published literature illustrates common mistakes in analyses of Sholl data. Analysis of Sholl data collected from Golgi-stained pyramidal neurons in the hippocampus of male and female mice using both simple linear and mixed effects models demonstrates that the p-values and standard deviations obtained using the simple linear models are biased downwards and lead to erroneous rejection of the null hypothesis in some analyses. The mixed effects approach more accurately models the true variability in the data set, which leads to correct inference. Mixed effects models avoid faulty inference in Sholl analysis of data sampled from multiple neurons per animal by accounting for intra-class correlation. Given the widespread practice in neuroscience of obtaining multiple measurements per subject, there is a critical need to apply mixed effects models more widely. Copyright © 2017 Elsevier B.V. All rights reserved.
Statistical Selection of Biological Models for Genome-Wide Association Analyses.
Bi, Wenjian; Kang, Guolian; Pounds, Stanley B
2018-05-24
Genome-wide association studies have discovered many biologically important associations of genes with phenotypes. Typically, genome-wide association analyses formally test the association of each genetic feature (SNP, CNV, etc) with the phenotype of interest and summarize the results with multiplicity-adjusted p-values. However, very small p-values only provide evidence against the null hypothesis of no association without indicating which biological model best explains the observed data. Correctly identifying a specific biological model may improve the scientific interpretation and can be used to more effectively select and design a follow-up validation study. Thus, statistical methodology to identify the correct biological model for a particular genotype-phenotype association can be very useful to investigators. Here, we propose a general statistical method to summarize how accurately each of five biological models (null, additive, dominant, recessive, co-dominant) represents the data observed for each variant in a GWAS study. We show that the new method stringently controls the false discovery rate and asymptotically selects the correct biological model. Simulations of two-stage discovery-validation studies show that the new method has these properties and that its validation power is similar to or exceeds that of simple methods that use the same statistical model for all SNPs. Example analyses of three data sets also highlight these advantages of the new method. An R package is freely available at www.stjuderesearch.org/site/depts/biostats/maew. Copyright © 2018. Published by Elsevier Inc.
Liu, Yingchun; Sun, Guoxiang; Wang, Yan; Yang, Lanping; Yang, Fangliang
2015-06-01
Micellar electrokinetic chromatography fingerprinting combined with quantification was successfully developed and applied to monitor the quality consistency of Weibizhi tablets, which is a classical compound preparation used to treat gastric ulcers. A background electrolyte composed of 57 mmol/L sodium borate, 21 mmol/L sodium dodecylsulfate and 100 mmol/L sodium hydroxide was used to separate compounds. To optimize capillary electrophoresis conditions, multivariate statistical analyses were applied. First, the most important factors influencing sample electrophoretic behavior were identified as background electrolyte concentrations. Then, a Box-Benhnken design response surface strategy using resolution index RF as an integrated response was set up to correlate factors with response. RF reflects the effective signal amount, resolution, and signal homogenization in an electropherogram, thus, it was regarded as an excellent indicator. In fingerprint assessments, simple quantified ratio fingerprint method was established for comprehensive quality discrimination of traditional Chinese medicines/herbal medicines from qualitative and quantitative perspectives, by which the quality of 27 samples from the same manufacturer were well differentiated. In addition, the fingerprint-efficacy relationship between fingerprints and antioxidant activities was established using partial least squares regression, which provided important medicinal efficacy information for quality control. The present study offered an efficient means for monitoring Weibizhi tablet quality consistency. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Evaluation of IOTA Simple Ultrasound Rules to Distinguish Benign and Malignant Ovarian Tumours
Kaur, Amarjit; Mohi, Jaswinder Kaur; Sibia, Preet Kanwal; Kaur, Navkiran
2017-01-01
Introduction IOTA stands for International Ovarian Tumour Analysis group. Ovarian cancer is one of the common cancers in women and is diagnosed at later stage in majority. The limiting factor for early diagnosis is lack of standardized terms and procedures in gynaecological sonography. Introduction of IOTA rules has provided some consistency in defining morphological features of ovarian masses through a standardized examination technique. Aim To evaluate the efficacy of IOTA simple ultrasound rules in distinguishing benign and malignant ovarian tumours and establishing their use as a tool in early diagnosis of ovarian malignancy. Materials and Methods A hospital based case control prospective study was conducted. Patients with suspected ovarian pathology were evaluated using IOTA ultrasound rules and designated as benign or malignant. Findings were correlated with histopathological findings. Collected data was statistically analysed using chi-square test and kappa statistical method. Results Out of initial 55 patients, 50 patients were included in the final analysis who underwent surgery. IOTA simple rules were applicable in 45 out of these 50 patients (90%). The sensitivity for the detection of malignancy in cases where IOTA simple rules were applicable was 91.66% and the specificity was 84.84%. Accuracy was 86.66%. Classifying inconclusive cases as malignant, the sensitivity and specificity was 93% and 80% respectively. High level of agreement was found between USG and histopathological diagnosis with Kappa value as 0.323. Conclusion IOTA simple ultrasound rules were highly sensitive and specific in predicting ovarian malignancy preoperatively yet being reproducible, easy to train and use. PMID:28969237
Periodicity of microfilariae of human filariasis analysed by a trigonometric method (Aikat and Das).
Tanaka, H
1981-04-01
The microfilarial periodicity of human filariae was characterized statistically by fitting the observed change of microfilaria (mf) counts to the formula of a simple harmonic wave using two parameters, the peak hour (K) and periodicity index (D) (Sasa & Tanaka, 1972, 1974). Later Aikat and Das (1976) proposed a simple calculation method using trigonometry (A-D method) to determine the peak hour (K) and periodicity index (P). All data of microfilarial periodicity analysed previously by the method of Sasa and Tanaka (S-T method) were calculated again by the A-D method in the present study to evaluate the latter method. The results of calculations showed that P was not proportional to D and the ratios of P/D were mostly smaller than expected, especially when P or D was small in less periodic forms. The peak hour calculated by the A-D method did not differ much from that calculated by the S-T method. Goodness of fit was improved slightly by the A-K method in two thirds of analysed data. The classification of human filariae in respect of the type of periodicity was, however, changed little by the results calculated by the A-D method.
Statistical universals reveal the structures and functions of human music.
Savage, Patrick E; Brown, Steven; Sakai, Emi; Currie, Thomas E
2015-07-21
Music has been called "the universal language of mankind." Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation.
Statistical universals reveal the structures and functions of human music
Savage, Patrick E.; Brown, Steven; Sakai, Emi; Currie, Thomas E.
2015-01-01
Music has been called “the universal language of mankind.” Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation. PMID:26124105
Polygenic scores via penalized regression on summary statistics.
Mak, Timothy Shin Heng; Porsch, Robert Milan; Choi, Shing Wan; Zhou, Xueya; Sham, Pak Chung
2017-09-01
Polygenic scores (PGS) summarize the genetic contribution of a person's genotype to a disease or phenotype. They can be used to group participants into different risk categories for diseases, and are also used as covariates in epidemiological analyses. A number of possible ways of calculating PGS have been proposed, and recently there is much interest in methods that incorporate information available in published summary statistics. As there is no inherent information on linkage disequilibrium (LD) in summary statistics, a pertinent question is how we can use LD information available elsewhere to supplement such analyses. To answer this question, we propose a method for constructing PGS using summary statistics and a reference panel in a penalized regression framework, which we call lassosum. We also propose a general method for choosing the value of the tuning parameter in the absence of validation data. In our simulations, we showed that pseudovalidation often resulted in prediction accuracy that is comparable to using a dataset with validation phenotype and was clearly superior to the conservative option of setting the tuning parameter of lassosum to its lowest value. We also showed that lassosum achieved better prediction accuracy than simple clumping and P-value thresholding in almost all scenarios. It was also substantially faster and more accurate than the recently proposed LDpred. © 2017 WILEY PERIODICALS, INC.
Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P
1999-01-01
Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149
Holmium laser enucleation versus laparoscopic simple prostatectomy for large adenomas.
Juaneda, R; Thanigasalam, R; Rizk, J; Perrot, E; Theveniaud, P E; Baumert, H
2016-01-01
The aim of this study is to compare Holmium laser enucleation of the prostate with another minimally invasive technique, the laparoscopic simple prostatectomy. We compared outcomes of a series of 40 patients who underwent laparoscopic simple prostatectomy (n=20) with laser enucleation of the prostate (n=20) for large adenomas (>100 grams) at our institution. Study variables included operative time and catheterization time, hospital stay, pre- and post-operative International Prostate Symptom Score and maximum urinary flow rate, complications and economic evaluation. Statistical analyses were performed using the Student t test and Fisher test. There were no significant differences in patient age, preoperative prostatic size, operating time or specimen weight between the 2 groups. Duration of catheterization (P=.0008) and hospital stay (P<.0001) were significantly less in the laser group. Both groups showed a statistically significant improvement in functional variables at 3 months post operatively. The cost utility analysis for Holmium per case was 2589 euros versus 4706 per laparoscopic case. In the laser arm, 4 patients (20%) experienced complications according to the modified Clavien classification system versus 5 (25%) in the laparoscopic group (P>.99). Holmium enucleation of the prostate has similar short term functional results and complication rates compared to laparoscopic simple prostatectomy performed in large glands with the advantage of less catheterization time, lower economic costs and a reduced hospital stay. Copyright © 2015 AEU. Publicado por Elsevier España, S.L.U. All rights reserved.
Estimating population diversity with CatchAll
Bunge, John; Woodard, Linda; Böhning, Dankmar; Foster, James A.; Connolly, Sean; Allen, Heather K.
2012-01-01
Motivation: The massive data produced by next-generation sequencing require advanced statistical tools. We address estimating the total diversity or species richness in a population. To date, only relatively simple methods have been implemented in available software. There is a need for software employing modern, computationally intensive statistical analyses including error, goodness-of-fit and robustness assessments. Results: We present CatchAll, a fast, easy-to-use, platform-independent program that computes maximum likelihood estimates for finite-mixture models, weighted linear regression-based analyses and coverage-based non-parametric methods, along with outlier diagnostics. Given sample ‘frequency count’ data, CatchAll computes 12 different diversity estimates and applies a model-selection algorithm. CatchAll also derives discounted diversity estimates to adjust for possibly uncertain low-frequency counts. It is accompanied by an Excel-based graphics program. Availability: Free executable downloads for Linux, Windows and Mac OS, with manual and source code, at www.northeastern.edu/catchall. Contact: jab18@cornell.edu PMID:22333246
Oakes, J M; Feldman, H A
2001-02-01
Nonequivalent controlled pretest-posttest designs are central to evaluation science, yet no practical and unified approach for estimating power in the two most widely used analytic approaches to these designs exists. This article fills the gap by presenting and comparing useful, unified power formulas for ANCOVA and change-score analyses, indicating the implications of each on sample-size requirements. The authors close with practical recommendations for evaluators. Mathematical details and a simple spreadsheet approach are included in appendices.
Quantitative analysis of tympanic membrane perforation: a simple and reliable method.
Ibekwe, T S; Adeosun, A A; Nwaorgu, O G
2009-01-01
Accurate assessment of the features of tympanic membrane perforation, especially size, site, duration and aetiology, is important, as it enables optimum management. To describe a simple, cheap and effective method of quantitatively analysing tympanic membrane perforations. The system described comprises a video-otoscope (capable of generating still and video images of the tympanic membrane), adapted via a universal serial bus box to a computer screen, with images analysed using the Image J geometrical analysis software package. The reproducibility of results and their correlation with conventional otoscopic methods of estimation were tested statistically with the paired t-test and correlational tests, using the Statistical Package for the Social Sciences version 11 software. The following equation was generated: P/T x 100 per cent = percentage perforation, where P is the area (in pixels2) of the tympanic membrane perforation and T is the total area (in pixels2) for the entire tympanic membrane (including the perforation). Illustrations are shown. Comparison of blinded data on tympanic membrane perforation area obtained independently from assessments by two trained otologists, of comparative years of experience, using the video-otoscopy system described, showed similar findings, with strong correlations devoid of inter-observer error (p = 0.000, r = 1). Comparison with conventional otoscopic assessment also indicated significant correlation, comparing results for two trained otologists, but some inter-observer variation was present (p = 0.000, r = 0.896). Correlation between the two methods for each of the otologists was also highly significant (p = 0.000). A computer-adapted video-otoscope, with images analysed by Image J software, represents a cheap, reliable, technology-driven, clinical method of quantitative analysis of tympanic membrane perforations and injuries.
Willis, Brian H; Riley, Richard D
2017-09-20
An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Statistical Emulation of Climate Model Projections Based on Precomputed GCM Runs*
Castruccio, Stefano; McInerney, David J.; Stein, Michael L.; ...
2014-02-24
The authors describe a new approach for emulating the output of a fully coupled climate model under arbitrary forcing scenarios that is based on a small set of precomputed runs from the model. Temperature and precipitation are expressed as simple functions of the past trajectory of atmospheric CO 2 concentrations, and a statistical model is fit using a limited set of training runs. The approach is demonstrated to be a useful and computationally efficient alternative to pattern scaling and captures the nonlinear evolution of spatial patterns of climate anomalies inherent in transient climates. The approach does as well as patternmore » scaling in all circumstances and substantially better in many; it is not computationally demanding; and, once the statistical model is fit, it produces emulated climate output effectively instantaneously. In conclusion, it may therefore find wide application in climate impacts assessments and other policy analyses requiring rapid climate projections.« less
Tipping points in the arctic: eyeballing or statistical significance?
Carstensen, Jacob; Weydmann, Agata
2012-02-01
Arctic ecosystems have experienced and are projected to experience continued large increases in temperature and declines in sea ice cover. It has been hypothesized that small changes in ecosystem drivers can fundamentally alter ecosystem functioning, and that this might be particularly pronounced for Arctic ecosystems. We present a suite of simple statistical analyses to identify changes in the statistical properties of data, emphasizing that changes in the standard error should be considered in addition to changes in mean properties. The methods are exemplified using sea ice extent, and suggest that the loss rate of sea ice accelerated by factor of ~5 in 1996, as reported in other studies, but increases in random fluctuations, as an early warning signal, were observed already in 1990. We recommend to employ the proposed methods more systematically for analyzing tipping points to document effects of climate change in the Arctic.
OdorMapComparer: an application for quantitative analyses and comparisons of fMRI brain odor maps.
Liu, Nian; Xu, Fuqiang; Miller, Perry L; Shepherd, Gordon M
2007-01-01
Brain odor maps are reconstructed flat images that describe the spatial activity patterns in the glomerular layer of the olfactory bulbs in animals exposed to different odor stimuli. We have developed a software application, OdorMapComparer, to carry out quantitative analyses and comparisons of the fMRI odor maps. This application is an open-source window program that first loads two odor map images being compared. It allows image transformations including scaling, flipping, rotating, and warping so that the two images can be appropriately aligned to each other. It performs simple subtraction, addition, and average of signals in the two images. It also provides comparative statistics including the normalized correlation (NC) and spatial correlation coefficient. Experimental studies showed that the rodent fMRI odor maps for aliphatic aldehydes displayed spatial activity patterns that are similar in gross outlines but somewhat different in specific subregions. Analyses with OdorMapComparer indicate that the similarity between odor maps decreases with increasing difference in the length of carbon chains. For example, the map of butanal is more closely related to that of pentanal (with a NC = 0.617) than to that of octanal (NC = 0.082), which is consistent with animal behavioral studies. The study also indicates that fMRI odor maps are statistically odor-specific and repeatable across both the intra- and intersubject trials. OdorMapComparer thus provides a tool for quantitative, statistical analyses and comparisons of fMRI odor maps in a fashion that is integrated with the overall odor mapping techniques.
NASA Astrophysics Data System (ADS)
Agus, M.; Penna, M. P.; Peró-Cebollero, M.; Guàrdia-Olmos, J.
2015-02-01
Numerous studies have examined students' difficulties in understanding some notions related to statistical problems. Some authors observed that the presentation of distinct visual representations could increase statistical reasoning, supporting the principle of graphical facilitation. But other researchers disagree with this viewpoint, emphasising the impediments related to the use of illustrations that could overcharge the cognitive system with insignificant data. In this work we aim at comparing the probabilistic statistical reasoning regarding two different formats of problem presentations: graphical and verbal-numerical. We have conceived and presented five pairs of homologous simple problems in the verbal numerical and graphical format to 311 undergraduate Psychology students (n=156 in Italy and n=155 in Spain) without statistical expertise. The purpose of our work was to evaluate the effect of graphical facilitation in probabilistic statistical reasoning. Every undergraduate has solved each pair of problems in two formats in different problem presentation orders and sequences. Data analyses have highlighted that the effect of graphical facilitation is infrequent in psychology undergraduates. This effect is related to many factors (as knowledge, abilities, attitudes, and anxiety); moreover it might be considered the resultant of interaction between individual and task characteristics.
Applying the compound Poisson process model to the reporting of injury-related mortality rates.
Kegler, Scott R
2007-02-16
Injury-related mortality rate estimates are often analyzed under the assumption that case counts follow a Poisson distribution. Certain types of injury incidents occasionally involve multiple fatalities, however, resulting in dependencies between cases that are not reflected in the simple Poisson model and which can affect even basic statistical analyses. This paper explores the compound Poisson process model as an alternative, emphasizing adjustments to some commonly used interval estimators for population-based rates and rate ratios. The adjusted estimators involve relatively simple closed-form computations, which in the absence of multiple-case incidents reduce to familiar estimators based on the simpler Poisson model. Summary data from the National Violent Death Reporting System are referenced in several examples demonstrating application of the proposed methodology.
New approach in the quantum statistical parton distribution
NASA Astrophysics Data System (ADS)
Sohaily, Sozha; Vaziri (Khamedi), Mohammad
2017-12-01
An attempt to find simple parton distribution functions (PDFs) based on quantum statistical approach is presented. The PDFs described by the statistical model have very interesting physical properties which help to understand the structure of partons. The longitudinal portion of distribution functions are given by applying the maximum entropy principle. An interesting and simple approach to determine the statistical variables exactly without fitting and fixing parameters is surveyed. Analytic expressions of the x-dependent PDFs are obtained in the whole x region [0, 1], and the computed distributions are consistent with the experimental observations. The agreement with experimental data, gives a robust confirm of our simple presented statistical model.
Cumulative sum control charts for assessing performance in arterial surgery.
Beiles, C Barry; Morton, Anthony P
2004-03-01
The Melbourne Vascular Surgical Association (Melbourne, Australia) undertakes surveillance of mortality following aortic aneurysm surgery, patency at discharge following infrainguinal bypass and stroke and death following carotid endarterectomy. Quality improvement protocol employing the Deming cycle requires that the system for performing surgery first be analysed and optimized. Then process and outcome data are collected and these data require careful analysis. There must be a mechanism so that the causes of unsatisfactory outcomes can be determined and a good feedback mechanism must exist so that good performance is acknowledged and unsatisfactory performance corrected. A simple method for analysing these data that detects changes in average outcome rates is available using cumulative sum statistical control charts. Data have been analysed both retrospectively from 1999 to 2001, and prospectively during 2002 using cumulative sum control methods. A pathway to deal with control chart signals has been developed. The standard of arterial surgery in Victoria, Australia, is high. In one case a safe and satisfactory outcome was achieved by following the pathway developed by the audit committee. Cumulative sum control charts are a simple and effective tool for the identification of variations in performance standards in arterial surgery. The establishment of a pathway to manage problem performance is a vital part of audit activity.
Bootstrap Methods: A Very Leisurely Look.
ERIC Educational Resources Information Center
Hinkle, Dennis E.; Winstead, Wayland H.
The Bootstrap method, a computer-intensive statistical method of estimation, is illustrated using a simple and efficient Statistical Analysis System (SAS) routine. The utility of the method for generating unknown parameters, including standard errors for simple statistics, regression coefficients, discriminant function coefficients, and factor…
Riley, Richard D.
2017-01-01
An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945
Multi-Scale Modeling to Improve Single-Molecule, Single-Cell Experiments
NASA Astrophysics Data System (ADS)
Munsky, Brian; Shepherd, Douglas
2014-03-01
Single-cell, single-molecule experiments are producing an unprecedented amount of data to capture the dynamics of biological systems. When integrated with computational models, observations of spatial, temporal and stochastic fluctuations can yield powerful quantitative insight. We concentrate on experiments that localize and count individual molecules of mRNA. These high precision experiments have large imaging and computational processing costs, and we explore how improved computational analyses can dramatically reduce overall data requirements. In particular, we show how analyses of spatial, temporal and stochastic fluctuations can significantly enhance parameter estimation results for small, noisy data sets. We also show how full probability distribution analyses can constrain parameters with far less data than bulk analyses or statistical moment closures. Finally, we discuss how a systematic modeling progression from simple to more complex analyses can reduce total computational costs by orders of magnitude. We illustrate our approach using single-molecule, spatial mRNA measurements of Interleukin 1-alpha mRNA induction in human THP1 cells following stimulation. Our approach could improve the effectiveness of single-molecule gene regulation analyses for many other process.
Methodological and Reporting Quality of Systematic Reviews and Meta-analyses in Endodontics.
Nagendrababu, Venkateshbabu; Pulikkotil, Shaju Jacob; Sultan, Omer Sheriff; Jayaraman, Jayakumar; Peters, Ove A
2018-06-01
The aim of this systematic review (SR) was to evaluate the quality of SRs and meta-analyses (MAs) in endodontics. A comprehensive literature search was conducted to identify relevant articles in the electronic databases from January 2000 to June 2017. Two reviewers independently assessed the articles for eligibility and data extraction. SRs and MAs on interventional studies with a minimum of 2 therapeutic strategies in endodontics were included in this SR. Methodologic and reporting quality were assessed using A Measurement Tool to Assess Systematic Reviews (AMSTAR) and Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA), respectively. The interobserver reliability was calculated using the Cohen kappa statistic. Statistical analysis with the level of significance at P < .05 was performed using Kruskal-Wallis tests and simple linear regression analysis. A total of 30 articles were selected for the current SR. Using AMSTAR, the item related to the scientific quality of studies used in conclusion was adhered by less than 40% of studies. Using PRISMA, 3 items were reported by less than 40% of studies, which were on objectives, protocol registration, and funding. No association was evident comparing the number of authors and country with quality. Statistical significance was observed when quality was compared among journals, with studies published as Cochrane reviews superior to those published in other journals. AMSTAR and PRISMA scores were significantly related. SRs in endodontics showed variability in both methodologic and reporting quality. Copyright © 2018 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Angstman, Nicholas B; Frank, Hans-Georg; Schmitz, Christoph
2016-01-01
As a widely used and studied model organism, Caenorhabditis elegans worms offer the ability to investigate implications of behavioral change. Although, investigation of C. elegans behavioral traits has been shown, analysis is often narrowed down to measurements based off a single point, and thus cannot pick up on subtle behavioral and morphological changes. In the present study videos were captured of four different C. elegans strains grown in liquid cultures and transferred to NGM-agar plates with an E. coli lawn or with no lawn. Using an advanced software, WormLab, the full skeleton and outline of worms were tracked to determine whether the presence of food affects behavioral traits. In all seven investigated parameters, statistically significant differences were found in worm behavior between those moving on NGM-agar plates with an E. coli lawn and NGM-agar plates with no lawn. Furthermore, multiple test groups showed differences in interaction between variables as the parameters that significantly correlated statistically with speed of locomotion varied. In the present study, we demonstrate the validity of a model to analyze C. elegans behavior beyond simple speed of locomotion. The need to account for a nested design while performing statistical analyses in similar studies is also demonstrated. With extended analyses, C. elegans behavioral change can be investigated with greater sensitivity, which could have wide utility in fields such as, but not limited to, toxicology, drug discovery, and RNAi screening.
Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses
Bayzid, Md Shamsuzzoha; Mirarab, Siavash; Boussau, Bastien; Warnow, Tandy
2015-01-01
Because biological processes can result in different loci having different evolutionary histories, species tree estimation requires multiple loci from across multiple genomes. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity. Coalescent-based methods have been developed to estimate species trees, many of which operate by combining estimated gene trees, and so are called "summary methods". Because summary methods are generally fast (and much faster than more complicated coalescent-based methods that co-estimate gene trees and species trees), they have become very popular techniques for estimating species trees from multiple loci. However, recent studies have established that summary methods can have reduced accuracy in the presence of gene tree estimation error, and also that many biological datasets have substantial gene tree estimation error, so that summary methods may not be highly accurate in biologically realistic conditions. Mirarab et al. (Science 2014) presented the "statistical binning" technique to improve gene tree estimation in multi-locus analyses, and showed that it improved the accuracy of MP-EST, one of the most popular coalescent-based summary methods. Statistical binning, which uses a simple heuristic to evaluate "combinability" and then uses the larger sets of genes to re-calculate gene trees, has good empirical performance, but using statistical binning within a phylogenomic pipeline does not have the desirable property of being statistically consistent. We show that weighting the re-calculated gene trees by the bin sizes makes statistical binning statistically consistent under the multispecies coalescent, and maintains the good empirical performance. Thus, "weighted statistical binning" enables highly accurate genome-scale species tree estimation, and is also statistically consistent under the multi-species coalescent model. New data used in this study are available at DOI: http://dx.doi.org/10.6084/m9.figshare.1411146, and the software is available at https://github.com/smirarab/binning. PMID:26086579
Mackey, Aaron J; Pearson, William R
2004-10-01
Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
NASA Astrophysics Data System (ADS)
Lee, David S.; Longhurst, James W. S.
Precipitation chemistry data from a dense urban monitoring network in Greater Manchester, northwest England, were compared with interpolated values from the U.K. secondary national acid deposition monitoring network for the year 1988. Differences were found to be small. However, when data from individual sites from the Greater Manchester network were compared with data from the two nearest secondary national network sites, significant differences were found using simple and complex statistical analyses. Precipitation chemistry at rural sites could be similar to that at urban sites, but the sources of some ions were thought to be different. The synoptic-scale gradients of precipitation chemistry, as shown by the secondary national network, also accounted for some of the differences.
A simple program to measure and analyse tree rings using Excel, R and SigmaScan
Hietz, Peter
2011-01-01
I present a new software that links a program for image analysis (SigmaScan), one for spreadsheets (Excel) and one for statistical analysis (R) for applications of tree-ring analysis. The first macro measures ring width marked by the user on scanned images, stores raw and detrended data in Excel and calculates the distance to the pith and inter-series correlations. A second macro measures darkness along a defined path to identify latewood–earlywood transition in conifers, and a third shows the potential for automatic detection of boundaries. Written in Visual Basic for Applications, the code makes use of the advantages of existing programs and is consequently very economic and relatively simple to adjust to the requirements of specific projects or to expand making use of already available code. PMID:26109835
A simple program to measure and analyse tree rings using Excel, R and SigmaScan.
Hietz, Peter
I present a new software that links a program for image analysis (SigmaScan), one for spreadsheets (Excel) and one for statistical analysis (R) for applications of tree-ring analysis. The first macro measures ring width marked by the user on scanned images, stores raw and detrended data in Excel and calculates the distance to the pith and inter-series correlations. A second macro measures darkness along a defined path to identify latewood-earlywood transition in conifers, and a third shows the potential for automatic detection of boundaries. Written in Visual Basic for Applications, the code makes use of the advantages of existing programs and is consequently very economic and relatively simple to adjust to the requirements of specific projects or to expand making use of already available code.
Why the Long Face? The Mechanics of Mandibular Symphysis Proportions in Crocodiles
Walmsley, Christopher W.; Smits, Peter D.; Quayle, Michelle R.; McCurry, Matthew R.; Richards, Heather S.; Oldfield, Christopher C.; Wroe, Stephen; Clausen, Phillip D.; McHenry, Colin R.
2013-01-01
Background Crocodilians exhibit a spectrum of rostral shape from long snouted (longirostrine), through to short snouted (brevirostrine) morphologies. The proportional length of the mandibular symphysis correlates consistently with rostral shape, forming as much as 50% of the mandible’s length in longirostrine forms, but 10% in brevirostrine crocodilians. Here we analyse the structural consequences of an elongate mandibular symphysis in relation to feeding behaviours. Methods/Principal Findings Simple beam and high resolution Finite Element (FE) models of seven species of crocodile were analysed under loads simulating biting, shaking and twisting. Using beam theory, we statistically compared multiple hypotheses of which morphological variables should control the biomechanical response. Brevi- and mesorostrine morphologies were found to consistently outperform longirostrine types when subject to equivalent biting, shaking and twisting loads. The best predictors of performance for biting and twisting loads in FE models were overall length and symphyseal length respectively; for shaking loads symphyseal length and a multivariate measurement of shape (PC1– which is strongly but not exclusively correlated with symphyseal length) were equally good predictors. Linear measurements were better predictors than multivariate measurements of shape in biting and twisting loads. For both biting and shaking loads but not for twisting, simple beam models agree with best performance predictors in FE models. Conclusions/Significance Combining beam and FE modelling allows a priori hypotheses about the importance of morphological traits on biomechanics to be statistically tested. Short mandibular symphyses perform well under loads used for feeding upon large prey, but elongate symphyses incur high strains under equivalent loads, underlining the structural constraints to prey size in the longirostrine morphotype. The biomechanics of the crocodilian mandible are largely consistent with beam theory and can be predicted from simple morphological measurements, suggesting that crocodilians are a useful model for investigating the palaeobiomechanics of other aquatic tetrapods. PMID:23342027
2013-01-01
Background Plasma glucose levels are important measures in medical care and research, and are often obtained from oral glucose tolerance tests (OGTT) with repeated measurements over 2–3 hours. It is common practice to use simple summary measures of OGTT curves. However, different OGTT curves can yield similar summary measures, and information of physiological or clinical interest may be lost. Our mean aim was to extract information inherent in the shape of OGTT glucose curves, compare it with the information from simple summary measures, and explore the clinical usefulness of such information. Methods OGTTs with five glucose measurements over two hours were recorded for 974 healthy pregnant women in their first trimester. For each woman, the five measurements were transformed into smooth OGTT glucose curves by functional data analysis (FDA), a collection of statistical methods developed specifically to analyse curve data. The essential modes of temporal variation between OGTT glucose curves were extracted by functional principal component analysis. The resultant functional principal component (FPC) scores were compared with commonly used simple summary measures: fasting and two-hour (2-h) values, area under the curve (AUC) and simple shape index (2-h minus 90-min values, or 90-min minus 60-min values). Clinical usefulness of FDA was explored by regression analyses of glucose tolerance later in pregnancy. Results Over 99% of the variation between individually fitted curves was expressed in the first three FPCs, interpreted physiologically as “general level” (FPC1), “time to peak” (FPC2) and “oscillations” (FPC3). FPC1 scores correlated strongly with AUC (r=0.999), but less with the other simple summary measures (−0.42≤r≤0.79). FPC2 scores gave shape information not captured by simple summary measures (−0.12≤r≤0.40). FPC2 scores, but not FPC1 nor the simple summary measures, discriminated between women who did and did not develop gestational diabetes later in pregnancy. Conclusions FDA of OGTT glucose curves in early pregnancy extracted shape information that was not identified by commonly used simple summary measures. This information discriminated between women with and without gestational diabetes later in pregnancy. PMID:23327294
Frøslie, Kathrine Frey; Røislien, Jo; Qvigstad, Elisabeth; Godang, Kristin; Bollerslev, Jens; Voldner, Nanna; Henriksen, Tore; Veierød, Marit B
2013-01-17
Plasma glucose levels are important measures in medical care and research, and are often obtained from oral glucose tolerance tests (OGTT) with repeated measurements over 2-3 hours. It is common practice to use simple summary measures of OGTT curves. However, different OGTT curves can yield similar summary measures, and information of physiological or clinical interest may be lost. Our mean aim was to extract information inherent in the shape of OGTT glucose curves, compare it with the information from simple summary measures, and explore the clinical usefulness of such information. OGTTs with five glucose measurements over two hours were recorded for 974 healthy pregnant women in their first trimester. For each woman, the five measurements were transformed into smooth OGTT glucose curves by functional data analysis (FDA), a collection of statistical methods developed specifically to analyse curve data. The essential modes of temporal variation between OGTT glucose curves were extracted by functional principal component analysis. The resultant functional principal component (FPC) scores were compared with commonly used simple summary measures: fasting and two-hour (2-h) values, area under the curve (AUC) and simple shape index (2-h minus 90-min values, or 90-min minus 60-min values). Clinical usefulness of FDA was explored by regression analyses of glucose tolerance later in pregnancy. Over 99% of the variation between individually fitted curves was expressed in the first three FPCs, interpreted physiologically as "general level" (FPC1), "time to peak" (FPC2) and "oscillations" (FPC3). FPC1 scores correlated strongly with AUC (r=0.999), but less with the other simple summary measures (-0.42≤r≤0.79). FPC2 scores gave shape information not captured by simple summary measures (-0.12≤r≤0.40). FPC2 scores, but not FPC1 nor the simple summary measures, discriminated between women who did and did not develop gestational diabetes later in pregnancy. FDA of OGTT glucose curves in early pregnancy extracted shape information that was not identified by commonly used simple summary measures. This information discriminated between women with and without gestational diabetes later in pregnancy.
Advancements in RNASeqGUI towards a Reproducible Analysis of RNA-Seq Experiments
Russo, Francesco; Righelli, Dario
2016-01-01
We present the advancements and novelties recently introduced in RNASeqGUI, a graphical user interface that helps biologists to handle and analyse large data collected in RNA-Seq experiments. This work focuses on the concept of reproducible research and shows how it has been incorporated in RNASeqGUI to provide reproducible (computational) results. The novel version of RNASeqGUI combines graphical interfaces with tools for reproducible research, such as literate statistical programming, human readable report, parallel executions, caching, and interactive and web-explorable tables of results. These features allow the user to analyse big datasets in a fast, efficient, and reproducible way. Moreover, this paper represents a proof of concept, showing a simple way to develop computational tools for Life Science in the spirit of reproducible research. PMID:26977414
Dwivedi, Jaya; Namdev, Kuldeep K; Chilkoti, Deepak C; Verma, Surajpal; Sharma, Swapnil
2018-06-06
Therapeutic drug monitoring (TDM) of anti-epileptic drugs provides a valid clinical tool in optimization of overall therapy. However, TDM is challenging due to the high biological samples (plasma/blood) storage/shipment costs and the limited availability of laboratories providing TDM services. Sampling in the form of dry plasma spot (DPS) or dry blood spot (DBS) is a suitable alternative to overcome these issues. An improved, simple, rapid, and stability indicating method for quantification of pregabalin in human plasma and DPS has been developed and validated. Analyses were performed on liquid chromatography tandem mass spectrometer under positive ionization mode of electrospray interface. Pregabain-d4 was used as internal standard, and the chromatographic separations were performed on Poroshell 120 EC-C18 column using an isocratic mobile phase flow rate of 1 mL/min. Stability of pregabalin in DPS was evaluated under simulated real-time conditions. Extraction procedures from plasma and DPS samples were compared using statistical tests. The method was validated considering the FDA method validation guideline. The method was linear over the concentration range of 20-16000 ng/mL and 100-10000 ng/mL in plasma and DPS, respectively. DPS samples were found stable for only one week upon storage at room temperature and for at least four weeks at freezing temperature (-20 ± 5 °C). Method was applied for quantification of pregabalin in over 600 samples of a clinical study. Statistical analyses revealed that two extraction procedures in plasma and DPS samples showed statistically insignificant difference and can be used interchangeably without any bias. Proposed method involves simple and rapid steps of sample processing that do not require a pre- or post-column derivatization procedure. The method is suitable for routine pharmacokinetic analysis and therapeutic monitoring of pregabalin.
Van Bockstaele, Femke; Janssens, Ann; Piette, Anne; Callewaert, Filip; Pede, Valerie; Offner, Fritz; Verhasselt, Bruno; Philippé, Jan
2006-07-15
ZAP-70 has been proposed as a surrogate marker for immunoglobulin heavy-chain variable region (IgV(H)) mutation status, which is known as a prognostic marker in B-cell chronic lymphocytic leukemia (CLL). The flow cytometric analysis of ZAP-70 suffers from difficulties in standardization and interpretation. We applied the Kolmogorov-Smirnov (KS) statistical test to make analysis more straightforward. We examined ZAP-70 expression by flow cytometry in 53 patients with CLL. Analysis was performed as initially described by Crespo et al. (New England J Med 2003; 348:1764-1775) and alternatively by application of the KS statistical test comparing T cells with B cells. Receiver-operating-characteristics (ROC)-curve analyses were performed to determine the optimal cut-off values for ZAP-70 measured by the two approaches. ZAP-70 protein expression was compared with ZAP-70 mRNA expression measured by a quantitative PCR (qPCR) and with the IgV(H) mutation status. Both flow cytometric analyses correlated well with the molecular technique and proved to be of equal value in predicting the IgV(H) mutation status. Applying the KS test is reproducible, simple, straightforward, and overcomes a number of difficulties encountered in the Crespo-method. The KS statistical test is an essential part of the software delivered with modern routine analytical flow cytometers and is well suited for analysis of ZAP-70 expression in CLL. (c) 2006 International Society for Analytical Cytology.
The power to detect linkage in complex disease by means of simple LOD-score analyses.
Greenberg, D A; Abreu, P; Hodge, S E
1998-01-01
Maximum-likelihood analysis (via LOD score) provides the most powerful method for finding linkage when the mode of inheritance (MOI) is known. However, because one must assume an MOI, the application of LOD-score analysis to complex disease has been questioned. Although it is known that one can legitimately maximize the maximum LOD score with respect to genetic parameters, this approach raises three concerns: (1) multiple testing, (2) effect on power to detect linkage, and (3) adequacy of the approximate MOI for the true MOI. We evaluated the power of LOD scores to detect linkage when the true MOI was complex but a LOD score analysis assumed simple models. We simulated data from 14 different genetic models, including dominant and recessive at high (80%) and low (20%) penetrances, intermediate models, and several additive two-locus models. We calculated LOD scores by assuming two simple models, dominant and recessive, each with 50% penetrance, then took the higher of the two LOD scores as the raw test statistic and corrected for multiple tests. We call this test statistic "MMLS-C." We found that the ELODs for MMLS-C are >=80% of the ELOD under the true model when the ELOD for the true model is >=3. Similarly, the power to reach a given LOD score was usually >=80% that of the true model, when the power under the true model was >=60%. These results underscore that a critical factor in LOD-score analysis is the MOI at the linked locus, not that of the disease or trait per se. Thus, a limited set of simple genetic models in LOD-score analysis can work well in testing for linkage. PMID:9718328
The power to detect linkage in complex disease by means of simple LOD-score analyses.
Greenberg, D A; Abreu, P; Hodge, S E
1998-09-01
Maximum-likelihood analysis (via LOD score) provides the most powerful method for finding linkage when the mode of inheritance (MOI) is known. However, because one must assume an MOI, the application of LOD-score analysis to complex disease has been questioned. Although it is known that one can legitimately maximize the maximum LOD score with respect to genetic parameters, this approach raises three concerns: (1) multiple testing, (2) effect on power to detect linkage, and (3) adequacy of the approximate MOI for the true MOI. We evaluated the power of LOD scores to detect linkage when the true MOI was complex but a LOD score analysis assumed simple models. We simulated data from 14 different genetic models, including dominant and recessive at high (80%) and low (20%) penetrances, intermediate models, and several additive two-locus models. We calculated LOD scores by assuming two simple models, dominant and recessive, each with 50% penetrance, then took the higher of the two LOD scores as the raw test statistic and corrected for multiple tests. We call this test statistic "MMLS-C." We found that the ELODs for MMLS-C are >=80% of the ELOD under the true model when the ELOD for the true model is >=3. Similarly, the power to reach a given LOD score was usually >=80% that of the true model, when the power under the true model was >=60%. These results underscore that a critical factor in LOD-score analysis is the MOI at the linked locus, not that of the disease or trait per se. Thus, a limited set of simple genetic models in LOD-score analysis can work well in testing for linkage.
Using conventional F-statistics to study unconventional sex-chromosome differentiation.
Rodrigues, Nicolas; Dufresnes, Christophe
2017-01-01
Species with undifferentiated sex chromosomes emerge as key organisms to understand the astonishing diversity of sex-determination systems. Whereas new genomic methods are widening opportunities to study these systems, the difficulty to separately characterize their X and Y homologous chromosomes poses limitations. Here we demonstrate that two simple F -statistics calculated from sex-linked genotypes, namely the genetic distance ( F st ) between sexes and the inbreeding coefficient ( F is ) in the heterogametic sex, can be used as reliable proxies to compare sex-chromosome differentiation between populations. We correlated these metrics using published microsatellite data from two frog species ( Hyla arborea and Rana temporaria ), and show that they intimately relate to the overall amount of X-Y differentiation in populations. However, the fits for individual loci appear highly variable, suggesting that a dense genetic coverage will be needed for inferring fine-scale patterns of differentiation along sex-chromosomes. The applications of these F -statistics, which implies little sampling requirement, significantly facilitate population analyses of sex-chromosomes.
Weighing Evidence "Steampunk" Style via the Meta-Analyser.
Bowden, Jack; Jackson, Chris
2016-10-01
The funnel plot is a graphical visualization of summary data estimates from a meta-analysis, and is a useful tool for detecting departures from the standard modeling assumptions. Although perhaps not widely appreciated, a simple extension of the funnel plot can help to facilitate an intuitive interpretation of the mathematics underlying a meta-analysis at a more fundamental level, by equating it to determining the center of mass of a physical system. We used this analogy to explain the concepts of weighing evidence and of biased evidence to a young audience at the Cambridge Science Festival, without recourse to precise definitions or statistical formulas and with a little help from Sherlock Holmes! Following on from the science fair, we have developed an interactive web-application (named the Meta-Analyser) to bring these ideas to a wider audience. We envisage that our application will be a useful tool for researchers when interpreting their data. First, to facilitate a simple understanding of fixed and random effects modeling approaches; second, to assess the importance of outliers; and third, to show the impact of adjusting for small study bias. This final aim is realized by introducing a novel graphical interpretation of the well-known method of Egger regression.
A Simple Illustration for the Need of Multiple Comparison Procedures
ERIC Educational Resources Information Center
Carter, Rickey E.
2010-01-01
Statistical adjustments to accommodate multiple comparisons are routinely covered in introductory statistical courses. The fundamental rationale for such adjustments, however, may not be readily understood. This article presents a simple illustration to help remedy this.
NASA Technical Reports Server (NTRS)
Vangelder, B. H. W.
1978-01-01
Non-Bayesian statistics were used in simulation studies centered around laser range observations to LAGEOS. The capabilities of satellite laser ranging especially in connection with relative station positioning are evaluated. The satellite measurement system under investigation may fall short in precise determinations of the earth's orientation (precession and nutation) and earth's rotation as opposed to systems as very long baseline interferometry (VLBI) and lunar laser ranging (LLR). Relative station positioning, determination of (differential) polar motion, positioning of stations with respect to the earth's center of mass and determination of the earth's gravity field should be easily realized by satellite laser ranging (SLR). The last two features should be considered as best (or solely) determinable by SLR in contrast to VLBI and LLR.
A Simple Secure Hash Function Scheme Using Multiple Chaotic Maps
NASA Astrophysics Data System (ADS)
Ahmad, Musheer; Khurana, Shruti; Singh, Sushmita; AlSharari, Hamed D.
2017-06-01
The chaotic maps posses high parameter sensitivity, random-like behavior and one-way computations, which favor the construction of cryptographic hash functions. In this paper, we propose to present a novel hash function scheme which uses multiple chaotic maps to generate efficient variable-sized hash functions. The message is divided into four parts, each part is processed by a different 1D chaotic map unit yielding intermediate hash code. The four codes are concatenated to two blocks, then each block is processed through 2D chaotic map unit separately. The final hash value is generated by combining the two partial hash codes. The simulation analyses such as distribution of hashes, statistical properties of confusion and diffusion, message and key sensitivity, collision resistance and flexibility are performed. The results reveal that the proposed anticipated hash scheme is simple, efficient and holds comparable capabilities when compared with some recent chaos-based hash algorithms.
van Rhee, Henk; Hak, Tony
2017-01-01
We present a new tool for meta‐analysis, Meta‐Essentials, which is free of charge and easy to use. In this paper, we introduce the tool and compare its features to other tools for meta‐analysis. We also provide detailed information on the validation of the tool. Although free of charge and simple, Meta‐Essentials automatically calculates effect sizes from a wide range of statistics and can be used for a wide range of meta‐analysis applications, including subgroup analysis, moderator analysis, and publication bias analyses. The confidence interval of the overall effect is automatically based on the Knapp‐Hartung adjustment of the DerSimonian‐Laird estimator. However, more advanced meta‐analysis methods such as meta‐analytical structural equation modelling and meta‐regression with multiple covariates are not available. In summary, Meta‐Essentials may prove a valuable resource for meta‐analysts, including researchers, teachers, and students. PMID:28801932
Heinrich, J; Putzar, H; Seidenschnur, G
1977-01-01
Report about an electronic data processing system for obstetrics and perinatology. The developed data document design and data flowchart are shown. The continuously accumulated data allowed everytime a detailed interpretation record. For all clinical treated patients the computer printed out a final obstetrical report (physician report). By means of this simple system there is an improvement of the information and the typewriting work of medical staff has been reduced. Cost and work-expenditure for this system are limited in relation to our earlier hand made procedure.
Ostberg, C.O.; Rodriguez, R.J.
2004-01-01
Eight polymerase chain reaction primer sets amplifying bi-parentally inherited species-specific markers were developed that differentiate between rainbow trout (Oncorhynchus mykiss) and various cutthroat trout (O. clarki) subspecies. The primers were tested within known F1 and first generation hybrid backcrosses and were shown to amplify codominantly within hybrids. Heterozygous individuals also amplified a slower migrating band that was a heteroduplex, caused by the annealing of polymerase chain reaction products from both species. These primer sets have numerous advantages for native cutthroat trout conservation including statistical genetic analyses of known crosses and simple hybrid identification.
NASA Astrophysics Data System (ADS)
Roberts, Michael J.; Braun, Noah O.; Sinclair, Thomas R.; Lobell, David B.; Schlenker, Wolfram
2017-09-01
We compare predictions of a simple process-based crop model (Soltani and Sinclair 2012), a simple statistical model (Schlenker and Roberts 2009), and a combination of both models to actual maize yields on a large, representative sample of farmer-managed fields in the Corn Belt region of the United States. After statistical post-model calibration, the process model (Simple Simulation Model, or SSM) predicts actual outcomes slightly better than the statistical model, but the combined model performs significantly better than either model. The SSM, statistical model and combined model all show similar relationships with precipitation, while the SSM better accounts for temporal patterns of precipitation, vapor pressure deficit and solar radiation. The statistical and combined models show a more negative impact associated with extreme heat for which the process model does not account. Due to the extreme heat effect, predicted impacts under uniform climate change scenarios are considerably more severe for the statistical and combined models than for the process-based model.
Modeling and replicating statistical topology and evidence for CMB nonhomogeneity
Agami, Sarit
2017-01-01
Under the banner of “big data,” the detection and classification of structure in extremely large, high-dimensional, data sets are two of the central statistical challenges of our times. Among the most intriguing new approaches to this challenge is “TDA,” or “topological data analysis,” one of the primary aims of which is providing nonmetric, but topologically informative, preanalyses of data which make later, more quantitative, analyses feasible. While TDA rests on strong mathematical foundations from topology, in applications, it has faced challenges due to difficulties in handling issues of statistical reliability and robustness, often leading to an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis—the typical case for big data applications—the replications permit conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical framework for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the parametric approach to TDA modeling in an analysis of cosmic microwave background (CMB) nonhomogeneity. PMID:29078301
Investigating the Group-Level Impact of Advanced Dual-Echo fMRI Combinations
Kettinger, Ádám; Hill, Christopher; Vidnyánszky, Zoltán; Windischberger, Christian; Nagy, Zoltán
2016-01-01
Multi-echo fMRI data acquisition has been widely investigated and suggested to optimize sensitivity for detecting the BOLD signal. Several methods have also been proposed for the combination of data with different echo times. The aim of the present study was to investigate whether these advanced echo combination methods provide advantages over the simple averaging of echoes when state-of-the-art group-level random-effect analyses are performed. Both resting-state and task-based dual-echo fMRI data were collected from 27 healthy adult individuals (14 male, mean age = 25.75 years) using standard echo-planar acquisition methods at 3T. Both resting-state and task-based data were subjected to a standard image pre-processing pipeline. Subsequently the two echoes were combined as a weighted average, using four different strategies for calculating the weights: (1) simple arithmetic averaging, (2) BOLD sensitivity weighting, (3) temporal-signal-to-noise ratio weighting and (4) temporal BOLD sensitivity weighting. Our results clearly show that the simple averaging of data with the different echoes is sufficient. Advanced echo combination methods may provide advantages on a single-subject level but when considering random-effects group level statistics they provide no benefit regarding sensitivity (i.e., group-level t-values) compared to the simple echo-averaging approach. One possible reason for the lack of clear advantages may be that apart from increasing the average BOLD sensitivity at the single-subject level, the advanced weighted averaging methods also inflate the inter-subject variance. As the echo combination methods provide very similar results, the recommendation is to choose between them depending on the availability of time for collecting additional resting-state data or whether subject-level or group-level analyses are planned. PMID:28018165
gHRV: Heart rate variability analysis made easy.
Rodríguez-Liñares, L; Lado, M J; Vila, X A; Méndez, A J; Cuesta, P
2014-08-01
In this paper, the gHRV software tool is presented. It is a simple, free and portable tool developed in python for analysing heart rate variability. It includes a graphical user interface and it can import files in multiple formats, analyse time intervals in the signal, test statistical significance and export the results. This paper also contains, as an example of use, a clinical analysis performed with the gHRV tool, namely to determine whether the heart rate variability indexes change across different stages of sleep. Results from tests completed by researchers who have tried gHRV are also explained: in general the application was positively valued and results reflect a high level of satisfaction. gHRV is in continuous development and new versions will include suggestions made by testers. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Small, J R
1993-01-01
This paper is a study into the effects of experimental error on the estimated values of flux control coefficients obtained using specific inhibitors. Two possible techniques for analysing the experimental data are compared: a simple extrapolation method (the so-called graph method) and a non-linear function fitting method. For these techniques, the sources of systematic errors are identified and the effects of systematic and random errors are quantified, using both statistical analysis and numerical computation. It is shown that the graph method is very sensitive to random errors and, under all conditions studied, that the fitting method, even under conditions where the assumptions underlying the fitted function do not hold, outperformed the graph method. Possible ways of designing experiments to minimize the effects of experimental errors are analysed and discussed. PMID:8257434
Statistics for X-chromosome associations.
Özbek, Umut; Lin, Hui-Min; Lin, Yan; Weeks, Daniel E; Chen, Wei; Shaffer, John R; Purcell, Shaun M; Feingold, Eleanor
2018-06-13
In a genome-wide association study (GWAS), association between genotype and phenotype at autosomal loci is generally tested by regression models. However, X-chromosome data are often excluded from published analyses of autosomes because of the difference between males and females in number of X chromosomes. Failure to analyze X-chromosome data at all is obviously less than ideal, and can lead to missed discoveries. Even when X-chromosome data are included, they are often analyzed with suboptimal statistics. Several mathematically sensible statistics for X-chromosome association have been proposed. The optimality of these statistics, however, is based on very specific simple genetic models. In addition, while previous simulation studies of these statistics have been informative, they have focused on single-marker tests and have not considered the types of error that occur even under the null hypothesis when the entire X chromosome is scanned. In this study, we comprehensively tested several X-chromosome association statistics using simulation studies that include the entire chromosome. We also considered a wide range of trait models for sex differences and phenotypic effects of X inactivation. We found that models that do not incorporate a sex effect can have large type I error in some cases. We also found that many of the best statistics perform well even when there are modest deviations, such as trait variance differences between the sexes or small sex differences in allele frequencies, from assumptions. © 2018 WILEY PERIODICALS, INC.
2016-01-01
Purpose The aim of this study was to determine the influence of anatomical conditions on primary stability in the models simulating posterior maxilla. Methods Polyurethane blocks were designed to simulate monocortical (M) and bicortical (B) conditions. Each condition had four subgroups measuring 3 mm (M3, B3), 5 mm (M5, B5), 8 mm (M8, B8), and 12 mm (M12, B12) in residual bone height (RBH). After implant placement, the implant stability quotient (ISQ), Periotest value (PTV), insertion torque (IT), and reverse torque (RT) were measured. Two-factor ANOVA (two cortical conditions×four RBHs) and additional analyses for simple main effects were performed. Results A significant interaction between cortical condition and RBH was demonstrated for all methods measuring stability with two-factor ANOVA. In the analyses for simple main effects, ISQ and PTV were statistically higher in the bicortical groups than the corresponding monocortical groups, respectively. In the monocortical group, ISQ and PTV showed a statistically significant rise with increasing RBH. Measurements of IT and RT showed a similar tendency, measuring highest in the M3 group, followed by the M8, the M5, and the M12 groups. In the bicortical group, all variables showed a similar tendency, with different degrees of rise and decline. The B8 group showed the highest values, followed by the B12, the B5, and the B3 groups. The highest coefficient was demonstrated between ISQ and PTV. Conclusions Primary stability was enhanced by the presence of bicortex and increased RBH, which may be better demonstrated by ISQ and PTV than by IT and RT. PMID:27588215
Sandoval-Castellanos, Edson; Palkopoulou, Eleftheria; Dalén, Love
2014-01-01
Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
[Development of an Excel spreadsheet for meta-analysis of indirect and mixed treatment comparisons].
Tobías, Aurelio; Catalá-López, Ferrán; Roqué, Marta
2014-01-01
Meta-analyses in clinical research usually aimed to evaluate treatment efficacy and safety in direct comparison with a unique comparator. Indirect comparisons, using the Bucher's method, can summarize primary data when information from direct comparisons is limited or nonexistent. Mixed comparisons allow combining estimates from direct and indirect comparisons, increasing statistical power. There is a need for simple applications for meta-analysis of indirect and mixed comparisons. These can easily be conducted using a Microsoft Office Excel spreadsheet. We developed a spreadsheet for indirect and mixed effects comparisons of friendly use for clinical researchers interested in systematic reviews, but non-familiarized with the use of more advanced statistical packages. The use of the proposed Excel spreadsheet for indirect and mixed comparisons can be of great use in clinical epidemiology to extend the knowledge provided by traditional meta-analysis when evidence from direct comparisons is limited or nonexistent.
Statistical complexity without explicit reference to underlying probabilities
NASA Astrophysics Data System (ADS)
Pennini, F.; Plastino, A.
2018-06-01
We show that extremely simple systems of a not too large number of particles can be simultaneously thermally stable and complex. To such an end, we extend the statistical complexity's notion to simple configurations of non-interacting particles, without appeal to probabilities, and discuss configurational properties.
Cost analysis and outcomes of simple elbow dislocations
Panteli, Michalis; Pountos, Ippokratis; Kanakaris, Nikolaos K; Tosounidis, Theodoros H; Giannoudis, Peter V
2015-01-01
AIM: To evaluate the management, clinical outcome and cost implications of three different treatment regimes for simple elbow dislocations. METHODS: Following institutional board approval, we performed a retrospective review of all consecutive patients treated for simple elbow dislocations in a Level I trauma centre between January 2008 and December 2010. Based on the length of elbow immobilisation (LOI), patients were divided in three groups (Group I, < 2 wk; Group II, 2-3 wk; and Group III, > 3 wk). Outcome was considered satisfactory when a patient could achieve a pain-free range of motion ≥ 100° (from 30° to 130°). The associated direct medical costs for the treatment of each patient were then calculated and analysed. RESULTS: We identified 80 patients who met the inclusion criteria. Due to loss to follow up, 13 patients were excluded from further analysis, leaving 67 patients for the final analysis. The mean LOI was 14 d (median 15 d; range 3-43 d) with a mean duration of hospital engagement of 67 d (median 57 d; range 10-351 d). Group III (prolonged immobilisation) had a statistically significant worse outcome in comparison to Group I and II (P = 0.04 and P = 0.01 respectively); however, there was no significant difference in the outcome between groups I and II (P = 0.30). No statistically significant difference in the direct medical costs between the groups was identified. CONCLUSION: The length of elbow immobilization doesn’t influence the medical cost; however immobilisation longer than three weeks is associated with persistent stiffness and a less satisfactory clinical outcome. PMID:26301180
Bluhmki, Tobias; Bramlage, Peter; Volk, Michael; Kaltheuner, Matthias; Danne, Thomas; Rathmann, Wolfgang; Beyersmann, Jan
2017-02-01
Complex longitudinal sampling and the observational structure of patient registers in health services research are associated with methodological challenges regarding data management and statistical evaluation. We exemplify common pitfalls and want to stimulate discussions on the design, development, and deployment of future longitudinal patient registers and register-based studies. For illustrative purposes, we use data from the prospective, observational, German DIabetes Versorgungs-Evaluation register. One aim was to explore predictors for the initiation of a basal insulin supported therapy in patients with type 2 diabetes initially prescribed to glucose-lowering drugs alone. Major challenges are missing mortality information, time-dependent outcomes, delayed study entries, different follow-up times, and competing events. We show that time-to-event methodology is a valuable tool for improved statistical evaluation of register data and should be preferred to simple case-control approaches. Patient registers provide rich data sources for health services research. Analyses are accompanied with the trade-off between data availability, clinical plausibility, and statistical feasibility. Cox' proportional hazards model allows for the evaluation of the outcome-specific hazards, but prediction of outcome probabilities is compromised by missing mortality information. Copyright © 2016 Elsevier Inc. All rights reserved.
Variation in reaction norms: Statistical considerations and biological interpretation.
Morrissey, Michael B; Liefting, Maartje
2016-09-01
Analysis of reaction norms, the functions by which the phenotype produced by a given genotype depends on the environment, is critical to studying many aspects of phenotypic evolution. Different techniques are available for quantifying different aspects of reaction norm variation. We examine what biological inferences can be drawn from some of the more readily applicable analyses for studying reaction norms. We adopt a strongly biologically motivated view, but draw on statistical theory to highlight strengths and drawbacks of different techniques. In particular, consideration of some formal statistical theory leads to revision of some recently, and forcefully, advocated opinions on reaction norm analysis. We clarify what simple analysis of the slope between mean phenotype in two environments can tell us about reaction norms, explore the conditions under which polynomial regression can provide robust inferences about reaction norm shape, and explore how different existing approaches may be used to draw inferences about variation in reaction norm shape. We show how mixed model-based approaches can provide more robust inferences than more commonly used multistep statistical approaches, and derive new metrics of the relative importance of variation in reaction norm intercepts, slopes, and curvatures. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.
Puente, Celso
1976-01-01
Water-level, springflow, and streamflow data were used to develop simple and multiple linear-regression equations for use in estimating water levels in wells and the flow of three major springs in the Edwards aquifer in the eastern San Antonio area. The equations provide daily, monthly, and annual estimates that compare very favorably with observed data. Analyses of geologic and hydrologic data indicate that the water discharged by the major springs is supplied primarily by regional underflow from the west and southwest and by local recharge in the infiltration area in northern Bexar, Comal, and Hays Counties.
Weighing Evidence “Steampunk” Style via the Meta-Analyser
Bowden, Jack; Jackson, Chris
2016-01-01
ABSTRACT The funnel plot is a graphical visualization of summary data estimates from a meta-analysis, and is a useful tool for detecting departures from the standard modeling assumptions. Although perhaps not widely appreciated, a simple extension of the funnel plot can help to facilitate an intuitive interpretation of the mathematics underlying a meta-analysis at a more fundamental level, by equating it to determining the center of mass of a physical system. We used this analogy to explain the concepts of weighing evidence and of biased evidence to a young audience at the Cambridge Science Festival, without recourse to precise definitions or statistical formulas and with a little help from Sherlock Holmes! Following on from the science fair, we have developed an interactive web-application (named the Meta-Analyser) to bring these ideas to a wider audience. We envisage that our application will be a useful tool for researchers when interpreting their data. First, to facilitate a simple understanding of fixed and random effects modeling approaches; second, to assess the importance of outliers; and third, to show the impact of adjusting for small study bias. This final aim is realized by introducing a novel graphical interpretation of the well-known method of Egger regression. PMID:28003684
NASA Astrophysics Data System (ADS)
Malmi, Lauri; Adawi, Tom; Curmi, Ronald; de Graaff, Erik; Duffy, Gavin; Kautz, Christian; Kinnunen, Päivi; Williams, Bill
2018-03-01
We investigated research processes applied in recent publications in the European Journal of Engineering Education (EJEE), exploring how papers link to theoretical work and how research processes have been designed and reported. We analysed all 155 papers published in EJEE in 2009, 2010 and 2013, classifying the papers using a taxonomy of research processes in engineering education research (EER) (Malmi et al. 2012). The majority of the papers presented either empirical work (59%) or were case reports (27%). Our main findings are as follows: (1) EJEE papers build moderately on a wide selection of theoretical work; (2) a great majority of papers have a clear research strategy, but data analysis methods are mostly simple descriptive statistics or simple/undocumented qualitative research methods; and (3) there are significant shortcomings in reporting research questions, methodology and limitations of studies. Our findings are consistent with and extend analyses of EER papers in other publishing venues; they help to build a clearer picture of the research currently published in EJEE and allow us to make recommendations for consideration by the editorial team of the journal. Our employed procedure also provides a framework that can be applied to monitor future global evolution of this and other EER journals.
Origin of the correlations between exit times in pedestrian flows through a bottleneck
NASA Astrophysics Data System (ADS)
Nicolas, Alexandre; Touloupas, Ioannis
2018-01-01
Robust statistical features have emerged from the microscopic analysis of dense pedestrian flows through a bottleneck, notably with respect to the time gaps between successive passages. We pinpoint the mechanisms at the origin of these features thanks to simple models that we develop and analyse quantitatively. We disprove the idea that anticorrelations between successive time gaps (i.e. an alternation between shorter ones and longer ones) are a hallmark of a zipper-like intercalation of pedestrian lines and show that they simply result from the possibility that pedestrians from distinct ‘lines’ or directions cross the bottleneck within a short time interval. A second feature concerns the bursts of escapes, i.e. egresses that come in fast succession. Despite the ubiquity of exponential distributions of burst sizes, entailed by a Poisson process, we argue that anomalous (power-law) statistics arise if the bottleneck is nearly congested, albeit only in a tiny portion of parameter space. The generality of the proposed mechanisms implies that similar statistical features should also be observed for other types of particulate flows.
A statistical and experimental approach for assessing the preservation of plant lipids in soil
NASA Astrophysics Data System (ADS)
Mueller, K. E.; Eissenstat, D. M.; Oleksyn, J.; Freeman, K. H.
2011-12-01
Plant-derived lipids contribute to stable soil organic matter, but further interpretations of their abundance in soils are limited because the factors that control lipid preservation are poorly understood. Using data from a long-term field experiment and simple statistical models, we provide novel constraints on several predictors of the concentration of hydrolyzable lipids in forest mineral soils. Focal lipids included common monomers of cutin, suberin, and plant waxes present in tree leaves and roots. Soil lipid concentrations were most strongly influenced by the concentrations of lipids in leaves and roots of the overlying trees, but were also affected by the type of lipid (e.g. alcohols vs. acids), lipid chain length, and whether lipids originated in leaves or roots. Collectively, these factors explained ~80% of the variation in soil lipid concentrations beneath 11 different tree species. In order to use soil lipid analyses to test and improve conceptual models of soil organic matter stabilization, additional studies that provide experimental and quantitative (i.e. statistical) constraints on plant lipid preservation are needed.
A simple rain attenuation model for earth-space radio links operating at 10-35 GHz
NASA Technical Reports Server (NTRS)
Stutzman, W. L.; Yon, K. M.
1986-01-01
The simple attenuation model has been improved from an earlier version and now includes the effect of wave polarization. The model is for the prediction of rain attenuation statistics on earth-space communication links operating in the 10-35 GHz band. Simple calculations produce attenuation values as a function of average rain rate. These together with rain rate statistics (either measured or predicted) can be used to predict annual rain attenuation statistics. In this paper model predictions are compared to measured data from a data base of 62 experiments performed in the U.S., Europe, and Japan. Comparisons are also made to predictions from other models.
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-07-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
Landscape movements of Anopheles gambiae malaria vector mosquitoes in rural Gambia.
Thomas, Christopher J; Cross, Dónall E; Bøgh, Claus
2013-01-01
For malaria control in Africa it is crucial to characterise the dispersal of its most efficient vector, Anopheles gambiae, in order to target interventions and assess their impact spatially. Our study is, we believe, the first to present a statistical model of dispersal probability against distance from breeding habitat to human settlements for this important disease vector. We undertook post-hoc analyses of mosquito catches made in The Gambia to derive statistical dispersal functions for An. gambiae sensu lato collected in 48 villages at varying distances to alluvial larval habitat along the River Gambia. The proportion dispersing declined exponentially with distance, and we estimated that 90% of movements were within 1.7 km. Although a 'heavy-tailed' distribution is considered biologically more plausible due to active dispersal by mosquitoes seeking blood meals, there was no statistical basis for choosing it over a negative exponential distribution. Using a simple random walk model with daily survival and movements previously recorded in Burkina Faso, we were able to reproduce the dispersal probabilities observed in The Gambia. Our results provide an important quantification of the probability of An. gambiae s.l. dispersal in a rural African setting typical of many parts of the continent. However, dispersal will be landscape specific and in order to generalise to other spatial configurations of habitat and hosts it will be necessary to produce tractable models of mosquito movements for operational use. We show that simple random walk models have potential. Consequently, there is a pressing need for new empirical studies of An. gambiae survival and movements in different settings to drive this development.
Malacarne, Mario; Nardin, Tiziana; Bertoldi, Daniela; Nicolini, Giorgio; Larcher, Roberto
2016-09-01
Commercial tannins from several botanical sources and with different chemical and technological characteristics are used in the food and winemaking industries. Different ways to check their botanical authenticity have been studied in the last few years, through investigation of different analytical parameters. This work proposes a new, effective approach based on the quantification of 6 carbohydrates, 7 polyalcohols, and 55 phenols. 87 tannins from 12 different botanical sources were analysed following a very simple sample preparation procedure. Using Forward Stepwise Discriminant Analysis, 3 statistical models were created based on sugars content, phenols concentration and combination of the two classes of compounds for the 8 most abundant categories (i.e. oak, grape seed, grape skin, gall, chestnut, quebracho, tea and acacia). The last approach provided good results in attributing tannins to the correct botanical origin. Validation, repeated 3 times on subsets of 10% of samples, confirmed the reliability of this model. Copyright © 2016 Elsevier Ltd. All rights reserved.
Suurmond, Robert; van Rhee, Henk; Hak, Tony
2017-12-01
We present a new tool for meta-analysis, Meta-Essentials, which is free of charge and easy to use. In this paper, we introduce the tool and compare its features to other tools for meta-analysis. We also provide detailed information on the validation of the tool. Although free of charge and simple, Meta-Essentials automatically calculates effect sizes from a wide range of statistics and can be used for a wide range of meta-analysis applications, including subgroup analysis, moderator analysis, and publication bias analyses. The confidence interval of the overall effect is automatically based on the Knapp-Hartung adjustment of the DerSimonian-Laird estimator. However, more advanced meta-analysis methods such as meta-analytical structural equation modelling and meta-regression with multiple covariates are not available. In summary, Meta-Essentials may prove a valuable resource for meta-analysts, including researchers, teachers, and students. © 2017 The Authors. Research Synthesis Methods published by John Wiley & Sons Ltd.
Køppe, Simo; Dammeyer, Jesper
2014-09-01
The evolution of developmental psychology has been characterized by the use of different quantitative and qualitative methods and procedures. But how does the use of methods and procedures change over time? This study explores the change and development of statistical methods used in articles published in Child Development from 1930 to 2010. The methods used in every article in the first issue of every volume were categorized into four categories. Until 1980 relatively simple statistical methods were used. During the last 30 years there has been an explosive use of more advanced statistical methods employed. The absence of statistical methods or use of simple methods had been eliminated.
Jackson, Dan; Bowden, Jack
2016-09-07
Confidence intervals for the between study variance are useful in random-effects meta-analyses because they quantify the uncertainty in the corresponding point estimates. Methods for calculating these confidence intervals have been developed that are based on inverting hypothesis tests using generalised heterogeneity statistics. Whilst, under the random effects model, these new methods furnish confidence intervals with the correct coverage, the resulting intervals are usually very wide, making them uninformative. We discuss a simple strategy for obtaining 95 % confidence intervals for the between-study variance with a markedly reduced width, whilst retaining the nominal coverage probability. Specifically, we consider the possibility of using methods based on generalised heterogeneity statistics with unequal tail probabilities, where the tail probability used to compute the upper bound is greater than 2.5 %. This idea is assessed using four real examples and a variety of simulation studies. Supporting analytical results are also obtained. Our results provide evidence that using unequal tail probabilities can result in shorter 95 % confidence intervals for the between-study variance. We also show some further results for a real example that illustrates how shorter confidence intervals for the between-study variance can be useful when performing sensitivity analyses for the average effect, which is usually the parameter of primary interest. We conclude that using unequal tail probabilities when computing 95 % confidence intervals for the between-study variance, when using methods based on generalised heterogeneity statistics, can result in shorter confidence intervals. We suggest that those who find the case for using unequal tail probabilities convincing should use the '1-4 % split', where greater tail probability is allocated to the upper confidence bound. The 'width-optimal' interval that we present deserves further investigation.
Guidelines for the design and statistical analysis of experiments in papers submitted to ATLA.
Festing, M F
2001-01-01
In vitro experiments need to be well designed and correctly analysed if they are to achieve their full potential to replace the use of animals in research. An "experiment" is a procedure for collecting scientific data in order to answer a hypothesis, or to provide material for generating new hypotheses, and differs from a survey because the scientist has control over the treatments that can be applied. Most experiments can be classified into one of a few formal designs, the most common being completely randomised, and randomised block designs. These are quite common with in vitro experiments, which are often replicated in time. Some experiments involve a single independent (treatment) variable, while other "factorial" designs simultaneously vary two or more independent variables, such as drug treatment and cell line. Factorial designs often provide additional information at little extra cost. Experiments need to be carefully planned to avoid bias, be powerful yet simple, provide for a valid statistical analysis and, in some cases, have a wide range of applicability. Virtually all experiments need some sort of statistical analysis in order to take account of biological variation among the experimental subjects. Parametric methods using the t test or analysis of variance are usually more powerful than non-parametric methods, provided the underlying assumptions of normality of the residuals and equal variances are approximately valid. The statistical analyses of data from a completely randomised design, and from a randomised-block design are demonstrated in Appendices 1 and 2, and methods of determining sample size are discussed in Appendix 3. Appendix 4 gives a checklist for authors submitting papers to ATLA.
Cavaco, Carina; Pereira, Jorge A M; Taunk, Khushman; Taware, Ravindra; Rapole, Srikanth; Nagarajaram, Hampapathalu; Câmara, José S
2018-05-07
Saliva is possibly the easiest biofluid to analyse and, despite its simple composition, contains relevant metabolic information. In this work, we explored the potential of the volatile composition of saliva samples as biosignatures for breast cancer (BC) non-invasive diagnosis. To achieve this, 106 saliva samples of BC patients and controls in two distinct geographic regions in Portugal and India were extracted and analysed using optimised headspace solid-phase microextraction gas chromatography mass spectrometry (HS-SPME/GC-MS, 2 mL acidified saliva containing 10% NaCl, stirred (800 rpm) for 45 min at 38 °C and using the CAR/PDMS SPME fibre) followed by multivariate statistical analysis (MVSA). Over 120 volatiles from distinct chemical classes, with significant variations among the groups, were identified. MVSA retrieved a limited number of volatiles, viz. 3-methyl-pentanoic acid, 4-methyl-pentanoic acid, phenol and p-tert-butyl-phenol (Portuguese samples) and acetic, propanoic, benzoic acids, 1,2-decanediol, 2-decanone, and decanal (Indian samples), statistically relevant for the discrimination of BC patients in the populations analysed. This work defines an experimental layout, HS-SPME/GC-MS followed by MVSA, suitable to characterise volatile fingerprints for saliva as putative biosignatures for BC non-invasive diagnosis. Here, it was applied to BC samples from geographically distant populations and good disease separation was obtained. Further studies using larger cohorts are therefore very pertinent to challenge and strengthen this proof-of-concept study. Graphical abstract ᅟ.
NASA Astrophysics Data System (ADS)
Aligholi, Saeed; Lashkaripour, Gholam Reza; Ghafoori, Mohammad
2017-01-01
This paper sheds further light on the fundamental relationships between simple methods, rock strength, and brittleness of igneous rocks. In particular, the relationship between mechanical (point load strength index I s(50) and brittleness value S 20), basic physical (dry density and porosity), and dynamic properties (P-wave velocity and Schmidt rebound values) for a wide range of Iranian igneous rocks is investigated. First, 30 statistical models (including simple and multiple linear regression analyses) were built to identify the relationships between mechanical properties and simple methods. The results imply that rocks with different Schmidt hardness (SH) rebound values have different physicomechanical properties or relations. Second, using these results, it was proved that dry density, P-wave velocity, and SH rebound value provide a fine complement to mechanical properties classification of rock materials. Further, a detailed investigation was conducted on the relationships between mechanical and simple tests, which are established with limited ranges of P-wave velocity and dry density. The results show that strength values decrease with the SH rebound value. In addition, there is a systematic trend between dry density, P-wave velocity, rebound hardness, and brittleness value of the studied rocks, and rocks with medium hardness have a higher brittleness value. Finally, a strength classification chart and a brittleness classification table are presented, providing reliable and low-cost methods for the classification of igneous rocks.
Rising air and stream-water temperatures in Chesapeake Bay region, USA
Rice, Karen C.; Jastram, John D.
2015-01-01
Monthly mean air temperature (AT) at 85 sites and instantaneous stream-water temperature (WT) at 129 sites for 1960–2010 are examined for the mid-Atlantic region, USA. Temperature anomalies for two periods, 1961–1985 and 1985–2010, relative to the climate normal period of 1971–2000, indicate that the latter period was statistically significantly warmer than the former for both mean AT and WT. Statistically significant temporal trends across the region of 0.023 °C per year for AT and 0.028 °C per year for WT are detected using simple linear regression. Sensitivity analyses show that the irregularly sampled WT data are appropriate for trend analyses, resulting in conservative estimates of trend magnitude. Relations between 190 landscape factors and significant trends in AT-WT relations are examined using principal components analysis. Measures of major dams and deciduous forest are correlated with WT increasing slower than AT, whereas agriculture in the absence of major dams is correlated with WT increasing faster than AT. Increasing WT trends are detected despite increasing trends in streamflow in the northern part of the study area. Continued warming of contributing streams to Chesapeake Bay likely will result in shifts in distributions of aquatic biota and contribute to worsened eutrophic conditions in the bay and its estuaries.
Deriving Vegetation Dynamics of Natural Terrestrial Ecosystems from MODIS NDVI/EVI Data over Turkey.
Evrendilek, Fatih; Gulbeyaz, Onder
2008-09-01
The 16-day composite MODIS vegetation indices (VIs) at 500-m resolution for the period between 2000 to 2007 were seasonally averaged on the basis of the estimated distribution of 16 potential natural terrestrial ecosystems (NTEs) across Turkey. Graphical and statistical analyses of the time-series VIs for the NTEs spatially disaggregated in terms of biogeoclimate zones and land cover types included descriptive statistics, correlations, discrete Fourier transform (DFT), time-series decomposition, and simple linear regression (SLR) models. Our spatio-temporal analyses revealed that both MODIS VIs, on average, depicted similar seasonal variations for the NTEs, with the NDVI values having higher mean and SD values. The seasonal VIs were most correlated in decreasing order for: barren/sparsely vegetated land > grassland > shrubland/woodland > forest; (sub)nival > warm temperate > alpine > cool temperate > boreal = Mediterranean; and summer > spring > autumn > winter. Most pronounced differences between the MODIS VI responses over Turkey occurred in boreal and Mediterranean climate zones and forests, and in winter (the senescence phase of the growing season). Our results showed the potential of the time-series MODIS VI datasets in the estimation and monitoring of seasonal and interannual ecosystem dynamics over Turkey that needs to be further improved and refined through systematic and extensive field measurements and validations across various biomes.
NASA Astrophysics Data System (ADS)
Clerc, F.; Njiki-Menga, G.-H.; Witschger, O.
2013-04-01
Most of the measurement strategies that are suggested at the international level to assess workplace exposure to nanomaterials rely on devices measuring, in real time, airborne particles concentrations (according different metrics). Since none of the instruments to measure aerosols can distinguish a particle of interest to the background aerosol, the statistical analysis of time resolved data requires special attention. So far, very few approaches have been used for statistical analysis in the literature. This ranges from simple qualitative analysis of graphs to the implementation of more complex statistical models. To date, there is still no consensus on a particular approach and the current period is always looking for an appropriate and robust method. In this context, this exploratory study investigates a statistical method to analyse time resolved data based on a Bayesian probabilistic approach. To investigate and illustrate the use of the this statistical method, particle number concentration data from a workplace study that investigated the potential for exposure via inhalation from cleanout operations by sandpapering of a reactor producing nanocomposite thin films have been used. In this workplace study, the background issue has been addressed through the near-field and far-field approaches and several size integrated and time resolved devices have been used. The analysis of the results presented here focuses only on data obtained with two handheld condensation particle counters. While one was measuring at the source of the released particles, the other one was measuring in parallel far-field. The Bayesian probabilistic approach allows a probabilistic modelling of data series, and the observed task is modelled in the form of probability distributions. The probability distributions issuing from time resolved data obtained at the source can be compared with the probability distributions issuing from the time resolved data obtained far-field, leading in a quantitative estimation of the airborne particles released at the source when the task is performed. Beyond obtained results, this exploratory study indicates that the analysis of the results requires specific experience in statistics.
Nomogram for sample size calculation on a straightforward basis for the kappa statistic.
Hong, Hyunsook; Choi, Yunhee; Hahn, Seokyung; Park, Sue Kyung; Park, Byung-Joo
2014-09-01
Kappa is a widely used measure of agreement. However, it may not be straightforward in some situation such as sample size calculation due to the kappa paradox: high agreement but low kappa. Hence, it seems reasonable in sample size calculation that the level of agreement under a certain marginal prevalence is considered in terms of a simple proportion of agreement rather than a kappa value. Therefore, sample size formulae and nomograms using a simple proportion of agreement rather than a kappa under certain marginal prevalences are proposed. A sample size formula was derived using the kappa statistic under the common correlation model and goodness-of-fit statistic. The nomogram for the sample size formula was developed using SAS 9.3. The sample size formulae using a simple proportion of agreement instead of a kappa statistic and nomograms to eliminate the inconvenience of using a mathematical formula were produced. A nomogram for sample size calculation with a simple proportion of agreement should be useful in the planning stages when the focus of interest is on testing the hypothesis of interobserver agreement involving two raters and nominal outcome measures. Copyright © 2014 Elsevier Inc. All rights reserved.
Arroyo-Hernández, M; Mellado-Romero, M A; Páramo-Díaz, P; Martín-López, C M; Cano-Egea, J M; Vilá Y Rico, J
2015-01-01
The purpose of this study is to analyze if there is any difference between the arthroscopic reparation of full-thickness supraspinatus tears with simple row technique versus suture bridge technique. We accomplished a retrospective study of 123 patients with full-thickness supraspinatus tears between January 2009 and January 2013 in our hospital. There were 60 simple row reparations, and 63 suture bridge ones. The mean age in the simple row group was 62.9, and in the suture bridge group was 63.3 years old. There were more women than men in both groups (67%). All patients were studied using the Constant test. The mean Constant test in the suture bridge group was 76.7, and in the simple row group was 72.4. We have also accomplished a statistical analysis of each Constant item. Strength was higher in the suture bridge group, with a significant statistical difference (p 0.04). The range of movement was also greater in the suture bridge group, but was not statistically significant. Suture bridge technique has better clinical results than single row reparations, but the difference is not statistically significant (p = 0.298).
Properties of nanocrystalline Si layers embedded in structure of solar cell
NASA Astrophysics Data System (ADS)
Jurečka, Stanislav; Imamura, Kentaro; Matsumoto, Taketoshi; Kobayashi, Hikaru
2017-12-01
Suppression of spectral reflectance from the surface of solar cell is necessary for achieving a high energy conversion efficiency. We developed a simple method for forming nanocrystalline layers with ultralow reflectance in a broad range of wavelengths. The method is based on metal assisted etching of the silicon surface. In this work, we prepared Si solar cell structures with embedded nanocrystalline layers. The microstructure of embedded layer depends on the etching conditions. We examined the microstructure of the etched layers by a transmission electron microscope and analysed the experimental images by statistical and Fourier methods. The obtained results provide information on the applied treatment operations and can be used to optimize the solar cell forming procedure.
Kyong, Jin Burm; Lee, Yelin; D’Souza, Malcolm John; Kevill, Dennis Neil; Kevill, Dennis Neil
2012-01-01
The “parent” tertiary alkyl chloroformate, tert-butyl chloroformate, is unstable, but the tert-butyl chlorothioformate (1) is of increased stability and a kinetic investigation of the solvolyses is presented. Analyses in terms of the simple and extended Grunwald-Winstein equations are carried out. The original one-term equation satisfactorily correlates the data with a sensitivity towards changes in solvent ionizing power of 0.73 ±0.03. When the two-term equation is applied, the sensitivity towards changes in solvent nucleophilicity of 0.13 ± 0.09 is associated with a high (0.17) probability that the term that it governs is not statistically significant. PMID:23538747
Cloud Compute for Global Climate Station Summaries
NASA Astrophysics Data System (ADS)
Baldwin, R.; May, B.; Cogbill, P.
2017-12-01
Global Climate Station Summaries are simple indicators of observational normals which include climatic data summarizations and frequency distributions. These typically are statistical analyses of station data over 5-, 10-, 20-, 30-year or longer time periods. The summaries are computed from the global surface hourly dataset. This dataset totaling over 500 gigabytes is comprised of 40 different types of weather observations with 20,000 stations worldwide. NCEI and the U.S. Navy developed these value added products in the form of hourly summaries from many of these observations. Enabling this compute functionality in the cloud is the focus of the project. An overview of approach and challenges associated with application transition to the cloud will be presented.
Detecting trends in tree growth: not so simple.
Bowman, David M J S; Brienen, Roel J W; Gloor, Emanuel; Phillips, Oliver L; Prior, Lynda D
2013-01-01
Tree biomass influences biogeochemical cycles, climate, and biodiversity across local to global scales. Understanding the environmental control of tree biomass demands consideration of the drivers of individual tree growth over their lifespan. This can be achieved by studies of tree growth in permanent sample plots (prospective studies) and tree ring analyses (retrospective studies). However, identification of growth trends and attribution of their drivers demands statistical control of the axiomatic co-variation of tree size and age, and avoiding sampling biases at the stand, forest, and regional scales. Tracking and predicting the effects of environmental change on tree biomass requires well-designed studies that address the issues that we have reviewed. Copyright © 2012 Elsevier Ltd. All rights reserved.
Current Status and Challenges of Atmospheric Data Assimilation
NASA Astrophysics Data System (ADS)
Atlas, R. M.; Gelaro, R.
2016-12-01
The issues of modern atmospheric data assimilation are fairly simple to comprehend but difficult to address, involving the combination of literally billions of model variables and tens of millions of observations daily. In addition to traditional meteorological variables such as wind, temperature pressure and humidity, model state vectors are being expanded to include explicit representation of precipitation, clouds, aerosols and atmospheric trace gases. At the same time, model resolutions are approaching single-kilometer scales globally and new observation types have error characteristics that are increasingly non-Gaussian. This talk describes the current status and challenges of atmospheric data assimilation, including an overview of current methodologies, the difficulty of estimating error statistics, and progress toward coupled earth system analyses.
Multiplicative Process in Turbulent Velocity Statistics: A Simplified Analysis
NASA Astrophysics Data System (ADS)
Chillà, F.; Peinke, J.; Castaing, B.
1996-04-01
A lot of models in turbulence links the energy cascade process and intermittency, the characteristic of which being the shape evolution of the probability density functions (pdf) for longitudinal velocity increments. Using recent models and experimental results, we show that the flatness factor of these pdf gives a simple and direct estimate for what is called the deepness of the cascade. We analyse in this way the published data of a Direct Numerical Simulation and show that the deepness of the cascade presents the same Reynolds number dependence as in laboratory experiments. Plusieurs modèles de turbulence relient la cascade d'énergie et l'intermittence, caractérisée par l'évolution des densités de probabilité (pdf) des incréments longitudinaux de vitesse. Nous appuyant aussi bien sur des modèles récents que sur des résultats expérimentaux, nous montrons que la Curtosis de ces pdf permet une estimation simple et directe de la profondeur de la cascade. Cela nous permet de réanalyser les résultats publiés d'une simulation numérique et de montrer que la profondeur de la cascade y évolue de la même façon que pour les expériences de laboratoire en fonction du nombre de Reynolds.
Salvatore, Stefania; Bramness, Jørgen Gustav; Reid, Malcolm J; Thomas, Kevin Victor; Harman, Christopher; Røislien, Jo
2015-01-01
Wastewater-based epidemiology (WBE) is a new methodology for estimating the drug load in a population. Simple summary statistics and specification tests have typically been used to analyze WBE data, comparing differences between weekday and weekend loads. Such standard statistical methods may, however, overlook important nuanced information in the data. In this study, we apply functional data analysis (FDA) to WBE data and compare the results to those obtained from more traditional summary measures. We analysed temporal WBE data from 42 European cities, using sewage samples collected daily for one week in March 2013. For each city, the main temporal features of two selected drugs were extracted using functional principal component (FPC) analysis, along with simpler measures such as the area under the curve (AUC). The individual cities' scores on each of the temporal FPCs were then used as outcome variables in multiple linear regression analysis with various city and country characteristics as predictors. The results were compared to those of functional analysis of variance (FANOVA). The three first FPCs explained more than 99% of the temporal variation. The first component (FPC1) represented the level of the drug load, while the second and third temporal components represented the level and the timing of a weekend peak. AUC was highly correlated with FPC1, but other temporal characteristic were not captured by the simple summary measures. FANOVA was less flexible than the FPCA-based regression, and even showed concordance results. Geographical location was the main predictor for the general level of the drug load. FDA of WBE data extracts more detailed information about drug load patterns during the week which are not identified by more traditional statistical methods. Results also suggest that regression based on FPC results is a valuable addition to FANOVA for estimating associations between temporal patterns and covariate information.
An Easy Tool to Predict Survival in Patients Receiving Radiation Therapy for Painful Bone Metastases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Westhoff, Paulien G., E-mail: p.g.westhoff@umcutrecht.nl; Graeff, Alexander de; Monninkhof, Evelyn M.
2014-11-15
Purpose: Patients with bone metastases have a widely varying survival. A reliable estimation of survival is needed for appropriate treatment strategies. Our goal was to assess the value of simple prognostic factors, namely, patient and tumor characteristics, Karnofsky performance status (KPS), and patient-reported scores of pain and quality of life, to predict survival in patients with painful bone metastases. Methods and Materials: In the Dutch Bone Metastasis Study, 1157 patients were treated with radiation therapy for painful bone metastases. At randomization, physicians determined the KPS; patients rated general health on a visual analogue scale (VAS-gh), valuation of life on amore » verbal rating scale (VRS-vl) and pain intensity. To assess the predictive value of the variables, we used multivariate Cox proportional hazard analyses and C-statistics for discriminative value. Of the final model, calibration was assessed. External validation was performed on a dataset of 934 patients who were treated with radiation therapy for vertebral metastases. Results: Patients had mainly breast (39%), prostate (23%), or lung cancer (25%). After a maximum of 142 weeks' follow-up, 74% of patients had died. The best predictive model included sex, primary tumor, visceral metastases, KPS, VAS-gh, and VRS-vl (C-statistic = 0.72, 95% CI = 0.70-0.74). A reduced model, with only KPS and primary tumor, showed comparable discriminative capacity (C-statistic = 0.71, 95% CI = 0.69-0.72). External validation showed a C-statistic of 0.72 (95% CI = 0.70-0.73). Calibration of the derivation and the validation dataset showed underestimation of survival. Conclusion: In predicting survival in patients with painful bone metastases, KPS combined with primary tumor was comparable to a more complex model. Considering the amount of variables in complex models and the additional burden on patients, the simple model is preferred for daily use. In addition, a risk table for survival is provided.« less
Perney, Pascal; Lehert, Philippe; Mason, Barbara J
2012-01-01
Sleep disturbance symptom (SDS) is commonly reported in alcoholic patients. Polysomnography studies suggested that acamprosate decreased SDS. We assessed this hypothesis by using data of a randomized controlled trial. As a secondary objective, we suggested and tested the validity of a simple measurement of SDS based on the Hamilton depression and anxiety inventory subset. We re-analysed a multi-center study evaluating the efficacy of acamprosate compared with placebos on alcohol-dependent patients in concentrating on SDS change in time. The Sleep sum score index (SAEI) was built from check-lists on adverse effects reported at each visit and constituted our main endpoint. We also tested the validity of the short sleep index (SSI) defined by the four sleep items of the Hamilton depression and anxiety scales. Statistical analyses were conducted on an intention to treat basis. A total of 592 patients were included, and 292 completed the 6-month trial. Compared with SAEI considered as our reference, the observed specificity and sensitivity of SSI were 91.6 and 87.6%. From 40.2% of patients experiencing SDS at baseline, this proportion decreased until 26.1% at M6 in the placebo group and 19.5% in the acamprosate group (relative risk placebo/acamprosate = 1.49, 95% confidence interval 1.10, 1.98, P = 0.04). Treating alcoholic patients to enhance abstinence has a beneficial effect in reducing SDS, and the duration of abstinence during the treatment constitutes the main positive factor. An additional effect of acamprosate is conjectured from its effect on the glutamatergic tone. The SSI constitutes a simple, reasonably sensitive and specific instrument tool to measure SDS.
The expectancy-value muddle in the theory of planned behaviour - and some proposed solutions.
French, David P; Hankins, Matthew
2003-02-01
The authors of the Theories of Reasoned Action and Planned Behaviour recommended a method for statistically analysing the relationships between beliefs and the Attitude, Subjective Norm, and Perceived Behavioural Control constructs. This method has been used in the overwhelming majority of studies using these theories. However, there is a growing awareness that this method yields statistically uninterpretable results (Evans, 1991). Despite this, the use of this method is continuing, as is uninformed interpretation of this problematic research literature. This is probably due to the lack of a simple account of where the problem lies, and the large number of alternatives available. This paper therefore summarizes the problem as simply as possible, gives consideration to the conclusions that can be validly drawn from studies that contain this problem, and critically reviews the many alternatives that have been proposed to address this problem. Different techniques are identified as being suitable, according to the purpose of the specific research project.
Rosen, G D
2006-06-01
Meta-analysis is a vague descriptor used to encompass very diverse methods of data collection analysis, ranging from simple averages to more complex statistical methods. Holo-analysis is a fully comprehensive statistical analysis of all available data and all available variables in a specified topic, with results expressed in a holistic factual empirical model. The objectives and applications of holo-analysis include software production for prediction of responses with confidence limits, translation of research conditions to praxis (field) circumstances, exposure of key missing variables, discovery of theoretically unpredictable variables and interactions, and planning future research. Holo-analyses are cited as examples of the effects on broiler feed intake and live weight gain of exogenous phytases, which account for 70% of variation in responses in terms of 20 highly significant chronological, dietary, environmental, genetic, managemental, and nutrient variables. Even better future accountancy of variation will be facilitated if and when authors of papers routinely provide key data for currently neglected variables, such as temperatures, complete feed formulations, and mortalities.
Environmental Justice Assessment for Transportation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mills, G.S.; Neuhauser, K.S.
1999-04-05
Application of Executive Order 12898 to risk assessment of highway or rail transport of hazardous materials has proven difficult; the location and conditions affecting the propagation of a plume of hazardous material released in a potential accident are unknown, in general. Therefore, analyses have only been possible in geographically broad or approximate manner. The advent of geographic information systems and development of software enhancements at Sandia National Laboratories have made kilometer-by-kilometer analysis of populations tallied by U.S. Census Blocks along entire routes practicable. Tabulations of total, or racially/ethnically distinct, populations close to a route, its alternatives, or the broader surroundingmore » area, can then be compared and differences evaluated statistically. This paper presents methods of comparing populations and their racial/ethnic compositions using simple tabulations, histograms and Chi Squared tests for statistical significance of differences found. Two examples of these methods are presented: comparison of two routes and comparison of a route with its surroundings.« less
Multivariate model of female black bear habitat use for a Geographic Information System
Clark, Joseph D.; Dunn, James E.; Smith, Kimberly G.
1993-01-01
Simple univariate statistical techniques may not adequately assess the multidimensional nature of habitats used by wildlife. Thus, we developed a multivariate method to model habitat-use potential using a set of female black bear (Ursus americanus) radio locations and habitat data consisting of forest cover type, elevation, slope, aspect, distance to roads, distance to streams, and forest cover type diversity score in the Ozark Mountains of Arkansas. The model is based on the Mahalanobis distance statistic coupled with Geographic Information System (GIS) technology. That statistic is a measure of dissimilarity and represents a standardized squared distance between a set of sample variates and an ideal based on the mean of variates associated with animal observations. Calculations were made with the GIS to produce a map containing Mahalanobis distance values within each cell on a 60- × 60-m grid. The model identified areas of high habitat use potential that could not otherwise be identified by independent perusal of any single map layer. This technique avoids many pitfalls that commonly affect typical multivariate analyses of habitat use and is a useful tool for habitat manipulation or mitigation to favor terrestrial vertebrates that use habitats on a landscape scale.
Horng, Huann-Cheng; Yuan, Chiou-Chung; Chao, Kuan-Chong; Cheng, Ming-Huei; Wang, Peng-Hui
2007-06-01
To evaluate the efficacy and acceptability of the Port-A-Cath (PAC) insertion method with (conventional group as II) and without (modified group as I) the aid of intraoperative fluoroscopy or other localizing devices. A total of 158 women with various kinds of gynecological cancers warranting PAC insertion (n = 86 in group I and n = 72 in group II, respectively) were evaluated. Data for analyses included patient age, main disease, dislocation site, surgical time, complications, and catheter outcome. There was no statistical difference between the two groups in terms of age, main disease, complications, and the experiencing of patent catheters. However, appropriate positioning (100% in group I, and 82% in group II) in the superior vena cava (SVC) showed statistical differences between the two groups (P = 0.001). In addition, the surgical time in group I was statistically shorter than that in group II (P < 0.001). The modified method for inserting the PAC offered the following benefits: including avoiding X-ray exposure for both the operator and the patient, defining the appropriate position in the SVC, and less surgical time. (c) 2007 Wiley-Liss, Inc.
Statistics for NAEG: past efforts, new results, and future plans
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilbert, R.O.; Simpson, J.C.; Kinnison, R.R.
A brief review of Nevada Applied Ecology Group (NAEG) objectives is followed by a summary of past statistical analyses conducted by Pacific Northwest Laboratory for the NAEG. Estimates of spatial pattern of radionuclides and other statistical analyses at NS's 201, 219 and 221 are reviewed as background for new analyses presented in this paper. Suggested NAEG activities and statistical analyses needed for the projected termination date of NAEG studies in March 1986 are given.
Statistical Techniques to Analyze Pesticide Data Program Food Residue Observations.
Szarka, Arpad Z; Hayworth, Carol G; Ramanarayanan, Tharacad S; Joseph, Robert S I
2018-06-26
The U.S. EPA conducts dietary-risk assessments to ensure that levels of pesticides on food in the U.S. food supply are safe. Often these assessments utilize conservative residue estimates, maximum residue levels (MRLs), and a high-end estimate derived from registrant-generated field-trial data sets. A more realistic estimate of consumers' pesticide exposure from food may be obtained by utilizing residues from food-monitoring programs, such as the Pesticide Data Program (PDP) of the U.S. Department of Agriculture. A substantial portion of food-residue concentrations in PDP monitoring programs are below the limits of detection (left-censored), which makes the comparison of regulatory-field-trial and PDP residue levels difficult. In this paper, we present a novel adaption of established statistical techniques, the Kaplan-Meier estimator (K-M), the robust regression on ordered statistic (ROS), and the maximum-likelihood estimator (MLE), to quantify the pesticide-residue concentrations in the presence of heavily censored data sets. The examined statistical approaches include the most commonly used parametric and nonparametric methods for handling left-censored data that have been used in the fields of medical and environmental sciences. This work presents a case study in which data of thiamethoxam residue on bell pepper generated from registrant field trials were compared with PDP-monitoring residue values. The results from the statistical techniques were evaluated and compared with commonly used simple substitution methods for the determination of summary statistics. It was found that the maximum-likelihood estimator (MLE) is the most appropriate statistical method to analyze this residue data set. Using the MLE technique, the data analyses showed that the median and mean PDP bell pepper residue levels were approximately 19 and 7 times lower, respectively, than the corresponding statistics of the field-trial residues.
Reif, David M.; Israel, Mark A.; Moore, Jason H.
2007-01-01
The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666
Observation of eight ancient olive trees (Olea europaea L.) growing in the Garden of Gethsemane.
Petruccelli, Raffaella; Giordano, Cristiana; Salvatici, Maria Cristina; Capozzoli, Laura; Ciaccheri, Leonardo; Pazzini, Massimo; Lain, Orietta; Testolin, Raffaele; Cimato, Antonio
2014-05-01
For thousands of years, olive trees (Olea europaea L.) have been a significant presence and a symbol in the Garden of Gethsemane, a place located at the foot of the Mount of Olives, Jerusalem, remembered for the agony of Jesus Christ before his arrest. This investigation comprises the first morphological and genetic characterization of eight olive trees in the Garden of Gethsemane. Pomological traits, morphometric, and ultrastructural observations as well as SSR (Simple Sequence Repeat) analysis were performed to identify the olive trees. Statistical analyses were conducted to evaluate their morphological variability. The study revealed a low morphological variability and minimal dissimilarity among the olive trees. According to molecular analysis, these trees showed the same allelic profile at all microsatellite loci analyzed. Combining the results of the different analyses carried out in the frame of the present work, we could conclude that the eight olive trees of the Gethsemane Garden have been propagated from a single genotype. Copyright © 2014. Published by Elsevier SAS.
[Bayesian statistics in medicine -- part II: main applications and inference].
Montomoli, C; Nichelatti, M
2008-01-01
Bayesian statistics is not only used when one is dealing with 2-way tables, but it can be used for inferential purposes. Using the basic concepts presented in the first part, this paper aims to give a simple overview of Bayesian methods by introducing its foundation (Bayes' theorem) and then applying this rule to a very simple practical example; whenever possible, the elementary processes at the basis of analysis are compared to those of frequentist (classical) statistical analysis. The Bayesian reasoning is naturally connected to medical activity, since it appears to be quite similar to a diagnostic process.
Statistics without Tears: Complex Statistics with Simple Arithmetic
ERIC Educational Resources Information Center
Smith, Brian
2011-01-01
One of the often overlooked aspects of modern statistics is the analysis of time series data. Modern introductory statistics courses tend to rush to probabilistic applications involving risk and confidence. Rarely does the first level course linger on such useful and fascinating topics as time series decomposition, with its practical applications…
Applied statistics in ecology: common pitfalls and simple solutions
E. Ashley Steel; Maureen C. Kennedy; Patrick G. Cunningham; John S. Stanovick
2013-01-01
The most common statistical pitfalls in ecological research are those associated with data exploration, the logic of sampling and design, and the interpretation of statistical results. Although one can find published errors in calculations, the majority of statistical pitfalls result from incorrect logic or interpretation despite correct numerical calculations. There...
Hicks, Amy; Fairhurst, Caroline; Torgerson, David J
2018-03-01
To perform a worked example of an approach that can be used to identify and remove potentially biased trials from meta-analyses via the analysis of baseline variables. True randomisation produces treatment groups that differ only by chance; therefore, a meta-analysis of a baseline measurement should produce no overall difference and zero heterogeneity. A meta-analysis from the British Medical Journal, known to contain significant heterogeneity and imbalance in baseline age, was chosen. Meta-analyses of baseline variables were performed and trials systematically removed, starting with those with the largest t-statistic, until the I 2 measure of heterogeneity became 0%, then the outcome meta-analysis repeated with only the remaining trials as a sensitivity check. We argue that heterogeneity in a meta-analysis of baseline variables should not exist, and therefore removing trials which contribute to heterogeneity from a meta-analysis will produce a more valid result. In our example none of the overall outcomes changed when studies contributing to heterogeneity were removed. We recommend routine use of this technique, using age and a second baseline variable predictive of outcome for the particular study chosen, to help eliminate potential bias in meta-analyses. Copyright © 2017 Elsevier Inc. All rights reserved.
The Statistics of wood assays for preservative retention
Patricia K. Lebow; Scott W. Conklin
2011-01-01
This paper covers general statistical concepts that apply to interpreting wood assay retention values. In particular, since wood assays are typically obtained from a single composited sample, the statistical aspects, including advantages and disadvantages, of simple compositing are covered.
Accuracy of clinical coding from 1210 appendicectomies in a British district general hospital.
Bhangu, Aneel; Nepogodiev, Dmitri; Taylor, Caroline; Durkin, Natalie; Patel, Rajan
2012-01-01
The primary aim of this study was to assess the accuracy of clinical coding in identifying negative appendicectomies. The secondary aim was to analyse trends over time in rates of simple, complex (gangrenous or perforated) and negative appendicectomies. Retrospective review of 1210 patients undergoing emergency appendicectomy during a five year period (2006-2010). Histopathology reports were taken as gold standard for diagnosis and compared to clinical coding lists. Clinical coding is the process by which non-medical administrators apply standardised diagnostic codes to patients, based upon clinical notes at discharge. These codes then contribute to national databases. Statistical analysis included correlation studies and regression analyses. Clinical coding had only moderate correlation with histopathology, with an overall kappa of 0.421. Annual kappa values varied between 0.378 and 0.500. Overall 14% of patients were incorrectly coded as having had appendicitis when in fact they had a histopathologically normal appendix (153/1107), whereas 4% were falsely coded as having received a negative appendicectomy when they had appendicitis (48/1107). There was an overall significant fall and then rise in the rate of simple appendicitis (B coefficient -0.239 (95% confidence interval -0.426, -0.051), p = 0.014) but no change in the rate of complex appendicitis (B coefficient 0.008 (-0.015, 0.031), p = 0.476). Clinical coding for negative appendicectomy was unreliable. Negative rates may be higher than suspected. This has implications for the validity of national database analyses. Using this form of data as a quality indictor for appendicitis should be reconsidered until its quality is improved. Copyright © 2012 Surgical Associates Ltd. Published by Elsevier Ltd. All rights reserved.
McAfee, Stephanie A.; Pederson, Gregory T.; Woodhouse, Connie A.; McCabe, Gregory
2017-01-01
Water managers are increasingly interested in better understanding and planning for projected resource impacts from climate change. In this management-guided study, we use a very large suite of synthetic climate scenarios in a statistical modeling framework to simultaneously evaluate how (1) average temperature and precipitation changes, (2) initial basin conditions, and (3) temporal characteristics of the input climate data influence water-year flow in the Upper Colorado River. The results here suggest that existing studies may underestimate the degree of uncertainty in future streamflow, particularly under moderate temperature and precipitation changes. However, we also find that the relative severity of future flow projections within a given climate scenario can be estimated with simple metrics that characterize the input climate data and basin conditions. These results suggest that simple testing, like the analyses presented in this paper, may be helpful in understanding differences between existing studies or in identifying specific conditions for physically based mechanistic modeling. Both options could reduce overall cost and improve the efficiency of conducting climate change impacts studies.
LOD significance thresholds for QTL analysis in experimental populations of diploid species
Van Ooijen JW
1999-11-01
Linkage analysis with molecular genetic markers is a very powerful tool in the biological research of quantitative traits. The lack of an easy way to know what areas of the genome can be designated as statistically significant for containing a gene affecting the quantitative trait of interest hampers the important prediction of the rate of false positives. In this paper four tables, obtained by large-scale simulations, are presented that can be used with a simple formula to get the false-positives rate for analyses of the standard types of experimental populations with diploid species with any size of genome. A new definition of the term 'suggestive linkage' is proposed that allows a more objective comparison of results across species.
An instructive model of entropy
NASA Astrophysics Data System (ADS)
Zimmerman, Seth
2010-09-01
This article first notes the misinterpretation of a common thought experiment, and the misleading comment that 'systems tend to flow from less probable to more probable macrostates'. It analyses the experiment, generalizes it and introduces a new tool of investigation, the simplectic structure. A time-symmetric model is built upon this structure, yielding several non-intuitive results. The approach is combinatorial rather than statistical, and assumes that entropy is equivalent to 'missing information'. The intention of this article is not only to present interesting results, but also, by deliberately starting with a simple example and developing it through proof and computer simulation, to clarify the often confusing subject of entropy. The article should be particularly stimulating to students and instructors of discrete mathematics or undergraduate physics.
Osterberg, T; Norinder, U
2001-01-01
A method of modelling and predicting biopharmaceutical properties using simple theoretically computed molecular descriptors and multivariate statistics has been investigated for several data sets related to solubility, IAM chromatography, permeability across Caco-2 cell monolayers, human intestinal perfusion, brain-blood partitioning, and P-glycoprotein ATPase activity. The molecular descriptors (e.g. molar refractivity, molar volume, index of refraction, surface tension and density) and logP were computed with ACD/ChemSketch and ACD/logP, respectively. Good statistical models were derived that permit simple computational prediction of biopharmaceutical properties. All final models derived had R(2) values ranging from 0.73 to 0.95 and Q(2) values ranging from 0.69 to 0.86. The RMSEP values for the external test sets ranged from 0.24 to 0.85 (log scale).
Refined elasticity sampling for Monte Carlo-based identification of stabilizing network patterns.
Childs, Dorothee; Grimbs, Sergio; Selbig, Joachim
2015-06-15
Structural kinetic modelling (SKM) is a framework to analyse whether a metabolic steady state remains stable under perturbation, without requiring detailed knowledge about individual rate equations. It provides a representation of the system's Jacobian matrix that depends solely on the network structure, steady state measurements, and the elasticities at the steady state. For a measured steady state, stability criteria can be derived by generating a large number of SKMs with randomly sampled elasticities and evaluating the resulting Jacobian matrices. The elasticity space can be analysed statistically in order to detect network positions that contribute significantly to the perturbation response. Here, we extend this approach by examining the kinetic feasibility of the elasticity combinations created during Monte Carlo sampling. Using a set of small example systems, we show that the majority of sampled SKMs would yield negative kinetic parameters if they were translated back into kinetic models. To overcome this problem, a simple criterion is formulated that mitigates such infeasible models. After evaluating the small example pathways, the methodology was used to study two steady states of the neuronal TCA cycle and the intrinsic mechanisms responsible for their stability or instability. The findings of the statistical elasticity analysis confirm that several elasticities are jointly coordinated to control stability and that the main source for potential instabilities are mutations in the enzyme alpha-ketoglutarate dehydrogenase. © The Author 2015. Published by Oxford University Press.
Demonstrating microbial co-occurrence pattern analyses within and between ecosystems
Williams, Ryan J.; Howe, Adina; Hofmockel, Kirsten S.
2014-01-01
Co-occurrence patterns are used in ecology to explore interactions between organisms and environmental effects on coexistence within biological communities. Analysis of co-occurrence patterns among microbial communities has ranged from simple pairwise comparisons between all community members to direct hypothesis testing between focal species. However, co-occurrence patterns are rarely studied across multiple ecosystems or multiple scales of biological organization within the same study. Here we outline an approach to produce co-occurrence analyses that are focused at three different scales: co-occurrence patterns between ecosystems at the community scale, modules of co-occurring microorganisms within communities, and co-occurring pairs within modules that are nested within microbial communities. To demonstrate our co-occurrence analysis approach, we gathered publicly available 16S rRNA amplicon datasets to compare and contrast microbial co-occurrence at different taxonomic levels across different ecosystems. We found differences in community composition and co-occurrence that reflect environmental filtering at the community scale and consistent pairwise occurrences that may be used to infer ecological traits about poorly understood microbial taxa. However, we also found that conclusions derived from applying network statistics to microbial relationships can vary depending on the taxonomic level chosen and criteria used to build co-occurrence networks. We present our statistical analysis and code for public use in analysis of co-occurrence patterns across microbial communities. PMID:25101065
Modeling Cross-Situational Word–Referent Learning: Prior Questions
Yu, Chen; Smith, Linda B.
2013-01-01
Both adults and young children possess powerful statistical computation capabilities—they can infer the referent of a word from highly ambiguous contexts involving many words and many referents by aggregating cross-situational statistical information across contexts. This ability has been explained by models of hypothesis testing and by models of associative learning. This article describes a series of simulation studies and analyses designed to understand the different learning mechanisms posited by the 2 classes of models and their relation to each other. Variants of a hypothesis-testing model and a simple or dumb associative mechanism were examined under different specifications of information selection, computation, and decision. Critically, these 3 components of the models interact in complex ways. The models illustrate a fundamental tradeoff between amount of data input and powerful computations: With the selection of more information, dumb associative models can mimic the powerful learning that is accomplished by hypothesis-testing models with fewer data. However, because of the interactions among the component parts of the models, the associative model can mimic various hypothesis-testing models, producing the same learning patterns but through different internal components. The simulations argue for the importance of a compositional approach to human statistical learning: the experimental decomposition of the processes that contribute to statistical learning in human learners and models with the internal components that can be evaluated independently and together. PMID:22229490
The impacts of recent smoking control policies on individual smoking choice: the case of Japan
2013-01-01
Abstract This article comprehensively examines the impact of recent smoking control policies in Japan, increases in cigarette taxes and the enforcement of the Health Promotion Law, on individual smoking choice by using multi-year and nationwide individual survey data to overcome the analytical problems of previous Japanese studies. In the econometric analyses, I specify a simple binary choice model based on a random utility model to examine the effects of smoking control policies on individual smoking choice by employing the instrumental variable probit model to control for the endogeneity of cigarette prices. The empirical results show that an increase in cigarette prices statistically significantly reduces the smoking probability of males by 1.0 percent and that of females by 1.4 to 2.0 percent. The enforcement of the Health Promotion Law has a statistically significant effect on reducing the smoking probability of males by 15.2 percent and of females by 11.9 percent. Furthermore, an increase in cigarette prices has a statistically significant negative effect on the smoking probability of office workers, non-workers, male manual workers, and female unemployed people, and the enforcement of the Health Promotion Law has a statistically significant effect on decreasing the smoking probabilities of office workers, female manual workers, and male non-workers. JEL classification C25, C26, I18 PMID:23497490
Cox, Tony; Popken, Douglas; Ricci, Paolo F
2013-01-01
Exposures to fine particulate matter (PM2.5) in air (C) have been suspected of contributing causally to increased acute (e.g., same-day or next-day) human mortality rates (R). We tested this causal hypothesis in 100 United States cities using the publicly available NMMAPS database. Although a significant, approximately linear, statistical C-R association exists in simple statistical models, closer analysis suggests that it is not causal. Surprisingly, conditioning on other variables that have been extensively considered in previous analyses (usually using splines or other smoothers to approximate their effects), such as month of the year and mean daily temperature, suggests that they create strong, nonlinear confounding that explains the statistical association between PM2.5 and mortality rates in this data set. As this finding disagrees with conventional wisdom, we apply several different techniques to examine it. Conditional independence tests for potential causation, non-parametric classification tree analysis, Bayesian Model Averaging (BMA), and Granger-Sims causality testing, show no evidence that PM2.5 concentrations have any causal impact on increasing mortality rates. This apparent absence of a causal C-R relation, despite their statistical association, has potentially important implications for managing and communicating the uncertain health risks associated with, but not necessarily caused by, PM2.5 exposures. PMID:23983662
Morris, E Kathryn; Caruso, Tancredi; Buscot, François; Fischer, Markus; Hancock, Christine; Maier, Tanja S; Meiners, Torsten; Müller, Caroline; Obermaier, Elisabeth; Prati, Daniel; Socher, Stephanie A; Sonnemann, Ilja; Wäschke, Nicole; Wubet, Tesfaye; Wurst, Susanne; Rillig, Matthias C
2014-09-01
Biodiversity, a multidimensional property of natural systems, is difficult to quantify partly because of the multitude of indices proposed for this purpose. Indices aim to describe general properties of communities that allow us to compare different regions, taxa, and trophic levels. Therefore, they are of fundamental importance for environmental monitoring and conservation, although there is no consensus about which indices are more appropriate and informative. We tested several common diversity indices in a range of simple to complex statistical analyses in order to determine whether some were better suited for certain analyses than others. We used data collected around the focal plant Plantago lanceolata on 60 temperate grassland plots embedded in an agricultural landscape to explore relationships between the common diversity indices of species richness (S), Shannon's diversity (H'), Simpson's diversity (D1), Simpson's dominance (D2), Simpson's evenness (E), and Berger-Parker dominance (BP). We calculated each of these indices for herbaceous plants, arbuscular mycorrhizal fungi, aboveground arthropods, belowground insect larvae, and P. lanceolata molecular and chemical diversity. Including these trait-based measures of diversity allowed us to test whether or not they behaved similarly to the better studied species diversity. We used path analysis to determine whether compound indices detected more relationships between diversities of different organisms and traits than more basic indices. In the path models, more paths were significant when using H', even though all models except that with E were equally reliable. This demonstrates that while common diversity indices may appear interchangeable in simple analyses, when considering complex interactions, the choice of index can profoundly alter the interpretation of results. Data mining in order to identify the index producing the most significant results should be avoided, but simultaneously considering analyses using multiple indices can provide greater insight into the interactions in a system.
Morris, E Kathryn; Caruso, Tancredi; Buscot, François; Fischer, Markus; Hancock, Christine; Maier, Tanja S; Meiners, Torsten; Müller, Caroline; Obermaier, Elisabeth; Prati, Daniel; Socher, Stephanie A; Sonnemann, Ilja; Wäschke, Nicole; Wubet, Tesfaye; Wurst, Susanne; Rillig, Matthias C
2014-01-01
Biodiversity, a multidimensional property of natural systems, is difficult to quantify partly because of the multitude of indices proposed for this purpose. Indices aim to describe general properties of communities that allow us to compare different regions, taxa, and trophic levels. Therefore, they are of fundamental importance for environmental monitoring and conservation, although there is no consensus about which indices are more appropriate and informative. We tested several common diversity indices in a range of simple to complex statistical analyses in order to determine whether some were better suited for certain analyses than others. We used data collected around the focal plant Plantago lanceolata on 60 temperate grassland plots embedded in an agricultural landscape to explore relationships between the common diversity indices of species richness (S), Shannon’s diversity (H’), Simpson’s diversity (D1), Simpson’s dominance (D2), Simpson’s evenness (E), and Berger–Parker dominance (BP). We calculated each of these indices for herbaceous plants, arbuscular mycorrhizal fungi, aboveground arthropods, belowground insect larvae, and P. lanceolata molecular and chemical diversity. Including these trait-based measures of diversity allowed us to test whether or not they behaved similarly to the better studied species diversity. We used path analysis to determine whether compound indices detected more relationships between diversities of different organisms and traits than more basic indices. In the path models, more paths were significant when using H’, even though all models except that with E were equally reliable. This demonstrates that while common diversity indices may appear interchangeable in simple analyses, when considering complex interactions, the choice of index can profoundly alter the interpretation of results. Data mining in order to identify the index producing the most significant results should be avoided, but simultaneously considering analyses using multiple indices can provide greater insight into the interactions in a system. PMID:25478144
Sampling and sensitivity analyses tools (SaSAT) for computational modelling
Hoare, Alexander; Regan, David G; Wilson, David P
2008-01-01
SaSAT (Sampling and Sensitivity Analysis Tools) is a user-friendly software package for applying uncertainty and sensitivity analyses to mathematical and computational models of arbitrary complexity and context. The toolbox is built in Matlab®, a numerical mathematical software package, and utilises algorithms contained in the Matlab® Statistics Toolbox. However, Matlab® is not required to use SaSAT as the software package is provided as an executable file with all the necessary supplementary files. The SaSAT package is also designed to work seamlessly with Microsoft Excel but no functionality is forfeited if that software is not available. A comprehensive suite of tools is provided to enable the following tasks to be easily performed: efficient and equitable sampling of parameter space by various methodologies; calculation of correlation coefficients; regression analysis; factor prioritisation; and graphical output of results, including response surfaces, tornado plots, and scatterplots. Use of SaSAT is exemplified by application to a simple epidemic model. To our knowledge, a number of the methods available in SaSAT for performing sensitivity analyses have not previously been used in epidemiological modelling and their usefulness in this context is demonstrated. PMID:18304361
Godoy-Ruiz, Paula; Rodas, Jamie; Talbot, Yves; Rouleau, Katherine
2016-09-01
In a global context of growing health inequities, international learning experiences have become a popular strategy for equipping health professionals with skills, knowledge, and competencies required to work with the populations they serve. This study sought to analyse the Chilean Interprofessional Programme in Primary Health Care (CIPPHC), a 5 week international learning experience funded by the Ministry of Health in Chile targeted at Chilean primary care providers and delivered in Toronto by the Department of Family and Community Medicine at the University of Toronto. The study focused on three cohorts of students (2010-2012). Anonymous programme evaluations were analysed and semi-structured interviews conducted with programme alumni. Simple descriptive statistics were gathered from the evaluations and the interviews were analysed via thematic content analysis. The majority of participants reported high levels of satisfaction with the training programme, knowledge gain, particularly in the areas of the Canadian model of primary care, and found the materials delivered to be applicable to their local context. The CIPPHC has proven to be a successful educational initiative and provides valuable lessons for other academic centres in developing international interprofessional training programmes for primary care health care providers.
Development of Super-Ensemble techniques for ocean analyses: the Mediterranean Sea case
NASA Astrophysics Data System (ADS)
Pistoia, Jenny; Pinardi, Nadia; Oddo, Paolo; Collins, Matthew; Korres, Gerasimos; Drillet, Yann
2017-04-01
Short-term ocean analyses for Sea Surface Temperature SST in the Mediterranean Sea can be improved by a statistical post-processing technique, called super-ensemble. This technique consists in a multi-linear regression algorithm applied to a Multi-Physics Multi-Model Super-Ensemble (MMSE) dataset, a collection of different operational forecasting analyses together with ad-hoc simulations produced by modifying selected numerical model parameterizations. A new linear regression algorithm based on Empirical Orthogonal Function filtering techniques is capable to prevent overfitting problems, even if best performances are achieved when we add correlation to the super-ensemble structure using a simple spatial filter applied after the linear regression. Our outcomes show that super-ensemble performances depend on the selection of an unbiased operator and the length of the learning period, but the quality of the generating MMSE dataset has the largest impact on the MMSE analysis Root Mean Square Error (RMSE) evaluated with respect to observed satellite SST. Lower RMSE analysis estimates result from the following choices: 15 days training period, an overconfident MMSE dataset (a subset with the higher quality ensemble members), and the least square algorithm being filtered a posteriori.
Kim, Yong-Il; Im, Hyung-Jun; Paeng, Jin Chul; Lee, Jae Sung; Eo, Jae Seon; Kim, Dong Hyun; Kim, Euishin E; Kang, Keon Wook; Chung, June-Key; Lee, Dong Soo
2012-12-01
(18)F-FP-CIT positron emission tomography (PET) is an effective imaging for dopamine transporters. In usual clinical practice, (18)F-FP-CIT PET is analyzed visually or quantified using manual delineation of a volume of interest (VOI) for the striatum. In this study, we suggested and validated two simple quantitative methods based on automatic VOI delineation using statistical probabilistic anatomical mapping (SPAM) and isocontour margin setting. Seventy-five (18)F-FP-CIT PET images acquired in routine clinical practice were used for this study. A study-specific image template was made and the subject images were normalized to the template. Afterwards, uptakes in the striatal regions and cerebellum were quantified using probabilistic VOI based on SPAM. A quantitative parameter, QSPAM, was calculated to simulate binding potential. Additionally, the functional volume of each striatal region and its uptake were measured in automatically delineated VOI using isocontour margin setting. Uptake-volume product (QUVP) was calculated for each striatal region. QSPAM and QUVP were compared with visual grading and the influence of cerebral atrophy on the measurements was tested. Image analyses were successful in all the cases. Both the QSPAM and QUVP were significantly different according to visual grading (P < 0.001). The agreements of QUVP or QSPAM with visual grading were slight to fair for the caudate nucleus (κ = 0.421 and 0.291, respectively) and good to perfect to the putamen (κ = 0.663 and 0.607, respectively). Also, QSPAM and QUVP had a significant correlation with each other (P < 0.001). Cerebral atrophy made a significant difference in QSPAM and QUVP of the caudate nuclei regions with decreased (18)F-FP-CIT uptake. Simple quantitative measurements of QSPAM and QUVP showed acceptable agreement with visual grading. Although QSPAM in some group may be influenced by cerebral atrophy, these simple methods are expected to be effective in the quantitative analysis of (18)F-FP-CIT PET in usual clinical practice.
A Role for Chunk Formation in Statistical Learning of Second Language Syntax
ERIC Educational Resources Information Center
Hamrick, Phillip
2014-01-01
Humans are remarkably sensitive to the statistical structure of language. However, different mechanisms have been proposed to account for such statistical sensitivities. The present study compared adult learning of syntax and the ability of two models of statistical learning to simulate human performance: Simple Recurrent Networks, which learn by…
Robust Combining of Disparate Classifiers Through Order Statistics
NASA Technical Reports Server (NTRS)
Tumer, Kagan; Ghosh, Joydeep
2001-01-01
Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In this article we investigate a family of combiners based on order statistics, for robust handling of situations where there are large discrepancies in performance of individual classifiers. Based on a mathematical modeling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when simple output combination methods based on the the median, the maximum and in general, the ith order statistic, are used. Furthermore, we analyze the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and show that in the presence of uneven classifier performance, they often provide substantial gains over both linear and simple order statistics combiners. Experimental results on both real world data and standard public domain data sets corroborate these findings.
Counting statistics for genetic switches based on effective interaction approximation
NASA Astrophysics Data System (ADS)
Ohkubo, Jun
2012-09-01
Applicability of counting statistics for a system with an infinite number of states is investigated. The counting statistics has been studied a lot for a system with a finite number of states. While it is possible to use the scheme in order to count specific transitions in a system with an infinite number of states in principle, we have non-closed equations in general. A simple genetic switch can be described by a master equation with an infinite number of states, and we use the counting statistics in order to count the number of transitions from inactive to active states in the gene. To avoid having the non-closed equations, an effective interaction approximation is employed. As a result, it is shown that the switching problem can be treated as a simple two-state model approximately, which immediately indicates that the switching obeys non-Poisson statistics.
Asymptotic Linear Spectral Statistics for Spiked Hermitian Random Matrices
NASA Astrophysics Data System (ADS)
Passemier, Damien; McKay, Matthew R.; Chen, Yang
2015-07-01
Using the Coulomb Fluid method, this paper derives central limit theorems (CLTs) for linear spectral statistics of three "spiked" Hermitian random matrix ensembles. These include Johnstone's spiked model (i.e., central Wishart with spiked correlation), non-central Wishart with rank-one non-centrality, and a related class of non-central matrices. For a generic linear statistic, we derive simple and explicit CLT expressions as the matrix dimensions grow large. For all three ensembles under consideration, we find that the primary effect of the spike is to introduce an correction term to the asymptotic mean of the linear spectral statistic, which we characterize with simple formulas. The utility of our proposed framework is demonstrated through application to three different linear statistics problems: the classical likelihood ratio test for a population covariance, the capacity analysis of multi-antenna wireless communication systems with a line-of-sight transmission path, and a classical multiple sample significance testing problem.
Lin, Kao; Li, Haipeng; Schlötterer, Christian; Futschik, Andreas
2011-01-01
Summary statistics are widely used in population genetics, but they suffer from the drawback that no simple sufficient summary statistic exists, which captures all information required to distinguish different evolutionary hypotheses. Here, we apply boosting, a recent statistical method that combines simple classification rules to maximize their joint predictive performance. We show that our implementation of boosting has a high power to detect selective sweeps. Demographic events, such as bottlenecks, do not result in a large excess of false positives. A comparison to other neutrality tests shows that our boosting implementation performs well compared to other neutrality tests. Furthermore, we evaluated the relative contribution of different summary statistics to the identification of selection and found that for recent sweeps integrated haplotype homozygosity is very informative whereas older sweeps are better detected by Tajima's π. Overall, Watterson's θ was found to contribute the most information for distinguishing between bottlenecks and selection. PMID:21041556
Implementing self sustained quality control procedures in a clinical laboratory.
Khatri, Roshan; K C, Sanjay; Shrestha, Prabodh; Sinha, J N
2013-01-01
Quality control is an essential component in every clinical laboratory which maintains the excellence of laboratory standards, supplementing to proper disease diagnosis, patient care and resulting in overall strengthening of health care system. Numerous quality control schemes are available, with combinations of procedures, most of which are tedious, time consuming and can be "too technical" whereas commercially available quality control materials can be expensive especially for laboratories in developing nations like Nepal. Here, we present a procedure performed at our centre with self prepared control serum and use of simple statistical tools for quality assurance. The pooled serum was prepared as per guidelines for preparation of stabilized liquid quality control serum from human sera. Internal Quality Assessment was performed on this sample, on a daily basis which included measurement of 12 routine biochemical parameters. The results were plotted on Levey-Jennings charts and analysed with quality control rules, for a period of one month. The mean levels of biochemical analytes in self prepared control serum were within normal physiological range. This serum was evaluated every day along with patients' samples. The results obtained were plotted on control charts and analysed using common quality control rules to identify possible systematic and random errors. Immediate mitigation measures were taken and the dispatch of erroneous reports was avoided. In this study we try to highlight on a simple internal quality control procedure which can be performed by laboratories, with minimum technology, expenditure, and expertise and improve reliability and validity of the test reports.
NASA Astrophysics Data System (ADS)
Baasch, Benjamin; Müller, Hendrik; von Dobeneck, Tilo; Oberle, Ferdinand K. J.
2017-05-01
The electric conductivity and magnetic susceptibility of sediments are fundamental parameters in environmental geophysics. Both can be derived from marine electromagnetic profiling, a novel, fast and non-invasive seafloor mapping technique. Here we present statistical evidence that electric conductivity and magnetic susceptibility can help to determine physical grain-size characteristics (size, sorting and mud content) of marine surficial sediments. Electromagnetic data acquired with the bottom-towed electromagnetic profiler MARUM NERIDIS III were analysed and compared with grain size data from 33 samples across the NW Iberian continental shelf. A negative correlation between mean grain size and conductivity (R=-0.79) as well as mean grain size and susceptibility (R=-0.78) was found. Simple and multiple linear regression analyses were carried out to predict mean grain size, mud content and the standard deviation of the grain-size distribution from conductivity and susceptibility. The comparison of both methods showed that multiple linear regression models predict the grain-size distribution characteristics better than the simple models. This exemplary study demonstrates that electromagnetic benthic profiling is capable to estimate mean grain size, sorting and mud content of marine surficial sediments at a very high significance level. Transfer functions can be calibrated using grains-size data from a few reference samples and extrapolated along shelf-wide survey lines. This study suggests that electromagnetic benthic profiling should play a larger role for coastal zone management, seafloor contamination and sediment provenance studies in worldwide continental shelf systems.
Association factor analysis between osteoporosis with cerebral artery disease: The STROBE study.
Jin, Eun-Sun; Jeong, Je Hoon; Lee, Bora; Im, Soo Bin
2017-03-01
The purpose of this study was to determine the clinical association factors between osteoporosis and cerebral artery disease in Korean population. Two hundred nineteen postmenopausal women and men undergoing cerebral computed tomography angiography were enrolled in this study to evaluate the cerebral artery disease by cross-sectional study. Cerebral artery disease was diagnosed if there was narrowing of 50% higher diameter in one or more cerebral vessel artery or presence of vascular calcification. History of osteoporotic fracture was assessed using medical record, and radiographic data such as simple radiography, MRI, and bone scan. Bone mineral density was checked by dual-energy x-ray absorptiometry. We reviewed clinical characteristics in all patients and also performed subgroup analysis for total or extracranial/ intracranial cerebral artery disease group retrospectively. We performed statistical analysis by means of chi-square test or Fisher's exact test for categorical variables and Student's t-test or Wilcoxon's rank sum test for continuous variables. We also used univariate and multivariate logistic regression analyses were conducted to assess the factors associated with the prevalence of cerebral artery disease. A two-tailed p-value of less than 0.05 was considered as statistically significant. All statistical analyses were performed using R (version 3.1.3; The R Foundation for Statistical Computing, Vienna, Austria) and SPSS (version 14.0; SPSS, Inc, Chicago, Ill, USA). Of the 219 patients, 142 had cerebral artery disease. All vertebral fracture was observed in 29 (13.24%) patients. There was significant difference in hip fracture according to the presence or absence of cerebral artery disease. In logistic regression analysis, osteoporotic hip fracture was significantly associated with extracranial cerebral artery disease after adjusting for multiple risk factors. Females with osteoporotic hip fracture were associated with total calcified cerebral artery disease. Some clinical factors such as age, hypertension, and osteoporotic hip fracture, smoking history and anti-osteoporosis drug use were associated with cerebral artery disease.
NASA Astrophysics Data System (ADS)
Funk, C. C.; Shukla, S.; Hoerling, M. P.; Robertson, F. R.; Hoell, A.; Liebmann, B.
2013-12-01
During boreal spring, eastern portions of Kenya and Somalia have experienced more frequent droughts since 1999. Given the region's high levels of food insecurity, better predictions of these droughts could provide substantial humanitarian benefits. We show that dynamical-statistical seasonal climate forecasts, based on the latest generation of coupled atmosphere-ocean and uncoupled atmospheric models, effectively predict boreal spring rainfall in this area. Skill sources are assessed by comparing ensembles driven with full-ocean forcing with ensembles driven with ENSO-only sea surface temperatures (SSTs). Our analysis suggests that both ENSO and non-ENSO Indo-Pacific SST forcing have played an important role in the increase in drought frequencies. Over the past 30 years, La Niña drought teleconnections have strengthened, while non-ENSO Indo-Pacific convection patterns have also supported increased (decreased) Western Pacific (East African) rainfall. To further examine the relative contribution of ENSO, low frequency warming and the Pacific Decadal Oscillation, we present decompositions of ECHAM5, GFS, CAM4 and GMAO AMIP simulations. These decompositions suggest that rapid warming in the western Pacific and steeper western-to-central Pacific SST gradients have likely played an important role in the recent intensification of the Walker circulation, and the associated increase in East African aridity. A linear combination of time series describing the Pacific Decadal Oscillation and the strength of Indo-Pacific warming are shown to track East African rainfall reasonably well. The talk concludes with a few thoughts linking the potentially important interplay of attribution and prediction. At least for recent East African droughts, it appears that a characteristic Indo-Pacific SST and precipitation anomaly pattern can be linked statistically to support forecasts and attribution analyses. The combination of traditional AGCM attribution analyses with simple yet physically plausible statistical estimation procedures may help us better untangle some climate mysteries.
Winzer, Klaus-Jürgen; Buchholz, Anika; Schumacher, Martin; Sauerbrei, Willi
2016-01-01
Background Prognostic factors and prognostic models play a key role in medical research and patient management. The Nottingham Prognostic Index (NPI) is a well-established prognostic classification scheme for patients with breast cancer. In a very simple way, it combines the information from tumor size, lymph node stage and tumor grade. For the resulting index cutpoints are proposed to classify it into three to six groups with different prognosis. As not all prognostic information from the three and other standard factors is used, we will consider improvement of the prognostic ability using suitable analysis approaches. Methods and Findings Reanalyzing overall survival data of 1560 patients from a clinical database by using multivariable fractional polynomials and further modern statistical methods we illustrate suitable multivariable modelling and methods to derive and assess the prognostic ability of an index. Using a REMARK type profile we summarize relevant steps of the analysis. Adding the information from hormonal receptor status and using the full information from the three NPI components, specifically concerning the number of positive lymph nodes, an extended NPI with improved prognostic ability is derived. Conclusions The prognostic ability of even one of the best established prognostic index in medicine can be improved by using suitable statistical methodology to extract the full information from standard clinical data. This extended version of the NPI can serve as a benchmark to assess the added value of new information, ranging from a new single clinical marker to a derived index from omics data. An established benchmark would also help to harmonize the statistical analyses of such studies and protect against the propagation of many false promises concerning the prognostic value of new measurements. Statistical methods used are generally available and can be used for similar analyses in other diseases. PMID:26938061
ERIC Educational Resources Information Center
Lu, Yonggang; Henning, Kevin S. S.
2013-01-01
Spurred by recent writings regarding statistical pragmatism, we propose a simple, practical approach to introducing students to a new style of statistical thinking that models nature through the lens of data-generating processes, not populations. (Contains 5 figures.)
Statistics Using Just One Formula
ERIC Educational Resources Information Center
Rosenthal, Jeffrey S.
2018-01-01
This article advocates that introductory statistics be taught by basing all calculations on a single simple margin-of-error formula and deriving all of the standard introductory statistical concepts (confidence intervals, significance tests, comparisons of means and proportions, etc) from that one formula. It is argued that this approach will…
Ganasegeran, Kurubaran; Selvaraj, Kamaraj; Rashid, Abdul
2017-08-01
The six item Confusion, Hubbub and Order Scale (CHAOS-6) has been validated as a reliable tool to measure levels of household disorder. We aimed to investigate the goodness of fit and reliability of a new Malay version of the CHAOS-6. The original English version of the CHAOS-6 underwent forward-backward translation into the Malay language. The finalised Malay version was administered to 105 myocardial infarction survivors in a Malaysian cardiac health facility. We performed confirmatory factor analyses (CFAs) using structural equation modelling. A path diagram and fit statistics were yielded to determine the Malay version's validity. Composite reliability was tested to determine the scale's reliability. All 105 myocardial infarction survivors participated in the study. The CFA yielded a six-item, one-factor model with excellent fit statistics. Composite reliability for the single factor CHAOS-6 was 0.65, confirming that the scale is reliable for Malay speakers. The Malay version of the CHAOS-6 was reliable and showed the best fit statistics for our study sample. We thus offer a simple, brief, validated, reliable and novel instrument to measure chaos, the Skala Kecelaruan, Keriuhan & Tertib Terubahsuai (CHAOS-6) , for the Malaysian population.
Data series embedding and scale invariant statistics.
Michieli, I; Medved, B; Ristov, S
2010-06-01
Data sequences acquired from bio-systems such as human gait data, heart rate interbeat data, or DNA sequences exhibit complex dynamics that is frequently described by a long-memory or power-law decay of autocorrelation function. One way of characterizing that dynamics is through scale invariant statistics or "fractal-like" behavior. For quantifying scale invariant parameters of physiological signals several methods have been proposed. Among them the most common are detrended fluctuation analysis, sample mean variance analyses, power spectral density analysis, R/S analysis, and recently in the realm of the multifractal approach, wavelet analysis. In this paper it is demonstrated that embedding the time series data in the high-dimensional pseudo-phase space reveals scale invariant statistics in the simple fashion. The procedure is applied on different stride interval data sets from human gait measurements time series (Physio-Bank data library). Results show that introduced mapping adequately separates long-memory from random behavior. Smaller gait data sets were analyzed and scale-free trends for limited scale intervals were successfully detected. The method was verified on artificially produced time series with known scaling behavior and with the varying content of noise. The possibility for the method to falsely detect long-range dependence in the artificially generated short range dependence series was investigated. (c) 2009 Elsevier B.V. All rights reserved.
A scoring system for ascertainment of incident stroke; the Risk Index Score (RISc).
Kass-Hout, T A; Moyé, L A; Smith, M A; Morgenstern, L B
2006-01-01
The main objective of this study was to develop and validate a computer-based statistical algorithm that could be translated into a simple scoring system in order to ascertain incident stroke cases using hospital admission medical records data. The Risk Index Score (RISc) algorithm was developed using data collected prospectively by the Brain Attack Surveillance in Corpus Christi (BASIC) project, 2000. The validity of RISc was evaluated by estimating the concordance of scoring system stroke ascertainment to stroke ascertainment by physician and/or abstractor review of hospital admission records. RISc was developed on 1718 randomly selected patients (training set) and then statistically validated on an independent sample of 858 patients (validation set). A multivariable logistic model was used to develop RISc and subsequently evaluated by goodness-of-fit and receiver operating characteristic (ROC) analyses. The higher the value of RISc, the higher the patient's risk of potential stroke. The study showed RISc was well calibrated and discriminated those who had potential stroke from those that did not on initial screening. In this study we developed and validated a rapid, easy, efficient, and accurate method to ascertain incident stroke cases from routine hospital admission records for epidemiologic investigations. Validation of this scoring system was achieved statistically; however, clinical validation in a community hospital setting is warranted.
Ganasegeran, Kurubaran; Selvaraj, Kamaraj; Rashid, Abdul
2017-01-01
Background The six item Confusion, Hubbub and Order Scale (CHAOS-6) has been validated as a reliable tool to measure levels of household disorder. We aimed to investigate the goodness of fit and reliability of a new Malay version of the CHAOS-6. Methods The original English version of the CHAOS-6 underwent forward-backward translation into the Malay language. The finalised Malay version was administered to 105 myocardial infarction survivors in a Malaysian cardiac health facility. We performed confirmatory factor analyses (CFAs) using structural equation modelling. A path diagram and fit statistics were yielded to determine the Malay version’s validity. Composite reliability was tested to determine the scale’s reliability. Results All 105 myocardial infarction survivors participated in the study. The CFA yielded a six-item, one-factor model with excellent fit statistics. Composite reliability for the single factor CHAOS-6 was 0.65, confirming that the scale is reliable for Malay speakers. Conclusion The Malay version of the CHAOS-6 was reliable and showed the best fit statistics for our study sample. We thus offer a simple, brief, validated, reliable and novel instrument to measure chaos, the Skala Kecelaruan, Keriuhan & Tertib Terubahsuai (CHAOS-6), for the Malaysian population. PMID:28951688
Effect of crowd size on patient volume at a large, multipurpose, indoor stadium.
De Lorenzo, R A; Gray, B C; Bennett, P C; Lamparella, V J
1989-01-01
A prediction of patient volume expected at "mass gatherings" is desirable in order to provide optimal on-site emergency medical care. While several methods of predicting patient loads have been suggested, a reliable technique has not been established. This study examines the frequency of medical emergencies at the Syracuse University Carrier Dome, a 50,500-seat indoor stadium. Patient volume and level of care at collegiate basketball and football games as well as rock concerts, over a 7-year period were examined and tabulated. This information was analyzed using simple regression and nonparametric statistical methods to determine level of correlation between crowd size and patient volume. These analyses demonstrated no statistically significant increase in patient volume for increasing crowd size for basketball and football events. There was a small but statistically significant increase in patient volume for increasing crowd size for concerts. A comparison of similar crowd size for each of the three events showed that patient frequency is greatest for concerts and smallest for basketball. The study suggests that crowd size alone has only a minor influence on patient volume at any given event. Structuring medical services based solely on expected crowd size and not considering other influences such as event type and duration may give poor results.
easyGWAS: A Cloud-Based Platform for Comparing the Results of Genome-Wide Association Studies.
Grimm, Dominik G; Roqueiro, Damian; Salomé, Patrice A; Kleeberger, Stefan; Greshake, Bastian; Zhu, Wangsheng; Liu, Chang; Lippert, Christoph; Stegle, Oliver; Schölkopf, Bernhard; Weigel, Detlef; Borgwardt, Karsten M
2017-01-01
The ever-growing availability of high-quality genotypes for a multitude of species has enabled researchers to explore the underlying genetic architecture of complex phenotypes at an unprecedented level of detail using genome-wide association studies (GWAS). The systematic comparison of results obtained from GWAS of different traits opens up new possibilities, including the analysis of pleiotropic effects. Other advantages that result from the integration of multiple GWAS are the ability to replicate GWAS signals and to increase statistical power to detect such signals through meta-analyses. In order to facilitate the simple comparison of GWAS results, we present easyGWAS, a powerful, species-independent online resource for computing, storing, sharing, annotating, and comparing GWAS. The easyGWAS tool supports multiple species, the uploading of private genotype data and summary statistics of existing GWAS, as well as advanced methods for comparing GWAS results across different experiments and data sets in an interactive and user-friendly interface. easyGWAS is also a public data repository for GWAS data and summary statistics and already includes published data and results from several major GWAS. We demonstrate the potential of easyGWAS with a case study of the model organism Arabidopsis thaliana , using flowering and growth-related traits. © 2016 American Society of Plant Biologists. All rights reserved.
Simple taper: Taper equations for the field forester
David R. Larsen
2017-01-01
"Simple taper" is set of linear equations that are based on stem taper rates; the intent is to provide taper equation functionality to field foresters. The equation parameters are two taper rates based on differences in diameter outside bark at two points on a tree. The simple taper equations are statistically equivalent to more complex equations. The linear...
ERIC Educational Resources Information Center
Nelson, Dean
2009-01-01
Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…
Carter, Rebecca R; DiFeo, Analisa; Bogie, Kath; Zhang, Guo-Qiang; Sun, Jiayang
2014-01-01
Ovarian cancer is the most lethal gynecologic disease in the United States, with more women dying from this cancer than all gynecological cancers combined. Ovarian cancer has been termed the "silent killer" because some patients do not show clear symptoms at an early stage. Currently, there is a lack of approved and effective early diagnostic tools for ovarian cancer. There is also an apparent severe knowledge gap of ovarian cancer in general and of its indicative symptoms among both public and many health professionals. These factors have significantly contributed to the late stage diagnosis of most ovarian cancer patients (63% are diagnosed at Stage III or above), where the 5-year survival rate is less than 30%. The paucity of knowledge concerning ovarian cancer in the United States is unknown. The present investigation examined current public awareness and knowledge about ovarian cancer. The study implemented design strategies to develop an unbiased survey with quality control measures, including the modern application of multiple statistical analyses. The survey assessed a reasonable proxy of the US population by crowdsourcing participants through the online task marketplace Amazon Mechanical Turk, at a highly condensed rate of cost and time compared to traditional recruitment methods. Knowledge of ovarian cancer was compared to that of breast cancer using repeated measures, bias control and other quality control measures in the survey design. Analyses included multinomial logistic regression and categorical data analysis procedures such as correspondence analysis, among other statistics. We confirmed the relatively poor public knowledge of ovarian cancer among the US population. The simple, yet novel design should set an example for designing surveys to obtain quality data via Amazon Mechanical Turk with the associated analyses.
Length bias correction in gene ontology enrichment analysis using logistic regression.
Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H
2012-01-01
When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.
Asymptotically Optimal and Private Statistical Estimation
NASA Astrophysics Data System (ADS)
Smith, Adam
Differential privacy is a definition of "privacy" for statistical databases. The definition is simple, yet it implies strong semantics even in the presence of an adversary with arbitrary auxiliary information about the database.
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"
ERIC Educational Resources Information Center
Ozturk, Elif
2012-01-01
The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
Statistical Tutorial | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018. The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean differences, simple and multiple linear regression, ANOVA tests, and Chi-Squared distribution.
Using Statistical Process Control to Make Data-Based Clinical Decisions.
ERIC Educational Resources Information Center
Pfadt, Al; Wheeler, Donald J.
1995-01-01
Statistical process control (SPC), which employs simple statistical tools and problem-solving techniques such as histograms, control charts, flow charts, and Pareto charts to implement continual product improvement procedures, can be incorporated into human service organizations. Examples illustrate use of SPC procedures to analyze behavioral data…
Atlas of susceptibility to pollution in marinas. Application to the Spanish coast.
Gómez, Aina G; Ondiviela, Bárbara; Fernández, María; Juanes, José A
2017-01-15
An atlas of susceptibility to pollution of 320 Spanish marinas is provided. Susceptibility is assessed through a simple, fast and low cost empirical method estimating the flushing capacity of marinas. The Complexity Tidal Range Index (CTRI) was selected among eleven empirical methods. The CTRI method was selected by means of statistical analyses because: it contributes to explain the system's variance; it is highly correlated to numerical model results; and, it is sensitive to marinas' location and typology. The process of implementation to the Spanish coast confirmed its usefulness, versatility and adaptability as a tool for the environmental management of marinas worldwide. The atlas of susceptibility, assessed through CTRI values, is an appropriate instrument to prioritize environmental and planning strategies at a regional scale. Copyright © 2016 Elsevier Ltd. All rights reserved.
Self-organization of cosmic radiation pressure instability. II - One-dimensional simulations
NASA Technical Reports Server (NTRS)
Hogan, Craig J.; Woods, Jorden
1992-01-01
The clustering of statistically uniform discrete absorbing particles moving solely under the influence of radiation pressure from uniformly distributed emitters is studied in a simple one-dimensional model. Radiation pressure tends to amplify statistical clustering in the absorbers; the absorbing material is swept into empty bubbles, the biggest bubbles grow bigger almost as they would in a uniform medium, and the smaller ones get crushed and disappear. Numerical simulations of a one-dimensional system are used to support the conjecture that the system is self-organizing. Simple statistics indicate that a wide range of initial conditions produce structure approaching the same self-similar statistical distribution, whose scaling properties follow those of the attractor solution for an isolated bubble. The importance of the process for large-scale structuring of the interstellar medium is briefly discussed.
S-SPatt: simple statistics for patterns on Markov chains.
Nuel, Grégory
2005-07-01
S-SPatt allows the counting of patterns occurrences in text files and, assuming these texts are generated from a random Markovian source, the computation of the P-value of a given observation using a simple binomial approximation.
Birefringence of Cellotape: Jones Representation and Experimental Analysis
ERIC Educational Resources Information Center
Belendez, Augusto; Fernandez, Elena; Frances, Jorge; Neipp, Cristian
2010-01-01
In this paper, we analyse a simple experiment to study the effects of polarized light. A simple optical system composed of a polarizer, a retarder (cellotape) and an analyser is used to study the effect on the polarization state of the light which impinges on the setup. The optical system is characterized by means of a Jones matrix, and a simple…
Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh
2014-01-01
Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1–6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ PMID:25380781
Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh
2014-01-01
Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ © The Author(s) 2014. Published by Oxford University Press.
Using complexity metrics with R-R intervals and BPM heart rate measures
Wallot, Sebastian; Fusaroli, Riccardo; Tylén, Kristian; Jegindø, Else-Marie
2013-01-01
Lately, growing attention in the health sciences has been paid to the dynamics of heart rate as indicator of impending failures and for prognoses. Likewise, in social and cognitive sciences, heart rate is increasingly employed as a measure of arousal, emotional engagement and as a marker of interpersonal coordination. However, there is no consensus about which measurements and analytical tools are most appropriate in mapping the temporal dynamics of heart rate and quite different metrics are reported in the literature. As complexity metrics of heart rate variability depend critically on variability of the data, different choices regarding the kind of measures can have a substantial impact on the results. In this article we compare linear and non-linear statistics on two prominent types of heart beat data, beat-to-beat intervals (R-R interval) and beats-per-min (BPM). As a proof-of-concept, we employ a simple rest-exercise-rest task and show that non-linear statistics—fractal (DFA) and recurrence (RQA) analyses—reveal information about heart beat activity above and beyond the simple level of heart rate. Non-linear statistics unveil sustained post-exercise effects on heart rate dynamics, but their power to do so critically depends on the type data that is employed: While R-R intervals are very susceptible to non-linear analyses, the success of non-linear methods for BPM data critically depends on their construction. Generally, “oversampled” BPM time-series can be recommended as they retain most of the information about non-linear aspects of heart beat dynamics. PMID:23964244
Rules vs. Statistics: Insights from a Highly Inflected Language
Mirković, Jelena; Seidenberg, Mark S.; Joanisse, Marc F.
2011-01-01
Inflectional morphology has been taken as a paradigmatic example of rule-governed grammatical knowledge (Pinker, 1999). The plausibility of this claim may be related to the fact that it is mainly based on studies of English, which has a very simple inflectional system. We examined the representation of inflectional morphology in Serbian, which encodes number, gender and case for nouns. Linguists standardly characterize this system as a complex set of rules, with disagreements about their exact form. We present analyses of a large corpus of nouns which showed that, as in English, Serbian inflectional morphology is quasiregular: it exhibits numerous partial regularities creating neighborhoods that vary in size and consistency. We then asked whether a simple connectionist network could encode this statistical information in a manner that also supported generalization. A network trained on 3,244 Serbian nouns learned to produce correctly inflected phonological forms from a specification of a word’s lemma, gender, number and case, and generalized to untrained cases. The model’s performance was sensitive to variables that also influence human performance, including surface and lemma frequency. It was also influenced by inflectional neighborhood size, a novel measure of the consistency of meaning to form mapping. A word naming experiment with native Serbian speakers showed that this measure also affects human performance. The results suggest that, as in English, generating correctly inflected forms involves satisfying a small number of simultaneous probabilistic constraints relating form and meaning. Thus, common computational mechanisms may govern the representation and use of inflectional information across typologically diverse languages. PMID:21564267
Brusniak, Mi-Youn; Bodenmiller, Bernd; Campbell, David; Cooke, Kelly; Eddes, James; Garbutt, Andrew; Lau, Hollis; Letarte, Simon; Mueller, Lukas N; Sharma, Vagisha; Vitek, Olga; Zhang, Ning; Aebersold, Ruedi; Watts, Julian D
2008-01-01
Background Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS) based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics. Results We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling. Conclusion The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools. Corra enables appropriate statistical analyses, with controlled false-discovery rates, ultimately to inform subsequent targeted identification of differentially abundant peptides by MS/MS. For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field. PMID:19087345
Statistical Approaches for Spatiotemporal Prediction of Low Flows
NASA Astrophysics Data System (ADS)
Fangmann, A.; Haberlandt, U.
2017-12-01
An adequate assessment of regional climate change impacts on streamflow requires the integration of various sources of information and modeling approaches. This study proposes simple statistical tools for inclusion into model ensembles, which are fast and straightforward in their application, yet able to yield accurate streamflow predictions in time and space. Target variables for all approaches are annual low flow indices derived from a data set of 51 records of average daily discharge for northwestern Germany. The models require input of climatic data in the form of meteorological drought indices, derived from observed daily climatic variables, averaged over the streamflow gauges' catchments areas. Four different modeling approaches are analyzed. Basis for all pose multiple linear regression models that estimate low flows as a function of a set of meteorological indices and/or physiographic and climatic catchment descriptors. For the first method, individual regression models are fitted at each station, predicting annual low flow values from a set of annual meteorological indices, which are subsequently regionalized using a set of catchment characteristics. The second method combines temporal and spatial prediction within a single panel data regression model, allowing estimation of annual low flow values from input of both annual meteorological indices and catchment descriptors. The third and fourth methods represent non-stationary low flow frequency analyses and require fitting of regional distribution functions. Method three is subject to a spatiotemporal prediction of an index value, method four to estimation of L-moments that adapt the regional frequency distribution to the at-site conditions. The results show that method two outperforms successive prediction in time and space. Method three also shows a high performance in the near future period, but since it relies on a stationary distribution, its application for prediction of far future changes may be problematic. Spatiotemporal prediction of L-moments appeared highly uncertain for higher-order moments resulting in unrealistic future low flow values. All in all, the results promote an inclusion of simple statistical methods in climate change impact assessment.
Perel, Pablo; Edwards, Phil; Shakur, Haleema; Roberts, Ian
2008-11-06
Traumatic brain injury (TBI) is an important cause of acquired disability. In evaluating the effectiveness of clinical interventions for TBI it is important to measure disability accurately. The Glasgow Outcome Scale (GOS) is the most widely used outcome measure in randomised controlled trials (RCTs) in TBI patients. However GOS measurement is generally collected at 6 months after discharge when loss to follow up could have occurred. The objectives of this study were to evaluate the association and predictive validity between a simple disability scale at hospital discharge, the Oxford Handicap Scale (OHS), and the GOS at 6 months among TBI patients. The study was a secondary analysis of a randomised clinical trial among TBI patients (MRC CRASH Trial). A Spearman correlation was estimated to evaluate the association between the OHS and GOS. The validity of different dichotomies of the OHS for predicting GOS at 6 months was assessed by calculating sensitivity, specificity and the C statistic. Uni and multivariate logistic regression models were fitted including OHS as explanatory variable. For each model we analysed its discrimination and calibration. We found that the OHS is highly correlated with GOS at 6 months (spearman correlation 0.75) with evidence of a linear relationship between the two scales. The OHS dichotomy that separates patients with severe dependency or death showed the greatest discrimination (C statistic: 84.3). Among survivors at hospital discharge the OHS showed a very good discrimination (C statistic 0.78) and excellent calibration when used to predict GOS outcome at 6 months. We have shown that the OHS, a simple disability scale available at hospital discharge can predict disability accurately, according to the GOS, at 6 months. OHS could be used to improve the design and analysis of clinical trials in TBI patients and may also provide a valuable clinical tool for physicians to improve communication with patients and relatives when assessing a patient's prognosis at hospital discharge.
Kyek, Martin; Lindner, Robert
2017-01-01
This study focuses on the population trends of two widespread European anuran species: the common toad (Bufo bufo) and the common frog (Rana temporaria). The basis of this study is data gathered over two decades of amphibian fencing alongside roads in the Austrian state of Salzburg. Different statistical approaches were used to analyse the data. Overall average increase or decrease of each species was estimated by calculating a simple average locality index. In addition the statistical software TRIM was used to verify these trends as well as to categorize the data based on the geographic location of each migration site. The results show differing overall trends for the two species: the common toad being stable and the common frog showing a substantial decline over the last two decades. Further analyses based on geographic categorization reveal the strongest decrease in the alpine range of the species. Drainage and agricultural intensification are still ongoing problems within alpine areas, not only in Salzburg. Particularly in respect to micro-climate and the availability of spawning places these changes appear to have a greater impact on the habitats of the common frog than the common toad. Therefore we consider habitat destruction to be the main potential reason behind this dramatic decline. We also conclude that the substantial loss of biomass of a widespread species such as the common frog must have a severe, and often overlooked, ecological impact. PMID:29121054
Using SQL Databases for Sequence Similarity Searching and Analysis.
Pearson, William R; Mackey, Aaron J
2017-09-13
Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
The Threshold Bias Model: A Mathematical Model for the Nomothetic Approach of Suicide
Folly, Walter Sydney Dutra
2011-01-01
Background Comparative and predictive analyses of suicide data from different countries are difficult to perform due to varying approaches and the lack of comparative parameters. Methodology/Principal Findings A simple model (the Threshold Bias Model) was tested for comparative and predictive analyses of suicide rates by age. The model comprises of a six parameter distribution that was applied to the USA suicide rates by age for the years 2001 and 2002. Posteriorly, linear extrapolations are performed of the parameter values previously obtained for these years in order to estimate the values corresponding to the year 2003. The calculated distributions agreed reasonably well with the aggregate data. The model was also used to determine the age above which suicide rates become statistically observable in USA, Brazil and Sri Lanka. Conclusions/Significance The Threshold Bias Model has considerable potential applications in demographic studies of suicide. Moreover, since the model can be used to predict the evolution of suicide rates based on information extracted from past data, it will be of great interest to suicidologists and other researchers in the field of mental health. PMID:21909431
The threshold bias model: a mathematical model for the nomothetic approach of suicide.
Folly, Walter Sydney Dutra
2011-01-01
Comparative and predictive analyses of suicide data from different countries are difficult to perform due to varying approaches and the lack of comparative parameters. A simple model (the Threshold Bias Model) was tested for comparative and predictive analyses of suicide rates by age. The model comprises of a six parameter distribution that was applied to the USA suicide rates by age for the years 2001 and 2002. Posteriorly, linear extrapolations are performed of the parameter values previously obtained for these years in order to estimate the values corresponding to the year 2003. The calculated distributions agreed reasonably well with the aggregate data. The model was also used to determine the age above which suicide rates become statistically observable in USA, Brazil and Sri Lanka. The Threshold Bias Model has considerable potential applications in demographic studies of suicide. Moreover, since the model can be used to predict the evolution of suicide rates based on information extracted from past data, it will be of great interest to suicidologists and other researchers in the field of mental health.
Scaling and universality in urban economic diversification.
Youn, Hyejin; Bettencourt, Luís M A; Lobo, José; Strumsky, Deborah; Samaniego, Horacio; West, Geoffrey B
2016-01-01
Understanding cities is central to addressing major global challenges from climate change to economic resilience. Although increasingly perceived as fundamental socio-economic units, the detailed fabric of urban economic activities is only recently accessible to comprehensive analyses with the availability of large datasets. Here, we study abundances of business categories across US metropolitan statistical areas, and provide a framework for measuring the intrinsic diversity of economic activities that transcends scales of the classification scheme. A universal structure common to all cities is revealed, manifesting self-similarity in internal economic structure as well as aggregated metrics (GDP, patents, crime). We present a simple mathematical derivation of the universality, and provide a model, together with its economic implications of open-ended diversity created by urbanization, for understanding the observed empirical distribution. Given the universal distribution, scaling analyses for individual business categories enable us to determine their relative abundances as a function of city size. These results shed light on the processes of economic differentiation with scale, suggesting a general structure for the growth of national economies as integrated urban systems. © 2016 The Authors.
Scaling and universality in urban economic diversification
Bettencourt, Luís M. A.; Lobo, José; Strumsky, Deborah; Samaniego, Horacio; West, Geoffrey B.
2016-01-01
Understanding cities is central to addressing major global challenges from climate change to economic resilience. Although increasingly perceived as fundamental socio-economic units, the detailed fabric of urban economic activities is only recently accessible to comprehensive analyses with the availability of large datasets. Here, we study abundances of business categories across US metropolitan statistical areas, and provide a framework for measuring the intrinsic diversity of economic activities that transcends scales of the classification scheme. A universal structure common to all cities is revealed, manifesting self-similarity in internal economic structure as well as aggregated metrics (GDP, patents, crime). We present a simple mathematical derivation of the universality, and provide a model, together with its economic implications of open-ended diversity created by urbanization, for understanding the observed empirical distribution. Given the universal distribution, scaling analyses for individual business categories enable us to determine their relative abundances as a function of city size. These results shed light on the processes of economic differentiation with scale, suggesting a general structure for the growth of national economies as integrated urban systems. PMID:26790997
A novel health indicator for on-line lithium-ion batteries remaining useful life prediction
NASA Astrophysics Data System (ADS)
Zhou, Yapeng; Huang, Miaohua; Chen, Yupu; Tao, Ye
2016-07-01
Prediction of lithium-ion batteries remaining useful life (RUL) plays an important role in an intelligent battery management system. The capacity and internal resistance are often used as the batteries health indicator (HI) for quantifying degradation and predicting RUL. However, on-line measurement of capacity and internal resistance are hardly realizable due to the not fully charged and discharged condition and the extremely expensive cost, respectively. Therefore, there is a great need to find an optional way to deal with this plight. In this work, a novel HI is extracted from the operating parameters of lithium-ion batteries for degradation modeling and RUL prediction. Moreover, Box-Cox transformation is employed to improve HI performance. Then Pearson and Spearman correlation analyses are utilized to evaluate the similarity between real capacity and the estimated capacity derived from the HI. Next, both simple statistical regression technique and optimized relevance vector machine are employed to predict the RUL based on the presented HI. The correlation analyses and prediction results show the efficiency and effectiveness of the proposed HI for battery degradation modeling and RUL prediction.
Kelley, George A.; Kelley, Kristi S.; Kohrt, Wendy M.
2013-01-01
Objective. Examine the effects of exercise on femoral neck (FN) and lumbar spine (LS) bone mineral density (BMD) in premenopausal women. Methods. Meta-analysis of randomized controlled exercise trials ≥24 weeks in premenopausal women. Standardized effect sizes (g) were calculated for each result and pooled using random-effects models, Z score alpha values, 95% confidence intervals (CIs), and number needed to treat (NNT). Heterogeneity was examined using Q and I 2. Moderator and predictor analyses using mixed-effects ANOVA and simple metaregression were conducted. Statistical significance was set at P ≤ 0.05. Results. Statistically significant improvements were found for both FN (7g's, 466 participants, g = 0.342, 95% CI = 0.132, 0.553, P = 0.001, Q = 10.8, P = 0.22, I 2 = 25.7%, NNT = 5) and LS (6g's, 402 participants, g = 0.201, 95% CI = 0.009, 0.394, P = 0.04, Q = 3.3, P = 0.65, I 2 = 0%, NNT = 9) BMD. A trend for greater benefits in FN BMD was observed for studies published in countries other than the United States and for those who participated in home versus facility-based exercise. Statistically significant, or a trend for statistically significant, associations were observed for 7 different moderators and predictors, 6 for FN BMD and 1 for LS BMD. Conclusions. Exercise benefits FN and LS BMD in premenopausal women. The observed moderators and predictors deserve further investigation in well-designed randomized controlled trials. PMID:23401684
Kelley, George A; Kelley, Kristi S; Kohrt, Wendy M
2013-01-01
Objective. Examine the effects of exercise on femoral neck (FN) and lumbar spine (LS) bone mineral density (BMD) in premenopausal women. Methods. Meta-analysis of randomized controlled exercise trials ≥24 weeks in premenopausal women. Standardized effect sizes (g) were calculated for each result and pooled using random-effects models, Z score alpha values, 95% confidence intervals (CIs), and number needed to treat (NNT). Heterogeneity was examined using Q and I(2). Moderator and predictor analyses using mixed-effects ANOVA and simple metaregression were conducted. Statistical significance was set at P ≤ 0.05. Results. Statistically significant improvements were found for both FN (7g's, 466 participants, g = 0.342, 95% CI = 0.132, 0.553, P = 0.001, Q = 10.8, P = 0.22, I(2) = 25.7%, NNT = 5) and LS (6g's, 402 participants, g = 0.201, 95% CI = 0.009, 0.394, P = 0.04, Q = 3.3, P = 0.65, I(2) = 0%, NNT = 9) BMD. A trend for greater benefits in FN BMD was observed for studies published in countries other than the United States and for those who participated in home versus facility-based exercise. Statistically significant, or a trend for statistically significant, associations were observed for 7 different moderators and predictors, 6 for FN BMD and 1 for LS BMD. Conclusions. Exercise benefits FN and LS BMD in premenopausal women. The observed moderators and predictors deserve further investigation in well-designed randomized controlled trials.
Seismic Safety Of Simple Masonry Buildings
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guadagnuolo, Mariateresa; Faella, Giuseppe
2008-07-08
Several masonry buildings comply with the rules for simple buildings provided by seismic codes. For these buildings explicit safety verifications are not compulsory if specific code rules are fulfilled. In fact it is assumed that their fulfilment ensures a suitable seismic behaviour of buildings and thus adequate safety under earthquakes. Italian and European seismic codes differ in the requirements for simple masonry buildings, mostly concerning the building typology, the building geometry and the acceleration at site. Obviously, a wide percentage of buildings assumed simple by codes should satisfy the numerical safety verification, so that no confusion and uncertainty have tomore » be given rise to designers who must use the codes. This paper aims at evaluating the seismic response of some simple unreinforced masonry buildings that comply with the provisions of the new Italian seismic code. Two-story buildings, having different geometry, are analysed and results from nonlinear static analyses performed by varying the acceleration at site are presented and discussed. Indications on the congruence between code rules and results of numerical analyses performed according to the code itself are supplied and, in this context, the obtained result can provide a contribution for improving the seismic code requirements.« less
Using Data from Climate Science to Teach Introductory Statistics
ERIC Educational Resources Information Center
Witt, Gary
2013-01-01
This paper shows how the application of simple statistical methods can reveal to students important insights from climate data. While the popular press is filled with contradictory opinions about climate science, teachers can encourage students to use introductory-level statistics to analyze data for themselves on this important issue in public…
Using R in Introductory Statistics Courses with the pmg Graphical User Interface
ERIC Educational Resources Information Center
Verzani, John
2008-01-01
The pmg add-on package for the open source statistics software R is described. This package provides a simple to use graphical user interface (GUI) that allows introductory statistics students, without advanced computing skills, to quickly create the graphical and numeric summaries expected of them. (Contains 9 figures.)
A simple code for use in shielding and radiation dosage analyses
NASA Technical Reports Server (NTRS)
Wan, C. C.
1972-01-01
A simple code for use in analyses of gamma radiation effects in laminated materials is described. Simple and good geometry is assumed so that all multiple collision and scattering events are excluded from consideration. The code is capable of handling laminates up to six layers. However, for laminates of more than six layers, the same code may be used to incorporate two additional layers at a time, making use of punch-tape outputs from previous computation on all preceding layers. Spectrum of attenuated radiation are obtained as both printed output and punch tape output as desired.
Identifying mechanisms for superdiffusive dynamics in cell trajectories
NASA Astrophysics Data System (ADS)
Passucci, Giuseppe; Brasch, Megan; Henderson, James; Manning, M. Lisa
Self-propelled particle (SPP) models have been used to explore features of active matter such as motility-induced phase separation, jamming, and flocking, and are often used to model biological cells. However, many cells exhibit super-diffusive trajectories, where displacements scale faster than t 1 / 2 in all directions, and these are not captured by traditional SPP models. We extract cell trajectories from image stacks of mouse fibroblast cells moving on 2D substrates and find super-diffusive mean-squared displacements in all directions across varying densities. Two SPP model modifications have been proposed to capture super-diffusive dynamics: Levy walks and heterogeneous motility parameters. In mouse fibroblast cells displacement probability distributions collapse when time is rescaled by a power greater than 1/2, which is consistent with Levy walks. We show that a simple SPP model with heterogeneous rotational noise can also generate a similar collapse. Furthermore, a close examination of statistics extracted directly from cell trajectories is consistent with a heterogeneous mobility SPP model and inconsistent with a Levy walk model. Our work demonstrates that a simple set of analyses can distinguish between mechanisms for anomalous diffusion in active matter.
A Simple Exploration of Complexity at the Climate-Weather-Social-Conflict Nexus
NASA Astrophysics Data System (ADS)
Shaw, M.
2017-12-01
The conceptualization, exploration, and prediction of interplay between climate, weather, important resources, and social and economic - so political - human behavior is cast, and analyzed, in terms familiar from statistical physics and nonlinear dynamics. A simple threshold toy model is presented which emulates human tendencies to either actively engage in responses deriving, in part, from environmental circumstances or to maintain some semblance of status quo, formulated based on efforts drawn from the sociophysics literature - more specifically vis a vis a model akin to spin glass depictions of human behavior - with threshold/switching of individual and collective dynamics influenced by relatively more detailed weather and land surface model (hydrological) analyses via a land data assimilation system (a custom rendition of the NASA GSFC Land Information System). Parameters relevant to human systems' - e.g., individual and collective switching - sensitivity to hydroclimatology are explored towards investigation of overall system behavior; i.e., fixed points/equilibria, oscillations, and bifurcations of systems composed of human interactions and responses to climate and weather through, e.g., agriculture. We discuss implications in terms of conceivable impacts of climate change and associated natural disasters on socioeconomics, politics, and power transfer, drawing from relatively recent literature concerning human conflict.
An automated approach to the design of decision tree classifiers
NASA Technical Reports Server (NTRS)
Argentiero, P.; Chin, R.; Beaudet, P.
1982-01-01
An automated technique is presented for designing effective decision tree classifiers predicated only on a priori class statistics. The procedure relies on linear feature extractions and Bayes table look-up decision rules. Associated error matrices are computed and utilized to provide an optimal design of the decision tree at each so-called 'node'. A by-product of this procedure is a simple algorithm for computing the global probability of correct classification assuming the statistical independence of the decision rules. Attention is given to a more precise definition of decision tree classification, the mathematical details on the technique for automated decision tree design, and an example of a simple application of the procedure using class statistics acquired from an actual Landsat scene.
Li, Jie; Li, Rui; You, Leiming; Xu, Anlong; Fu, Yonggui; Huang, Shengfeng
2015-01-01
Switching between different alternative polyadenylation (APA) sites plays an important role in the fine tuning of gene expression. New technologies for the execution of 3’-end enriched RNA-seq allow genome-wide detection of the genes that exhibit significant APA site switching between different samples. Here, we show that the independence test gives better results than the linear trend test in detecting APA site-switching events. Further examination suggests that the discrepancy between these two statistical methods arises from complex APA site-switching events that cannot be represented by a simple change of average 3’-UTR length. In theory, the linear trend test is only effective in detecting these simple changes. We classify the switching events into four switching patterns: two simple patterns (3’-UTR shortening and lengthening) and two complex patterns. By comparing the results of the two statistical methods, we show that complex patterns account for 1/4 of all observed switching events that happen between normal and cancerous human breast cell lines. Because simple and complex switching patterns may convey different biological meanings, they merit separate study. We therefore propose to combine both the independence test and the linear trend test in practice. First, the independence test should be used to detect APA site switching; second, the linear trend test should be invoked to identify simple switching events; and third, those complex switching events that pass independence testing but fail linear trend testing can be identified. PMID:25875641
Techniques for estimating selected streamflow characteristics of rural unregulated streams in Ohio
Koltun, G.F.; Whitehead, Matthew T.
2002-01-01
This report provides equations for estimating mean annual streamflow, mean monthly streamflows, harmonic mean streamflow, and streamflow quartiles (the 25th-, 50th-, and 75th-percentile streamflows) as a function of selected basin characteristics for rural, unregulated streams in Ohio. The equations were developed from streamflow statistics and basin-characteristics data for as many as 219 active or discontinued streamflow-gaging stations on rural, unregulated streams in Ohio with 10 or more years of homogenous daily streamflow record. Streamflow statistics and basin-characteristics data for the 219 stations are presented in this report. Simple equations (based on drainage area only) and best-fit equations (based on drainage area and at least two other basin characteristics) were developed by means of ordinary least-squares regression techniques. Application of the best-fit equations generally involves quantification of basin characteristics that require or are facilitated by use of a geographic information system. In contrast, the simple equations can be used with information that can be obtained without use of a geographic information system; however, the simple equations have larger prediction errors than the best-fit equations and exhibit geographic biases for most streamflow statistics. The best-fit equations should be used instead of the simple equations whenever possible.
Statistical Data Analyses of Trace Chemical, Biochemical, and Physical Analytical Signatures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Udey, Ruth Norma
Analytical and bioanalytical chemistry measurement results are most meaningful when interpreted using rigorous statistical treatments of the data. The same data set may provide many dimensions of information depending on the questions asked through the applied statistical methods. Three principal projects illustrated the wealth of information gained through the application of statistical data analyses to diverse problems.
Humans make efficient use of natural image statistics when performing spatial interpolation.
D'Antona, Anthony D; Perry, Jeffrey S; Geisler, Wilson S
2013-12-16
Visual systems learn through evolution and experience over the lifespan to exploit the statistical structure of natural images when performing visual tasks. Understanding which aspects of this statistical structure are incorporated into the human nervous system is a fundamental goal in vision science. To address this goal, we measured human ability to estimate the intensity of missing image pixels in natural images. Human estimation accuracy is compared with various simple heuristics (e.g., local mean) and with optimal observers that have nearly complete knowledge of the local statistical structure of natural images. Human estimates are more accurate than those of simple heuristics, and they match the performance of an optimal observer that knows the local statistical structure of relative intensities (contrasts). This optimal observer predicts the detailed pattern of human estimation errors and hence the results place strong constraints on the underlying neural mechanisms. However, humans do not reach the performance of an optimal observer that knows the local statistical structure of the absolute intensities, which reflect both local relative intensities and local mean intensity. As predicted from a statistical analysis of natural images, human estimation accuracy is negligibly improved by expanding the context from a local patch to the whole image. Our results demonstrate that the human visual system exploits efficiently the statistical structure of natural images.
Morse Code, Scrabble, and the Alphabet
ERIC Educational Resources Information Center
Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss
2004-01-01
In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…
ERIC Educational Resources Information Center
Mirman, Daniel; Estes, Katharine Graf; Magnuson, James S.
2010-01-01
Statistical learning mechanisms play an important role in theories of language acquisition and processing. Recurrent neural network models have provided important insights into how these mechanisms might operate. We examined whether such networks capture two key findings in human statistical learning. In Simulation 1, a simple recurrent network…
ERIC Educational Resources Information Center
Kravchuk, Olena; Elliott, Antony; Bhandari, Bhesh
2005-01-01
A simple laboratory experiment, based on the Maillard reaction, served as a project in Introductory Statistics for undergraduates in Food Science and Technology. By using the principles of randomization and replication and reflecting on the sources of variation in the experimental data, students reinforced the statistical concepts and techniques…
Early Warning Signs of Suicide in Service Members Who Engage in Unauthorized Acts of Violence
2016-06-01
observable to military law enforcement personnel. Statistical analyses tested for differences in warning signs between cases of suicide, violence, or...indicators, (2) Behavioral Change indicators, (3) Social indicators, and (4) Occupational indicators. Statistical analyses were conducted to test for...6 Coding _________________________________________________________________ 7 Statistical
[Statistical analysis using freely-available "EZR (Easy R)" software].
Kanda, Yoshinobu
2015-10-01
Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.
Oostenveld, Robert; Fries, Pascal; Maris, Eric; Schoffelen, Jan-Mathijs
2011-01-01
This paper describes FieldTrip, an open source software package that we developed for the analysis of MEG, EEG, and other electrophysiological data. The software is implemented as a MATLAB toolbox and includes a complete set of consistent and user-friendly high-level functions that allow experimental neuroscientists to analyze experimental data. It includes algorithms for simple and advanced analysis, such as time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. The implementation as toolbox allows the user to perform elaborate and structured analyses of large data sets using the MATLAB command line and batch scripting. Furthermore, users and developers can easily extend the functionality and implement new algorithms. The modular design facilitates the reuse in other software packages.
Seat Integrated and Conventional Restraints: A Study of Crash Injury/Fatality Rates in Rollovers
Padmanaban, Jeya; Burnett, Roger A.
2008-01-01
This study used police-reported motor vehicle crash data from eleven states to determine ejection, fatality, and fatal/serious injury risks for belted drivers in vehicles with conventional seatbelts compared to belted drivers in vehicles with seat integrated restraint systems (SIRS). Risks were compared for 11,159 belted drivers involved in single- or multiple-vehicle rollover crashes. Simple driver ejection (partial and complete), fatality, and injury rates were derived, and logistic regression analyses were used to determine relative contribution of factors (including event calendar year, vehicle age, driver age/gender/alcohol use) that significantly influence the likelihood of fatality and fatal/serious injury to belted drivers in rollovers. Results show no statistically significant difference in driver ejection, fatality, or fatal/serious injury rates between vehicles with conventional belts and vehicles with SIRS. PMID:19026243
NASA Astrophysics Data System (ADS)
Prasanna, V.
2018-01-01
This study makes use of temperature and precipitation from CMIP5 climate model output for climate change application studies over the Indian region during the summer monsoon season (JJAS). Bias correction of temperature and precipitation from CMIP5 GCM simulation results with respect to observation is discussed in detail. The non-linear statistical bias correction is a suitable bias correction method for climate change data because it is simple and does not add up artificial uncertainties to the impact assessment of climate change scenarios for climate change application studies (agricultural production changes) in the future. The simple statistical bias correction uses observational constraints on the GCM baseline, and the projected results are scaled with respect to the changing magnitude in future scenarios, varying from one model to the other. Two types of bias correction techniques are shown here: (1) a simple bias correction using a percentile-based quantile-mapping algorithm and (2) a simple but improved bias correction method, a cumulative distribution function (CDF; Weibull distribution function)-based quantile-mapping algorithm. This study shows that the percentile-based quantile mapping method gives results similar to the CDF (Weibull)-based quantile mapping method, and both the methods are comparable. The bias correction is applied on temperature and precipitation variables for present climate and future projected data to make use of it in a simple statistical model to understand the future changes in crop production over the Indian region during the summer monsoon season. In total, 12 CMIP5 models are used for Historical (1901-2005), RCP4.5 (2005-2100), and RCP8.5 (2005-2100) scenarios. The climate index from each CMIP5 model and the observed agricultural yield index over the Indian region are used in a regression model to project the changes in the agricultural yield over India from RCP4.5 and RCP8.5 scenarios. The results revealed a better convergence of model projections in the bias corrected data compared to the uncorrected data. The study can be extended to localized regional domains aimed at understanding the changes in the agricultural productivity in the future with an agro-economy or a simple statistical model. The statistical model indicated that the total food grain yield is going to increase over the Indian region in the future, the increase in the total food grain yield is approximately 50 kg/ ha for the RCP4.5 scenario from 2001 until the end of 2100, and the increase in the total food grain yield is approximately 90 kg/ha for the RCP8.5 scenario from 2001 until the end of 2100. There are many studies using bias correction techniques, but this study applies the bias correction technique to future climate scenario data from CMIP5 models and applied it to crop statistics to find future crop yield changes over the Indian region.
NASA Astrophysics Data System (ADS)
Mahmood, Ehab A.; Rana, Sohel; Hussin, Abdul Ghapor; Midi, Habshah
2016-06-01
The circular regression model may contain one or more data points which appear to be peculiar or inconsistent with the main part of the model. This may be occur due to recording errors, sudden short events, sampling under abnormal conditions etc. The existence of these data points "outliers" in the data set cause lot of problems in the research results and the conclusions. Therefore, we should identify them before applying statistical analysis. In this article, we aim to propose a statistic to identify outliers in the both of the response and explanatory variables of the simple circular regression model. Our proposed statistic is robust circular distance RCDxy and it is justified by the three robust measurements such as proportion of detection outliers, masking and swamping rates.
Use of iPhone technology in improving acetabular component position in total hip arthroplasty.
Tay, Xiau Wei; Zhang, Benny Xu; Gayagay, George
2017-09-01
Improper acetabular cup positioning is associated with high risk of complications after total hip arthroplasty. The aim of our study is to objectively compare 3 methods, namely (1) free hand, (2) alignment jig (Sputnik), and (3) iPhone application to identify an easy, reproducible, and accurate method in improving acetabular cup placement. We designed a simple setup and carried out a simple experiment (see Method section). Using statistical analysis, the difference in inclination angles using iPhone application compared with the freehand method was found to be statistically significant ( F [2,51] = 4.17, P = .02) in the "untrained group". There is no statistical significance detected for the other groups. This suggests a potential role for iPhone applications in junior surgeons in overcoming the steep learning curve.
The importance of topographically corrected null models for analyzing ecological point processes.
McDowall, Philip; Lynch, Heather J
2017-07-01
Analyses of point process patterns and related techniques (e.g., MaxEnt) make use of the expected number of occurrences per unit area and second-order statistics based on the distance between occurrences. Ecologists working with point process data often assume that points exist on a two-dimensional x-y plane or within a three-dimensional volume, when in fact many observed point patterns are generated on a two-dimensional surface existing within three-dimensional space. For many surfaces, however, such as the topography of landscapes, the projection from the surface to the x-y plane preserves neither area nor distance. As such, when these point patterns are implicitly projected to and analyzed in the x-y plane, our expectations of the point pattern's statistical properties may not be met. When used in hypothesis testing, we find that the failure to account for the topography of the generating surface may bias statistical tests that incorrectly identify clustering and, furthermore, may bias coefficients in inhomogeneous point process models that incorporate slope as a covariate. We demonstrate the circumstances under which this bias is significant, and present simple methods that allow point processes to be simulated with corrections for topography. These point patterns can then be used to generate "topographically corrected" null models against which observed point processes can be compared. © 2017 by the Ecological Society of America.
MRI textures as outcome predictor for Gamma Knife radiosurgery on vestibular schwannoma
NASA Astrophysics Data System (ADS)
Langenhuizen, P. P. J. H.; Legters, M. J. W.; Zinger, S.; Verheul, H. B.; Leenstra, S.; de With, P. H. N.
2018-02-01
Vestibular schwannomas (VS) are benign brain tumors that can be treated with high-precision focused radiation with the Gamma Knife in order to stop tumor growth. Outcome prediction of Gamma Knife radiosurgery (GKRS) treatment can help in determining whether GKRS will be effective on an individual patient basis. However, at present, prognostic factors of tumor control after GKRS for VS are largely unknown, and only clinical factors, such as size of the tumor at treatment and pre-treatment growth rate of the tumor, have been considered thus far. This research aims at outcome prediction of GKRS by means of quantitative texture feature analysis on conventional MRI scans. We compute first-order statistics and features based on gray-level co- occurrence (GLCM) and run-length matrices (RLM), and employ support vector machines and decision trees for classification. In a clinical dataset, consisting of 20 tumors showing treatment failure and 20 tumors exhibiting treatment success, we have discovered that the second-order statistical metrics distilled from GLCM and RLM are suitable for describing texture, but are slightly outperformed by simple first-order statistics, like mean, standard deviation and median. The obtained prediction accuracy is about 85%, but a final choice of the best feature can only be made after performing more extensive analyses on larger datasets. In any case, this work provides suitable texture measures for successful prediction of GKRS treatment outcome for VS.
NASA Astrophysics Data System (ADS)
Sellentin, Elena; Heavens, Alan F.
2018-01-01
We investigate whether a Gaussian likelihood, as routinely assumed in the analysis of cosmological data, is supported by simulated survey data. We define test statistics, based on a novel method that first destroys Gaussian correlations in a data set, and then measures the non-Gaussian correlations that remain. This procedure flags pairs of data points that depend on each other in a non-Gaussian fashion, and thereby identifies where the assumption of a Gaussian likelihood breaks down. Using this diagnosis, we find that non-Gaussian correlations in the CFHTLenS cosmic shear correlation functions are significant. With a simple exclusion of the most contaminated data points, the posterior for s8 is shifted without broadening, but we find no significant reduction in the tension with s8 derived from Planck cosmic microwave background data. However, we also show that the one-point distributions of the correlation statistics are noticeably skewed, such that sound weak-lensing data sets are intrinsically likely to lead to a systematically low lensing amplitude being inferred. The detected non-Gaussianities get larger with increasing angular scale such that for future wide-angle surveys such as Euclid or LSST, with their very small statistical errors, the large-scale modes are expected to be increasingly affected. The shifts in posteriors may then not be negligible and we recommend that these diagnostic tests be run as part of future analyses.
The feasibility and utility of grocery receipt analyses for dietary assessment.
Martin, Sarah Levin; Howell, Teresa; Duan, Yan; Walters, Michele
2006-03-30
To establish the feasibility and utility of a simple data collection methodology for dietary assessment. Using a cross-sectional design, trained data collectors approached adults (approximately 20 - 40 years of age) at local grocery stores and asked whether they would volunteer their grocery receipts and answer a few questions for a small stipend (dollar 1). The grocery data were divided into 3 categories: "fats, oils, and sweets," "processed foods," and "low-fat/low-calorie substitutions" as a percentage of the total food purchase price. The questions assessed the shopper's general eating habits (eg, fast-food consumption) and a few demographic characteristics and health aspects (eg, perception of body size). Statistical Analyses Performed. Descriptive and analytic analyses using non-parametric tests were conducted in SAS. Forty-eight receipts and questionnaires were collected. Nearly every respondent reported eating fast food at least once per month; 27% ate out once or twice a day. Frequency of fast-food consumption was positively related to perceived body size of the respondent (p = 0.02). Overall, 30% of the food purchase price was for fats, oils, sweets, 10% was for processed foods, and almost 6% was for low-fat/low-calorie substitutions. Households where no one was perceived to be overweight spent a smaller proportion of their food budget on fats, oils, and sweets than did households where at least one person was perceived to be overweight (p = 0.10); household where the spouse was not perceived to be overweight spent less on fats, oils, and sweets (p = 0.02) and more on low-fat/low-calorie substitutions (p = 0.09) than did households where the spouse was perceived to be overweight; and, respondents who perceived themselves to be overweight spent more on processed foods than did respondents who did not perceive themselves to be overweight (p = 0.06). This simple dietary assessment method, although global in nature, may be a useful indicator of dietary practices as evidenced by its association with perceived weight status.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.
Kosinski, Andrzej S
2013-03-15
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests
Kosinski, Andrzej S.
2013-01-01
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations which are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we present, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic which incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, it always reduces to the score statistic in the independent samples situation, and it preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the weighted generalized score test statistic in a general GEE setting. PMID:22912343
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
2014-01-01
Background Thresholds for statistical significance are insufficiently demonstrated by 95% confidence intervals or P-values when assessing results from randomised clinical trials. First, a P-value only shows the probability of getting a result assuming that the null hypothesis is true and does not reflect the probability of getting a result assuming an alternative hypothesis to the null hypothesis is true. Second, a confidence interval or a P-value showing significance may be caused by multiplicity. Third, statistical significance does not necessarily result in clinical significance. Therefore, assessment of intervention effects in randomised clinical trials deserves more rigour in order to become more valid. Methods Several methodologies for assessing the statistical and clinical significance of intervention effects in randomised clinical trials were considered. Balancing simplicity and comprehensiveness, a simple five-step procedure was developed. Results For a more valid assessment of results from a randomised clinical trial we propose the following five-steps: (1) report the confidence intervals and the exact P-values; (2) report Bayes factor for the primary outcome, being the ratio of the probability that a given trial result is compatible with a ‘null’ effect (corresponding to the P-value) divided by the probability that the trial result is compatible with the intervention effect hypothesised in the sample size calculation; (3) adjust the confidence intervals and the statistical significance threshold if the trial is stopped early or if interim analyses have been conducted; (4) adjust the confidence intervals and the P-values for multiplicity due to number of outcome comparisons; and (5) assess clinical significance of the trial results. Conclusions If the proposed five-step procedure is followed, this may increase the validity of assessments of intervention effects in randomised clinical trials. PMID:24588900
Zhang, Harrison G; Ying, Gui-Shuang
2018-02-09
The aim of this study is to evaluate the current practice of statistical analysis of eye data in clinical science papers published in British Journal of Ophthalmology ( BJO ) and to determine whether the practice of statistical analysis has improved in the past two decades. All clinical science papers (n=125) published in BJO in January-June 2017 were reviewed for their statistical analysis approaches for analysing primary ocular measure. We compared our findings to the results from a previous paper that reviewed BJO papers in 1995. Of 112 papers eligible for analysis, half of the studies analysed the data at an individual level because of the nature of observation, 16 (14%) studies analysed data from one eye only, 36 (32%) studies analysed data from both eyes at ocular level, one study (1%) analysed the overall summary of ocular finding per individual and three (3%) studies used the paired comparison. Among studies with data available from both eyes, 50 (89%) of 56 papers in 2017 did not analyse data from both eyes or ignored the intereye correlation, as compared with in 60 (90%) of 67 papers in 1995 (P=0.96). Among studies that analysed data from both eyes at an ocular level, 33 (92%) of 36 studies completely ignored the intereye correlation in 2017, as compared with in 16 (89%) of 18 studies in 1995 (P=0.40). A majority of studies did not analyse the data properly when data from both eyes were available. The practice of statistical analysis did not improve in the past two decades. Collaborative efforts should be made in the vision research community to improve the practice of statistical analysis for ocular data. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Turner, Rebecca M; Jackson, Dan; Wei, Yinghui; Thompson, Simon G; Higgins, Julian P T
2015-01-01
Numerous meta-analyses in healthcare research combine results from only a small number of studies, for which the variance representing between-study heterogeneity is estimated imprecisely. A Bayesian approach to estimation allows external evidence on the expected magnitude of heterogeneity to be incorporated. The aim of this paper is to provide tools that improve the accessibility of Bayesian meta-analysis. We present two methods for implementing Bayesian meta-analysis, using numerical integration and importance sampling techniques. Based on 14 886 binary outcome meta-analyses in the Cochrane Database of Systematic Reviews, we derive a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made. These can be used as prior distributions for heterogeneity in future meta-analyses. The two methods are implemented in R, for which code is provided. Both methods produce equivalent results to standard but more complex Markov chain Monte Carlo approaches. The priors are derived as log-normal distributions for the between-study variance, applicable to meta-analyses of binary outcomes on the log odds-ratio scale. The methods are applied to two example meta-analyses, incorporating the relevant predictive distributions as prior distributions for between-study heterogeneity. We have provided resources to facilitate Bayesian meta-analysis, in a form accessible to applied researchers, which allow relevant prior information on the degree of heterogeneity to be incorporated. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:25475839
10 CFR 436.23 - Estimated simple payback time.
Code of Federal Regulations, 2010 CFR
2010-01-01
... Methodology and Procedures for Life Cycle Cost Analyses § 436.23 Estimated simple payback time. The estimated simple payback time is the number of years required for the cumulative value of energy or water cost savings less future non-fuel or non-water costs to equal the investment costs of the building energy or...
10 CFR 436.23 - Estimated simple payback time.
Code of Federal Regulations, 2014 CFR
2014-01-01
... Methodology and Procedures for Life Cycle Cost Analyses § 436.23 Estimated simple payback time. The estimated simple payback time is the number of years required for the cumulative value of energy or water cost savings less future non-fuel or non-water costs to equal the investment costs of the building energy or...
10 CFR 436.23 - Estimated simple payback time.
Code of Federal Regulations, 2011 CFR
2011-01-01
... Methodology and Procedures for Life Cycle Cost Analyses § 436.23 Estimated simple payback time. The estimated simple payback time is the number of years required for the cumulative value of energy or water cost savings less future non-fuel or non-water costs to equal the investment costs of the building energy or...
10 CFR 436.23 - Estimated simple payback time.
Code of Federal Regulations, 2013 CFR
2013-01-01
... Methodology and Procedures for Life Cycle Cost Analyses § 436.23 Estimated simple payback time. The estimated simple payback time is the number of years required for the cumulative value of energy or water cost savings less future non-fuel or non-water costs to equal the investment costs of the building energy or...
10 CFR 436.23 - Estimated simple payback time.
Code of Federal Regulations, 2012 CFR
2012-01-01
... Methodology and Procedures for Life Cycle Cost Analyses § 436.23 Estimated simple payback time. The estimated simple payback time is the number of years required for the cumulative value of energy or water cost savings less future non-fuel or non-water costs to equal the investment costs of the building energy or...
Entropy Is Simple, Qualitatively.
ERIC Educational Resources Information Center
Lambert, Frank L.
2002-01-01
Suggests that qualitatively, entropy is simple. Entropy increase from a macro viewpoint is a measure of the dispersal of energy from localized to spread out at a temperature T. Fundamentally based on statistical and quantum mechanics, this approach is superior to the non-fundamental "disorder" as a descriptor of entropy change. (MM)
Power-law ansatz in complex systems: Excessive loss of information.
Tsai, Sun-Ting; Chang, Chin-De; Chang, Ching-Hao; Tsai, Meng-Xue; Hsu, Nan-Jung; Hong, Tzay-Ming
2015-12-01
The ubiquity of power-law relations in empirical data displays physicists' love of simple laws and uncovering common causes among seemingly unrelated phenomena. However, many reported power laws lack statistical support and mechanistic backings, not to mention discrepancies with real data are often explained away as corrections due to finite size or other variables. We propose a simple experiment and rigorous statistical procedures to look into these issues. Making use of the fact that the occurrence rate and pulse intensity of crumple sound obey a power law with an exponent that varies with material, we simulate a complex system with two driving mechanisms by crumpling two different sheets together. The probability function of the crumple sound is found to transit from two power-law terms to a bona fide power law as compaction increases. In addition to showing the vicinity of these two distributions in the phase space, this observation nicely demonstrates the effect of interactions to bring about a subtle change in macroscopic behavior and more information may be retrieved if the data are subject to sorting. Our analyses are based on the Akaike information criterion that is a direct measurement of information loss and emphasizes the need to strike a balance between model simplicity and goodness of fit. As a show of force, the Akaike information criterion also found the Gutenberg-Richter law for earthquakes and the scale-free model for a brain functional network, a two-dimensional sandpile, and solar flare intensity to suffer an excessive loss of information. They resemble more the crumpled-together ball at low compactions in that there appear to be two driving mechanisms that take turns occurring.
The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research
Estivalet, Gustavo L.; Meunier, Fanny
2015-01-01
In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional’s corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior. PMID:26630138
The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research.
Estivalet, Gustavo L; Meunier, Fanny
2015-01-01
In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional's corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior.
Partl, Richard; Fastner, Gerd; Kaiser, Julia; Kronhuber, Elisabeth; Cetin-Strohmer, Klaudia; Steffal, Claudia; Böhmer-Breitfelder, Barbara; Mayer, Johannes; Avian, Alexander; Berghold, Andrea
2016-02-01
Low Karnofsky performance status (KPS) and elevated lactate dehydrogenases (LDHs) as a surrogate marker for tumor load and cell turnover may depict patients with a very short life expectancy. To validate this finding and compare it to other indices, namely, the recursive partitioning analysis (RPA) and diagnosis-specific graded prognostic assessment (DS-GPA), a multicenter analysis was undertaken. A retrospective analysis of 234 metastatic melanoma patients uniformly treated with palliative whole brain radiotherapy (WBRT) was done. Univariate and multivariate analyses were used to determine the impact of patient-, tumor-, and treatment-related parameters on overall survival (OS). KPS and LDH emerged as independent factors predicting OS. By combining KPS and LDH values (KPS/LDH index), groups of patients with statistically significant differences in median OS (days; 95 % CI) after onset of WBRT were identified: group 1 (KPS ≥ 70/normal LDH) 234 (96-372), group 2 (KPS ≥ 70/elevated LDH) 112 (69-155), group 3 (KPS <70/normal LDH) 43 (12-74), and group 4 (KPS <70/elevated LDH) 29 (17-41). Between all four groups, statistically significant differences were observed. The RPA and DS-GPA indices failed to distinguish significantly between good and moderate prognosis and were inferior in predicting a very unfavorable prognosis. The parameters KPS and LDH independently impacted on OS. The combination of both (KPS/LDH index) identified patients with a very short life expectancy, who might be better served by recommending best supportive care instead of WBRT. The KPS/LDH index is simple and effective in terms of time and cost as compared to other prognostic indices.
PhyloExplorer: a web server to validate, explore and query phylogenetic trees
Ranwez, Vincent; Clairon, Nicolas; Delsuc, Frédéric; Pourali, Saeed; Auberval, Nicolas; Diser, Sorel; Berry, Vincent
2009-01-01
Background Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees. Results We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database. Conclusion PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: and the source code can be downloaded from: . PMID:19450253
Anchorage loss due to Herbst mechanics-preventable through miniscrews?
Bremen, Julia von; Ludwig, Björn; Ruf, Sabine
2015-10-01
To assess if mandibular incisor proclination and protrusion during treatment with the Herbst/multibracket appliance can be prevented through simple screws (MIs) anchorage. After a statistical power analysis, 12 Herbst patients with MIs (100% MIs survival) ligated to the Herbst/multibracket appliance to reinforce anchorage were investigated. A control group matched for gender and skeletal maturity treated without MIs anchorage was selected. Pre- and posttreatment cephalograms were analysed for overjet reduction, mandibular incisor proclination (IL/ML), protrusion (Ii-MLp) and intrusion (Ii-ML), as well as occlusal plane inclination (OP/ML) by a single-blinded examiner. No statistically significant differences between the two groups were found concerning overjet reduction, incisor protrusion- and intrusion or occlusal plane tilt. Although the MIs group generally showed less lower incisor proclination (4.8°) than the group without skeletal anchorage (6.5°), a large interindividual variation was observed. Interradicular MIs anchorage cannot prevent anchorage loss during Herbst treatment. For the individual patient, the amount of incisor proclination and protrusion remains unpredictable. © The Author 2014. Published by Oxford University Press on behalf of the European Orthodontic Society. All rights reserved. For permissions, please email: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Fisher, W. P., Jr.; Elbaum, B.; Coulter, A.
2010-07-01
Reliability coefficients indicate the proportion of total variance attributable to differences among measures separated along a quantitative continuum by a testing, survey, or assessment instrument. Reliability is usually considered to be influenced by both the internal consistency of a data set and the number of items, though textbooks and research papers rarely evaluate the extent to which these factors independently affect the data in question. Probabilistic formulations of the requirements for unidimensional measurement separate consistency from error by modelling individual response processes instead of group-level variation. The utility of this separation is illustrated via analyses of small sets of simulated data, and of subsets of data from a 78-item survey of over 2,500 parents of children with disabilities. Measurement reliability ultimately concerns the structural invariance specified in models requiring sufficient statistics, parameter separation, unidimensionality, and other qualities that historically have made quantification simple, practical, and convenient for end users. The paper concludes with suggestions for a research program aimed at focusing measurement research more on the calibration and wide dissemination of tools applicable to individuals, and less on the statistical study of inter-variable relations in large data sets.
NASA Astrophysics Data System (ADS)
Synek, Petr; Zemánek, Miroslav; Kudrle, Vít; Hoder, Tomáš
2018-04-01
Electrical current measurements in corona or barrier microdischarges are a challenge as they require both high temporal resolution and a large dynamic range of the current probe used. In this article, we apply a simple self-assembled current probe and compare it to commercial ones. An analysis in the time and frequency domain is carried out. Moreover, an improved methodology is presented, enabling both temporal resolution in sub-nanosecond times and current sensitivity in the order of tens of micro-amperes. Combining this methodology with a high-tech oscilloscope and self-developed software, a unique statistical analysis of currents in volume barrier discharge driven in atmospheric-pressure air is made for over 80 consecutive periods of a 15 kHz applied voltage. We reveal the presence of repetitive sub-critical current pulses and conclude that these can be identified with the discharging of surface charge microdomains. Moreover, extremely low, long-lasting microsecond currents were detected which are caused by ion flow, and are analysed in detail. The statistical behaviour presented gives deeper insight into the discharge physics of these usually undetectable current signals.
NASA Technical Reports Server (NTRS)
Prabhakara, C.; Wang, I.; Chang, A. T. C.; Gloersen, P.
1982-01-01
Nimbus 7 Scanning Multichannel Microwave Radiometer (SMMR) brightness temperature measurements over the global oceans have been examined with the help of statistical and empirical techniques. Such analyses show that zonal averages of brightness temperature measured by SMMR, over the oceans, on a large scale are primarily influenced by the water vapor in the atmosphere. Liquid water in the clouds and rain, which has a much smaller spatial and temporal scale, contributes substantially to the variability of the SMMR measurements within the latitudinal zones. The surface wind not only increases the surface emissivity but through its interactions with the atmosphere produces correlations, in the SMMR brightness temperature data, that have significant meteorological implications. It is found that a simple meteorological model can explain the general characteristics of the SMMR data. With the help of this model methods to infer over the global oceans, the surface temperature, liquid water content in the atmosphere, and surface wind speed are developed. Monthly mean estimates of the sea surface temperature and surface winds are compared with the ship measurements. Estimates of liquid water content in the atmosphere are consistent with earlier satellite measurements.
Trends and associated uncertainty in the global mean temperature record
NASA Astrophysics Data System (ADS)
Poppick, A. N.; Moyer, E. J.; Stein, M.
2016-12-01
Physical models suggest that the Earth's mean temperature warms in response to changing CO2 concentrations (and hence increased radiative forcing); given physical uncertainties in this relationship, the historical temperature record is a source of empirical information about global warming. A persistent thread in many analyses of the historical temperature record, however, is the reliance on methods that appear to deemphasize both physical and statistical assumptions. Examples include regression models that treat time rather than radiative forcing as the relevant covariate, and time series methods that account for natural variability in nonparametric rather than parametric ways. We show here that methods that deemphasize assumptions can limit the scope of analysis and can lead to misleading inferences, particularly in the setting considered where the data record is relatively short and the scale of temporal correlation is relatively long. A proposed model that is simple but physically informed provides a more reliable estimate of trends and allows a broader array of questions to be addressed. In accounting for uncertainty, we also illustrate how parametric statistical models that are attuned to the important characteristics of natural variability can be more reliable than ostensibly more flexible approaches.
Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics
NASA Technical Reports Server (NTRS)
Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.
Brief communication: Skeletal biology past and present: Are we moving in the right direction?
Hens, Samantha M; Godde, Kanya
2008-10-01
In 1982, Spencer's edited volume A History of American Physical Anthropology: 1930-1980 allowed numerous authors to document the state of our science, including a critical examination of skeletal biology. Some authors argued that the first 50 years of skeletal biology were characterized by the descriptive-historical approach with little regard for processual problems and that technological and statistical analyses were not rooted in theory. In an effort to determine whether Spencer's landmark volume impacted the field of skeletal biology, a content analysis was carried out for the American Journal of Physical Anthropology from 1980 to 2004. The percentage of skeletal biology articles is similar to that of previous decades. Analytical articles averaged only 32% and are defined by three criteria: statistical analysis, hypothesis testing, and broader explanatory context. However, when these criteria were scored individually, nearly 80% of papers attempted a broader theoretical explanation, 44% tested hypotheses, and 67% used advanced statistics, suggesting that the skeletal biology papers in the journal have an analytical emphasis. Considerable fluctuation exists between subfields; trends toward a more analytical approach are witnessed in the subfields of age/sex/stature/demography, skeletal maturation, anatomy, and nonhuman primate studies, which also increased in frequency, while paleontology and pathology were largely descriptive. Comparisons to the International Journal of Osteoarchaeology indicate that there are statistically significant differences between the two journals in terms of analytical criteria. These data indicate a positive shift in theoretical thinking, i.e., an attempt by most to explain processes rather than present a simple description of events.
Simple Statistics: - Summarized!
ERIC Educational Resources Information Center
Blai, Boris, Jr.
Statistics are an essential tool for making proper judgement decisions. It is concerned with probability distribution models, testing of hypotheses, significance tests and other means of determining the correctness of deductions and the most likely outcome of decisions. Measures of central tendency include the mean, median and mode. A second…
ERIC Educational Resources Information Center
Haans, Antal
2018-01-01
Contrast analysis is a relatively simple but effective statistical method for testing theoretical predictions about differences between group means against the empirical data. Despite its advantages, contrast analysis is hardly used to date, perhaps because it is not implemented in a convenient manner in many statistical software packages. This…
Superordinate Shape Classification Using Natural Shape Statistics
ERIC Educational Resources Information Center
Wilder, John; Feldman, Jacob; Singh, Manish
2011-01-01
This paper investigates the classification of shapes into broad natural categories such as "animal" or "leaf". We asked whether such coarse classifications can be achieved by a simple statistical classification of the shape skeleton. We surveyed databases of natural shapes, extracting shape skeletons and tabulating their…
Rohrmeier, Martin A; Cross, Ian
2014-07-01
Humans rapidly learn complex structures in various domains. Findings of above-chance performance of some untrained control groups in artificial grammar learning studies raise questions about the extent to which learning can occur in an untrained, unsupervised testing situation with both correct and incorrect structures. The plausibility of unsupervised online-learning effects was modelled with n-gram, chunking and simple recurrent network models. A novel evaluation framework was applied, which alternates forced binary grammaticality judgments and subsequent learning of the same stimulus. Our results indicate a strong online learning effect for n-gram and chunking models and a weaker effect for simple recurrent network models. Such findings suggest that online learning is a plausible effect of statistical chunk learning that is possible when ungrammatical sequences contain a large proportion of grammatical chunks. Such common effects of continuous statistical learning may underlie statistical and implicit learning paradigms and raise implications for study design and testing methodologies. Copyright © 2014 Elsevier Inc. All rights reserved.
A statistical method for measuring activation of gene regulatory networks.
Esteves, Gustavo H; Reis, Luiz F L
2018-06-13
Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.
Contingency and statistical laws in replicate microbial closed ecosystems.
Hekstra, Doeke R; Leibler, Stanislas
2012-05-25
Contingency, the persistent influence of past random events, pervades biology. To what extent, then, is each course of ecological or evolutionary dynamics unique, and to what extent are these dynamics subject to a common statistical structure? Addressing this question requires replicate measurements to search for emergent statistical laws. We establish a readily replicated microbial closed ecosystem (CES), sustaining its three species for years. We precisely measure the local population density of each species in many CES replicates, started from the same initial conditions and kept under constant light and temperature. The covariation among replicates of the three species densities acquires a stable structure, which could be decomposed into discrete eigenvectors, or "ecomodes." The largest ecomode dominates population density fluctuations around the replicate-average dynamics. These fluctuations follow simple power laws consistent with a geometric random walk. Thus, variability in ecological dynamics can be studied with CES replicates and described by simple statistical laws. Copyright © 2012 Elsevier Inc. All rights reserved.
Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages.
Jadoul, Yannick; Ravignani, Andrea; Thompson, Bill; Filippi, Piera; de Boer, Bart
2016-01-01
Temporal regularities in speech, such as interdependencies in the timing of speech events, are thought to scaffold early acquisition of the building blocks in speech. By providing on-line clues to the location and duration of upcoming syllables, temporal structure may aid segmentation and clustering of continuous speech into separable units. This hypothesis tacitly assumes that learners exploit predictability in the temporal structure of speech. Existing measures of speech timing tend to focus on first-order regularities among adjacent units, and are overly sensitive to idiosyncrasies in the data they describe. Here, we compare several statistical methods on a sample of 18 languages, testing whether syllable occurrence is predictable over time. Rather than looking for differences between languages, we aim to find across languages (using clearly defined acoustic, rather than orthographic, measures), temporal predictability in the speech signal which could be exploited by a language learner. First, we analyse distributional regularities using two novel techniques: a Bayesian ideal learner analysis, and a simple distributional measure. Second, we model higher-order temporal structure-regularities arising in an ordered series of syllable timings-testing the hypothesis that non-adjacent temporal structures may explain the gap between subjectively-perceived temporal regularities, and the absence of universally-accepted lower-order objective measures. Together, our analyses provide limited evidence for predictability at different time scales, though higher-order predictability is difficult to reliably infer. We conclude that temporal predictability in speech may well arise from a combination of individually weak perceptual cues at multiple structural levels, but is challenging to pinpoint.
Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages
Jadoul, Yannick; Ravignani, Andrea; Thompson, Bill; Filippi, Piera; de Boer, Bart
2016-01-01
Temporal regularities in speech, such as interdependencies in the timing of speech events, are thought to scaffold early acquisition of the building blocks in speech. By providing on-line clues to the location and duration of upcoming syllables, temporal structure may aid segmentation and clustering of continuous speech into separable units. This hypothesis tacitly assumes that learners exploit predictability in the temporal structure of speech. Existing measures of speech timing tend to focus on first-order regularities among adjacent units, and are overly sensitive to idiosyncrasies in the data they describe. Here, we compare several statistical methods on a sample of 18 languages, testing whether syllable occurrence is predictable over time. Rather than looking for differences between languages, we aim to find across languages (using clearly defined acoustic, rather than orthographic, measures), temporal predictability in the speech signal which could be exploited by a language learner. First, we analyse distributional regularities using two novel techniques: a Bayesian ideal learner analysis, and a simple distributional measure. Second, we model higher-order temporal structure—regularities arising in an ordered series of syllable timings—testing the hypothesis that non-adjacent temporal structures may explain the gap between subjectively-perceived temporal regularities, and the absence of universally-accepted lower-order objective measures. Together, our analyses provide limited evidence for predictability at different time scales, though higher-order predictability is difficult to reliably infer. We conclude that temporal predictability in speech may well arise from a combination of individually weak perceptual cues at multiple structural levels, but is challenging to pinpoint. PMID:27994544
van 't Hoff, Marcel; Reuter, Marcel; Dryden, David T F; Oheim, Martin
2009-09-21
Bacteriophage lambda-DNA molecules are frequently used as a scaffold to characterize the action of single proteins unwinding, translocating, digesting or repairing DNA. However, scaling up such single-DNA-molecule experiments under identical conditions to attain statistically relevant sample sizes remains challenging. Additionally the movies obtained are frequently noisy and difficult to analyse with any precision. We address these two problems here using, firstly, a novel variable-angle total internal reflection fluorescence (VA-TIRF) reflector composed of a minimal set of optical reflective elements, and secondly, using single value decomposition (SVD) to improve the signal-to-noise ratio prior to analysing time-lapse image stacks. As an example, we visualize under identical optical conditions hundreds of surface-tethered single lambda-DNA molecules, stained with the intercalating dye YOYO-1 iodide, and stretched out in a microcapillary flow. Another novelty of our approach is that we arrange on a mechanically driven stage several capillaries containing saline, calibration buffer and lambda-DNA, respectively, thus extending the approach to high-content, high-throughput screening of single molecules. Our length measurements of individual DNA molecules from noise-reduced kymograph images using SVD display a 6-fold enhanced precision compared to raw-data analysis, reaching approximately 1 kbp resolution. Combining these two methods, our approach provides a straightforward yet powerful way of collecting statistically relevant amounts of data in a semi-automated manner. We believe that our conceptually simple technique should be of interest for a broader range of single-molecule studies, well beyond the specific example of lambda-DNA shown here.
Laparoscopic repair of perforated peptic ulcer: simple closure versus omentopexy.
Lin, Being-Chuan; Liao, Chien-Hung; Wang, Shang-Yu; Hwang, Tsann-Long
2017-12-01
This report presents our experience with laparoscopic repair performed in 118 consecutive patients diagnosed with a perforated peptic ulcer (PPU). We compared the surgical outcome of simple closure with modified Cellan-Jones omentopexy and report the safety and benefit of simple closure. From January 2010 to December 2014, 118 patients with PPU underwent laparoscopic repair with simple closure (n = 27) or omentopexy (n = 91). Charts were retrospectively reviewed for demographic characteristics and outcome. The data were compared by Fisher's exact test, Mann-Whitney U test, Pearson's chi-square test, and the Kruskal-Wallis test. The results were considered statistically significant if P < 0.05. No patients died, whereas three incurred leakage. After matching, the simple closure and omentopexy groups had similarity in sex, systolic blood pressure, pulse rate, respiratory rate, Boey score, Charlson comorbidity index, Mannheim peritonitis index, and leakage. There were statistically significant differences in age, length of hospital stay, perforated size, and operating time. Comparison of the operating time in the ≤4.0 mm and 5.0-12 mm groups revealed that the simple closure took less time than omentopexy in both groups (≤4.0 mm, 76 versus 133 minutes, P < 0.0001; 5.0-12 mm, 97 versus 139.5 minutes; P = 0.006). Compared to the omentopexy, laparoscopic simple closure is a safe procedure and shortens the operating time. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Schwämmle, Veit; León, Ileana Rodríguez; Jensen, Ole Nørregaard
2013-09-06
Large-scale quantitative analyses of biological systems are often performed with few replicate experiments, leading to multiple nonidentical data sets due to missing values. For example, mass spectrometry driven proteomics experiments are frequently performed with few biological or technical replicates due to sample-scarcity or due to duty-cycle or sensitivity constraints, or limited capacity of the available instrumentation, leading to incomplete results where detection of significant feature changes becomes a challenge. This problem is further exacerbated for the detection of significant changes on the peptide level, for example, in phospho-proteomics experiments. In order to assess the extent of this problem and the implications for large-scale proteome analysis, we investigated and optimized the performance of three statistical approaches by using simulated and experimental data sets with varying numbers of missing values. We applied three tools, including standard t test, moderated t test, also known as limma, and rank products for the detection of significantly changing features in simulated and experimental proteomics data sets with missing values. The rank product method was improved to work with data sets containing missing values. Extensive analysis of simulated and experimental data sets revealed that the performance of the statistical analysis tools depended on simple properties of the data sets. High-confidence results were obtained by using the limma and rank products methods for analyses of triplicate data sets that exhibited more than 1000 features and more than 50% missing values. The maximum number of differentially represented features was identified by using limma and rank products methods in a complementary manner. We therefore recommend combined usage of these methods as a novel and optimal way to detect significantly changing features in these data sets. This approach is suitable for large quantitative data sets from stable isotope labeling and mass spectrometry experiments and should be applicable to large data sets of any type. An R script that implements the improved rank products algorithm and the combined analysis is available.
NASA Astrophysics Data System (ADS)
Rich, Grayson Currie
The COHERENT Collaboration has produced the first-ever observation, with a significance of 6.7sigma, of a process consistent with coherent, elastic neutrino-nucleus scattering (CEnuNS) as first predicted and described by D.Z. Freedman in 1974. Physics of the CEnuNS process are presented along with its relationship to future measurements in the arenas of nuclear physics, fundamental particle physics, and astroparticle physics, where the newly-observed interaction presents a viable tool for investigations into numerous outstanding questions about the nature of the universe. To enable the CEnuNS observation with a 14.6-kg CsI[Na] detector, new measurements of the response of CsI[Na] to low-energy nuclear recoils, which is the only mechanism by which CEnuNS is detectable, were carried out at Triangle Universities Nuclear Laboratory; these measurements are detailed and an effective nuclear-recoil quenching factor of 8.78 +/- 1.66% is established for CsI[Na] in the recoil-energy range of 5-30 keV, based on new and literature data. Following separate analyses of the CEnuNS-search data by groups at the University of Chicago and the Moscow Engineering and Physics Institute, information from simulations, calculations, and ancillary measurements were used to inform statistical analyses of the collected data. Based on input from the Chicago analysis, the number of CEnuNS events expected from the Standard Model is 173 +/- 48; interpretation as a simple counting experiment finds 136 +/- 31 CEnuNS counts in the data, while a two-dimensional, profile likelihood fit yields 134 +/- 22 CEnuNS counts. Details of the simulations, calculations, and supporting measurements are discussed, in addition to the statistical procedures. Finally, potential improvements to the CsI[Na]-based CEnuNS measurement are presented along with future possibilities for COHERENT Collaboration, including new CEnuNS detectors and measurement of the neutrino-induced neutron spallation process.
Proceedings, Seminar on Probabilistic Methods in Geotechnical Engineering
NASA Astrophysics Data System (ADS)
Hynes-Griffin, M. E.; Buege, L. L.
1983-09-01
Contents: Applications of Probabilistic Methods in Geotechnical Engineering; Probabilistic Seismic and Geotechnical Evaluation at a Dam Site; Probabilistic Slope Stability Methodology; Probability of Liquefaction in a 3-D Soil Deposit; Probabilistic Design of Flood Levees; Probabilistic and Statistical Methods for Determining Rock Mass Deformability Beneath Foundations: An Overview; Simple Statistical Methodology for Evaluating Rock Mechanics Exploration Data; New Developments in Statistical Techniques for Analyzing Rock Slope Stability.
1992-10-01
N=8) and Results of 44 Statistical Analyses for Impact Test Performed on Forefoot of Unworn Footwear A-2. Summary Statistics (N=8) and Results of...on Forefoot of Worn Footwear Vlll Tables (continued) Table Page B-2. Summary Statistics (N=4) and Results of 76 Statistical Analyses for Impact...used tests to assess heel and forefoot shock absorption, upper and sole durability, and flexibility (Cavanagh, 1978). Later, the number of tests was
2011-01-01
Background Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. Methods We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Results Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Conclusions Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim. PMID:21473747
More Powerful Tests of Simple Interaction Contrasts in the Two-Way Factorial Design
ERIC Educational Resources Information Center
Hancock, Gregory R.; McNeish, Daniel M.
2017-01-01
For the two-way factorial design in analysis of variance, the current article explicates and compares three methods for controlling the Type I error rate for all possible simple interaction contrasts following a statistically significant interaction, including a proposed modification to the Bonferroni procedure that increases the power of…
Calibration of Response Data Using MIRT Models with Simple and Mixed Structures
ERIC Educational Resources Information Center
Zhang, Jinming
2012-01-01
It is common to assume during a statistical analysis of a multiscale assessment that the assessment is composed of several unidimensional subtests or that it has simple structure. Under this assumption, the unidimensional and multidimensional approaches can be used to estimate item parameters. These two approaches are equivalent in parameter…
Quantitation & Case-Study-Driven Inquiry to Enhance Yeast Fermentation Studies
ERIC Educational Resources Information Center
Grammer, Robert T.
2012-01-01
We propose a procedure for the assay of fermentation in yeast in microcentrifuge tubes that is simple and rapid, permitting assay replicates, descriptive statistics, and the preparation of line graphs that indicate reproducibility. Using regression and simple derivatives to determine initial velocities, we suggest methods to compare the effects of…
A Simple Statistical Thermodynamics Experiment
ERIC Educational Resources Information Center
LoPresto, Michael C.
2010-01-01
Comparing the predicted and actual rolls of combinations of both two and three dice can help to introduce many of the basic concepts of statistical thermodynamics, including multiplicity, probability, microstates, and macrostates, and demonstrate that entropy is indeed a measure of randomness, that disordered states (those of higher entropy) are…
ERIC Educational Resources Information Center
Harris, Ronald M.
1978-01-01
Presents material dealing with an application of statistical thermodynamics to the diatomic solid I-2(s). The objective is to enhance the student's appreciation of the power of the statistical formulation of thermodynamics. The Simple Einstein Model is used. (Author/MA)
Gaskin, Cadeyrn J; Happell, Brenda
2014-05-01
To (a) assess the statistical power of nursing research to detect small, medium, and large effect sizes; (b) estimate the experiment-wise Type I error rate in these studies; and (c) assess the extent to which (i) a priori power analyses, (ii) effect sizes (and interpretations thereof), and (iii) confidence intervals were reported. Statistical review. Papers published in the 2011 volumes of the 10 highest ranked nursing journals, based on their 5-year impact factors. Papers were assessed for statistical power, control of experiment-wise Type I error, reporting of a priori power analyses, reporting and interpretation of effect sizes, and reporting of confidence intervals. The analyses were based on 333 papers, from which 10,337 inferential statistics were identified. The median power to detect small, medium, and large effect sizes was .40 (interquartile range [IQR]=.24-.71), .98 (IQR=.85-1.00), and 1.00 (IQR=1.00-1.00), respectively. The median experiment-wise Type I error rate was .54 (IQR=.26-.80). A priori power analyses were reported in 28% of papers. Effect sizes were routinely reported for Spearman's rank correlations (100% of papers in which this test was used), Poisson regressions (100%), odds ratios (100%), Kendall's tau correlations (100%), Pearson's correlations (99%), logistic regressions (98%), structural equation modelling/confirmatory factor analyses/path analyses (97%), and linear regressions (83%), but were reported less often for two-proportion z tests (50%), analyses of variance/analyses of covariance/multivariate analyses of variance (18%), t tests (8%), Wilcoxon's tests (8%), Chi-squared tests (8%), and Fisher's exact tests (7%), and not reported for sign tests, Friedman's tests, McNemar's tests, multi-level models, and Kruskal-Wallis tests. Effect sizes were infrequently interpreted. Confidence intervals were reported in 28% of papers. The use, reporting, and interpretation of inferential statistics in nursing research need substantial improvement. Most importantly, researchers should abandon the misleading practice of interpreting the results from inferential tests based solely on whether they are statistically significant (or not) and, instead, focus on reporting and interpreting effect sizes, confidence intervals, and significance levels. Nursing researchers also need to conduct and report a priori power analyses, and to address the issue of Type I experiment-wise error inflation in their studies. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
Statistical issues in the design and planning of proteomic profiling experiments.
Cairns, David A
2015-01-01
The statistical design of a clinical proteomics experiment is a critical part of well-undertaken investigation. Standard concepts from experimental design such as randomization, replication and blocking should be applied in all experiments, and this is possible when the experimental conditions are well understood by the investigator. The large number of proteins simultaneously considered in proteomic discovery experiments means that determining the number of required replicates to perform a powerful experiment is more complicated than in simple experiments. However, by using information about the nature of an experiment and making simple assumptions this is achievable for a variety of experiments useful for biomarker discovery and initial validation.
A simple statistical model for geomagnetic reversals
NASA Technical Reports Server (NTRS)
Constable, Catherine
1990-01-01
The diversity of paleomagnetic records of geomagnetic reversals now available indicate that the field configuration during transitions cannot be adequately described by simple zonal or standing field models. A new model described here is based on statistical properties inferred from the present field and is capable of simulating field transitions like those observed. Some insight is obtained into what one can hope to learn from paleomagnetic records. In particular, it is crucial that the effects of smoothing in the remanence acquisition process be separated from true geomagnetic field behavior. This might enable us to determine the time constants associated with the dominant field configuration during a reversal.
Statistical Properties of Online Auctions
NASA Astrophysics Data System (ADS)
Namazi, Alireza; Schadschneider, Andreas
We characterize the statistical properties of a large number of online auctions run on eBay. Both stationary and dynamic properties, like distributions of prices, number of bids etc., as well as relations between these quantities are studied. The analysis of the data reveals surprisingly simple distributions and relations, typically of power-law form. Based on these findings we introduce a simple method to identify suspicious auctions that could be influenced by a form of fraud known as shill bidding. Furthermore the influence of bidding strategies is discussed. The results indicate that the observed behavior is related to a mixture of agents using a variety of strategies.
Franc, Jeffrey Michael; Ingrassia, Pier Luigi; Verde, Manuela; Colombo, Davide; Della Corte, Francesco
2015-02-01
Surge capacity, or the ability to manage an extraordinary volume of patients, is fundamental for hospital management of mass-casualty incidents. However, quantification of surge capacity is difficult and no universal standard for its measurement has emerged, nor has a standardized statistical method been advocated. As mass-casualty incidents are rare, simulation may represent a viable alternative to measure surge capacity. Hypothesis/Problem The objective of the current study was to develop a statistical method for the quantification of surge capacity using a combination of computer simulation and simple process-control statistical tools. Length-of-stay (LOS) and patient volume (PV) were used as metrics. The use of this method was then demonstrated on a subsequent computer simulation of an emergency department (ED) response to a mass-casualty incident. In the derivation phase, 357 participants in five countries performed 62 computer simulations of an ED response to a mass-casualty incident. Benchmarks for ED response were derived from these simulations, including LOS and PV metrics for triage, bed assignment, physician assessment, and disposition. In the application phase, 13 students of the European Master in Disaster Medicine (EMDM) program completed the same simulation scenario, and the results were compared to the standards obtained in the derivation phase. Patient-volume metrics included number of patients to be triaged, assigned to rooms, assessed by a physician, and disposed. Length-of-stay metrics included median time to triage, room assignment, physician assessment, and disposition. Simple graphical methods were used to compare the application phase group to the derived benchmarks using process-control statistical tools. The group in the application phase failed to meet the indicated standard for LOS from admission to disposition decision. This study demonstrates how simulation software can be used to derive values for objective benchmarks of ED surge capacity using PV and LOS metrics. These objective metrics can then be applied to other simulation groups using simple graphical process-control tools to provide a numeric measure of surge capacity. Repeated use in simulations of actual EDs may represent a potential means of objectively quantifying disaster management surge capacity. It is hoped that the described statistical method, which is simple and reusable, will be useful for investigators in this field to apply to their own research.
Zeng, Yi; Qi, Shulan; Meng, Xing; Chen, Yinyin
2018-03-12
By analysing the defect of control design in randomized controlled trials (RCTs) of simple obesity treated with acupuncture and using acupuncture as the contrast, presenting the essential factors which should be taken into account as designing the control of clinical trial to further improve the clinical research. Setting RCTs of acupuncture treating simple obesity as a example, we searched RCTs of acupuncture treating simple obesity with acupuncture control. According to the characteristics of acupuncture therapy, this research sorted and analysed the control approach of intervention from aspects of acupoint selection, the penetration of needle, the depth of insertion, etc, then calculated the amount of difference factor between the two groups and analyzed the rationality. In 15 RCTs meeting the inclusion criterias, 7 published in English, 8 in Chinese, the amount of difference factors between two groups greater than 1 was 6 (40%), 4 published in English abroad, 2 in Chinese, while only 1 was 9 (60%), 3 published in English, 6 in Chinese. Control design of acupuncture in some clinical RCTs is unreasonable for not considering the amount of difference factors between the two groups.
DIY soundcard based temperature logging system. Part II: applications
NASA Astrophysics Data System (ADS)
Nunn, John
2016-11-01
This paper demonstrates some simple applications of how temperature logging systems may be used to monitor simple heat experiments, and how the data obtained can be analysed to get some additional insight into the physical processes.
Rudasingwa, Martin; Soeters, Robert; Bossuyt, Michel
2015-01-01
To strengthen the health care delivery, the Burundian Government in collaboration with international NGOs piloted performance-based financing (PBF) in 2006. The health facilities were assigned - by using a simple matching method - to begin PBF scheme or to continue with the traditional input-based funding. Our objective was to analyse the effect of that PBF scheme on the quality of health services between 2006 and 2008. We conducted the analysis in 16 health facilities with PBF scheme and 13 health facilities without PBF scheme. We analysed the PBF effect by using 58 composite quality indicators of eight health services: Care management, outpatient care, maternity care, prenatal care, family planning, laboratory services, medicines management and materials management. The differences in quality improvement in the two groups of health facilities were performed applying descriptive statistics, a paired non-parametric Wilcoxon Signed Ranks test and a simple difference-in-difference approach at a significance level of 5%. We found an improvement of the quality of care in the PBF group and a significant deterioration in the non-PBF group in the same four health services: care management, outpatient care, maternity care, and prenatal care. The findings suggest a PBF effect of between 38 and 66 percentage points (p<0.001) in the quality scores of care management, outpatient care, prenatal care, and maternal care. We found no PBF effect on clinical support services: laboratory services, medicines management, and material management. The PBF scheme in Burundi contributed to the improvement of the health services that were strongly under the control of medical personnel (physicians and nurses) in a short time of two years. The clinical support services that did not significantly improved were strongly under the control of laboratory technicians, pharmacists and non-medical personnel. PMID:25948432
Liu, Ying; Guo, Xin-biao; Li, Hai-rong; Yang, Lin-sheng
2006-07-01
To study some factors that affected food poison and appeals in restaurants which were hidden danger on the cards. Data on food hygiene events from 2002 to 2004 in restaurants of 14 blocks which were located in the important city zone of Qingdao were collected and studied. The spatial distribution was conducted by means of Geographic Information System (GIS). Possible factors related to food hygiene events were investigated and analysed by NCSS Data statistics software. information of every block were marked on digitalized map by ARCVIEW3.2a software in order to show the spatial distribution of food hygiene events palpably in different areas in the course of three years. It was showed that air temperature, humidity, sunlight length were the important factors of food poison. Average amount of guests and floating population related to administration level of sanitation, the level of sanitation administration, geography location, business status of restaurants related to their status of food sanitation. This study showed the method that analysed and studied status of food sanitation from different areas by GIS were effective, simple and palpable.
Francisco, E; Martín Pendás, A; Blanco, M A
2009-09-28
We show in this article how for single-determinant wave functions the one-electron functions derived from the diagonalization of the Fermi hole, averaged over an arbitrary domain Omega of real space, and expressed in terms of the occupied canonical orbitals, describe coarse-grained statistically independent electrons. With these domain-averaged Fermi hole (DAFH) orbitals, the full electron number distribution function (EDF) is given by a simple product of one-electron events. This useful property follows from the simultaneous orthogonality of the DAFH orbitals in Omega, Omega(')=R(3)-Omega, and R(3). We also show how the interfragment (shared electron) delocalization index, delta(Omega,Omega(')), transforms into a sum of one-electron DAFH contributions. Description of chemical bonding in terms of DAFH orbitals provides a vivid picture relating bonding and delocalization in real space. DAFH and EDF analyses are performed on several test systems to illustrate the close relationship between both concepts. Finally, these analyses clearly prove how DAFH orbitals well localized in Omega or Omega(') can be simply ignored in computing the EDFs and/or delta(Omega,Omega(')), and thus do not contribute to the chemical bonding between the two fragments.
Lester L. Yuan,; Amina I. Pollard,; Carlisle, Daren M.
2009-01-01
Analyses of observational data can provide insights into relationships between environmental conditions and biological responses across a broader range of natural conditions than experimental studies, potentially complementing insights gained from experiments. However, observational data must be analyzed carefully to minimize the likelihood that confounding variables bias observed relationships. Propensity scores provide a robust approach for controlling for the effects of measured confounding variables when analyzing observational data. Here, we use propensity scores to estimate changes in mean invertebrate taxon richness in streams that have experienced insecticide concentrations that exceed aquatic life use benchmark concentrations. A simple comparison of richness in sites exposed to elevated insecticides with those that were not exposed suggests that exposed sites had on average 6.8 fewer taxa compared to unexposed sites. The presence of potential confounding variables makes it difficult to assert a causal relationship from this simple comparison. After controlling for confounding factors using propensity scores, the difference in richness between exposed and unexposed sites was reduced to 4.1 taxa, a difference that was still statistically significant. Because the propensity score analysis controlled for the effects of a wide variety of possible confounding variables, we infer that the change in richness observed in the propensity score analysis was likely caused by insecticide exposure.
Yuan, L.L.; Pollard, A.I.; Carlisle, D.M.
2009-01-01
Analyses of observational data can provide insights into relationships between environmental conditions and biological responses across a broader range of natural conditions than experimental studies, potentially complementing insights gained from experiments. However, observational data must be analyzed carefully to minimize the likelihood that confounding variables bias observed relationships. Propensity scores provide a robust approach for controlling for the effects of measured confounding variables when analyzing observational data. Here, we use propensity scores to estimate changes in mean invertebrate taxon richness in streams that have experienced insecticide concentrations that exceed aquatic life use benchmark concentrations. A simple comparison of richness in sites exposed to elevated insecticides with those that were not exposed suggests that exposed sites had on average 6.8 fewer taxa compared to unexposed sites. The presence of potential confounding variables makes it difficult to assert a causal relationship from this simple comparison. After controlling for confounding factors using propensity scores, the difference in richness between exposed and unexposed sites was reduced to 4.1 taxa, a difference that was still statistically significant. Because the propensity score analysis controlled for the effects of a wide variety of possible confounding variables, we infer that the change in richness observed in the propensity score analysis was likely caused by insecticide exposure. ?? 2009 SETAC.
Eksborg, Staffan
2013-01-01
Pharmacokinetic studies are important for optimizing of drug dosing, but requires proper validation of the used pharmacokinetic procedures. However, simple and reliable statistical methods suitable for evaluation of the predictive performance of pharmacokinetic analysis are essentially lacking. The aim of the present study was to construct and evaluate a graphic procedure for quantification of predictive performance of individual and population pharmacokinetic compartment analysis. Original data from previously published pharmacokinetic compartment analyses after intravenous, oral, and epidural administration, and digitized data, obtained from published scatter plots of observed vs predicted drug concentrations from population pharmacokinetic studies using the NPEM algorithm and NONMEM computer program and Bayesian forecasting procedures, were used for estimating the predictive performance according to the proposed graphical method and by the method of Sheiner and Beal. The graphical plot proposed in the present paper proved to be a useful tool for evaluation of predictive performance of both individual and population compartment pharmacokinetic analysis. The proposed method is simple to use and gives valuable information concerning time- and concentration-dependent inaccuracies that might occur in individual and population pharmacokinetic compartment analysis. Predictive performance can be quantified by the fraction of concentration ratios within arbitrarily specified ranges, e.g. within the range 0.8-1.2.
NASA Astrophysics Data System (ADS)
Kawakami, Shun; Sasaki, Toshihiko; Koashi, Masato
2017-07-01
An essential step in quantum key distribution is the estimation of parameters related to the leaked amount of information, which is usually done by sampling of the communication data. When the data size is finite, the final key rate depends on how the estimation process handles statistical fluctuations. Many of the present security analyses are based on the method with simple random sampling, where hypergeometric distribution or its known bounds are used for the estimation. Here we propose a concise method based on Bernoulli sampling, which is related to binomial distribution. Our method is suitable for the Bennett-Brassard 1984 (BB84) protocol with weak coherent pulses [C. H. Bennett and G. Brassard, Proceedings of the IEEE Conference on Computers, Systems and Signal Processing (IEEE, New York, 1984), Vol. 175], reducing the number of estimated parameters to achieve a higher key generation rate compared to the method with simple random sampling. We also apply the method to prove the security of the differential-quadrature-phase-shift (DQPS) protocol in the finite-key regime. The result indicates that the advantage of the DQPS protocol over the phase-encoding BB84 protocol in terms of the key rate, which was previously confirmed in the asymptotic regime, persists in the finite-key regime.
NASA Technical Reports Server (NTRS)
Duncan, Jeff; Stow, D.; Franklin, J.; Hope, A.
1993-01-01
We assessed the statistical relations between Spectral Vegetation Indices (SVI's) derived from SPOT multi-spectral data and semi-arid shrub cover at the Jornada LTER site in New Mexico. Despite a limited range of shrub cover in the sample the analyses resulted in r(sup 2) values as high as 0 central dot 77. Greenness SVI's (e.g., Simple Ratio, NDVI, SAVI, PVI and an orthogonal Greenness index) were shown to be more sensitive to shrub type and phenology than brightness SVis (e.g., green, red and near-infrared reflectances and a Brightness index). The results varied substantially with small-scale changes in plot size (60m by 60m to 100m by 100m) as a consequence of landscape heterogeneity. The results also indicated the potential for the spectral differentiation of shrub types, and shrubs from grass, using multi-temporal, multi-spectral analysis.
Kamal, S M Mostafa; Hassan, Che Hashim; Kabir, M A
2015-03-01
This study examines the inequality of the use of skilled delivery assistance by the rural women of Bangladesh using the 2007 Bangladesh Demographic and Health Survey data. Simple cross-tabulation and univariate and multivariate statistical analyses were employed in the study. Overall, 56.1% of the women received at least one antenatal care visit, whereas only 13.2% births were assisted by skilled personnel. Findings revealed apparent inequality in using skilled delivery assistance by socioeconomic strata. Birth order, women's education, religion, wealth index, region and antenatal care are important determinants of seeking skilled assistance. To ensure safe motherhood initiative, government should pay special attention to reduce inequality in seeking skilled delivery assistance. A strong focus on community-based and regional interventions is important in order to increase the utilization of safe maternal health care services in rural Bangladesh. © 2013 APJPH.
Bulk and surface properties of liquid Al-Cr and Cr-Ni alloys.
Novakovic, R
2011-06-15
The energetics of mixing and structural arrangement in liquid Al-Cr and Cr-Ni alloys has been analysed through the study of surface properties (surface tension and surface segregation), dynamic properties (chemical diffusion) and microscopic functions (concentration fluctuations in the long-wavelength limit and chemical short-range order parameter) in the framework of statistical mechanical theory in conjunction with quasi-lattice theory. The Al-Cr phase diagram exhibits the existence of different intermetallic compounds in the solid state, while that of Cr-Ni is a simple eutectic-type phase diagram at high temperatures and includes the low-temperature peritectoid reaction in the range near a CrNi(2) composition. Accordingly, the mixing behaviour in Al-Cr and Cr-Ni alloy melts was studied using the complex formation model in the weak interaction approximation and by postulating Al(8)Cr(5) and CrNi(2) chemical complexes, respectively, as energetically favoured.
Mitchem, Dorian G.; Zietsch, Brendan P.; Wright, Margaret J.; Martin, Nicholas G.; Hewitt, John K.; Keller, Matthew C.
2015-01-01
Theories in both evolutionary and social psychology suggest that a positive correlation should exist between facial attractiveness and general intelligence, and several empirical observations appear to corroborate this expectation. Using highly reliable measures of facial attractiveness and IQ in a large sample of identical and fraternal twins and their siblings, we found no evidence for a phenotypic correlation between these traits. Likewise, neither the genetic nor the environmental latent factor correlations were statistically significant. We supplemented our analyses of new data with a simple meta-analysis that found evidence of publication bias among past studies of the relationship between facial attractiveness and intelligence. In view of these results, we suggest that previously published reports may have overestimated the strength of the relationship and that the theoretical bases for the predicted attractiveness-intelligence correlation may need to be reconsidered. PMID:25937789
Oostenveld, Robert; Fries, Pascal; Maris, Eric; Schoffelen, Jan-Mathijs
2011-01-01
This paper describes FieldTrip, an open source software package that we developed for the analysis of MEG, EEG, and other electrophysiological data. The software is implemented as a MATLAB toolbox and includes a complete set of consistent and user-friendly high-level functions that allow experimental neuroscientists to analyze experimental data. It includes algorithms for simple and advanced analysis, such as time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. The implementation as toolbox allows the user to perform elaborate and structured analyses of large data sets using the MATLAB command line and batch scripting. Furthermore, users and developers can easily extend the functionality and implement new algorithms. The modular design facilitates the reuse in other software packages. PMID:21253357
Application of the Probabilistic Dynamic Synthesis Method to the Analysis of a Realistic Structure
NASA Technical Reports Server (NTRS)
Brown, Andrew M.; Ferri, Aldo A.
1998-01-01
The Probabilistic Dynamic Synthesis method is a new technique for obtaining the statistics of a desired response engineering quantity for a structure with non-deterministic parameters. The method uses measured data from modal testing of the structure as the input random variables, rather than more "primitive" quantities like geometry or material variation. This modal information is much more comprehensive and easily measured than the "primitive" information. The probabilistic analysis is carried out using either response surface reliability methods or Monte Carlo simulation. A previous work verified the feasibility of the PDS method on a simple seven degree-of-freedom spring-mass system. In this paper, extensive issues involved with applying the method to a realistic three-substructure system are examined, and free and forced response analyses are performed. The results from using the method are promising, especially when the lack of alternatives for obtaining quantitative output for probabilistic structures is considered.
Application of the Probabilistic Dynamic Synthesis Method to Realistic Structures
NASA Technical Reports Server (NTRS)
Brown, Andrew M.; Ferri, Aldo A.
1998-01-01
The Probabilistic Dynamic Synthesis method is a technique for obtaining the statistics of a desired response engineering quantity for a structure with non-deterministic parameters. The method uses measured data from modal testing of the structure as the input random variables, rather than more "primitive" quantities like geometry or material variation. This modal information is much more comprehensive and easily measured than the "primitive" information. The probabilistic analysis is carried out using either response surface reliability methods or Monte Carlo simulation. In previous work, the feasibility of the PDS method applied to a simple seven degree-of-freedom spring-mass system was verified. In this paper, extensive issues involved with applying the method to a realistic three-substructure system are examined, and free and forced response analyses are performed. The results from using the method are promising, especially when the lack of alternatives for obtaining quantitative output for probabilistic structures is considered.
Timing in a Variable Interval Procedure: Evidence for a Memory Singularity
Matell, Matthew S.; Kim, Jung S.; Hartshorne, Loryn
2013-01-01
Rats were trained in either a 30s peak-interval procedure, or a 15–45s variable interval peak procedure with a uniform distribution (Exp 1) or a ramping probability distribution (Exp 2). Rats in all groups showed peak shaped response functions centered around 30s, with the uniform group having an earlier and broader peak response function and rats in the ramping group having a later peak function as compared to the single duration group. The changes in these mean functions, as well as the statistics from single trial analyses, can be better captured by a model of timing in which memory is represented by a single, average, delay to reinforcement compared to one in which all durations are stored as a distribution, such as the complete memory model of Scalar Expectancy Theory or a simple associative model. PMID:24012783
Moreau, Gaétan; Michaud, J-P
2017-01-01
LaMotte and Wells re-analyzed and criticized one of our articles in which we proposed a novel statistical test for predicting postmortem interval from insect succession data. Using simple mathematical examples, we demonstrate that LaMotte and Wells erred because their analyses are based on an erroneous interpretation of the nature of probabilities that disregards more than 300 years of scientific literature on probability combination. We also argue that the methods presented in our article, more specifically the use of degree-day-based logistic regression analysis to model succession, was a positive contribution to the fields of forensic entomology and carrion ecology, which LaMotte and Wells forgot to mention by instead focusing on issues that were either trivial or did not exist. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Walsh, Linda; Dufey, Florian; Tschense, Annemarie; Schnelzer, Maria; Sogl, Marion; Kreuzer, Michaela
2012-01-01
A recent study and comprehensive literature review has indicated that mining could be protective against prostate cancer. This indication has been explored further here by analysing prostate cancer mortality in the German 'Wismut' uranium miner cohort, which has detailed information on the number of days worked underground. An historical cohort study of 58 987 male mine workers with retrospective follow-up before 1999 and prospective follow-up since 1999. Uranium mine workers employed during the period 1970-1990 in the regions of Saxony and Thuringia, Germany, contributing 1.42 million person-years of follow-up ending in 2003. Simple standardised mortality ratio (SMR) analyses were applied to assess differences between the national and cohort prostate cancer mortality rates and complemented by refined analyses done entirely within the cohort. The internal comparisons applied Poisson regression excess relative prostate cancer mortality risk model with background stratification by age and calendar year and a whole range of possible explanatory covariables that included days worked underground and years worked at high physical activity with γ radiation treated as a confounder. The analysis is based on miner data for 263 prostate cancer deaths. The overall SMR was 0.85 (95% CI 0.75 to 0.95). A linear excess relative risk model with the number of years worked at high physical activity and the number of days worked underground as explanatory covariables provided a statistically significant fit when compared with the background model (p=0.039). Results (with 95% CIs) for the excess relative risk per day worked underground indicated a statistically significant (p=0.0096) small protective effect of -5.59 (-9.81 to -1.36) ×10(-5). Evidence is provided from the German Wismut cohort in support of a protective effect from working underground on prostate cancer mortality risk.
Manufacturing Squares: An Integrative Statistical Process Control Exercise
ERIC Educational Resources Information Center
Coy, Steven P.
2016-01-01
In the exercise, students in a junior-level operations management class are asked to manufacture a simple product. Given product specifications, they must design a production process, create roles and design jobs for each team member, and develop a statistical process control plan that efficiently and effectively controls quality during…
NASA Astrophysics Data System (ADS)
Zhang, Weijia; Fuller, Robert G.
1998-05-01
A demographic database for the 139 Nobel prize winners in physics from 1901 to 1990 has been created from a variety of sources. The results of our statistical study are discussed in the light of the implications for physics teaching.
Teaching Statistics with Minitab II.
ERIC Educational Resources Information Center
Ryan, T. A., Jr.; And Others
Minitab is a statistical computing system which uses simple language, produces clear output, and keeps track of bookkeeping automatically. Error checking with English diagnostics and inclusion of several default options help to facilitate use of the system by students. Minitab II is an improved and expanded version of the original Minitab which…
Applying Descriptive Statistics to Teaching the Regional Classification of Climate.
ERIC Educational Resources Information Center
Lindquist, Peter S.; Hammel, Daniel J.
1998-01-01
Describes an exercise for college and high school students that relates descriptive statistics to the regional climatic classification. The exercise introduces students to simple calculations of central tendency and dispersion, the construction and interpretation of scatterplots, and the definition of climatic regions. Forces students to engage…
An Experimental Approach to Teaching and Learning Elementary Statistical Mechanics
ERIC Educational Resources Information Center
Ellis, Frank B.; Ellis, David C.
2008-01-01
Introductory statistical mechanics is studied for a simple two-state system using an inexpensive and easily built apparatus. A large variety of demonstrations, suitable for students in high school and introductory university chemistry courses, are possible. This article details demonstrations for exothermic and endothermic reactions, the dynamic…
School District Enrollment Projections: A Comparison of Three Methods.
ERIC Educational Resources Information Center
Pettibone, Timothy J.; Bushan, Latha
This study assesses three methods of forecasting school enrollments: the cohort-sruvival method (grade progression), the statistical forecasting procedure developed by the Statistical Analysis System (SAS) Institute, and a simple ratio computation. The three methods were used to forecast school enrollments for kindergarten through grade 12 in a…
Application of Transformations in Parametric Inference
ERIC Educational Resources Information Center
Brownstein, Naomi; Pensky, Marianna
2008-01-01
The objective of the present paper is to provide a simple approach to statistical inference using the method of transformations of variables. We demonstrate performance of this powerful tool on examples of constructions of various estimation procedures, hypothesis testing, Bayes analysis and statistical inference for the stress-strength systems.…
40 CFR 91.512 - Request for public hearing.
Code of Federal Regulations, 2010 CFR
2010-07-01
... plans and statistical analyses have been properly applied (specifically, whether sampling procedures and statistical analyses specified in this subpart were followed and whether there exists a basis for... will be made available to the public during Agency business hours. ...
Approximate Expressions for the Period of a Simple Pendulum Using a Taylor Series Expansion
ERIC Educational Resources Information Center
Belendez, Augusto; Arribas, Enrique; Marquez, Andres; Ortuno, Manuel; Gallego, Sergi
2011-01-01
An approximate scheme for obtaining the period of a simple pendulum for large-amplitude oscillations is analysed and discussed. When students express the exact frequency or the period of a simple pendulum as a function of the oscillation amplitude, and they are told to expand this function in a Taylor series, they always do so using the…
Jin, Zhichao; Yu, Danghui; Zhang, Luoman; Meng, Hong; Lu, Jian; Gao, Qingbin; Cao, Yang; Ma, Xiuqiang; Wu, Cheng; He, Qian; Wang, Rui; He, Jia
2010-05-25
High quality clinical research not only requires advanced professional knowledge, but also needs sound study design and correct statistical analyses. The number of clinical research articles published in Chinese medical journals has increased immensely in the past decade, but study design quality and statistical analyses have remained suboptimal. The aim of this investigation was to gather evidence on the quality of study design and statistical analyses in clinical researches conducted in China for the first decade of the new millennium. Ten (10) leading Chinese medical journals were selected and all original articles published in 1998 (N = 1,335) and 2008 (N = 1,578) were thoroughly categorized and reviewed. A well-defined and validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation. Main outcomes were the frequencies of different types of study design, error/defect proportion in design and statistical analyses, and implementation of CONSORT in randomized clinical trials. From 1998 to 2008: The error/defect proportion in statistical analyses decreased significantly ( = 12.03, p<0.001), 59.8% (545/1,335) in 1998 compared to 52.2% (664/1,578) in 2008. The overall error/defect proportion of study design also decreased ( = 21.22, p<0.001), 50.9% (680/1,335) compared to 42.40% (669/1,578). In 2008, design with randomized clinical trials remained low in single digit (3.8%, 60/1,578) with two-third showed poor results reporting (defects in 44 papers, 73.3%). Nearly half of the published studies were retrospective in nature, 49.3% (658/1,335) in 1998 compared to 48.2% (761/1,578) in 2008. Decreases in defect proportions were observed in both results presentation ( = 93.26, p<0.001), 92.7% (945/1,019) compared to 78.2% (1023/1,309) and interpretation ( = 27.26, p<0.001), 9.7% (99/1,019) compared to 4.3% (56/1,309), some serious ones persisted. Chinese medical research seems to have made significant progress regarding statistical analyses, but there remains ample room for improvement regarding study designs. Retrospective clinical studies are the most often used design, whereas randomized clinical trials are rare and often show methodological weaknesses. Urgent implementation of the CONSORT statement is imperative.
ERIC Educational Resources Information Center
Cafri, Guy; Kromrey, Jeffrey D.; Brannick, Michael T.
2010-01-01
This article uses meta-analyses published in "Psychological Bulletin" from 1995 to 2005 to describe meta-analyses in psychology, including examination of statistical power, Type I errors resulting from multiple comparisons, and model choice. Retrospective power estimates indicated that univariate categorical and continuous moderators, individual…
Statistical fluctuations in pedestrian evacuation times and the effect of social contagion
NASA Astrophysics Data System (ADS)
Nicolas, Alexandre; Bouzat, Sebastián; Kuperman, Marcelo N.
2016-08-01
Mathematical models of pedestrian evacuation and the associated simulation software have become essential tools for the assessment of the safety of public facilities and buildings. While a variety of models is now available, their calibration and test against empirical data are generally restricted to global averaged quantities; the statistics compiled from the time series of individual escapes ("microscopic" statistics) measured in recent experiments are thus overlooked. In the same spirit, much research has primarily focused on the average global evacuation time, whereas the whole distribution of evacuation times over some set of realizations should matter. In the present paper we propose and discuss the validity of a simple relation between this distribution and the microscopic statistics, which is theoretically valid in the absence of correlations. To this purpose, we develop a minimal cellular automaton, with features that afford a semiquantitative reproduction of the experimental microscopic statistics. We then introduce a process of social contagion of impatient behavior in the model and show that the simple relation under test may dramatically fail at high contagion strengths, the latter being responsible for the emergence of strong correlations in the system. We conclude with comments on the potential practical relevance for safety science of calculations based on microscopic statistics.
Learning predictive statistics from temporal sequences: Dynamics and strategies.
Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe
2017-10-01
Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.
A simple blind placement of the left-sided double-lumen tubes.
Zong, Zhi Jun; Shen, Qi Ying; Lu, Yao; Li, Yuan Hai
2016-11-01
One-lung ventilation (OLV) has been commonly provided by using a double-lumen tube (DLT). Previous reports have indicated the high incidence of inappropriate DLT positioning in conventional maneuvers.After obtaining approval from the medical ethics committee of First Affiliated Hospital of Anhui Medical University and written consent from patients, 88 adult patients belonging to American society of anesthesiologists (ASA) physical status grade I or II, and undergoing elective thoracic surgery requiring a left-side DLT for OLV were enrolled in this prospective, single-blind, randomized controlled study. Patients were randomly allocated to 1 of 2 groups: simple maneuver group or conventional maneuver group. The simple maneuver is a method that relies on partially inflating the bronchial balloon and recreating the effect of a carinal hook on the DLTs to give an idea of orientation and depth. After the induction of anesthesia the patients were intubated with a left-sided Robertshaw DLT using one of the 2 intubation techniques. After intubation of each DLT, an anesthesiologist used flexible bronchoscopy to evaluate the patient while the patient lay in a supine position. The number of optimal position and the time required to place DLT in correct position were recorded.Time for the intubation of DLT took 100 ± 16.2 seconds (mean ± SD) in simple maneuver group and 95.1 ± 20.8 seconds in conventional maneuver group. The difference was not statistically significant (P = 0.221). Time for fiberoptic bronchoscope (FOB) took 22 ± 4.8 seconds in simple maneuver group and was statistically faster than that in conventional maneuver group (43.6 ± 23.7 seconds, P < 0.001). Nearly 98% of the 44 intubations in simple maneuver group were considered as in optimal position while only 52% of the 44 intubations in conventional maneuver group were in optimal position, and the difference was statistically significant (P < 0.001).This simple maneuver is more rapid and more accurate to position left-sided DLTs, it may be substituted for FOB during positioning of a left-sided DLT in condition that FOB is unavailable or inapplicable.
NASA Technical Reports Server (NTRS)
Hakkinen, Raimo J; Richardson, A S , Jr
1957-01-01
Sinusoidally oscillating downwash and lift produced on a simple rigid airfoil were measured and compared with calculated values. Statistically stationary random downwash and the corresponding lift on a simple rigid airfoil were also measured and the transfer functions between their power spectra determined. The random experimental values are compared with theoretically approximated values. Limitations of the experimental technique and the need for more extensive experimental data are discussed.
Algorithm for Identifying Erroneous Rain-Gauge Readings
NASA Technical Reports Server (NTRS)
Rickman, Doug
2005-01-01
An algorithm analyzes rain-gauge data to identify statistical outliers that could be deemed to be erroneous readings. Heretofore, analyses of this type have been performed in burdensome manual procedures that have involved subjective judgements. Sometimes, the analyses have included computational assistance for detecting values falling outside of arbitrary limits. The analyses have been performed without statistically valid knowledge of the spatial and temporal variations of precipitation within rain events. In contrast, the present algorithm makes it possible to automate such an analysis, makes the analysis objective, takes account of the spatial distribution of rain gauges in conjunction with the statistical nature of spatial variations in rainfall readings, and minimizes the use of arbitrary criteria. The algorithm implements an iterative process that involves nonparametric statistics.
Citation of previous meta-analyses on the same topic: a clue to perpetuation of incorrect methods?
Li, Tianjing; Dickersin, Kay
2013-06-01
Systematic reviews and meta-analyses serve as a basis for decision-making and clinical practice guidelines and should be carried out using appropriate methodology to avoid incorrect inferences. We describe the characteristics, statistical methods used for meta-analyses, and citation patterns of all 21 glaucoma systematic reviews we identified pertaining to the effectiveness of prostaglandin analog eye drops in treating primary open-angle glaucoma, published between December 2000 and February 2012. We abstracted data, assessed whether appropriate statistical methods were applied in meta-analyses, and examined citation patterns of included reviews. We identified two forms of problematic statistical analyses in 9 of the 21 systematic reviews examined. Except in 1 case, none of the 9 reviews that used incorrect statistical methods cited a previously published review that used appropriate methods. Reviews that used incorrect methods were cited 2.6 times more often than reviews that used appropriate statistical methods. We speculate that by emulating the statistical methodology of previous systematic reviews, systematic review authors may have perpetuated incorrect approaches to meta-analysis. The use of incorrect statistical methods, perhaps through emulating methods described in previous research, calls conclusions of systematic reviews into question and may lead to inappropriate patient care. We urge systematic review authors and journal editors to seek the advice of experienced statisticians before undertaking or accepting for publication a systematic review and meta-analysis. The author(s) have no proprietary or commercial interest in any materials discussed in this article. Copyright © 2013 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
A simple model for calculating air pollution within street canyons
NASA Astrophysics Data System (ADS)
Venegas, Laura E.; Mazzeo, Nicolás A.; Dezzutti, Mariana C.
2014-04-01
This paper introduces the Semi-Empirical Urban Street (SEUS) model. SEUS is a simple mathematical model based on the scaling of air pollution concentration inside street canyons employing the emission rate, the width of the canyon, the dispersive velocity scale and the background concentration. Dispersive velocity scale depends on turbulent motions related to wind and traffic. The parameterisations of these turbulent motions include two dimensionless empirical parameters. Functional forms of these parameters have been obtained from full scale data measured in street canyons at four European cities. The sensitivity of SEUS model is studied analytically. Results show that relative errors in the evaluation of the two dimensionless empirical parameters have less influence on model uncertainties than uncertainties in other input variables. The model estimates NO2 concentrations using a simple photochemistry scheme. SEUS is applied to estimate NOx and NO2 hourly concentrations in an irregular and busy street canyon in the city of Buenos Aires. The statistical evaluation of results shows that there is a good agreement between estimated and observed hourly concentrations (e.g. fractional bias are -10.3% for NOx and +7.8% for NO2). The agreement between the estimated and observed values has also been analysed in terms of its dependence on wind speed and direction. The model shows a better performance for wind speeds >2 m s-1 than for lower wind speeds and for leeward situations than for others. No significant discrepancies have been found between the results of the proposed model and that of a widely used operational dispersion model (OSPM), both using the same input information.
NASA Astrophysics Data System (ADS)
Kang, Pilsang; Koo, Changhoi; Roh, Hokyu
2017-11-01
Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
Wu, Robert; Glen, Peter; Ramsay, Tim; Martel, Guillaume
2014-06-28
Observational studies dominate the surgical literature. Statistical adjustment is an important strategy to account for confounders in observational studies. Research has shown that published articles are often poor in statistical quality, which may jeopardize their conclusions. The Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines have been published to help establish standards for statistical reporting.This study will seek to determine whether the quality of statistical adjustment and the reporting of these methods are adequate in surgical observational studies. We hypothesize that incomplete reporting will be found in all surgical observational studies, and that the quality and reporting of these methods will be of lower quality in surgical journals when compared with medical journals. Finally, this work will seek to identify predictors of high-quality reporting. This work will examine the top five general surgical and medical journals, based on a 5-year impact factor (2007-2012). All observational studies investigating an intervention related to an essential component area of general surgery (defined by the American Board of Surgery), with an exposure, outcome, and comparator, will be included in this systematic review. Essential elements related to statistical reporting and quality were extracted from the SAMPL guidelines and include domains such as intent of analysis, primary analysis, multiple comparisons, numbers and descriptive statistics, association and correlation analyses, linear regression, logistic regression, Cox proportional hazard analysis, analysis of variance, survival analysis, propensity analysis, and independent and correlated analyses. Each article will be scored as a proportion based on fulfilling criteria in relevant analyses used in the study. A logistic regression model will be built to identify variables associated with high-quality reporting. A comparison will be made between the scores of surgical observational studies published in medical versus surgical journals. Secondary outcomes will pertain to individual domains of analysis. Sensitivity analyses will be conducted. This study will explore the reporting and quality of statistical analyses in surgical observational studies published in the most referenced surgical and medical journals in 2013 and examine whether variables (including the type of journal) can predict high-quality reporting.
Statistical analyses of commercial vehicle accident factors. Volume 1 Part 1
DOT National Transportation Integrated Search
1978-02-01
Procedures for conducting statistical analyses of commercial vehicle accidents have been established and initially applied. A file of some 3,000 California Highway Patrol accident reports from two areas of California during a period of about one year...
40 CFR 90.712 - Request for public hearing.
Code of Federal Regulations, 2010 CFR
2010-07-01
... sampling plans and statistical analyses have been properly applied (specifically, whether sampling procedures and statistical analyses specified in this subpart were followed and whether there exists a basis... Clerk and will be made available to the public during Agency business hours. ...
Supercomputer use in orthopaedic biomechanics research: focus on functional adaptation of bone.
Hart, R T; Thongpreda, N; Van Buskirk, W C
1988-01-01
The authors describe two biomechanical analyses carried out using numerical methods. One is an analysis of the stress and strain in a human mandible, and the other analysis involves modeling the adaptive response of a sheep bone to mechanical loading. The computing environment required for the two types of analyses is discussed. It is shown that a simple stress analysis of a geometrically complex mandible can be accomplished using a minicomputer. However, more sophisticated analyses of the same model with dynamic loading or nonlinear materials would require supercomputer capabilities. A supercomputer is also required for modeling the adaptive response of living bone, even when simple geometric and material models are use.
Bioreactivity: Studies on a Simple Brain Stem Reflex in Behaving Animals
1988-07-22
neuromodulation , or complex behavioral processes, such as arousal, is finding a simple system that will permit such analyses. The brain stem...systems important in neuromodulation and arousal. Initial pharmacologic studies showed that locally applied norepinephrine facilitated the reflex
Functional constraints on tooth morphology in carnivorous mammals
2012-01-01
Background The range of potential morphologies resulting from evolution is limited by complex interacting processes, ranging from development to function. Quantifying these interactions is important for understanding adaptation and convergent evolution. Using three-dimensional reconstructions of carnivoran and dasyuromorph tooth rows, we compared statistical models of the relationship between tooth row shape and the opposing tooth row, a static feature, as well as measures of mandibular motion during chewing (occlusion), which are kinetic features. This is a new approach to quantifying functional integration because we use measures of movement and displacement, such as the amount the mandible translates laterally during occlusion, as opposed to conventional morphological measures, such as mandible length and geometric landmarks. By sampling two distantly related groups of ecologically similar mammals, we study carnivorous mammals in general rather than a specific group of mammals. Results Statistical model comparisons demonstrate that the best performing models always include some measure of mandibular motion, indicating that functional and statistical models of tooth shape as purely a function of the opposing tooth row are too simple and that increased model complexity provides a better understanding of tooth form. The predictors of the best performing models always included the opposing tooth row shape and a relative linear measure of mandibular motion. Conclusions Our results provide quantitative support of long-standing hypotheses of tooth row shape as being influenced by mandibular motion in addition to the opposing tooth row. Additionally, this study illustrates the utility and necessity of including kinetic features in analyses of morphological integration. PMID:22899809
Di, Yanming; Schafer, Daniel W.; Wilhelm, Larry J.; Fox, Samuel E.; Sullivan, Christopher M.; Curzon, Aron D.; Carrington, James C.; Mockler, Todd C.; Chang, Jeff H.
2011-01-01
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts. PMID:21998647
Supply Chain Collaboration: Information Sharing in a Tactical Operating Environment
2013-06-01
architecture, there are four tiers: Client (Web Application Clients ), Presentation (Web-Server), Processing (Application-Server), Data (Database...organization in each period. This data will be collected to analyze. i) Analyses and Validation: We will do a statistics test in this data, Pareto ...notes, outstanding deliveries, and inventory. i) Analyses and Validation: We will do a statistics test in this data, Pareto analyses and confirmation
Research of Extension of the Life Cycle of Helicopter Rotor Blade in Hungary
2003-02-01
Radiography (DXR), and (iii) Vibration Diagnostics (VD) with Statistical Energy Analysis (SEA) were semi- simultaneously applied [1]. The used three...2.2. Vibration Diagnostics (VD)) Parallel to the NDT measurements the Statistical Energy Analysis (SEA) as a vibration diagnostical tool were...noises were analysed with a dual-channel real time frequency analyser (BK2035). In addition to the Statistical Energy Analysis measurement a small
Model for neural signaling leap statistics
NASA Astrophysics Data System (ADS)
Chevrollier, Martine; Oriá, Marcos
2011-03-01
We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T = 37.5°C, awaken regime) and Lévy statistics (T = 35.5°C, sleeping period), characterized by rare events of long range connections.
NAUSEA and the Principle of Supplementarity of Damping and Isolation in Noise Control.
1980-02-01
New approaches and uses of the statistical energy analysis (NAUSEA) have been considered and developed in recent months. The advances were made...possible in that the requirement, in the olde statistical energy analysis , that the dynamic systems be highly reverberant and the couplings between the...analytical consideration in terms of the statistical energy analysis (SEA). A brief discussion and simple examples that relate to these recent advances
Seasonal ENSO forecasting: Where does a simple model stand amongst other operational ENSO models?
NASA Astrophysics Data System (ADS)
Halide, Halmar
2017-01-01
We apply a simple linear multiple regression model called IndOzy for predicting ENSO up to 7 seasonal lead times. The model still used 5 (five) predictors of the past seasonal Niño 3.4 ENSO indices derived from chaos theory and it was rolling-validated to give a one-step ahead forecast. The model skill was evaluated against data from the season of May-June-July (MJJ) 2003 to November-December-January (NDJ) 2015/2016. There were three skill measures such as: Pearson correlation, RMSE, and Euclidean distance were used for forecast verification. The skill of this simple model was than compared to those of combined Statistical and Dynamical models compiled at the IRI (International Research Institute) website. It was found that the simple model was only capable of producing a useful ENSO prediction only up to 3 seasonal leads, while the IRI statistical and Dynamical model skill were still useful up to 4 and 6 seasonal leads, respectively. Even with its short-range seasonal prediction skills, however, the simple model still has a potential to give ENSO-derived tailored products such as probabilistic measures of precipitation and air temperature. Both meteorological conditions affect the presence of wild-land fire hot-spots in Sumatera and Kalimantan. It is suggested that to improve its long-range skill, the simple INDOZY model needs to incorporate a nonlinear model such as an artificial neural network technique.
Hamel, Jean-Francois; Saulnier, Patrick; Pe, Madeline; Zikos, Efstathios; Musoro, Jammbe; Coens, Corneel; Bottomley, Andrew
2017-09-01
Over the last decades, Health-related Quality of Life (HRQoL) end-points have become an important outcome of the randomised controlled trials (RCTs). HRQoL methodology in RCTs has improved following international consensus recommendations. However, no international recommendations exist concerning the statistical analysis of such data. The aim of our study was to identify and characterise the quality of the statistical methods commonly used for analysing HRQoL data in cancer RCTs. Building on our recently published systematic review, we analysed a total of 33 published RCTs studying the HRQoL methods reported in RCTs since 1991. We focussed on the ability of the methods to deal with the three major problems commonly encountered when analysing HRQoL data: their multidimensional and longitudinal structure and the commonly high rate of missing data. All studies reported HRQoL being assessed repeatedly over time for a period ranging from 2 to 36 months. Missing data were common, with compliance rates ranging from 45% to 90%. From the 33 studies considered, 12 different statistical methods were identified. Twenty-nine studies analysed each of the questionnaire sub-dimensions without type I error adjustment. Thirteen studies repeated the HRQoL analysis at each assessment time again without type I error adjustment. Only 8 studies used methods suitable for repeated measurements. Our findings show a lack of consistency in statistical methods for analysing HRQoL data. Problems related to multiple comparisons were rarely considered leading to a high risk of false positive results. It is therefore critical that international recommendations for improving such statistical practices are developed. Copyright © 2017. Published by Elsevier Ltd.
Sunspot activity and influenza pandemics: a statistical assessment of the purported association.
Towers, S
2017-10-01
Since 1978, a series of papers in the literature have claimed to find a significant association between sunspot activity and the timing of influenza pandemics. This paper examines these analyses, and attempts to recreate the three most recent statistical analyses by Ertel (1994), Tapping et al. (2001), and Yeung (2006), which all have purported to find a significant relationship between sunspot numbers and pandemic influenza. As will be discussed, each analysis had errors in the data. In addition, in each analysis arbitrary selections or assumptions were also made, and the authors did not assess the robustness of their analyses to changes in those arbitrary assumptions. Varying the arbitrary assumptions to other, equally valid, assumptions negates the claims of significance. Indeed, an arbitrary selection made in one of the analyses appears to have resulted in almost maximal apparent significance; changing it only slightly yields a null result. This analysis applies statistically rigorous methodology to examine the purported sunspot/pandemic link, using more statistically powerful un-binned analysis methods, rather than relying on arbitrarily binned data. The analyses are repeated using both the Wolf and Group sunspot numbers. In all cases, no statistically significant evidence of any association was found. However, while the focus in this particular analysis was on the purported relationship of influenza pandemics to sunspot activity, the faults found in the past analyses are common pitfalls; inattention to analysis reproducibility and robustness assessment are common problems in the sciences, that are unfortunately not noted often enough in review.
Measuring Student and School Progress with the California API. CSE Technical Report.
ERIC Educational Resources Information Center
Thum, Yeow Meng
This paper focuses on interpreting the major conceptual features of California's Academic Performance Index (API) as a coherent set of statistical procedures. To facilitate a characterization of its statistical properties, the paper casts the index as a simple weighted average of the subjective worth of students' normative performance and presents…
Applying Statistics in the Undergraduate Chemistry Laboratory: Experiments with Food Dyes.
ERIC Educational Resources Information Center
Thomasson, Kathryn; Lofthus-Merschman, Sheila; Humbert, Michelle; Kulevsky, Norman
1998-01-01
Describes several experiments to teach different aspects of the statistical analysis of data using household substances and a simple analysis technique. Each experiment can be performed in three hours. Students learn about treatment of spurious data, application of a pooled variance, linear least-squares fitting, and simultaneous analysis of dyes…
A Critique of One-Tailed Hypothesis Test Procedures in Business and Economics Statistics Textbooks.
ERIC Educational Resources Information Center
Liu, Tung; Stone, Courtenay C.
1999-01-01
Surveys introductory business and economics statistics textbooks and finds that they differ over the best way to explain one-tailed hypothesis tests: the simple null-hypothesis approach or the composite null-hypothesis approach. Argues that the composite null-hypothesis approach contains methodological shortcomings that make it more difficult for…
"Using Power Tables to Compute Statistical Power in Multilevel Experimental Designs"
ERIC Educational Resources Information Center
Konstantopoulos, Spyros
2009-01-01
Power computations for one-level experimental designs that assume simple random samples are greatly facilitated by power tables such as those presented in Cohen's book about statistical power analysis. However, in education and the social sciences experimental designs have naturally nested structures and multilevel models are needed to compute the…
Simple Data Sets for Distinct Basic Summary Statistics
ERIC Educational Resources Information Center
Lesser, Lawrence M.
2011-01-01
It is important to avoid ambiguity with numbers because unfortunate choices of numbers can inadvertently make it possible for students to form misconceptions or make it difficult for teachers to tell if students obtained the right answer for the right reason. Therefore, it is important to make sure when introducing basic summary statistics that…
NASA Technical Reports Server (NTRS)
Staubert, R.
1985-01-01
Methods for calculating the statistical significance of excess events and the interpretation of the formally derived values are discussed. It is argued that a simple formula for a conservative estimate should generally be used in order to provide a common understanding of quoted values.
Re-Thinking Statistics Education for Social Science Majors
ERIC Educational Resources Information Center
Reid, Howard M.; Mason, Susan E.
2008-01-01
Many college students majoring in the social sciences find the required statistics course to be dull as well as difficult. Further, they often do not retain much of the material, which limits their success in subsequent courses. We describe a few simple changes, the incorporation of which may enhance student learning. (Contains 1 table.)
Flipped Statistics Class Results: Better Performance than Lecture over One Year Later
ERIC Educational Resources Information Center
Winquist, Jennifer R.; Carlson, Keith A.
2014-01-01
In this paper, we compare an introductory statistics course taught using a flipped classroom approach to the same course taught using a traditional lecture based approach. In the lecture course, students listened to lecture, took notes, and completed homework assignments. In the flipped course, students read relatively simple chapters and answered…
Assistive Technologies for Second-Year Statistics Students Who Are Blind
ERIC Educational Resources Information Center
Erhardt, Robert J.; Shuman, Michael P.
2015-01-01
At Wake Forest University, a student who is blind enrolled in a second course in statistics. The course covered simple and multiple regression, model diagnostics, model selection, data visualization, and elementary logistic regression. These topics required that the student both interpret and produce three sets of materials: mathematical writing,…
Empirical Reference Distributions for Networks of Different Size
Smith, Anna; Calder, Catherine A.; Browning, Christopher R.
2016-01-01
Network analysis has become an increasingly prevalent research tool across a vast range of scientific fields. Here, we focus on the particular issue of comparing network statistics, i.e. graph-level measures of network structural features, across multiple networks that differ in size. Although “normalized” versions of some network statistics exist, we demonstrate via simulation why direct comparison is often inappropriate. We consider normalizing network statistics relative to a simple fully parameterized reference distribution and demonstrate via simulation how this is an improvement over direct comparison, but still sometimes problematic. We propose a new adjustment method based on a reference distribution constructed as a mixture model of random graphs which reflect the dependence structure exhibited in the observed networks. We show that using simple Bernoulli models as mixture components in this reference distribution can provide adjusted network statistics that are relatively comparable across different network sizes but still describe interesting features of networks, and that this can be accomplished at relatively low computational expense. Finally, we apply this methodology to a collection of ecological networks derived from the Los Angeles Family and Neighborhood Survey activity location data. PMID:27721556
Network topology and functional connectivity disturbances precede the onset of Huntington’s disease
Harrington, Deborah L.; Rubinov, Mikail; Durgerian, Sally; Mourany, Lyla; Reece, Christine; Koenig, Katherine; Bullmore, Ed; Long, Jeffrey D.; Paulsen, Jane S.
2015-01-01
Cognitive, motor and psychiatric changes in prodromal Huntington’s disease have nurtured the emergent need for early interventions. Preventive clinical trials for Huntington’s disease, however, are limited by a shortage of suitable measures that could serve as surrogate outcomes. Measures of intrinsic functional connectivity from resting-state functional magnetic resonance imaging are of keen interest. Yet recent studies suggest circumscribed abnormalities in resting-state functional magnetic resonance imaging connectivity in prodromal Huntington’s disease, despite the spectrum of behavioural changes preceding a manifest diagnosis. The present study used two complementary analytical approaches to examine whole-brain resting-state functional magnetic resonance imaging connectivity in prodromal Huntington’s disease. Network topology was studied using graph theory and simple functional connectivity amongst brain regions was explored using the network-based statistic. Participants consisted of gene-negative controls (n = 16) and prodromal Huntington’s disease individuals (n = 48) with various stages of disease progression to examine the influence of disease burden on intrinsic connectivity. Graph theory analyses showed that global network interconnectivity approximated a random network topology as proximity to diagnosis neared and this was associated with decreased connectivity amongst highly-connected rich-club network hubs, which integrate processing from diverse brain regions. However, functional segregation within the global network (average clustering) was preserved. Functional segregation was also largely maintained at the local level, except for the notable decrease in the diversity of anterior insula intermodular-interconnections (participation coefficient), irrespective of disease burden. In contrast, network-based statistic analyses revealed patterns of weakened frontostriatal connections and strengthened frontal-posterior connections that evolved as disease burden increased. These disturbances were often related to long-range connections involving peripheral nodes and interhemispheric connections. A strong association was found between weaker connectivity and decreased rich-club organization, indicating that whole-brain simple connectivity partially expressed disturbances in the communication of highly-connected hubs. However, network topology and network-based statistic connectivity metrics did not correlate with key markers of executive dysfunction (Stroop Test, Trail Making Test) in prodromal Huntington’s disease, which instead were related to whole-brain connectivity disturbances in nodes (right inferior parietal, right thalamus, left anterior cingulate) that exhibited multiple aberrant connections and that mediate executive control. Altogether, our results show for the first time a largely disease burden-dependent functional reorganization of whole-brain networks in prodromal Huntington’s disease. Both analytic approaches provided a unique window into brain reorganization that was not related to brain atrophy or motor symptoms. Longitudinal studies currently in progress will chart the course of functional changes to determine the most sensitive markers of disease progression. PMID:26059655
Network topology and functional connectivity disturbances precede the onset of Huntington's disease.
Harrington, Deborah L; Rubinov, Mikail; Durgerian, Sally; Mourany, Lyla; Reece, Christine; Koenig, Katherine; Bullmore, Ed; Long, Jeffrey D; Paulsen, Jane S; Rao, Stephen M
2015-08-01
Cognitive, motor and psychiatric changes in prodromal Huntington's disease have nurtured the emergent need for early interventions. Preventive clinical trials for Huntington's disease, however, are limited by a shortage of suitable measures that could serve as surrogate outcomes. Measures of intrinsic functional connectivity from resting-state functional magnetic resonance imaging are of keen interest. Yet recent studies suggest circumscribed abnormalities in resting-state functional magnetic resonance imaging connectivity in prodromal Huntington's disease, despite the spectrum of behavioural changes preceding a manifest diagnosis. The present study used two complementary analytical approaches to examine whole-brain resting-state functional magnetic resonance imaging connectivity in prodromal Huntington's disease. Network topology was studied using graph theory and simple functional connectivity amongst brain regions was explored using the network-based statistic. Participants consisted of gene-negative controls (n = 16) and prodromal Huntington's disease individuals (n = 48) with various stages of disease progression to examine the influence of disease burden on intrinsic connectivity. Graph theory analyses showed that global network interconnectivity approximated a random network topology as proximity to diagnosis neared and this was associated with decreased connectivity amongst highly-connected rich-club network hubs, which integrate processing from diverse brain regions. However, functional segregation within the global network (average clustering) was preserved. Functional segregation was also largely maintained at the local level, except for the notable decrease in the diversity of anterior insula intermodular-interconnections (participation coefficient), irrespective of disease burden. In contrast, network-based statistic analyses revealed patterns of weakened frontostriatal connections and strengthened frontal-posterior connections that evolved as disease burden increased. These disturbances were often related to long-range connections involving peripheral nodes and interhemispheric connections. A strong association was found between weaker connectivity and decreased rich-club organization, indicating that whole-brain simple connectivity partially expressed disturbances in the communication of highly-connected hubs. However, network topology and network-based statistic connectivity metrics did not correlate with key markers of executive dysfunction (Stroop Test, Trail Making Test) in prodromal Huntington's disease, which instead were related to whole-brain connectivity disturbances in nodes (right inferior parietal, right thalamus, left anterior cingulate) that exhibited multiple aberrant connections and that mediate executive control. Altogether, our results show for the first time a largely disease burden-dependent functional reorganization of whole-brain networks in prodromal Huntington's disease. Both analytic approaches provided a unique window into brain reorganization that was not related to brain atrophy or motor symptoms. Longitudinal studies currently in progress will chart the course of functional changes to determine the most sensitive markers of disease progression. © The Author (2015). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Estimating trends in the global mean temperature record
NASA Astrophysics Data System (ADS)
Poppick, Andrew; Moyer, Elisabeth J.; Stein, Michael L.
2017-06-01
Given uncertainties in physical theory and numerical climate simulations, the historical temperature record is often used as a source of empirical information about climate change. Many historical trend analyses appear to de-emphasize physical and statistical assumptions: examples include regression models that treat time rather than radiative forcing as the relevant covariate, and time series methods that account for internal variability in nonparametric rather than parametric ways. However, given a limited data record and the presence of internal variability, estimating radiatively forced temperature trends in the historical record necessarily requires some assumptions. Ostensibly empirical methods can also involve an inherent conflict in assumptions: they require data records that are short enough for naive trend models to be applicable, but long enough for long-timescale internal variability to be accounted for. In the context of global mean temperatures, empirical methods that appear to de-emphasize assumptions can therefore produce misleading inferences, because the trend over the twentieth century is complex and the scale of temporal correlation is long relative to the length of the data record. We illustrate here how a simple but physically motivated trend model can provide better-fitting and more broadly applicable trend estimates and can allow for a wider array of questions to be addressed. In particular, the model allows one to distinguish, within a single statistical framework, between uncertainties in the shorter-term vs. longer-term response to radiative forcing, with implications not only on historical trends but also on uncertainties in future projections. We also investigate the consequence on inferred uncertainties of the choice of a statistical description of internal variability. While nonparametric methods may seem to avoid making explicit assumptions, we demonstrate how even misspecified parametric statistical methods, if attuned to the important characteristics of internal variability, can result in more accurate uncertainty statements about trends.
Davidson, P; Bigerelle, M; Bounichane, B; Giazzon, M; Anselme, K
2010-07-01
Contact guidance is generally evaluated by measuring the orientation angle of cells. However, statistical analyses are rarely performed on these parameters. Here we propose a statistical analysis based on a new parameter sigma, the orientation parameter, defined as the dispersion of the distribution of orientation angles. This parameter can be used to obtain a truncated Gaussian distribution that models the distribution of the data between -90 degrees and +90 degrees. We established a threshold value of the orientation parameter below which the data can be considered to be aligned within a 95% confidence interval. Applying our orientation parameter to cells on grooves and using a modelling approach, we established the relationship sigma=alpha(meas)+(52 degrees -alpha(meas))/(1+C(GDE)R) where the parameter C(GDE) represents the sensitivity of cells to groove depth, and R the groove depth. The values of C(GDE) obtained allowed us to compare the contact guidance of human osteoprogenitor (HOP) cells across experiments involving different groove depths, times in culture and inoculation densities. We demonstrate that HOP cells are able to identify and respond to the presence of grooves 30, 100, 200 and 500 nm deep and that the deeper the grooves, the higher the cell orientation. The evolution of the sensitivity (C(GDE)) with culture time is roughly sigmoidal with an asymptote, which is a function of inoculation density. The sigma parameter defined here is a universal parameter that can be applied to all orientation measurements and does not require a mathematical background or knowledge of directional statistics. Copyright 2010 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Prediction during statistical learning, and implications for the implicit/explicit divide
Dale, Rick; Duran, Nicholas D.; Morehead, J. Ryan
2012-01-01
Accounts of statistical learning, both implicit and explicit, often invoke predictive processes as central to learning, yet practically all experiments employ non-predictive measures during training. We argue that the common theoretical assumption of anticipation and prediction needs clearer, more direct evidence for it during learning. We offer a novel experimental context to explore prediction, and report results from a simple sequential learning task designed to promote predictive behaviors in participants as they responded to a short sequence of simple stimulus events. Predictive tendencies in participants were measured using their computer mouse, the trajectories of which served as a means of tapping into predictive behavior while participants were exposed to very short and simple sequences of events. A total of 143 participants were randomly assigned to stimulus sequences along a continuum of regularity. Analysis of computer-mouse trajectories revealed that (a) participants almost always anticipate events in some manner, (b) participants exhibit two stable patterns of behavior, either reacting to vs. predicting future events, (c) the extent to which participants predict relates to performance on a recall test, and (d) explicit reports of perceiving patterns in the brief sequence correlates with extent of prediction. We end with a discussion of implicit and explicit statistical learning and of the role prediction may play in both kinds of learning. PMID:22723817
Grading the neuroendocrine tumors of the lung: an evidence-based proposal.
Rindi, G; Klersy, C; Inzani, F; Fellegara, G; Ampollini, L; Ardizzoni, A; Campanini, N; Carbognani, P; De Pas, T M; Galetta, D; Granone, P L; Righi, L; Rusca, M; Spaggiari, L; Tiseo, M; Viale, G; Volante, M; Papotti, M; Pelosi, G
2014-02-01
Lung neuroendocrine tumors are catalogued in four categories by the World Health Organization (WHO 2004) classification. Its reproducibility and prognostic efficacy was disputed. The WHO 2010 classification of digestive neuroendocrine neoplasms is based on Ki67 proliferation assessment and proved prognostically effective. This study aims at comparing these two classifications and at defining a prognostic grading system for lung neuroendocrine tumors. The study included 399 patients who underwent surgery and with at least 1 year follow-up between 1989 and 2011. Data on 21 variables were collected, and performance of grading systems and their components was compared by Cox regression and multivariable analyses. All statistical tests were two-sided. At Cox analysis, WHO 2004 stratified patients into three major groups with statistically significant survival difference (typical carcinoid vs atypical carcinoid (AC), P=0.021; AC vs large-cell/small-cell lung neuroendocrine carcinomas, P<0.001). Optimal discrimination in three groups was observed by Ki67% (Ki67% cutoffs: G1 <4, G2 4-<25, G3 ≥25; G1 vs G2, P=0.021; and G2 vs G3, P≤0.001), mitotic count (G1 ≤2, G2 >2-47, G3 >47; G1 vs G2, P≤0.001; and G2 vs G3, P≤0.001), and presence of necrosis (G1 absent, G2 <10% of sample, G3 >10% of sample; G1 vs G2, P≤0.001; and G2 vs G3, P≤0.001) at uni and multivariable analyses. The combination of these three variables resulted in a simple and effective grading system. A three-tiers grading system based on Ki67 index, mitotic count, and necrosis with cutoffs specifically generated for lung neuroendocrine tumors is prognostically effective and accurate.
Use of simple models to determine wake vortex categories for new aircraft.
DOT National Transportation Integrated Search
2015-06-22
The paper describes how to use simple models and, if needed, sensitivity analyses to determine the wake vortex categories for new aircraft. The methodology provides a tool for the regulators to assess the relative risk of introducing new aircraft int...
Taimoory, S Maryamdokht; Sadraei, S Iraj; Fayoumi, Rose Anne; Nasri, Sarah; Revington, Matthew; Trant, John F
2018-04-20
The reaction between furans and maleimides has increasingly become a method of interest as its reversibility makes it a useful tool for applications ranging from self-healing materials, to self-immolative polymers, to hydrogels for cell culture and for the preparation of bone repair. However, most of these applications have relied on simple monosubstituted furans and simple maleimides and have not extensively evaluated the potential thermal variability inherent in the process that is achievable through simple substrate modification. A small library of cycloadducts suitable for the above applications was prepared, and the temperature dependence of the retro-Diels-Alder processes was determined through in situ 1 H NMR analyses complemented by computational calculations. The practical range of the reported systems ranges from 40 to >110 °C. The cycloreversion reactions are more complex than would be expected based on simple trends expected based on frontier molecular orbital analyses of the materials.
Jin, Zhichao; Yu, Danghui; Zhang, Luoman; Meng, Hong; Lu, Jian; Gao, Qingbin; Cao, Yang; Ma, Xiuqiang; Wu, Cheng; He, Qian; Wang, Rui; He, Jia
2010-01-01
Background High quality clinical research not only requires advanced professional knowledge, but also needs sound study design and correct statistical analyses. The number of clinical research articles published in Chinese medical journals has increased immensely in the past decade, but study design quality and statistical analyses have remained suboptimal. The aim of this investigation was to gather evidence on the quality of study design and statistical analyses in clinical researches conducted in China for the first decade of the new millennium. Methodology/Principal Findings Ten (10) leading Chinese medical journals were selected and all original articles published in 1998 (N = 1,335) and 2008 (N = 1,578) were thoroughly categorized and reviewed. A well-defined and validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation. Main outcomes were the frequencies of different types of study design, error/defect proportion in design and statistical analyses, and implementation of CONSORT in randomized clinical trials. From 1998 to 2008: The error/defect proportion in statistical analyses decreased significantly ( = 12.03, p<0.001), 59.8% (545/1,335) in 1998 compared to 52.2% (664/1,578) in 2008. The overall error/defect proportion of study design also decreased ( = 21.22, p<0.001), 50.9% (680/1,335) compared to 42.40% (669/1,578). In 2008, design with randomized clinical trials remained low in single digit (3.8%, 60/1,578) with two-third showed poor results reporting (defects in 44 papers, 73.3%). Nearly half of the published studies were retrospective in nature, 49.3% (658/1,335) in 1998 compared to 48.2% (761/1,578) in 2008. Decreases in defect proportions were observed in both results presentation ( = 93.26, p<0.001), 92.7% (945/1,019) compared to 78.2% (1023/1,309) and interpretation ( = 27.26, p<0.001), 9.7% (99/1,019) compared to 4.3% (56/1,309), some serious ones persisted. Conclusions/Significance Chinese medical research seems to have made significant progress regarding statistical analyses, but there remains ample room for improvement regarding study designs. Retrospective clinical studies are the most often used design, whereas randomized clinical trials are rare and often show methodological weaknesses. Urgent implementation of the CONSORT statement is imperative. PMID:20520824
Waardenberg, Ashley J; Basset, Samuel D; Bouveret, Romaric; Harvey, Richard P
2015-09-02
Gene ontology (GO) enrichment is commonly used for inferring biological meaning from systems biology experiments. However, determining differential GO and pathway enrichment between DNA-binding experiments or using the GO structure to classify experiments has received little attention. Herein, we present a bioinformatics tool, CompGO, for identifying Differentially Enriched Gene Ontologies, called DiEGOs, and pathways, through the use of a z-score derivation of log odds ratios, and visualizing these differences at GO and pathway level. Through public experimental data focused on the cardiac transcription factor NKX2-5, we illustrate the problems associated with comparing GO enrichments between experiments using a simple overlap approach. We have developed an R/Bioconductor package, CompGO, which implements a new statistic normally used in epidemiological studies for performing comparative GO analyses and visualizing comparisons from . BED data containing genomic coordinates as well as gene lists as inputs. We justify the statistic through inclusion of experimental data and compare to the commonly used overlap method. CompGO is freely available as a R/Bioconductor package enabling easy integration into existing pipelines and is available at: http://www.bioconductor.org/packages/release/bioc/html/CompGO.html packages/release/bioc/html/CompGO.html.
NASA Astrophysics Data System (ADS)
Debski, Wojciech
2015-06-01
The spatial location of sources of seismic waves is one of the first tasks when transient waves from natural (uncontrolled) sources are analysed in many branches of physics, including seismology, oceanology, to name a few. Source activity and its spatial variability in time, the geometry of recording network, the complexity and heterogeneity of wave velocity distribution are all factors influencing the performance of location algorithms and accuracy of the achieved results. Although estimating of the earthquake foci location is relatively simple, a quantitative estimation of the location accuracy is really a challenging task even if the probabilistic inverse method is used because it requires knowledge of statistics of observational, modelling and a priori uncertainties. In this paper, we addressed this task when statistics of observational and/or modelling errors are unknown. This common situation requires introduction of a priori constraints on the likelihood (misfit) function which significantly influence the estimated errors. Based on the results of an analysis of 120 seismic events from the Rudna copper mine operating in southwestern Poland, we propose an approach based on an analysis of Shanon's entropy calculated for the a posteriori distribution. We show that this meta-characteristic of the a posteriori distribution carries some information on uncertainties of the solution found.
Takayama, Motoharu; Terui, Keita; Oiwa, Yoshitsugu
2012-10-01
Chronic subdural hematoma is common in elderly individuals and surgical procedures are simple. The recurrence rate of chronic subdural hematoma, however, varies from 9.2 to 26.5% after surgery. The authors studied factors of the recurrence using univariate and multivariate analyses in patients with chronic subdural hematoma We retrospectively reviewed 239 consecutive cases of chronic subdural hematoma who received burr-hole surgery with irrigation and closed-system drainage. We analyzed the relationships between recurrence of chronic subdural hematoma and factors such as sex, age, laterality, bleeding tendency, other complicated diseases, density on CT, volume of the hematoma, residual air in the hematoma cavity, use of artificial cerebrospinal fluid. Twenty-one patients (8.8%) experienced a recurrence of chronic subdural hematoma. Multiple logistic regression found that the recurrence rate was higher in patients with a large volume of the residual air, and was lower in patients using artificial cerebrospinal fluid. No statistical differences were found in bleeding tendency. Techniques to reduce the air in the hematoma cavity are important for good outcome in surgery of chronic subdural hematoma. Also, the use of artificial cerebrospinal fluid reduces recurrence of chronic subdural hematoma. The surgical procedures can be the same for patients with bleeding tendencies.
Coral forests diversity in the outer shelf of the south Sardinian continental margin
NASA Astrophysics Data System (ADS)
Cau, Alessandro; Moccia, Davide; Follesa, Maria Cristina; Alvito, Andrea; Canese, Simonepietro; Angiolillo, Michela; Cuccu, Danila; Bo, Marzia; Cannas, Rita
2017-04-01
Ecological theory predicts that heterogeneous habitats allow more species to co-exist in a given area, but to date, knowledge on relationships between habitat heterogeneity and biodiversity of coral forests in the outer shelf and upper slope along continental margins is rather limited. We investigated biodiversity of coral forests from 8 sites spread over two different geomorphological settings (namely, pinnacles vs. canyons) in the outer shelf along Sardinian continental margin. Using a combination of multivariate statistical analyses, we show here that differences in the composition of coral assemblages among contrasting geomorphological settings were not statistically significant, whereas significant differences emerged among sites within similar geomorphologies (i.e. among pinnacles and among canyons). Our results reveal that environmental and bathymetric factors such as sediment coverage, slope of the substrate, terrain ruggedness, bathymetric positioning index and aspect were important drivers of the observed patterns of coral biodiversity, in both settings. Spatial variability of coral forests' biodiversity is affected by environmental factors that act at the scale of each geomorphological setting (i.e. within each pinnacle and canyon) rather than by the contrasting geomorphological settings themselves. This result allows us to suggest that simple categorization of benthic communities according topographically defined habitat is unlikely to be sufficient for addressing conservation purposes.
Yildirim, Aysegul; Akinci, Fevzi; Gozu, Hulya; Sargin, Haluk; Orbay, Ekrem; Sargin, Mehmet
2007-06-01
The aim of this study was to test the validity and reliability of the Turkish version of the diabetes quality of life (DQOL) questionnaire for use with patients with diabetes. Turkish version of the generic quality of life (QoL) scale 15D and DQOL, socio-demographics and clinical parameter characteristics were administered to 150 patients with type 2 diabetes. Study participants were randomly sampled from the Endocrinology and Diabetes Outpatient Department of Dr. Lutfi Kirdar Kartal Education and Research Hospital in Istanbul, Turkey. The Cronbach alpha coefficient of the overall DQOL scale was 0.89; the Cronbach alpha coefficient ranged from 0.80 to 0.94 for subscales. Distress, discomfort and its symptoms, depression, mobility, usual activities, and vitality on the 15 D scale had statistically significant correlations with social/vocational worry and diabetes-related worry on the DQOL scale indicating good convergent validity. Factor analysis identified four subscales: satisfaction", impact", "diabetes-related worry", and "social/vocational worry". Statistical analyses showed that the Turkish version of the DQOL is a valid and reliable instrument to measure disease related QoL in patients with diabetes. It is a simple and quick screening tool with about 15 +/- 5.8 min administration time for measuring QoL in this population.
NASA Astrophysics Data System (ADS)
Li, H. J.; Wei, F. S.; Feng, X. S.; Xie, Y. Q.
2008-09-01
This paper investigates methods to improve the predictions of Shock Arrival Time (SAT) of the original Shock Propagation Model (SPM). According to the classical blast wave theory adopted in the SPM, the shock propagating speed is determined by the total energy of the original explosion together with the background solar wind speed. Noting that there exists an intrinsic limit to the transit times computed by the SPM predictions for a specified ambient solar wind, we present a statistical analysis on the forecasting capability of the SPM using this intrinsic property. Two facts about SPM are found: (1) the error in shock energy estimation is not the only cause of the prediction errors and we should not expect that the accuracy of SPM to be improved drastically by an exact shock energy input; and (2) there are systematic differences in prediction results both for the strong shocks propagating into a slow ambient solar wind and for the weak shocks into a fast medium. Statistical analyses indicate the physical details of shock propagation and thus clearly point out directions of the future improvement of the SPM. A simple modification is presented here, which shows that there is room for improvement of SPM and thus that the original SPM is worthy of further development.
Hutz, Janna E; Nelson, Thomas; Wu, Hua; McAllister, Gregory; Moutsatsos, Ioannis; Jaeger, Savina A; Bandyopadhyay, Somnath; Nigsch, Florian; Cornett, Ben; Jenkins, Jeremy L; Selinger, Douglas W
2013-04-01
Screens using high-throughput, information-rich technologies such as microarrays, high-content screening (HCS), and next-generation sequencing (NGS) have become increasingly widespread. Compared with single-readout assays, these methods produce a more comprehensive picture of the effects of screened treatments. However, interpreting such multidimensional readouts is challenging. Univariate statistics such as t-tests and Z-factors cannot easily be applied to multidimensional profiles, leaving no obvious way to answer common screening questions such as "Is treatment X active in this assay?" and "Is treatment X different from (or equivalent to) treatment Y?" We have developed a simple, straightforward metric, the multidimensional perturbation value (mp-value), which can be used to answer these questions. Here, we demonstrate application of the mp-value to three data sets: a multiplexed gene expression screen of compounds and genomic reagents, a microarray-based gene expression screen of compounds, and an HCS compound screen. In all data sets, active treatments were successfully identified using the mp-value, and simulations and follow-up analyses supported the mp-value's statistical and biological validity. We believe the mp-value represents a promising way to simplify the analysis of multidimensional data while taking full advantage of its richness.
Use of Statistical Analyses in the Ophthalmic Literature
Lisboa, Renato; Meira-Freitas, Daniel; Tatham, Andrew J.; Marvasti, Amir H.; Sharpsten, Lucie; Medeiros, Felipe A.
2014-01-01
Purpose To identify the most commonly used statistical analyses in the ophthalmic literature and to determine the likely gain in comprehension of the literature that readers could expect if they were to sequentially add knowledge of more advanced techniques to their statistical repertoire. Design Cross-sectional study Methods All articles published from January 2012 to December 2012 in Ophthalmology, American Journal of Ophthalmology and Archives of Ophthalmology were reviewed. A total of 780 peer-reviewed articles were included. Two reviewers examined each article and assigned categories to each one depending on the type of statistical analyses used. Discrepancies between reviewers were resolved by consensus. Main Outcome Measures Total number and percentage of articles containing each category of statistical analysis were obtained. Additionally we estimated the accumulated number and percentage of articles that a reader would be expected to be able to interpret depending on their statistical repertoire. Results Readers with little or no statistical knowledge would be expected to be able to interpret the statistical methods presented in only 20.8% of articles. In order to understand more than half (51.4%) of the articles published, readers were expected to be familiar with at least 15 different statistical methods. Knowledge of 21 categories of statistical methods was necessary to comprehend 70.9% of articles, while knowledge of more than 29 categories was necessary to comprehend more than 90% of articles. Articles in retina and glaucoma subspecialties showed a tendency for using more complex analysis when compared to cornea. Conclusions Readers of clinical journals in ophthalmology need to have substantial knowledge of statistical methodology to understand the results of published studies in the literature. The frequency of use of complex statistical analyses also indicates that those involved in the editorial peer-review process must have sound statistical knowledge in order to critically appraise articles submitted for publication. The results of this study could provide guidance to direct the statistical learning of clinical ophthalmologists, researchers and educators involved in the design of courses for residents and medical students. PMID:24612977
Measurement and classification of heart and lung sounds by using LabView for educational use.
Altrabsheh, B
2010-01-01
This study presents the design, development and implementation of a simple low-cost method of phonocardiography signal detection. Human heart and lung signals are detected by using a simple microphone through a personal computer; the signals are recorded and analysed using LabView software. Amplitude and frequency analyses are carried out for various phonocardiography pathological cases. Methods for automatic classification of normal and abnormal heart sounds, murmurs and lung sounds are presented. Various cases of heart and lung sound measurement are recorded and analysed. The measurements can be saved for further analysis. The method in this study can be used by doctors as a detection tool aid and may be useful for teaching purposes at medical and nursing schools.
Statistical Mechanics of the US Supreme Court
NASA Astrophysics Data System (ADS)
Lee, Edward D.; Broedersz, Chase P.; Bialek, William
2015-07-01
We build simple models for the distribution of voting patterns in a group, using the Supreme Court of the United States as an example. The maximum entropy model consistent with the observed pairwise correlations among justices' votes, an Ising spin glass, agrees quantitatively with the data. While all correlations (perhaps surprisingly) are positive, the effective pairwise interactions in the spin glass model have both signs, recovering the intuition that ideologically opposite justices negatively influence each another. Despite the competing interactions, a strong tendency toward unanimity emerges from the model, organizing the voting patterns in a relatively simple "energy landscape." Besides unanimity, other energy minima in this landscape, or maxima in probability, correspond to prototypical voting states, such as the ideological split or a tightly correlated, conservative core. The model correctly predicts the correlation of justices with the majority and gives us a measure of their influence on the majority decision. These results suggest that simple models, grounded in statistical physics, can capture essential features of collective decision making quantitatively, even in a complex political context.
Global atmospheric circulation statistics, 1000-1 mb
NASA Technical Reports Server (NTRS)
Randel, William J.
1992-01-01
The atlas presents atmospheric general circulation statistics derived from twelve years (1979-90) of daily National Meteorological Center (NMC) operational geopotential height analyses; it is an update of a prior atlas using data over 1979-1986. These global analyses are available on pressure levels covering 1000-1 mb (approximately 0-50 km). The geopotential grids are a combined product of the Climate Analysis Center (which produces analyses over 70-1 mb) and operational NMC analyses (over 1000-100 mb). Balance horizontal winds and hydrostatic temperatures are derived from the geopotential fields.
Development of the Statistical Reasoning in Biology Concept Inventory (SRBCI)
Deane, Thomas; Nomme, Kathy; Jeffery, Erica; Pollock, Carol; Birol, Gülnur
2016-01-01
We followed established best practices in concept inventory design and developed a 12-item inventory to assess student ability in statistical reasoning in biology (Statistical Reasoning in Biology Concept Inventory [SRBCI]). It is important to assess student thinking in this conceptual area, because it is a fundamental requirement of being statistically literate and associated skills are needed in almost all walks of life. Despite this, previous work shows that non–expert-like thinking in statistical reasoning is common, even after instruction. As science educators, our goal should be to move students along a novice-to-expert spectrum, which could be achieved with growing experience in statistical reasoning. We used item response theory analyses (the one-parameter Rasch model and associated analyses) to assess responses gathered from biology students in two populations at a large research university in Canada in order to test SRBCI’s robustness and sensitivity in capturing useful data relating to the students’ conceptual ability in statistical reasoning. Our analyses indicated that SRBCI is a unidimensional construct, with items that vary widely in difficulty and provide useful information about such student ability. SRBCI should be useful as a diagnostic tool in a variety of biology settings and as a means of measuring the success of teaching interventions designed to improve statistical reasoning skills. PMID:26903497
An Automated Statistical Process Control Study of Inline Mixing Using Spectrophotometric Detection
ERIC Educational Resources Information Center
Dickey, Michael D.; Stewart, Michael D.; Willson, C. Grant
2006-01-01
An experiment is described, which is designed for a junior-level chemical engineering "fundamentals of measurements and data analysis" course, where students are introduced to the concept of statistical process control (SPC) through a simple inline mixing experiment. The students learn how to create and analyze control charts in an effort to…
Competitive agents in a market: Statistical physics of the minority game
NASA Astrophysics Data System (ADS)
Sherrington, David
2007-10-01
A brief review is presented of the minority game, a simple frustrated many-body system stimulated by considerations of a market of competitive speculative agents. Its cooperative behaviour exhibits phase transitions and both ergodic and non-ergodic regimes. It provides novel challenges to statistical physics, reminiscent of those of mean-field spin glasses.
Secondary Analysis of National Longitudinal Transition Study 2 Data
ERIC Educational Resources Information Center
Hicks, Tyler A.; Knollman, Greg A.
2015-01-01
This review examines published secondary analyses of National Longitudinal Transition Study 2 (NLTS2) data, with a primary focus upon statistical objectives, paradigms, inferences, and methods. Its primary purpose was to determine which statistical techniques have been common in secondary analyses of NLTS2 data. The review begins with an…
A Nonparametric Geostatistical Method For Estimating Species Importance
Andrew J. Lister; Rachel Riemann; Michael Hoppus
2001-01-01
Parametric statistical methods are not always appropriate for conducting spatial analyses of forest inventory data. Parametric geostatistical methods such as variography and kriging are essentially averaging procedures, and thus can be affected by extreme values. Furthermore, non normal distributions violate the assumptions of analyses in which test statistics are...
ERIC Educational Resources Information Center
Ellis, Barbara G.; Dick, Steven J.
1996-01-01
Employs the statistics-documentation portion of a word-processing program's grammar-check feature together with qualitative analyses to determine that Henry Watterson, long-time editor of the "Louisville Courier-Journal," was probably the South's famed Civil War correspondent "Shadow." (TB)
NASA Astrophysics Data System (ADS)
Ortega-Martinez, Antonio; Padilla-Martinez, Juan Pablo; Franco, Walfre
2016-04-01
The skin contains several fluorescent molecules or fluorophores that serve as markers of structure, function and composition. UV fluorescence excitation photography is a simple and effective way to image specific intrinsic fluorophores, such as the one ascribed to tryptophan which emits at a wavelength of 345 nm upon excitation at 295 nm, and is a marker of cellular proliferation. Earlier, we built a clinical UV photography system to image cellular proliferation. In some samples, the naturally low intensity of the fluorescence can make it difficult to separate the fluorescence of cells in higher proliferation states from background fluorescence and other imaging artifacts -- like electronic noise. In this work, we describe a statistical image segmentation method to separate the fluorescence of interest. Statistical image segmentation is based on image averaging, background subtraction and pixel statistics. This method allows to better quantify the intensity and surface distributions of fluorescence, which in turn simplify the detection of borders. Using this method we delineated the borders of highly-proliferative skin conditions and diseases, in particular, allergic contact dermatitis, psoriatic lesions and basal cell carcinoma. Segmented images clearly define lesion borders. UV fluorescence excitation photography along with statistical image segmentation may serve as a quick and simple diagnostic tool for clinicians.
Evaluation of wind field statistics near and inside clouds using a coherent Doppler lidar
NASA Astrophysics Data System (ADS)
Lottman, Brian Todd
1998-09-01
This work proposes advanced techniques for measuring the spatial wind field statistics near and inside clouds using a vertically pointing solid state coherent Doppler lidar on a fixed ground based platform. The coherent Doppler lidar is an ideal instrument for high spatial and temporal resolution velocity estimates. The basic parameters of lidar are discussed, including a complete statistical description of the Doppler lidar signal. This description is extended to cases with simple functional forms for aerosol backscatter and velocity. An estimate for the mean velocity over a sensing volume is produced by estimating the mean spectra. There are many traditional spectral estimators, which are useful for conditions with slowly varying velocity and backscatter. A new class of estimators (novel) is introduced that produces reliable velocity estimates for conditions with large variations in aerosol backscatter and velocity with range, such as cloud conditions. Performance of traditional and novel estimators is computed for a variety of deterministic atmospheric conditions using computer simulated data. Wind field statistics are produced for actual data for a cloud deck, and for multi- layer clouds. Unique results include detection of possible spectral signatures for rain, estimates for the structure function inside a cloud deck, reliable velocity estimation techniques near and inside thin clouds, and estimates for simple wind field statistics between cloud layers.
Learning predictive statistics from temporal sequences: Dynamics and strategies
Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E.; Kourtzi, Zoe
2017-01-01
Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics—that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments. PMID:28973111
NASA Technical Reports Server (NTRS)
Generazio, Edward R.
2014-01-01
Unknown risks are introduced into failure critical systems when probability of detection (POD) capabilities are accepted without a complete understanding of the statistical method applied and the interpretation of the statistical results. The presence of this risk in the nondestructive evaluation (NDE) community is revealed in common statements about POD. These statements are often interpreted in a variety of ways and therefore, the very existence of the statements identifies the need for a more comprehensive understanding of POD methodologies. Statistical methodologies have data requirements to be met, procedures to be followed, and requirements for validation or demonstration of adequacy of the POD estimates. Risks are further enhanced due to the wide range of statistical methodologies used for determining the POD capability. Receiver/Relative Operating Characteristics (ROC) Display, simple binomial, logistic regression, and Bayes' rule POD methodologies are widely used in determining POD capability. This work focuses on Hit-Miss data to reveal the framework of the interrelationships between Receiver/Relative Operating Characteristics Display, simple binomial, logistic regression, and Bayes' Rule methodologies as they are applied to POD. Knowledge of these interrelationships leads to an intuitive and global understanding of the statistical data, procedural and validation requirements for establishing credible POD estimates.
1993-08-01
subtitled "Simulation Data," consists of detailed infonrnation on the design parmneter variations tested, subsequent statistical analyses conducted...used with confidence during the design process. The data quality can be examined in various forms such as statistical analyses of measure of merit data...merit, such as time to capture or nmaximurn pitch rate, can be calculated from the simulation time history data. Statistical techniques are then used
On the (In)Validity of Tests of Simple Mediation: Threats and Solutions
Pek, Jolynn; Hoyle, Rick H.
2015-01-01
Mediation analysis is a popular framework for identifying underlying mechanisms in social psychology. In the context of simple mediation, we review and discuss the implications of three facets of mediation analysis: (a) conceptualization of the relations between the variables, (b) statistical approaches, and (c) relevant elements of design. We also highlight the issue of equivalent models that are inherent in simple mediation. The extent to which results are meaningful stem directly from choices regarding these three facets of mediation analysis. We conclude by discussing how mediation analysis can be better applied to examine causal processes, highlight the limits of simple mediation, and make recommendations for better practice. PMID:26985234
NASA Astrophysics Data System (ADS)
Cornilleau-Wehrlin, Nicole; Bocchialini, Karine; Menvielle, Michel; Chambodut, Aude; Fontaine, Dominique; Grison, Benjamin; Marchaudon, Aurélie; Pick, Monique; Pitout, Frédéric; Schmieder, Brigitte; Régnier, Stéphane; Zouganelis, Yannis
2017-04-01
Taking the 32 sudden storm commencements (SSC) listed by the observatory de l'Ebre / ISGI over the year 2002 (maximal solar activity) as a starting point, we performed a statistical analysis of the related solar sources, solar wind signatures, and terrestrial responses. For each event, we characterized and identified, as far as possible, (i) the sources on the Sun (Coronal Mass Ejections -CME-), with the help of a series of criteria (velocities, drag coefficient, radio waves, helicity), as well as (ii) the structure and properties in the interplanetary medium, at L1, of the event associated to the SSC: magnetic clouds -MC-, non-MC interplanetary coronal mass ejections -ICME-, co-rotating/stream interaction regions -SIR/CIR-, shocks only and unclear events that we call "miscellaneous" events. The observed Sun-to-Earth travel times are compared to those estimated using existing simple models of propagation in the interplanetary medium. This comparison is used to statistically assess performances of various models. The geoeffectiveness of the events, classified by category at L1, is analysed by their signatures in the Earth ionized (magnetosphere and ionosphere) and neutral (thermosphere) environments, using a broad set of in situ, remote and ground based instrumentation. The role of the presence of a unique or of a multiple source at the Sun, of its nature, halo or non halo CME, is also discussed. The set of observations is statistically analyzed so as to evaluate and compare the geoeffectiveness of the events. The results obtained for this set of geomagnetic storms started by SSCs is compared to the overall statistics of year 2002, relying on already published catalogues of events, allowing assessing the relevance of our approach (for instance the all 12 well identified Magnetic Clouds of 2002 give rise to SSCs).
A Geometrical Approach to Bell's Theorem
NASA Technical Reports Server (NTRS)
Rubincam, David Parry
2000-01-01
Bell's theorem can be proved through simple geometrical reasoning, without the need for the Psi function, probability distributions, or calculus. The proof is based on N. David Mermin's explication of the Einstein-Podolsky-Rosen-Bohm experiment, which involves Stern-Gerlach detectors which flash red or green lights when detecting spin-up or spin-down. The statistics of local hidden variable theories for this experiment can be arranged in colored strips from which simple inequalities can be deduced. These inequalities lead to a demonstration of Bell's theorem. Moreover, all local hidden variable theories can be graphed in such a way as to enclose their statistics in a pyramid, with the quantum-mechanical result lying a finite distance beneath the base of the pyramid.
A simple test of association for contingency tables with multiple column responses.
Decady, Y J; Thomas, D R
2000-09-01
Loughin and Scherer (1998, Biometrics 54, 630-637) investigated tests of association in two-way tables when one of the categorical variables allows for multiple-category responses from individual respondents. Standard chi-squared tests are invalid in this case, and they developed a bootstrap test procedure that provides good control of test levels under the null hypothesis. This procedure and some others that have been proposed are computationally involved and are based on techniques that are relatively unfamiliar to many practitioners. In this paper, the methods introduced by Rao and Scott (1981, Journal of the American Statistical Association 76, 221-230) for analyzing complex survey data are used to develop a simple test based on a corrected chi-squared statistic.
Is complex allometry in field metabolic rates of mammals a statistical artifact?
Packard, Gary C
2017-01-01
Recent reports indicate that field metabolic rates (FMRs) of mammals conform to a pattern of complex allometry in which the exponent in a simple, two-parameter power equation increases steadily as a dependent function of body mass. The reports were based, however, on indirect analyses performed on logarithmic transformations of the original data. I re-examined values for FMR and body mass for 114 species of mammal by the conventional approach to allometric analysis (to illustrate why the approach is unreliable) and by linear and nonlinear regression on untransformed variables (to illustrate the power and versatility of newer analytical methods). The best of the regression models fitted directly to untransformed observations is a three-parameter power equation with multiplicative, lognormal, heteroscedastic error and an allometric exponent of 0.82. The mean function is a good fit to data in graphical display. The significant intercept in the model may simply have gone undetected in prior analyses because conventional allometry assumes implicitly that the intercept is zero; or the intercept may be a spurious finding resulting from bias introduced by the haphazard sampling that underlies "exploratory" analyses like the one reported here. The aforementioned issues can be resolved only by gathering new data specifically intended to address the question of scaling of FMR with body mass in mammals. However, there is no support for the concept of complex allometry in the relationship between FMR and body size in mammals. Copyright © 2016 Elsevier Inc. All rights reserved.
Maxwell, M; Howie, J G; Pryde, C J
1998-01-01
BACKGROUND: Prescribing matters (particularly budget setting and research into prescribing variation between doctors) have been handicapped by the absence of credible measures of the volume of drugs prescribed. AIM: To use the defined daily dose (DDD) method to study variation in the volume and cost of drugs prescribed across the seven main British National Formulary (BNF) chapters with a view to comparing different methods of setting prescribing budgets. METHOD: Study of one year of prescribing statistics from all 129 general practices in Lothian, covering 808,059 patients: analyses of prescribing statistics for 1995 to define volume and cost/volume of prescribing for one year for 10 groups of practices defined by the age and deprivation status of their patients, for seven BNF chapters; creation of prescribing budgets for 1996 for each individual practice based on the use of target volume and cost statistics; comparison of 1996 DDD-based budgets with those set using the conventional historical approach; and comparison of DDD-based budgets with budgets set using a capitation-based formula derived from local cost/patient information. RESULTS: The volume of drugs prescribed was affected by the age structure of the practices in BNF Chapters 1 (gastrointestinal), 2 (cardiovascular), and 6 (endocrine), and by deprivation structure for BNF Chapters 3 (respiratory) and 4 (central nervous system). Costs per DDD in the major BNF chapters were largely independent of age, deprivation structure, or fundholding status. Capitation and DDD-based budgets were similar to each other, but both differed substantially from historic budgets. One practice in seven gained or lost more than 100,000 Pounds per annum using DDD or capitation budgets compared with historic budgets. The DDD-based budget, but not the capitation-based budget, can be used to set volume-specific prescribing targets. CONCLUSIONS: DDD-based and capitation-based prescribing budgets can be set using a simple explanatory model and generalizable methods. In this study, both differed substantially from historic budgets. DDD budgets could be created to accommodate new prescribing strategies and raised or lowered to reflect local intentions to alter overall prescribing volume or cost targets. We recommend that future work on setting budgets and researching prescribing variations should be based on DDD statistics. PMID:10024703
DIY Soundcard Based Temperature Logging System. Part II: Applications
ERIC Educational Resources Information Center
Nunn, John
2016-01-01
This paper demonstrates some simple applications of how temperature logging systems may be used to monitor simple heat experiments, and how the data obtained can be analysed to get some additional insight into the physical processes. [For "DIY Soundcard Based Temperature Logging System. Part I: Design," see EJ1114124.
Cancer survival: an overview of measures, uses, and interpretation.
Mariotto, Angela B; Noone, Anne-Michelle; Howlader, Nadia; Cho, Hyunsoon; Keel, Gretchen E; Garshell, Jessica; Woloshin, Steven; Schwartz, Lisa M
2014-11-01
Survival statistics are of great interest to patients, clinicians, researchers, and policy makers. Although seemingly simple, survival can be confusing: there are many different survival measures with a plethora of names and statistical methods developed to answer different questions. This paper aims to describe and disseminate different survival measures and their interpretation in less technical language. In addition, we introduce templates to summarize cancer survival statistic organized by their specific purpose: research and policy versus prognosis and clinical decision making. Published by Oxford University Press 2014.
Cancer Survival: An Overview of Measures, Uses, and Interpretation
Noone, Anne-Michelle; Howlader, Nadia; Cho, Hyunsoon; Keel, Gretchen E.; Garshell, Jessica; Woloshin, Steven; Schwartz, Lisa M.
2014-01-01
Survival statistics are of great interest to patients, clinicians, researchers, and policy makers. Although seemingly simple, survival can be confusing: there are many different survival measures with a plethora of names and statistical methods developed to answer different questions. This paper aims to describe and disseminate different survival measures and their interpretation in less technical language. In addition, we introduce templates to summarize cancer survival statistic organized by their specific purpose: research and policy versus prognosis and clinical decision making. PMID:25417231
On real statistics of relaxation in gases
NASA Astrophysics Data System (ADS)
Kuzovlev, Yu. E.
2016-02-01
By example of a particle interacting with ideal gas, it is shown that the statistics of collisions in statistical mechanics at any value of the gas rarefaction parameter qualitatively differ from that conjugated with Boltzmann's hypothetical molecular chaos and kinetic equation. In reality, the probability of collisions of the particle in itself is random. Because of that, the relaxation of particle velocity acquires a power-law asymptotic behavior. An estimate of its exponent is suggested on the basis of simple kinematic reasons.
Pope, Larry M.; Diaz, A.M.
1982-01-01
Quality-of-water data, collected October 21-23, 1980, and a statistical summary are presented for 42 coal-mined strip pits in Crawford and Cherokee Counties, Southeastern Kansas. The statistical summary includes minimum and maximum observed values , mean, and standard deviation. Simple linear regression equations relating specific conductance, dissolved solids, and acidity to concentrations of dissolved solids, sulfate, calcium, and magnesium, potassium, aluminum, and iron are also presented. (USGS)
Goyal, Saumitra; Radi, Mohamed Abdel; Ramadan, Islam Karam-allah; Said, Hatem Galal
2016-01-01
Purpose: Arthroscopic skills training outside the operative room may decrease risks and errors by trainee surgeons. There is a need of simple objective method for evaluating proficiency and skill of arthroscopy trainees using simple bench model of arthroscopic simulator. The aim of this study is to correlate motor task performance to level of prior arthroscopic experience and establish benchmarks for training modules. Methods: Twenty orthopaedic surgeons performed a set of tasks to assess a) arthroscopic triangulation, b) navigation, c) object handling and d) meniscus trimming using SAWBONES “FAST” arthroscopy skills workstation. Time to completion and the errors were computed. The subjects were divided into four levels; “Novice”, “Beginner”, “Intermediate” and “Advanced” based on previous arthroscopy experience, for analyses of performance. Results: The task performance under transparent dome was not related to experience of the surgeon unlike opaque dome, highlighting the importance of hand-eye co-ordination required in arthroscopy. Median time to completion for each task improved as the level of experience increased and this was found to be statistically significant (p < .05) e.g. time for maze navigation (Novice – 166 s, Beginner – 135.5 s, Intermediate – 100 s, Advance – 97.5 s) and the similar results for all tasks. Majority (>85%) of subjects across all the levels reported improvement in performance with sequential tasks. Conclusion: Use of the arthroscope requires visuo-spatial coordination which is a skill that develops with practice. This simple box model can reliably differentiate the arthroscopic skills based on experience and can be used to monitor progression of skills of trainees in institutions. PMID:27801643
New early warning system for gravity-driven ruptures based on codetection of acoustic signal
NASA Astrophysics Data System (ADS)
Faillettaz, J.
2016-12-01
Gravity-driven rupture phenomena in natural media - e.g. landslide, rockfalls, snow or ice avalanches - represent an important class of natural hazards in mountainous regions. To protect the population against such events, a timely evacuation often constitutes the only effective way to secure the potentially endangered area. However, reliable prediction of imminence of such failure events remains challenging due to the nonlinear and complex nature of geological material failure hampered by inherent heterogeneity, unknown initial mechanical state, and complex load application (rainfall, temperature, etc.). Here, a simple method for real-time early warning that considers both the heterogeneity of natural media and characteristics of acoustic emissions attenuation is proposed. This new method capitalizes on codetection of elastic waves emanating from microcracks by multiple and spatially separated sensors. Event-codetection is considered as surrogate for large event size with more frequent codetected events (i.e., detected concurrently on more than one sensor) marking imminence of catastrophic failure. Simple numerical model based on a Fiber Bundle Model considering signal attenuation and hypothetical arrays of sensors confirms the early warning potential of codetection principles. Results suggest that although statistical properties of attenuated signal amplitude could lead to misleading results, monitoring the emergence of large events announcing impeding failure is possible even with attenuated signals depending on sensor network geometry and detection threshold. Preliminary application of the proposed method to acoustic emissions during failure of snow samples has confirmed the potential use of codetection as indicator for imminent failure at lab scale. The applicability of such simple and cheap early warning system is now investigated at a larger scale (hillslope). First results of such a pilot field experiment are presented and analysed.
NASA Astrophysics Data System (ADS)
Lee, Dong-Sup; Cho, Dae-Seung; Kim, Kookhyun; Jeon, Jae-Jin; Jung, Woo-Jin; Kang, Myeng-Hwan; Kim, Jae-Ho
2015-01-01
Independent Component Analysis (ICA), one of the blind source separation methods, can be applied for extracting unknown source signals only from received signals. This is accomplished by finding statistical independence of signal mixtures and has been successfully applied to myriad fields such as medical science, image processing, and numerous others. Nevertheless, there are inherent problems that have been reported when using this technique: instability and invalid ordering of separated signals, particularly when using a conventional ICA technique in vibratory source signal identification of complex structures. In this study, a simple iterative algorithm of the conventional ICA has been proposed to mitigate these problems. The proposed method to extract more stable source signals having valid order includes an iterative and reordering process of extracted mixing matrix to reconstruct finally converged source signals, referring to the magnitudes of correlation coefficients between the intermediately separated signals and the signals measured on or nearby sources. In order to review the problems of the conventional ICA technique and to validate the proposed method, numerical analyses have been carried out for a virtual response model and a 30 m class submarine model. Moreover, in order to investigate applicability of the proposed method to real problem of complex structure, an experiment has been carried out for a scaled submarine mockup. The results show that the proposed method could resolve the inherent problems of a conventional ICA technique.
Fueyo-Díaz, Ricardo; Gascón-Santos, Santiago; Asensio-Martínez, Ángela; Sánchez-Calavera, María Antonia; Magallón-Botaya, Rosa
2016-03-01
A gluten-free diet is to date the only treatment available to celiac disease sufferers. However, systematic reviews indicate that, depending on the method of evaluation used, only 42% to 91% of patients adhere to the diet strictly. Transculturally adapted tools that evaluate adherence beyond simple self-informed questions or invasive analyses are, therefore, of importance. The aim is to obtain a Spanish transcultural adaption and validation of Leffler's Celiac Dietary Adherence Test. A two-stage observational transversal study: translation and back translation by four qualified translators followed by a validation stage in which the questionnaire was administered to 306 celiac disease patients aged between 12 and 72 years and resident in Aragon. Factorial structure, criteria validity and internal consistency were evaluated. The Spanish version maintained the 7 items in a 3-factor structure. Reliability was very high in all the questions answered and the floor and ceiling effects were very low (4.3% and 1%, respectively). The Spearman correlation with the self-efficacy and life quality scales and the self-informed question were statistically significant (p < 0.01). According to the questionnaire criteria, adherence was 72.3%. The Spanish version of the Celiac Dietary Adherence Test shows appropriate psychometric properties and is, therefore, suitable for studying adherence to a gluten-free diet in clinical and research environments.
Using singing to nurture children's hearing? A pilot study.
Welch, Graham F; Saunders, Jo; Edwards, Sian; Palmer, Zoe; Himonides, Evangelos; Knight, Julian; Mahon, Merle; Griffin, Susanna; Vickers, Deborah A
2015-09-01
This article reports a pilot study of the potential benefits of a sustained programme of singing activities on the musical behaviours and hearing acuity of young children with hearing impairment (HI). Twenty-nine children (n=12 HI and n=17 NH) aged between 5 and 7 years from an inner-city primary school in London participated, following appropriate ethical approval. The predominantly classroom-based programme was designed by colleagues from the UCL Institute of Education and UCL Ear Institute in collaboration with a multi-arts charity Creative Futures and delivered by an experienced early years music specialist weekly across two school terms. There was a particular emphasis on building a repertoire of simple songs with actions and allied vocal exploration. Musical learning was also supported by activities that drew on visual imagery for sound and that included simple notation and physical gesture. An overall impact assessment of the pilot programme embraced pre- and post-intervention measures of pitch discrimination, speech perception in noise and singing competency. Subsequent statistical data analyses suggest that the programme had a positive impact on participant children's singing range, particularly (but not only) for HI children with hearing aids, and also in their singing skills. HI children's pitch perception also improved measurably over time. Findings imply that all children, including those with HI, can benefit from regular and sustained access to age-appropriate musical activities.
Statistical palaeomagnetic field modelling and dynamo numerical simulation
NASA Astrophysics Data System (ADS)
Bouligand, C.; Hulot, G.; Khokhlov, A.; Glatzmaier, G. A.
2005-06-01
By relying on two numerical dynamo simulations for which such investigations are possible, we test the validity and sensitivity of a statistical palaeomagnetic field modelling approach known as the giant gaussian process (GGP) modelling approach. This approach is currently used to analyse palaeomagnetic data at times of stable polarity and infer some information about the way the main magnetic field (MF) of the Earth has been behaving in the past and has possibly been influenced by core-mantle boundary (CMB) conditions. One simulation has been run with homogeneous CMB conditions, the other with more realistic non-homogeneous symmetry breaking CMB conditions. In both simulations, it is found that, as required by the GGP approach, the field behaves as a short-term memory process. Some severe non-stationarity is however found in the non-homogeneous case, leading to very significant departures of the Gauss coefficients from a Gaussian distribution, in contradiction with the assumptions underlying the GGP approach. A similar but less severe non-stationarity is found in the case of the homogeneous simulation, which happens to display a more Earth-like temporal behaviour than the non-homogeneous case. This suggests that a GGP modelling approach could nevertheless be applied to try and estimate the mean μ and covariance matrix γ(τ) (first- and second-order statistical moments) of the field produced by the geodynamo. A detailed study of both simulations is carried out to assess the possibility of detecting statistical symmetry breaking properties of the underlying dynamo process by inspection of estimates of μ and γ(τ). As expected (because of the role of the rotation of the Earth in the dynamo process), those estimates reveal spherical symmetry breaking properties. Equatorial symmetry breaking properties are also detected in both simulations, showing that such symmetry breaking properties can occur spontaneously under homogeneous CMB conditions. By contrast axial symmetry breaking is detected only in the non-homogenous simulation, testifying for the constraints imposed by the CMB conditions. The signature of this axial symmetry breaking is however found to be much weaker than the signature of equatorial symmetry breaking. We note that this could be the reason why only equatorial symmetry breaking properties (in the form of the well-known axial quadrupole term in the time-averaged field) have unambiguously been found so far by analysing the real data. However, this could also be because those analyses have all assumed to simple a form for γ(τ) when attempting to estimate μ. Suggestions are provided to make sure future attempts of GGP modelling with real data are being carried out in a more consistent and perhaps more efficient way.
The analgesic effects of exogenous melatonin in humans.
Andersen, Lars Peter Holst
2016-10-01
The hormone, melatonin is produced with circadian rhythm by the pineal gland in humans. The melatonin rhythm provides an endogenous synchronizer, modulating e.g. blood pressure, body temperature, cortisol rhythm, sleep-awake-cycle, immune function and anti-oxidative defence. Interestingly, a number of experimental animal studies demonstrate significant dose-dependent anti-nociceptive effects of exogenous melatonin. Similarly, recent experimental- and clinical studies in humans indicate significant analgesic effects. In study I, we systematically reviewed all randomized studies investigating clinical effects of perioperative melatonin. Meta-analyses demonstrated significant analgesic and anxiolytic effects of melatonin in surgical patients, equating reductions of 20 mm and 19 mm, respectively on a VAS, compared with placebo. Profound heterogeneity between the included studies was, however, present. In study II, we aimed to investigate the analgesic, anti-hyperalgesic and anti-inflammatory effects of exogenous melatonin in a validated human inflammatory pain model, the human burn model. The study was performed as a randomized, double blind placebo-controlled crossover study. Primary outcomes were pain during the burn injury and areas of secondary hyperalgesia. No significant effects of exogenous melatonin were observed with respect to primary or secondary outcomes, compared to placebo. Study III and IV estimated the pharmacokinetic variables of exogenous melatonin. Oral melatonin demonstrated a t max value of 41 minutes. Bioavailability of oral melatonin was only 3%. Elimination t 1/2 were approximately 45 minutes following both oral and intravenous administration, respectively. High-dose intravenous melatonin was not associated with increased sedation, in terms of simple reaction times, compared to placebo. Similarly, no other adverse effects were reported. In Study V, we aimed to re-analyse data obtained from a randomized analgesic drug trial by a selection of standard statistical test. Furthermore, we presented an integrated assessment method of longitudinally measured pain intensity and opioid consumption. Our analyses documented that the employed statistical method impacted the statistical significance of post-operative analgesic outcomes. Furthermore, the novel integrated assessment method combines two interdependent outcomes, lowers the risk of type 2 errors, increases the statistical power, and provides a more accurate description of post-operative analgesic efficacy. Exogenous melatonin may offer an effective and safe analgesic drug. At this moment, however, the results of human studies have been contradictory. High-quality randomized experimental- and clinical studies are still needed to establish a "genuine" analgesic effect of the drug in humans. Other perioperative effects of exogenous melatonin should also be investigated, before melatonin can be introduced for clinical routine use in surgical patients. Despite promising experimental and clinical findings, several unanswered questions also relate to optimal dosage, timing of administration and administration route of exogenous melatonin.
Detection of greenhouse-gas-induced climatic change. Progress report, July 1, 1994--July 31, 1995
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jones, P.D.; Wigley, T.M.L.
1995-07-21
The objective of this research is to assembly and analyze instrumental climate data and to develop and apply climate models as a basis for detecting greenhouse-gas-induced climatic change, and validation of General Circulation Models. In addition to changes due to variations in anthropogenic forcing, including greenhouse gas and aerosol concentration changes, the global climate system exhibits a high degree of internally-generated and externally-forced natural variability. To detect the anthropogenic effect, its signal must be isolated from the ``noise`` of this natural climatic variability. A high quality, spatially extensive data base is required to define the noise and its spatial characteristics.more » To facilitate this, available land and marine data bases will be updated and expanded. The data will be analyzed to determine the potential effects on climate of greenhouse gas and aerosol concentration changes and other factors. Analyses will be guided by a variety of models, from simple energy balance climate models to coupled atmosphere ocean General Circulation Models. These analyses are oriented towards obtaining early evidence of anthropogenic climatic change that would lead either to confirmation, rejection or modification of model projections, and towards the statistical validation of General Circulation Model control runs and perturbation experiments.« less
Rowan, Alicia A; McDermott, Máirtín S; Allen, Mark S
2017-12-01
Intention stability is considered to be one of the key pre-requisites for a strong association between intention and behaviour. It has been claimed, however, that studies examining the moderating impact of intention stability may be invalid, as they have relied on statistically inferior methods. Residual change scores have been suggested as a more appropriate method of measuring change (or lack thereof) in constructs. The aim of the current study, therefore, is to test whether intention stability, calculated using residual change scores, moderates the intention-physical activity behaviour association. A total of 163 participants (124 women, 39 men) completed questionnaires online at three time points separated by 14 day intervals. The moderating impact of intention stability was assessed using multiple linear regression followed up using simple slope analyses to identify the direction of any effect. The interaction of intention and intention stability was found to significantly improve the overall model fit. Intentions had a stronger positive association with behaviour when intentions were more stable than when they were more unstable. However, sensitivity analyses revealed that the association was not robust and reduced to non-significant with the removal of potential multivariate outliers. Future research should use residual change scores as the preferred method of assessing intention stability.
Teaching meta-analysis using MetaLight.
Thomas, James; Graziosi, Sergio; Higgins, Steve; Coe, Robert; Torgerson, Carole; Newman, Mark
2012-10-18
Meta-analysis is a statistical method for combining the results of primary studies. It is often used in systematic reviews and is increasingly a method and topic that appears in student dissertations. MetaLight is a freely available software application that runs simple meta-analyses and contains specific functionality to facilitate the teaching and learning of meta-analysis. While there are many courses and resources for meta-analysis available and numerous software applications to run meta-analyses, there are few pieces of software which are aimed specifically at helping those teaching and learning meta-analysis. Valuable teaching time can be spent learning the mechanics of a new software application, rather than on the principles and practices of meta-analysis. We discuss ways in which the MetaLight tool can be used to present some of the main issues involved in undertaking and interpreting a meta-analysis. While there are many software tools available for conducting meta-analysis, in the context of a teaching programme such software can require expenditure both in terms of money and in terms of the time it takes to learn how to use it. MetaLight was developed specifically as a tool to facilitate the teaching and learning of meta-analysis and we have presented here some of the ways it might be used in a training situation.
Classification of viral zoonosis through receptor pattern analysis.
Bae, Se-Eun; Son, Hyeon Seok
2011-04-13
Viral zoonosis, the transmission of a virus from its primary vertebrate reservoir species to humans, requires ubiquitous cellular proteins known as receptor proteins. Zoonosis can occur not only through direct transmission from vertebrates to humans, but also through intermediate reservoirs or other environmental factors. Viruses can be categorized according to genotype (ssDNA, dsDNA, ssRNA and dsRNA viruses). Among them, the RNA viruses exhibit particularly high mutation rates and are especially problematic for this reason. Most zoonotic viruses are RNA viruses that change their envelope proteins to facilitate binding to various receptors of host species. In this study, we sought to predict zoonotic propensity through the analysis of receptor characteristics. We hypothesized that the major barrier to interspecies virus transmission is that receptor sequences vary among species--in other words, that the specific amino acid sequence of the receptor determines the ability of the viral envelope protein to attach to the cell. We analysed host-cell receptor sequences for their hydrophobicity/hydrophilicity characteristics. We then analysed these properties for similarities among receptors of different species and used a statistical discriminant analysis to predict the likelihood of transmission among species. This study is an attempt to predict zoonosis through simple computational analysis of receptor sequence differences. Our method may be useful in predicting the zoonotic potential of newly discovered viral strains.
Inferential Statistics in "Language Teaching Research": A Review and Ways Forward
ERIC Educational Resources Information Center
Lindstromberg, Seth
2016-01-01
This article reviews all (quasi)experimental studies appearing in the first 19 volumes (1997-2015) of "Language Teaching Research" (LTR). Specifically, it provides an overview of how statistical analyses were conducted in these studies and of how the analyses were reported. The overall conclusion is that there has been a tight adherence…
The Effect of Cluster Sampling Design in Survey Research on the Standard Error Statistic.
ERIC Educational Resources Information Center
Wang, Lin; Fan, Xitao
Standard statistical methods are used to analyze data that is assumed to be collected using a simple random sampling scheme. These methods, however, tend to underestimate variance when the data is collected with a cluster design, which is often found in educational survey research. The purposes of this paper are to demonstrate how a cluster design…
ERIC Educational Resources Information Center
Arbeit, Caren A.; Staklis, Sandra; Horn, Laura
2016-01-01
Statistics in Brief publications present descriptive data in tabular formats to provide useful information to a broad audience, including members of the general public. They address simple and topical issues and questions. This Statistics in Brief profiles the demographic and enrollment characteristics of undergraduates who are immigrants or…
Three lessons for genetic toxicology from baseball analytics.
Dertinger, Stephen D
2017-07-01
In many respects the evolution of baseball statistics mirrors advances made in the field of genetic toxicology. From its inception, baseball and statistics have been inextricably linked. Generations of players and fans have used a number of relatively simple measurements to describe team and individual player's current performance, as well as for historical record-keeping purposes. Over the years, baseball analytics has progressed in several important ways. Early advances were based on deriving more meaningful metrics from simpler forerunners. Now, technological innovations are delivering much deeper insights. Videography, radar, and other advances that include automatic player recognition capabilities provide the means to measure more complex and useful factors. Fielders' reaction times, efficiency of the route taken to reach a batted ball, and pitch-framing effectiveness come to mind. With the current availability of complex measurements from multiple data streams, multifactorial analyses occurring via machine learning algorithms have become necessary to make sense of the terabytes of data that are now being captured in every Major League Baseball game. Collectively, these advances have transformed baseball statistics from being largely descriptive in nature to serving data-driven, predictive roles. Whereas genetic toxicology has charted a somewhat parallel course, a case can be made that greater utilization of baseball's mindset and strategies would serve our scientific field well. This paper describes three useful lessons for genetic toxicology, courtesy of the field of baseball analytics: seek objective knowledge; incorporate multiple data streams; and embrace machine learning. Environ. Mol. Mutagen. 58:390-397, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Atan, Doğan; İkincioğulları, Aykut; Köseoğlu, Sabri; Özcan, Kürşat Murat; Çetin, Mehmet Ali; Ensari, Serdar; Dere, Hüseyin
2015-01-01
Background: Bell’s palsy is the most frequent cause of unilateral facial paralysis. Inflammation is thought to play an important role in the pathogenesis of Bell’s palsy. Aims: Neutrophil to lymphocyte ratio (NLR) and platelet to lymphocyte ratio (PLR) are simple and inexpensive tests which are indicative of inflammation and can be calculated by all physicians. The aim of this study was to reveal correlations of Bell’s palsy and degree of paralysis with NLR and PLR. Study Design: Case-control study. Methods: The retrospective study was performed January 2010 and December 2013. Ninety-nine patients diagnosed as Bell’s palsy were included in the Bell’s palsy group and ninety-nine healthy individuals with the same demographic characteristics as the Bell’s palsy group were included in the control group. As a result of analyses, NLR and PLR were calculated. Results: The mean NLR was 4.37 in the Bell’s palsy group and 1.89 in the control group with a statistically significant difference (p<0.001). The mean PLR was 137.5 in the Bell’s palsy group and 113.75 in the control group with a statistically significant difference (p=0.008). No statistically significant relation was detected between the degree of facial paralysis and NLR and PLR. Conclusion: The NLR and the PLR were significantly higher in patients with Bell’s palsy. This is the first study to reveal a relation between Bell’s palsy and PLR. NLR and PLR can be used as auxiliary parameters in the diagnosis of Bell’s palsy. PMID:26167340
Missing CD4+ cell response in randomized clinical trials of maraviroc and dolutegravir.
Cuffe, Robert; Barnett, Carly; Granier, Catherine; Machida, Mitsuaki; Wang, Cunshan; Roger, James
2015-10-01
Missing data can compromise inferences from clinical trials, yet the topic has received little attention in the clinical trial community. Shortcomings in commonly used methods used to analyze studies with missing data (complete case, last- or baseline-observation carried forward) have been highlighted in a recent Food and Drug Administration-sponsored report. This report recommends how to mitigate the issues associated with missing data. We present an example of the proposed concepts using data from recent clinical trials. CD4+ cell count data from the previously reported SINGLE and MOTIVATE studies of dolutegravir and maraviroc were analyzed using a variety of statistical methods to explore the impact of missing data. Four methodologies were used: complete case analysis, simple imputation, mixed models for repeated measures, and multiple imputation. We compared the sensitivity of conclusions to the volume of missing data and to the assumptions underpinning each method. Rates of missing data were greater in the MOTIVATE studies (35%-68% premature withdrawal) than in SINGLE (12%-20%). The sensitivity of results to assumptions about missing data was related to volume of missing data. Estimates of treatment differences by various analysis methods ranged across a 61 cells/mm3 window in MOTIVATE and a 22 cells/mm3 window in SINGLE. Where missing data are anticipated, analyses require robust statistical and clinical debate of the necessary but unverifiable underlying statistical assumptions. Multiple imputation makes these assumptions transparent, can accommodate a broad range of scenarios, and is a natural analysis for clinical trials in HIV with missing data.
Detecting trends in raptor counts: power and type I error rates of various statistical tests
Hatfield, J.S.; Gould, W.R.; Hoover, B.A.; Fuller, M.R.; Lindquist, E.L.
1996-01-01
We conducted simulations that estimated power and type I error rates of statistical tests for detecting trends in raptor population count data collected from a single monitoring site. Results of the simulations were used to help analyze count data of bald eagles (Haliaeetus leucocephalus) from 7 national forests in Michigan, Minnesota, and Wisconsin during 1980-1989. Seven statistical tests were evaluated, including simple linear regression on the log scale and linear regression with a permutation test. Using 1,000 replications each, we simulated n = 10 and n = 50 years of count data and trends ranging from -5 to 5% change/year. We evaluated the tests at 3 critical levels (alpha = 0.01, 0.05, and 0.10) for both upper- and lower-tailed tests. Exponential count data were simulated by adding sampling error with a coefficient of variation of 40% from either a log-normal or autocorrelated log-normal distribution. Not surprisingly, tests performed with 50 years of data were much more powerful than tests with 10 years of data. Positive autocorrelation inflated alpha-levels upward from their nominal levels, making the tests less conservative and more likely to reject the null hypothesis of no trend. Of the tests studied, Cox and Stuart's test and Pollard's test clearly had lower power than the others. Surprisingly, the linear regression t-test, Collins' linear regression permutation test, and the nonparametric Lehmann's and Mann's tests all had similar power in our simulations. Analyses of the count data suggested that bald eagles had increasing trends on at least 2 of the 7 national forests during 1980-1989.
Winkler, Robert
2015-01-01
In biological mass spectrometry, crude instrumental data need to be converted into meaningful theoretical models. Several data processing and data evaluation steps are required to come to the final results. These operations are often difficult to reproduce, because of too specific computing platforms. This effect, known as 'workflow decay', can be diminished by using a standardized informatic infrastructure. Thus, we compiled an integrated platform, which contains ready-to-use tools and workflows for mass spectrometry data analysis. Apart from general unit operations, such as peak picking and identification of proteins and metabolites, we put a strong emphasis on the statistical validation of results and Data Mining. MASSyPup64 includes e.g., the OpenMS/TOPPAS framework, the Trans-Proteomic-Pipeline programs, the ProteoWizard tools, X!Tandem, Comet and SpiderMass. The statistical computing language R is installed with packages for MS data analyses, such as XCMS/metaXCMS and MetabR. The R package Rattle provides a user-friendly access to multiple Data Mining methods. Further, we added the non-conventional spreadsheet program teapot for editing large data sets and a command line tool for transposing large matrices. Individual programs, console commands and modules can be integrated using the Workflow Management System (WMS) taverna. We explain the useful combination of the tools by practical examples: (1) A workflow for protein identification and validation, with subsequent Association Analysis of peptides, (2) Cluster analysis and Data Mining in targeted Metabolomics, and (3) Raw data processing, Data Mining and identification of metabolites in untargeted Metabolomics. Association Analyses reveal relationships between variables across different sample sets. We present its application for finding co-occurring peptides, which can be used for target proteomics, the discovery of alternative biomarkers and protein-protein interactions. Data Mining derived models displayed a higher robustness and accuracy for classifying sample groups in targeted Metabolomics than cluster analyses. Random Forest models do not only provide predictive models, which can be deployed for new data sets, but also the variable importance. We demonstrate that the later is especially useful for tracking down significant signals and affected pathways in untargeted Metabolomics. Thus, Random Forest modeling supports the unbiased search for relevant biological features in Metabolomics. Our results clearly manifest the importance of Data Mining methods to disclose non-obvious information in biological mass spectrometry . The application of a Workflow Management System and the integration of all required programs and data in a consistent platform makes the presented data analyses strategies reproducible for non-expert users. The simple remastering process and the Open Source licenses of MASSyPup64 (http://www.bioprocess.org/massypup/) enable the continuous improvement of the system.
Contrasting Views of Complexity and Their Implications For Network-Centric Infrastructures
2010-07-04
problem) can be undecidable. While two gravitationally interacting bodies yield simple orbits, Poincare showed that the motion of even three...statistical me- chanics are valid only when the [billiard] balls are distributed, in their positions and motions , in a helter-skelter, i.e., a disorga- nized...Rube Goldberg, whose famous cartoons depict “ comically involved complicated invention[s], laboriously contrived to perform a simple operation” [68
Statistical Tests Black swans or dragon-kings? A simple test for deviations from the power law★
NASA Astrophysics Data System (ADS)
Janczura, J.; Weron, R.
2012-05-01
We develop a simple test for deviations from power law tails. Actually, from the tails of any distribution. We use this test - which is based on the asymptotic properties of the empirical distribution function - to answer the question whether great natural disasters, financial crashes or electricity price spikes should be classified as dragon-kings or `only' as black swans.
Masquerade Detection Using a Taxonomy-Based Multinomial Modeling Approach in UNIX Systems
2008-08-25
primarily the modeling of statistical features , such as the frequency of events, the duration of events, the co- occurrence of multiple events...are identified, we can extract features representing such behavior while auditing the user’s behavior. Figure1: Taxonomy of Linux and Unix...achieved when the features are extracted just from simple commands. Method Hit Rate False Positive Rate ocSVM using simple cmds (freq.-based
Rainfall runoff modelling of the Upper Ganga and Brahmaputra basins using PERSiST.
Futter, M N; Whitehead, P G; Sarkar, S; Rodda, H; Crossman, J
2015-06-01
There are ongoing discussions about the appropriate level of complexity and sources of uncertainty in rainfall runoff models. Simulations for operational hydrology, flood forecasting or nutrient transport all warrant different levels of complexity in the modelling approach. More complex model structures are appropriate for simulations of land-cover dependent nutrient transport while more parsimonious model structures may be adequate for runoff simulation. The appropriate level of complexity is also dependent on data availability. Here, we use PERSiST; a simple, semi-distributed dynamic rainfall-runoff modelling toolkit to simulate flows in the Upper Ganges and Brahmaputra rivers. We present two sets of simulations driven by single time series of daily precipitation and temperature using simple (A) and complex (B) model structures based on uniform and hydrochemically relevant land covers respectively. Models were compared based on ensembles of Bayesian Information Criterion (BIC) statistics. Equifinality was observed for parameters but not for model structures. Model performance was better for the more complex (B) structural representations than for parsimonious model structures. The results show that structural uncertainty is more important than parameter uncertainty. The ensembles of BIC statistics suggested that neither structural representation was preferable in a statistical sense. Simulations presented here confirm that relatively simple models with limited data requirements can be used to credibly simulate flows and water balance components needed for nutrient flux modelling in large, data-poor basins.
Martins, Kelly Vasconcelos Chaves; Gil, Daniela
2017-01-01
Introduction The registry of the component P1 of the cortical auditory evoked potential has been widely used to analyze the behavior of auditory pathways in response to cochlear implant stimulation. Objective To determine the influence of aural rehabilitation in the parameters of latency and amplitude of the P1 cortical auditory evoked potential component elicited by simple auditory stimuli (tone burst) and complex stimuli (speech) in children with cochlear implants. Method The study included six individuals of both genders aged 5 to 10 years old who have been cochlear implant users for at least 12 months, and who attended auditory rehabilitation with an aural rehabilitation therapy approach. Participants were submitted to research of the cortical auditory evoked potential at the beginning of the study and after 3 months of aural rehabilitation. To elicit the responses, simple stimuli (tone burst) and complex stimuli (speech) were used and presented in free field at 70 dB HL. The results were statistically analyzed, and both evaluations were compared. Results There was no significant difference between the type of eliciting stimulus of the cortical auditory evoked potential for the latency and the amplitude of P1. There was a statistically significant difference in the P1 latency between the evaluations for both stimuli, with reduction of the latency in the second evaluation after 3 months of auditory rehabilitation. There was no statistically significant difference regarding the amplitude of P1 under the two types of stimuli or in the two evaluations. Conclusion A decrease in latency of the P1 component elicited by both simple and complex stimuli was observed within a three-month interval in children with cochlear implant undergoing aural rehabilitation. PMID:29018498
Racing to learn: statistical inference and learning in a single spiking neuron with adaptive kernels
Afshar, Saeed; George, Libin; Tapson, Jonathan; van Schaik, André; Hamilton, Tara J.
2014-01-01
This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple: there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively “hiding” its learnt pattern from its neighbors. The robustness to noise, high speed, and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA). Matlab, Python, and Verilog implementations of SKAN are available at: http://www.uws.edu.au/bioelectronics_neuroscience/bens/reproducible_research. PMID:25505378
Kaplan, Metin; Erol, Fatih Serhat; Bozgeyik, Zülküf; Koparan, Mehmet
2007-07-01
In the present study, the clinical effectiveness of a surgical procedure in which no draining tubes are installed following simple burr hole drainage and saline irrigation is investigated. 10 patients, having undergone operative intervention for unilateral chronic subdural hemorrhage, having a clinical grade of 2 and a hemorrhage thickness of 2 cm, were included in the study. The cerebral blood flow rates of middle cerebral artery were evaluated bilaterally with Doppler before and after the surgery. All the cases underwent the operation using the simple burr hole drainage technique without the drain and consequent saline irrigation. Statistical analysis was performed by Wilcoxon signed rank test (p<0.05). There was a pronounced decrease in the preoperative MCA blood flow in the hemisphere the hemorrhage had occurred (p=0.008). An increased PI value on the side of the hemorrhage drew our attention (p=0.005). Postoperative MCA blood flow measurements showed a statistically significant improvement (p=0.005). Furthermore, the PI value showed normalization (p<0.05). The paresis and the level of consciousness improved in all cases. Simple burr hole drainage technique is sufficient for the improvement of cerebral blood flow and clinical recovery in patients with chronic subdural hemorrhage.
Afshar, Saeed; George, Libin; Tapson, Jonathan; van Schaik, André; Hamilton, Tara J
2014-01-01
This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple: there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively "hiding" its learnt pattern from its neighbors. The robustness to noise, high speed, and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA). Matlab, Python, and Verilog implementations of SKAN are available at: http://www.uws.edu.au/bioelectronics_neuroscience/bens/reproducible_research.
Jaryum, Kiri H; Okoye, Zebulon Sunday C; Stoecker, Barbara
2018-06-01
Nutritional deficiencies of trace elements are among the top ten causes of death in Sub Saharan Africa. In Kanam Local Government Area of Nigeria, the problem is compounded by high levels of poverty and illiteracy. Abnormally low hair zinc levels are important, sensitive diagnostic biochemical indices of Zinc deficiency. The purpose of this study is to assess the zinc status of children less than 5 years in Kanam local government area, north-central Nigeria, by measuring the zinc level in hair samples collected from 44 under-5 children across the area. A household survey was made to assess the pattern and frequency of consumption of zinc-rich foods which was done by means of questionnaire. Hair samples were analysed for zinc content by the inductively coupled plasma-mass spectrophotometry (ICP-MS). The data were analysed statistically using the Student's t test, z test, and Pearson correlation, while questionnaire-captured data were analysed by simple arithmetic. The results of the analyses showed that the average hair zinc level was 74.35 ± 48.05 μg/g. This was below the normal range of 130-140 μg/g, for children less than 5 years. Based on the results, 86.36% have hair zinc level below the lower limit of the normal range of 130 μg/g. Between the gender, boys have higher hair zinc content than girls. Data from the questionnaire showed that 53.45% of the population studied have poor/inadequate intake of zinc-rich foods of animal origin, a dietary behaviour reported to predispose to micronutrient deficiency, including zinc.
Cusack, Rhodri; Vicente-Grabovetsky, Alejandro; Mitchell, Daniel J; Wild, Conor J; Auer, Tibor; Linke, Annika C; Peelle, Jonathan E
2014-01-01
Recent years have seen neuroimaging data sets becoming richer, with larger cohorts of participants, a greater variety of acquisition techniques, and increasingly complex analyses. These advances have made data analysis pipelines complicated to set up and run (increasing the risk of human error) and time consuming to execute (restricting what analyses are attempted). Here we present an open-source framework, automatic analysis (aa), to address these concerns. Human efficiency is increased by making code modular and reusable, and managing its execution with a processing engine that tracks what has been completed and what needs to be (re)done. Analysis is accelerated by optional parallel processing of independent tasks on cluster or cloud computing resources. A pipeline comprises a series of modules that each perform a specific task. The processing engine keeps track of the data, calculating a map of upstream and downstream dependencies for each module. Existing modules are available for many analysis tasks, such as SPM-based fMRI preprocessing, individual and group level statistics, voxel-based morphometry, tractography, and multi-voxel pattern analyses (MVPA). However, aa also allows for full customization, and encourages efficient management of code: new modules may be written with only a small code overhead. aa has been used by more than 50 researchers in hundreds of neuroimaging studies comprising thousands of subjects. It has been found to be robust, fast, and efficient, for simple-single subject studies up to multimodal pipelines on hundreds of subjects. It is attractive to both novice and experienced users. aa can reduce the amount of time neuroimaging laboratories spend performing analyses and reduce errors, expanding the range of scientific questions it is practical to address.
Lie, Stein Atle; Eriksen, Hege R; Ursin, Holger; Hagen, Eli Molde
2008-05-01
Analysing and presenting data on different outcomes after sick-leave is challenging. The use of extended statistical methods supplies additional information and allows further exploitation of data. Four hundred and fifty-seven patients, sick-listed for 8-12 weeks for low back pain, were randomized to intervention (n=237) or control (n=220). Outcome was measured as: "sick-listed'', "returned to work'', or "disability pension''. The individuals shifted between the three states between one and 22 times (mean 6.4 times). In a multi-state model, shifting between the states was set up in a transition intensity matrix. The probability of being in any of the states was calculated as a transition probability matrix. The effects of the intervention were modelled using a non-parametric model. There was an effect of the intervention for leaving the state sick-listed and shifting to returned to work (relative risk (RR)=1.27, 95% confidence interval (CI) 1.09- 1.47). The nonparametric estimates showed an effect of the intervention for leaving sick-listed and shifting to returned to work in the first 6 months. We found a protective effect of the intervention for shifting back to sick-listed between 6 and 18 months. The analyses showed that the probability of staying in the state returned to work was not different between the intervention and control groups at the end of the follow-up (3 years). We demonstrate that these alternative analyses give additional results and increase the strength of the analyses. The simple intervention did not decrease the probability of being on sick-leave in the long term; however, it decreased the time that individuals were on sick-leave.
A d-statistic for single-case designs that is equivalent to the usual between-groups d-statistic.
Shadish, William R; Hedges, Larry V; Pustejovsky, James E; Boyajian, Jonathan G; Sullivan, Kristynn J; Andrade, Alma; Barrientos, Jeannette L
2014-01-01
We describe a standardised mean difference statistic (d) for single-case designs that is equivalent to the usual d in between-groups experiments. We show how it can be used to summarise treatment effects over cases within a study, to do power analyses in planning new studies and grant proposals, and to meta-analyse effects across studies of the same question. We discuss limitations of this d-statistic, and possible remedies to them. Even so, this d-statistic is better founded statistically than other effect size measures for single-case design, and unlike many general linear model approaches such as multilevel modelling or generalised additive models, it produces a standardised effect size that can be integrated over studies with different outcome measures. SPSS macros for both effect size computation and power analysis are available.
Aksoy, S; Erdil, I; Hocaoglu, E; Inci, E; Adas, G T; Kemik, O; Turkay, R
2018-02-01
The present study indicates that simple and hydatid cysts in liver are a common health problem in Turkey. The aim of the study is to differentiate different types of hydatid cysts from simple cysts by using diffusion-weighted images. In total, 37 hydatid cysts and 36 simple cysts in the liver were diagnosed. We retrospectively reviewed the medical records of the patients who had both ultrasonography and magnetic resonance imaging. We measured apparent diffusion coefficient (ADC) values of all the cysts and then compared the findings. There was no statistically meaningful difference between the ADC values of simple cysts and type 1 hydatid cysts. However, for the other types of hydatid cysts, it is possible to differentiate hydatid cysts from simple cysts using the ADC values. Although in our study we cannot differentiate between type I hydatid cysts and simple cysts in the liver, diffusion-weighted images are very useful to differentiate different types of hydatid cysts from simple cysts using the ADC values.
Motivation and Burnout in Professional Rugby Players
ERIC Educational Resources Information Center
Cresswell, Scott L.; Eklund, Robert C.
2005-01-01
The authors examined relationships among burnout and motivational types differing in self-determination, using simple correlational analyses as well as more sophisticated canonical correlation analyses allowing for the simultaneous examination of multiple independent and dependent variables. The authors hypothesized that: (a) motivation low in…
ERIC Educational Resources Information Center
Thompson, Bruce; Melancon, Janet G.
Effect sizes have been increasingly emphasized in research as more researchers have recognized that: (1) all parametric analyses (t-tests, analyses of variance, etc.) are correlational; (2) effect sizes have played an important role in meta-analytic work; and (3) statistical significance testing is limited in its capacity to inform scientific…
Harmonic and Anharmonic Behaviour of a Simple Oscillator
ERIC Educational Resources Information Center
O'Shea, Michael J.
2009-01-01
We consider a simple oscillator that exhibits harmonic and anharmonic regimes and analyse its behaviour over the complete range of possible amplitudes. The oscillator consists of a mass "m" fixed at the midpoint of a horizontal rope. For zero initial rope tension and small amplitude the period of oscillation, tau, varies as tau is approximately…
Validation of the Simple View of Reading in Hebrew--A Semitic Language
ERIC Educational Resources Information Center
Joshi, R. Malatesha; Ji, Xuejun Ryan; Breznitz, Zvia; Amiel, Meirav; Yulia, Astri
2015-01-01
The Simple View of Reading (SVR) in Hebrew was tested by administering decoding, listening comprehension, and reading comprehension measures to 1,002 students from Grades 2 to 10 in the northern part of Israel. Results from hierarchical regression analyses supported the SVR in Hebrew with decoding and listening comprehension measures explaining…
Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance
Plantegenet, Stephanie; Weber, Johann; Goldstein, Darlene R; Zeller, Georg; Nussbaumer, Cindy; Thomas, Jérôme; Weigel, Detlef; Harshman, Keith; Hardtke, Christian S
2009-01-01
In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5′ regulatory sequence variation in the corresponding genes is indeed increased. However, ∼42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a >4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL. PMID:19225455
NASA Astrophysics Data System (ADS)
Hudson, Ross D.; Treagust, David F.
2013-04-01
Background . This study developed from observations of apparent achievement differences between male and female chemistry performances in a state university entrance examination. Male students performed more strongly than female students, especially in higher scores. Apart from the gender of the students, two other important factors that might influence student performance were format of questions (short-answer or multiple-choice) and type of questions (recall or application). Purpose The research question addressed in this study was: Is there a relationship between performance in state university entrance examinations in chemistry and school chemistry examinations and student gender, format of questions - multiple-choice or short-answer, and conceptual level - recall or application? Sample The two sources of data were: (1) secondary analyses of five consecutive years' data published by the examining authority of chemistry examinations, and (2) tests conducted with 192 students which provided information about all aspects of the three variables (question format, question type and gender) under consideration. Design and methods Both sources of data were analysed using ANOVA to compare means for the variables under consideration and the statistical significance of any differences. The data from the tests were also analysed using Rasch analysis to determine differences in gender performance. Results When overall mean data are considered, both male and female students performed better on multiple-choice questions and recall questions than on short-answer questions and application questions, respectively. When overall mean data are considered, male students outperformed female students in both the university entrance and school tests, particularly in the higher scores. When data were analysed with Rasch, there was no statistically significant difference in performance between males and females of equal ability. Conclusions Both male and female students generally perform better on multiple-choice questions than they do on short-answer questions. However, when the questions are matched in terms of difficulty (using Rasch analysis), the differences in performance between multiple-choice and short-answer are quite small. Rasch analysis showed that there was little difference in performance between males and females of equal ability. This study shows that a simple face-value score analysis of relative student performance - in this case, in chemistry - can be deceptive unless the actual abilities of the students concerned, as measured by a tool such as Rasch, are taken into consideration before reaching any conclusion.
Barbie, Dana L.; Wehmeyer, Loren L.
2012-01-01
Trends in selected streamflow statistics during 1922-2009 were evaluated at 19 long-term streamflow-gaging stations considered indicative of outflows from Texas to Arkansas, Louisiana, Galveston Bay, and the Gulf of Mexico. The U.S. Geological Survey, in cooperation with the Texas Water Development Board, evaluated streamflow data from streamflow-gaging stations with more than 50 years of record that were active as of 2009. The outflows into Arkansas and Louisiana were represented by 3 streamflow-gaging stations, and outflows into the Gulf of Mexico, including Galveston Bay, were represented by 16 streamflow-gaging stations. Monotonic trend analyses were done using the following three streamflow statistics generated from daily mean values of streamflow: (1) annual mean daily discharge, (2) annual maximum daily discharge, and (3) annual minimum daily discharge. The trend analyses were based on the nonparametric Kendall's Tau test, which is useful for the detection of monotonic upward or downward trends with time. A total of 69 trend analyses by Kendall's Tau were computed - 19 periods of streamflow multiplied by the 3 streamflow statistics plus 12 additional trend analyses because the periods of record for 2 streamflow-gaging stations were divided into periods representing pre- and post-reservoir impoundment. Unless otherwise described, each trend analysis used the entire period of record for each streamflow-gaging station. The monotonic trend analysis detected 11 statistically significant downward trends, 37 instances of no trend, and 21 statistically significant upward trends. One general region studied, which seemingly has relatively more upward trends for many of the streamflow statistics analyzed, includes the rivers and associated creeks and bayous to Galveston Bay in the Houston metropolitan area. Lastly, the most western river basins considered (the Nueces and Rio Grande) had statistically significant downward trends for many of the streamflow statistics analyzed.
Analysing attitude data through ridit schemes.
El-rouby, M G
1994-12-02
The attitudes of individuals and populations on various issues are usually assessed through sample surveys. Responses to survey questions are then scaled and combined into a meaningful whole which defines the measured attitude. The applied scales may be of nominal, ordinal, interval, or ratio nature depending upon the degree of sophistication the researcher wants to introduce into the measurement. This paper discusses methods of analysis for categorical variables of the type used in attitude and human behavior research, and recommends adoption of ridit analysis, a technique which has been successfully applied to epidemiological, clinical investigation, laboratory, and microbiological data. The ridit methodology is described after reviewing some general attitude scaling methods and problems of analysis related to them. The ridit method is then applied to a recent study conducted to assess health care service quality in North Carolina. This technique is conceptually and computationally more simple than other conventional statistical methods, and is also distribution-free. Basic requirements and limitations on its use are indicated.
Exploiting Fast-Variables to Understand Population Dynamics and Evolution
NASA Astrophysics Data System (ADS)
Constable, George W. A.; McKane, Alan J.
2018-07-01
We describe a continuous-time modelling framework for biological population dynamics that accounts for demographic noise. In the spirit of the methodology used by statistical physicists, transitions between the states of the system are caused by individual events while the dynamics are described in terms of the time-evolution of a probability density function. In general, the application of the diffusion approximation still leaves a description that is quite complex. However, in many biological applications one or more of the processes happen slowly relative to the system's other processes, and the dynamics can be approximated as occurring within a slow low-dimensional subspace. We review these time-scale separation arguments and analyse the more simple stochastic dynamics that result in a number of cases. We stress that it is important to retain the demographic noise derived in this way, and emphasise this point by showing that it can alter the direction of selection compared to the prediction made from an analysis of the corresponding deterministic model.
Predictors of transformational leadership of nurse managers.
Echevarria, Ilia M; Patterson, Barbara J; Krouse, Anne
2017-04-01
The aim of this study was to examine the relationships among education, leadership experience, emotional intelligence and transformational leadership of nurse managers. Nursing leadership research provides limited evidence of predictors of transformational leadership style in nurse managers. A predictive correlational design was used with a sample of nurse managers (n = 148) working in varied health care settings. Data were collected using the Genos Emotional Intelligence Inventory, the Multi-factor Leadership Questionnaire and a demographic questionnaire. Simple linear and multiple regression analyses were used to examine relationships. A statistically significant relationship was found between emotional intelligence and transformational leadership (r = 0.59, P < 0.001) explaining 34% variance in transformational leadership. Nurse managers should be well informed of the predictors of transformational leadership in order to pursue continuing education and development opportunities related to those predictors. The results of this study emphasise the need for emotional intelligence continuing education, leadership development and leader assessment programmes. © 2016 John Wiley & Sons Ltd.
Effects of Normal Aging on Visuo-Motor Plasticity
NASA Technical Reports Server (NTRS)
Roller, Carrie A.; Cohen, Helen S.; Kimball, Kay T.; Bloomberg, Jacob J.
2001-01-01
Normal aging is associated with declines in neurologic function. Uncompensated visual and vestibular problems may have dire consequences including dangerous falls. Visuomotor plasticity is a form of behavioral neural plasticity which is important in the process of adapting to visual or vestibular alteration, including those changes due to pathology, pharmacotherapy, surgery or even entry into a microgravity or underwater environment. In order to determine the effects of aging on visuomotor plasticity, we chose the simple and easily measured paradigm of visual-motor re-arrangement created by using visual displacement prisms while throwing small balls at a target. Subjects threw balls before, during and after wearing a set of prisms which displace the visual scene by twenty degrees to the right. Data obtained during adaptation were modeled using multilevel analyses for 73 subjects aged 20 to 80 years. We found no statistically significant difference in measures of visuomotor plasticity with advancing age. Further studies are underway examining variable practice training as a potential mechanism for enhancing this form of behavioral neural plasticity.
Towards the estimation of effect measures in studies using respondent-driven sampling.
Rotondi, Michael A
2014-06-01
Respondent-driven sampling (RDS) is an increasingly common sampling technique to recruit hidden populations. Statistical methods for RDS are not straightforward due to the correlation between individual outcomes and subject weighting; thus, analyses are typically limited to estimation of population proportions. This manuscript applies the method of variance estimates recovery (MOVER) to construct confidence intervals for effect measures such as risk difference (difference of proportions) or relative risk in studies using RDS. To illustrate the approach, MOVER is used to construct confidence intervals for differences in the prevalence of demographic characteristics between an RDS study and convenience study of injection drug users. MOVER is then applied to obtain a confidence interval for the relative risk between education levels and HIV seropositivity and current infection with syphilis, respectively. This approach provides a simple method to construct confidence intervals for effect measures in RDS studies. Since it only relies on a proportion and appropriate confidence limits, it can also be applied to previously published manuscripts.
A conceptual weather-type classification procedure for the Philadelphia, Pennsylvania, area
McCabe, Gregory J.
1990-01-01
A simple method of weather-type classification, based on a conceptual model of pressure systems that pass through the Philadelphia, Pennsylvania, area, has been developed. The only inputs required for the procedure are daily mean wind direction and cloud cover, which are used to index the relative position of pressure systems and fronts to Philadelphia.Daily mean wind-direction and cloud-cover data recorded at Philadelphia, Pennsylvania, from January 1954 through August 1988 were used to categorize daily weather conditions. The conceptual weather types reflect changes in daily air and dew-point temperatures, and changes in monthly mean temperature and monthly and annual precipitation. The weather-type classification produced by using the conceptual model was similar to a classification produced by using a multivariate statistical classification procedure. Even though the conceptual weather types are derived from a small amount of data, they appear to account for the variability of daily weather patterns sufficiently to describe distinct weather conditions for use in environmental analyses of weather-sensitive processes.
Modeling the atmospheric chemistry of TICs
NASA Astrophysics Data System (ADS)
Henley, Michael V.; Burns, Douglas S.; Chynwat, Veeradej; Moore, William; Plitz, Angela; Rottmann, Shawn; Hearn, John
2009-05-01
An atmospheric chemistry model that describes the behavior and disposition of environmentally hazardous compounds discharged into the atmosphere was coupled with the transport and diffusion model, SCIPUFF. The atmospheric chemistry model was developed by reducing a detailed atmospheric chemistry mechanism to a simple empirical effective degradation rate term (keff) that is a function of important meteorological parameters such as solar flux, temperature, and cloud cover. Empirically derived keff functions that describe the degradation of target toxic industrial chemicals (TICs) were derived by statistically analyzing data generated from the detailed chemistry mechanism run over a wide range of (typical) atmospheric conditions. To assess and identify areas to improve the developed atmospheric chemistry model, sensitivity and uncertainty analyses were performed to (1) quantify the sensitivity of the model output (TIC concentrations) with respect to changes in the input parameters and (2) improve, where necessary, the quality of the input data based on sensitivity results. The model predictions were evaluated against experimental data. Chamber data were used to remove the complexities of dispersion in the atmosphere.
Exploiting Fast-Variables to Understand Population Dynamics and Evolution
NASA Astrophysics Data System (ADS)
Constable, George W. A.; McKane, Alan J.
2017-11-01
We describe a continuous-time modelling framework for biological population dynamics that accounts for demographic noise. In the spirit of the methodology used by statistical physicists, transitions between the states of the system are caused by individual events while the dynamics are described in terms of the time-evolution of a probability density function. In general, the application of the diffusion approximation still leaves a description that is quite complex. However, in many biological applications one or more of the processes happen slowly relative to the system's other processes, and the dynamics can be approximated as occurring within a slow low-dimensional subspace. We review these time-scale separation arguments and analyse the more simple stochastic dynamics that result in a number of cases. We stress that it is important to retain the demographic noise derived in this way, and emphasise this point by showing that it can alter the direction of selection compared to the prediction made from an analysis of the corresponding deterministic model.
Lakatos, Eszter; Salehi-Reyhani, Ali; Barclay, Michael; Stumpf, Michael P H; Klug, David R
2017-01-01
We determine p53 protein abundances and cell to cell variation in two human cancer cell lines with single cell resolution, and show that the fractional width of the distributions is the same in both cases despite a large difference in average protein copy number. We developed a computational framework to identify dominant mechanisms controlling the variation of protein abundance in a simple model of gene expression from the summary statistics of single cell steady state protein expression distributions. Our results, based on single cell data analysed in a Bayesian framework, lends strong support to a model in which variation in the basal p53 protein abundance may be best explained by variations in the rate of p53 protein degradation. This is supported by measurements of the relative average levels of mRNA which are very similar despite large variation in the level of protein.
Applications of rule-induction in the derivation of quantitative structure-activity relationships.
A-Razzak, M; Glen, R C
1992-08-01
Recently, methods have been developed in the field of Artificial Intelligence (AI), specifically in the expert systems area using rule-induction, designed to extract rules from data. We have applied these methods to the analysis of molecular series with the objective of generating rules which are predictive and reliable. The input to rule-induction consists of a number of examples with known outcomes (a training set) and the output is a tree-structured series of rules. Unlike most other analysis methods, the results of the analysis are in the form of simple statements which can be easily interpreted. These are readily applied to new data giving both a classification and a probability of correctness. Rule-induction has been applied to in-house generated and published QSAR datasets and the methodology, application and results of these analyses are discussed. The results imply that in some cases it would be advantageous to use rule-induction as a complementary technique in addition to conventional statistical and pattern-recognition methods.
On the properties of stochastic intermittency in rainfall processes.
Molini, A; La, Barbera P; Lanza, L G
2002-01-01
In this work we propose a mixed approach to deal with the modelling of rainfall events, based on the analysis of geometrical and statistical properties of rain intermittency in time, combined with the predictability power derived from the analysis of no-rain periods distribution and from the binary decomposition of the rain signal. Some recent hypotheses on the nature of rain intermittency are reviewed too. In particular, the internal intermittent structure of a high resolution pluviometric time series covering one decade and recorded at the tipping bucket station of the University of Genova is analysed, by separating the internal intermittency of rainfall events from the inter-arrival process through a simple geometrical filtering procedure. In this way it is possible to associate no-rain intervals with a probability distribution both in virtue of their position within the event and their percentage. From this analysis, an invariant probability distribution for the no-rain periods within the events is obtained at different aggregation levels and its satisfactory agreement with a typical extreme value distribution is shown.
A Ricin Forensic Profiling Approach Based on a Complex Set of Biomarkers
Fredriksson, Sten-Ake; Wunschel, David S.; Lindstrom, Susanne Wiklund; ...
2018-03-28
A forensic method for the retrospective determination of preparation methods used for illicit ricin toxin production was developed. The method was based on a complex set of biomarkers, including carbohydrates, fatty acids, seed storage proteins, in combination with data on ricin and Ricinus communis agglutinin. The analyses were performed on samples prepared from four castor bean plant (R. communis) cultivars by four different sample preparation methods (PM1 – PM4) ranging from simple disintegration of the castor beans to multi-step preparation methods including different protein precipitation methods. Comprehensive analytical data was collected by use of a range of analytical methods andmore » robust orthogonal partial least squares-discriminant analysis- models (OPLS-DA) were constructed based on the calibration set. By the use of a decision tree and two OPLS-DA models, the sample preparation methods of test set samples were determined. The model statistics of the two models were good and a 100% rate of correct predictions of the test set was achieved.« less
A novel reliable method of DNA extraction from olive oil suitable for molecular traceability.
Raieta, Katia; Muccillo, Livio; Colantuoni, Vittorio
2015-04-01
Extra virgin olive oil production has a worldwide economic impact. The use of this brand, however, is of great concern to Institutions and private industries because of the increasing number of fraud and adulteration attempts to the market products. Here, we present a novel, reliable and not expensive method for extracting the DNA from commercial virgin and extra virgin olive oils. The DNA is stable overtime and amenable for molecular analyses; in fact, by carrying out simple sequence repeats (SSRs) markers analysis, we characterise the genetic profile of monovarietal olive oils. By comparing the oil-derived pattern with that of the corresponding tree, we can unambiguously identify four cultivars from Samnium, a region of Southern Italy, and distinguish them from reference and more widely used varieties. Through a parentage statistical analysis, we also identify the putative pollinators, establishing an unprecedented and powerful tool for olive oil traceability. Copyright © 2014 Elsevier Ltd. All rights reserved.
Time-dependent scaling patterns in high frequency financial data
NASA Astrophysics Data System (ADS)
Nava, Noemi; Di Matteo, Tiziana; Aste, Tomaso
2016-10-01
We measure the influence of different time-scales on the intraday dynamics of financial markets. This is obtained by decomposing financial time series into simple oscillations associated with distinct time-scales. We propose two new time-varying measures of complexity: 1) an amplitude scaling exponent and 2) an entropy-like measure. We apply these measures to intraday, 30-second sampled prices of various stock market indices. Our results reveal intraday trends where different time-horizons contribute with variable relative amplitudes over the course of the trading day. Our findings indicate that the time series we analysed have a non-stationary multifractal nature with predominantly persistent behaviour at the middle of the trading session and anti-persistent behaviour at the opening and at the closing of the session. We demonstrate that these patterns are statistically significant, robust, reproducible and characteristic of each stock market. We argue that any modelling, analytics or trading strategy must take into account these non-stationary intraday scaling patterns.
Spatially coordinated dynamic gene transcription in living pituitary tissue
Featherstone, Karen; Hey, Kirsty; Momiji, Hiroshi; McNamara, Anne V; Patist, Amanda L; Woodburn, Joanna; Spiller, David G; Christian, Helen C; McNeilly, Alan S; Mullins, John J; Finkenstädt, Bärbel F; Rand, David A; White, Michael RH; Davis, Julian RE
2016-01-01
Transcription at individual genes in single cells is often pulsatile and stochastic. A key question emerges regarding how this behaviour contributes to tissue phenotype, but it has been a challenge to quantitatively analyse this in living cells over time, as opposed to studying snap-shots of gene expression state. We have used imaging of reporter gene expression to track transcription in living pituitary tissue. We integrated live-cell imaging data with statistical modelling for quantitative real-time estimation of the timing of switching between transcriptional states across a whole tissue. Multiple levels of transcription rate were identified, indicating that gene expression is not a simple binary ‘on-off’ process. Immature tissue displayed shorter durations of high-expressing states than the adult. In adult pituitary tissue, direct cell contacts involving gap junctions allowed local spatial coordination of prolactin gene expression. Our findings identify how heterogeneous transcriptional dynamics of single cells may contribute to overall tissue behaviour. DOI: http://dx.doi.org/10.7554/eLife.08494.001 PMID:26828110
AncestrySNPminer: A bioinformatics tool to retrieve and develop ancestry informative SNP panels
Amirisetty, Sushil; Khurana Hershey, Gurjit K.; Baye, Tesfaye M.
2012-01-01
A wealth of genomic information is available in public and private databases. However, this information is underutilized for uncovering population specific and functionally relevant markers underlying complex human traits. Given the huge amount of SNP data available from the annotation of human genetic variation, data mining is a faster and cost effective approach for investigating the number of SNPs that are informative for ancestry. In this study, we present AncestrySNPminer, the first web-based bioinformatics tool specifically designed to retrieve Ancestry Informative Markers (AIMs) from genomic data sets and link these informative markers to genes and ontological annotation classes. The tool includes an automated and simple “scripting at the click of a button” functionality that enables researchers to perform various population genomics statistical analyses methods with user friendly querying and filtering of data sets across various populations through a single web interface. AncestrySNPminer can be freely accessed at https://research.cchmc.org/mershalab/AncestrySNPminer/login.php. PMID:22584067
The dot{M}-M_* relation of pre-main-sequence stars: a consequence of X-ray driven disc evolution
NASA Astrophysics Data System (ADS)
Ercolano, B.; Mayr, D.; Owen, J. E.; Rosotti, G.; Manara, C. F.
2014-03-01
We analyse current measurements of accretion rates on to pre-main-sequence stars as a function of stellar mass, and conclude that the steep dependence of accretion rates on stellar mass is real and not driven by selection/detection threshold, as has been previously feared. These conclusions are reached by means of statistical tests including a survival analysis which can account for upper limits. The power-law slope of the dot{M}-M_* relation is found to be in the range of 1.6-1.9 for young stars with masses lower than 1 M⊙. The measured slopes and distributions can be easily reproduced by means of a simple disc model which includes viscous accretion and X-ray photoevaporation. We conclude that the dot{M}-M_* relation in pre-main-sequence stars bears the signature of disc dispersal by X-ray photoevaporation, suggesting that the relation is a straightforward consequence of disc physics rather than an imprint of initial conditions.
NASA Astrophysics Data System (ADS)
Reis, D. S.; Stedinger, J. R.; Martins, E. S.
2005-10-01
This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.
Exploring Remote Sensing Products Online with Giovanni for Studying Urbanization
NASA Technical Reports Server (NTRS)
Shen, Suhung; Leptoukh, Gregory G.; Gerasimov, Irina; Kempler, Steve
2012-01-01
Recently, a Large amount of MODIS land products at multi-spatial resolutions have been integrated into the online system, Giovanni, to support studies on land cover and land use changes focused on Northern Eurasia and Monsoon Asia regions. Giovanni (Goddard Interactive Online Visualization ANd aNalysis Infrastructure) is a Web-based application developed by the NASA Goddard Earth Sciences Data and Information Services Center (GES-DISC) providing a simple and intuitive way to visualize, analyze, and access Earth science remotely-sensed and modeled data. The customized Giovanni Web portals (Giovanni-NEESPI and Giovanni-MAIRS) are created to integrate land, atmospheric, cryospheric, and social products, that enable researchers to do quick exploration and basic analyses of land surface changes and their relationships to climate at global and regional scales. This presentation documents MODIS land surface products in Giovanni system. As examples, images and statistical analysis results on land surface and local climate changes associated with urbanization over Yangtze River Delta region, China, using data in Giovanni are shown.
Transverse excitations in liquid metals
NASA Astrophysics Data System (ADS)
Hosokawa, S.; Munejiri, S.; Inui, M.; Kajihara, Y.; Pilgrim, W.-C.; Baron, A. Q. R.; Shimojo, F.; Hoshino, K.
2013-02-01
The transverse acoustic excitation modes were detected by inelastic x-ray scattering in liquid Ga, Cu and Fe in the Q range around 10 nm-1 using a third-generation synchrotron radiation facility, SPring-8, although these liquid metals are mostly described by a simple hard-sphere liquid. Ab initio molecular dynamics simulations clearly support this finding for liquid Ga. From the detailed analyses for the S(Q,ω) spectra with good statistic qualities, the lifetime of less than 1 ps and the propagating length of less than 1 nm can be estimated for the transverse acoustic phonon modes, which correspond to the lifetime and size of cages formed instantaneously in these liquid metals. The microscopic Poisson's ratio estimated from the dynamic velocities of sound is 0.42 for liquid Ga and about -0.2 for liquid transition metals, indicating a rubber-like soft and extremely hard elastic properties of the cage clusters, respectively. The origin of these microscopic elastic properties is discussed in detail.
A Ricin Forensic Profiling Approach Based on a Complex Set of Biomarkers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fredriksson, Sten-Ake; Wunschel, David S.; Lindstrom, Susanne Wiklund
A forensic method for the retrospective determination of preparation methods used for illicit ricin toxin production was developed. The method was based on a complex set of biomarkers, including carbohydrates, fatty acids, seed storage proteins, in combination with data on ricin and Ricinus communis agglutinin. The analyses were performed on samples prepared from four castor bean plant (R. communis) cultivars by four different sample preparation methods (PM1 – PM4) ranging from simple disintegration of the castor beans to multi-step preparation methods including different protein precipitation methods. Comprehensive analytical data was collected by use of a range of analytical methods andmore » robust orthogonal partial least squares-discriminant analysis- models (OPLS-DA) were constructed based on the calibration set. By the use of a decision tree and two OPLS-DA models, the sample preparation methods of test set samples were determined. The model statistics of the two models were good and a 100% rate of correct predictions of the test set was achieved.« less
Applications of rule-induction in the derivation of quantitative structure-activity relationships
NASA Astrophysics Data System (ADS)
A-Razzak, Mohammed; Glen, Robert C.
1992-08-01
Recently, methods have been developed in the field of Artificial Intelligence (AI), specifically in the expert systems area using rule-induction, designed to extract rules from data. We have applied these methods to the analysis of molecular series with the objective of generating rules which are predictive and reliable. The input to rule-induction consists of a number of examples with known outcomes (a training set) and the output is a tree-structured series of rules. Unlike most other analysis methods, the results of the analysis are in the form of simple statements which can be easily interpreted. These are readily applied to new data giving both a classification and a probability of correctness. Rule-induction has been applied to in-house generated and published QSAR datasets and the methodology, application and results of these analyses are discussed. The results imply that in some cases it would be advantageous to use rule-induction as a complementary technique in addition to conventional statistical and pattern-recognition methods.
Probabilistic risk analysis of building contamination.
Bolster, D T; Tartakovsky, D M
2008-10-01
We present a general framework for probabilistic risk assessment (PRA) of building contamination. PRA provides a powerful tool for the rigorous quantification of risk in contamination of building spaces. A typical PRA starts by identifying relevant components of a system (e.g. ventilation system components, potential sources of contaminants, remediation methods) and proceeds by using available information and statistical inference to estimate the probabilities of their failure. These probabilities are then combined by means of fault-tree analyses to yield probabilistic estimates of the risk of system failure (e.g. building contamination). A sensitivity study of PRAs can identify features and potential problems that need to be addressed with the most urgency. Often PRAs are amenable to approximations, which can significantly simplify the approach. All these features of PRA are presented in this paper via a simple illustrative example, which can be built upon in further studies. The tool presented here can be used to design and maintain adequate ventilation systems to minimize exposure of occupants to contaminants.
Agnotology: learning from mistakes
NASA Astrophysics Data System (ADS)
Benestad, R. E.; Hygen, H. O.; van Dorland, R.; Cook, J.; Nuccitelli, D.
2013-05-01
Replication is an important part of science, and by repeating past analyses, we show that a number of papers in the scientific literature contain severe methodological flaws which can easily be identified through simple tests and demonstrations. In many cases, shortcomings are related to a lack of robustness, leading to results that are not universally valid but rather an artifact of a particular experimental set-up. Some examples presented here have ignored data that do not fit the conclusions, and in several other cases, inappropriate statistical methods have been adopted or conclusions have been based on misconceived physics. These papers may serve as educational case studies for why certain analytical approaches sometimes are unsuitable in providing reliable answers. They also highlight the merit of replication. A lack of common replication has repercussions for the quality of the scientific literature, and may be a reason why some controversial questions remain unanswered even when ignorance could be reduced. Agnotology is the study of such ignorance. A free and open-source software is provided for demonstration purposes.
Madan, Christopher R
2014-01-01
When studying animal behaviour within an open environment, movement-related data are often important for behavioural analyses. Therefore, simple and efficient techniques are needed to present and analyze the data of such movements. However, it is challenging to present both spatial and temporal information of movements within a two-dimensional image representation. To address this challenge, we developed the spectral time-lapse (STL) algorithm that re-codes an animal’s position at every time point with a time-specific color, and overlays it with a reference frame of the video, to produce a summary image. We additionally incorporated automated motion tracking, such that the animal’s position can be extracted and summary statistics such as path length and duration can be calculated, as well as instantaneous velocity and acceleration. Here we describe the STL algorithm and offer a freely available MATLAB toolbox that implements the algorithm and allows for a large degree of end-user control and flexibility. PMID:25580219
Li, Lin-Qiu; Baibado, Joewel T; Shen, Qing; Cheung, Hon-Yeung
2017-12-01
Plastron is a nutritive and superior functional food. Due to its limited supply yet enormous demands, some functional foods supposed to contain plastron may be forged with other substitutes. This paper reports a novel and simple method for determination of the authenticity of plastron-derived functional foods based on comparison of the amino acid (AA) profiles of plastron and its possible substitutes. By applying micellar electrokinetic chromatography (MEKC), 18 common AAs along with another 2 special AAs - hydroxyproline (Hyp) and hydroxylysine (Hyl) were detected in all plastron samples. Since chicken, egg, fish, milk, pork, nail and hair lacked of Hyp and Hyl, plastron could be easily distinguished. For those containing collagen, a statistical analysis technique - principal component analysis (PCA) was adopted and plastron was successfully distinguished. When applied the proposed method to authenticate turtle shell glue in the market, fake products were commonly found. Copyright © 2017 Elsevier B.V. All rights reserved.
Foldnes, Njål; Olsson, Ulf Henning
2016-01-01
We present and investigate a simple way to generate nonnormal data using linear combinations of independent generator (IG) variables. The simulated data have prespecified univariate skewness and kurtosis and a given covariance matrix. In contrast to the widely used Vale-Maurelli (VM) transform, the obtained data are shown to have a non-Gaussian copula. We analytically obtain asymptotic robustness conditions for the IG distribution. We show empirically that popular test statistics in covariance analysis tend to reject true models more often under the IG transform than under the VM transform. This implies that overly optimistic evaluations of estimators and fit statistics in covariance structure analysis may be tempered by including the IG transform for nonnormal data generation. We provide an implementation of the IG transform in the R environment.
Zhao, Zhihua; Zheng, Zhiqin; Roux, Clément; Delmas, Céline; Marty, Jean-Daniel; Kahn, Myrtil L; Mingotaud, Christophe
2016-08-22
Analysis of nanoparticle size through a simple 2D plot is proposed in order to extract the correlation between length and width in a collection or a mixture of anisotropic particles. Compared to the usual statistics on the length associated with a second and independent statistical analysis of the width, this simple plot easily points out the various types of nanoparticles and their (an)isotropy. For each class of nano-objects, the relationship between width and length (i.e., the strong or weak correlations between these two parameters) may suggest information concerning the nucleation/growth processes. It allows one to follow the effect on the shape and size distribution of physical or chemical processes such as simple ripening. Various electron microscopy pictures from the literature or from the authors' own syntheses are used as examples to demonstrate the efficiency and simplicity of the proposed 2D plot combined with a multivariate analysis. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Accurate Modeling of Galaxy Clustering on Small Scales: Testing the Standard ΛCDM + Halo Model
NASA Astrophysics Data System (ADS)
Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron; Scoccimarro, Roman
2015-01-01
The large-scale distribution of galaxies can be explained fairly simply by assuming (i) a cosmological model, which determines the dark matter halo distribution, and (ii) a simple connection between galaxies and the halos they inhabit. This conceptually simple framework, called the halo model, has been remarkably successful at reproducing the clustering of galaxies on all scales, as observed in various galaxy redshift surveys. However, none of these previous studies have carefully modeled the systematics and thus truly tested the halo model in a statistically rigorous sense. We present a new accurate and fully numerical halo model framework and test it against clustering measurements from two luminosity samples of galaxies drawn from the SDSS DR7. We show that the simple ΛCDM cosmology + halo model is not able to simultaneously reproduce the galaxy projected correlation function and the group multiplicity function. In particular, the more luminous sample shows significant tension with theory. We discuss the implications of our findings and how this work paves the way for constraining galaxy formation by accurate simultaneous modeling of multiple galaxy clustering statistics.
Papageorgiou, Spyridon N; Kloukos, Dimitrios; Petridis, Haralampos; Pandis, Nikolaos
2015-10-01
To assess the hypothesis that there is excessive reporting of statistically significant studies published in prosthodontic and implantology journals, which could indicate selective publication. The last 30 issues of 9 journals in prosthodontics and implant dentistry were hand-searched for articles with statistical analyses. The percentages of significant and non-significant results were tabulated by parameter of interest. Univariable/multivariable logistic regression analyses were applied to identify possible predictors of reporting statistically significance findings. The results of this study were compared with similar studies in dentistry with random-effects meta-analyses. From the 2323 included studies 71% of them reported statistically significant results, with the significant results ranging from 47% to 86%. Multivariable modeling identified that geographical area and involvement of statistician were predictors of statistically significant results. Compared to interventional studies, the odds that in vitro and observational studies would report statistically significant results was increased by 1.20 times (OR: 2.20, 95% CI: 1.66-2.92) and 0.35 times (OR: 1.35, 95% CI: 1.05-1.73), respectively. The probability of statistically significant results from randomized controlled trials was significantly lower compared to various study designs (difference: 30%, 95% CI: 11-49%). Likewise the probability of statistically significant results in prosthodontics and implant dentistry was lower compared to other dental specialties, but this result did not reach statistical significant (P>0.05). The majority of studies identified in the fields of prosthodontics and implant dentistry presented statistically significant results. The same trend existed in publications of other specialties in dentistry. Copyright © 2015 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Woo, Jennie H.; Velez, Erin Dunlop
2016-01-01
Statistics in Brief publications present descriptive data in tabular formats to provide useful information to a broad audience, including members of the general public. They address simple and topical issues and questions. Using data from 2011-12, this Statistics in Brief updates and expands on a previous National Center for Education Statistics…
Are V1 Simple Cells Optimized for Visual Occlusions? A Comparative Study
Bornschein, Jörg; Henniges, Marc; Lücke, Jörg
2013-01-01
Simple cells in primary visual cortex were famously found to respond to low-level image components such as edges. Sparse coding and independent component analysis (ICA) emerged as the standard computational models for simple cell coding because they linked their receptive fields to the statistics of visual stimuli. However, a salient feature of image statistics, occlusions of image components, is not considered by these models. Here we ask if occlusions have an effect on the predicted shapes of simple cell receptive fields. We use a comparative approach to answer this question and investigate two models for simple cells: a standard linear model and an occlusive model. For both models we simultaneously estimate optimal receptive fields, sparsity and stimulus noise. The two models are identical except for their component superposition assumption. We find the image encoding and receptive fields predicted by the models to differ significantly. While both models predict many Gabor-like fields, the occlusive model predicts a much sparser encoding and high percentages of ‘globular’ receptive fields. This relatively new center-surround type of simple cell response is observed since reverse correlation is used in experimental studies. While high percentages of ‘globular’ fields can be obtained using specific choices of sparsity and overcompleteness in linear sparse coding, no or only low proportions are reported in the vast majority of studies on linear models (including all ICA models). Likewise, for the here investigated linear model and optimal sparsity, only low proportions of ‘globular’ fields are observed. In comparison, the occlusive model robustly infers high proportions and can match the experimentally observed high proportions of ‘globular’ fields well. Our computational study, therefore, suggests that ‘globular’ fields may be evidence for an optimal encoding of visual occlusions in primary visual cortex. PMID:23754938
Measuring and characterizing beat phenomena with a smartphone
NASA Astrophysics Data System (ADS)
Osorio, M.; Pereyra, C. J.; Gau, D. L.; Laguarda, A.
2018-03-01
Nowadays, smartphones are in everyone’s life. Apart from being excellent tools for work and communication, they can also be used to perform several measurements of simple physical magnitudes, serving as a mobile and inexpensive laboratory, ideal for use physics lectures in high schools or universities. In this article, we use a smartphone to analyse the acoustic beat phenomena by using a simple experimental setup, which can complement lessons in the classroom. The beats were created by the superposition of the waves generated by two tuning forks, with their natural frequencies previously characterized using different applications. After the characterization, we recorded the beats and analysed the oscillations in time and frequency.
DESIGNING ENVIRONMENTAL MONITORING DATABASES FOR STATISTIC ASSESSMENT
Databases designed for statistical analyses have characteristics that distinguish them from databases intended for general use. EMAP uses a probabilistic sampling design to collect data to produce statistical assessments of environmental conditions. In addition to supporting the ...
Narayanan, Roshni; Nugent, Rebecca; Nugent, Kenneth
2015-10-01
Accreditation Council for Graduate Medical Education guidelines require internal medicine residents to develop skills in the interpretation of medical literature and to understand the principles of research. A necessary component is the ability to understand the statistical methods used and their results, material that is not an in-depth focus of most medical school curricula and residency programs. Given the breadth and depth of the current medical literature and an increasing emphasis on complex, sophisticated statistical analyses, the statistical foundation and education necessary for residents are uncertain. We reviewed the statistical methods and terms used in 49 articles discussed at the journal club in the Department of Internal Medicine residency program at Texas Tech University between January 1, 2013 and June 30, 2013. We collected information on the study type and on the statistical methods used for summarizing and comparing samples, determining the relations between independent variables and dependent variables, and estimating models. We then identified the typical statistics education level at which each term or method is learned. A total of 14 articles came from the Journal of the American Medical Association Internal Medicine, 11 from the New England Journal of Medicine, 6 from the Annals of Internal Medicine, 5 from the Journal of the American Medical Association, and 13 from other journals. Twenty reported randomized controlled trials. Summary statistics included mean values (39 articles), category counts (38), and medians (28). Group comparisons were based on t tests (14 articles), χ2 tests (21), and nonparametric ranking tests (10). The relations between dependent and independent variables were analyzed with simple regression (6 articles), multivariate regression (11), and logistic regression (8). Nine studies reported odds ratios with 95% confidence intervals, and seven analyzed test performance using sensitivity and specificity calculations. These papers used 128 statistical terms and context-defined concepts, including some from data analysis (56), epidemiology-biostatistics (31), modeling (24), data collection (12), and meta-analysis (5). Ten different software programs were used in these articles. Based on usual undergraduate and graduate statistics curricula, 64.3% of the concepts and methods used in these papers required at least a master's degree-level statistics education. The interpretation of the current medical literature can require an extensive background in statistical methods at an education level exceeding the material and resources provided to most medical students and residents. Given the complexity and time pressure of medical education, these deficiencies will be hard to correct, but this project can serve as a basis for developing a curriculum in study design and statistical methods needed by physicians-in-training.
Comparing Visual and Statistical Analysis of Multiple Baseline Design Graphs.
Wolfe, Katie; Dickenson, Tammiee S; Miller, Bridget; McGrath, Kathleen V
2018-04-01
A growing number of statistical analyses are being developed for single-case research. One important factor in evaluating these methods is the extent to which each corresponds to visual analysis. Few studies have compared statistical and visual analysis, and information about more recently developed statistics is scarce. Therefore, our purpose was to evaluate the agreement between visual analysis and four statistical analyses: improvement rate difference (IRD); Tau-U; Hedges, Pustejovsky, Shadish (HPS) effect size; and between-case standardized mean difference (BC-SMD). Results indicate that IRD and BC-SMD had the strongest overall agreement with visual analysis. Although Tau-U had strong agreement with visual analysis on raw values, it had poorer agreement when those values were dichotomized to represent the presence or absence of a functional relation. Overall, visual analysis appeared to be more conservative than statistical analysis, but further research is needed to evaluate the nature of these disagreements.
Short-Path Statistics and the Diffusion Approximation
NASA Astrophysics Data System (ADS)
Blanco, Stéphane; Fournier, Richard
2006-12-01
In the field of first return time statistics in bounded domains, short paths may be defined as those paths for which the diffusion approximation is inappropriate. This is at the origin of numerous open questions concerning the characterization of residence time distributions. We show here how general integral constraints can be derived that make it possible to address short-path statistics indirectly by application of the diffusion approximation to long paths. Application to the moments of the distribution at the low-Knudsen limit leads to simple practical results and novel physical pictures.
Information Entropy Production of Maximum Entropy Markov Chains from Spike Trains
NASA Astrophysics Data System (ADS)
Cofré, Rodrigo; Maldonado, Cesar
2018-01-01
We consider the maximum entropy Markov chain inference approach to characterize the collective statistics of neuronal spike trains, focusing on the statistical properties of the inferred model. We review large deviations techniques useful in this context to describe properties of accuracy and convergence in terms of sampling size. We use these results to study the statistical fluctuation of correlations, distinguishability and irreversibility of maximum entropy Markov chains. We illustrate these applications using simple examples where the large deviation rate function is explicitly obtained for maximum entropy models of relevance in this field.
Gain statistics of a fiber optical parametric amplifier with a temporally incoherent pump.
Xu, Y Q; Murdoch, S G
2010-03-15
We present an investigation of the statistics of the gain fluctuations of a fiber optical parametric amplifier pumped with a temporally incoherent pump. We derive a simple expression for the probability distribution of the gain of the amplified optical signal. The gain statistics are shown to be a strong function of the signal detuning and allow the possibility of generating optical gain distributions with controllable long-tails. Very good agreement is found between this theory and the experimentally measured gain distributions of an incoherently pumped amplifier.
USDA-ARS?s Scientific Manuscript database
Agronomic and Environmental research experiments result in data that are analyzed using statistical methods. These data are unavoidably accompanied by uncertainty. Decisions about hypotheses, based on statistical analyses of these data are therefore subject to error. This error is of three types,...
The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth
ERIC Educational Resources Information Center
Steyvers, Mark; Tenenbaum, Joshua B.
2005-01-01
We present statistical analyses of the large-scale structure of 3 types of semantic networks: word associations, WordNet, and Roget's Thesaurus. We show that they have a small-world structure, characterized by sparse connectivity, short average path lengths between words, and strong local clustering. In addition, the distributions of the number of…
NASA Astrophysics Data System (ADS)
Holzman, Burt
The E917 experiment studied Au+Au collisions at beam energies of 6, 8, and 10.8 GeV per nucleon at the Alternating Gradient Synchrotron at Brookhaven National Laboratory in New York. Hanbury-Brown and Twiss correlations between pion pairs were investigated in order to determine the spatiotemporal structure of the collision region which serves as the source for these produced particles. Three separate correlation analyses were carried out in this work. One-dimensional correlation radii and their dependence on beam energy are measured. No systematic trends with energy are observed, and the overall radius roughly corresponds to the geometric size of the collision zone. In a simple model, three-dimensional correlation radii which assume an azimuthally asymmetric source are determined at the full beam energy of 10.8 GeV/u. The radius transverse to the beam direction along the reaction plane is compared to the transverse radius orthogonal to the reaction plane. A small difference is observed between the two radii. In a state-of-the-art model, it is possible to determine the ``lengths of homogeneity'' of the collision zone without invoking the assumptions inherent in the simple model, six-dimensional correlation radii are determined at the full beam energy of 10.8 GeV/u, and consequently the in-plane length of homogeneity, out-of- plane length, longitudinal length, and emission duration of the source are determined. Additionally, the magnitude of the tilt of the collision region in the reaction plane is measured and found to be -31° +/- 32°. The measurement of the in-plane and out-of-plane homogeneity lengths is not sensitive enough to distinguish between an oblate or prolate source, but provides a blueprint for future statistics-rich analyses to follow.
Testing search strategies for systematic reviews in the Medline literature database through PubMed.
Volpato, Enilze S N; Betini, Marluci; El Dib, Regina
2014-04-01
A high-quality electronic search is essential in ensuring accuracy and completeness in retrieved records for the conducting of a systematic review. We analysed the available sample of search strategies to identify the best method for searching in Medline through PubMed, considering the use or not of parenthesis, double quotation marks, truncation and use of a simple search or search history. In our cross-sectional study of search strategies, we selected and analysed the available searches performed during evidence-based medicine classes and in systematic reviews conducted in the Botucatu Medical School, UNESP, Brazil. We analysed 120 search strategies. With regard to the use of phrase searches with parenthesis, there was no difference between the results with and without parenthesis and simple searches or search history tools in 100% of the sample analysed (P = 1.0). The number of results retrieved by the searches analysed was smaller using double quotations marks and using truncation compared with the standard strategy (P = 0.04 and P = 0.08, respectively). There is no need to use phrase-searching parenthesis to retrieve studies; however, we recommend the use of double quotation marks when an investigator attempts to retrieve articles in which a term appears to be exactly the same as what was proposed in the search form. Furthermore, we do not recommend the use of truncation in search strategies in the Medline via PubMed. Although the results of simple searches or search history tools were the same, we recommend using the latter.
NASA Astrophysics Data System (ADS)
Chen, X.; Vierling, L. A.; Deering, D. W.
2004-12-01
Satellite data offer unique perspectives for monitoring and quantifying land cover change, however, the radiometric consistency among co-located multi-temporal images is difficult to maintain due to variations in sensors and atmosphere. To detect accurate landscape change using multi-temporal images, we developed a new relative radiometric normalization scheme: the temporally invariant cluster (TIC) method. Image data were acquired on 9 June 1990 (Landsat 4), 20 June 2000, and 26 August 2001 (Landsat 7) for analyses over boreal forests near the Siberian city of Krasnoyarsk. Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), and Reduced Simple Ratio (RSR) were investigated in the normalization study. The temporally invariant cluster (TIC) centers were identified through a point density map of the base image and the target image and a normalization regression line was created through all TIC centers. The target image digital data were then converted using the regression function so that the two images could be compared using the resulting common radiometric scale. We found that EVI was very sensitive to vegetation structure and could thus be used to separate conifer forests from deciduous forests and grass/crop lands. NDVI was a very effective vegetation index to reduce the influence of shadow, while EVI was very sensitive to shadowing. After normalization, correlations of NDVI and EVI with field collected total Leaf Area Index (LAI) data in 2000 and 2001 were significantly improved; the r-square values in these regressions increased from 0.49 to 0.69 and from 0.46 to 0.61, respectively. An EVI ¡°cancellation effect¡± where EVI was positively related to understory greenness but negatively related to forest canopy coverage was evident across a post fire chronosequence. These findings indicate that the TIC method provides a simple, effective and repeatable method to create radiometrically comparable data sets for remote detection of landscape change. Compared with some previous relative normalization methods, this new method can avoid subjective selection of a normalization regression line. It does not require high level programming and statistical analyses, yet remains sensitive to landscape changes occurring over seasonal and inter-annual time scales. In addition, the TIC method maintains sensitivity to subtle changes in vegetation phenology and enables normalization even when invariant features are rare.
Day, Andrew G; Pelland, Lucie; Pickett, William; Johnson, Ana P; Aiken, Alice; Pichora, David R; Brouwer, Brenda
2016-01-01
Objective To assess the efficacy of a programme of supervised physiotherapy on the recovery of simple grade 1 and 2 ankle sprains. Design A randomised controlled trial of 503 participants followed for six months. Setting Participants were recruited from two tertiary acute care settings in Kingston, ON, Canada. Participants The broad inclusion criteria were patients aged ≥16 presenting for acute medical assessment and treatment of a simple grade 1 or 2 ankle sprain. Exclusions were patients with multiple injuries, other conditions limiting mobility, and ankle injuries that required immobilisation and those unable to accommodate the time intensive study protocol. Intervention Participants received either usual care, consisting of written instructions regarding protection, rest, cryotherapy, compression, elevation, and graduated weight bearing activities, or usual care enhanced with a supervised programme of physiotherapy. Main outcome measures The primary outcome of efficacy was the proportion of participants reporting excellent recovery assessed with the foot and ankle outcome score (FAOS). Excellent recovery was defined as a score ≥450/500 at three months. A difference of at least 15% increase in the absolute proportion of participants with excellent recovery was deemed clinically important. Secondary analyses included the assessment of excellent recovery at one and six months; change from baseline using continuous scores at one, three, and six months; and clinical and biomechanical measures of ankle function, assessed at one, three, and six months. Results The absolute proportion of patients achieving excellent recovery at three months was not significantly different between the physiotherapy (98/229, 43%) and usual care (79/214, 37%) arms (absolute difference 6%, 95% confidence interval −3% to 15%). The observed trend towards benefit with physiotherapy did not increase in the per protocol analysis and was in the opposite direction by six months. These trends remained similar and were never statistically or clinically important when the FAOS was analysed as a continuous change score. Conclusions In a general population of patients seeking hospital based acute care for simple ankle sprains, there is no evidence to support a clinically important improvement in outcome with the addition of supervised physiotherapy to usual care, as provided in this protocol. Trial registration ISRCTN 74033088 (www.isrctn.com/ISRCTN74033088) PMID:27852621
Differences in Performance Among Test Statistics for Assessing Phylogenomic Model Adequacy.
Duchêne, David A; Duchêne, Sebastian; Ho, Simon Y W
2018-05-18
Statistical phylogenetic analyses of genomic data depend on models of nucleotide or amino acid substitution. The adequacy of these substitution models can be assessed using a number of test statistics, allowing the model to be rejected when it is found to provide a poor description of the evolutionary process. A potentially valuable use of model-adequacy test statistics is to identify when data sets are likely to produce unreliable phylogenetic estimates, but their differences in performance are rarely explored. We performed a comprehensive simulation study to identify test statistics that are sensitive to some of the most commonly cited sources of phylogenetic estimation error. Our results show that, for many test statistics, traditional thresholds for assessing model adequacy can fail to reject the model when the phylogenetic inferences are inaccurate and imprecise. This is particularly problematic when analysing loci that have few variable informative sites. We propose new thresholds for assessing substitution model adequacy and demonstrate their effectiveness in analyses of three phylogenomic data sets. These thresholds lead to frequent rejection of the model for loci that yield topological inferences that are imprecise and are likely to be inaccurate. We also propose the use of a summary statistic that provides a practical assessment of overall model adequacy. Our approach offers a promising means of enhancing model choice in genome-scale data sets, potentially leading to improvements in the reliability of phylogenomic inference.
Error Analysis of Present Simple Tense in the Interlanguage of Adult Arab English Language Learners
ERIC Educational Resources Information Center
Muftah, Muneera; Rafik-Galea, Shameem
2013-01-01
The present study analyses errors on present simple tense among adult Arab English language learners. It focuses on the error on 3sg "-s" (the third person singular present tense agreement morpheme "-s"). The learners are undergraduate adult Arabic speakers learning English as a foreign language. The study gathered data from…
Hansen, John P
2003-01-01
Healthcare quality improvement professionals need to understand and use inferential statistics to interpret sample data from their organizations. In quality improvement and healthcare research studies all the data from a population often are not available, so investigators take samples and make inferences about the population by using inferential statistics. This three-part series will give readers an understanding of the concepts of inferential statistics as well as the specific tools for calculating confidence intervals for samples of data. This article, Part 2, describes probability, populations, and samples. The uses of descriptive and inferential statistics are outlined. The article also discusses the properties and probability of normal distributions, including the standard normal distribution.
Statistics for nuclear engineers and scientists. Part 1. Basic statistical inference
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beggs, W.J.
1981-02-01
This report is intended for the use of engineers and scientists working in the nuclear industry, especially at the Bettis Atomic Power Laboratory. It serves as the basis for several Bettis in-house statistics courses. The objectives of the report are to introduce the reader to the language and concepts of statistics and to provide a basic set of techniques to apply to problems of the collection and analysis of data. Part 1 covers subjects of basic inference. The subjects include: descriptive statistics; probability; simple inference for normally distributed populations, and for non-normal populations as well; comparison of two populations; themore » analysis of variance; quality control procedures; and linear regression analysis.« less
ERIC Educational Resources Information Center
Larzelere, Robert E.; Ferrer, Emilio; Kuhn, Brett R.; Danelia, Ketevan
2010-01-01
This study estimates the causal effects of six corrective actions for children's problem behaviors, comparing four types of longitudinal analyses that correct for pre-existing differences in a cohort of 1,464 4- and 5-year-olds from Canadian National Longitudinal Survey of Children and Youth (NLSCY) data. Analyses of residualized gain scores found…
NASA Technical Reports Server (NTRS)
Miller, R. D.; Rogers, J. T.
1975-01-01
General requirements for dynamic loads analyses are described. The indicial lift growth function unsteady subsonic aerodynamic representation is reviewed, and the FLEXSTAB CPS is evaluated with respect to these general requirements. The effects of residual flexibility techniques on dynamic loads analyses are also evaluated using a simple dynamic model.
NASA Astrophysics Data System (ADS)
Gregoire, Alexandre David
2011-07-01
The goal of this research was to accurately predict the ultimate compressive load of impact damaged graphite/epoxy coupons using a Kohonen self-organizing map (SOM) neural network and multivariate statistical regression analysis (MSRA). An optimized use of these data treatment tools allowed the generation of a simple, physically understandable equation that predicts the ultimate failure load of an impacted damaged coupon based uniquely on the acoustic emissions it emits at low proof loads. Acoustic emission (AE) data were collected using two 150 kHz resonant transducers which detected and recorded the AE activity given off during compression to failure of thirty-four impacted 24-ply bidirectional woven cloth laminate graphite/epoxy coupons. The AE quantification parameters duration, energy and amplitude for each AE hit were input to the Kohonen self-organizing map (SOM) neural network to accurately classify the material failure mechanisms present in the low proof load data. The number of failure mechanisms from the first 30% of the loading for twenty-four coupons were used to generate a linear prediction equation which yielded a worst case ultimate load prediction error of 16.17%, just outside of the +/-15% B-basis allowables, which was the goal for this research. Particular emphasis was placed upon the noise removal process which was largely responsible for the accuracy of the results.
Okunade, Kehinde S; Sunmonu, Oyebola; Osanyin, Gbemisola E; Oluwole, Ayodeji A
2017-01-01
This study was aimed at determining the knowledge and acceptability of HPV vaccine among women attending the gynaecology clinics of the Lagos University Teaching Hospital (LUTH). This was a descriptive cross-sectional study involving 148 consecutively selected women attending the gynaecology clinic of LUTH. Relevant information was obtained from these women using an interviewer-administered questionnaire. The data was analysed and then presented by simple descriptive statistics using tables and charts. Chi-square statistics were used to test the association between the sociodemographical variables and acceptance of HPV vaccination. All significance values were reported at P < 0.05. The mean age of the respondents was 35.7 ± 9.7 years. The study showed that 36.5% of the respondents had heard about HPV infection while only 18.9% had knowledge about the existence of HPV vaccines. Overall, 81.8% of the respondents accepted that the vaccines could be administered to their teenage girls with the level of education of the mothers being the major determinant of their acceptability ( P = 0.013). Awareness of HPV infections and existence of HPV vaccines is low. However, the acceptance of HPV vaccines is generally high. Efforts should be made to increase the awareness about cervical cancer, its aetiologies, and prevention via HPV vaccination.
Haile, Tariku Gebre
2017-01-01
Background. In many studies, compliance with standard precautions among healthcare workers was reported to be inadequate. Objective. The aim of this study was to assess compliance with standard precautions and associated factors among healthcare workers in northwest Ethiopia. Methods. An institution-based cross-sectional study was conducted from March 01 to April 30, 2014. Simple random sampling technique was used to select participants. Data were entered into Epi info 3.5.1 and were exported to SPSS version 20.0 for statistical analysis. Multivariate logistic regression analyses were computed and adjusted odds ratio with 95% confidence interval was calculated to identify associated factors. Results. The proportion of healthcare workers who always comply with standard precautions was found to be 12%. Being a female healthcare worker (AOR [95% CI] 2.18 [1.12–4.23]), higher infection risk perception (AOR [95% CI] 3.46 [1.67–7.18]), training on standard precautions (AOR [95% CI] 2.90 [1.20–7.02]), accessibility of personal protective equipment (AOR [95% CI] 2.87 [1.41–5.86]), and management support (AOR [95% CI] 2.23 [1.11–4.53]) were found to be statistically significant. Conclusion and Recommendation. Compliance with standard precautions among the healthcare workers is very low. Interventions which include training of healthcare workers on standard precautions and consistent management support are recommended. PMID:28191020
O'Neel, Shad; Larsen, Christopher F.; Rupert, Natalia; Hansen, Roger
2010-01-01
Since the installation of the Alaska Regional Seismic Network in the 1970s, data analysts have noted nontectonic seismic events thought to be related to glacier dynamics. While loose associations with the glaciers of the St. Elias Mountains have been made, no detailed study of the source locations has been undertaken. We performed a two-step investigation surrounding these events, beginning with manual locations that guided an automated detection and event sifting routine. Results from the manual investigation highlight characteristics of the seismic waveforms including single-peaked (narrowband) spectra, emergent onsets, lack of distinct phase arrivals, and a predominant cluster of locations near the calving termini of several neighboring tidewater glaciers. Through these locations, comparison with previous work, analyses of waveform characteristics, frequency-magnitude statistics and temporal patterns in seismicity, we suggest calving as a source for the seismicity. Statistical properties and time series analysis of the event catalog suggest a scale-invariant process that has no single or simple forcing. These results support the idea that calving is often a response to short-lived or localized stress perturbations. Our results demonstrate the utility of passive seismic instrumentation to monitor relative changes in the rate and magnitude of iceberg calving at tidewater glaciers that may be volatile or susceptible to ensuing rapid retreat, especially when existing seismic infrastructure can be used.
Meat Quality Assessment by Electronic Nose (Machine Olfaction Technology)
Ghasemi-Varnamkhasti, Mahdi; Mohtasebi, Seyed Saeid; Siadat, Maryam; Balasubramanian, Sundar
2009-01-01
Over the last twenty years, newly developed chemical sensor systems (so called “electronic noses”) have made odor analyses possible. These systems involve various types of electronic chemical gas sensors with partial specificity, as well as suitable statistical methods enabling the recognition of complex odors. As commercial instruments have become available, a substantial increase in research into the application of electronic noses in the evaluation of volatile compounds in food, cosmetic and other items of everyday life is observed. At present, the commercial gas sensor technologies comprise metal oxide semiconductors, metal oxide semiconductor field effect transistors, organic conducting polymers, and piezoelectric crystal sensors. Further sensors based on fibreoptic, electrochemical and bi-metal principles are still in the developmental stage. Statistical analysis techniques range from simple graphical evaluation to multivariate analysis such as artificial neural network and radial basis function. The introduction of electronic noses into the area of food is envisaged for quality control, process monitoring, freshness evaluation, shelf-life investigation and authenticity assessment. Considerable work has already been carried out on meat, grains, coffee, mushrooms, cheese, sugar, fish, beer and other beverages, as well as on the odor quality evaluation of food packaging material. This paper describes the applications of these systems for meat quality assessment, where fast detection methods are essential for appropriate product management. The results suggest the possibility of using this new technology in meat handling. PMID:22454572
Learning optimal features for visual pattern recognition
NASA Astrophysics Data System (ADS)
Labusch, Kai; Siewert, Udo; Martinetz, Thomas; Barth, Erhardt
2007-02-01
The optimal coding hypothesis proposes that the human visual system has adapted to the statistical properties of the environment by the use of relatively simple optimality criteria. We here (i) discuss how the properties of different models of image coding, i.e. sparseness, decorrelation, and statistical independence are related to each other (ii) propose to evaluate the different models by verifiable performance measures (iii) analyse the classification performance on images of handwritten digits (MNIST data base). We first employ the SPARSENET algorithm (Olshausen, 1998) to derive a local filter basis (on 13 × 13 pixels windows). We then filter the images in the database (28 × 28 pixels images of digits) and reduce the dimensionality of the resulting feature space by selecting the locally maximal filter responses. We then train a support vector machine on a training set to classify the digits and report results obtained on a separate test set. Currently, the best state-of-the-art result on the MNIST data base has an error rate of 0,4%. This result, however, has been obtained by using explicit knowledge that is specific to the data (elastic distortion model for digits). We here obtain an error rate of 0,55% which is second best but does not use explicit data specific knowledge. In particular it outperforms by far all methods that do not use data-specific knowledge.
Role of gynecologists in reproductive education of adolescent girls in Hungary.
Varga-Tóth, Andrea; Paulik, Edit
2015-05-01
The aim of this study was to assess whether the socioeconomic characteristics of adolescent girls, their knowledge about cervical cancer screening, and their sexual activity are associated with whether or not they have already visited a gynecologist. A self-administered questionnaire-based study was performed among secondary school girls (n = 589) who participated in professional education provided by a pediatric and adolescent gynecologist. The questionnaire comprised sociodemographic characteristics, sexual activity, knowledge on contraceptive methods, cervical screening and sources of their knowledge. Simple descriptive statistics, χ(2) and one-way-anova tests, multivariate logistic regression analysis and Pearson correlation were applied. All statistical analyses were carried out using spss 17.0 for Windows. A total of 50.3% of adolescent girls had already had a sexual contact. Half of the sexually active participants had already visited a gynecologist, and most of them did so due to some kind of complaint. The overall knowledge about cervical screening was quite low; higher knowledge was found among those having visited a gynecologist. Adolescent girls' knowledge on cervical screening was improved by previous visits to a gynecologist. The participation of an expert--a gynecologist--in a comprehensive sexual education program of teenage girls is of high importance in Hungary. © 2014 The Authors. Journal of Obstetrics and Gynaecology Research © 2014 Japan Society of Obstetrics and Gynecology.
Sexual violence against female university students in Ethiopia.
Adinew, Yohannes Mehretie; Hagos, Mihiret Abreham
2017-07-24
Though many women are suffering the consequences of sexual violence, only few victims speak out as it is sensitive and prone to stigma. This lack of data made it difficult to get full picture of the problem and design proper interventions. Thus, the aim of this study was to assess the prevalence and factors associated with sexual violence among female students of Wolaita Sodo University, south Ethiopia. Institution based cross-sectional study was conducted among 462 regular female Wolaita Sodo University students on April 7/2015. Participants were selected by simple random sampling. Data were collected by self-administered questionnaire. Data entry and analysis was done by EPI info and SPSS statistical packages respectively. Descriptive statistics were done. Moreover, bivariate and multivariate analyses were also carried out to identify predictors of sexual violence. The age of respondents ranged from 18 to 26 years. Lifetime sexual violence was found to be 45.4%. However, 36.1% and 24.4% of respondents reported experiencing sexual violence since entering university and in the current academic year respectively. Life time sexual violence was positively associated with witnessing inter-parental violence as a child, rural childhood residence, having regular boyfriend, alcohol consumption and having friends who drink regularly; while it was negatively associated with discussing sexual issues with parents. Sexual violence is a common phenomenon among the students. More detailed research has to be conducted to develop prevention and intervention strategies.
Padmavathi, P
2014-01-01
Premenstrual syndrome is the most common of gynaecologic complaints. It affects half of all female adolescents today and represents the leading cause of college/school absenteeism among that population. It was sought to assess the effectiveness of acupressure Vs reflexology on premenstrual syndrome among adolescents. Two-group pre-test and post-test true experimental design was adopted for the study. Forty adolescent girls from Government Girls Secondary School, Erode with pre- menstrual syndrome fulfilling the inclusion criteria were selected by simple random sampling. A pre-test was conducted by using premenstrual symptoms assessment scale. Immediately after pre-test acupressure Vs reflexology was given once a week for 6 weeks and again post-test was conducted to assess the effectiveness of treatment. Collected data was analysed by using descriptive and inferential statistics. In post-test, the mean score of the experimental group I sample was 97.3 (SD = 2.5) and the group II mean score was 70:8 (SD = 10.71) with paired 't' value of 19.2 and 31.9. This showed that the reflexology was more effective than acupressure in enhancing the practice of the sample regarding pre-menstrual syndrome. Statistically no significant association was found between the post-test scores of the sample with their demographic variables. The findings imply the need for educating adolescent girls on effective management of pre-menstrual syndrome.
Statistical Analyses of Raw Material Data for MTM45-1/CF7442A-36% RW: CMH Cure Cycle
NASA Technical Reports Server (NTRS)
Coroneos, Rula; Pai, Shantaram, S.; Murthy, Pappu
2013-01-01
This report describes statistical characterization of physical properties of the composite material system MTM45-1/CF7442A, which has been tested and is currently being considered for use on spacecraft structures. This composite system is made of 6K plain weave graphite fibers in a highly toughened resin system. This report summarizes the distribution types and statistical details of the tests and the conditions for the experimental data generated. These distributions will be used in multivariate regression analyses to help determine material and design allowables for similar material systems and to establish a procedure for other material systems. Additionally, these distributions will be used in future probabilistic analyses of spacecraft structures. The specific properties that are characterized are the ultimate strength, modulus, and Poisson??s ratio by using a commercially available statistical package. Results are displayed using graphical and semigraphical methods and are included in the accompanying appendixes.
Post Hoc Analyses of ApoE Genotype-Defined Subgroups in Clinical Trials.
Kennedy, Richard E; Cutter, Gary R; Wang, Guoqiao; Schneider, Lon S
2016-01-01
Many post hoc analyses of clinical trials in Alzheimer's disease (AD) and mild cognitive impairment (MCI) are in small Phase 2 trials. Subject heterogeneity may lead to statistically significant post hoc results that cannot be replicated in larger follow-up studies. We investigated the extent of this problem using simulation studies mimicking current trial methods with post hoc analyses based on ApoE4 carrier status. We used a meta-database of 24 studies, including 3,574 subjects with mild AD and 1,171 subjects with MCI/prodromal AD, to simulate clinical trial scenarios. Post hoc analyses examined if rates of progression on the Alzheimer's Disease Assessment Scale-cognitive (ADAS-cog) differed between ApoE4 carriers and non-carriers. Across studies, ApoE4 carriers were younger and had lower baseline scores, greater rates of progression, and greater variability on the ADAS-cog. Up to 18% of post hoc analyses for 18-month trials in AD showed greater rates of progression for ApoE4 non-carriers that were statistically significant but unlikely to be confirmed in follow-up studies. The frequency of erroneous conclusions dropped below 3% with trials of 100 subjects per arm. In MCI, rates of statistically significant differences with greater progression in ApoE4 non-carriers remained below 3% unless sample sizes were below 25 subjects per arm. Statistically significant differences for ApoE4 in post hoc analyses often reflect heterogeneity among small samples rather than true differential effect among ApoE4 subtypes. Such analyses must be viewed cautiously. ApoE genotype should be incorporated into the design stage to minimize erroneous conclusions.
Rao, Goutham; Lopez-Jimenez, Francisco; Boyd, Jack; D'Amico, Frank; Durant, Nefertiti H; Hlatky, Mark A; Howard, George; Kirley, Katherine; Masi, Christopher; Powell-Wiley, Tiffany M; Solomonides, Anthony E; West, Colin P; Wessel, Jennifer
2017-09-05
Meta-analyses are becoming increasingly popular, especially in the fields of cardiovascular disease prevention and treatment. They are often considered to be a reliable source of evidence for making healthcare decisions. Unfortunately, problems among meta-analyses such as the misapplication and misinterpretation of statistical methods and tests are long-standing and widespread. The purposes of this statement are to review key steps in the development of a meta-analysis and to provide recommendations that will be useful for carrying out meta-analyses and for readers and journal editors, who must interpret the findings and gauge methodological quality. To make the statement practical and accessible, detailed descriptions of statistical methods have been omitted. Based on a survey of cardiovascular meta-analyses, published literature on methodology, expert consultation, and consensus among the writing group, key recommendations are provided. Recommendations reinforce several current practices, including protocol registration; comprehensive search strategies; methods for data extraction and abstraction; methods for identifying, measuring, and dealing with heterogeneity; and statistical methods for pooling results. Other practices should be discontinued, including the use of levels of evidence and evidence hierarchies to gauge the value and impact of different study designs (including meta-analyses) and the use of structured tools to assess the quality of studies to be included in a meta-analysis. We also recommend choosing a pooling model for conventional meta-analyses (fixed effect or random effects) on the basis of clinical and methodological similarities among studies to be included, rather than the results of a test for statistical heterogeneity. © 2017 American Heart Association, Inc.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Statistical mechanics of simple models of protein folding and design.
Pande, V S; Grosberg, A Y; Tanaka, T
1997-01-01
It is now believed that the primary equilibrium aspects of simple models of protein folding are understood theoretically. However, current theories often resort to rather heavy mathematics to overcome some technical difficulties inherent in the problem or start from a phenomenological model. To this end, we take a new approach in this pedagogical review of the statistical mechanics of protein folding. The benefit of our approach is a drastic mathematical simplification of the theory, without resort to any new approximations or phenomenological prescriptions. Indeed, the results we obtain agree precisely with previous calculations. Because of this simplification, we are able to present here a thorough and self contained treatment of the problem. Topics discussed include the statistical mechanics of the random energy model (REM), tests of the validity of REM as a model for heteropolymer freezing, freezing transition of random sequences, phase diagram of designed ("minimally frustrated") sequences, and the degree to which errors in the interactions employed in simulations of either folding and design can still lead to correct folding behavior. Images FIGURE 2 FIGURE 3 FIGURE 4 FIGURE 6 PMID:9414231
Statistical self-similarity of width function maxima with implications to floods
Veitzer, S.A.; Gupta, V.K.
2001-01-01
Recently a new theory of random self-similar river networks, called the RSN model, was introduced to explain empirical observations regarding the scaling properties of distributions of various topologic and geometric variables in natural basins. The RSN model predicts that such variables exhibit statistical simple scaling, when indexed by Horton-Strahler order. The average side tributary structure of RSN networks also exhibits Tokunaga-type self-similarity which is widely observed in nature. We examine the scaling structure of distributions of the maximum of the width function for RSNs for nested, complete Strahler basins by performing ensemble simulations. The maximum of the width function exhibits distributional simple scaling, when indexed by Horton-Strahler order, for both RSNs and natural river networks extracted from digital elevation models (DEMs). We also test a powerlaw relationship between Horton ratios for the maximum of the width function and drainage areas. These results represent first steps in formulating a comprehensive physical statistical theory of floods at multiple space-time scales for RSNs as discrete hierarchical branching structures. ?? 2001 Published by Elsevier Science Ltd.
NASA Technical Reports Server (NTRS)
Todling, Ricardo
2015-01-01
Recently, this author studied an approach to the estimation of system error based on combining observation residuals derived from a sequential filter and fixed lag-1 smoother. While extending the methodology to a variational formulation, experimenting with simple models and making sure consistency was found between the sequential and variational formulations, the limitations of the residual-based approach came clearly to the surface. This note uses the sequential assimilation application to simple nonlinear dynamics to highlight the issue. Only when some of the underlying error statistics are assumed known is it possible to estimate the unknown component. In general, when considerable uncertainties exist in the underlying statistics as a whole, attempts to obtain separate estimates of the various error covariances are bound to lead to misrepresentation of errors. The conclusions are particularly relevant to present-day attempts to estimate observation-error correlations from observation residual statistics. A brief illustration of the issue is also provided by comparing estimates of error correlations derived from a quasi-operational assimilation system and a corresponding Observing System Simulation Experiments framework.
NASA Astrophysics Data System (ADS)
Pearl, Judea
2000-03-01
Written by one of the pre-eminent researchers in the field, this book provides a comprehensive exposition of modern analysis of causation. It shows how causality has grown from a nebulous concept into a mathematical theory with significant applications in the fields of statistics, artificial intelligence, philosophy, cognitive science, and the health and social sciences. Pearl presents a unified account of the probabilistic, manipulative, counterfactual and structural approaches to causation, and devises simple mathematical tools for analyzing the relationships between causal connections, statistical associations, actions and observations. The book will open the way for including causal analysis in the standard curriculum of statistics, artifical intelligence, business, epidemiology, social science and economics. Students in these areas will find natural models, simple identification procedures, and precise mathematical definitions of causal concepts that traditional texts have tended to evade or make unduly complicated. This book will be of interest to professionals and students in a wide variety of fields. Anyone who wishes to elucidate meaningful relationships from data, predict effects of actions and policies, assess explanations of reported events, or form theories of causal understanding and causal speech will find this book stimulating and invaluable.
Plastino, A; Rocca, M C
2017-06-01
Appealing to the 1902 Gibbs formalism for classical statistical mechanics (SM)-the first SM axiomatic theory ever that successfully explained equilibrium thermodynamics-we show that already at the classical level there is a strong correlation between Renyi's exponent α and the number of particles for very simple systems. No reference to heat baths is needed for such a purpose.
Factors associated to quality of life in active elderly.
Alexandre, Tiago da Silva; Cordeiro, Renata Cereda; Ramos, Luiz Roberto
2009-08-01
To analyze whether quality of life in active, healthy elderly individuals is influenced by functional status and sociodemographic characteristics, as well as psychological parameters. Study conducted in a sample of 120 active elderly subjects recruited from two open universities of the third age in the cities of São Paulo and São José dos Campos (Southeastern Brazil) between May 2005 and April 2006. Quality of life was measured using the abbreviated Brazilian version of the World Health Organization Quality of Live (WHOQOL-bref) questionnaire. Sociodemographic, clinical and functional variables were measured through crossculturally validated assessments by the Mini Mental State Examination, Geriatric Depression Scale, Functional Reach, One-Leg Balance Test, Timed Up and Go Test, Six-Minute Walk Test, Human Activity Profile and a complementary questionnaire. Simple descriptive analyses, Pearson's correlation coefficient, Student's t-test for non-related samples, analyses of variance, linear regression analyses and variance inflation factor were performed. The significance level for all statistical tests was set at 0.05. Linear regression analysis showed an independent correlation without colinearity between depressive symptoms measured by the Geriatric Depression Scale and four domains of the WHOQOL-bref. Not having a conjugal life implied greater perception in the social domain; developing leisure activities and having an income over five minimum wages implied greater perception in the environment domain. Functional status had no influence on the Quality of Life variable in the analysis models in active elderly. In contrast, psychological factors, as assessed by the Geriatric Depression Scale, and sociodemographic characteristics, such as marital status, income and leisure activities, had an impact on quality of life.
Perez-Rodriguez, M Mercedes; Garcia-Nieto, Rebeca; Fernandez-Navarro, Pablo; Galfalvy, Hanga; de Leon, Jose; Baca-Garcia, Enrique
2012-01-01
Objectives To investigate the trends and correlations of gross domestic product (GDP) adjusted for purchasing power parity (PPP) per capita on suicide rates in 10 WHO regions during the past 30 years. Design Analyses of databases of PPP-adjusted GDP per capita and suicide rates. Countries were grouped according to the Global Burden of Disease regional classification system. Data sources World Bank's official website and WHO's mortality database. Statistical analyses After graphically displaying PPP-adjusted GDP per capita and suicide rates, mixed effect models were used for representing and analysing clustered data. Results Three different groups of countries, based on the correlation between the PPP-adjusted GDP per capita and suicide rates, are reported: (1) positive correlation: developing (lower middle and upper middle income) Latin-American and Caribbean countries, developing countries in the South East Asian Region including India, some countries in the Western Pacific Region (such as China and South Korea) and high-income Asian countries, including Japan; (2) negative correlation: high-income and developing European countries, Canada, Australia and New Zealand and (3) no correlation was found in an African country. Conclusions PPP-adjusted GDP per capita may offer a simple measure for designing the type of preventive interventions aimed at lowering suicide rates that can be used across countries. Public health interventions might be more suitable for developing countries. In high-income countries, however, preventive measures based on the medical model might prove more useful. PMID:22586285
Rue-Albrecht, Kévin; McGettigan, Paul A; Hernández, Belinda; Nalpas, Nicolas C; Magee, David A; Parnell, Andrew C; Gordon, Stephen V; MacHugh, David E
2016-03-11
Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.
2014-01-01
Objective To offer a practical demonstration of receiver operating characteristic (ROC) analyses, diagnostic efficiency statistics, and their application to clinical decision making using a popular parent checklist to assess for potential mood disorder. Method Secondary analyses of data from 589 families seeking outpatient mental health services, completing the Child Behavior Checklist and semi-structured diagnostic interviews. Results Internalizing Problems raw scores discriminated mood disorders significantly better than did age- and gender-normed T scores, or an Affective Problems score. Internalizing scores <8 had a diagnostic likelihood ratio <0.3, and scores >30 had a diagnostic likelihood ratio of 7.4. Conclusions This study illustrates a series of steps in defining a clinical problem, operationalizing it, selecting a valid study design, and using ROC analyses to generate statistics that support clinical decisions. The ROC framework offers important advantages for clinical interpretation. Appendices include sample scripts using SPSS and R to check assumptions and conduct ROC analyses. PMID:23965298
Youngstrom, Eric A
2014-03-01
To offer a practical demonstration of receiver operating characteristic (ROC) analyses, diagnostic efficiency statistics, and their application to clinical decision making using a popular parent checklist to assess for potential mood disorder. Secondary analyses of data from 589 families seeking outpatient mental health services, completing the Child Behavior Checklist and semi-structured diagnostic interviews. Internalizing Problems raw scores discriminated mood disorders significantly better than did age- and gender-normed T scores, or an Affective Problems score. Internalizing scores <8 had a diagnostic likelihood ratio <0.3, and scores >30 had a diagnostic likelihood ratio of 7.4. This study illustrates a series of steps in defining a clinical problem, operationalizing it, selecting a valid study design, and using ROC analyses to generate statistics that support clinical decisions. The ROC framework offers important advantages for clinical interpretation. Appendices include sample scripts using SPSS and R to check assumptions and conduct ROC analyses.
Winer, E Samuel; Cervone, Daniel; Bryant, Jessica; McKinney, Cliff; Liu, Richard T; Nadorff, Michael R
2016-09-01
A popular way to attempt to discern causality in clinical psychology is through mediation analysis. However, mediation analysis is sometimes applied to research questions in clinical psychology when inferring causality is impossible. This practice may soon increase with new, readily available, and easy-to-use statistical advances. Thus, we here provide a heuristic to remind clinical psychological scientists of the assumptions of mediation analyses. We describe recent statistical advances and unpack assumptions of causality in mediation, underscoring the importance of time in understanding mediational hypotheses and analyses in clinical psychology. Example analyses demonstrate that statistical mediation can occur despite theoretical mediation being improbable. We propose a delineation of mediational effects derived from cross-sectional designs into the terms temporal and atemporal associations to emphasize time in conceptualizing process models in clinical psychology. The general implications for mediational hypotheses and the temporal frameworks from within which they may be drawn are discussed. © 2016 Wiley Periodicals, Inc.
Cancer Data and Statistics Tools
... Joseph Jacqueline W. Miller Mona Saraiya Judith Lee Smith Sherri L. Stewart Mary C. White Debra Younginer ... MD, MPH Simple Singh, MD, MPH Judith Lee Smith, PhD Sherri L. Stewart, PhD Eric Tai, MD, ...
Analysing Simple Electric Motors in the Classroom
ERIC Educational Resources Information Center
Yap, Jeff; MacIsaac, Dan
2006-01-01
Electromagnetic phenomena and devices such as motors are typically unfamiliar to both teachers and students. To better visualize and illustrate the abstract concepts (such as magnetic fields) underlying electricity and magnetism, we suggest that students construct and analyse the operation of a simply constructed Johnson electric motor. In this…
Statistical methodologies for the control of dynamic remapping
NASA Technical Reports Server (NTRS)
Saltz, J. H.; Nicol, D. M.
1986-01-01
Following an initial mapping of a problem onto a multiprocessor machine or computer network, system performance often deteriorates with time. In order to maintain high performance, it may be necessary to remap the problem. The decision to remap must take into account measurements of performance deterioration, the cost of remapping, and the estimated benefits achieved by remapping. We examine the tradeoff between the costs and the benefits of remapping two qualitatively different kinds of problems. One problem assumes that performance deteriorates gradually, the other assumes that performance deteriorates suddenly. We consider a variety of policies for governing when to remap. In order to evaluate these policies, statistical models of problem behaviors are developed. Simulation results are presented which compare simple policies with computationally expensive optimal decision policies; these results demonstrate that for each problem type, the proposed simple policies are effective and robust.
On a simple molecular–statistical model of a liquid-crystal suspension of anisometric particles
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zakhlevnykh, A. N., E-mail: anz@psu.ru; Lubnin, M. S.; Petrov, D. A.
2016-11-15
A molecular–statistical mean-field theory is constructed for suspensions of anisometric particles in nematic liquid crystals (NLCs). The spherical approximation, well known in the physics of ferromagnetic materials, is considered that allows one to obtain an analytic expression for the free energy and simple equations for the orientational state of a suspension that describe the temperature dependence of the order parameters of the suspension components. The transition temperature from ordered to isotropic state and the jumps in the order parameters at the phase-transition point are studied as a function of the anchoring energy of dispersed particles to the matrix, the concentrationmore » of the impurity phase, and the size of particles. The proposed approach allows one to generalize the model to the case of biaxial ordering.« less
Huang, Ruili; Southall, Noel; Xia, Menghang; Cho, Ming-Hsuang; Jadhav, Ajit; Nguyen, Dac-Trung; Inglese, James; Tice, Raymond R.; Austin, Christopher P.
2009-01-01
In support of the U.S. Tox21 program, we have developed a simple and chemically intuitive model we call weighted feature significance (WFS) to predict the toxicological activity of compounds, based on the statistical enrichment of structural features in toxic compounds. We trained and tested the model on the following: (1) data from quantitative high–throughput screening cytotoxicity and caspase activation assays conducted at the National Institutes of Health Chemical Genomics Center, (2) data from Salmonella typhimurium reverse mutagenicity assays conducted by the U.S. National Toxicology Program, and (3) hepatotoxicity data published in the Registry of Toxic Effects of Chemical Substances. Enrichments of structural features in toxic compounds are evaluated for their statistical significance and compiled into a simple additive model of toxicity and then used to score new compounds for potential toxicity. The predictive power of the model for cytotoxicity was validated using an independent set of compounds from the U.S. Environmental Protection Agency tested also at the National Institutes of Health Chemical Genomics Center. We compared the performance of our WFS approach with classical classification methods such as Naive Bayesian clustering and support vector machines. In most test cases, WFS showed similar or slightly better predictive power, especially in the prediction of hepatotoxic compounds, where WFS appeared to have the best performance among the three methods. The new algorithm has the important advantages of simplicity, power, interpretability, and ease of implementation. PMID:19805409
Thieler, E. Robert; Himmelstoss, Emily A.; Zichichi, Jessica L.; Ergul, Ayhan
2009-01-01
The Digital Shoreline Analysis System (DSAS) version 4.0 is a software extension to ESRI ArcGIS v.9.2 and above that enables a user to calculate shoreline rate-of-change statistics from multiple historic shoreline positions. A user-friendly interface of simple buttons and menus guides the user through the major steps of shoreline change analysis. Components of the extension and user guide include (1) instruction on the proper way to define a reference baseline for measurements, (2) automated and manual generation of measurement transects and metadata based on user-specified parameters, and (3) output of calculated rates of shoreline change and other statistical information. DSAS computes shoreline rates of change using four different methods: (1) endpoint rate, (2) simple linear regression, (3) weighted linear regression, and (4) least median of squares. The standard error, correlation coefficient, and confidence interval are also computed for the simple and weighted linear-regression methods. The results of all rate calculations are output to a table that can be linked to the transect file by a common attribute field. DSAS is intended to facilitate the shoreline change-calculation process and to provide rate-of-change information and the statistical data necessary to establish the reliability of the calculated results. The software is also suitable for any generic application that calculates positional change over time, such as assessing rates of change of glacier limits in sequential aerial photos, river edge boundaries, land-cover changes, and so on.