simple statistical tools: Topics by Science.gov

Sample records for simple statistical tools

Using Statistical Process Control to Make Data-Based Clinical Decisions.

ERIC Educational Resources Information Center

Pfadt, Al; Wheeler, Donald J.

1995-01-01

Statistical process control (SPC), which employs simple statistical tools and problem-solving techniques such as histograms, control charts, flow charts, and Pareto charts to implement continual product improvement procedures, can be incorporated into human service organizations. Examples illustrate use of SPC procedures to analyze behavioral data…
A Simple Graphical Method for Quantification of Disaster Management Surge Capacity Using Computer Simulation and Process-control Tools.

PubMed

Franc, Jeffrey Michael; Ingrassia, Pier Luigi; Verde, Manuela; Colombo, Davide; Della Corte, Francesco

2015-02-01

Surge capacity, or the ability to manage an extraordinary volume of patients, is fundamental for hospital management of mass-casualty incidents. However, quantification of surge capacity is difficult and no universal standard for its measurement has emerged, nor has a standardized statistical method been advocated. As mass-casualty incidents are rare, simulation may represent a viable alternative to measure surge capacity. Hypothesis/Problem The objective of the current study was to develop a statistical method for the quantification of surge capacity using a combination of computer simulation and simple process-control statistical tools. Length-of-stay (LOS) and patient volume (PV) were used as metrics. The use of this method was then demonstrated on a subsequent computer simulation of an emergency department (ED) response to a mass-casualty incident. In the derivation phase, 357 participants in five countries performed 62 computer simulations of an ED response to a mass-casualty incident. Benchmarks for ED response were derived from these simulations, including LOS and PV metrics for triage, bed assignment, physician assessment, and disposition. In the application phase, 13 students of the European Master in Disaster Medicine (EMDM) program completed the same simulation scenario, and the results were compared to the standards obtained in the derivation phase. Patient-volume metrics included number of patients to be triaged, assigned to rooms, assessed by a physician, and disposed. Length-of-stay metrics included median time to triage, room assignment, physician assessment, and disposition. Simple graphical methods were used to compare the application phase group to the derived benchmarks using process-control statistical tools. The group in the application phase failed to meet the indicated standard for LOS from admission to disposition decision. This study demonstrates how simulation software can be used to derive values for objective benchmarks of ED surge capacity using PV and LOS metrics. These objective metrics can then be applied to other simulation groups using simple graphical process-control tools to provide a numeric measure of surge capacity. Repeated use in simulations of actual EDs may represent a potential means of objectively quantifying disaster management surge capacity. It is hoped that the described statistical method, which is simple and reusable, will be useful for investigators in this field to apply to their own research.
Simple Statistics: - Summarized!

ERIC Educational Resources Information Center

Blai, Boris, Jr.

Statistics are an essential tool for making proper judgement decisions. It is concerned with probability distribution models, testing of hypotheses, significance tests and other means of determining the correctness of deductions and the most likely outcome of decisions. Measures of central tendency include the mean, median and mode. A second…
Application of Transformations in Parametric Inference

ERIC Educational Resources Information Center

Brownstein, Naomi; Pensky, Marianna

2008-01-01

The objective of the present paper is to provide a simple approach to statistical inference using the method of transformations of variables. We demonstrate performance of this powerful tool on examples of constructions of various estimation procedures, hypothesis testing, Bayes analysis and statistical inference for the stress-strength systems.…
Learning investment indicators through data extension

NASA Astrophysics Data System (ADS)

Dvořák, Marek

2017-07-01

Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.
Cancer Data and Statistics Tools

MedlinePlus

... Joseph Jacqueline W. Miller Mona Saraiya Judith Lee Smith Sherri L. Stewart Mary C. White Debra Younginer ... MD, MPH Simple Singh, MD, MPH Judith Lee Smith, PhD Sherri L. Stewart, PhD Eric Tai, MD, ...
Why Are Shot Puts Thrown at 31[degrees]? Using Autograph for Applications of the Parabola

ERIC Educational Resources Information Center

Butler, Douglas

2010-01-01

Autograph is a two- and three-dimensional dynamic statistics and graphing utility, developed in England, that has grown out of direct classroom experience. A simple select-and-right-click interface, together with tools such as Autograph's unique Slow Plot, Scribble Tool, and dynamic Constant Controller help make the classroom experience…
Development of a Statistical Validation Methodology for Fire Weather Indices

Treesearch

Brian E. Potter; Scott Goodrick; Tim Brown

2003-01-01

Fire managers and forecasters must have tools, such as fire indices, to summarize large amounts of complex information. These tools allow them to identify and plan for periods of elevated risk and/or wildfire potential. This need was once met using simple measures like relative humidity or maximum daily temperature (e.g., Gisborne, 1936) to describe fire weather, and...
Introduction, comparison, and validation of Meta‐Essentials: A free and simple tool for meta‐analysis

PubMed Central

van Rhee, Henk; Hak, Tony

2017-01-01

We present a new tool for meta‐analysis, Meta‐Essentials, which is free of charge and easy to use. In this paper, we introduce the tool and compare its features to other tools for meta‐analysis. We also provide detailed information on the validation of the tool. Although free of charge and simple, Meta‐Essentials automatically calculates effect sizes from a wide range of statistics and can be used for a wide range of meta‐analysis applications, including subgroup analysis, moderator analysis, and publication bias analyses. The confidence interval of the overall effect is automatically based on the Knapp‐Hartung adjustment of the DerSimonian‐Laird estimator. However, more advanced meta‐analysis methods such as meta‐analytical structural equation modelling and meta‐regression with multiple covariates are not available. In summary, Meta‐Essentials may prove a valuable resource for meta‐analysts, including researchers, teachers, and students. PMID:28801932
OPTHYLIC: An Optimised Tool for Hybrid Limits Computation

NASA Astrophysics Data System (ADS)

Busato, Emmanuel; Calvet, David; Theveneaux-Pelzer, Timothée

2018-05-01

A software tool, computing observed and expected upper limits on Poissonian process rates using a hybrid frequentist-Bayesian CLs method, is presented. This tool can be used for simple counting experiments where only signal, background and observed yields are provided or for multi-bin experiments where binned distributions of discriminating variables are provided. It allows the combination of several channels and takes into account statistical and systematic uncertainties, as well as correlations of systematic uncertainties between channels. It has been validated against other software tools and analytical calculations, for several realistic cases.
SPARSKIT: A basic tool kit for sparse matrix computations

NASA Technical Reports Server (NTRS)

Saad, Youcef

1990-01-01

Presented here are the main features of a tool package for manipulating and working with sparse matrices. One of the goals of the package is to provide basic tools to facilitate the exchange of software and data between researchers in sparse matrix computations. The starting point is the Harwell/Boeing collection of matrices for which the authors provide a number of tools. Among other things, the package provides programs for converting data structures, printing simple statistics on a matrix, plotting a matrix profile, and performing linear algebra operations with sparse matrices.
[Is there life beyond SPSS? Discover R].

PubMed

Elosua Oliden, Paula

2009-11-01

R is a GNU statistical and programming environment with very high graphical capabilities. It is very powerful for research purposes, but it is also an exceptional tool for teaching. R is composed of more than 1400 packages that allow using it for simple statistics and applying the most complex and most recent formal models. Using graphical interfaces like the Rcommander package, permits working in user-friendly environments which are similar to the graphical environment used by SPSS. This last characteristic allows non-statisticians to overcome the obstacle of accessibility, and it makes R the best tool for teaching. Is there anything better? Open, free, affordable, accessible and always on the cutting edge.
Introduction, comparison, and validation of Meta-Essentials: A free and simple tool for meta-analysis.

PubMed

Suurmond, Robert; van Rhee, Henk; Hak, Tony

2017-12-01

We present a new tool for meta-analysis, Meta-Essentials, which is free of charge and easy to use. In this paper, we introduce the tool and compare its features to other tools for meta-analysis. We also provide detailed information on the validation of the tool. Although free of charge and simple, Meta-Essentials automatically calculates effect sizes from a wide range of statistics and can be used for a wide range of meta-analysis applications, including subgroup analysis, moderator analysis, and publication bias analyses. The confidence interval of the overall effect is automatically based on the Knapp-Hartung adjustment of the DerSimonian-Laird estimator. However, more advanced meta-analysis methods such as meta-analytical structural equation modelling and meta-regression with multiple covariates are not available. In summary, Meta-Essentials may prove a valuable resource for meta-analysts, including researchers, teachers, and students. © 2017 The Authors. Research Synthesis Methods published by John Wiley & Sons Ltd.
Hood of the truck statistics for food animal practitioners.

PubMed

Slenning, Barrett D

2006-03-01

This article offers some tips on working with statistics and develops four relatively simple procedures to deal with most kinds of data with which veterinarians work. The criterion for a procedure to be a "Hood of the Truck Statistics" (HOT Stats) technique is that it must be simple enough to be done with pencil, paper, and a calculator. The goal of HOT Stats is to have the tools available to run quick analyses in only a few minutes so that decisions can be made in a timely fashion. The discipline allows us to move away from the all-too-common guess work about effects and differences we perceive following a change in treatment or management. The techniques allow us to move toward making more defensible, credible, and more quantifiably "risk-aware" real-time recommendations to our clients.
Bayesian models based on test statistics for multiple hypothesis testing problems.

PubMed

Ji, Yuan; Lu, Yiling; Mills, Gordon B

2008-04-01

We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
Implementation and Use of the Reference Analytics Module of LibAnswers

ERIC Educational Resources Information Center

Flatley, Robert; Jensen, Robert Bruce

2012-01-01

Academic libraries have traditionally collected reference statistics using hash marks on paper. Although efficient and simple, this method is not an effective way to capture the complexity of reference transactions. Several electronic tools are now available to assist libraries with collecting often elusive reference data--among them homegrown…
The Web as an educational tool for/in learning/teaching bioinformatics statistics.

PubMed

Oliver, J; Pisano, M E; Alonso, T; Roca, P

2005-12-01

Statistics provides essential tool in Bioinformatics to interpret the results of a database search or for the management of enormous amounts of information provided from genomics, proteomics and metabolomics. The goal of this project was the development of a software tool that would be as simple as possible to demonstrate the use of the Bioinformatics statistics. Computer Simulation Methods (CSMs) developed using Microsoft Excel were chosen for their broad range of applications, immediate and easy formula calculation, immediate testing and easy graphics representation, and of general use and acceptance by the scientific community. The result of these endeavours is a set of utilities which can be accessed from the following URL: http://gmein.uib.es/bioinformatica/statistics. When tested on students with previous coursework with traditional statistical teaching methods, the general opinion/overall consensus was that Web-based instruction had numerous advantages, but traditional methods with manual calculations were also needed for their theory and practice. Once having mastered the basic statistical formulas, Excel spreadsheets and graphics were shown to be very useful for trying many parameters in a rapid fashion without having to perform tedious calculations. CSMs will be of great importance for the formation of the students and professionals in the field of bioinformatics, and for upcoming applications of self-learning and continuous formation.
MyPMFs: a simple tool for creating statistical potentials to assess protein structural models.

PubMed

Postic, Guillaume; Hamelryck, Thomas; Chomilier, Jacques; Stratmann, Dirk

2018-05-29

Evaluating the model quality of protein structures that evolve in environments with particular physicochemical properties requires scoring functions that are adapted to their specific residue compositions and/or structural characteristics. Thus, computational methods developed for structures from the cytosol cannot work properly on membrane or secreted proteins. Here, we present MyPMFs, an easy-to-use tool that allows users to train statistical potentials of mean force (PMFs) on the protein structures of their choice, with all parameters being adjustable. We demonstrate its use by creating an accurate statistical potential for transmembrane protein domains. We also show its usefulness to study the influence of the physical environment on residue interactions within protein structures. Our open-source software is freely available for download at https://github.com/bibip-impmc/mypmfs. Copyright © 2018. Published by Elsevier B.V.
Statistical Issues for Uncontrolled Reentry Hazards Empirical Tests of the Predicted Footprint for Uncontrolled Satellite Reentry Hazards

NASA Technical Reports Server (NTRS)

Matney, Mark

2011-01-01

A number of statistical tools have been developed over the years for assessing the risk of reentering objects to human populations. These tools make use of the characteristics (e.g., mass, material, shape, size) of debris that are predicted by aerothermal models to survive reentry. The statistical tools use this information to compute the probability that one or more of the surviving debris might hit a person on the ground and cause one or more casualties. The statistical portion of the analysis relies on a number of assumptions about how the debris footprint and the human population are distributed in latitude and longitude, and how to use that information to arrive at realistic risk numbers. Because this information is used in making policy and engineering decisions, it is important that these assumptions be tested using empirical data. This study uses the latest database of known uncontrolled reentry locations measured by the United States Department of Defense. The predicted ground footprint distributions of these objects are based on the theory that their orbits behave basically like simple Kepler orbits. However, there are a number of factors in the final stages of reentry - including the effects of gravitational harmonics, the effects of the Earth s equatorial bulge on the atmosphere, and the rotation of the Earth and atmosphere - that could cause them to diverge from simple Kepler orbit behavior and possibly change the probability of reentering over a given location. In this paper, the measured latitude and longitude distributions of these objects are directly compared with the predicted distributions, providing a fundamental empirical test of the model assumptions.
Cloud-based solution to identify statistically significant MS peaks differentiating sample categories.

PubMed

Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B

2013-03-23

Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.

CAN'T MISS--conquer any number task by making important statistics simple. Part 2. Probability, populations, samples, and normal distributions.

PubMed

Hansen, John P

2003-01-01

Healthcare quality improvement professionals need to understand and use inferential statistics to interpret sample data from their organizations. In quality improvement and healthcare research studies all the data from a population often are not available, so investigators take samples and make inferences about the population by using inferential statistics. This three-part series will give readers an understanding of the concepts of inferential statistics as well as the specific tools for calculating confidence intervals for samples of data. This article, Part 2, describes probability, populations, and samples. The uses of descriptive and inferential statistics are outlined. The article also discusses the properties and probability of normal distributions, including the standard normal distribution.
Novel simple and practical nutritional screening tool for cancer inpatients: a pilot study.

PubMed

Zekri, Jamal; Morganti, Julie; Rizvi, Azhar; Sadiq, Bakr Bin; Kerr, Ian; Aslam, Mohamed

2014-05-01

There is lack of consensus on how nutritional screening and intervention should be provided to cancer patients. Nutritional screening and support of cancer patients are not well established in the Middle East. We report our systematic and practical experience led by a qualified specialist dietician in a cancer inpatient setting, using a novel nutritional screening tool. Ninety-seven consecutive inpatients underwent nutritional screening and categorised into three nutritional risk groups based on oral intake, gastrointestinal symptoms, body mass index (BMI) and weight loss. Nutritional support was introduced accordingly. Statistical tests used included ANOVA, Bonferroni post hoc, chi-square and log rank tests. Median age was 48 (19-87)years. Patients were categorised into three nutritional risk groups: 55 % low, 37 % intermediate and 8 % high. Nutritional intervention was introduced for 36 % of these patients. Individually, weight, BMI, oral intake, serum albumin on admission and weight loss significantly affected nutritional risk and nutritional intervention (all significant P values). Eighty-seven, 60 and 55 % of patients admitted for chemotherapy, febrile neutropenia and other reasons, respectively, did not require specific nutritional intervention. There was a statistically significant relationship between nutritional risk and nutritional intervention (P=0.005). Significantly more patients were alive at 3 months in low (91 %) than intermediate (75 %) than high (37 %)-risk groups. About a third of cancer inpatients require nutritional intervention. The adopted nutritional risk assessment tool is simple and practical. The validity of this tool is supported by its significant relation with known individual nutritional risk factors. This should be confirmed in larger prospective study and comparing this new tool with other established ones.
A simple rapid process for semi-automated brain extraction from magnetic resonance images of the whole mouse head.

PubMed

Delora, Adam; Gonzales, Aaron; Medina, Christopher S; Mitchell, Adam; Mohed, Abdul Faheem; Jacobs, Russell E; Bearer, Elaine L

2016-01-15

Magnetic resonance imaging (MRI) is a well-developed technique in neuroscience. Limitations in applying MRI to rodent models of neuropsychiatric disorders include the large number of animals required to achieve statistical significance, and the paucity of automation tools for the critical early step in processing, brain extraction, which prepares brain images for alignment and voxel-wise statistics. This novel timesaving automation of template-based brain extraction ("skull-stripping") is capable of quickly and reliably extracting the brain from large numbers of whole head images in a single step. The method is simple to install and requires minimal user interaction. This method is equally applicable to different types of MR images. Results were evaluated with Dice and Jacquard similarity indices and compared in 3D surface projections with other stripping approaches. Statistical comparisons demonstrate that individual variation of brain volumes are preserved. A downloadable software package not otherwise available for extraction of brains from whole head images is included here. This software tool increases speed, can be used with an atlas or a template from within the dataset, and produces masks that need little further refinement. Our new automation can be applied to any MR dataset, since the starting point is a template mask generated specifically for that dataset. The method reliably and rapidly extracts brain images from whole head images, rendering them useable for subsequent analytical processing. This software tool will accelerate the exploitation of mouse models for the investigation of human brain disorders by MRI. Copyright © 2015 Elsevier B.V. All rights reserved.
On-Line Analysis of Southern FIA Data

Treesearch

Michael P. Spinney; Paul C. Van Deusen; Francis A. Roesch

2006-01-01

The Southern On-Line Estimator (SOLE) is a web-based FIA database analysis tool designed with an emphasis on modularity. The Java-based user interface is simple and intuitive to use and the R-based analysis engine is fast and stable. Each component of the program (data retrieval, statistical analysis and output) can be individually modified to accommodate major...
Empirical Reference Distributions for Networks of Different Size

PubMed Central

Smith, Anna; Calder, Catherine A.; Browning, Christopher R.

2016-01-01

Network analysis has become an increasingly prevalent research tool across a vast range of scientific fields. Here, we focus on the particular issue of comparing network statistics, i.e. graph-level measures of network structural features, across multiple networks that differ in size. Although “normalized” versions of some network statistics exist, we demonstrate via simulation why direct comparison is often inappropriate. We consider normalizing network statistics relative to a simple fully parameterized reference distribution and demonstrate via simulation how this is an improvement over direct comparison, but still sometimes problematic. We propose a new adjustment method based on a reference distribution constructed as a mixture model of random graphs which reflect the dependence structure exhibited in the observed networks. We show that using simple Bernoulli models as mixture components in this reference distribution can provide adjusted network statistics that are relatively comparable across different network sizes but still describe interesting features of networks, and that this can be accomplished at relatively low computational expense. Finally, we apply this methodology to a collection of ecological networks derived from the Los Angeles Family and Neighborhood Survey activity location data. PMID:27721556
An integrated user-friendly ArcMAP tool for bivariate statistical modeling in geoscience applications

NASA Astrophysics Data System (ADS)

Jebur, M. N.; Pradhan, B.; Shafri, H. Z. M.; Yusof, Z.; Tehrany, M. S.

2014-10-01

Modeling and classification difficulties are fundamental issues in natural hazard assessment. A geographic information system (GIS) is a domain that requires users to use various tools to perform different types of spatial modeling. Bivariate statistical analysis (BSA) assists in hazard modeling. To perform this analysis, several calculations are required and the user has to transfer data from one format to another. Most researchers perform these calculations manually by using Microsoft Excel or other programs. This process is time consuming and carries a degree of uncertainty. The lack of proper tools to implement BSA in a GIS environment prompted this study. In this paper, a user-friendly tool, BSM (bivariate statistical modeler), for BSA technique is proposed. Three popular BSA techniques such as frequency ratio, weights-of-evidence, and evidential belief function models are applied in the newly proposed ArcMAP tool. This tool is programmed in Python and is created by a simple graphical user interface, which facilitates the improvement of model performance. The proposed tool implements BSA automatically, thus allowing numerous variables to be examined. To validate the capability and accuracy of this program, a pilot test area in Malaysia is selected and all three models are tested by using the proposed program. Area under curve is used to measure the success rate and prediction rate. Results demonstrate that the proposed program executes BSA with reasonable accuracy. The proposed BSA tool can be used in numerous applications, such as natural hazard, mineral potential, hydrological, and other engineering and environmental applications.
An integrated user-friendly ArcMAP tool for bivariate statistical modelling in geoscience applications

NASA Astrophysics Data System (ADS)

Jebur, M. N.; Pradhan, B.; Shafri, H. Z. M.; Yusoff, Z. M.; Tehrany, M. S.

2015-03-01

Modelling and classification difficulties are fundamental issues in natural hazard assessment. A geographic information system (GIS) is a domain that requires users to use various tools to perform different types of spatial modelling. Bivariate statistical analysis (BSA) assists in hazard modelling. To perform this analysis, several calculations are required and the user has to transfer data from one format to another. Most researchers perform these calculations manually by using Microsoft Excel or other programs. This process is time-consuming and carries a degree of uncertainty. The lack of proper tools to implement BSA in a GIS environment prompted this study. In this paper, a user-friendly tool, bivariate statistical modeler (BSM), for BSA technique is proposed. Three popular BSA techniques, such as frequency ratio, weight-of-evidence (WoE), and evidential belief function (EBF) models, are applied in the newly proposed ArcMAP tool. This tool is programmed in Python and created by a simple graphical user interface (GUI), which facilitates the improvement of model performance. The proposed tool implements BSA automatically, thus allowing numerous variables to be examined. To validate the capability and accuracy of this program, a pilot test area in Malaysia is selected and all three models are tested by using the proposed program. Area under curve (AUC) is used to measure the success rate and prediction rate. Results demonstrate that the proposed program executes BSA with reasonable accuracy. The proposed BSA tool can be used in numerous applications, such as natural hazard, mineral potential, hydrological, and other engineering and environmental applications.
Statistical fluctuations in pedestrian evacuation times and the effect of social contagion

NASA Astrophysics Data System (ADS)

Nicolas, Alexandre; Bouzat, Sebastián; Kuperman, Marcelo N.

2016-08-01

Mathematical models of pedestrian evacuation and the associated simulation software have become essential tools for the assessment of the safety of public facilities and buildings. While a variety of models is now available, their calibration and test against empirical data are generally restricted to global averaged quantities; the statistics compiled from the time series of individual escapes ("microscopic" statistics) measured in recent experiments are thus overlooked. In the same spirit, much research has primarily focused on the average global evacuation time, whereas the whole distribution of evacuation times over some set of realizations should matter. In the present paper we propose and discuss the validity of a simple relation between this distribution and the microscopic statistics, which is theoretically valid in the absence of correlations. To this purpose, we develop a minimal cellular automaton, with features that afford a semiquantitative reproduction of the experimental microscopic statistics. We then introduce a process of social contagion of impatient behavior in the model and show that the simple relation under test may dramatically fail at high contagion strengths, the latter being responsible for the emergence of strong correlations in the system. We conclude with comments on the potential practical relevance for safety science of calculations based on microscopic statistics.
Issues around Creating a Reusable Learning Object to Support Statistics Teaching

ERIC Educational Resources Information Center

Gilchrist, Mollie

2007-01-01

Although our health professional students have some experience of simple charts, such as pie and bar, and some intuition of histograms, they do not appear to have much knowledge or understanding about box and whisker plots and their relation to the data they are describing or compared to histograms. The boxplot is a versatile charting tool, useful…
Understanding Statistical Power in Cluster Randomized Trials: Challenges Posed by Differences in Notation and Terminology

ERIC Educational Resources Information Center

Spybrook, Jessaca; Hedges, Larry; Borenstein, Michael

2014-01-01

Research designs in which clusters are the unit of randomization are quite common in the social sciences. Given the multilevel nature of these studies, the power analyses for these studies are more complex than in a simple individually randomized trial. Tools are now available to help researchers conduct power analyses for cluster randomized…
Bringing Data to Life into an Introductory Statistics Course with Gapminder

ERIC Educational Resources Information Center

Le, Dai-Trang

2013-01-01

"Gapminder" is a free and easy to use software for visualising real-world data in multiple dimensions. The simple format of the Cartesian coordinate system is used in a dynamic and interactive way to convey a great deal of information. This tool can be readily used to arouse students' natural curiosity regarding world events and to…
Statistical Issues for Uncontrolled Reentry Hazards

NASA Technical Reports Server (NTRS)

Matney, Mark

2008-01-01

A number of statistical tools have been developed over the years for assessing the risk of reentering objects to human populations. These tools make use of the characteristics (e.g., mass, shape, size) of debris that are predicted by aerothermal models to survive reentry. The statistical tools use this information to compute the probability that one or more of the surviving debris might hit a person on the ground and cause one or more casualties. The statistical portion of the analysis relies on a number of assumptions about how the debris footprint and the human population are distributed in latitude and longitude, and how to use that information to arrive at realistic risk numbers. This inevitably involves assumptions that simplify the problem and make it tractable, but it is often difficult to test the accuracy and applicability of these assumptions. This paper looks at a number of these theoretical assumptions, examining the mathematical basis for the hazard calculations, and outlining the conditions under which the simplifying assumptions hold. In addition, this paper will also outline some new tools for assessing ground hazard risk in useful ways. Also, this study is able to make use of a database of known uncontrolled reentry locations measured by the United States Department of Defense. By using data from objects that were in orbit more than 30 days before reentry, sufficient time is allowed for the orbital parameters to be randomized in the way the models are designed to compute. The predicted ground footprint distributions of these objects are based on the theory that their orbits behave basically like simple Kepler orbits. However, there are a number of factors - including the effects of gravitational harmonics, the effects of the Earth's equatorial bulge on the atmosphere, and the rotation of the Earth and atmosphere - that could cause them to diverge from simple Kepler orbit behavior and change the ground footprints. The measured latitude and longitude distributions of these objects provide data that can be directly compared with the predicted distributions, providing a fundamental empirical test of the model assumptions.
Dynamic Monitoring Reveals Motor Task Characteristics in Prehistoric Technical Gestures

PubMed Central

Pfleging, Johannes; Stücheli, Marius; Iovita, Radu; Buchli, Jonas

2015-01-01

Reconstructing ancient technical gestures associated with simple tool actions is crucial for understanding the co-evolution of the human forelimb and its associated control-related cognitive functions on the one hand, and of the human technological arsenal on the other hand. Although the topic of gesture is an old one in Paleolithic archaeology and in anthropology in general, very few studies have taken advantage of the new technologies from the science of kinematics in order to improve replicative experimental protocols. Recent work in paleoanthropology has shown the potential of monitored replicative experiments to reconstruct tool-use-related motions through the study of fossil bones, but so far comparatively little has been done to examine the dynamics of the tool itself. In this paper, we demonstrate that we can statistically differentiate gestures used in a simple scraping task through dynamic monitoring. Dynamics combines kinematics (position, orientation, and speed) with contact mechanical parameters (force and torque). Taken together, these parameters are important because they play a role in the formation of a visible archaeological signature, use-wear. We present our new affordable, yet precise methodology for measuring the dynamics of a simple hide-scraping task, carried out using a pull-to (PT) and a push-away (PA) gesture. A strain gage force sensor combined with a visual tag tracking system records force, torque, as well as position and orientation of hafted flint stone tools. The set-up allows switching between two tool configurations, one with distal and the other one with perpendicular hafting of the scrapers, to allow for ethnographically plausible reconstructions. The data show statistically significant differences between the two gestures: scraping away from the body (PA) generates higher shearing forces, but requires greater hand torque. Moreover, most benchmarks associated with the PA gesture are more highly variable than in the PT gesture. These results demonstrate that different gestures used in ‘common’ prehistoric tasks can be distinguished quantitatively based on their dynamic parameters. Future research needs to assess our ability to reconstruct these parameters from observed use-wear patterns. PMID:26284785
The Statistical Consulting Center for Astronomy (SCCA)

NASA Technical Reports Server (NTRS)

Akritas, Michael

2001-01-01

The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.
Statistical image segmentation for the detection of skin lesion borders in UV fluorescence excitation

NASA Astrophysics Data System (ADS)

Ortega-Martinez, Antonio; Padilla-Martinez, Juan Pablo; Franco, Walfre

2016-04-01

The skin contains several fluorescent molecules or fluorophores that serve as markers of structure, function and composition. UV fluorescence excitation photography is a simple and effective way to image specific intrinsic fluorophores, such as the one ascribed to tryptophan which emits at a wavelength of 345 nm upon excitation at 295 nm, and is a marker of cellular proliferation. Earlier, we built a clinical UV photography system to image cellular proliferation. In some samples, the naturally low intensity of the fluorescence can make it difficult to separate the fluorescence of cells in higher proliferation states from background fluorescence and other imaging artifacts -- like electronic noise. In this work, we describe a statistical image segmentation method to separate the fluorescence of interest. Statistical image segmentation is based on image averaging, background subtraction and pixel statistics. This method allows to better quantify the intensity and surface distributions of fluorescence, which in turn simplify the detection of borders. Using this method we delineated the borders of highly-proliferative skin conditions and diseases, in particular, allergic contact dermatitis, psoriatic lesions and basal cell carcinoma. Segmented images clearly define lesion borders. UV fluorescence excitation photography along with statistical image segmentation may serve as a quick and simple diagnostic tool for clinicians.
Causality

NASA Astrophysics Data System (ADS)

Pearl, Judea

2000-03-01

Written by one of the pre-eminent researchers in the field, this book provides a comprehensive exposition of modern analysis of causation. It shows how causality has grown from a nebulous concept into a mathematical theory with significant applications in the fields of statistics, artificial intelligence, philosophy, cognitive science, and the health and social sciences. Pearl presents a unified account of the probabilistic, manipulative, counterfactual and structural approaches to causation, and devises simple mathematical tools for analyzing the relationships between causal connections, statistical associations, actions and observations. The book will open the way for including causal analysis in the standard curriculum of statistics, artifical intelligence, business, epidemiology, social science and economics. Students in these areas will find natural models, simple identification procedures, and precise mathematical definitions of causal concepts that traditional texts have tended to evade or make unduly complicated. This book will be of interest to professionals and students in a wide variety of fields. Anyone who wishes to elucidate meaningful relationships from data, predict effects of actions and policies, assess explanations of reported events, or form theories of causal understanding and causal speech will find this book stimulating and invaluable.
ToNER: A tool for identifying nucleotide enrichment signals in feature-enriched RNA-seq data.

PubMed

Promworn, Yuttachon; Kaewprommal, Pavita; Shaw, Philip J; Intarapanich, Apichart; Tongsima, Sissades; Piriyapongsa, Jittima

2017-01-01

Biochemical methods are available for enriching 5' ends of RNAs in prokaryotes, which are employed in the differential RNA-seq (dRNA-seq) and the more recent Cappable-seq protocols. Computational methods are needed to locate RNA 5' ends from these data by statistical analysis of the enrichment. Although statistical-based analysis methods have been developed for dRNA-seq, they may not be suitable for Cappable-seq data. The more efficient enrichment method employed in Cappable-seq compared with dRNA-seq could affect data distribution and thus algorithm performance. We present Transformation of Nucleotide Enrichment Ratios (ToNER), a tool for statistical modeling of enrichment from RNA-seq data obtained from enriched and unenriched libraries. The tool calculates nucleotide enrichment scores and determines the global transformation for fitting to the normal distribution using the Box-Cox procedure. From the transformed distribution, sites of significant enrichment are identified. To increase power of detection, meta-analysis across experimental replicates is offered. We tested the tool on Cappable-seq and dRNA-seq data for identifying Escherichia coli transcript 5' ends and compared the results with those from the TSSAR tool, which is designed for analyzing dRNA-seq data. When combining results across Cappable-seq replicates, ToNER detects more known transcript 5' ends than TSSAR. In general, the transcript 5' ends detected by ToNER but not TSSAR occur in regions which cannot be locally modeled by TSSAR. ToNER uses a simple yet robust statistical modeling approach, which can be used for detecting RNA 5'ends from Cappable-seq data, in particular when combining information from experimental replicates. The ToNER tool could potentially be applied for analyzing other RNA-seq datasets in which enrichment for other structural features of RNA is employed. The program is freely available for download at ToNER webpage (http://www4a.biotec.or.th/GI/tools/toner) and GitHub repository (https://github.com/PavitaKae/ToNER).
Comparison of four modeling tools for the prediction of potential distribution for non-indigenous weeds in the United States

USGS Publications Warehouse

Magarey, Roger; Newton, Leslie; Hong, Seung C.; Takeuchi, Yu; Christie, Dave; Jarnevich, Catherine S.; Kohl, Lisa; Damus, Martin; Higgins, Steven I.; Miller, Leah; Castro, Karen; West, Amanda; Hastings, John; Cook, Gericke; Kartesz, John; Koop, Anthony

2018-01-01

This study compares four models for predicting the potential distribution of non-indigenous weed species in the conterminous U.S. The comparison focused on evaluating modeling tools and protocols as currently used for weed risk assessment or for predicting the potential distribution of invasive weeds. We used six weed species (three highly invasive and three less invasive non-indigenous species) that have been established in the U.S. for more than 75 years. The experiment involved providing non-U. S. location data to users familiar with one of the four evaluated techniques, who then developed predictive models that were applied to the United States without knowing the identity of the species or its U.S. distribution. We compared a simple GIS climate matching technique known as Proto3, a simple climate matching tool CLIMEX Match Climates, the correlative model MaxEnt, and a process model known as the Thornley Transport Resistance (TTR) model. Two experienced users ran each modeling tool except TTR, which had one user. Models were trained with global species distribution data excluding any U.S. data, and then were evaluated using the current known U.S. distribution. The influence of weed species identity and modeling tool on prevalence and sensitivity effects was compared using a generalized linear mixed model. Each modeling tool itself had a low statistical significance, while weed species alone accounted for 69.1 and 48.5% of the variance for prevalence and sensitivity, respectively. These results suggest that simple modeling tools might perform as well as complex ones in the case of predicting potential distribution for a weed not yet present in the United States. Considerations of model accuracy should also be balanced with those of reproducibility and ease of use. More important than the choice of modeling tool is the construction of robust protocols and testing both new and experienced users under blind test conditions that approximate operational conditions.
Automated clustering-based workload characterization

NASA Technical Reports Server (NTRS)

Pentakalos, Odysseas I.; Menasce, Daniel A.; Yesha, Yelena

1996-01-01

The demands placed on the mass storage systems at various federal agencies and national laboratories are continuously increasing in intensity. This forces system managers to constantly monitor the system, evaluate the demand placed on it, and tune it appropriately using either heuristics based on experience or analytic models. Performance models require an accurate workload characterization. This can be a laborious and time consuming process. It became evident from our experience that a tool is necessary to automate the workload characterization process. This paper presents the design and discusses the implementation of a tool for workload characterization of mass storage systems. The main features of the tool discussed here are: (1)Automatic support for peak-period determination. Histograms of system activity are generated and presented to the user for peak-period determination; (2) Automatic clustering analysis. The data collected from the mass storage system logs is clustered using clustering algorithms and tightness measures to limit the number of generated clusters; (3) Reporting of varied file statistics. The tool computes several statistics on file sizes such as average, standard deviation, minimum, maximum, frequency, as well as average transfer time. These statistics are given on a per cluster basis; (4) Portability. The tool can easily be used to characterize the workload in mass storage systems of different vendors. The user needs to specify through a simple log description language how the a specific log should be interpreted. The rest of this paper is organized as follows. Section two presents basic concepts in workload characterization as they apply to mass storage systems. Section three describes clustering algorithms and tightness measures. The following section presents the architecture of the tool. Section five presents some results of workload characterization using the tool.Finally, section six presents some concluding remarks.
Thoth: Software for data visualization & statistics

NASA Astrophysics Data System (ADS)

Laher, R. R.

2016-10-01

Thoth is a standalone software application with a graphical user interface for making it easy to query, display, visualize, and analyze tabular data stored in relational databases and data files. From imported data tables, it can create pie charts, bar charts, scatter plots, and many other kinds of data graphs with simple menus and mouse clicks (no programming required), by leveraging the open-source JFreeChart library. It also computes useful table-column data statistics. A mature tool, having underwent development and testing over several years, it is written in the Java computer language, and hence can be run on any computing platform that has a Java Virtual Machine and graphical-display capability. It can be downloaded and used by anyone free of charge, and has general applicability in science, engineering, medical, business, and other fields. Special tools and features for common tasks in astronomy and astrophysical research are included in the software.

Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis

PubMed Central

McDermott, Josh H.; Simoncelli, Eero P.

2014-01-01

Rainstorms, insect swarms, and galloping horses produce “sound textures” – the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures. However, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation. PMID:21903084
A method for obtaining a statistically stationary turbulent free shear flow

NASA Technical Reports Server (NTRS)

Timson, Stephen F.; Lele, S. K.; Moser, R. D.

1994-01-01

The long-term goal of the current research is the study of Large-Eddy Simulation (LES) as a tool for aeroacoustics. New algorithms and developments in computer hardware are making possible a new generation of tools for aeroacoustic predictions, which rely on the physics of the flow rather than empirical knowledge. LES, in conjunction with an acoustic analogy, holds the promise of predicting the statistics of noise radiated to the far-field of a turbulent flow. LES's predictive ability will be tested through extensive comparison of acoustic predictions based on a Direct Numerical Simulation (DNS) and LES of the same flow, as well as a priori testing of DNS results. The method presented here is aimed at allowing simulation of a turbulent flow field that is both simple and amenable to acoustic predictions. A free shear flow is homogeneous in both the streamwise and spanwise directions and which is statistically stationary will be simulated using equations based on the Navier-Stokes equations with a small number of added terms. Studying a free shear flow eliminates the need to consider flow-surface interactions as an acoustic source. The homogeneous directions and the flow's statistically stationary nature greatly simplify the application of an acoustic analogy.
Automated Reporting of DXA Studies Using a Custom-Built Computer Program.

PubMed

England, Joseph R; Colletti, Patrick M

2018-06-01

Dual-energy x-ray absorptiometry (DXA) scans are a critical population health tool and relatively simple to interpret but can be time consuming to report, often requiring manual transfer of bone mineral density and associated statistics into commercially available dictation systems. We describe here a custom-built computer program for automated reporting of DXA scans using Pydicom, an open-source package built in the Python computer language, and regular expressions to mine DICOM tags for patient information and bone mineral density statistics. This program, easy to emulate by any novice computer programmer, has doubled our efficiency at reporting DXA scans and has eliminated dictation errors.
Development of the Concept of Energy Conservation using Simple Experiments for Grade 10 Students

NASA Astrophysics Data System (ADS)

Rachniyom, S.; Toedtanya, K.; Wuttiprom, S.

2017-09-01

The purpose of this research was to develop students’ concept of and retention rate in relation to energy conservation. Activities included simple and easy experiments that considered energy transformation from potential to kinetic energy. The participants were 30 purposively selected grade 10 students in the second semester of the 2016 academic year. The research tools consisted of learning lesson plans and a learning achievement test. Results showed that the experiments worked well and were appropriate as learning activities. The students’ achievement scores significantly increased at the statistical level of 05, the students’ retention rates were at a high level, and learning behaviour was at a good level. These simple experiments allowed students to learn to demonstrate to their peers and encouraged them to use familiar models to explain phenomena in daily life.
Evaluation of IOTA Simple Ultrasound Rules to Distinguish Benign and Malignant Ovarian Tumours.

PubMed

Garg, Sugandha; Kaur, Amarjit; Mohi, Jaswinder Kaur; Sibia, Preet Kanwal; Kaur, Navkiran

2017-08-01

IOTA stands for International Ovarian Tumour Analysis group. Ovarian cancer is one of the common cancers in women and is diagnosed at later stage in majority. The limiting factor for early diagnosis is lack of standardized terms and procedures in gynaecological sonography. Introduction of IOTA rules has provided some consistency in defining morphological features of ovarian masses through a standardized examination technique. To evaluate the efficacy of IOTA simple ultrasound rules in distinguishing benign and malignant ovarian tumours and establishing their use as a tool in early diagnosis of ovarian malignancy. A hospital based case control prospective study was conducted. Patients with suspected ovarian pathology were evaluated using IOTA ultrasound rules and designated as benign or malignant. Findings were correlated with histopathological findings. Collected data was statistically analysed using chi-square test and kappa statistical method. Out of initial 55 patients, 50 patients were included in the final analysis who underwent surgery. IOTA simple rules were applicable in 45 out of these 50 patients (90%). The sensitivity for the detection of malignancy in cases where IOTA simple rules were applicable was 91.66% and the specificity was 84.84%. Accuracy was 86.66%. Classifying inconclusive cases as malignant, the sensitivity and specificity was 93% and 80% respectively. High level of agreement was found between USG and histopathological diagnosis with Kappa value as 0.323. IOTA simple ultrasound rules were highly sensitive and specific in predicting ovarian malignancy preoperatively yet being reproducible, easy to train and use.
Bio-jETI: a service integration, design, and provisioning platform for orchestrated bioinformatics processes.

PubMed

Margaria, Tiziana; Kubczak, Christian; Steffen, Bernhard

2008-04-25

With Bio-jETI, we introduce a service platform for interdisciplinary work on biological application domains and illustrate its use in a concrete application concerning statistical data processing in R and xcms for an LC/MS analysis of FAAH gene knockout. Bio-jETI uses the jABC environment for service-oriented modeling and design as a graphical process modeling tool and the jETI service integration technology for remote tool execution. As a service definition and provisioning platform, Bio-jETI has the potential to become a core technology in interdisciplinary service orchestration and technology transfer. Domain experts, like biologists not trained in computer science, directly define complex service orchestrations as process models and use efficient and complex bioinformatics tools in a simple and intuitive way.
ToNER: A tool for identifying nucleotide enrichment signals in feature-enriched RNA-seq data

PubMed Central

Promworn, Yuttachon; Kaewprommal, Pavita; Shaw, Philip J.; Intarapanich, Apichart; Tongsima, Sissades

2017-01-01

Background Biochemical methods are available for enriching 5′ ends of RNAs in prokaryotes, which are employed in the differential RNA-seq (dRNA-seq) and the more recent Cappable-seq protocols. Computational methods are needed to locate RNA 5′ ends from these data by statistical analysis of the enrichment. Although statistical-based analysis methods have been developed for dRNA-seq, they may not be suitable for Cappable-seq data. The more efficient enrichment method employed in Cappable-seq compared with dRNA-seq could affect data distribution and thus algorithm performance. Results We present Transformation of Nucleotide Enrichment Ratios (ToNER), a tool for statistical modeling of enrichment from RNA-seq data obtained from enriched and unenriched libraries. The tool calculates nucleotide enrichment scores and determines the global transformation for fitting to the normal distribution using the Box-Cox procedure. From the transformed distribution, sites of significant enrichment are identified. To increase power of detection, meta-analysis across experimental replicates is offered. We tested the tool on Cappable-seq and dRNA-seq data for identifying Escherichia coli transcript 5′ ends and compared the results with those from the TSSAR tool, which is designed for analyzing dRNA-seq data. When combining results across Cappable-seq replicates, ToNER detects more known transcript 5′ ends than TSSAR. In general, the transcript 5′ ends detected by ToNER but not TSSAR occur in regions which cannot be locally modeled by TSSAR. Conclusion ToNER uses a simple yet robust statistical modeling approach, which can be used for detecting RNA 5′ends from Cappable-seq data, in particular when combining information from experimental replicates. The ToNER tool could potentially be applied for analyzing other RNA-seq datasets in which enrichment for other structural features of RNA is employed. The program is freely available for download at ToNER webpage (http://www4a.biotec.or.th/GI/tools/toner) and GitHub repository (https://github.com/PavitaKae/ToNER). PMID:28542466
Six Sigma Quality Management System and Design of Risk-based Statistical Quality Control.

PubMed

Westgard, James O; Westgard, Sten A

2017-03-01

Six sigma concepts provide a quality management system (QMS) with many useful tools for managing quality in medical laboratories. This Six Sigma QMS is driven by the quality required for the intended use of a test. The most useful form for this quality requirement is the allowable total error. Calculation of a sigma-metric provides the best predictor of risk for an analytical examination process, as well as a design parameter for selecting the statistical quality control (SQC) procedure necessary to detect medically important errors. Simple point estimates of sigma at medical decision concentrations are sufficient for laboratory applications. Copyright © 2016 Elsevier Inc. All rights reserved.
Statistical tests for detecting associations with groups of genetic variants: generalization, evaluation, and implementation

PubMed Central

Ferguson, John; Wheeler, William; Fu, YiPing; Prokunina-Olsson, Ludmila; Zhao, Hongyu; Sampson, Joshua

2013-01-01

With recent advances in sequencing, genotyping arrays, and imputation, GWAS now aim to identify associations with rare and uncommon genetic variants. Here, we describe and evaluate a class of statistics, generalized score statistics (GSS), that can test for an association between a group of genetic variants and a phenotype. GSS are a simple weighted sum of single-variant statistics and their cross-products. We show that the majority of statistics currently used to detect associations with rare variants are equivalent to choosing a specific set of weights within this framework. We then evaluate the power of various weighting schemes as a function of variant characteristics, such as MAF, the proportion associated with the phenotype, and the direction of effect. Ultimately, we find that two classical tests are robust and powerful, but details are provided as to when other GSS may perform favorably. The software package CRaVe is available at our website (http://dceg.cancer.gov/bb/tools/crave). PMID:23092956
The Problem of Auto-Correlation in Parasitology

PubMed Central

Pollitt, Laura C.; Reece, Sarah E.; Mideo, Nicole; Nussey, Daniel H.; Colegrave, Nick

2012-01-01

Explaining the contribution of host and pathogen factors in driving infection dynamics is a major ambition in parasitology. There is increasing recognition that analyses based on single summary measures of an infection (e.g., peak parasitaemia) do not adequately capture infection dynamics and so, the appropriate use of statistical techniques to analyse dynamics is necessary to understand infections and, ultimately, control parasites. However, the complexities of within-host environments mean that tracking and analysing pathogen dynamics within infections and among hosts poses considerable statistical challenges. Simple statistical models make assumptions that will rarely be satisfied in data collected on host and parasite parameters. In particular, model residuals (unexplained variance in the data) should not be correlated in time or space. Here we demonstrate how failure to account for such correlations can result in incorrect biological inference from statistical analysis. We then show how mixed effects models can be used as a powerful tool to analyse such repeated measures data in the hope that this will encourage better statistical practices in parasitology. PMID:22511865
A comprehensive evaluation of assembly scaffolding tools

PubMed Central

2014-01-01

Background Genome assembly is typically a two-stage process: contig assembly followed by the use of paired sequencing reads to join contigs into scaffolds. Scaffolds are usually the focus of reported assembly statistics; longer scaffolds greatly facilitate the use of genome sequences in downstream analyses, and it is appealing to present larger numbers as metrics of assembly performance. However, scaffolds are highly prone to errors, especially when generated using short reads, which can directly result in inflated assembly statistics. Results Here we provide the first independent evaluation of scaffolding tools for second-generation sequencing data. We find large variations in the quality of results depending on the tool and dataset used. Even extremely simple test cases of perfect input, constructed to elucidate the behaviour of each algorithm, produced some surprising results. We further dissect the performance of the scaffolders using real and simulated sequencing data derived from the genomes of Staphylococcus aureus, Rhodobacter sphaeroides, Plasmodium falciparum and Homo sapiens. The results from simulated data are of high quality, with several of the tools producing perfect output. However, at least 10% of joins remains unidentified when using real data. Conclusions The scaffolders vary in their usability, speed and number of correct and missed joins made between contigs. Results from real data highlight opportunities for further improvements of the tools. Overall, SGA, SOPRA and SSPACE generally outperform the other tools on our datasets. However, the quality of the results is highly dependent on the read mapper and genome complexity. PMID:24581555
The t-test: An Influential Inferential Tool in Chaplaincy and Other Healthcare Research.

PubMed

Jankowski, Katherine R B; Flannelly, Kevin J; Flannelly, Laura T

2018-01-01

The t-test developed by William S. Gosset (also known as Student's t-test and the two-sample t-test) is commonly used to compare one sample mean on a measure with another sample mean on the same measure. The outcome of the t-test is used to draw inferences about how different the samples are from each other. It is probably one of the most frequently relied upon statistics in inferential research. It is easy to use: a researcher can calculate the statistic with three simple tools: paper, pen, and a calculator. A computer program can quickly calculate the t-test for large samples. The ease of use can result in the misuse of the t-test. This article discusses the development of the original t-test, basic principles of the t-test, two additional types of t-tests (the one-sample t-test and the paired t-test), and recommendations about what to consider when using the t-test to draw inferences in research.
Development of the Workplace Health Savings Calculator: a practical tool to measure economic impact from reduced absenteeism and staff turnover in workplace health promotion.

PubMed

Baxter, Siyan; Campbell, Sharon; Sanderson, Kristy; Cazaly, Carl; Venn, Alison; Owen, Carole; Palmer, Andrew J

2015-09-18

Workplace health promotion is focussed on improving the health and wellbeing of workers. Although quantifiable effectiveness and economic evidence is variable, workplace health promotion is recognised by both government and business stakeholders as potentially beneficial for worker health and economic advantage. Despite the current debate on whether conclusive positive outcomes exist, governments are investing, and business engagement is necessary for value to be realised. Practical tools are needed to assist decision makers in developing the business case for workplace health promotion programs. Our primary objective was to develop an evidence-based, simple and easy-to-use resource (calculator) for Australian employers interested in workplace health investment figures. Three phases were undertaken to develop the calculator. First, evidence from a literature review located appropriate effectiveness measures. Second, a review of employer-facilitated programs aimed at improving the health and wellbeing of employees was utilised to identify change estimates surrounding these measures, and third, currently available online evaluation tools and models were investigated. We present a simple web-based calculator for use by employers who wish to estimate potential annual savings associated with implementing a successful workplace health promotion program. The calculator uses effectiveness measures (absenteeism and staff turnover rates) and change estimates sourced from 55 case studies to generate the annual savings an employer may potentially gain. Australian wage statistics were used to calculate replacement costs due to staff turnover. The calculator was named the Workplace Health Savings Calculator and adapted and reproduced on the Healthy Workers web portal by the Australian Commonwealth Government Department of Health and Ageing. The Workplace Health Savings Calculator is a simple online business tool that aims to engage employers and to assist participation, development and implementation of workplace health promotion programs.
Evaluation of IOTA Simple Ultrasound Rules to Distinguish Benign and Malignant Ovarian Tumours

PubMed Central

Kaur, Amarjit; Mohi, Jaswinder Kaur; Sibia, Preet Kanwal; Kaur, Navkiran

2017-01-01

Introduction IOTA stands for International Ovarian Tumour Analysis group. Ovarian cancer is one of the common cancers in women and is diagnosed at later stage in majority. The limiting factor for early diagnosis is lack of standardized terms and procedures in gynaecological sonography. Introduction of IOTA rules has provided some consistency in defining morphological features of ovarian masses through a standardized examination technique. Aim To evaluate the efficacy of IOTA simple ultrasound rules in distinguishing benign and malignant ovarian tumours and establishing their use as a tool in early diagnosis of ovarian malignancy. Materials and Methods A hospital based case control prospective study was conducted. Patients with suspected ovarian pathology were evaluated using IOTA ultrasound rules and designated as benign or malignant. Findings were correlated with histopathological findings. Collected data was statistically analysed using chi-square test and kappa statistical method. Results Out of initial 55 patients, 50 patients were included in the final analysis who underwent surgery. IOTA simple rules were applicable in 45 out of these 50 patients (90%). The sensitivity for the detection of malignancy in cases where IOTA simple rules were applicable was 91.66% and the specificity was 84.84%. Accuracy was 86.66%. Classifying inconclusive cases as malignant, the sensitivity and specificity was 93% and 80% respectively. High level of agreement was found between USG and histopathological diagnosis with Kappa value as 0.323. Conclusion IOTA simple ultrasound rules were highly sensitive and specific in predicting ovarian malignancy preoperatively yet being reproducible, easy to train and use. PMID:28969237
gHRV: Heart rate variability analysis made easy.

PubMed

Rodríguez-Liñares, L; Lado, M J; Vila, X A; Méndez, A J; Cuesta, P

2014-08-01

In this paper, the gHRV software tool is presented. It is a simple, free and portable tool developed in python for analysing heart rate variability. It includes a graphical user interface and it can import files in multiple formats, analyse time intervals in the signal, test statistical significance and export the results. This paper also contains, as an example of use, a clinical analysis performed with the gHRV tool, namely to determine whether the heart rate variability indexes change across different stages of sleep. Results from tests completed by researchers who have tried gHRV are also explained: in general the application was positively valued and results reflect a high level of satisfaction. gHRV is in continuous development and new versions will include suggestions made by testers. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Bio-jETI: a service integration, design, and provisioning platform for orchestrated bioinformatics processes

PubMed Central

Margaria, Tiziana; Kubczak, Christian; Steffen, Bernhard

2008-01-01

Background With Bio-jETI, we introduce a service platform for interdisciplinary work on biological application domains and illustrate its use in a concrete application concerning statistical data processing in R and xcms for an LC/MS analysis of FAAH gene knockout. Methods Bio-jETI uses the jABC environment for service-oriented modeling and design as a graphical process modeling tool and the jETI service integration technology for remote tool execution. Conclusions As a service definition and provisioning platform, Bio-jETI has the potential to become a core technology in interdisciplinary service orchestration and technology transfer. Domain experts, like biologists not trained in computer science, directly define complex service orchestrations as process models and use efficient and complex bioinformatics tools in a simple and intuitive way. PMID:18460173
TableViewer for Herschel Data Processing

NASA Astrophysics Data System (ADS)

Zhang, L.; Schulz, B.

2006-07-01

The TableViewer utility is a GUI tool written in Java to support interactive data processing and analysis for the Herschel Space Observatory (Pilbratt et al. 2001). The idea was inherited from a prototype written in IDL (Schulz et al. 2005). It allows to graphically view and analyze tabular data organized in columns with equal numbers of rows. It can be run either as a standalone application, where data access is restricted to FITS (FITS 1999) files only, or it can be run from the Quick Look Analysis(QLA) or Interactive Analysis(IA) command line, from where also objects are accessible. The graphic display is very versatile, allowing plots in either linear or log scales. Zooming, panning, and changing data columns is performed rapidly using a group of navigation buttons. Selecting and de-selecting of fields of data points controls the input to simple analysis tasks like building a statistics table, or generating power spectra. The binary data stored in a TableDataset^1, a Product or in FITS files can also be displayed as tabular data, where values in individual cells can be modified. TableViewer provides several processing utilities which, besides calculation of statistics either for all channels or for selected channels, and calculation of power spectra, allows to convert/repair datasets by changing the unit name of data columns, and by modifying data values in columns with a simple calculator tool. Interactively selected data can be separated out, and modified data sets can be saved to FITS files. The tool will be very helpful especially in the early phases of Herschel data analysis when a quick access to contents of data products is important. TableDataset and Product are Java classes defined in herschel.ia.dataset.
Robust Strategy for Rocket Engine Health Monitoring

NASA Technical Reports Server (NTRS)

Santi, L. Michael

2001-01-01

Monitoring the health of rocket engine systems is essentially a two-phase process. The acquisition phase involves sensing physical conditions at selected locations, converting physical inputs to electrical signals, conditioning the signals as appropriate to establish scale or filter interference, and recording results in a form that is easy to interpret. The inference phase involves analysis of results from the acquisition phase, comparison of analysis results to established health measures, and assessment of health indications. A variety of analytical tools may be employed in the inference phase of health monitoring. These tools can be separated into three broad categories: statistical, rule based, and model based. Statistical methods can provide excellent comparative measures of engine operating health. They require well-characterized data from an ensemble of "typical" engines, or "golden" data from a specific test assumed to define the operating norm in order to establish reliable comparative measures. Statistical methods are generally suitable for real-time health monitoring because they do not deal with the physical complexities of engine operation. The utility of statistical methods in rocket engine health monitoring is hindered by practical limits on the quantity and quality of available data. This is due to the difficulty and high cost of data acquisition, the limited number of available test engines, and the problem of simulating flight conditions in ground test facilities. In addition, statistical methods incur a penalty for disregarding flow complexity and are therefore limited in their ability to define performance shift causality. Rule based methods infer the health state of the engine system based on comparison of individual measurements or combinations of measurements with defined health norms or rules. This does not mean that rule based methods are necessarily simple. Although binary yes-no health assessment can sometimes be established by relatively simple rules, the causality assignment needed for refined health monitoring often requires an exceptionally complex rule base involving complicated logical maps. Structuring the rule system to be clear and unambiguous can be difficult, and the expert input required to maintain a large logic network and associated rule base can be prohibitive.
Laterality of Grooming and Tool Use in a Group of Captive Bonobos (Pan paniscus).

PubMed

Brand, Colin M; Marchant, Linda F; Boose, Klaree J; White, Frances J; Rood, Tabatha M; Meinelt, Audra

2017-01-01

Humans exhibit population level handedness for the right hand; however, the evolution of this behavioral phenotype is poorly understood. Here, we compared the laterality of a simple task (grooming) and a complex task (tool use) to investigate whether increasing task difficulty elicited individual hand preference among a group of captive bonobos (Pan paniscus). Subjects were 17 bonobos housed at the Columbus Zoo and Aquarium. Laterality of grooming was recorded using group scans; tool use was recorded using all-occurrence sampling. Grooming was characterized as unimanual or bimanual, and both tasks were scored as right-handed or left-handed. Most individuals did not exhibit significant hand preference for unimanual or bimanual (asymmetrical hand use) grooming, although 1 individual was lateralized for each. For the 8 subjects who engaged in termite fishing enough for statistical testing, 7 individuals exhibited significant laterality and strong individual hand preference. Four subjects preferred their left hand, 3 preferred their right, and 1 had no preference. Grooming, a simple behavior, was not lateralized in this group, yet a more complex behavior revealed a strong individual hand preference, and these results are congruent with other recent findings that demonstrate complex tasks elicit hand preference in bonobos. © 2017 S. Karger AG, Basel.
CAN'T MISS--conquer any number task by making important statistics simple. Part 1. Types of variables, mean, median, variance, and standard deviation.

PubMed

Hansen, John P

2003-01-01

Healthcare quality improvement professionals need to understand and use inferential statistics to interpret sample data from their organizations. In quality improvement and healthcare research studies all the data from a population often are not available, so investigators take samples and make inferences about the population by using inferential statistics. This three-part series will give readers an understanding of the concepts of inferential statistics as well as the specific tools for calculating confidence intervals for samples of data. This article, Part 1, presents basic information about data including a classification system that describes the four major types of variables: continuous quantitative variable, discrete quantitative variable, ordinal categorical variable (including the binomial variable), and nominal categorical variable. A histogram is a graph that displays the frequency distribution for a continuous variable. The article also demonstrates how to calculate the mean, median, standard deviation, and variance for a continuous variable.

Molecular Analysis of Date Palm Genetic Diversity Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeats (ISSRs).

PubMed

El Sharabasy, Sherif F; Soliman, Khaled A

2017-01-01

The date palm is an ancient domesticated plant with great diversity and has been cultivated in the Middle East and North Africa for at last 5000 years. Date palm cultivars are classified based on the fruit moisture content, as dry, semidry, and soft dates. There are a number of biochemical and molecular techniques available for characterization of the date palm variation. This chapter focuses on the DNA-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeats (ISSR) techniques, in addition to biochemical markers based on isozyme analysis. These techniques coupled with appropriate statistical tools proved useful for determining phylogenetic relationships among date palm cultivars and provide information resources for date palm gene banks.
A quantitative assessment of alkaptonuria: testing the reliability of two disease severity scoring systems.

PubMed

Cox, Trevor F; Ranganath, Lakshminarayan

2011-12-01

Alkaptonuria (AKU) is due to excessive homogentisic acid accumulation in body fluids due to lack of enzyme homogentisate dioxygenase leading in turn to varied clinical manifestations mainly by a process of conversion of HGA to a polymeric melanin-like pigment known as ochronosis. A potential treatment, a drug called nitisinone, to decrease formation of HGA is available. However, successful demonstration of its efficacy in modifying the natural history of AKU requires an effective quantitative assessment tool. We have described two potential tools that could be used to quantitate disease burden in AKU. One tool describes scoring the clinical features that includes clinical assessments, investigations and questionnaires in 15 patients with AKU. The second tool describes a scoring system that only includes items obtained from questionnaires used in 44 people with AKU. Statistical analyses were carried out on the two patient datasets to assess the AKU tools; these included the calculation of Chronbach's alpha, multidimensional scaling and simple linear regression analysis. The conclusion was that there was good evidence that the tools could be adopted as AKU assessment tools, but perhaps with further refinement before being used in the practical setting of a clinical trial.
A new tool to evaluate postgraduate training posts: the Job Evaluation Survey Tool (JEST).

PubMed

Wall, David; Goodyear, Helen; Singh, Baldev; Whitehouse, Andrew; Hughes, Elizabeth; Howes, Jonathan

2014-10-02

Three reports in 2013 about healthcare and patient safety in the UK, namely Berwick, Francis and Keogh have highlighted the need for junior doctors' views about their training experience to be heard. In the UK, the General Medical Council (GMC) quality assures medical training programmes and requires postgraduate deaneries to undertake quality management and monitoring of all training posts in their area. The aim of this study was to develop a simple trainee questionnaire for evaluation of postgraduate training posts based on the GMC, UK standards and to look at the reliability and validity including comparison with a well-established and internationally validated tool, the Postgraduate Hospital Educational Environment Measure (PHEEM). The Job Evaluation Survey Tool (JEST), a fifteen item job evaluation questionnaire was drawn up in 2006, piloted with Foundation doctors (2007), field tested with specialist paediatric registrars (2008) and used over a three year period (2008-11) by Foundation Doctors. Statistical analyses including descriptives, reliability, correlation and factor analysis were undertaken and JEST compared with PHEEM. The JEST had a reliability of 0.91 in the pilot study of 76 Foundation doctors, 0.88 in field testing of 173 Paediatric specialist registrars and 0.91 in three years of general use in foundation training with 3367 doctors completing JEST. Correlation of JEST with PHEEM was 0.80 (p < 0.001). Factor analysis showed two factors, a teaching factor and a social and lifestyle one. The JEST has proved to be a simple, valid and reliable evaluation tool in the monitoring and evaluation of postgraduate hospital training posts.
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

PubMed Central

Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

2011-01-01

Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
Improved Analysis of Earth System Models and Observations using Simple Climate Models

NASA Astrophysics Data System (ADS)

Nadiga, B. T.; Urban, N. M.

2016-12-01

Earth system models (ESM) are the most comprehensive tools we have to study climate change and develop climate projections. However, the computational infrastructure required and the cost incurred in running such ESMs precludes direct use of such models in conjunction with a wide variety of tools that can further our understanding of climate. Here we are referring to tools that range from dynamical systems tools that give insight into underlying flow structure and topology to tools that come from various applied mathematical and statistical techniques and are central to quantifying stability, sensitivity, uncertainty and predictability to machine learning tools that are now being rapidly developed or improved. Our approach to facilitate the use of such models is to analyze output of ESM experiments (cf. CMIP) using a range of simpler models that consider integral balances of important quantities such as mass and/or energy in a Bayesian framework.We highlight the use of this approach in the context of the uptake of heat by the world oceans in the ongoing global warming. Indeed, since in excess of 90% of the anomalous radiative forcing due greenhouse gas emissions is sequestered in the world oceans, the nature of ocean heat uptake crucially determines the surface warming that is realized (cf. climate sensitivity). Nevertheless, ESMs themselves are never run long enough to directly assess climate sensitivity. So, we consider a range of models based on integral balances--balances that have to be realized in all first-principles based models of the climate system including the most detailed state-of-the art climate simulations. The models range from simple models of energy balance to those that consider dynamically important ocean processes such as the conveyor-belt circulation (Meridional Overturning Circulation, MOC), North Atlantic Deep Water (NADW) formation, Antarctic Circumpolar Current (ACC) and eddy mixing. Results from Bayesian analysis of such models using both ESM experiments and actual observations are presented. One such result points to the importance of direct sequestration of heat below 700 m, a process that is not allowed for in the simple models that have been traditionally used to deduce climate sensitivity.
Statistical models for causation: what inferential leverage do they provide?

PubMed

Freedman, David A

2006-12-01

Experiments offer more reliable evidence on causation than observational studies, which is not to gainsay the contribution to knowledge from observation. Experiments should be analyzed as experiments, not as observational studies. A simple comparison of rates might be just the right tool, with little value added by "sophisticated" models. This article discusses current models for causation, as applied to experimental and observational data. The intention-to-treat principle and the effect of treatment on the treated will also be discussed. Flaws in per-protocol and treatment-received estimates will be demonstrated.
Regression Models for Identifying Noise Sources in Magnetic Resonance Images

PubMed Central

Zhu, Hongtu; Li, Yimei; Ibrahim, Joseph G.; Shi, Xiaoyan; An, Hongyu; Chen, Yashen; Gao, Wei; Lin, Weili; Rowe, Daniel B.; Peterson, Bradley S.

2009-01-01

Stochastic noise, susceptibility artifacts, magnetic field and radiofrequency inhomogeneities, and other noise components in magnetic resonance images (MRIs) can introduce serious bias into any measurements made with those images. We formally introduce three regression models including a Rician regression model and two associated normal models to characterize stochastic noise in various magnetic resonance imaging modalities, including diffusion-weighted imaging (DWI) and functional MRI (fMRI). Estimation algorithms are introduced to maximize the likelihood function of the three regression models. We also develop a diagnostic procedure for systematically exploring MR images to identify noise components other than simple stochastic noise, and to detect discrepancies between the fitted regression models and MRI data. The diagnostic procedure includes goodness-of-fit statistics, measures of influence, and tools for graphical display. The goodness-of-fit statistics can assess the key assumptions of the three regression models, whereas measures of influence can isolate outliers caused by certain noise components, including motion artifacts. The tools for graphical display permit graphical visualization of the values for the goodness-of-fit statistic and influence measures. Finally, we conduct simulation studies to evaluate performance of these methods, and we analyze a real dataset to illustrate how our diagnostic procedure localizes subtle image artifacts by detecting intravoxel variability that is not captured by the regression models. PMID:19890478
Prediction of N-nitrosodimethylamine (NDMA) formation as a disinfection by-product.

PubMed

Kim, Jongo; Clevenger, Thomas E

2007-06-25

This study investigated the possibility of a statistical model application for the prediction of N-nitrosodimethylamine (NDMA) formation. The NDMA formation was studied as a function of monochloramine concentration (0.001-5mM) at fixed dimethylamine (DMA) concentrations of 0.01mM or 0.05mM. Excellent linear correlations were observed between the molar ratio of monochloramine to DMA and the NDMA formation on a log scale at pH 7 and 8. When a developed prediction equation was applied to a previously reported study, a good result was obtained. The statistical model appears to predict adequately NDMA concentrations if other NDMA precursors are excluded. Using the predictive tool, a simple and approximate calculation of NDMA formation can be obtained in drinking water systems.
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics

PubMed Central

Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo

2007-01-01

Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253
En route Spacing Tool: Efficient Conflict-free Spacing to Flow-Restricted Airspace

NASA Technical Reports Server (NTRS)

Green, S.

1999-01-01

This paper describes the Air Traffic Management (ATM) problem within the U.S. of flow-restricted en route airspace, an assessment of its impact on airspace users, and a set of near-term tools and procedures to resolve the problem. The FAA is committed, over the next few years, to deploy the first generation of modem ATM decision support tool (DST) technology under the Free-Flight Phase-1 (FFp1) program. The associated en route tools include the User Request Evaluation Tool (URET) and the Traffic Management Advisor (TMA). URET is an initial conflict probe (ICP) capability that assists controllers with the detection and resolution of conflicts in en route airspace. TMA orchestrates arrivals transitioning into high-density terminal airspace by providing controllers with scheduled times of arrival (STA) and delay feedback advisories to assist with STA conformance. However, these FFPl capabilities do not mitigate the en route Miles-In-Trail (MIT) restrictions that are dynamically applied to mitigate airspace congestion. National statistics indicate that en route facilities (Centers) apply Miles-In-Trail (MIT) restrictions for approximately 5000 hours per month. Based on results from this study, an estimated 45,000 flights are impacted by these restrictions each month. Current-day practices for implementing these restrictions result in additional controller workload and an economic impact of which the fuel penalty alone may approach several hundred dollars per flight. To mitigate much of the impact of these restrictions on users and controller workload, a DST and procedures are presented. The DST is based on a simple derivative of FFP1 technology that is designed to introduce a set of simple tools for flow-rate (spacing) conformance and integrate them with conflict-probe capabilities. The tool and associated algorithms are described based on a concept prototype implemented within the CTAS baseline in 1995. A traffic scenario is used to illustrate the controller's use of the tool, and potential display options are presented for future controller evaluation.
Advancements in RNASeqGUI towards a Reproducible Analysis of RNA-Seq Experiments

PubMed Central

Russo, Francesco; Righelli, Dario

2016-01-01

We present the advancements and novelties recently introduced in RNASeqGUI, a graphical user interface that helps biologists to handle and analyse large data collected in RNA-Seq experiments. This work focuses on the concept of reproducible research and shows how it has been incorporated in RNASeqGUI to provide reproducible (computational) results. The novel version of RNASeqGUI combines graphical interfaces with tools for reproducible research, such as literate statistical programming, human readable report, parallel executions, caching, and interactive and web-explorable tables of results. These features allow the user to analyse big datasets in a fast, efficient, and reproducible way. Moreover, this paper represents a proof of concept, showing a simple way to develop computational tools for Life Science in the spirit of reproducible research. PMID:26977414
The ideas behind self-consistent expansion

NASA Astrophysics Data System (ADS)

Schwartz, Moshe; Katzav, Eytan

2008-04-01

In recent years we have witnessed a growing interest in various non-equilibrium systems described in terms of stochastic nonlinear field theories. In some of those systems, like KPZ and related models, the interesting behavior is in the strong coupling regime, which is inaccessible by traditional perturbative treatments such as dynamical renormalization group (DRG). A useful tool in the study of such systems is the self-consistent expansion (SCE), which might be said to generate its own 'small parameter'. The self-consistent expansion (SCE) has the advantage that its structure is just that of a regular expansion, the only difference is that the simple system around which the expansion is performed is adjustable. The purpose of this paper is to present the method in a simple and understandable way that hopefully will make it accessible to a wider public working on non-equilibrium statistical physics.
4P: fast computing of population genetics statistics from large DNA polymorphism panels

PubMed Central

Benazzo, Andrea; Panziera, Alex; Bertorelle, Giorgio

2015-01-01

Massive DNA sequencing has significantly increased the amount of data available for population genetics and molecular ecology studies. However, the parallel computation of simple statistics within and between populations from large panels of polymorphic sites is not yet available, making the exploratory analyses of a set or subset of data a very laborious task. Here, we present 4P (parallel processing of polymorphism panels), a stand-alone software program for the rapid computation of genetic variation statistics (including the joint frequency spectrum) from millions of DNA variants in multiple individuals and multiple populations. It handles a standard input file format commonly used to store DNA variation from empirical or simulation experiments. The computational performance of 4P was evaluated using large SNP (single nucleotide polymorphism) datasets from human genomes or obtained by simulations. 4P was faster or much faster than other comparable programs, and the impact of parallel computing using multicore computers or servers was evident. 4P is a useful tool for biologists who need a simple and rapid computer program to run exploratory population genetics analyses in large panels of genomic data. It is also particularly suitable to analyze multiple data sets produced in simulation studies. Unix, Windows, and MacOs versions are provided, as well as the source code for easier pipeline implementations. PMID:25628874
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.

PubMed

Chu, Annie; Cui, Jenny; Dinov, Ivo D

2009-03-01

The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.
LADES: a software for constructing and analyzing longitudinal designs in biomedical research.

PubMed

Vázquez-Alcocer, Alan; Garzón-Cortes, Daniel Ladislao; Sánchez-Casas, Rosa María

2014-01-01

One of the most important steps in biomedical longitudinal studies is choosing a good experimental design that can provide high accuracy in the analysis of results with a minimum sample size. Several methods for constructing efficient longitudinal designs have been developed based on power analysis and the statistical model used for analyzing the final results. However, development of this technology is not available to practitioners through user-friendly software. In this paper we introduce LADES (Longitudinal Analysis and Design of Experiments Software) as an alternative and easy-to-use tool for conducting longitudinal analysis and constructing efficient longitudinal designs. LADES incorporates methods for creating cost-efficient longitudinal designs, unequal longitudinal designs, and simple longitudinal designs. In addition, LADES includes different methods for analyzing longitudinal data such as linear mixed models, generalized estimating equations, among others. A study of European eels is reanalyzed in order to show LADES capabilities. Three treatments contained in three aquariums with five eels each were analyzed. Data were collected from 0 up to the 12th week post treatment for all the eels (complete design). The response under evaluation is sperm volume. A linear mixed model was fitted to the results using LADES. The complete design had a power of 88.7% using 15 eels. With LADES we propose the use of an unequal design with only 14 eels and 89.5% efficiency. LADES was developed as a powerful and simple tool to promote the use of statistical methods for analyzing and creating longitudinal experiments in biomedical research.
CoNVaQ: a web tool for copy number variation-based association studies.

PubMed

Larsen, Simon Jonas; do Canto, Luisa Matos; Rogatto, Silvia Regina; Baumbach, Jan

2018-05-18

Copy number variations (CNVs) are large segments of the genome that are duplicated or deleted. Structural variations in the genome have been linked to many complex diseases. Similar to how genome-wide association studies (GWAS) have helped discover single-nucleotide polymorphisms linked to disease phenotypes, the extension of GWAS to CNVs has aided the discovery of structural variants associated with human traits and diseases. We present CoNVaQ, an easy-to-use web-based tool for CNV-based association studies. The web service allows users to upload two sets of CNV segments and search for genomic regions where the occurrence of CNVs is significantly associated with the phenotype. CoNVaQ provides two models: a simple statistical model using Fisher's exact test and a novel query-based model matching regions to user-defined queries. For each region, the method computes a global q-value statistic by repeated permutation of samples among the populations. We demonstrate our platform by using it to analyze a data set of HPV-positive and HPV-negative penile cancer patients. CoNVaQ provides a simple workflow for performing CNV-based association studies. It is made available as a web platform in order to provide a user-friendly workflow for biologists and clinicians to carry out CNV data analysis without installing any software. Through the web interface, users are also able to analyze their results to find overrepresented GO terms and pathways. In addition, our method is also available as a package for the R programming language. CoNVaQ is available at https://convaq.compbio.sdu.dk .
Statistical process control: separating signal from noise in emergency department operations.

PubMed

Pimentel, Laura; Barrueto, Fermin

2015-05-01

Statistical process control (SPC) is a visually appealing and statistically rigorous methodology very suitable to the analysis of emergency department (ED) operations. We demonstrate that the control chart is the primary tool of SPC; it is constructed by plotting data measuring the key quality indicators of operational processes in rationally ordered subgroups such as units of time. Control limits are calculated using formulas reflecting the variation in the data points from one another and from the mean. SPC allows managers to determine whether operational processes are controlled and predictable. We review why the moving range chart is most appropriate for use in the complex ED milieu, how to apply SPC to ED operations, and how to determine when performance improvement is needed. SPC is an excellent tool for operational analysis and quality improvement for these reasons: 1) control charts make large data sets intuitively coherent by integrating statistical and visual descriptions; 2) SPC provides analysis of process stability and capability rather than simple comparison with a benchmark; 3) SPC allows distinction between special cause variation (signal), indicating an unstable process requiring action, and common cause variation (noise), reflecting a stable process; and 4) SPC keeps the focus of quality improvement on process rather than individual performance. Because data have no meaning apart from their context, and every process generates information that can be used to improve it, we contend that SPC should be seriously considered for driving quality improvement in emergency medicine. Copyright © 2015 Elsevier Inc. All rights reserved.
VennDIS: a JavaFX-based Venn and Euler diagram software to generate publication quality figures.

PubMed

Ignatchenko, Vladimir; Ignatchenko, Alexandr; Sinha, Ankit; Boutros, Paul C; Kislinger, Thomas

2015-04-01

Venn diagrams are graphical representations of the relationships among multiple sets of objects and are often used to illustrate similarities and differences among genomic and proteomic datasets. All currently existing tools for producing Venn diagrams evince one of two traits; they require expertise in specific statistical software packages (such as R), or lack the flexibility required to produce publication-quality figures. We describe a simple tool that addresses both shortcomings, Venn Diagram Interactive Software (VennDIS), a JavaFX-based solution for producing highly customizable, publication-quality Venn, and Euler diagrams of up to five sets. The strengths of VennDIS are its simple graphical user interface and its large array of customization options, including the ability to modify attributes such as font, style and position of the labels, background color, size of the circle/ellipse, and outline color. It is platform independent and provides real-time visualization of figure modifications. The created figures can be saved as XML files for future modification or exported as high-resolution images for direct use in publications. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Weighing Evidence "Steampunk" Style via the Meta-Analyser.

PubMed

Bowden, Jack; Jackson, Chris

2016-10-01

The funnel plot is a graphical visualization of summary data estimates from a meta-analysis, and is a useful tool for detecting departures from the standard modeling assumptions. Although perhaps not widely appreciated, a simple extension of the funnel plot can help to facilitate an intuitive interpretation of the mathematics underlying a meta-analysis at a more fundamental level, by equating it to determining the center of mass of a physical system. We used this analogy to explain the concepts of weighing evidence and of biased evidence to a young audience at the Cambridge Science Festival, without recourse to precise definitions or statistical formulas and with a little help from Sherlock Holmes! Following on from the science fair, we have developed an interactive web-application (named the Meta-Analyser) to bring these ideas to a wider audience. We envisage that our application will be a useful tool for researchers when interpreting their data. First, to facilitate a simple understanding of fixed and random effects modeling approaches; second, to assess the importance of outliers; and third, to show the impact of adjusting for small study bias. This final aim is realized by introducing a novel graphical interpretation of the well-known method of Egger regression.
A novel methodology for building robust design rules by using design based metrology (DBM)

NASA Astrophysics Data System (ADS)

Lee, Myeongdong; Choi, Seiryung; Choi, Jinwoo; Kim, Jeahyun; Sung, Hyunju; Yeo, Hyunyoung; Shim, Myoungseob; Jin, Gyoyoung; Chung, Eunseung; Roh, Yonghan

2013-03-01

This paper addresses a methodology for building robust design rules by using design based metrology (DBM). Conventional method for building design rules has been using a simulation tool and a simple pattern spider mask. At the early stage of the device, the estimation of simulation tool is poor. And the evaluation of the simple pattern spider mask is rather subjective because it depends on the experiential judgment of an engineer. In this work, we designed a huge number of pattern situations including various 1D and 2D design structures. In order to overcome the difficulties of inspecting many types of patterns, we introduced Design Based Metrology (DBM) of Nano Geometry Research, Inc. And those mass patterns could be inspected at a fast speed with DBM. We also carried out quantitative analysis on PWQ silicon data to estimate process variability. Our methodology demonstrates high speed and accuracy for building design rules. All of test patterns were inspected within a few hours. Mass silicon data were handled with not personal decision but statistical processing. From the results, robust design rules are successfully verified and extracted. Finally we found out that our methodology is appropriate for building robust design rules.

Effectiveness of environmental-based educative program for disaster preparedness and burn management.

PubMed

Moghazy, Amr; Abdelrahman, Amira; Fahim, Ayman

2012-01-01

Preparedness is a necessity for proper handling of emergencies and disaster, particularly in Suez Canal and Sinai regions. To assure best success rates, educative programs should be environmentally based. Burn and fire preventive educative programs were tailored to adapt social and education levels of audience. In addition, common etiologies and applicability of preventive measures, according to local resources and logistics, were considered. Presentations were the main educative tool; they were made as simple as possible to assure best understanding. To assure continuous education, brochures and stickers, containing most popular mistakes and questions, were distributed after the sessions. Audience was classified according to their level of knowledge to health professional group; students groups; high-risk group; and lay people group. For course efficacy evaluation, pre- and posttests were used immediately before and after the sessions. Right answers in both tests were compared for statistical significance. Results showed significant acquisition of proper attitude and knowledge in all educated groups. The highest was among students and the least was in health professionals. Comprehensive simple environmental-based educative programs are ideal for rapid reform and community mobilization in our region. Activities should include direct contact, stickers and flyers, and audiovisual tools if possible.
Navigating freely-available software tools for metabolomics analysis.

PubMed

Spicer, Rachel; Salek, Reza M; Moreno, Pablo; Cañueto, Daniel; Steinbeck, Christoph

2017-01-01

The field of metabolomics has expanded greatly over the past two decades, both as an experimental science with applications in many areas, as well as in regards to data standards and bioinformatics software tools. The diversity of experimental designs and instrumental technologies used for metabolomics has led to the need for distinct data analysis methods and the development of many software tools. To compile a comprehensive list of the most widely used freely available software and tools that are used primarily in metabolomics. The most widely used tools were selected for inclusion in the review by either ≥ 50 citations on Web of Science (as of 08/09/16) or the use of the tool being reported in the recent Metabolomics Society survey. Tools were then categorised by the type of instrumental data (i.e. LC-MS, GC-MS or NMR) and the functionality (i.e. pre- and post-processing, statistical analysis, workflow and other functions) they are designed for. A comprehensive list of the most used tools was compiled. Each tool is discussed within the context of its application domain and in relation to comparable tools of the same domain. An extended list including additional tools is available at https://github.com/RASpicer/MetabolomicsTools which is classified and searchable via a simple controlled vocabulary. This review presents the most widely used tools for metabolomics analysis, categorised based on their main functionality. As future work, we suggest a direct comparison of tools' abilities to perform specific data analysis tasks e.g. peak picking.
SMART-COP: a tool for predicting the need for intensive respiratory or vasopressor support in community-acquired pneumonia.

PubMed

Charles, Patrick G P; Wolfe, Rory; Whitby, Michael; Fine, Michael J; Fuller, Andrew J; Stirling, Robert; Wright, Alistair A; Ramirez, Julio A; Christiansen, Keryn J; Waterer, Grant W; Pierce, Robert J; Armstrong, John G; Korman, Tony M; Holmes, Peter; Obrosky, D Scott; Peyrani, Paula; Johnson, Barbara; Hooy, Michelle; Grayson, M Lindsay

2008-08-01

Existing severity assessment tools, such as the pneumonia severity index (PSI) and CURB-65 (tool based on confusion, urea level, respiratory rate, blood pressure, and age >or=65 years), predict 30-day mortality in community-acquired pneumonia (CAP) and have limited ability to predict which patients will require intensive respiratory or vasopressor support (IRVS). The Australian CAP Study (ACAPS) was a prospective study of 882 episodes in which each patient had a detailed assessment of severity features, etiology, and treatment outcomes. Multivariate logistic regression was performed to identify features at initial assessment that were associated with receipt of IRVS. These results were converted into a simple points-based severity tool that was validated in 5 external databases, totaling 7464 patients. In ACAPS, 10.3% of patients received IRVS, and the 30-day mortality rate was 5.7%. The features statistically significantly associated with receipt of IRVS were low systolic blood pressure (2 points), multilobar chest radiography involvement (1 point), low albumin level (1 point), high respiratory rate (1 point), tachycardia (1 point), confusion (1 point), poor oxygenation (2 points), and low arterial pH (2 points): SMART-COP. A SMART-COP score of >or=3 points identified 92% of patients who received IRVS, including 84% of patients who did not need immediate admission to the intensive care unit. Accuracy was also high in the 5 validation databases. Sensitivities of PSI and CURB-65 for identifying the need for IRVS were 74% and 39%, respectively. SMART-COP is a simple, practical clinical tool for accurately predicting the need for IRVS that is likely to assist clinicians in determining CAP severity.
Statistical guides to estimating the number of undiscovered mineral deposits: an example with porphyry copper deposits

USGS Publications Warehouse

Singer, Donald A.; Menzie, W.D.; Cheng, Qiuming; Bonham-Carter, G. F.

2005-01-01

Estimating numbers of undiscovered mineral deposits is a fundamental part of assessing mineral resources. Some statistical tools can act as guides to low variance, unbiased estimates of the number of deposits. The primary guide is that the estimates must be consistent with the grade and tonnage models. Another statistical guide is the deposit density (i.e., the number of deposits per unit area of permissive rock in well-explored control areas). Preliminary estimates and confidence limits of the number of undiscovered deposits in a tract of given area may be calculated using linear regression and refined using frequency distributions with appropriate parameters. A Poisson distribution leads to estimates having lower relative variances than the regression estimates and implies a random distribution of deposits. Coefficients of variation are used to compare uncertainties of negative binomial, Poisson, or MARK3 empirical distributions that have the same expected number of deposits as the deposit density. Statistical guides presented here allow simple yet robust estimation of the number of undiscovered deposits in permissive terranes.
New approach in the quantum statistical parton distribution

NASA Astrophysics Data System (ADS)

Sohaily, Sozha; Vaziri (Khamedi), Mohammad

2017-12-01

An attempt to find simple parton distribution functions (PDFs) based on quantum statistical approach is presented. The PDFs described by the statistical model have very interesting physical properties which help to understand the structure of partons. The longitudinal portion of distribution functions are given by applying the maximum entropy principle. An interesting and simple approach to determine the statistical variables exactly without fitting and fixing parameters is surveyed. Analytic expressions of the x-dependent PDFs are obtained in the whole x region [0, 1], and the computed distributions are consistent with the experimental observations. The agreement with experimental data, gives a robust confirm of our simple presented statistical model.
Rocker: Open source, easy-to-use tool for AUC and enrichment calculations and ROC visualization.

PubMed

Lätti, Sakari; Niinivehmas, Sanna; Pentikäinen, Olli T

2016-01-01

Receiver operating characteristics (ROC) curve with the calculation of area under curve (AUC) is a useful tool to evaluate the performance of biomedical and chemoinformatics data. For example, in virtual drug screening ROC curves are very often used to visualize the efficiency of the used application to separate active ligands from inactive molecules. Unfortunately, most of the available tools for ROC analysis are implemented into commercially available software packages, or are plugins in statistical software, which are not always the easiest to use. Here, we present Rocker, a simple ROC curve visualization tool that can be used for the generation of publication quality images. Rocker also includes an automatic calculation of the AUC for the ROC curve and Boltzmann-enhanced discrimination of ROC (BEDROC). Furthermore, in virtual screening campaigns it is often important to understand the early enrichment of active ligand identification, for this Rocker offers automated calculation routine. To enable further development of Rocker, it is freely available (MIT-GPL license) for use and modifications from our web-site (http://www.jyu.fi/rocker).
Ask-the-expert: Active Learning Based Knowledge Discovery Using the Expert

NASA Technical Reports Server (NTRS)

Das, Kamalika; Avrekh, Ilya; Matthews, Bryan; Sharma, Manali; Oza, Nikunj

2017-01-01

Often the manual review of large data sets, either for purposes of labeling unlabeled instances or for classifying meaningful results from uninteresting (but statistically significant) ones is extremely resource intensive, especially in terms of subject matter expert (SME) time. Use of active learning has been shown to diminish this review time significantly. However, since active learning is an iterative process of learning a classifier based on a small number of SME-provided labels at each iteration, the lack of an enabling tool can hinder the process of adoption of these technologies in real-life, in spite of their labor-saving potential. In this demo we present ASK-the-Expert, an interactive tool that allows SMEs to review instances from a data set and provide labels within a single framework. ASK-the-Expert is powered by an active learning algorithm for training a classifier in the backend. We demonstrate this system in the context of an aviation safety application, but the tool can be adopted to work as a simple review and labeling tool as well, without the use of active learning.
Ask-the-Expert: Active Learning Based Knowledge Discovery Using the Expert

NASA Technical Reports Server (NTRS)

Das, Kamalika

2017-01-01

Often the manual review of large data sets, either for purposes of labeling unlabeled instances or for classifying meaningful results from uninteresting (but statistically significant) ones is extremely resource intensive, especially in terms of subject matter expert (SME) time. Use of active learning has been shown to diminish this review time significantly. However, since active learning is an iterative process of learning a classifier based on a small number of SME-provided labels at each iteration, the lack of an enabling tool can hinder the process of adoption of these technologies in real-life, in spite of their labor-saving potential. In this demo we present ASK-the-Expert, an interactive tool that allows SMEs to review instances from a data set and provide labels within a single framework. ASK-the-Expert is powered by an active learning algorithm for training a classifier in the back end. We demonstrate this system in the context of an aviation safety application, but the tool can be adopted to work as a simple review and labeling tool as well, without the use of active learning.
Score tests for independence in semiparametric competing risks models.

PubMed

Saïd, Mériem; Ghazzali, Nadia; Rivest, Louis-Paul

2009-12-01

A popular model for competing risks postulates the existence of a latent unobserved failure time for each risk. Assuming that these underlying failure times are independent is attractive since it allows standard statistical tools for right-censored lifetime data to be used in the analysis. This paper proposes simple independence score tests for the validity of this assumption when the individual risks are modeled using semiparametric proportional hazards regressions. It assumes that covariates are available, making the model identifiable. The score tests are derived for alternatives that specify that copulas are responsible for a possible dependency between the competing risks. The test statistics are constructed by adding to the partial likelihoods for the individual risks an explanatory variable for the dependency between the risks. A variance estimator is derived by writing the score function and the Fisher information matrix for the marginal models as stochastic integrals. Pitman efficiencies are used to compare test statistics. A simulation study and a numerical example illustrate the methodology proposed in this paper.
Statistical approaches to account for missing values in accelerometer data: Applications to modeling physical activity.

PubMed

Yue Xu, Selene; Nelson, Sandahl; Kerr, Jacqueline; Godbole, Suneeta; Patterson, Ruth; Merchant, Gina; Abramson, Ian; Staudenmayer, John; Natarajan, Loki

2018-04-01

Physical inactivity is a recognized risk factor for many chronic diseases. Accelerometers are increasingly used as an objective means to measure daily physical activity. One challenge in using these devices is missing data due to device nonwear. We used a well-characterized cohort of 333 overweight postmenopausal breast cancer survivors to examine missing data patterns of accelerometer outputs over the day. Based on these observed missingness patterns, we created psuedo-simulated datasets with realistic missing data patterns. We developed statistical methods to design imputation and variance weighting algorithms to account for missing data effects when fitting regression models. Bias and precision of each method were evaluated and compared. Our results indicated that not accounting for missing data in the analysis yielded unstable estimates in the regression analysis. Incorporating variance weights and/or subject-level imputation improved precision by >50%, compared to ignoring missing data. We recommend that these simple easy-to-implement statistical tools be used to improve analysis of accelerometer data.
Bootstrap Methods: A Very Leisurely Look.

ERIC Educational Resources Information Center

Hinkle, Dennis E.; Winstead, Wayland H.

The Bootstrap method, a computer-intensive statistical method of estimation, is illustrated using a simple and efficient Statistical Analysis System (SAS) routine. The utility of the method for generating unknown parameters, including standard errors for simple statistics, regression coefficients, discriminant function coefficients, and factor…
Invariant approach to the character classification

NASA Astrophysics Data System (ADS)

Šariri, Kristina; Demoli, Nazif

2008-04-01

Image moments analysis is a very useful tool which allows image description invariant to translation and rotation, scale change and some types of image distortions. The aim of this work was development of simple method for fast and reliable classification of characters by using Hu's and affine moment invariants. Measure of Eucleidean distance was used as a discrimination feature with statistical parameters estimated. The method was tested in classification of Times New Roman font letters as well as sets of the handwritten characters. It is shown that using all Hu's and three affine invariants as discrimination set improves recognition rate by 30%.
AncestrySNPminer: A bioinformatics tool to retrieve and develop ancestry informative SNP panels

PubMed Central

Amirisetty, Sushil; Khurana Hershey, Gurjit K.; Baye, Tesfaye M.

2012-01-01

A wealth of genomic information is available in public and private databases. However, this information is underutilized for uncovering population specific and functionally relevant markers underlying complex human traits. Given the huge amount of SNP data available from the annotation of human genetic variation, data mining is a faster and cost effective approach for investigating the number of SNPs that are informative for ancestry. In this study, we present AncestrySNPminer, the first web-based bioinformatics tool specifically designed to retrieve Ancestry Informative Markers (AIMs) from genomic data sets and link these informative markers to genes and ontological annotation classes. The tool includes an automated and simple “scripting at the click of a button” functionality that enables researchers to perform various population genomics statistical analyses methods with user friendly querying and filtering of data sets across various populations through a single web interface. AncestrySNPminer can be freely accessed at https://research.cchmc.org/mershalab/AncestrySNPminer/login.php. PMID:22584067
Simplified estimation of age-specific reference intervals for skewed data.

PubMed

Wright, E M; Royston, P

1997-12-30

Age-specific reference intervals are commonly used in medical screening and clinical practice, where interest lies in the detection of extreme values. Many different statistical approaches have been published on this topic. The advantages of a parametric method are that they necessarily produce smooth centile curves, the entire density is estimated and an explicit formula is available for the centiles. The method proposed here is a simplified version of a recent approach proposed by Royston and Wright. Basic transformations of the data and multiple regression techniques are combined to model the mean, standard deviation and skewness. Using these simple tools, which are implemented in almost all statistical computer packages, age-specific reference intervals may be obtained. The scope of the method is illustrated by fitting models to several real data sets and assessing each model using goodness-of-fit techniques.
An Example of an Improvable Rao-Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator.

PubMed

Galili, Tal; Meilijson, Isaac

2016-01-02

The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].
Does daily nurse staffing match ward workload variability? Three hospitals' experiences.

PubMed

Gabbay, Uri; Bukchin, Michael

2009-01-01

Nurse shortage and rising healthcare resource burdens mean that appropriate workforce use is imperative. This paper aims to evaluate whether daily nursing staffing meets ward workload needs. Nurse attendance and daily nurses' workload capacity in three hospitals were evaluated. Statistical process control was used to evaluate intra-ward nurse workload capacity and day-to-day variations. Statistical process control is a statistics-based method for process monitoring that uses charts with predefined target measure and control limits. Standardization was performed for inter-ward analysis by converting ward-specific crude measures to ward-specific relative measures by dividing observed/expected. Two charts: acceptable and tolerable daily nurse workload intensity, were defined. Appropriate staffing indicators were defined as those exceeding predefined rates within acceptable and tolerable limits (50 percent and 80 percent respectively). A total of 42 percent of the overall days fell within acceptable control limits and 71 percent within tolerable control limits. Appropriate staffing indicators were met in only 33 percent of wards regarding acceptable nurse workload intensity and in only 45 percent of wards regarding tolerable workloads. The study work did not differentiate crude nurse attendance and it did not take into account patient severity since crude bed occupancy was used. Double statistical process control charts and certain staffing indicators were used, which is open to debate. Wards that met appropriate staffing indicators prove the method's feasibility. Wards that did not meet appropriate staffing indicators prove the importance and the need for process evaluations and monitoring. Methods presented for monitoring daily staffing appropriateness are simple to implement either for intra-ward day-to-day variation by using nurse workload capacity statistical process control charts or for inter-ward evaluation using standardized measure of nurse workload intensity. The real challenge will be to develop planning systems and implement corrective interventions such as dynamic and flexible daily staffing, which will face difficulties and barriers. The paper fulfils the need for workforce utilization evaluation. A simple method using available data for daily staffing appropriateness evaluation, which is easy to implement and operate, is presented. The statistical process control method enables intra-ward evaluation, while standardization by converting crude into relative measures enables inter-ward analysis. The staffing indicator definitions enable performance evaluation. This original study uses statistical process control to develop simple standardization methods and applies straightforward statistical tools. This method is not limited to crude measures, rather it uses weighted workload measures such as nursing acuity or weighted nurse level (i.e. grade/band).
Entropy generation in Gaussian quantum transformations: applying the replica method to continuous-variable quantum information theory

NASA Astrophysics Data System (ADS)

Gagatsos, Christos N.; Karanikas, Alexandros I.; Kordas, Georgios; Cerf, Nicolas J.

2016-02-01

In spite of their simple description in terms of rotations or symplectic transformations in phase space, quadratic Hamiltonians such as those modelling the most common Gaussian operations on bosonic modes remain poorly understood in terms of entropy production. For instance, determining the quantum entropy generated by a Bogoliubov transformation is notably a hard problem, with generally no known analytical solution, while it is vital to the characterisation of quantum communication via bosonic channels. Here we overcome this difficulty by adapting the replica method, a tool borrowed from statistical physics and quantum field theory. We exhibit a first application of this method to continuous-variable quantum information theory, where it enables accessing entropies in an optical parametric amplifier. As an illustration, we determine the entropy generated by amplifying a binary superposition of the vacuum and a Fock state, which yields a surprisingly simple, yet unknown analytical expression.
Compression-based distance (CBD): a simple, rapid, and accurate method for microbiota composition comparison

PubMed Central

2013-01-01

Background Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. Results We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. Conclusion CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets. PMID:23617892
Compression-based distance (CBD): a simple, rapid, and accurate method for microbiota composition comparison.

PubMed

Yang, Fang; Chia, Nicholas; White, Bryan A; Schook, Lawrence B

2013-04-23

Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets.
Perspective: Sloppiness and emergent theories in physics, biology, and beyond.

PubMed

Transtrum, Mark K; Machta, Benjamin B; Brown, Kevin S; Daniels, Bryan C; Myers, Christopher R; Sethna, James P

2015-07-07

Large scale models of physical phenomena demand the development of new statistical and computational tools in order to be effective. Many such models are "sloppy," i.e., exhibit behavior controlled by a relatively small number of parameter combinations. We review an information theoretic framework for analyzing sloppy models. This formalism is based on the Fisher information matrix, which is interpreted as a Riemannian metric on a parameterized space of models. Distance in this space is a measure of how distinguishable two models are based on their predictions. Sloppy model manifolds are bounded with a hierarchy of widths and extrinsic curvatures. The manifold boundary approximation can extract the simple, hidden theory from complicated sloppy models. We attribute the success of simple effective models in physics as likewise emerging from complicated processes exhibiting a low effective dimensionality. We discuss the ramifications and consequences of sloppy models for biochemistry and science more generally. We suggest that the reason our complex world is understandable is due to the same fundamental reason: simple theories of macroscopic behavior are hidden inside complicated microscopic processes.

Using statistical process control to make data-based clinical decisions.

PubMed

Pfadt, A; Wheeler, D J

1995-01-01

Applied behavior analysis is based on an investigation of variability due to interrelationships among antecedents, behavior, and consequences. This permits testable hypotheses about the causes of behavior as well as for the course of treatment to be evaluated empirically. Such information provides corrective feedback for making data-based clinical decisions. This paper considers how a different approach to the analysis of variability based on the writings of Walter Shewart and W. Edwards Deming in the area of industrial quality control helps to achieve similar objectives. Statistical process control (SPC) was developed to implement a process of continual product improvement while achieving compliance with production standards and other requirements for promoting customer satisfaction. SPC involves the use of simple statistical tools, such as histograms and control charts, as well as problem-solving techniques, such as flow charts, cause-and-effect diagrams, and Pareto charts, to implement Deming's management philosophy. These data-analytic procedures can be incorporated into a human service organization to help to achieve its stated objectives in a manner that leads to continuous improvement in the functioning of the clients who are its customers. Examples are provided to illustrate how SPC procedures can be used to analyze behavioral data. Issues related to the application of these tools for making data-based clinical decisions and for creating an organizational climate that promotes their routine use in applied settings are also considered.
Agreement between calcaneal quantitative ultrasound and osteoporosis self-assessment tool for Asians in identifying individuals at risk of osteoporosis

PubMed Central

Chin, Kok-Yong; Low, Nie Yen; Kamaruddin, Alia Annessa Ain; Dewiputri, Wan Ilma; Soelaiman, Ima-Nirwana

2017-01-01

Background Calcaneal quantitative ultrasound (QUS) is a useful tool in osteoporosis screening. However, QUS device may not be available at all primary health care settings. Osteoporosis self-assessment tool for Asians (OSTA) is a simple algorithm for osteoporosis screening that does not require any sophisticated instruments. This study explored the possibility of replacing QUS with OSTA by determining their agreement in identifying individuals at risk of osteoporosis. Methods A cross-sectional study was conducted to recruit Malaysian men and women aged ≥50 years. Their bone health status was measured using a calcaneal QUS device and OSTA. The association between OSTA and QUS was determined using Spearman’s correlation and their agreement was assessed using Cohen Kappa and receiver-operating curve. Results All QUS indices correlated significantly with OSTA (p<0.05). The agreement between QUS and OSTA was minimal but statistically significant (p<0.05). The performance of OSTA in identifying subjects at risk of osteoporosis according to QUS was poor-to-fair in women (p<0.05), but not statistically significant for men (p>0.05). Changing the cut-off values improved the performance of OSTA in women but not in men. Conclusion The agreement between QUS and OSTA is minimal in categorizing individuals at risk of osteoporosis. Therefore, they cannot be used interchangeably in osteoporosis screening. PMID:29070951
Methods for assessing reliability and validity for a measurement tool: a case study and critique using the WHO haemoglobin colour scale.

PubMed

White, Sarah A; van den Broek, Nynke R

2004-05-30

Before introducing a new measurement tool it is necessary to evaluate its performance. Several statistical methods have been developed, or used, to evaluate the reliability and validity of a new assessment method in such circumstances. In this paper we review some commonly used methods. Data from a study that was conducted to evaluate the usefulness of a specific measurement tool (the WHO Colour Scale) is then used to illustrate the application of these methods. The WHO Colour Scale was developed under the auspices of the WHO to provide a simple portable and reliable method of detecting anaemia. This Colour Scale is a discrete interval scale, whereas the actual haemoglobin values it is used to estimate are on a continuous interval scale and can be measured accurately using electrical laboratory equipment. The methods we consider are: linear regression, correlation coefficients, paired t-tests plotting differences against mean values and deriving limits of agreement; kappa and weighted kappa statistics, sensitivity and specificity, an intraclass correlation coefficient and the repeatability coefficient. We note that although the definition and properties of each of these methods is well established inappropriate methods continue to be used in medical literature for assessing reliability and validity, as evidenced in the context of the evaluation of the WHO Colour Scale. Copyright 2004 John Wiley & Sons, Ltd.
Modeling and replicating statistical topology and evidence for CMB nonhomogeneity

PubMed Central

Agami, Sarit

2017-01-01

Under the banner of “big data,” the detection and classification of structure in extremely large, high-dimensional, data sets are two of the central statistical challenges of our times. Among the most intriguing new approaches to this challenge is “TDA,” or “topological data analysis,” one of the primary aims of which is providing nonmetric, but topologically informative, preanalyses of data which make later, more quantitative, analyses feasible. While TDA rests on strong mathematical foundations from topology, in applications, it has faced challenges due to difficulties in handling issues of statistical reliability and robustness, often leading to an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis—the typical case for big data applications—the replications permit conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical framework for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the parametric approach to TDA modeling in an analysis of cosmic microwave background (CMB) nonhomogeneity. PMID:29078301
48 CFR 1852.223-76 - Federal Automotive Statistical Tool Reporting.

Code of Federal Regulations, 2010 CFR

2010-10-01

... Statistical Tool Reporting. 1852.223-76 Section 1852.223-76 Federal Acquisition Regulations System NATIONAL... Provisions and Clauses 1852.223-76 Federal Automotive Statistical Tool Reporting. As prescribed at 1823.271 and 1851.205, insert the following clause: Federal Automotive Statistical Tool Reporting (JUL 2003) If...
PhyloExplorer: a web server to validate, explore and query phylogenetic trees

PubMed Central

Ranwez, Vincent; Clairon, Nicolas; Delsuc, Frédéric; Pourali, Saeed; Auberval, Nicolas; Diser, Sorel; Berry, Vincent

2009-01-01

Background Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees. Results We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database. Conclusion PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: and the source code can be downloaded from: . PMID:19450253
Bayesian statistics as a new tool for spectral analysis - I. Application for the determination of basic parameters of massive stars

NASA Astrophysics Data System (ADS)

Mugnes, J.-M.; Robert, C.

2015-11-01

Spectral analysis is a powerful tool to investigate stellar properties and it has been widely used for decades now. However, the methods considered to perform this kind of analysis are mostly based on iteration among a few diagnostic lines to determine the stellar parameters. While these methods are often simple and fast, they can lead to errors and large uncertainties due to the required assumptions. Here, we present a method based on Bayesian statistics to find simultaneously the best combination of effective temperature, surface gravity, projected rotational velocity, and microturbulence velocity, using all the available spectral lines. Different tests are discussed to demonstrate the strength of our method, which we apply to 54 mid-resolution spectra of field and cluster B stars obtained at the Observatoire du Mont-Mégantic. We compare our results with those found in the literature. Differences are seen which are well explained by the different methods used. We conclude that the B-star microturbulence velocities are often underestimated. We also confirm the trend that B stars in clusters are on average faster rotators than field B stars.
A review of statistical updating methods for clinical prediction models.

PubMed

Su, Ting-Li; Jaki, Thomas; Hickey, Graeme L; Buchan, Iain; Sperrin, Matthew

2018-01-01

A clinical prediction model is a tool for predicting healthcare outcomes, usually within a specific population and context. A common approach is to develop a new clinical prediction model for each population and context; however, this wastes potentially useful historical information. A better approach is to update or incorporate the existing clinical prediction models already developed for use in similar contexts or populations. In addition, clinical prediction models commonly become miscalibrated over time, and need replacing or updating. In this article, we review a range of approaches for re-using and updating clinical prediction models; these fall in into three main categories: simple coefficient updating, combining multiple previous clinical prediction models in a meta-model and dynamic updating of models. We evaluated the performance (discrimination and calibration) of the different strategies using data on mortality following cardiac surgery in the United Kingdom: We found that no single strategy performed sufficiently well to be used to the exclusion of the others. In conclusion, useful tools exist for updating existing clinical prediction models to a new population or context, and these should be implemented rather than developing a new clinical prediction model from scratch, using a breadth of complementary statistical methods.
D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

PubMed Central

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-01-01

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. DMATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the coregulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sosbox cisregulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. DMATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861
D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.

PubMed

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-07-27

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/
Metrics for comparing neuronal tree shapes based on persistent homology.

PubMed

Li, Yanjie; Wang, Dingkang; Ascoli, Giorgio A; Mitra, Partha; Wang, Yusu

2017-01-01

As more and more neuroanatomical data are made available through efforts such as NeuroMorpho.Org and FlyCircuit.org, the need to develop computational tools to facilitate automatic knowledge discovery from such large datasets becomes more urgent. One fundamental question is how best to compare neuron structures, for instance to organize and classify large collection of neurons. We aim to develop a flexible yet powerful framework to support comparison and classification of large collection of neuron structures efficiently. Specifically we propose to use a topological persistence-based feature vectorization framework. Existing methods to vectorize a neuron (i.e, convert a neuron to a feature vector so as to support efficient comparison and/or searching) typically rely on statistics or summaries of morphometric information, such as the average or maximum local torque angle or partition asymmetry. These simple summaries have limited power in encoding global tree structures. Based on the concept of topological persistence recently developed in the field of computational topology, we vectorize each neuron structure into a simple yet informative summary. In particular, each type of information of interest can be represented as a descriptor function defined on the neuron tree, which is then mapped to a simple persistence-signature. Our framework can encode both local and global tree structure, as well as other information of interest (electrophysiological or dynamical measures), by considering multiple descriptor functions on the neuron. The resulting persistence-based signature is potentially more informative than simple statistical summaries (such as average/mean/max) of morphometric quantities-Indeed, we show that using a certain descriptor function will give a persistence-based signature containing strictly more information than the classical Sholl analysis. At the same time, our framework retains the efficiency associated with treating neurons as points in a simple Euclidean feature space, which would be important for constructing efficient searching or indexing structures over them. We present preliminary experimental results to demonstrate the effectiveness of our persistence-based neuronal feature vectorization framework.
Metrics for comparing neuronal tree shapes based on persistent homology

PubMed Central

Li, Yanjie; Wang, Dingkang; Ascoli, Giorgio A.; Mitra, Partha

2017-01-01

As more and more neuroanatomical data are made available through efforts such as NeuroMorpho.Org and FlyCircuit.org, the need to develop computational tools to facilitate automatic knowledge discovery from such large datasets becomes more urgent. One fundamental question is how best to compare neuron structures, for instance to organize and classify large collection of neurons. We aim to develop a flexible yet powerful framework to support comparison and classification of large collection of neuron structures efficiently. Specifically we propose to use a topological persistence-based feature vectorization framework. Existing methods to vectorize a neuron (i.e, convert a neuron to a feature vector so as to support efficient comparison and/or searching) typically rely on statistics or summaries of morphometric information, such as the average or maximum local torque angle or partition asymmetry. These simple summaries have limited power in encoding global tree structures. Based on the concept of topological persistence recently developed in the field of computational topology, we vectorize each neuron structure into a simple yet informative summary. In particular, each type of information of interest can be represented as a descriptor function defined on the neuron tree, which is then mapped to a simple persistence-signature. Our framework can encode both local and global tree structure, as well as other information of interest (electrophysiological or dynamical measures), by considering multiple descriptor functions on the neuron. The resulting persistence-based signature is potentially more informative than simple statistical summaries (such as average/mean/max) of morphometric quantities—Indeed, we show that using a certain descriptor function will give a persistence-based signature containing strictly more information than the classical Sholl analysis. At the same time, our framework retains the efficiency associated with treating neurons as points in a simple Euclidean feature space, which would be important for constructing efficient searching or indexing structures over them. We present preliminary experimental results to demonstrate the effectiveness of our persistence-based neuronal feature vectorization framework. PMID:28809960
COPD assessment test (CAT): simple tool for evaluating quality of life of chemical warfare patients with chronic obstructive pulmonary disease.

PubMed

Lari, Shahrzad M; Ghobadi, Hassan; Attaran, Davood; Mahmoodpour, Afsoun; Shadkam, Omid; Rostami, Maryam

2014-01-01

Chronic obstructive pulmonary disease (COPD) is one of the serious late pulmonary complications caused by sulphur mustard exposure. Health status evaluations of chemical warfare patients with COPD are important to the management of these patients. The aim of this study was to determine the efficacy of the COPD assessment test (CAT) in evaluating the health-related quality of life (HRQOL) of chemical warfare patients with COPD. Eighty-two consecutive patients with stable COPD were enrolled in this study. All subjects were visited by one physician, and the HRQOL was evaluated by the CAT and St. George Respiratory Questionnaires (SGRQs). In addition, a standard spirometry test, 6-min walk distance test and pulse oxymetry were conducted. The severity of the COPD was determined using Global Initiative for Chronic Obstructive Lung Disease (GOLD) staging and the body mass index, obstruction, dyspnoea and exercise (BODE) index. The mean age of the patients was 47.30 ± 7.08 years. The mean CAT score was 26.03 ± 8.28. Thirty-five (43%) patients were in CAT stage 3. There were statistically significant correlations between the CAT and the SGRQ (r = 0.70, P = 0.001) and the BODE index (r = 0.70, P = 0.001). A statistically significant inverse correlation was found between the CAT score and the forced expiratory volume in 1 s (r = -0.30, P = 0.03). Our results demonstrated that the CAT is a simple and valid tool for assessment of HRQOL in chemical warfare patients with COPD and can be used in clinical practice. © 2013 John Wiley & Sons Ltd.
SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.

PubMed

Johnson, Benjamin K; Scholz, Matthew B; Teal, Tracy K; Abramovitch, Robert B

2016-02-04

Many tools exist in the analysis of bacterial RNA sequencing (RNA-seq) transcriptional profiling experiments to identify differentially expressed genes between experimental conditions. Generally, the workflow includes quality control of reads, mapping to a reference, counting transcript abundance, and statistical tests for differentially expressed genes. In spite of the numerous tools developed for each component of an RNA-seq analysis workflow, easy-to-use bacterially oriented workflow applications to combine multiple tools and automate the process are lacking. With many tools to choose from for each step, the task of identifying a specific tool, adapting the input/output options to the specific use-case, and integrating the tools into a coherent analysis pipeline is not a trivial endeavor, particularly for microbiologists with limited bioinformatics experience. To make bacterial RNA-seq data analysis more accessible, we developed a Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis (SPARTA). SPARTA is a reference-based bacterial RNA-seq analysis workflow application for single-end Illumina reads. SPARTA is turnkey software that simplifies the process of analyzing RNA-seq data sets, making bacterial RNA-seq analysis a routine process that can be undertaken on a personal computer or in the classroom. The easy-to-install, complete workflow processes whole transcriptome shotgun sequencing data files by trimming reads and removing adapters, mapping reads to a reference, counting gene features, calculating differential gene expression, and, importantly, checking for potential batch effects within the data set. SPARTA outputs quality analysis reports, gene feature counts and differential gene expression tables and scatterplots. SPARTA provides an easy-to-use bacterial RNA-seq transcriptional profiling workflow to identify differentially expressed genes between experimental conditions. This software will enable microbiologists with limited bioinformatics experience to analyze their data and integrate next generation sequencing (NGS) technologies into the classroom. The SPARTA software and tutorial are available at sparta.readthedocs.org.
Theoretical predictor for candidate structure assignment from IMS data of biomolecule-related conformational space.

PubMed

Schenk, Emily R; Nau, Frederic; Fernandez-Lima, Francisco

2015-06-01

The ability to correlate experimental ion mobility data with candidate structures from theoretical modeling provides a powerful analytical and structural tool for the characterization of biomolecules. In the present paper, a theoretical workflow is described to generate and assign candidate structures for experimental trapped ion mobility and H/D exchange (HDX-TIMS-MS) data following molecular dynamics simulations and statistical filtering. The applicability of the theoretical predictor is illustrated for a peptide and protein example with multiple conformations and kinetic intermediates. The described methodology yields a low computational cost and a simple workflow by incorporating statistical filtering and molecular dynamics simulations. The workflow can be adapted to different IMS scenarios and CCS calculators for a more accurate description of the IMS experimental conditions. For the case of the HDX-TIMS-MS experiments, molecular dynamics in the "TIMS box" accounts for a better sampling of the molecular intermediates and local energy minima.
WASP (Write a Scientific Paper) using Excel - 1: Data entry and validation.

PubMed

Grech, Victor

2018-02-01

Data collection for the purposes of analysis, after the planning and execution of a research study, commences with data input and validation. The process of data entry and analysis may appear daunting to the uninitiated, but as pointed out in the 1970s in a series of papers by British Medical Journal Deputy Editor TDV Swinscow, modern hardware and software (he was then referring to the availability of hand calculators) permits the performance of statistical testing outside a computer laboratory. In this day and age, modern software, such as the ubiquitous and almost universally familiar Microsoft Excel™ greatly facilitates this process. This first paper comprises the first of a collection of papers which will emulate Swinscow's series, in his own words, "addressed to readers who want to start at the beginning, not to those who are already skilled statisticians." These papers will have less focus on the actual arithmetic, and more emphasis on how to actually implement simple statistics, step by step, using Excel, thereby constituting the equivalent of Swinscow's papers in the personal computer age. Data entry can be facilitated by several underutilised features in Excel. This paper will explain Excel's little-known form function, data validation implementation at input stage, simple coding tips and data cleaning tools. Copyright © 2018 Elsevier B.V. All rights reserved.
Computation of fluid flow and pore-space properties estimation on micro-CT images of rock samples

NASA Astrophysics Data System (ADS)

Starnoni, M.; Pokrajac, D.; Neilson, J. E.

2017-09-01

Accurate determination of the petrophysical properties of rocks, namely REV, mean pore and grain size and absolute permeability, is essential for a broad range of engineering applications. Here, the petrophysical properties of rocks are calculated using an integrated approach comprising image processing, statistical correlation and numerical simulations. The Stokes equations of creeping flow for incompressible fluids are solved using the Finite-Volume SIMPLE algorithm. Simulations are then carried out on three-dimensional digital images obtained from micro-CT scanning of two rock formations: one sandstone and one carbonate. Permeability is predicted from the computed flow field using Darcy's law. It is shown that REV, REA and mean pore and grain size are effectively estimated using the two-point spatial correlation function. Homogeneity and anisotropy are also evaluated using the same statistical tools. A comparison of different absolute permeability estimates is also presented, revealing a good agreement between the numerical value and the experimentally determined one for the carbonate sample, but a large discrepancy for the sandstone. Finally, a new convergence criterion for the SIMPLE algorithm, and more generally for the family of pressure-correction methods, is presented. This criterion is based on satisfaction of bulk momentum balance, which makes it particularly useful for pore-scale modelling of reservoir rocks.
A Simple Approach for Monitoring Business Service Time Variation

PubMed Central

2014-01-01

Control charts are effective tools for signal detection in both manufacturing processes and service processes. Much of the data in service industries comes from processes having nonnormal or unknown distributions. The commonly used Shewhart variable control charts, which depend heavily on the normality assumption, are not appropriately used here. In this paper, we propose a new asymmetric EWMA variance chart (EWMA-AV chart) and an asymmetric EWMA mean chart (EWMA-AM chart) based on two simple statistics to monitor process variance and mean shifts simultaneously. Further, we explore the sampling properties of the new monitoring statistics and calculate the average run lengths when using both the EWMA-AV chart and the EWMA-AM chart. The performance of the EWMA-AV and EWMA-AM charts and that of some existing variance and mean charts are compared. A numerical example involving nonnormal service times from the service system of a bank branch in Taiwan is used to illustrate the applications of the EWMA-AV and EWMA-AM charts and to compare them with the existing variance (or standard deviation) and mean charts. The proposed EWMA-AV chart and EWMA-AM charts show superior detection performance compared to the existing variance and mean charts. The EWMA-AV chart and EWMA-AM chart are thus recommended. PMID:24895647
A simple approach for monitoring business service time variation.

PubMed

Yang, Su-Fen; Arnold, Barry C

2014-01-01

Control charts are effective tools for signal detection in both manufacturing processes and service processes. Much of the data in service industries comes from processes having nonnormal or unknown distributions. The commonly used Shewhart variable control charts, which depend heavily on the normality assumption, are not appropriately used here. In this paper, we propose a new asymmetric EWMA variance chart (EWMA-AV chart) and an asymmetric EWMA mean chart (EWMA-AM chart) based on two simple statistics to monitor process variance and mean shifts simultaneously. Further, we explore the sampling properties of the new monitoring statistics and calculate the average run lengths when using both the EWMA-AV chart and the EWMA-AM chart. The performance of the EWMA-AV and EWMA-AM charts and that of some existing variance and mean charts are compared. A numerical example involving nonnormal service times from the service system of a bank branch in Taiwan is used to illustrate the applications of the EWMA-AV and EWMA-AM charts and to compare them with the existing variance (or standard deviation) and mean charts. The proposed EWMA-AV chart and EWMA-AM charts show superior detection performance compared to the existing variance and mean charts. The EWMA-AV chart and EWMA-AM chart are thus recommended.
Correlations between reaction product yields as a tool for probing heavy-ion reaction scenarios

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gawlikowicz, W.; Heavy-Ion Laboratory, Warsaw University, PL-02-093 Warsaw; Agnihotri, D. K.

2010-01-15

Experimental multidimensional joint distributions of neutrons and charged reaction products were analyzed for {sup 136}Xe + {sup 209}Bi reactions at E/A=28, 40, and 62 MeV and were found to exhibit several different types of prominent correlation patterns. Some of these correlations have a simple explanation in terms of the system excitation energy and pose little challenge to most statistical decay theories. However, several other types of correlation patterns are difficult to reconcile with some, but not other, possible reaction scenarios. In this respect, correlations between the average atomic numbers of intermediate-mass fragments, on the one hand, and light particle multiplicities,more » on the other, are notable. This kind of multiparticle correlation provides a useful tool for probing reaction scenarios, which is different from the traditional approach of interpreting inclusive yields of individual reaction products.« less

Assessing the Robustness of Complete Bacterial Genome Segmentations

NASA Astrophysics Data System (ADS)

Devillers, Hugo; Chiapello, Hélène; Schbath, Sophie; El Karoui, Meriem

Comparison of closely related bacterial genomes has revealed the presence of highly conserved sequences forming a "backbone" that is interrupted by numerous, less conserved, DNA fragments. Segmentation of bacterial genomes into backbone and variable regions is particularly useful to investigate bacterial genome evolution. Several software tools have been designed to compare complete bacterial chromosomes and a few online databases store pre-computed genome comparisons. However, very few statistical methods are available to evaluate the reliability of these software tools and to compare the results obtained with them. To fill this gap, we have developed two local scores to measure the robustness of bacterial genome segmentations. Our method uses a simulation procedure based on random perturbations of the compared genomes. The scores presented in this paper are simple to implement and our results show that they allow to discriminate easily between robust and non-robust bacterial genome segmentations when using aligners such as MAUVE and MGA.
LocusExplorer: a user-friendly tool for integrated visualization of human genetic association data and biological annotations.

PubMed

Dadaev, Tokhir; Leongamornlert, Daniel A; Saunders, Edward J; Eeles, Rosalind; Kote-Jarai, Zsofia

2016-03-15

: In this article, we present LocusExplorer, a data visualization and exploration tool for genetic association data. LocusExplorer is written in R using the Shiny library, providing access to powerful R-based functions through a simple user interface. LocusExplorer allows users to simultaneously display genetic, statistical and biological data for humans in a single image and allows dynamic zooming and customization of the plot features. Publication quality plots may then be produced in a variety of file formats. LocusExplorer is open source and runs through R and a web browser. It is available at www.oncogenetics.icr.ac.uk/LocusExplorer/ or can be installed locally and the source code accessed from https://github.com/oncogenetics/LocusExplorer tokhir.dadaev@icr.ac.uk. © The Author 2015. Published by Oxford University Press.
CrossQuery: a web tool for easy associative querying of transcriptome data.

PubMed

Wagner, Toni U; Fischer, Andreas; Thoma, Eva C; Schartl, Manfred

2011-01-01

Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.
Weighing Evidence “Steampunk” Style via the Meta-Analyser

PubMed Central

Bowden, Jack; Jackson, Chris

2016-01-01

ABSTRACT The funnel plot is a graphical visualization of summary data estimates from a meta-analysis, and is a useful tool for detecting departures from the standard modeling assumptions. Although perhaps not widely appreciated, a simple extension of the funnel plot can help to facilitate an intuitive interpretation of the mathematics underlying a meta-analysis at a more fundamental level, by equating it to determining the center of mass of a physical system. We used this analogy to explain the concepts of weighing evidence and of biased evidence to a young audience at the Cambridge Science Festival, without recourse to precise definitions or statistical formulas and with a little help from Sherlock Holmes! Following on from the science fair, we have developed an interactive web-application (named the Meta-Analyser) to bring these ideas to a wider audience. We envisage that our application will be a useful tool for researchers when interpreting their data. First, to facilitate a simple understanding of fixed and random effects modeling approaches; second, to assess the importance of outliers; and third, to show the impact of adjusting for small study bias. This final aim is realized by introducing a novel graphical interpretation of the well-known method of Egger regression. PMID:28003684
IMAGINE: Interstellar MAGnetic field INference Engine

NASA Astrophysics Data System (ADS)

Steininger, Theo

2018-03-01

IMAGINE (Interstellar MAGnetic field INference Engine) performs inference on generic parametric models of the Galaxy. The modular open source framework uses highly optimized tools and technology such as the MultiNest sampler (ascl:1109.006) and the information field theory framework NIFTy (ascl:1302.013) to create an instance of the Milky Way based on a set of parameters for physical observables, using Bayesian statistics to judge the mismatch between measured data and model prediction. The flexibility of the IMAGINE framework allows for simple refitting for newly available data sets and makes state-of-the-art Bayesian methods easily accessible particularly for random components of the Galactic magnetic field.
Employee resourcing strategies and universities' corporate image: A survey dataset.

PubMed

Falola, Hezekiah Olubusayo; Oludayo, Olumuyiwa Akinrole; Olokundun, Maxwell Ayodele; Salau, Odunayo Paul; Ibidunni, Ayodotun Stephen; Igbinoba, Ebe

2018-06-01

The data examined the effect of employee resourcing strategies on corporate image. The data were generated from a total of 500 copies of questionnaire administered to the academic staff of the six (6) selected private Universities in Southwest, Nigeria, out of which four hundred and forty-three (443) were retrieved. Stratified and simple random sampling techniques were used to select the respondents for this study. Descriptive and Linear Regression, were used for the presentation of the data. Mean score was used as statistical tool of analysis. Therefore, the data presented in this article is made available to facilitate further and more comprehensive investigation on the subject matter.
Biological Parametric Mapping: A Statistical Toolbox for Multi-Modality Brain Image Analysis

PubMed Central

Casanova, Ramon; Ryali, Srikanth; Baer, Aaron; Laurienti, Paul J.; Burdette, Jonathan H.; Hayasaka, Satoru; Flowers, Lynn; Wood, Frank; Maldjian, Joseph A.

2006-01-01

In recent years multiple brain MR imaging modalities have emerged; however, analysis methodologies have mainly remained modality specific. In addition, when comparing across imaging modalities, most researchers have been forced to rely on simple region-of-interest type analyses, which do not allow the voxel-by-voxel comparisons necessary to answer more sophisticated neuroscience questions. To overcome these limitations, we developed a toolbox for multimodal image analysis called biological parametric mapping (BPM), based on a voxel-wise use of the general linear model. The BPM toolbox incorporates information obtained from other modalities as regressors in a voxel-wise analysis, thereby permitting investigation of more sophisticated hypotheses. The BPM toolbox has been developed in MATLAB with a user friendly interface for performing analyses, including voxel-wise multimodal correlation, ANCOVA, and multiple regression. It has a high degree of integration with the SPM (statistical parametric mapping) software relying on it for visualization and statistical inference. Furthermore, statistical inference for a correlation field, rather than a widely-used T-field, has been implemented in the correlation analysis for more accurate results. An example with in-vivo data is presented demonstrating the potential of the BPM methodology as a tool for multimodal image analysis. PMID:17070709
Accounting for isotopic clustering in Fourier transform mass spectrometry data analysis for clinical diagnostic studies.

PubMed

Kakourou, Alexia; Vach, Werner; Nicolardi, Simone; van der Burgt, Yuri; Mertens, Bart

2016-10-01

Mass spectrometry based clinical proteomics has emerged as a powerful tool for high-throughput protein profiling and biomarker discovery. Recent improvements in mass spectrometry technology have boosted the potential of proteomic studies in biomedical research. However, the complexity of the proteomic expression introduces new statistical challenges in summarizing and analyzing the acquired data. Statistical methods for optimally processing proteomic data are currently a growing field of research. In this paper we present simple, yet appropriate methods to preprocess, summarize and analyze high-throughput MALDI-FTICR mass spectrometry data, collected in a case-control fashion, while dealing with the statistical challenges that accompany such data. The known statistical properties of the isotopic distribution of the peptide molecules are used to preprocess the spectra and translate the proteomic expression into a condensed data set. Information on either the intensity level or the shape of the identified isotopic clusters is used to derive summary measures on which diagnostic rules for disease status allocation will be based. Results indicate that both the shape of the identified isotopic clusters and the overall intensity level carry information on the class outcome and can be used to predict the presence or absence of the disease.
Time-to-event methodology improved statistical evaluation in register-based health services research.

PubMed

Bluhmki, Tobias; Bramlage, Peter; Volk, Michael; Kaltheuner, Matthias; Danne, Thomas; Rathmann, Wolfgang; Beyersmann, Jan

2017-02-01

Complex longitudinal sampling and the observational structure of patient registers in health services research are associated with methodological challenges regarding data management and statistical evaluation. We exemplify common pitfalls and want to stimulate discussions on the design, development, and deployment of future longitudinal patient registers and register-based studies. For illustrative purposes, we use data from the prospective, observational, German DIabetes Versorgungs-Evaluation register. One aim was to explore predictors for the initiation of a basal insulin supported therapy in patients with type 2 diabetes initially prescribed to glucose-lowering drugs alone. Major challenges are missing mortality information, time-dependent outcomes, delayed study entries, different follow-up times, and competing events. We show that time-to-event methodology is a valuable tool for improved statistical evaluation of register data and should be preferred to simple case-control approaches. Patient registers provide rich data sources for health services research. Analyses are accompanied with the trade-off between data availability, clinical plausibility, and statistical feasibility. Cox' proportional hazards model allows for the evaluation of the outcome-specific hazards, but prediction of outcome probabilities is compromised by missing mortality information. Copyright © 2016 Elsevier Inc. All rights reserved.
A Simple Illustration for the Need of Multiple Comparison Procedures

ERIC Educational Resources Information Center

Carter, Rickey E.

2010-01-01

Statistical adjustments to accommodate multiple comparisons are routinely covered in introductory statistical courses. The fundamental rationale for such adjustments, however, may not be readily understood. This article presents a simple illustration to help remedy this.
Statistics of dislocation pinning at localized obstacles

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dutta, A.; Bhattacharya, M., E-mail: mishreyee@vecc.gov.in; Barat, P.

2014-10-14

Pinning of dislocations at nanosized obstacles like precipitates, voids, and bubbles is a crucial mechanism in the context of phenomena like hardening and creep. The interaction between such an obstacle and a dislocation is often studied at fundamental level by means of analytical tools, atomistic simulations, and finite element methods. Nevertheless, the information extracted from such studies cannot be utilized to its maximum extent on account of insufficient information about the underlying statistics of this process comprising a large number of dislocations and obstacles in a system. Here, we propose a new statistical approach, where the statistics of pinning ofmore » dislocations by idealized spherical obstacles is explored by taking into account the generalized size-distribution of the obstacles along with the dislocation density within a three-dimensional framework. Starting with a minimal set of material parameters, the framework employs the method of geometrical statistics with a few simple assumptions compatible with the real physical scenario. The application of this approach, in combination with the knowledge of fundamental dislocation-obstacle interactions, has successfully been demonstrated for dislocation pinning at nanovoids in neutron irradiated type 316-stainless steel in regard to the non-conservative motion of dislocations. An interesting phenomenon of transition from rare pinning to multiple pinning regimes with increasing irradiation temperature is revealed.« less
Open-source Software for Demand Forecasting of Clinical Laboratory Test Volumes Using Time-series Analysis.

PubMed

Mohammed, Emad A; Naugler, Christopher

2017-01-01

Demand forecasting is the area of predictive analytics devoted to predicting future volumes of services or consumables. Fair understanding and estimation of how demand will vary facilitates the optimal utilization of resources. In a medical laboratory, accurate forecasting of future demand, that is, test volumes, can increase efficiency and facilitate long-term laboratory planning. Importantly, in an era of utilization management initiatives, accurately predicted volumes compared to the realized test volumes can form a precise way to evaluate utilization management initiatives. Laboratory test volumes are often highly amenable to forecasting by time-series models; however, the statistical software needed to do this is generally either expensive or highly technical. In this paper, we describe an open-source web-based software tool for time-series forecasting and explain how to use it as a demand forecasting tool in clinical laboratories to estimate test volumes. This tool has three different models, that is, Holt-Winters multiplicative, Holt-Winters additive, and simple linear regression. Moreover, these models are ranked and the best one is highlighted. This tool will allow anyone with historic test volume data to model future demand.
Open-source Software for Demand Forecasting of Clinical Laboratory Test Volumes Using Time-series Analysis

PubMed Central

Mohammed, Emad A.; Naugler, Christopher

2017-01-01

Background: Demand forecasting is the area of predictive analytics devoted to predicting future volumes of services or consumables. Fair understanding and estimation of how demand will vary facilitates the optimal utilization of resources. In a medical laboratory, accurate forecasting of future demand, that is, test volumes, can increase efficiency and facilitate long-term laboratory planning. Importantly, in an era of utilization management initiatives, accurately predicted volumes compared to the realized test volumes can form a precise way to evaluate utilization management initiatives. Laboratory test volumes are often highly amenable to forecasting by time-series models; however, the statistical software needed to do this is generally either expensive or highly technical. Method: In this paper, we describe an open-source web-based software tool for time-series forecasting and explain how to use it as a demand forecasting tool in clinical laboratories to estimate test volumes. Results: This tool has three different models, that is, Holt-Winters multiplicative, Holt-Winters additive, and simple linear regression. Moreover, these models are ranked and the best one is highlighted. Conclusion: This tool will allow anyone with historic test volume data to model future demand. PMID:28400996
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit

PubMed Central

Chu, Annie; Cui, Jenny; Dinov, Ivo D.

2011-01-01

The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994
Comparing and combining process-based crop models and statistical models with some implications for climate change

NASA Astrophysics Data System (ADS)

Roberts, Michael J.; Braun, Noah O.; Sinclair, Thomas R.; Lobell, David B.; Schlenker, Wolfram

2017-09-01

We compare predictions of a simple process-based crop model (Soltani and Sinclair 2012), a simple statistical model (Schlenker and Roberts 2009), and a combination of both models to actual maize yields on a large, representative sample of farmer-managed fields in the Corn Belt region of the United States. After statistical post-model calibration, the process model (Simple Simulation Model, or SSM) predicts actual outcomes slightly better than the statistical model, but the combined model performs significantly better than either model. The SSM, statistical model and combined model all show similar relationships with precipitation, while the SSM better accounts for temporal patterns of precipitation, vapor pressure deficit and solar radiation. The statistical and combined models show a more negative impact associated with extreme heat for which the process model does not account. Due to the extreme heat effect, predicted impacts under uniform climate change scenarios are considerably more severe for the statistical and combined models than for the process-based model.
minimega

DOE Office of Scientific and Technical Information (OSTI.GOV)

David Fritz, John Floren

2013-08-27

Minimega is a simple emulytics platform for creating testbeds of networked devices. The platform consists of easily deployable tools to facilitate bringing up large networks of virtual machines including Windows, Linux, and Android. Minimega attempts to allow experiments to be brought up quickly with nearly no configuration. Minimega also includes tools for simple cluster management, as well as tools for creating Linux based virtual machine images.
A robust clustering algorithm for identifying problematic samples in genome-wide association studies.

PubMed

Bellenguez, Céline; Strange, Amy; Freeman, Colin; Donnelly, Peter; Spencer, Chris C A

2012-01-01

High-throughput genotyping arrays provide an efficient way to survey single nucleotide polymorphisms (SNPs) across the genome in large numbers of individuals. Downstream analysis of the data, for example in genome-wide association studies (GWAS), often involves statistical models of genotype frequencies across individuals. The complexities of the sample collection process and the potential for errors in the experimental assay can lead to biases and artefacts in an individual's inferred genotypes. Rather than attempting to model these complications, it has become a standard practice to remove individuals whose genome-wide data differ from the sample at large. Here we describe a simple, but robust, statistical algorithm to identify samples with atypical summaries of genome-wide variation. Its use as a semi-automated quality control tool is demonstrated using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections. The algorithm is written in R and is freely available at www.well.ox.ac.uk/chris-spencer chris.spencer@well.ox.ac.uk Supplementary data are available at Bioinformatics online.
Eutrophication risk assessment in coastal embayments using simple statistical models.

PubMed

Arhonditsis, G; Eleftheriadou, M; Karydis, M; Tsirtsis, G

2003-09-01

A statistical methodology is proposed for assessing the risk of eutrophication in marine coastal embayments. The procedure followed was the development of regression models relating the levels of chlorophyll a (Chl) with the concentration of the limiting nutrient--usually nitrogen--and the renewal rate of the systems. The method was applied in the Gulf of Gera, Island of Lesvos, Aegean Sea and a surrogate for renewal rate was created using the Canberra metric as a measure of the resemblance between the Gulf and the oligotrophic waters of the open sea in terms of their physical, chemical and biological properties. The Chl-total dissolved nitrogen-renewal rate regression model was the most significant, accounting for 60% of the variation observed in Chl. Predicted distributions of Chl for various combinations of the independent variables, based on Bayesian analysis of the models, enabled comparison of the outcomes of specific scenarios of interest as well as further analysis of the system dynamics. The present statistical approach can be used as a methodological tool for testing the resilience of coastal ecosystems under alternative managerial schemes and levels of exogenous nutrient loading.
Development of a Simple Tool for Identifying Alcohol Use Disorder in Female Korean Drinkers from Previous Questionnaires.

PubMed

Seo, Yu Ri; Kim, Jong Sung; Kim, Sung Soo; Yoon, Seok Joon; Suh, Won Yoon; Youn, Kwangmi

2016-01-01

This study aimed to develop a simple tool for identifying alcohol use disorders in female Korean drinkers from previous questionnaires. This research was conducted on 400 women who consumed at least one alcoholic drink during the past month and visited the health promotion center at Chungnam National University Hospital between June 2013 to May 2014. Drinking habits and alcohol use disorders were assessed by structured interviews using the Diagnostic and Statistical Manual of Mental Disorders, 5th Edition diagnostic criteria. The subjects were also asked to answer the Alcohol Use Disorders Identification Test (AUDIT), AUDIT-Consumption, CAGE (Cut down, Annoyed, Guilty, Eye-opener), TWEAK (Tolerance, Worried, Eye-opener, Amnesia, Kut down), TACE (Tolerance, Annoyed, Cut down, Eye-opener), and NET (Normal drinker, Eye-opener, Tolerance) questionnaires. The area under receiver operating characteristic (AUROC) of each question of the questionnaires on alcohol use disorders was assessed. After combining two questions with the largest AUROC, it was compared to other previous questionnaires. Among the 400 subjects, 58 (14.5%) were identified as having an alcohol use disorder. Two questions with the largest AUROC were question no. 7 in AUDIT, "How often during the last year have you had a feeling of guilt or remorse after drinking?" and question no. 5 in AUDIT, "How often during the past year have you failed to do what was normally expected from you because of drinking?" with an AUROC (95% confidence interval [CI]) of 0.886 (0.850-0.915) and 0.862 (0.824-0.894), respectively. The AUROC (95% CI) of the combination of the two questions was 0.958 (0.934-0.976) with no significant difference as compared to the existing AUDIT with the largest AUROC. The above results suggest that the simple tool consisting of questions no. 5 and no. 7 in AUDIT is useful in identifying alcohol use disorders in Korean female drinkers.
A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

PubMed

Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

2017-12-06

Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.

KISS for STRAP: user extensions for a protein alignment editor.

PubMed

Gille, Christoph; Lorenzen, Stephan; Michalsky, Elke; Frömmel, Cornelius

2003-12-12

The Structural Alignment Program STRAP is a comfortable comprehensive editor and analyzing tool for protein alignments. A wide range of functions related to protein sequences and protein structures are accessible with an intuitive graphical interface. Recent features include mapping of mutations and polymorphisms onto structures and production of high quality figures for publication. Here we address the general problem of multi-purpose program packages to keep up with the rapid development of bioinformatical methods and the demand for specific program functions. STRAP was remade implementing a novel design which aims at Keeping Interfaces in STRAP Simple (KISS). KISS renders STRAP extendable to bio-scientists as well as to bio-informaticians. Scientists with basic computer skills are capable of implementing statistical methods or embedding existing bioinformatical tools in STRAP themselves. For bio-informaticians STRAP may serve as an environment for rapid prototyping and testing of complex algorithms such as automatic alignment algorithms or phylogenetic methods. Further, STRAP can be applied as an interactive web applet to present data related to a particular protein family and as a teaching tool. JAVA-1.4 or higher. http://www.charite.de/bioinf/strap/
A Simple Evacuation Modeling and Simulation Tool for First Responders

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koch, Daniel B; Payne, Patricia W

2015-01-01

Although modeling and simulation of mass evacuations during a natural or man-made disaster is an on-going and vigorous area of study, tool adoption by front-line first responders is uneven. Some of the factors that account for this situation include cost and complexity of the software. For several years, Oak Ridge National Laboratory has been actively developing the free Incident Management Preparedness and Coordination Toolkit (IMPACT) to address these issues. One of the components of IMPACT is a multi-agent simulation module for area-based and path-based evacuations. The user interface is designed so that anyone familiar with typical computer drawing tools canmore » quickly author a geospatially-correct evacuation visualization suitable for table-top exercises. Since IMPACT is designed for use in the field where network communications may not be available, quick on-site evacuation alternatives can be evaluated to keep pace with a fluid threat situation. Realism is enhanced by incorporating collision avoidance into the simulation. Statistics are gathered as the simulation unfolds, including most importantly time-to-evacuate, to help first responders choose the best course of action.« less
Identification of facilitators and barriers to residents' use of a clinical reasoning tool.

PubMed

DiNardo, Deborah; Tilstra, Sarah; McNeil, Melissa; Follansbee, William; Zimmer, Shanta; Farris, Coreen; Barnato, Amber E

2018-03-28

While there is some experimental evidence to support the use of cognitive forcing strategies to reduce diagnostic error in residents, the potential usability of such strategies in the clinical setting has not been explored. We sought to test the effect of a clinical reasoning tool on diagnostic accuracy and to obtain feedback on its usability and acceptability. We conducted a randomized behavioral experiment testing the effect of this tool on diagnostic accuracy on written cases among post-graduate 3 (PGY-3) residents at a single internal medical residency program in 2014. Residents completed written clinical cases in a proctored setting with and without prompts to use the tool. The tool encouraged reflection on concordant and discordant aspects of each case. We used random effects regression to assess the effect of the tool on diagnostic accuracy of the independent case sets, controlling for case complexity. We then conducted audiotaped structured focus group debriefing sessions and reviewed the tapes for facilitators and barriers to use of the tool. Of 51 eligible PGY-3 residents, 34 (67%) participated in the study. The average diagnostic accuracy increased from 52% to 60% with the tool, a difference that just met the test for statistical significance in adjusted analyses (p=0.05). Residents reported that the tool was generally acceptable and understandable but did not recognize its utility for use with simple cases, suggesting the presence of overconfidence bias. A clinical reasoning tool improved residents' diagnostic accuracy on written cases. Overconfidence bias is a potential barrier to its use in the clinical setting.
miRNA Temporal Analyzer (mirnaTA): a bioinformatics tool for identifying differentially expressed microRNAs in temporal studies using normal quantile transformation.

PubMed

Cer, Regina Z; Herrera-Galeano, J Enrique; Anderson, Joseph J; Bishop-Lilly, Kimberly A; Mokashi, Vishwesh P

2014-01-01

Understanding the biological roles of microRNAs (miRNAs) is a an active area of research that has produced a surge of publications in PubMed, particularly in cancer research. Along with this increasing interest, many open-source bioinformatics tools to identify existing and/or discover novel miRNAs in next-generation sequencing (NGS) reads become available. While miRNA identification and discovery tools are significantly improved, the development of miRNA differential expression analysis tools, especially in temporal studies, remains substantially challenging. Further, the installation of currently available software is non-trivial and steps of testing with example datasets, trying with one's own dataset, and interpreting the results require notable expertise and time. Subsequently, there is a strong need for a tool that allows scientists to normalize raw data, perform statistical analyses, and provide intuitive results without having to invest significant efforts. We have developed miRNA Temporal Analyzer (mirnaTA), a bioinformatics package to identify differentially expressed miRNAs in temporal studies. mirnaTA is written in Perl and R (Version 2.13.0 or later) and can be run across multiple platforms, such as Linux, Mac and Windows. In the current version, mirnaTA requires users to provide a simple, tab-delimited, matrix file containing miRNA name and count data from a minimum of two to a maximum of 20 time points and three replicates. To recalibrate data and remove technical variability, raw data is normalized using Normal Quantile Transformation (NQT), and linear regression model is used to locate any miRNAs which are differentially expressed in a linear pattern. Subsequently, remaining miRNAs which do not fit a linear model are further analyzed in two different non-linear methods 1) cumulative distribution function (CDF) or 2) analysis of variances (ANOVA). After both linear and non-linear analyses are completed, statistically significant miRNAs (P < 0.05) are plotted as heat maps using hierarchical cluster analysis and Euclidean distance matrix computation methods. mirnaTA is an open-source, bioinformatics tool to aid scientists in identifying differentially expressed miRNAs which could be further mined for biological significance. It is expected to provide researchers with a means of interpreting raw data to statistical summaries in a fast and intuitive manner.
Using complexity metrics with R-R intervals and BPM heart rate measures.

PubMed

Wallot, Sebastian; Fusaroli, Riccardo; Tylén, Kristian; Jegindø, Else-Marie

2013-01-01

Lately, growing attention in the health sciences has been paid to the dynamics of heart rate as indicator of impending failures and for prognoses. Likewise, in social and cognitive sciences, heart rate is increasingly employed as a measure of arousal, emotional engagement and as a marker of interpersonal coordination. However, there is no consensus about which measurements and analytical tools are most appropriate in mapping the temporal dynamics of heart rate and quite different metrics are reported in the literature. As complexity metrics of heart rate variability depend critically on variability of the data, different choices regarding the kind of measures can have a substantial impact on the results. In this article we compare linear and non-linear statistics on two prominent types of heart beat data, beat-to-beat intervals (R-R interval) and beats-per-min (BPM). As a proof-of-concept, we employ a simple rest-exercise-rest task and show that non-linear statistics-fractal (DFA) and recurrence (RQA) analyses-reveal information about heart beat activity above and beyond the simple level of heart rate. Non-linear statistics unveil sustained post-exercise effects on heart rate dynamics, but their power to do so critically depends on the type data that is employed: While R-R intervals are very susceptible to non-linear analyses, the success of non-linear methods for BPM data critically depends on their construction. Generally, "oversampled" BPM time-series can be recommended as they retain most of the information about non-linear aspects of heart beat dynamics.
Constructing Space-Time Views from Fixed Size Statistical Data: Getting the Best of both Worlds

NASA Technical Reports Server (NTRS)

Schmidt, Melisa; Yan, Jerry C.

1997-01-01

Many performance monitoring tools are currently available to the super-computing community. The performance data gathered and analyzed by these tools fall under two categories: statistics and event traces. Statistical data is much more compact but lacks the probative power event traces offer. Event traces, on the other hand, can easily fill up the entire file system during execution such that the instrumented execution may have to be terminated half way through. In this paper, we propose an innovative methodology for performance data gathering and representation that offers a middle ground. The user can trade-off tracing overhead, trace data size vs. data quality incrementally. In other words, the user will be able to limit the amount of trace collected and, at the same time, carry out some of the analysis event traces offer using space-time views for the entire execution. Two basic ideas arc employed: the use of averages to replace recording data for each instance and formulae to represent sequences associated with communication and control flow. With the help of a few simple examples, we illustrate the use of these techniques in performance tuning and compare the quality of the traces we collected vs. event traces. We found that the trace files thus obtained are, in deed, small, bounded and predictable before program execution and that the quality of the space time views generated from these statistical data are excellent. Furthermore, experimental results showed that the formulae proposed were able to capture 100% of all the sequences associated with 11 of the 15 applications tested. The performance of the formulae can be incrementally improved by allocating more memory at run-time to learn longer sequences.
Constructing Space-Time Views from Fixed Size Statistical Data: Getting the Best of Both Worlds

NASA Technical Reports Server (NTRS)

Schmidt, Melisa; Yan, Jerry C.; Bailey, David (Technical Monitor)

1996-01-01

Many performance monitoring tools are currently available to the super-computing community. The performance data gathered and analyzed by these tools fall under two categories: statistics and event traces. Statistical data is much more compact but lacks the probative power event traces offer. Event traces, on the other hand, can easily fill up the entire file system during execution such that the instrumented execution may have to be terminated half way through. In this paper, we propose an innovative methodology for performance data gathering and representation that offers a middle ground. The user can trade-off tracing overhead, trace data size vs. data quality incrementally. In other words, the user will be able to limit the amount of trace collected and, at the same time, carry out some of the analysis event traces offer using spacetime views for the entire execution. Two basic ideas are employed: the use of averages to replace recording data for each instance and "formulae" to represent sequences associated with communication and control flow. With the help of a few simple examples, we illustrate the use of these techniques in performance tuning and compare the quality of the traces we collected vs. event traces. We found that the trace files thus obtained are, in deed, small, bounded and predictable before program execution and that the quality of the space time views generated from these statistical data are excellent. Furthermore, experimental results showed that the formulae proposed were able to capture 100% of all the sequences associated with 11 of the 15 applications tested. The performance of the formulae can be incrementally improved by allocating more memory at run-time to learn longer sequences.
Use of a Cutaneous Body Image (CBI) scale to evaluate self perception of body image in acne vulgaris.

PubMed

Amr, Mostafa; Kaliyadan, Feroze; Shams, Tarek

2014-01-01

Skin disorders such as acne, which have significant cosmetic implications, can affect the self-perception of cutaneous body image. There are many scales which measure self-perception of cutaneous body image. We evaluated the use of a simple Cutaneous Body Image (CBI) scale to assess self-perception of body image in a sample of young Arab patients affected with acne. A total of 70 patients with acne answered the CBI questionnaire. The CBI score was correlated with the severity of acne and acne scarring, gender, and history of retinoids use. There was no statistically significant correlation between CBI and the other parameters - gender, acne/acne scarring severity, and use of retinoids. Our study suggests that cutaneous body image perception in Arab patients with acne was not dependent on variables like gender and severity of acne or acne scarring. A simple CBI scale alone is not a sufficiently reliable tool to assess self-perception of body image in patients with acne vulgaris.
Analyzing Hidden Semantics in Social Bookmarking of Open Educational Resources

NASA Astrophysics Data System (ADS)

Minguillón, Julià

Web 2.0 services such as social bookmarking allow users to manage and share the links they find interesting, adding their own tags for describing them. This is especially interesting in the field of open educational resources, as delicious is a simple way to bridge the institutional point of view (i.e. learning object repositories) with the individual one (i.e. personal collections), thus promoting the discovering and sharing of such resources by other users. In this paper we propose a methodology for analyzing such tags in order to discover hidden semantics (i.e. taxonomies and vocabularies) that can be used to improve descriptions of learning objects and make learning object repositories more visible and discoverable. We propose the use of a simple statistical analysis tool such as principal component analysis to discover which tags create clusters that can be semantically interpreted. We will compare the obtained results with a collection of resources related to open educational resources, in order to better understand the real needs of people searching for open educational resources.
Self-assessed performance improves statistical fusion of image labels

PubMed Central

Bryan, Frederick W.; Xu, Zhoubing; Asman, Andrew J.; Allen, Wade M.; Reich, Daniel S.; Landman, Bennett A.

2014-01-01

Purpose: Expert manual labeling is the gold standard for image segmentation, but this process is difficult, time-consuming, and prone to inter-individual differences. While fully automated methods have successfully targeted many anatomies, automated methods have not yet been developed for numerous essential structures (e.g., the internal structure of the spinal cord as seen on magnetic resonance imaging). Collaborative labeling is a new paradigm that offers a robust alternative that may realize both the throughput of automation and the guidance of experts. Yet, distributing manual labeling expertise across individuals and sites introduces potential human factors concerns (e.g., training, software usability) and statistical considerations (e.g., fusion of information, assessment of confidence, bias) that must be further explored. During the labeling process, it is simple to ask raters to self-assess the confidence of their labels, but this is rarely done and has not been previously quantitatively studied. Herein, the authors explore the utility of self-assessment in relation to automated assessment of rater performance in the context of statistical fusion. Methods: The authors conducted a study of 66 volumes manually labeled by 75 minimally trained human raters recruited from the university undergraduate population. Raters were given 15 min of training during which they were shown examples of correct segmentation, and the online segmentation tool was demonstrated. The volumes were labeled 2D slice-wise, and the slices were unordered. A self-assessed quality metric was produced by raters for each slice by marking a confidence bar superimposed on the slice. Volumes produced by both voting and statistical fusion algorithms were compared against a set of expert segmentations of the same volumes. Results: Labels for 8825 distinct slices were obtained. Simple majority voting resulted in statistically poorer performance than voting weighted by self-assessed performance. Statistical fusion resulted in statistically indistinguishable performance from self-assessed weighted voting. The authors developed a new theoretical basis for using self-assessed performance in the framework of statistical fusion and demonstrated that the combined sources of information (both statistical assessment and self-assessment) yielded statistically significant improvement over the methods considered separately. Conclusions: The authors present the first systematic characterization of self-assessed performance in manual labeling. The authors demonstrate that self-assessment and statistical fusion yield similar, but complementary, benefits for label fusion. Finally, the authors present a new theoretical basis for combining self-assessments with statistical label fusion. PMID:24593721
Predicting the need for muscle flap salvage after open groin vascular procedures: a clinical assessment tool.

PubMed

Fischer, John P; Nelson, Jonas A; Shang, Eric K; Wink, Jason D; Wingate, Nicholas A; Woo, Edward Y; Jackson, Benjamin M; Kovach, Stephen J; Kanchwala, Suhail

2014-12-01

Groin wound complications after open vascular surgery procedures are common, morbid, and costly. The purpose of this study was to generate a simple, validated, clinically usable risk assessment tool for predicting groin wound morbidity after infra-inguinal vascular surgery. A retrospective review of consecutive patients undergoing groin cutdowns for femoral access between 2005-2011 was performed. Patients necessitating salvage flaps were compared to those who did not, and a stepwise logistic regression was performed and validated using a bootstrap technique. Utilising this analysis, a simplified risk score was developed to predict the risk of developing a wound which would necessitate salvage. A total of 925 patients were included in the study. The salvage flap rate was 11.2% (n = 104). Predictors determined by logistic regression included prior groin surgery (OR = 4.0, p < 0.001), prosthetic graft (OR = 2.7, p < 0.001), coronary artery disease (OR = 1.8, p = 0.019), peripheral arterial disease (OR = 5.0, p < 0.001), and obesity (OR = 1.7, p = 0.039). Based upon the respective logistic coefficients, a simplified scoring system was developed to enable the preoperative risk stratification regarding the likelihood of a significant complication which would require a salvage muscle flap. The c-statistic for the regression demonstrated excellent discrimination at 0.89. This study presents a simple, internally validated risk assessment tool that accurately predicts wound morbidity requiring flap salvage in open groin vascular surgery patients. The preoperatively high-risk patient can be identified and selectively targeted as a candidate for a prophylactic muscle flap.
Characterizing pigments with hyperspectral imaging variable false-color composites

NASA Astrophysics Data System (ADS)

Hayem-Ghez, Anita; Ravaud, Elisabeth; Boust, Clotilde; Bastian, Gilles; Menu, Michel; Brodie-Linder, Nancy

2015-11-01

Hyperspectral imaging has been used for pigment characterization on paintings for the last 10 years. It is a noninvasive technique, which mixes the power of spectrophotometry and that of imaging technologies. We have access to a visible and near-infrared hyperspectral camera, ranging from 400 to 1000 nm in 80-160 spectral bands. In order to treat the large amount of data that this imaging technique generates, one can use statistical tools such as principal component analysis (PCA). To conduct the characterization of pigments, researchers mostly use PCA, convex geometry algorithms and the comparison of resulting clusters to database spectra with a specific tolerance (like the Spectral Angle Mapper tool on the dedicated software ENVI). Our approach originates from false-color photography and aims at providing a simple tool to identify pigments thanks to imaging spectroscopy. It can be considered as a quick first analysis to see the principal pigments of a painting, before using a more complete multivariate statistical tool. We study pigment spectra, for each kind of hue (blue, green, red and yellow) to identify the wavelength maximizing spectral differences. The case of red pigments is most interesting because our methodology can discriminate the red pigments very well—even red lakes, which are always difficult to identify. As for the yellow and blue categories, it represents a good progress of IRFC photography for pigment discrimination. We apply our methodology to study the pigments on a painting by Eustache Le Sueur, a French painter of the seventeenth century. We compare the results to other noninvasive analysis like X-ray fluorescence and optical microscopy. Finally, we draw conclusions about the advantages and limits of the variable false-color image method using hyperspectral imaging.
Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.

PubMed

Mi, Gu; Di, Yanming; Schafer, Daniel W

2015-01-01

This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.
Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics

PubMed Central

Brusniak, Mi-Youn; Bodenmiller, Bernd; Campbell, David; Cooke, Kelly; Eddes, James; Garbutt, Andrew; Lau, Hollis; Letarte, Simon; Mueller, Lukas N; Sharma, Vagisha; Vitek, Olga; Zhang, Ning; Aebersold, Ruedi; Watts, Julian D

2008-01-01

Background Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS) based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics. Results We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling. Conclusion The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools. Corra enables appropriate statistical analyses, with controlled false-discovery rates, ultimately to inform subsequent targeted identification of differentially abundant peptides by MS/MS. For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field. PMID:19087345
Dynamic PET simulator via tomographic emission projection for kinetic modeling and parametric image studies.

PubMed

Häggström, Ida; Beattie, Bradley J; Schmidtlein, C Ross

2016-06-01

To develop and evaluate a fast and simple tool called dpetstep (Dynamic PET Simulator of Tracers via Emission Projection), for dynamic PET simulations as an alternative to Monte Carlo (MC), useful for educational purposes and evaluation of the effects of the clinical environment, postprocessing choices, etc., on dynamic and parametric images. The tool was developed in matlab using both new and previously reported modules of petstep (PET Simulator of Tracers via Emission Projection). Time activity curves are generated for each voxel of the input parametric image, whereby effects of imaging system blurring, counting noise, scatters, randoms, and attenuation are simulated for each frame. Each frame is then reconstructed into images according to the user specified method, settings, and corrections. Reconstructed images were compared to MC data, and simple Gaussian noised time activity curves (GAUSS). dpetstep was 8000 times faster than MC. Dynamic images from dpetstep had a root mean square error that was within 4% on average of that of MC images, whereas the GAUSS images were within 11%. The average bias in dpetstep and MC images was the same, while GAUSS differed by 3% points. Noise profiles in dpetstep images conformed well to MC images, confirmed visually by scatter plot histograms, and statistically by tumor region of interest histogram comparisons that showed no significant differences (p < 0.01). Compared to GAUSS, dpetstep images and noise properties agreed better with MC. The authors have developed a fast and easy one-stop solution for simulations of dynamic PET and parametric images, and demonstrated that it generates both images and subsequent parametric images with very similar noise properties to those of MC images, in a fraction of the time. They believe dpetstep to be very useful for generating fast, simple, and realistic results, however since it uses simple scatter and random models it may not be suitable for studies investigating these phenomena. dpetstep can be downloaded free of cost from https://github.com/CRossSchmidtlein/dPETSTEP.
Statistical complexity without explicit reference to underlying probabilities

NASA Astrophysics Data System (ADS)

Pennini, F.; Plastino, A.

2018-06-01

We show that extremely simple systems of a not too large number of particles can be simultaneously thermally stable and complex. To such an end, we extend the statistical complexity's notion to simple configurations of non-interacting particles, without appeal to probabilities, and discuss configurational properties.
A simple method for processing data with least square method

NASA Astrophysics Data System (ADS)

Wang, Chunyan; Qi, Liqun; Chen, Yongxiang; Pang, Guangning

2017-08-01

The least square method is widely used in data processing and error estimation. The mathematical method has become an essential technique for parameter estimation, data processing, regression analysis and experimental data fitting, and has become a criterion tool for statistical inference. In measurement data analysis, the distribution of complex rules is usually based on the least square principle, i.e., the use of matrix to solve the final estimate and to improve its accuracy. In this paper, a new method is presented for the solution of the method which is based on algebraic computation and is relatively straightforward and easy to understand. The practicability of this method is described by a concrete example.
Costo-iliac distance: a physical sign of understated importance.

PubMed

Barry, P J; O'Mahony, D

2012-03-01

Osteoporosis is a common condition, especially affecting the older female population. The ability to predict loss of lumbar height using simple anatomical measurements would be a useful tool. Forty subjects were recruited. Mean age was 72 years. Arm span (AS) and the costo-iliac distance (CID) were measured. The CID/AS ratio was calculated. The L(1)-L(4) vertebral height of each patient was obtained from dual-energy X-ray absorptiometry (DEXA). There was a statistically significant correlation between the lumbar height and CID/AS ratio (R (2) = 0.79, p < 0.001). The CID/AS ratio may be a useful bedside test in identifying loss of lumbar vertebral height.
Neurophysiological correlates of depressive symptoms in young adults: A quantitative EEG study.

PubMed

Lee, Poh Foong; Kan, Donica Pei Xin; Croarkin, Paul; Phang, Cheng Kar; Doruk, Deniz

2018-01-01

There is an unmet need for practical and reliable biomarkers for mood disorders in young adults. Identifying the brain activity associated with the early signs of depressive disorders could have important diagnostic and therapeutic implications. In this study we sought to investigate the EEG characteristics in young adults with newly identified depressive symptoms. Based on the initial screening, a total of 100 participants (n = 50 euthymic, n = 50 depressive) underwent 32-channel EEG acquisition. Simple logistic regression and C-statistic were used to explore if EEG power could be used to discriminate between the groups. The strongest EEG predictors of mood using multivariate logistic regression models. Simple logistic regression analysis with subsequent C-statistics revealed that only high-alpha and beta power originating from the left central cortex (C3) have a reliable discriminative value (ROC curve >0.7 (70%)) for differentiating the depressive group from the euthymic group. Multivariate regression analysis showed that the single most significant predictor of group (depressive vs. euthymic) is the high-alpha power over C3 (p = 0.03). The present findings suggest that EEG is a useful tool in the identification of neurophysiological correlates of depressive symptoms in young adults with no previous psychiatric history. Our results could guide future studies investigating the early neurophysiological changes and surrogate outcomes in depression. Copyright © 2017 Elsevier Ltd. All rights reserved.
Comparing Serum Follicle-Stimulating Hormone (FSH) Level with Vaginal PH in Women with Menopausal Symptoms.

PubMed

Vahidroodsari, Fatemeh; Ayati, Seddigheh; Yousefi, Zohreh; Saeed, Shohreh

2010-01-01

Despite the important implication for women's health and reproduction, very few studies have focused on vaginal PH for menopausal diagnosis. Recent studies have suggested vaginal PH as a simple, noninvasive and inexpensive method for this purpose. The aim of this study is to compare serum FSH level with vaginal PH in menopause. This is a cross-sectional, descriptive study, conducted on 103 women (aged 31-95 yrs) with menopausal symptoms who were referred to the Menopausal Clinic at Ghaem Hospital during 2006. Vaginal pH was measured using pH meter strips and serum FSH levels were measured using immunoassay methods. The data was analyzed using SPSS software (version 11.5) and results were evaluated statistically by the Chi-square and Kappa tests. p≤0.05 was considered statistically significant. According to this study, in the absence of vaginal infection, the average vaginal pH in these 103 menopausal women was 5.33±0.53. If the menopausal hallmark was considered as vaginal pH>4.5, and serum FSH as ≥20 mIU/ml, then the sensitivity of vaginal pH for menopausal diagnosis was 97%. The mean of FSH levels in this population was 80.79 mIU/ml. Vaginal pH is a simple, accurate, and cost effective tool that can be suggested as a suitable alternative to serum FSH measurement for the diagnosis of menopause.

Information Visualization Techniques for Effective Cross-Discipline Communication

NASA Astrophysics Data System (ADS)

Fisher, Ward

2013-04-01

Collaboration between research groups in different fields is a common occurrence, but it can often be frustrating due to the absence of a common vocabulary. This lack of a shared context can make expressing important concepts and discussing results difficult. This problem may be further exacerbated when communicating to an audience of laypeople. Without a clear frame of reference, simple concepts are often rendered difficult-to-understand at best, and unintelligible at worst. An easy way to alleviate this confusion is with the use of clear, well-designed visualizations to illustrate an idea, process or conclusion. There exist a number of well-described machine-learning and statistical techniques which can be used to illuminate the information present within complex high-dimensional datasets. Once the information has been separated from the data, clear communication becomes a matter of selecting an appropriate visualization. Ideally, the visualization is information-rich but data-scarce. Anything from a simple bar chart, to a line chart with confidence intervals, to an animated set of 3D point-clouds can be used to render a complex idea as an easily understood image. Several case studies will be presented in this work. In the first study, we will examine how a complex statistical analysis was applied to a high-dimensional dataset, and how the results were succinctly communicated to an audience of microbiologists and chemical engineers. Next, we will examine a technique used to illustrate the concept of the singular value decomposition, as used in the field of computer vision, to a lay audience of undergraduate students from mixed majors. We will then examine a case where a simple animated line plot was used to communicate an approach to signal decomposition, and will finish with a discussion of the tools available to create these visualizations.
Modeling the milling tool wear by using an evolutionary SVM-based model from milling runs experimental data

NASA Astrophysics Data System (ADS)

Nieto, Paulino José García; García-Gonzalo, Esperanza; Vilán, José Antonio Vilán; Robleda, Abraham Segade

2015-12-01

The main aim of this research work is to build a new practical hybrid regression model to predict the milling tool wear in a regular cut as well as entry cut and exit cut of a milling tool. The model was based on Particle Swarm Optimization (PSO) in combination with support vector machines (SVMs). This optimization mechanism involved kernel parameter setting in the SVM training procedure, which significantly influences the regression accuracy. Bearing this in mind, a PSO-SVM-based model, which is based on the statistical learning theory, was successfully used here to predict the milling tool flank wear (output variable) as a function of the following input variables: the time duration of experiment, depth of cut, feed, type of material, etc. To accomplish the objective of this study, the experimental dataset represents experiments from runs on a milling machine under various operating conditions. In this way, data sampled by three different types of sensors (acoustic emission sensor, vibration sensor and current sensor) were acquired at several positions. A second aim is to determine the factors with the greatest bearing on the milling tool flank wear with a view to proposing milling machine's improvements. Firstly, this hybrid PSO-SVM-based regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the flank wear (output variable) and input variables (time, depth of cut, feed, etc.). Indeed, regression with optimal hyperparameters was performed and a determination coefficient of 0.95 was obtained. The agreement of this model with experimental data confirmed its good performance. Secondly, the main advantages of this PSO-SVM-based model are its capacity to produce a simple, easy-to-interpret model, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, the main conclusions of this study are exposed.
Density-based empirical likelihood procedures for testing symmetry of data distributions and K-sample comparisons.

PubMed

Vexler, Albert; Tanajian, Hovig; Hutson, Alan D

In practice, parametric likelihood-ratio techniques are powerful statistical tools. In this article, we propose and examine novel and simple distribution-free test statistics that efficiently approximate parametric likelihood ratios to analyze and compare distributions of K groups of observations. Using the density-based empirical likelihood methodology, we develop a Stata package that applies to a test for symmetry of data distributions and compares K -sample distributions. Recognizing that recent statistical software packages do not sufficiently address K -sample nonparametric comparisons of data distributions, we propose a new Stata command, vxdbel, to execute exact density-based empirical likelihood-ratio tests using K samples. To calculate p -values of the proposed tests, we use the following methods: 1) a classical technique based on Monte Carlo p -value evaluations; 2) an interpolation technique based on tabulated critical values; and 3) a new hybrid technique that combines methods 1 and 2. The third, cutting-edge method is shown to be very efficient in the context of exact-test p -value computations. This Bayesian-type method considers tabulated critical values as prior information and Monte Carlo generations of test statistic values as data used to depict the likelihood function. In this case, a nonparametric Bayesian method is proposed to compute critical values of exact tests.
Evaluation of an existing screening tool for psoriatic arthritis in people with psoriasis and the development of a new instrument: the Psoriasis Epidemiology Screening Tool (PEST) questionnaire.

PubMed

Ibrahim, G H; Buch, M H; Lawson, C; Waxman, R; Helliwell, P S

2009-01-01

To evaluate an existing tool (the Swedish modification of the Psoriasis Assessment Questionnaire) and to develop a new instrument to screen for psoriatic arthritis in people with psoriasis. The starting point was a community-based survey of people with psoriasis using questionnaires developed from the literature. Selected respondents were examined and additional known cases of psoriatic arthritis were included in the analysis. The new instrument was developed using univariate statistics and a logistic regression model, comparing people with and without psoriatic arthritis. The instruments were compared using receiver operating curve (ROC) curve analysis. 168 questionnaires were returned (response rate 27%) and 93 people attended for examination (55% of questionnaire respondents). Of these 93, twelve were newly diagnosed with psoriatic arthritis during this study. These 12 were supplemented by 21 people with known psoriatic arthritis. Just 5 questions were found to be significant predictors of psoriatic arthritis in this population. Figures for sensitivity and specificity were 0.92 and 0.78 respectively, an improvement on the Alenius tool (sensitivity and specificity, 0.63 and 0.72 respectively). A new screening tool for identifying people with psoriatic arthritis has been developed. Five simple questions demonstrated good sensitivity and specificity in this population but further validation is required.
Estimating population diversity with CatchAll

PubMed Central

Bunge, John; Woodard, Linda; Böhning, Dankmar; Foster, James A.; Connolly, Sean; Allen, Heather K.

2012-01-01

Motivation: The massive data produced by next-generation sequencing require advanced statistical tools. We address estimating the total diversity or species richness in a population. To date, only relatively simple methods have been implemented in available software. There is a need for software employing modern, computationally intensive statistical analyses including error, goodness-of-fit and robustness assessments. Results: We present CatchAll, a fast, easy-to-use, platform-independent program that computes maximum likelihood estimates for finite-mixture models, weighted linear regression-based analyses and coverage-based non-parametric methods, along with outlier diagnostics. Given sample ‘frequency count’ data, CatchAll computes 12 different diversity estimates and applies a model-selection algorithm. CatchAll also derives discounted diversity estimates to adjust for possibly uncertain low-frequency counts. It is accompanied by an Excel-based graphics program. Availability: Free executable downloads for Linux, Windows and Mac OS, with manual and source code, at www.northeastern.edu/catchall. Contact: jab18@cornell.edu PMID:22333246
Longitudinal study of microvascular involvement by nailfold capillaroscopy in children with Henoch-Schönlein purpura.

PubMed

Zampetti, Anna; Rigante, Donato; Bersani, Giulia; Rendeli, Claudia; Feliciani, Claudio; Stabile, Achille

2009-09-01

The aim of this study is to describe by video-nailfold capillaroscopy the microvascular involvement and capillary changes in children with Henoch-Schönlein purpura (HSp) and to establish a possible correlation with clinical outcome. Thirty-one patients underwent capillaroscopic evaluation through a videomicroscope during the acute phase and after 6 months. Twenty sex/age-matched controls were also examined. All capillaroscopic variables were statistically examined in combination with laboratoristic/clinical data. Architectural and morphological changes recorded during the acute phase were statistically significant in comparison to the controls (p < 0.01). At the follow-up, oedema was still observed in all patients, whereas, morphological changes only in two. There was a no significant correlation between capillaroscopy changes, laboratoristic/clinical data, and outcome. Video-nailfold capillaroscopy can be a simple tool to evaluate microvascular abnormalities in the acute phase of HSp, and the persistence of oedema could suggest an incomplete disease resolution at a microvascular level.
Fast Quantum Algorithm for Predicting Descriptive Statistics of Stochastic Processes

NASA Technical Reports Server (NTRS)

Williams Colin P.

1999-01-01

Stochastic processes are used as a modeling tool in several sub-fields of physics, biology, and finance. Analytic understanding of the long term behavior of such processes is only tractable for very simple types of stochastic processes such as Markovian processes. However, in real world applications more complex stochastic processes often arise. In physics, the complicating factor might be nonlinearities; in biology it might be memory effects; and in finance is might be the non-random intentional behavior of participants in a market. In the absence of analytic insight, one is forced to understand these more complex stochastic processes via numerical simulation techniques. In this paper we present a quantum algorithm for performing such simulations. In particular, we show how a quantum algorithm can predict arbitrary descriptive statistics (moments) of N-step stochastic processes in just O(square root of N) time. That is, the quantum complexity is the square root of the classical complexity for performing such simulations. This is a significant speedup in comparison to the current state of the art.
A simple rain attenuation model for earth-space radio links operating at 10-35 GHz

NASA Technical Reports Server (NTRS)

Stutzman, W. L.; Yon, K. M.

1986-01-01

The simple attenuation model has been improved from an earlier version and now includes the effect of wave polarization. The model is for the prediction of rain attenuation statistics on earth-space communication links operating in the 10-35 GHz band. Simple calculations produce attenuation values as a function of average rain rate. These together with rain rate statistics (either measured or predicted) can be used to predict annual rain attenuation statistics. In this paper model predictions are compared to measured data from a data base of 62 experiments performed in the U.S., Europe, and Japan. Comparisons are also made to predictions from other models.
DABAM: an open-source database of X-ray mirrors metrology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele

2016-04-20

An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. Some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less
DABAM: an open-source database of X-ray mirrors metrology

PubMed Central

Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele; Glass, Mark; Idir, Mourad; Metz, Jim; Raimondi, Lorenzo; Rebuffi, Luca; Reininger, Ruben; Shi, Xianbo; Siewert, Frank; Spielmann-Jaeggi, Sibylle; Takacs, Peter; Tomasset, Muriel; Tonnessen, Tom; Vivo, Amparo; Yashchuk, Valeriy

2016-01-01

An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper, with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. Some optics simulations are presented and discussed to illustrate the real use of the profiles from the database. PMID:27140145
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele

An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. Some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less
DABAM: An open-source database of X-ray mirrors metrology

DOE PAGES

Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele; ...

2016-05-01

An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. In conclusion, some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less
DABAM: an open-source database of X-ray mirrors metrology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele

An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. Some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less
DABAM: An open-source database of X-ray mirrors metrology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele

An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. In conclusion, some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less
Probabilistic risk analysis of building contamination.

PubMed

Bolster, D T; Tartakovsky, D M

2008-10-01

We present a general framework for probabilistic risk assessment (PRA) of building contamination. PRA provides a powerful tool for the rigorous quantification of risk in contamination of building spaces. A typical PRA starts by identifying relevant components of a system (e.g. ventilation system components, potential sources of contaminants, remediation methods) and proceeds by using available information and statistical inference to estimate the probabilities of their failure. These probabilities are then combined by means of fault-tree analyses to yield probabilistic estimates of the risk of system failure (e.g. building contamination). A sensitivity study of PRAs can identify features and potential problems that need to be addressed with the most urgency. Often PRAs are amenable to approximations, which can significantly simplify the approach. All these features of PRA are presented in this paper via a simple illustrative example, which can be built upon in further studies. The tool presented here can be used to design and maintain adequate ventilation systems to minimize exposure of occupants to contaminants.
Assessment and improvement of statistical tools for comparative proteomics analysis of sparse data sets with few experimental replicates.

PubMed

Schwämmle, Veit; León, Ileana Rodríguez; Jensen, Ole Nørregaard

2013-09-06

Large-scale quantitative analyses of biological systems are often performed with few replicate experiments, leading to multiple nonidentical data sets due to missing values. For example, mass spectrometry driven proteomics experiments are frequently performed with few biological or technical replicates due to sample-scarcity or due to duty-cycle or sensitivity constraints, or limited capacity of the available instrumentation, leading to incomplete results where detection of significant feature changes becomes a challenge. This problem is further exacerbated for the detection of significant changes on the peptide level, for example, in phospho-proteomics experiments. In order to assess the extent of this problem and the implications for large-scale proteome analysis, we investigated and optimized the performance of three statistical approaches by using simulated and experimental data sets with varying numbers of missing values. We applied three tools, including standard t test, moderated t test, also known as limma, and rank products for the detection of significantly changing features in simulated and experimental proteomics data sets with missing values. The rank product method was improved to work with data sets containing missing values. Extensive analysis of simulated and experimental data sets revealed that the performance of the statistical analysis tools depended on simple properties of the data sets. High-confidence results were obtained by using the limma and rank products methods for analyses of triplicate data sets that exhibited more than 1000 features and more than 50% missing values. The maximum number of differentially represented features was identified by using limma and rank products methods in a complementary manner. We therefore recommend combined usage of these methods as a novel and optimal way to detect significantly changing features in these data sets. This approach is suitable for large quantitative data sets from stable isotope labeling and mass spectrometry experiments and should be applicable to large data sets of any type. An R script that implements the improved rank products algorithm and the combined analysis is available.
Quinoa - Adaptive Computational Fluid Dynamics, 0.2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bakosi, Jozsef; Gonzalez, Francisco; Rogers, Brandon

Quinoa is a set of computational tools that enables research and numerical analysis in fluid dynamics. At this time it remains a test-bed to experiment with various algorithms using fully asynchronous runtime systems. Currently, Quinoa consists of the following tools: (1) Walker, a numerical integrator for systems of stochastic differential equations in time. It is a mathematical tool to analyze and design the behavior of stochastic differential equations. It allows the estimation of arbitrary coupled statistics and probability density functions and is currently used for the design of statistical moment approximations for multiple mixing materials in variable-density turbulence. (2) Inciter,more » an overdecomposition-aware finite element field solver for partial differential equations using 3D unstructured grids. Inciter is used to research asynchronous mesh-based algorithms and to experiment with coupling asynchronous to bulk-synchronous parallel code. Two planned new features of Inciter, compared to the previous release (LA-CC-16-015), to be implemented in 2017, are (a) a simple Navier-Stokes solver for ideal single-material compressible gases, and (b) solution-adaptive mesh refinement (AMR), which enables dynamically concentrating compute resources to regions with interesting physics. Using the NS-AMR problem we plan to explore how to scale such high-load-imbalance simulations, representative of large production multiphysics codes, to very large problems on very large computers using an asynchronous runtime system. (3) RNGTest, a test harness to subject random number generators to stringent statistical tests enabling quantitative ranking with respect to their quality and computational cost. (4) UnitTest, a unit test harness, running hundreds of tests per second, capable of testing serial, synchronous, and asynchronous functions. (5) MeshConv, a mesh file converter that can be used to convert 3D tetrahedron meshes from and to either of the following formats: Gmsh, (http://www.geuz.org/gmsh), Netgen, (http://sourceforge.net/apps/mediawiki/netgen-mesher), ExodusII, (http://sourceforge.net/projects/exodusii), HyperMesh, (http://www.altairhyperworks.com/product/HyperMesh).« less
A novel risk score model for prediction of contrast-induced nephropathy after emergent percutaneous coronary intervention.

PubMed

Lin, Kai-Yang; Zheng, Wei-Ping; Bei, Wei-Jie; Chen, Shi-Qun; Islam, Sheikh Mohammed Shariful; Liu, Yong; Xue, Lin; Tan, Ning; Chen, Ji-Yan

2017-03-01

A few studies developed simple risk model for predicting CIN with poor prognosis after emergent PCI. The study aimed to develop and validate a novel tool for predicting the risk of contrast-induced nephropathy (CIN) in patients undergoing emergent percutaneous coronary intervention (PCI). 692 consecutive patients undergoing emergent PCI between January 2010 and December 2013 were randomly (2:1) assigned to a development dataset (n=461) and a validation dataset (n=231). Multivariate logistic regression was applied to identify independent predictors of CIN, and established CIN predicting model, whose prognostic accuracy was assessed using the c-statistic for discrimination and the Hosmere Lemeshow test for calibration. The overall incidence of CIN was 55(7.9%). A total of 11 variables were analyzed, including age >75years old, baseline serum creatinine (SCr)>1.5mg/dl, hypotension and the use of intra-aortic balloon pump(IABP), which were identified to enter risk score model (Chen). The incidence of CIN was 32(6.9%) in the development dataset (in low risk (score=0), 1.0%, moderate risk (score:1-2), 13.4%, high risk (score≥3), 90.0%). Compared to the classical Mehran's and ACEF CIN risk score models, the risk score (Chen) across the subgroup of the study population exhibited similar discrimination and predictive ability on CIN (c-statistic:0.828, 0.776, 0.853, respectively), in-hospital mortality, 2, 3-years mortality (c-statistic:0.738.0.750, 0.845, respectively) in the validation population. Our data showed that this simple risk model exhibited good discrimination and predictive ability on CIN, similar to Mehran's and ACEF score, and even on long-term mortality after emergent PCI. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Uncertainty Quantification Techniques for Population Density Estimates Derived from Sparse Open Source Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stewart, Robert N; White, Devin A; Urban, Marie L

2013-01-01

The Population Density Tables (PDT) project at the Oak Ridge National Laboratory (www.ornl.gov) is developing population density estimates for specific human activities under normal patterns of life based largely on information available in open source. Currently, activity based density estimates are based on simple summary data statistics such as range and mean. Researchers are interested in improving activity estimation and uncertainty quantification by adopting a Bayesian framework that considers both data and sociocultural knowledge. Under a Bayesian approach knowledge about population density may be encoded through the process of expert elicitation. Due to the scale of the PDT effort whichmore » considers over 250 countries, spans 40 human activity categories, and includes numerous contributors, an elicitation tool is required that can be operationalized within an enterprise data collection and reporting system. Such a method would ideally require that the contributor have minimal statistical knowledge, require minimal input by a statistician or facilitator, consider human difficulties in expressing qualitative knowledge in a quantitative setting, and provide methods by which the contributor can appraise whether their understanding and associated uncertainty was well captured. This paper introduces an algorithm that transforms answers to simple, non-statistical questions into a bivariate Gaussian distribution as the prior for the Beta distribution. Based on geometric properties of the Beta distribution parameter feasibility space and the bivariate Gaussian distribution, an automated method for encoding is developed that responds to these challenging enterprise requirements. Though created within the context of population density, this approach may be applicable to a wide array of problem domains requiring informative priors for the Beta distribution.« less
Symmetric log-domain diffeomorphic Registration: a demons-based approach.

PubMed

Vercauteren, Tom; Pennec, Xavier; Perchant, Aymeric; Ayache, Nicholas

2008-01-01

Modern morphometric studies use non-linear image registration to compare anatomies and perform group analysis. Recently, log-Euclidean approaches have contributed to promote the use of such computational anatomy tools by permitting simple computations of statistics on a rather large class of invertible spatial transformations. In this work, we propose a non-linear registration algorithm perfectly fit for log-Euclidean statistics on diffeomorphisms. Our algorithm works completely in the log-domain, i.e. it uses a stationary velocity field. This implies that we guarantee the invertibility of the deformation and have access to the true inverse transformation. This also means that our output can be directly used for log-Euclidean statistics without relying on the heavy computation of the log of the spatial transformation. As it is often desirable, our algorithm is symmetric with respect to the order of the input images. Furthermore, we use an alternate optimization approach related to Thirion's demons algorithm to provide a fast non-linear registration algorithm. First results show that our algorithm outperforms both the demons algorithm and the recently proposed diffeomorphic demons algorithm in terms of accuracy of the transformation while remaining computationally efficient.

The epistemological status of general circulation models

NASA Astrophysics Data System (ADS)

Loehle, Craig

2018-03-01

Forecasts of both likely anthropogenic effects on climate and consequent effects on nature and society are based on large, complex software tools called general circulation models (GCMs). Forecasts generated by GCMs have been used extensively in policy decisions related to climate change. However, the relation between underlying physical theories and results produced by GCMs is unclear. In the case of GCMs, many discretizations and approximations are made, and simulating Earth system processes is far from simple and currently leads to some results with unknown energy balance implications. Statistical testing of GCM forecasts for degree of agreement with data would facilitate assessment of fitness for use. If model results need to be put on an anomaly basis due to model bias, then both visual and quantitative measures of model fit depend strongly on the reference period used for normalization, making testing problematic. Epistemology is here applied to problems of statistical inference during testing, the relationship between the underlying physics and the models, the epistemic meaning of ensemble statistics, problems of spatial and temporal scale, the existence or not of an unforced null for climate fluctuations, the meaning of existing uncertainty estimates, and other issues. Rigorous reasoning entails carefully quantifying levels of uncertainty.
Multivariate model of female black bear habitat use for a Geographic Information System

USGS Publications Warehouse

Clark, Joseph D.; Dunn, James E.; Smith, Kimberly G.

1993-01-01

Simple univariate statistical techniques may not adequately assess the multidimensional nature of habitats used by wildlife. Thus, we developed a multivariate method to model habitat-use potential using a set of female black bear (Ursus americanus) radio locations and habitat data consisting of forest cover type, elevation, slope, aspect, distance to roads, distance to streams, and forest cover type diversity score in the Ozark Mountains of Arkansas. The model is based on the Mahalanobis distance statistic coupled with Geographic Information System (GIS) technology. That statistic is a measure of dissimilarity and represents a standardized squared distance between a set of sample variates and an ideal based on the mean of variates associated with animal observations. Calculations were made with the GIS to produce a map containing Mahalanobis distance values within each cell on a 60- × 60-m grid. The model identified areas of high habitat use potential that could not otherwise be identified by independent perusal of any single map layer. This technique avoids many pitfalls that commonly affect typical multivariate analyses of habitat use and is a useful tool for habitat manipulation or mitigation to favor terrestrial vertebrates that use habitats on a landscape scale.
An Easy Tool to Predict Survival in Patients Receiving Radiation Therapy for Painful Bone Metastases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Westhoff, Paulien G., E-mail: p.g.westhoff@umcutrecht.nl; Graeff, Alexander de; Monninkhof, Evelyn M.

2014-11-15

Purpose: Patients with bone metastases have a widely varying survival. A reliable estimation of survival is needed for appropriate treatment strategies. Our goal was to assess the value of simple prognostic factors, namely, patient and tumor characteristics, Karnofsky performance status (KPS), and patient-reported scores of pain and quality of life, to predict survival in patients with painful bone metastases. Methods and Materials: In the Dutch Bone Metastasis Study, 1157 patients were treated with radiation therapy for painful bone metastases. At randomization, physicians determined the KPS; patients rated general health on a visual analogue scale (VAS-gh), valuation of life on amore » verbal rating scale (VRS-vl) and pain intensity. To assess the predictive value of the variables, we used multivariate Cox proportional hazard analyses and C-statistics for discriminative value. Of the final model, calibration was assessed. External validation was performed on a dataset of 934 patients who were treated with radiation therapy for vertebral metastases. Results: Patients had mainly breast (39%), prostate (23%), or lung cancer (25%). After a maximum of 142 weeks' follow-up, 74% of patients had died. The best predictive model included sex, primary tumor, visceral metastases, KPS, VAS-gh, and VRS-vl (C-statistic = 0.72, 95% CI = 0.70-0.74). A reduced model, with only KPS and primary tumor, showed comparable discriminative capacity (C-statistic = 0.71, 95% CI = 0.69-0.72). External validation showed a C-statistic of 0.72 (95% CI = 0.70-0.73). Calibration of the derivation and the validation dataset showed underestimation of survival. Conclusion: In predicting survival in patients with painful bone metastases, KPS combined with primary tumor was comparable to a more complex model. Considering the amount of variables in complex models and the additional burden on patients, the simple model is preferred for daily use. In addition, a risk table for survival is provided.« less
Lessons learned developing a diagnostic tool for HIV-associated dementia feasible to implement in resource-limited settings: pilot testing in Kenya.

PubMed

Kwasa, Judith; Cettomai, Deanna; Lwanya, Edwin; Osiemo, Dennis; Oyaro, Patrick; Birbeck, Gretchen L; Price, Richard W; Bukusi, Elizabeth A; Cohen, Craig R; Meyer, Ana-Claire L

2012-01-01

To conduct a preliminary evaluation of the utility and reliability of a diagnostic tool for HIV-associated dementia (HAD) for use by primary health care workers (HCW) which would be feasible to implement in resource-limited settings. In resource-limited settings, HAD is an indication for anti-retroviral therapy regardless of CD4 T-cell count. Anti-retroviral therapy, the treatment for HAD, is now increasingly available in resource-limited settings. Nonetheless, HAD remains under-diagnosed likely because of limited clinical expertise and availability of diagnostic tests. Thus, a simple diagnostic tool which is practical to implement in resource-limited settings is an urgent need. A convenience sample of 30 HIV-infected outpatients was enrolled in Western Kenya. We assessed the sensitivity and specificity of a diagnostic tool for HAD as administered by a primary HCW. This was compared to an expert clinical assessment which included examination by a physician, neuropsychological testing, and in selected cases, brain imaging. Agreement between HCW and an expert examiner on certain tool components was measured using Kappa statistic. The sample was 57% male, mean age was 38.6 years, mean CD4 T-cell count was 323 cells/µL, and 54% had less than a secondary school education. Six (20%) of the subjects were diagnosed with HAD by expert clinical assessment. The diagnostic tool was 63% sensitive and 67% specific for HAD. Agreement between HCW and expert examiners was poor for many individual items of the diagnostic tool (K = .03-.65). This diagnostic tool had moderate sensitivity and specificity for HAD. However, reliability was poor, suggesting that substantial training and formal evaluations of training adequacy will be critical to enable HCW to reliably administer a brief diagnostic tool for HAD.
Sampling and sensitivity analyses tools (SaSAT) for computational modelling

PubMed Central

Hoare, Alexander; Regan, David G; Wilson, David P

2008-01-01

SaSAT (Sampling and Sensitivity Analysis Tools) is a user-friendly software package for applying uncertainty and sensitivity analyses to mathematical and computational models of arbitrary complexity and context. The toolbox is built in Matlab®, a numerical mathematical software package, and utilises algorithms contained in the Matlab® Statistics Toolbox. However, Matlab® is not required to use SaSAT as the software package is provided as an executable file with all the necessary supplementary files. The SaSAT package is also designed to work seamlessly with Microsoft Excel but no functionality is forfeited if that software is not available. A comprehensive suite of tools is provided to enable the following tasks to be easily performed: efficient and equitable sampling of parameter space by various methodologies; calculation of correlation coefficients; regression analysis; factor prioritisation; and graphical output of results, including response surfaces, tornado plots, and scatterplots. Use of SaSAT is exemplified by application to a simple epidemic model. To our knowledge, a number of the methods available in SaSAT for performing sensitivity analyses have not previously been used in epidemiological modelling and their usefulness in this context is demonstrated. PMID:18304361
Megasite Management Tool (mmt): a Decision Support System Built Using Mapwindow Activex Control

NASA Astrophysics Data System (ADS)

Pulsani, B. R.

2017-11-01

Megasite Management Tool (MMT) is planning and evaluation software for contaminated sites. Using different statistical modules, MMT produces maps which help decision makers in rehabilitating contaminated sites. The input data used by MMT is of geographic nature and exists as shapefile and raster format. As MMT is built using simple windows forms application, the objective of the study was to find a way to visualize geographic data and to allow the user to edit its attribute information. Therefore, the application requirement was to find GIS libraries which offer capabilities such as (1) map viewer with navigation tools (2) library to read/write geographic data and (3) software which allows free distribution of the developed components. A research on these requirements led to the discovery of MapWindow ActiveX components which not only offered these capabilities but also provided free and open source licensing options for redistribution. Although considerable amount of reports and publications exist on MMT, the major contribution provided by MapWindow libraries have been under played. The current study emphasises upon the contribution and advantages MapWindow ActiveX provides for incorporating GIS functionality to an already existing application. Similar components for other languages have also been reviewed.
Diagnostic tools for mixing models of stream water chemistry

USGS Publications Warehouse

Hooper, Richard P.

2003-01-01

Mixing models provide a useful null hypothesis against which to evaluate processes controlling stream water chemical data. Because conservative mixing of end‐members with constant concentration is a linear process, a number of simple mathematical and multivariate statistical methods can be applied to this problem. Although mixing models have been most typically used in the context of mixing soil and groundwater end‐members, an extension of the mathematics of mixing models is presented that assesses the “fit” of a multivariate data set to a lower dimensional mixing subspace without the need for explicitly identified end‐members. Diagnostic tools are developed to determine the approximate rank of the data set and to assess lack of fit of the data. This permits identification of processes that violate the assumptions of the mixing model and can suggest the dominant processes controlling stream water chemical variation. These same diagnostic tools can be used to assess the fit of the chemistry of one site into the mixing subspace of a different site, thereby permitting an assessment of the consistency of controlling end‐members across sites. This technique is applied to a number of sites at the Panola Mountain Research Watershed located near Atlanta, Georgia.
Geena 2, improved automated analysis of MALDI/TOF mass spectra.

PubMed

Romano, Paolo; Profumo, Aldo; Rocco, Mattia; Mangerini, Rosa; Ferri, Fabio; Facchiano, Angelo

2016-03-02

Mass spectrometry (MS) is producing high volumes of data supporting oncological sciences, especially for translational research. Most of related elaborations can be carried out by combining existing tools at different levels, but little is currently available for the automation of the fundamental steps. For the analysis of MALDI/TOF spectra, a number of pre-processing steps are required, including joining of isotopic abundances for a given molecular species, normalization of signals against an internal standard, background noise removal, averaging multiple spectra from the same sample, and aligning spectra from different samples. In this paper, we present Geena 2, a public software tool for the automated execution of these pre-processing steps for MALDI/TOF spectra. Geena 2 has been developed in a Linux-Apache-MySQL-PHP web development environment, with scripts in PHP and Perl. Input and output are managed as simple formats that can be consumed by any database system and spreadsheet software. Input data may also be stored in a MySQL database. Processing methods are based on original heuristic algorithms which are introduced in the paper. Three simple and intuitive web interfaces are available: the Standard Search Interface, which allows a complete control over all parameters, the Bright Search Interface, which leaves to the user the possibility to tune parameters for alignment of spectra, and the Quick Search Interface, which limits the number of parameters to a minimum by using default values for the majority of parameters. Geena 2 has been utilized, in conjunction with a statistical analysis tool, in three published experimental works: a proteomic study on the effects of long-term cryopreservation on the low molecular weight fraction of serum proteome, and two retrospective serum proteomic studies, one on the risk of developing breat cancer in patients affected by gross cystic disease of the breast (GCDB) and the other for the identification of a predictor of breast cancer mortality following breast cancer surgery, whose results were validated by ELISA, a completely alternative method. Geena 2 is a public tool for the automated pre-processing of MS data originated by MALDI/TOF instruments, with a simple and intuitive web interface. It is now under active development for the inclusion of further filtering options and for the adoption of standard formats for MS spectra.
Exploratory Causal Analysis in Bivariate Time Series Data

NASA Astrophysics Data System (ADS)

McCracken, James M.

Many scientific disciplines rely on observational data of systems for which it is difficult (or impossible) to implement controlled experiments and data analysis techniques are required for identifying causal information and relationships directly from observational data. This need has lead to the development of many different time series causality approaches and tools including transfer entropy, convergent cross-mapping (CCM), and Granger causality statistics. In this thesis, the existing time series causality method of CCM is extended by introducing a new method called pairwise asymmetric inference (PAI). It is found that CCM may provide counter-intuitive causal inferences for simple dynamics with strong intuitive notions of causality, and the CCM causal inference can be a function of physical parameters that are seemingly unrelated to the existence of a driving relationship in the system. For example, a CCM causal inference might alternate between ''voltage drives current'' and ''current drives voltage'' as the frequency of the voltage signal is changed in a series circuit with a single resistor and inductor. PAI is introduced to address both of these limitations. Many of the current approaches in the times series causality literature are not computationally straightforward to apply, do not follow directly from assumptions of probabilistic causality, depend on assumed models for the time series generating process, or rely on embedding procedures. A new approach, called causal leaning, is introduced in this work to avoid these issues. The leaning is found to provide causal inferences that agree with intuition for both simple systems and more complicated empirical examples, including space weather data sets. The leaning may provide a clearer interpretation of the results than those from existing time series causality tools. A practicing analyst can explore the literature to find many proposals for identifying drivers and causal connections in times series data sets, but little research exists of how these tools compare to each other in practice. This work introduces and defines exploratory causal analysis (ECA) to address this issue along with the concept of data causality in the taxonomy of causal studies introduced in this work. The motivation is to provide a framework for exploring potential causal structures in time series data sets. ECA is used on several synthetic and empirical data sets, and it is found that all of the tested time series causality tools agree with each other (and intuitive notions of causality) for many simple systems but can provide conflicting causal inferences for more complicated systems. It is proposed that such disagreements between different time series causality tools during ECA might provide deeper insight into the data than could be found otherwise.
Analyzing Discourse Processing Using a Simple Natural Language Processing Tool

ERIC Educational Resources Information Center

Crossley, Scott A.; Allen, Laura K.; Kyle, Kristopher; McNamara, Danielle S.

2014-01-01

Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a "simple" NLP (SiNLP) tool. The overarching goal of…
Water Conservation Measures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ian Metzger, Jesse Dean

2010-12-31

This software requires inputs of simple water fixture inventory information and calculates the water/energy and cost benefits of various retrofit opportunities. This tool includes water conservation measures for: Low-flow Toilets, Low-flow Urinals, Low-flow Faucets, and Low-flow Showheads. This tool calculates water savings, energy savings, demand reduction, cost savings, and building life cycle costs including: simple payback, discounted payback, net-present value, and savings to investment ratio. In addition this tool also displays the environmental benefits of a project.
The change and development of statistical methods used in research articles in child development 1930-2010.

PubMed

Køppe, Simo; Dammeyer, Jesper

2014-09-01

The evolution of developmental psychology has been characterized by the use of different quantitative and qualitative methods and procedures. But how does the use of methods and procedures change over time? This study explores the change and development of statistical methods used in articles published in Child Development from 1930 to 2010. The methods used in every article in the first issue of every volume were categorized into four categories. Until 1980 relatively simple statistical methods were used. During the last 30 years there has been an explosive use of more advanced statistical methods employed. The absence of statistical methods or use of simple methods had been eliminated.
The Taguchi methodology as a statistical tool for biotechnological applications: a critical appraisal.

PubMed

Rao, Ravella Sreenivas; Kumar, C Ganesh; Prakasham, R Shetty; Hobbs, Phil J

2008-04-01

Success in experiments and/or technology mainly depends on a properly designed process or product. The traditional method of process optimization involves the study of one variable at a time, which requires a number of combinations of experiments that are time, cost and labor intensive. The Taguchi method of design of experiments is a simple statistical tool involving a system of tabulated designs (arrays) that allows a maximum number of main effects to be estimated in an unbiased (orthogonal) fashion with a minimum number of experimental runs. It has been applied to predict the significant contribution of the design variable(s) and the optimum combination of each variable by conducting experiments on a real-time basis. The modeling that is performed essentially relates signal-to-noise ratio to the control variables in a 'main effect only' approach. This approach enables both multiple response and dynamic problems to be studied by handling noise factors. Taguchi principles and concepts have made extensive contributions to industry by bringing focused awareness to robustness, noise and quality. This methodology has been widely applied in many industrial sectors; however, its application in biological sciences has been limited. In the present review, the application and comparison of the Taguchi methodology has been emphasized with specific case studies in the field of biotechnology, particularly in diverse areas like fermentation, food processing, molecular biology, wastewater treatment and bioremediation.
Hydrophobicity diversity in globular and nonglobular proteins measured with the Gini index.

PubMed

Carugo, Oliviero

2017-12-01

Amino acids and their properties are variably distributed in proteins and different compositions determine all protein features, ranging from solubility to stability and functionality. Gini index, a tool to estimate distribution uniformity, is widely used in macroeconomics and has numerous statistical applications. Here, Gini index is used to analyze the distribution of hydrophobicity in proteins and to compare hydrophobicity distribution in globular and intrinsically disordered proteins. Based on the analysis of carefully selected high-quality data sets of proteins extracted from the Protein Data Bank (http://www.rcsb.org) and from the DisProt database (http://www.disprot.org/), it is observed that hydrophobicity is distributed in a more diverse way in intrinsically disordered proteins than in folded and soluble globular proteins. This correlates with the observation that the amino acid composition deviates from the uniformity (estimate with the Shannon and the Gini-Simpson indices) more in intrinsically disordered proteins than in globular and soluble proteins. Although statistical tools tike the Gini index have received little attention in molecular biology, these results show that they allow one to estimate sequence diversity and that they are useful to delineate trends that can hardly be described, otherwise, in simple and concise ways. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Uterine Cancer Statistics

MedlinePlus

... Doing AMIGAS Stay Informed Cancer Home Uterine Cancer Statistics Language: English (US) Español (Spanish) Recommend on Facebook ... the most commonly diagnosed gynecologic cancer. U.S. Cancer Statistics Data Visualizations Tool The Data Visualizations tool makes ...
Design of Friction Stir Spot Welding Tools by Using a Novel Thermal-Mechanical Approach

PubMed Central

Su, Zheng-Ming; Qiu, Qi-Hong; Lin, Pai-Chen

2016-01-01

A simple thermal-mechanical model for friction stir spot welding (FSSW) was developed to obtain similar weld performance for different weld tools. Use of the thermal-mechanical model and a combined approach enabled the design of weld tools for various sizes but similar qualities. Three weld tools for weld radii of 4, 5, and 6 mm were made to join 6061-T6 aluminum sheets. Performance evaluations of the three weld tools compared fracture behavior, microstructure, micro-hardness distribution, and welding temperature of welds in lap-shear specimens. For welds made by the three weld tools under identical processing conditions, failure loads were approximately proportional to tool size. Failure modes, microstructures, and micro-hardness distributions were similar. Welding temperatures correlated with frictional heat generation rate densities. Because the three weld tools sufficiently met all design objectives, the proposed approach is considered a simple and feasible guideline for preliminary tool design. PMID:28773800
Design of Friction Stir Spot Welding Tools by Using a Novel Thermal-Mechanical Approach.

PubMed

Su, Zheng-Ming; Qiu, Qi-Hong; Lin, Pai-Chen

2016-08-09

A simple thermal-mechanical model for friction stir spot welding (FSSW) was developed to obtain similar weld performance for different weld tools. Use of the thermal-mechanical model and a combined approach enabled the design of weld tools for various sizes but similar qualities. Three weld tools for weld radii of 4, 5, and 6 mm were made to join 6061-T6 aluminum sheets. Performance evaluations of the three weld tools compared fracture behavior, microstructure, micro-hardness distribution, and welding temperature of welds in lap-shear specimens. For welds made by the three weld tools under identical processing conditions, failure loads were approximately proportional to tool size. Failure modes, microstructures, and micro-hardness distributions were similar. Welding temperatures correlated with frictional heat generation rate densities. Because the three weld tools sufficiently met all design objectives, the proposed approach is considered a simple and feasible guideline for preliminary tool design.
LOD significance thresholds for QTL analysis in experimental populations of diploid species

PubMed

Van Ooijen JW

1999-11-01

Linkage analysis with molecular genetic markers is a very powerful tool in the biological research of quantitative traits. The lack of an easy way to know what areas of the genome can be designated as statistically significant for containing a gene affecting the quantitative trait of interest hampers the important prediction of the rate of false positives. In this paper four tables, obtained by large-scale simulations, are presented that can be used with a simple formula to get the false-positives rate for analyses of the standard types of experimental populations with diploid species with any size of genome. A new definition of the term 'suggestive linkage' is proposed that allows a more objective comparison of results across species.
Validity criteria for Fermi's golden rule scattering rates applied to metallic nanowires.

PubMed

Moors, Kristof; Sorée, Bart; Magnus, Wim

2016-09-14

Fermi's golden rule underpins the investigation of mobile carriers propagating through various solids, being a standard tool to calculate their scattering rates. As such, it provides a perturbative estimate under the implicit assumption that the effect of the interaction Hamiltonian which causes the scattering events is sufficiently small. To check the validity of this assumption, we present a general framework to derive simple validity criteria in order to assess whether the scattering rates can be trusted for the system under consideration, given its statistical properties such as average size, electron density, impurity density et cetera. We derive concrete validity criteria for metallic nanowires with conduction electrons populating a single parabolic band subjected to different elastic scattering mechanisms: impurities, grain boundaries and surface roughness.
An instructive model of entropy

NASA Astrophysics Data System (ADS)

Zimmerman, Seth

2010-09-01

This article first notes the misinterpretation of a common thought experiment, and the misleading comment that 'systems tend to flow from less probable to more probable macrostates'. It analyses the experiment, generalizes it and introduces a new tool of investigation, the simplectic structure. A time-symmetric model is built upon this structure, yielding several non-intuitive results. The approach is combinatorial rather than statistical, and assumes that entropy is equivalent to 'missing information'. The intention of this article is not only to present interesting results, but also, by deliberately starting with a simple example and developing it through proof and computer simulation, to clarify the often confusing subject of entropy. The article should be particularly stimulating to students and instructors of discrete mathematics or undergraduate physics.

Time studies in A&E departments--a useful tool for management.

PubMed

Aharonson-Daniel, L; Fung, H; Hedley, A J

1996-01-01

A time and motion study was conducted in an accident and emergency (A&E) department in a Hong Kong Government hospital in order to suggest solutions for severe queuing problems found in A&E. The study provided useful information about the patterns of arrival and service; the throughput; and the factors that influence the length of the queue at the A&E department. Plans for building a computerized simulation model were dropped as new intelligence generated by the study enabled problem solving using simple statistical analysis and common sense. Demonstrates some potential benefits for management in applying operations research methods in busy clinical working environments. The implementation of the recommendations made by this study successfully eliminated queues in A&E.
A Theorem on the Rank of a Product of Matrices with Illustration of Its Use in Goodness of Fit Testing.

PubMed

Satorra, Albert; Neudecker, Heinz

2015-12-01

This paper develops a theorem that facilitates computing the degrees of freedom of Wald-type chi-square tests for moment restrictions when there is rank deficiency of key matrices involved in the definition of the test. An if and only if (iff) condition is developed for a simple rule of difference of ranks to be used when computing the desired degrees of freedom of the test. The theorem is developed exploiting basics tools of matrix algebra. The theorem is shown to play a key role in proving the asymptotic chi-squaredness of a goodness of fit test in moment structure analysis, and in finding the degrees of freedom of this chi-square statistic.
BiDiBlast: comparative genomics pipeline for the PC.

PubMed

de Almeida, João M G C F

2010-06-01

Bi-directional BLAST is a simple approach to detect, annotate, and analyze candidate orthologous or paralogous sequences in a single go. This procedure is usually confined to the realm of customized Perl scripts, usually tuned for UNIX-like environments. Porting those scripts to other operating systems involves refactoring them, and also the installation of the Perl programming environment with the required libraries. To overcome these limitations, a data pipeline was implemented in Java. This application submits two batches of sequences to local versions of the NCBI BLAST tool, manages result lists, and refines both bi-directional and simple hits. GO Slim terms are attached to hits, several statistics are derived, and molecular evolution rates are estimated through PAML. The results are written to a set of delimited text tables intended for further analysis. The provided graphic user interface allows a friendly interaction with this application, which is documented and available to download at http://moodle.fct.unl.pt/course/view.php?id=2079 or https://sourceforge.net/projects/bidiblast/ under the GNU GPL license. Copyright 2010 Beijing Genomics Institute. Published by Elsevier Ltd. All rights reserved.
Pulling My Gut out--Simple Tools for Engaging Students in Gross Anatomy Lectures

ERIC Educational Resources Information Center

Chan, Lap Ki

2010-01-01

A lecture is not necessarily a monologue, promoting only passive learning. If appropriate techniques are used, a lecture can stimulate active learning too. One such method is demonstration, which can engage learners' attention and increase the interaction between the lecturer and the learners. This article describes two simple and useful tools for…
Nomogram for sample size calculation on a straightforward basis for the kappa statistic.

PubMed

Hong, Hyunsook; Choi, Yunhee; Hahn, Seokyung; Park, Sue Kyung; Park, Byung-Joo

2014-09-01

Kappa is a widely used measure of agreement. However, it may not be straightforward in some situation such as sample size calculation due to the kappa paradox: high agreement but low kappa. Hence, it seems reasonable in sample size calculation that the level of agreement under a certain marginal prevalence is considered in terms of a simple proportion of agreement rather than a kappa value. Therefore, sample size formulae and nomograms using a simple proportion of agreement rather than a kappa under certain marginal prevalences are proposed. A sample size formula was derived using the kappa statistic under the common correlation model and goodness-of-fit statistic. The nomogram for the sample size formula was developed using SAS 9.3. The sample size formulae using a simple proportion of agreement instead of a kappa statistic and nomograms to eliminate the inconvenience of using a mathematical formula were produced. A nomogram for sample size calculation with a simple proportion of agreement should be useful in the planning stages when the focus of interest is on testing the hypothesis of interobserver agreement involving two raters and nominal outcome measures. Copyright © 2014 Elsevier Inc. All rights reserved.
[Comparative study of the repair of full thickness tear of the supraspinatus by means of "single row" or "suture bridge" techniques].

PubMed

Arroyo-Hernández, M; Mellado-Romero, M A; Páramo-Díaz, P; Martín-López, C M; Cano-Egea, J M; Vilá Y Rico, J

2015-01-01

The purpose of this study is to analyze if there is any difference between the arthroscopic reparation of full-thickness supraspinatus tears with simple row technique versus suture bridge technique. We accomplished a retrospective study of 123 patients with full-thickness supraspinatus tears between January 2009 and January 2013 in our hospital. There were 60 simple row reparations, and 63 suture bridge ones. The mean age in the simple row group was 62.9, and in the suture bridge group was 63.3 years old. There were more women than men in both groups (67%). All patients were studied using the Constant test. The mean Constant test in the suture bridge group was 76.7, and in the simple row group was 72.4. We have also accomplished a statistical analysis of each Constant item. Strength was higher in the suture bridge group, with a significant statistical difference (p 0.04). The range of movement was also greater in the suture bridge group, but was not statistically significant. Suture bridge technique has better clinical results than single row reparations, but the difference is not statistically significant (p = 0.298).
Creating Simple Admin Tools Using Info*Engine and Java

NASA Technical Reports Server (NTRS)

Jones, Corey; Kapatos, Dennis; Skradski, Cory; Felkins, J. D.

2012-01-01

PTC has provided a simple way to dynamically interact with Windchill using Info*Engine. This presentation will describe how to create a simple Info*Engine Tasks capable of saving Windchill 10.0 administration of tedious work.
A note on the kappa statistic for clustered dichotomous data.

PubMed

Zhou, Ming; Yang, Zhao

2014-06-30

The kappa statistic is widely used to assess the agreement between two raters. Motivated by a simulation-based cluster bootstrap method to calculate the variance of the kappa statistic for clustered physician-patients dichotomous data, we investigate its special correlation structure and develop a new simple and efficient data generation algorithm. For the clustered physician-patients dichotomous data, based on the delta method and its special covariance structure, we propose a semi-parametric variance estimator for the kappa statistic. An extensive Monte Carlo simulation study is performed to evaluate the performance of the new proposal and five existing methods with respect to the empirical coverage probability, root-mean-square error, and average width of the 95% confidence interval for the kappa statistic. The variance estimator ignoring the dependence within a cluster is generally inappropriate, and the variance estimators from the new proposal, bootstrap-based methods, and the sampling-based delta method perform reasonably well for at least a moderately large number of clusters (e.g., the number of clusters K ⩾50). The new proposal and sampling-based delta method provide convenient tools for efficient computations and non-simulation-based alternatives to the existing bootstrap-based methods. Moreover, the new proposal has acceptable performance even when the number of clusters is as small as K = 25. To illustrate the practical application of all the methods, one psychiatric research data and two simulated clustered physician-patients dichotomous data are analyzed. Copyright © 2014 John Wiley & Sons, Ltd.
How to Recognize Success and Failure: Practical Assessment of an Evolving, First-Semester Laboratory Program Using Simple, Outcome-Based Tools

ERIC Educational Resources Information Center

Gron, Liz U.; Bradley, Shelly B.; McKenzie, Jennifer R.; Shinn, Sara E.; Teague, M. Warfield

2013-01-01

This paper presents the use of simple, outcome-based assessment tools to design and evaluate the first semester of a new introductory laboratory program created to teach green analytical chemistry using environmental samples. This general chemistry laboratory program, like many introductory courses, has a wide array of stakeholders within and…
A simple quality assurance test tool for the visual verification of light and radiation field congruent using electronic portal images device and computed radiography

PubMed Central

2012-01-01

Background The radiation field on most megavoltage radiation therapy units are shown by a light field projected through the collimator by a light source mounted inside the collimator. The light field is traditionally used for patient alignment. Hence it is imperative that the light field is congruent with the radiation field. Method A simple quality assurance tool has been designed for rapid and simple test of the light field and radiation field using electronic portal images device (EPID) or computed radiography (CR). We tested this QA tool using Varian PortalVision and Elekta iViewGT EPID systems and Kodak CR system. Results Both the single and double exposure techniques were evaluated, with double exposure technique providing a better visualization of the light-radiation field markers. The light and radiation congruency could be detected within 1 mm. This will satisfy the American Association of Physicists in Medicine task group report number 142 recommendation of 2 mm tolerance. Conclusion The QA tool can be used with either an EPID or CR to provide a simple and rapid method to verify light and radiation field congruence. PMID:22452821
The Persistence of Mode 1 Technology in the Korean Late Paleolithic

PubMed Central

Lee, Hyeong Woo

2013-01-01

Ssangjungri (SJ), an open-air site with several Paleolithic horizons, was recently discovered in South Korea. Most of the identified artifacts are simple core and flake tools that indicate an expedient knapping strategy. Bifacially worked core tools, which might be considered non-classic bifaces, also have been found. The prolific horizons at the site were dated by accelerator mass spectrometry (AMS) to about 30 kya. Another newly discovered Paleolithic open-air site, Jeungsan (JS), shows a homogeneous lithic pattern during this period. The dominated artifact types and usage of raw materials are similar in character to those from SJ, although JS yielded a larger number of simple core and flake tools with non-classic bifaces. Chronometric analysis by AMS and optically stimulated luminescence (OSL) indicate that the prime stratigraphic levels at JS also date to approximately 30 kya, and the numerous conjoining pieces indicate that the layers were not seriously affected by post-depositional processes. Thus, it can be confirmed that simple core and flake tools were produced at temporally and culturally independent sites until after 30 kya, supporting the hypothesis of a wide and persistent use of simple technology into the Late Pleistocene. PMID:23724113
Inter-rater reliability of the PIPES tool: validation of a surgical capacity index for use in resource-limited settings.

PubMed

Markin, Abraham; Barbero, Roxana; Leow, Jeffrey J; Groen, Reinou S; Perlman, Greg; Habermann, Elizabeth B; Apelgren, Keith N; Kushner, Adam L; Nwomeh, Benedict C

2014-09-01

In response to the need for simple, rapid means of quantifying surgical capacity in low resource settings, Surgeons OverSeas (SOS) developed the personnel, infrastructure, procedures, equipment and supplies (PIPES) tool. The present investigation assessed the inter-rater reliability of the PIPES tool. As part of a government assessment of surgical services in Santa Cruz, Bolivia, the PIPES tool was translated into Spanish and applied in interviews with physicians at 31 public hospitals. An additional interview was conducted with nurses at a convenience sample of 25 of these hospitals. Physician and nurse responses were then compared to generate an estimate of reliability. For dichotomous survey items, inter-rater reliability between physicians and nurses was assessed using the Cohen's kappa statistic and percent agreement. The Pearson correlation coefficient was used to assess agreement for continuous items. Cohen's kappa was 0.46 for infrastructure, 0.43 for procedures, 0.26 for equipment, and 0 for supplies sections. The median correlation coefficient was 0.91 for continuous items. Correlation was 0.79 for the PIPES index, and ranged from 0.32 to 0.98 for continuous response items. Reliability of the PIPES tool was moderate for the infrastructure and procedures sections, fair for the equipment section, and poor for supplies section when comparing surgeons' responses to nurses' responses-an extremely rigorous test of reliability. These results indicate that the PIPES tool is an effective measure of surgical capacity but that the equipment and supplies sections may need to be revised.
Development of a New Data Tool for Computing Launch and Landing Availability with Respect to Surface Weather

NASA Technical Reports Server (NTRS)

Burns, K. Lee; Altino, Karen

2008-01-01

The Marshall Space Flight Center Natural Environments Branch has a long history of expertise in the modeling and computation of statistical launch availabilities with respect to weather conditions. Their existing data analysis product, the Atmospheric Parametric Risk Assessment (APRA) tool, computes launch availability given an input set of vehicle hardware and/or operational weather constraints by calculating the climatological probability of exceeding the specified constraint limits, APRA has been used extensively to provide the Space Shuttle program the ability to estimate impacts that various proposed design modifications would have to overall launch availability. The model accounts for both seasonal and diurnal variability at a single geographic location and provides output probabilities for a single arbitrary launch attempt. Recently, the Shuttle program has shown interest in having additional capabilities added to the APRA model, including analysis of humidity parameters, inclusion of landing site weather to produce landing availability, and concurrent analysis of multiple sites, to assist in operational landing site selection. In addition, the Constellation program has also expressed interest in the APRA tool, and has requested several additional capabilities to address some Constellation-specific issues, both in the specification and verification of design requirements and in the development of operations concepts. The combined scope of the requested capability enhancements suggests an evolution of the model beyond a simple revision process. Development has begun for a new data analysis tool that will satisfy the requests of both programs. This new tool, Probabilities of Atmospheric Conditions and Environmental Risk (PACER), will provide greater flexibility and significantly enhanced functionality compared to the currently existing tool.
GIS based application tool -- history of East India Company

NASA Astrophysics Data System (ADS)

Phophaliya, Sudhir

The emphasis of the thesis is to build an intuitive and robust GIS (Geographic Information systems) Tool which gives an in depth information on history of East India Company. The GIS tool also incorporates various achievements of East India Company which helped to establish their business all over world especially India. The user has the option to select these movements and acts by clicking on any of the marked states on the World map. The World Map also incorporates key features for East India Company like landing of East India Company in India, Darjeeling Tea Establishment, East India Company Stock Redemption Act etc. The user can know more about these features simply by clicking on each of them. The primary focus of the tool is to give the user a unique insight about East India Company; for this the tool has several HTML (Hypertext markup language) pages which the user can select. These HTML pages give information on various topics like the first Voyage, Trade with China, 1857 Revolt etc. The tool has been developed in JAVA. For the Indian map MOJO (Map Objects Java Objects) is used. MOJO is developed by ESRI. The major features shown on the World map was designed using MOJO. MOJO made it easy to incorporate the statistical data with these features. The user interface was intentionally kept simple and easy to use. To keep the user engaged, key aspects are explained using HTML pages. The idea is that pictures will help the user garner interest in the history of East India Company.
Estimating time since infection in early homogeneous HIV-1 samples using a poisson model

PubMed Central

2010-01-01

Background The occurrence of a genetic bottleneck in HIV sexual or mother-to-infant transmission has been well documented. This results in a majority of new infections being homogeneous, i.e., initiated by a single genetic strain. Early after infection, prior to the onset of the host immune response, the viral population grows exponentially. In this simple setting, an approach for estimating evolutionary and demographic parameters based on comparison of diversity measures is a feasible alternative to the existing Bayesian methods (e.g., BEAST), which are instead based on the simulation of genealogies. Results We have devised a web tool that analyzes genetic diversity in acutely infected HIV-1 patients by comparing it to a model of neutral growth. More specifically, we consider a homogeneous infection (i.e., initiated by a unique genetic strain) prior to the onset of host-induced selection, where we can assume a random accumulation of mutations. Previously, we have shown that such a model successfully describes about 80% of sexual HIV-1 transmissions provided the samples are drawn early enough in the infection. Violation of the model is an indicator of either heterogeneous infections or the initiation of selection. Conclusions When the underlying assumptions of our model (homogeneous infection prior to selection and fast exponential growth) are met, we are under a very particular scenario for which we can use a forward approach (instead of backwards in time as provided by coalescent methods). This allows for more computationally efficient methods to derive the time since the most recent common ancestor. Furthermore, the tool performs statistical tests on the Hamming distance frequency distribution, and outputs summary statistics (mean of the best fitting Poisson distribution, goodness of fit p-value, etc). The tool runs within minutes and can readily accommodate the tens of thousands of sequences generated through new ultradeep pyrosequencing technologies. The tool is available on the LANL website. PMID:20973976
Statistical inference for noisy nonlinear ecological dynamic systems.

PubMed

Wood, Simon N

2010-08-26

Chaotic ecological dynamic systems defy conventional statistical analysis. Systems with near-chaotic dynamics are little better. Such systems are almost invariably driven by endogenous dynamic processes plus demographic and environmental process noise, and are only observable with error. Their sensitivity to history means that minute changes in the driving noise realization, or the system parameters, will cause drastic changes in the system trajectory. This sensitivity is inherited and amplified by the joint probability density of the observable data and the process noise, rendering it useless as the basis for obtaining measures of statistical fit. Because the joint density is the basis for the fit measures used by all conventional statistical methods, this is a major theoretical shortcoming. The inability to make well-founded statistical inferences about biological dynamic models in the chaotic and near-chaotic regimes, other than on an ad hoc basis, leaves dynamic theory without the methods of quantitative validation that are essential tools in the rest of biological science. Here I show that this impasse can be resolved in a simple and general manner, using a method that requires only the ability to simulate the observed data on a system from the dynamic model about which inferences are required. The raw data series are reduced to phase-insensitive summary statistics, quantifying local dynamic structure and the distribution of observations. Simulation is used to obtain the mean and the covariance matrix of the statistics, given model parameters, allowing the construction of a 'synthetic likelihood' that assesses model fit. This likelihood can be explored using a straightforward Markov chain Monte Carlo sampler, but one further post-processing step returns pure likelihood-based inference. I apply the method to establish the dynamic nature of the fluctuations in Nicholson's classic blowfly experiments.
Analysis of pre-service physics teacher skills designing simple physics experiments based technology

NASA Astrophysics Data System (ADS)

Susilawati; Huda, C.; Kurniawan, W.; Masturi; Khoiri, N.

2018-03-01

Pre-service physics teacher skill in designing simple experiment set is very important in adding understanding of student concept and practicing scientific skill in laboratory. This study describes the skills of physics students in designing simple experiments based technologicall. The experimental design stages include simple tool design and sensor modification. The research method used is descriptive method with the number of research samples 25 students and 5 variations of simple physics experimental design. Based on the results of interviews and observations obtained the results of pre-service physics teacher skill analysis in designing simple experimental physics charged technology is good. Based on observation result, pre-service physics teacher skill in designing simple experiment is good while modification and sensor application are still not good. This suggests that pre-service physics teacher still need a lot of practice and do experiments in designing physics experiments using sensor modifications. Based on the interview result, it is found that students have high enough motivation to perform laboratory activities actively and students have high curiosity to be skilled at making simple practicum tool for physics experiment.
Simplified aeroelastic modeling of horizontal axis wind turbines

NASA Technical Reports Server (NTRS)

Wendell, J. H.

1982-01-01

Certain aspects of the aeroelastic modeling and behavior of the horizontal axis wind turbine (HAWT) are examined. Two simple three degree of freedom models are described in this report, and tools are developed which allow other simple models to be derived. The first simple model developed is an equivalent hinge model to study the flap-lag-torsion aeroelastic stability of an isolated rotor blade. The model includes nonlinear effects, preconing, and noncoincident elastic axis, center of gravity, and aerodynamic center. A stability study is presented which examines the influence of key parameters on aeroelastic stability. Next, two general tools are developed to study the aeroelastic stability and response of a teetering rotor coupled to a flexible tower. The first of these tools is an aeroelastic model of a two-bladed rotor on a general flexible support. The second general tool is a harmonic balance solution method for the resulting second order system with periodic coefficients. The second simple model developed is a rotor-tower model which serves to demonstrate the general tools. This model includes nacelle yawing, nacelle pitching, and rotor teetering. Transient response time histories are calculated and compared to a similar model in the literature. Agreement between the two is very good, especially considering how few harmonics are used. Finally, a stability study is presented which examines the effects of support stiffness and damping, inflow angle, and preconing.
Dynamic PET simulator via tomographic emission projection for kinetic modeling and parametric image studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Häggström, Ida, E-mail: haeggsti@mskcc.org; Beattie, Bradley J.; Schmidtlein, C. Ross

2016-06-15

Purpose: To develop and evaluate a fast and simple tool called dPETSTEP (Dynamic PET Simulator of Tracers via Emission Projection), for dynamic PET simulations as an alternative to Monte Carlo (MC), useful for educational purposes and evaluation of the effects of the clinical environment, postprocessing choices, etc., on dynamic and parametric images. Methods: The tool was developed in MATLAB using both new and previously reported modules of PETSTEP (PET Simulator of Tracers via Emission Projection). Time activity curves are generated for each voxel of the input parametric image, whereby effects of imaging system blurring, counting noise, scatters, randoms, and attenuationmore » are simulated for each frame. Each frame is then reconstructed into images according to the user specified method, settings, and corrections. Reconstructed images were compared to MC data, and simple Gaussian noised time activity curves (GAUSS). Results: dPETSTEP was 8000 times faster than MC. Dynamic images from dPETSTEP had a root mean square error that was within 4% on average of that of MC images, whereas the GAUSS images were within 11%. The average bias in dPETSTEP and MC images was the same, while GAUSS differed by 3% points. Noise profiles in dPETSTEP images conformed well to MC images, confirmed visually by scatter plot histograms, and statistically by tumor region of interest histogram comparisons that showed no significant differences (p < 0.01). Compared to GAUSS, dPETSTEP images and noise properties agreed better with MC. Conclusions: The authors have developed a fast and easy one-stop solution for simulations of dynamic PET and parametric images, and demonstrated that it generates both images and subsequent parametric images with very similar noise properties to those of MC images, in a fraction of the time. They believe dPETSTEP to be very useful for generating fast, simple, and realistic results, however since it uses simple scatter and random models it may not be suitable for studies investigating these phenomena. dPETSTEP can be downloaded free of cost from https://github.com/CRossSchmidtlein/dPETSTEP.« less
Exposure assessment in health assessments for hand-arm vibration syndrome.

PubMed

Mason, H J; Poole, K; Young, C

2011-08-01

Assessing past cumulative vibration exposure is part of assessing the risk of hand-arm vibration syndrome (HAVS) in workers exposed to hand-arm vibration and invariably forms part of a medical assessment of such workers. To investigate the strength of relationships between the presence and severity of HAVS and different cumulative exposure metrics obtained from a self-reporting questionnaire. Cumulative exposure metrics were constructed from a tool-based questionnaire applied in a group of HAVS referrals and workplace field studies. These metrics included simple years of vibration exposure, cumulative total hours of all tool use and differing combinations of acceleration magnitudes for specific tools and their daily use, including the current frequency-weighting method contained in ISO 5349-1:2001. Use of simple years of exposure is a weak predictor of HAVS or its increasing severity. The calculation of cumulative hours across all vibrating tools used is a more powerful predictor. More complex calculations based on involving likely acceleration data for specific classes of tools, either frequency weighted or not, did not offer a clear further advantage in this dataset. This may be due to the uncertainty associated with workers' recall of their past tool usage or the variability between tools in the magnitude of their vibration emission. Assessing years of exposure or 'latency' in a worker should be replaced by cumulative hours of tool use. This can be readily obtained using a tool-pictogram-based self-reporting questionnaire and a simple spreadsheet calculation.

New Tools to Prepare ACE Cross-section Files for MCNP Analytic Test Problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B.

Monte Carlo calculations using one-group cross sections, multigroup cross sections, or simple continuous energy cross sections are often used to: (1) verify production codes against known analytical solutions, (2) verify new methods and algorithms that do not involve detailed collision physics, (3) compare Monte Carlo calculation methods with deterministic methods, and (4) teach fundamentals to students. In this work we describe 2 new tools for preparing the ACE cross-section files to be used by MCNP ® for these analytic test problems, simple_ace.pl and simple_ace_mg.pl.
High-throughput electrical measurement and microfluidic sorting of semiconductor nanowires.

PubMed

Akin, Cevat; Feldman, Leonard C; Durand, Corentin; Hus, Saban M; Li, An-Ping; Hui, Ho Yee; Filler, Michael A; Yi, Jingang; Shan, Jerry W

2016-05-24

Existing nanowire electrical characterization tools not only are expensive and require sophisticated facilities, but are far too slow to enable statistical characterization of highly variable samples. They are also generally not compatible with further sorting and processing of nanowires. Here, we demonstrate a high-throughput, solution-based electro-orientation-spectroscopy (EOS) method, which is capable of automated electrical characterization of individual nanowires by direct optical visualization of their alignment behavior under spatially uniform electric fields of different frequencies. We demonstrate that EOS can quantitatively characterize the electrical conductivities of nanowires over a 6-order-of-magnitude range (10(-5) to 10 S m(-1), corresponding to typical carrier densities of 10(10)-10(16) cm(-3)), with different fluids used to suspend the nanowires. By implementing EOS in a simple microfluidic device, continuous electrical characterization is achieved, and the sorting of nanowires is demonstrated as a proof-of-concept. With measurement speeds two orders of magnitude faster than direct-contact methods, the automated EOS instrument enables for the first time the statistical characterization of highly variable 1D nanomaterials.
Multivariate Statistical Inference of Lightning Occurrence, and Using Lightning Observations

NASA Technical Reports Server (NTRS)

Boccippio, Dennis

2004-01-01

Two classes of multivariate statistical inference using TRMM Lightning Imaging Sensor, Precipitation Radar, and Microwave Imager observation are studied, using nonlinear classification neural networks as inferential tools. The very large and globally representative data sample provided by TRMM allows both training and validation (without overfitting) of neural networks with many degrees of freedom. In the first study, the flashing / or flashing condition of storm complexes is diagnosed using radar, passive microwave and/or environmental observations as neural network inputs. The diagnostic skill of these simple lightning/no-lightning classifiers can be quite high, over land (above 80% Probability of Detection; below 20% False Alarm Rate). In the second, passive microwave and lightning observations are used to diagnose radar reflectivity vertical structure. A priori diagnosis of hydrometeor vertical structure is highly important for improved rainfall retrieval from either orbital radars (e.g., the future Global Precipitation Mission "mothership") or radiometers (e.g., operational SSM/I and future Global Precipitation Mission passive microwave constellation platforms), we explore the incremental benefit to such diagnosis provided by lightning observations.
Nanocluster building blocks of artificial square spin ice: Stray-field studies of thermal dynamics

NASA Astrophysics Data System (ADS)

Pohlit, Merlin; Porrati, Fabrizio; Huth, Michael; Ohno, Yuzo; Ohno, Hideo; Müller, Jens

2015-05-01

We present measurements of the thermal dynamics of a Co-based single building block of an artificial square spin ice fabricated by focused electron-beam-induced deposition. We employ micro-Hall magnetometry, an ultra-sensitive tool to study the stray field emanating from magnetic nanostructures, as a new technique to access the dynamical properties during the magnetization reversal of the spin-ice nanocluster. The obtained hysteresis loop exhibits distinct steps, displaying a reduction of their "coercive field" with increasing temperature. Therefore, thermally unstable states could be repetitively prepared by relatively simple temperature and field protocols allowing one to investigate the statistics of their switching behavior within experimentally accessible timescales. For a selected switching event, we find a strong reduction of the so-prepared states' "survival time" with increasing temperature and magnetic field. Besides the possibility to control the lifetime of selected switching events at will, we find evidence for a more complex behavior caused by the special spin ice arrangement of the macrospins, i.e., that the magnetic reversal statistically follows distinct "paths" most likely driven by thermal perturbation.
Characterizing and Addressing the Need for Statistical Adjustment of Global Climate Model Data

NASA Astrophysics Data System (ADS)

White, K. D.; Baker, B.; Mueller, C.; Villarini, G.; Foley, P.; Friedman, D.

2017-12-01

As part of its mission to research and measure the effects of the changing climate, the U. S. Army Corps of Engineers (USACE) regularly uses the World Climate Research Programme's Coupled Model Intercomparison Project Phase 5 (CMIP5) multi-model dataset. However, these data are generated at a global level and are not fine-tuned for specific watersheds. This often causes CMIP5 output to vary from locally observed patterns in the climate. Several downscaling methods have been developed to increase the resolution of the CMIP5 data and decrease systemic differences to support decision-makers as they evaluate results at the watershed scale. Evaluating preliminary comparisons of observed and projected flow frequency curves over the US revealed a simple framework for water resources decision makers to plan and design water resources management measures under changing conditions using standard tools. Using this framework as a basis, USACE has begun to explore to use of statistical adjustment to alter global climate model data to better match the locally observed patterns while preserving the general structure and behavior of the model data. When paired with careful measurement and hypothesis testing, statistical adjustment can be particularly effective at navigating the compromise between the locally observed patterns and the global climate model structures for decision makers.
Confirmatory Factor Analysis of the Malay Version of the Confusion, Hubbub and Order Scale (CHAOS-6) among Myocardial Infarction Survivors in a Malaysian Cardiac Healthcare Facility.

PubMed

Ganasegeran, Kurubaran; Selvaraj, Kamaraj; Rashid, Abdul

2017-08-01

The six item Confusion, Hubbub and Order Scale (CHAOS-6) has been validated as a reliable tool to measure levels of household disorder. We aimed to investigate the goodness of fit and reliability of a new Malay version of the CHAOS-6. The original English version of the CHAOS-6 underwent forward-backward translation into the Malay language. The finalised Malay version was administered to 105 myocardial infarction survivors in a Malaysian cardiac health facility. We performed confirmatory factor analyses (CFAs) using structural equation modelling. A path diagram and fit statistics were yielded to determine the Malay version's validity. Composite reliability was tested to determine the scale's reliability. All 105 myocardial infarction survivors participated in the study. The CFA yielded a six-item, one-factor model with excellent fit statistics. Composite reliability for the single factor CHAOS-6 was 0.65, confirming that the scale is reliable for Malay speakers. The Malay version of the CHAOS-6 was reliable and showed the best fit statistics for our study sample. We thus offer a simple, brief, validated, reliable and novel instrument to measure chaos, the Skala Kecelaruan, Keriuhan & Tertib Terubahsuai (CHAOS-6) , for the Malaysian population.
Confirmatory Factor Analysis of the Malay Version of the Confusion, Hubbub and Order Scale (CHAOS-6) among Myocardial Infarction Survivors in a Malaysian Cardiac Healthcare Facility

PubMed Central

Ganasegeran, Kurubaran; Selvaraj, Kamaraj; Rashid, Abdul

2017-01-01

Background The six item Confusion, Hubbub and Order Scale (CHAOS-6) has been validated as a reliable tool to measure levels of household disorder. We aimed to investigate the goodness of fit and reliability of a new Malay version of the CHAOS-6. Methods The original English version of the CHAOS-6 underwent forward-backward translation into the Malay language. The finalised Malay version was administered to 105 myocardial infarction survivors in a Malaysian cardiac health facility. We performed confirmatory factor analyses (CFAs) using structural equation modelling. A path diagram and fit statistics were yielded to determine the Malay version’s validity. Composite reliability was tested to determine the scale’s reliability. Results All 105 myocardial infarction survivors participated in the study. The CFA yielded a six-item, one-factor model with excellent fit statistics. Composite reliability for the single factor CHAOS-6 was 0.65, confirming that the scale is reliable for Malay speakers. Conclusion The Malay version of the CHAOS-6 was reliable and showed the best fit statistics for our study sample. We thus offer a simple, brief, validated, reliable and novel instrument to measure chaos, the Skala Kecelaruan, Keriuhan & Tertib Terubahsuai (CHAOS-6), for the Malaysian population. PMID:28951688
The Population Tracking Model: A Simple, Scalable Statistical Model for Neural Population Data

PubMed Central

O'Donnell, Cian; alves, J. Tiago Gonç; Whiteley, Nick; Portera-Cailliau, Carlos; Sejnowski, Terrence J.

2017-01-01

Our understanding of neural population coding has been limited by a lack of analysis methods to characterize spiking data from large populations. The biggest challenge comes from the fact that the number of possible network activity patterns scales exponentially with the number of neurons recorded (∼2Neurons). Here we introduce a new statistical method for characterizing neural population activity that requires semi-independent fitting of only as many parameters as the square of the number of neurons, requiring drastically smaller data sets and minimal computation time. The model works by matching the population rate (the number of neurons synchronously active) and the probability that each individual neuron fires given the population rate. We found that this model can accurately fit synthetic data from up to 1000 neurons. We also found that the model could rapidly decode visual stimuli from neural population data from macaque primary visual cortex about 65 ms after stimulus onset. Finally, we used the model to estimate the entropy of neural population activity in developing mouse somatosensory cortex and, surprisingly, found that it first increases, and then decreases during development. This statistical model opens new options for interrogating neural population data and can bolster the use of modern large-scale in vivo Ca2+ and voltage imaging tools. PMID:27870612
easyGWAS: A Cloud-Based Platform for Comparing the Results of Genome-Wide Association Studies.

PubMed

Grimm, Dominik G; Roqueiro, Damian; Salomé, Patrice A; Kleeberger, Stefan; Greshake, Bastian; Zhu, Wangsheng; Liu, Chang; Lippert, Christoph; Stegle, Oliver; Schölkopf, Bernhard; Weigel, Detlef; Borgwardt, Karsten M

2017-01-01

The ever-growing availability of high-quality genotypes for a multitude of species has enabled researchers to explore the underlying genetic architecture of complex phenotypes at an unprecedented level of detail using genome-wide association studies (GWAS). The systematic comparison of results obtained from GWAS of different traits opens up new possibilities, including the analysis of pleiotropic effects. Other advantages that result from the integration of multiple GWAS are the ability to replicate GWAS signals and to increase statistical power to detect such signals through meta-analyses. In order to facilitate the simple comparison of GWAS results, we present easyGWAS, a powerful, species-independent online resource for computing, storing, sharing, annotating, and comparing GWAS. The easyGWAS tool supports multiple species, the uploading of private genotype data and summary statistics of existing GWAS, as well as advanced methods for comparing GWAS results across different experiments and data sets in an interactive and user-friendly interface. easyGWAS is also a public data repository for GWAS data and summary statistics and already includes published data and results from several major GWAS. We demonstrate the potential of easyGWAS with a case study of the model organism Arabidopsis thaliana , using flowering and growth-related traits. © 2016 American Society of Plant Biologists. All rights reserved.
Instruction of Statistics via Computer-Based Tools: Effects on Statistics' Anxiety, Attitude, and Achievement

ERIC Educational Resources Information Center

Ciftci, S. Koza; Karadag, Engin; Akdal, Pinar

2014-01-01

The purpose of this study was to determine the effect of statistics instruction using computer-based tools, on statistics anxiety, attitude, and achievement. This study was designed as quasi-experimental research and the pattern used was a matched pre-test/post-test with control group design. Data was collected using three scales: a Statistics…
Validity criteria for Fermi’s golden rule scattering rates applied to metallic nanowires

NASA Astrophysics Data System (ADS)

Moors, Kristof; Sorée, Bart; Magnus, Wim

2016-09-01

Fermi’s golden rule underpins the investigation of mobile carriers propagating through various solids, being a standard tool to calculate their scattering rates. As such, it provides a perturbative estimate under the implicit assumption that the effect of the interaction Hamiltonian which causes the scattering events is sufficiently small. To check the validity of this assumption, we present a general framework to derive simple validity criteria in order to assess whether the scattering rates can be trusted for the system under consideration, given its statistical properties such as average size, electron density, impurity density et cetera. We derive concrete validity criteria for metallic nanowires with conduction electrons populating a single parabolic band subjected to different elastic scattering mechanisms: impurities, grain boundaries and surface roughness.
Atlas of susceptibility to pollution in marinas. Application to the Spanish coast.

PubMed

Gómez, Aina G; Ondiviela, Bárbara; Fernández, María; Juanes, José A

2017-01-15

An atlas of susceptibility to pollution of 320 Spanish marinas is provided. Susceptibility is assessed through a simple, fast and low cost empirical method estimating the flushing capacity of marinas. The Complexity Tidal Range Index (CTRI) was selected among eleven empirical methods. The CTRI method was selected by means of statistical analyses because: it contributes to explain the system's variance; it is highly correlated to numerical model results; and, it is sensitive to marinas' location and typology. The process of implementation to the Spanish coast confirmed its usefulness, versatility and adaptability as a tool for the environmental management of marinas worldwide. The atlas of susceptibility, assessed through CTRI values, is an appropriate instrument to prioritize environmental and planning strategies at a regional scale. Copyright © 2016 Elsevier Ltd. All rights reserved.
Evaluating Lexical Coverage in Simple English Wikipedia Articles: A Corpus-Driven Study

ERIC Educational Resources Information Center

Hendry, Clinton; Sheepy, Emily

2017-01-01

Simple English Wikipedia is a user-contributed online encyclopedia intended for young readers and readers whose first language is not English. We compiled a corpus of the entirety of Simple English Wikipedia as of June 20th, 2017. We used lexical frequency profiling tools to investigate the vocabulary size needed to comprehend Simple English…
A Bayesian approach to the characterization of electroencephalographic recordings in premature infants

NASA Astrophysics Data System (ADS)

Mitchell, Timothy J.

Preterm infants are particularly susceptible to cerebral injury, and electroencephalographic (EEG) recordings provide an important diagnostic tool for determining cerebral health. However, interpreting these EEG recordings is challenging and requires the skills of a trained electroencephalographer. Because these EEG specialists are rare, an automated interpretation of newborn EEG recordings would increase access to an important diagnostic tool for physicians. To automate this procedure, we employ a novel Bayesian approach to compute the probability of EEG features (waveforms) including suppression, delta brushes, and delta waves. The power of this approach lies not only in its ability to closely mimic the techniques used by EEG specialists, but also its ability to be generalized to identify other waveforms that may be of interest for future work. The results of these calculations are used in a program designed to output simple statistics related to the presence or absence of such features. Direct comparison of the software with expert human readers has indicated satisfactory performance, and the algorithm has shown promise in its ability to distinguish between infants with normal neurodevelopmental outcome and those with poor neurodevelopmental outcome.
[Bayesian statistics in medicine -- part II: main applications and inference].

PubMed

Montomoli, C; Nichelatti, M

2008-01-01

Bayesian statistics is not only used when one is dealing with 2-way tables, but it can be used for inferential purposes. Using the basic concepts presented in the first part, this paper aims to give a simple overview of Bayesian methods by introducing its foundation (Bayes' theorem) and then applying this rule to a very simple practical example; whenever possible, the elementary processes at the basis of analysis are compared to those of frequentist (classical) statistical analysis. The Bayesian reasoning is naturally connected to medical activity, since it appears to be quite similar to a diagnostic process.
Teaching meta-analysis using MetaLight.

PubMed

Thomas, James; Graziosi, Sergio; Higgins, Steve; Coe, Robert; Torgerson, Carole; Newman, Mark

2012-10-18

Meta-analysis is a statistical method for combining the results of primary studies. It is often used in systematic reviews and is increasingly a method and topic that appears in student dissertations. MetaLight is a freely available software application that runs simple meta-analyses and contains specific functionality to facilitate the teaching and learning of meta-analysis. While there are many courses and resources for meta-analysis available and numerous software applications to run meta-analyses, there are few pieces of software which are aimed specifically at helping those teaching and learning meta-analysis. Valuable teaching time can be spent learning the mechanics of a new software application, rather than on the principles and practices of meta-analysis. We discuss ways in which the MetaLight tool can be used to present some of the main issues involved in undertaking and interpreting a meta-analysis. While there are many software tools available for conducting meta-analysis, in the context of a teaching programme such software can require expenditure both in terms of money and in terms of the time it takes to learn how to use it. MetaLight was developed specifically as a tool to facilitate the teaching and learning of meta-analysis and we have presented here some of the ways it might be used in a training situation.
Risk assessment model for development of advanced age-related macular degeneration.

PubMed

Klein, Michael L; Francis, Peter J; Ferris, Frederick L; Hamon, Sara C; Clemons, Traci E

2011-12-01

To design a risk assessment model for development of advanced age-related macular degeneration (AMD) incorporating phenotypic, demographic, environmental, and genetic risk factors. We evaluated longitudinal data from 2846 participants in the Age-Related Eye Disease Study. At baseline, these individuals had all levels of AMD, ranging from none to unilateral advanced AMD (neovascular or geographic atrophy). Follow-up averaged 9.3 years. We performed a Cox proportional hazards analysis with demographic, environmental, phenotypic, and genetic covariates and constructed a risk assessment model for development of advanced AMD. Performance of the model was evaluated using the C statistic and the Brier score and externally validated in participants in the Complications of Age-Related Macular Degeneration Prevention Trial. The final model included the following independent variables: age, smoking history, family history of AMD (first-degree member), phenotype based on a modified Age-Related Eye Disease Study simple scale score, and genetic variants CFH Y402H and ARMS2 A69S. The model did well on performance measures, with very good discrimination (C statistic = 0.872) and excellent calibration and overall performance (Brier score at 5 years = 0.08). Successful external validation was performed, and a risk assessment tool was designed for use with or without the genetic component. We constructed a risk assessment model for development of advanced AMD. The model performed well on measures of discrimination, calibration, and overall performance and was successfully externally validated. This risk assessment tool is available for online use.
Errors in patient specimen collection: application of statistical process control.

PubMed

Dzik, Walter Sunny; Beckman, Neil; Selleng, Kathleen; Heddle, Nancy; Szczepiorkowski, Zbigniew; Wendel, Silvano; Murphy, Michael

2008-10-01

Errors in the collection and labeling of blood samples for pretransfusion testing increase the risk of transfusion-associated patient morbidity and mortality. Statistical process control (SPC) is a recognized method to monitor the performance of a critical process. An easy-to-use SPC method was tested to determine its feasibility as a tool for monitoring quality in transfusion medicine. SPC control charts were adapted to a spreadsheet presentation. Data tabulating the frequency of mislabeled and miscollected blood samples from 10 hospitals in five countries from 2004 to 2006 were used to demonstrate the method. Control charts were produced to monitor process stability. The participating hospitals found the SPC spreadsheet very suitable to monitor the performance of the sample labeling and collection and applied SPC charts to suit their specific needs. One hospital monitored subcategories of sample error in detail. A large hospital monitored the number of wrong-blood-in-tube (WBIT) events. Four smaller-sized facilities, each following the same policy for sample collection, combined their data on WBIT samples into a single control chart. One hospital used the control chart to monitor the effect of an educational intervention. A simple SPC method is described that can monitor the process of sample collection and labeling in any hospital. SPC could be applied to other critical steps in the transfusion processes as a tool for biovigilance and could be used to develop regional or national performance standards for pretransfusion sample collection. A link is provided to download the spreadsheet for free.
Statistics without Tears: Complex Statistics with Simple Arithmetic

ERIC Educational Resources Information Center

Smith, Brian

2011-01-01

One of the often overlooked aspects of modern statistics is the analysis of time series data. Modern introductory statistics courses tend to rush to probabilistic applications involving risk and confidence. Rarely does the first level course linger on such useful and fascinating topics as time series decomposition, with its practical applications…
Applied statistics in ecology: common pitfalls and simple solutions

Treesearch

E. Ashley Steel; Maureen C. Kennedy; Patrick G. Cunningham; John S. Stanovick

2013-01-01

The most common statistical pitfalls in ecological research are those associated with data exploration, the logic of sampling and design, and the interpretation of statistical results. Although one can find published errors in calculations, the majority of statistical pitfalls result from incorrect logic or interpretation despite correct numerical calculations. There...

The Statistics of wood assays for preservative retention

Treesearch

Patricia K. Lebow; Scott W. Conklin

2011-01-01

This paper covers general statistical concepts that apply to interpreting wood assay retention values. In particular, since wood assays are typically obtained from a single composited sample, the statistical aspects, including advantages and disadvantages, of simple compositing are covered.
Prediction of drug transport processes using simple parameters and PLS statistics. The use of ACD/logP and ACD/ChemSketch descriptors.

PubMed

Osterberg, T; Norinder, U

2001-01-01

A method of modelling and predicting biopharmaceutical properties using simple theoretically computed molecular descriptors and multivariate statistics has been investigated for several data sets related to solubility, IAM chromatography, permeability across Caco-2 cell monolayers, human intestinal perfusion, brain-blood partitioning, and P-glycoprotein ATPase activity. The molecular descriptors (e.g. molar refractivity, molar volume, index of refraction, surface tension and density) and logP were computed with ACD/ChemSketch and ACD/logP, respectively. Good statistical models were derived that permit simple computational prediction of biopharmaceutical properties. All final models derived had R(2) values ranging from 0.73 to 0.95 and Q(2) values ranging from 0.69 to 0.86. The RMSEP values for the external test sets ranged from 0.24 to 0.85 (log scale).
Critical appraisal of fundamental items in approved clinical trial research proposals in Mashhad University of Medical Sciences

PubMed Central

Shakeri, Mohammad-Taghi; Taghipour, Ali; Sadeghi, Masoumeh; Nezami, Hossein; Amirabadizadeh, Ali-Reza; Bonakchi, Hossein

2017-01-01

Background: Writing, designing, and conducting a clinical trial research proposal has an important role in achieving valid and reliable findings. Thus, this study aimed at critically appraising fundamental information in approved clinical trial research proposals in Mashhad University of Medical Sciences (MUMS) from 2008 to 2014. Methods: This cross-sectional study was conducted on all 935 approved clinical trial research proposals in MUMS from 2008 to 2014. A valid and reliable as well as comprehensive, simple, and usable checklist in sessions with biostatisticians and methodologists, consisting of 11 main items as research tool, were used. Agreement rate between the reviewers of the proposals, who were responsible for data collection, was assessed during 3 sessions, and Kappa statistics was calculated at the last session as 97%. Results: More than 60% of the research proposals had a methodologist consultant, moreover, type of study or study design had been specified in almost all of them (98%). Appropriateness of study aims with hypotheses was not observed in a significant number of research proposals (585 proposals, 62.6%). The required sample size for 66.8% of the approved proposals was based on a sample size formula; however, in 25% of the proposals, sample size formula was not in accordance with the study design. Data collection tool was not selected appropriately in 55.2% of the approved research proposals. Type and method of randomization were unknown in 21% of the proposals and dealing with missing data had not been described in most of them (98%). Inclusion and exclusion criteria were (92%) fully and adequately explained. Moreover, 44% and 31% of the research proposals were moderate and weak in rank, respectively, with respect to the correctness of the statistical analysis methods. Conclusion: Findings of the present study revealed that a large portion of the approved proposals were highly biased or ambiguous with respect to randomization, blinding, dealing with missing data, data collection tool, sampling methods, and statistical analysis. Thus, it is essential to consult and collaborate with a methodologist in all parts of a proposal to control the possible and specific biases in clinical trials. PMID:29445703
Critical appraisal of fundamental items in approved clinical trial research proposals in Mashhad University of Medical Sciences.

PubMed

Shakeri, Mohammad-Taghi; Taghipour, Ali; Sadeghi, Masoumeh; Nezami, Hossein; Amirabadizadeh, Ali-Reza; Bonakchi, Hossein

2017-01-01

Background: Writing, designing, and conducting a clinical trial research proposal has an important role in achieving valid and reliable findings. Thus, this study aimed at critically appraising fundamental information in approved clinical trial research proposals in Mashhad University of Medical Sciences (MUMS) from 2008 to 2014. Methods: This cross-sectional study was conducted on all 935 approved clinical trial research proposals in MUMS from 2008 to 2014. A valid and reliable as well as comprehensive, simple, and usable checklist in sessions with biostatisticians and methodologists, consisting of 11 main items as research tool, were used. Agreement rate between the reviewers of the proposals, who were responsible for data collection, was assessed during 3 sessions, and Kappa statistics was calculated at the last session as 97%. Results: More than 60% of the research proposals had a methodologist consultant, moreover, type of study or study design had been specified in almost all of them (98%). Appropriateness of study aims with hypotheses was not observed in a significant number of research proposals (585 proposals, 62.6%). The required sample size for 66.8% of the approved proposals was based on a sample size formula; however, in 25% of the proposals, sample size formula was not in accordance with the study design. Data collection tool was not selected appropriately in 55.2% of the approved research proposals. Type and method of randomization were unknown in 21% of the proposals and dealing with missing data had not been described in most of them (98%). Inclusion and exclusion criteria were (92%) fully and adequately explained. Moreover, 44% and 31% of the research proposals were moderate and weak in rank, respectively, with respect to the correctness of the statistical analysis methods. Conclusion: Findings of the present study revealed that a large portion of the approved proposals were highly biased or ambiguous with respect to randomization, blinding, dealing with missing data, data collection tool, sampling methods, and statistical analysis. Thus, it is essential to consult and collaborate with a methodologist in all parts of a proposal to control the possible and specific biases in clinical trials.
Simple Example of Backtest Overfitting (SEBO)

DOE Office of Scientific and Technical Information (OSTI.GOV)

In the field of mathematical finance, a "backtest" is the usage of historical market data to assess the performance of a proposed trading strategy. It is a relatively simple matter for a present-day computer system to explore thousands, millions or even billions of variations of a proposed strategy, and pick the best performing variant as the "optimal" strategy "in sample" (i.e., on the input dataset). Unfortunately, such an "optimal" strategy often performs very poorly "out of sample" (i.e. on another dataset), because the parameters of the invest strategy have been oversit to the in-sample data, a situation known as "backtestmore » overfitting". While the mathematics of backtest overfitting has been examined in several recent theoretical studies, here we pursue a more tangible analysis of this problem, in the form of an online simulator tool. Given a input random walk time series, the tool develops an "optimal" variant of a simple strategy by exhaustively exploring all integer parameter values among a handful of parameters. That "optimal" strategy is overfit, since by definition a random walk is unpredictable. Then the tool tests the resulting "optimal" strategy on a second random walk time series. In most runs using our online tool, the "optimal" strategy derived from the first time series performs poorly on the second time series, demonstrating how hard it is not to overfit a backtest. We offer this online tool, "Simple Example of Backtest Overfitting (SEBO)", to facilitate further research in this area.« less
Publication bias in situ.

PubMed

Phillips, Carl V

2004-08-05

Publication bias, as typically defined, refers to the decreased likelihood of studies' results being published when they are near the null, not statistically significant, or otherwise "less interesting." But choices about how to analyze the data and which results to report create a publication bias within the published results, a bias I label "publication bias in situ" (PBIS). PBIS may create much greater bias in the literature than traditionally defined publication bias (the failure to publish any result from a study). The causes of PBIS are well known, consisting of various decisions about reporting that are influenced by the data. But its impact is not generally appreciated, and very little attention is devoted to it. What attention there is consists largely of rules for statistical analysis that are impractical and do not actually reduce the bias in reported estimates. PBIS cannot be reduced by statistical tools because it is not fundamentally a problem of statistics, but rather of non-statistical choices and plain language interpretations. PBIS should be recognized as a phenomenon worthy of study - it is extremely common and probably has a huge impact on results reported in the literature - and there should be greater systematic efforts to identify and reduce it. The paper presents examples, including results of a recent HIV vaccine trial, that show how easily PBIS can have a large impact on reported results, as well as how there can be no simple answer to it. PBIS is a major problem, worthy of substantially more attention than it receives. There are ways to reduce the bias, but they are very seldom employed because they are largely unrecognized.
Using complexity metrics with R-R intervals and BPM heart rate measures

PubMed Central

Wallot, Sebastian; Fusaroli, Riccardo; Tylén, Kristian; Jegindø, Else-Marie

2013-01-01

Lately, growing attention in the health sciences has been paid to the dynamics of heart rate as indicator of impending failures and for prognoses. Likewise, in social and cognitive sciences, heart rate is increasingly employed as a measure of arousal, emotional engagement and as a marker of interpersonal coordination. However, there is no consensus about which measurements and analytical tools are most appropriate in mapping the temporal dynamics of heart rate and quite different metrics are reported in the literature. As complexity metrics of heart rate variability depend critically on variability of the data, different choices regarding the kind of measures can have a substantial impact on the results. In this article we compare linear and non-linear statistics on two prominent types of heart beat data, beat-to-beat intervals (R-R interval) and beats-per-min (BPM). As a proof-of-concept, we employ a simple rest-exercise-rest task and show that non-linear statistics—fractal (DFA) and recurrence (RQA) analyses—reveal information about heart beat activity above and beyond the simple level of heart rate. Non-linear statistics unveil sustained post-exercise effects on heart rate dynamics, but their power to do so critically depends on the type data that is employed: While R-R intervals are very susceptible to non-linear analyses, the success of non-linear methods for BPM data critically depends on their construction. Generally, “oversampled” BPM time-series can be recommended as they retain most of the information about non-linear aspects of heart beat dynamics. PMID:23964244
Data Verification Tools for Minimizing Management Costs of Dense Air-Quality Monitoring Networks.

PubMed

Miskell, Georgia; Salmond, Jennifer; Alavi-Shoshtari, Maryam; Bart, Mark; Ainslie, Bruce; Grange, Stuart; McKendry, Ian G; Henshaw, Geoff S; Williams, David E

2016-01-19

Aiming at minimizing the costs, both of capital expenditure and maintenance, of an extensive air-quality measurement network, we present simple statistical methods that do not require extensive training data sets for automated real-time verification of the reliability of data delivered by a spatially dense hybrid network of both low-cost and reference ozone measurement instruments. Ozone is a pollutant that has a relatively smooth spatial spread over a large scale although there can be significant small-scale variations. We take advantage of these characteristics and demonstrate detection of instrument calibration drift within a few days using a rolling 72 h comparison of hourly averaged data from the test instrument with that from suitably defined proxies. We define the required characteristics of the proxy measurements by working from a definition of the network purpose and specification, in this case reliable determination of the proportion of hourly averaged ozone measurements that are above a threshold in any given day, and detection of calibration drift of greater than ±30% in slope or ±5 parts-per-billion in offset. By analyzing results of a study of an extensive deployment of low-cost instruments in the Lower Fraser Valley, we demonstrate that proxies can be established using land-use criteria and that simple statistical comparisons can identify low-cost instruments that are not stable and therefore need replacing. We propose that a minimal set of compliant reference instruments can be used to verify the reliability of data from a much more extensive network of low-cost devices.
A simple epidural simulator: a blinded study assessing the 'feel' of loss of resistance in four fruits.

PubMed

Raj, Diana; Williamson, Roy M; Young, David; Russell, Douglas

2013-07-01

Complex epidural simulators are now available, but these are expensive and not widely available. Simple simulators using fruit have been described before. To ascertain which easily available fruit would best simulate the 'feel' of loss of resistance experienced in epidural insertion and be used as a teaching tool. A single blinded study using four different fruits housed in a purpose-built box to conceal the identities of the fruits. The fruits were labelled A, B, C and D. Two teaching hospitals in Glasgow, Scotland between 2006 and 2007. Fifty participants consisting of consultant anaesthetists, specialist registrars and senior house officers all with previous epidural experience. Insertion of a Tuohy needle into the four concealed fruits (orange, banana, kiwi and honeydew melon). Each participant then completed a questionnaire that included recording of the realism of the 'feel' of loss of resistance of each fruit. The 'feel' of loss of resistance for each fruit was scored on a 100-mm Visual Analogue Scale. A '0 mm' represented 'completely unrealistic feel' and '100 mm' represented 'indistinguishable feel from a real patient'. A total of 62.6% of participants recorded the banana as their first choice. This result was statistically significant after taking into account the grades of the participants, their years of experience, the needle gauge used and the participants' chosen technique. The banana is a cheap and easily available training tool to introduce novice anaesthetists to the feel of loss of resistance, which is best experienced before the first insertion of an epidural in a patient.
The Multisensory Attentional Consequences of Tool Use: A Functional Magnetic Resonance Imaging Study

PubMed Central

Holmes, Nicholas P.; Spence, Charles; Hansen, Peter C.; Mackay, Clare E.; Calvert, Gemma A.

2008-01-01

Background Tool use in humans requires that multisensory information is integrated across different locations, from objects seen to be distant from the hand, but felt indirectly at the hand via the tool. We tested the hypothesis that using a simple tool to perceive vibrotactile stimuli results in the enhanced processing of visual stimuli presented at the distal, functional part of the tool. Such a finding would be consistent with a shift of spatial attention to the location where the tool is used. Methodology/Principal Findings We tested this hypothesis by scanning healthy human participants' brains using functional magnetic resonance imaging, while they used a simple tool to discriminate between target vibrations, accompanied by congruent or incongruent visual distractors, on the same or opposite side to the tool. The attentional hypothesis was supported: BOLD response in occipital cortex, particularly in the right hemisphere lingual gyrus, varied significantly as a function of tool position, increasing contralaterally, and decreasing ipsilaterally to the tool. Furthermore, these modulations occurred despite the fact that participants were repeatedly instructed to ignore the visual stimuli, to respond only to the vibrotactile stimuli, and to maintain visual fixation centrally. In addition, the magnitude of multisensory (visual-vibrotactile) interactions in participants' behavioural responses significantly predicted the BOLD response in occipital cortical areas that were also modulated as a function of both visual stimulus position and tool position. Conclusions/Significance These results show that using a simple tool to locate and to perceive vibrotactile stimuli is accompanied by a shift of spatial attention to the location where the functional part of the tool is used, resulting in enhanced processing of visual stimuli at that location, and decreased processing at other locations. This was most clearly observed in the right hemisphere lingual gyrus. Such modulations of visual processing may reflect the functional importance of visuospatial information during human tool use. PMID:18958150
"Dear Fresher …"--How Online Questionnaires Can Improve Learning and Teaching Statistics

ERIC Educational Resources Information Center

Bebermeier, Sarah; Nussbeck, Fridtjof W.; Ontrup, Greta

2015-01-01

Lecturers teaching statistics are faced with several challenges supporting students' learning in appropriate ways. A variety of methods and tools exist to facilitate students' learning on statistics courses. The online questionnaires presented in this report are a new, slightly different computer-based tool: the central aim was to support students…
A Role for Chunk Formation in Statistical Learning of Second Language Syntax

ERIC Educational Resources Information Center

Hamrick, Phillip

2014-01-01

Humans are remarkably sensitive to the statistical structure of language. However, different mechanisms have been proposed to account for such statistical sensitivities. The present study compared adult learning of syntax and the ability of two models of statistical learning to simulate human performance: Simple Recurrent Networks, which learn by…
Collaboratively Conceived, Designed and Implemented: Matching Visualization Tools with Geoscience Data Collections and Geoscience Data Collections with Visualization Tools via the ToolMatch Service.

NASA Astrophysics Data System (ADS)

Hoebelheinrich, N. J.; Lynnes, C.; West, P.; Ferritto, M.

2014-12-01

Two problems common to many geoscience domains are the difficulties in finding tools to work with a given dataset collection, and conversely, the difficulties in finding data for a known tool. A collaborative team from the Earth Science Information Partnership (ESIP) has gotten together to design and create a web service, called ToolMatch, to address these problems. The team began their efforts by defining an initial, relatively simple conceptual model that addressed the two uses cases briefly described above. The conceptual model is expressed as an ontology using OWL (Web Ontology Language) and DCterms (Dublin Core Terms), and utilizing standard ontologies such as DOAP (Description of a Project), FOAF (Friend of a Friend), SKOS (Simple Knowledge Organization System) and DCAT (Data Catalog Vocabulary). The ToolMatch service will be taking advantage of various Semantic Web and Web standards, such as OpenSearch, RESTful web services, SWRL (Semantic Web Rule Language) and SPARQL (Simple Protocol and RDF Query Language). The first version of the ToolMatch service was deployed in early fall 2014. While more complete testing is required, a number of communities besides ESIP member organizations have expressed interest in collaborating to create, test and use the service and incorporate it into their own web pages, tools and / or services including the USGS Data Catalog service, DataONE, the Deep Carbon Observatory, Virtual Solar Terrestrial Observatory (VSTO), and the U.S. Global Change Research Program. In this session, presenters will discuss the inception and development of the ToolMatch service, the collaborative process used to design, refine, and test the service, and future plans for the service.
Burden Calculator: a simple and open analytical tool for estimating the population burden of injuries.

PubMed

Bhalla, Kavi; Harrison, James E

2016-04-01

Burden of disease and injury methods can be used to summarise and compare the effects of conditions in terms of disability-adjusted life years (DALYs). Burden estimation methods are not inherently complex. However, as commonly implemented, the methods include complex modelling and estimation. To provide a simple and open-source software tool that allows estimation of incidence-DALYs due to injury, given data on incidence of deaths and non-fatal injuries. The tool includes a default set of estimation parameters, which can be replaced by users. The tool was written in Microsoft Excel. All calculations and values can be seen and altered by users. The parameter sets currently used in the tool are based on published sources. The tool is available without charge online at http://calculator.globalburdenofinjuries.org. To use the tool with the supplied parameter sets, users need to only paste a table of population and injury case data organised by age, sex and external cause of injury into a specified location in the tool. Estimated DALYs can be read or copied from tables and figures in another part of the tool. In some contexts, a simple and user-modifiable burden calculator may be preferable to undertaking a more complex study to estimate the burden of disease. The tool and the parameter sets required for its use can be improved by user innovation, by studies comparing DALYs estimates calculated in this way and in other ways, and by shared experience of its use. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
The Value of Data and Metadata Standardization for Interoperability in Giovanni Or: Why Your Product's Metadata Causes Us Headaches!

NASA Technical Reports Server (NTRS)

Smit, Christine; Hegde, Mahabaleshwara; Strub, Richard; Bryant, Keith; Li, Angela; Petrenko, Maksym

2017-01-01

Giovanni is a data exploration and visualization tool at the NASA Goddard Earth Sciences Data Information Services Center (GES DISC). It has been around in one form or another for more than 15 years. Giovanni calculates simple statistics and produces 22 different visualizations for more than 1600 geophysical parameters from more than 90 satellite and model products. Giovanni relies on external data format standards to ensure interoperability, including the NetCDF CF Metadata Conventions. Unfortunately, these standards were insufficient to make Giovanni's internal data representation truly simple to use. Finding and working with dimensions can be convoluted with the CF Conventions. Furthermore, the CF Conventions are silent on machine-friendly descriptive metadata such as the parameter's source product and product version. In order to simplify analyzing disparate earth science data parameters in a unified way, we developed Giovanni's internal standard. First, the format standardizes parameter dimensions and variables so they can be easily found. Second, the format adds all the machine-friendly metadata Giovanni needs to present our parameters to users in a consistent and clear manner. At a glance, users can grasp all the pertinent information about parameters both during parameter selection and after visualization.
State-space models’ dirty little secrets: even simple linear Gaussian models can have estimation problems

NASA Astrophysics Data System (ADS)

Auger-Méthé, Marie; Field, Chris; Albertsen, Christoffer M.; Derocher, Andrew E.; Lewis, Mark A.; Jonsen, Ian D.; Mills Flemming, Joanna

2016-05-01

State-space models (SSMs) are increasingly used in ecology to model time-series such as animal movement paths and population dynamics. This type of hierarchical model is often structured to account for two levels of variability: biological stochasticity and measurement error. SSMs are flexible. They can model linear and nonlinear processes using a variety of statistical distributions. Recent ecological SSMs are often complex, with a large number of parameters to estimate. Through a simulation study, we show that even simple linear Gaussian SSMs can suffer from parameter- and state-estimation problems. We demonstrate that these problems occur primarily when measurement error is larger than biological stochasticity, the condition that often drives ecologists to use SSMs. Using an animal movement example, we show how these estimation problems can affect ecological inference. Biased parameter estimates of a SSM describing the movement of polar bears (Ursus maritimus) result in overestimating their energy expenditure. We suggest potential solutions, but show that it often remains difficult to estimate parameters. While SSMs are powerful tools, they can give misleading results and we urge ecologists to assess whether the parameters can be estimated accurately before drawing ecological conclusions from their results.
An Improved LC-ESI-MS/MS Method to Quantify Pregabalin in Human Plasma and Dry Plasma Spot for Therapeutic Monitoring and Pharmacokinetic Applications.

PubMed

Dwivedi, Jaya; Namdev, Kuldeep K; Chilkoti, Deepak C; Verma, Surajpal; Sharma, Swapnil

2018-06-06

Therapeutic drug monitoring (TDM) of anti-epileptic drugs provides a valid clinical tool in optimization of overall therapy. However, TDM is challenging due to the high biological samples (plasma/blood) storage/shipment costs and the limited availability of laboratories providing TDM services. Sampling in the form of dry plasma spot (DPS) or dry blood spot (DBS) is a suitable alternative to overcome these issues. An improved, simple, rapid, and stability indicating method for quantification of pregabalin in human plasma and DPS has been developed and validated. Analyses were performed on liquid chromatography tandem mass spectrometer under positive ionization mode of electrospray interface. Pregabain-d4 was used as internal standard, and the chromatographic separations were performed on Poroshell 120 EC-C18 column using an isocratic mobile phase flow rate of 1 mL/min. Stability of pregabalin in DPS was evaluated under simulated real-time conditions. Extraction procedures from plasma and DPS samples were compared using statistical tests. The method was validated considering the FDA method validation guideline. The method was linear over the concentration range of 20-16000 ng/mL and 100-10000 ng/mL in plasma and DPS, respectively. DPS samples were found stable for only one week upon storage at room temperature and for at least four weeks at freezing temperature (-20 ± 5 °C). Method was applied for quantification of pregabalin in over 600 samples of a clinical study. Statistical analyses revealed that two extraction procedures in plasma and DPS samples showed statistically insignificant difference and can be used interchangeably without any bias. Proposed method involves simple and rapid steps of sample processing that do not require a pre- or post-column derivatization procedure. The method is suitable for routine pharmacokinetic analysis and therapeutic monitoring of pregabalin.
Statistical Approaches for Spatiotemporal Prediction of Low Flows

NASA Astrophysics Data System (ADS)

Fangmann, A.; Haberlandt, U.

2017-12-01

An adequate assessment of regional climate change impacts on streamflow requires the integration of various sources of information and modeling approaches. This study proposes simple statistical tools for inclusion into model ensembles, which are fast and straightforward in their application, yet able to yield accurate streamflow predictions in time and space. Target variables for all approaches are annual low flow indices derived from a data set of 51 records of average daily discharge for northwestern Germany. The models require input of climatic data in the form of meteorological drought indices, derived from observed daily climatic variables, averaged over the streamflow gauges' catchments areas. Four different modeling approaches are analyzed. Basis for all pose multiple linear regression models that estimate low flows as a function of a set of meteorological indices and/or physiographic and climatic catchment descriptors. For the first method, individual regression models are fitted at each station, predicting annual low flow values from a set of annual meteorological indices, which are subsequently regionalized using a set of catchment characteristics. The second method combines temporal and spatial prediction within a single panel data regression model, allowing estimation of annual low flow values from input of both annual meteorological indices and catchment descriptors. The third and fourth methods represent non-stationary low flow frequency analyses and require fitting of regional distribution functions. Method three is subject to a spatiotemporal prediction of an index value, method four to estimation of L-moments that adapt the regional frequency distribution to the at-site conditions. The results show that method two outperforms successive prediction in time and space. Method three also shows a high performance in the near future period, but since it relies on a stationary distribution, its application for prediction of far future changes may be problematic. Spatiotemporal prediction of L-moments appeared highly uncertain for higher-order moments resulting in unrealistic future low flow values. All in all, the results promote an inclusion of simple statistical methods in climate change impact assessment.
Use of the Oxford Handicap Scale at hospital discharge to predict Glasgow Outcome Scale at 6 months in patients with traumatic brain injury.

PubMed

Perel, Pablo; Edwards, Phil; Shakur, Haleema; Roberts, Ian

2008-11-06

Traumatic brain injury (TBI) is an important cause of acquired disability. In evaluating the effectiveness of clinical interventions for TBI it is important to measure disability accurately. The Glasgow Outcome Scale (GOS) is the most widely used outcome measure in randomised controlled trials (RCTs) in TBI patients. However GOS measurement is generally collected at 6 months after discharge when loss to follow up could have occurred. The objectives of this study were to evaluate the association and predictive validity between a simple disability scale at hospital discharge, the Oxford Handicap Scale (OHS), and the GOS at 6 months among TBI patients. The study was a secondary analysis of a randomised clinical trial among TBI patients (MRC CRASH Trial). A Spearman correlation was estimated to evaluate the association between the OHS and GOS. The validity of different dichotomies of the OHS for predicting GOS at 6 months was assessed by calculating sensitivity, specificity and the C statistic. Uni and multivariate logistic regression models were fitted including OHS as explanatory variable. For each model we analysed its discrimination and calibration. We found that the OHS is highly correlated with GOS at 6 months (spearman correlation 0.75) with evidence of a linear relationship between the two scales. The OHS dichotomy that separates patients with severe dependency or death showed the greatest discrimination (C statistic: 84.3). Among survivors at hospital discharge the OHS showed a very good discrimination (C statistic 0.78) and excellent calibration when used to predict GOS outcome at 6 months. We have shown that the OHS, a simple disability scale available at hospital discharge can predict disability accurately, according to the GOS, at 6 months. OHS could be used to improve the design and analysis of clinical trials in TBI patients and may also provide a valuable clinical tool for physicians to improve communication with patients and relatives when assessing a patient's prognosis at hospital discharge.
Rocks in Our Pockets

ERIC Educational Resources Information Center

Plummer, Donna; Kuhlman, Wilma

2005-01-01

To introduce students to rocks and their characteristics, teacher can begin rock units with the activities described in this article. Students need the ability to make simple observations using their senses and simple tools.

Definition of redox and pH influence in the AMD mine system using a fuzzy qualitative tool (Iberian Pyrite Belt, SW Spain).

PubMed

de la Torre, M L; Grande, J A; Valente, T; Perez-Ostalé, E; Santisteban, M; Aroba, J; Ramos, I

2016-03-01

Poderosa Mine is an abandoned pyrite mine, located in the Iberian Pyrite Belt which pours its acid mine drainage (AMD) waters into the Odiel river (South-West Spain). This work focuses on establishing possible reasons for interdependence between the potential redox and pH, with the load of metals and sulfates, as well as a set of variables that define the physical chemistry of the water-conductivity, temperature, TDS, and dissolved oxygen-transported by a channel from Poderosa mine affected by acid mine drainage, through the use of techniques of artificial intelligence: fuzzy logic and data mining. The sampling campaign was carried out in May of 2012. There were a total of 16 sites, the first inside the tunnel and the last at the mouth of the river Odiel, with a distance of approximately 10 m between each pair of measuring stations. While the tools of classical statistics, which are widely used in this context, prove useful for defining proximity ratios between variables based on Pearson's correlations, in addition to making it easier to handle large volumes of data and producing easier-to-understand graphs, the use of fuzzy logic tools and data mining results in better definition of the variations produced by external stimuli on the set of variables. This tool is adaptable and can be extrapolated to any system polluted by acid mine drainage using simple, intuitive reasoning.
[Psychoprophylaxis in elective paediatric general surgery: does audiovisual tools improve the perioperative anxiety in children and their families?

PubMed

Álvarez García, N; Gómez Palacio, V; Siles Hinojosa, A; Gracia Romero, J

2017-10-25

Surgery is considered a stressful experience for children and their families who undergo elective procedures. Different tools have been developed to improve perioperative anxiety. Our objective is to demonstrate if the audiovisual psychoprophylaxis reduces anxiety linked to paediatric surgery. A randomized prospective case-control study was carried out in children aged 4-15 who underwent surgery in a Paediatric Surgery Department. We excluded patients with surgical backgrounds, sever illness or non-elective procedures. Simple randomization was performed and cases watched a video before being admitted, under medical supervision. Trait and state anxiety levels were measured using the STAI-Y2, STAI-Y2, STAI-C tests and VAS in children under 6-years-old, at admission and discharge. 100 patients (50 cases/50 controls) were included, mean age at diagnosis was 7.98 and 7.32 respectively. Orchiopexy was the most frequent surgery performed in both groups. Anxiety state levels from parents were lower in the Cases Group (36.06 vs 39.93 p= 0.09 in fathers, 38.78 vs 40.34 p= 0.43 in mothers). At discharge, anxiety levels in children aged > 6 were statistically significant among cases (26.84 vs 32.96, p< 0.05). The use of audiovisual psychoprophylaxis tools shows a clinically relevant improvement in perioperative anxiety, both in children and their parents. Our results are similar to those reported by other authors supporting these tools as beneficial strategy for the family.
Robust Combining of Disparate Classifiers Through Order Statistics

NASA Technical Reports Server (NTRS)

Tumer, Kagan; Ghosh, Joydeep

2001-01-01

Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In this article we investigate a family of combiners based on order statistics, for robust handling of situations where there are large discrepancies in performance of individual classifiers. Based on a mathematical modeling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when simple output combination methods based on the the median, the maximum and in general, the ith order statistic, are used. Furthermore, we analyze the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and show that in the presence of uneven classifier performance, they often provide substantial gains over both linear and simple order statistics combiners. Experimental results on both real world data and standard public domain data sets corroborate these findings.
Counting statistics for genetic switches based on effective interaction approximation

NASA Astrophysics Data System (ADS)

Ohkubo, Jun

2012-09-01

Applicability of counting statistics for a system with an infinite number of states is investigated. The counting statistics has been studied a lot for a system with a finite number of states. While it is possible to use the scheme in order to count specific transitions in a system with an infinite number of states in principle, we have non-closed equations in general. A simple genetic switch can be described by a master equation with an infinite number of states, and we use the counting statistics in order to count the number of transitions from inactive to active states in the gene. To avoid having the non-closed equations, an effective interaction approximation is employed. As a result, it is shown that the switching problem can be treated as a simple two-state model approximately, which immediately indicates that the switching obeys non-Poisson statistics.
Asymptotic Linear Spectral Statistics for Spiked Hermitian Random Matrices

NASA Astrophysics Data System (ADS)

Passemier, Damien; McKay, Matthew R.; Chen, Yang

2015-07-01

Using the Coulomb Fluid method, this paper derives central limit theorems (CLTs) for linear spectral statistics of three "spiked" Hermitian random matrix ensembles. These include Johnstone's spiked model (i.e., central Wishart with spiked correlation), non-central Wishart with rank-one non-centrality, and a related class of non-central matrices. For a generic linear statistic, we derive simple and explicit CLT expressions as the matrix dimensions grow large. For all three ensembles under consideration, we find that the primary effect of the spike is to introduce an correction term to the asymptotic mean of the linear spectral statistic, which we characterize with simple formulas. The utility of our proposed framework is demonstrated through application to three different linear statistics problems: the classical likelihood ratio test for a population covariance, the capacity analysis of multi-antenna wireless communication systems with a line-of-sight transmission path, and a classical multiple sample significance testing problem.
Distinguishing Positive Selection From Neutral Evolution: Boosting the Performance of Summary Statistics

PubMed Central

Lin, Kao; Li, Haipeng; Schlötterer, Christian; Futschik, Andreas

2011-01-01

Summary statistics are widely used in population genetics, but they suffer from the drawback that no simple sufficient summary statistic exists, which captures all information required to distinguish different evolutionary hypotheses. Here, we apply boosting, a recent statistical method that combines simple classification rules to maximize their joint predictive performance. We show that our implementation of boosting has a high power to detect selective sweeps. Demographic events, such as bottlenecks, do not result in a large excess of false positives. A comparison to other neutrality tests shows that our boosting implementation performs well compared to other neutrality tests. Furthermore, we evaluated the relative contribution of different summary statistics to the identification of selection and found that for recent sweeps integrated haplotype homozygosity is very informative whereas older sweeps are better detected by Tajima's π. Overall, Watterson's θ was found to contribute the most information for distinguishing between bottlenecks and selection. PMID:21041556
Nutrition screening tools: an analysis of the evidence.

PubMed

Skipper, Annalynn; Ferguson, Maree; Thompson, Kyle; Castellanos, Victoria H; Porcari, Judy

2012-05-01

In response to questions about tools for nutrition screening, an evidence analysis project was developed to identify the most valid and reliable nutrition screening tools for use in acute care and hospital-based ambulatory care settings. An oversight group defined nutrition screening and literature search criteria. A trained analyst conducted structured searches of the literature for studies of nutrition screening tools according to predetermined criteria. Eleven nutrition screening tools designed to detect undernutrition in patients in acute care and hospital-based ambulatory care were identified. Trained analysts evaluated articles for quality using criteria specified by the American Dietetic Association's Evidence Analysis Library. Members of the oversight group assigned quality grades to the tools based on the quality of the supporting evidence, including reliability and validity data. One tool, the NRS-2002, received a grade I, and 4 tools-the Simple Two-Part Tool, the Mini-Nutritional Assessment-Short Form (MNA-SF), the Malnutrition Screening Tool (MST), and Malnutrition Universal Screening Tool (MUST)-received a grade II. The MST was the only tool shown to be both valid and reliable for identifying undernutrition in the settings studied. Thus, validated nutrition screening tools that are simple and easy to use are available for application in acute care and hospital-based ambulatory care settings.
Are Statisticians Cold-Blooded Bosses? A New Perspective on the "Old" Concept of Statistical Population

ERIC Educational Resources Information Center

Lu, Yonggang; Henning, Kevin S. S.

2013-01-01

Spurred by recent writings regarding statistical pragmatism, we propose a simple, practical approach to introducing students to a new style of statistical thinking that models nature through the lens of data-generating processes, not populations. (Contains 5 figures.)
The Precision-Power-Gradient Theory for Teaching Basic Research Statistical Tools to Graduate Students.

ERIC Educational Resources Information Center

Cassel, Russell N.

This paper relates educational and psychological statistics to certain "Research Statistical Tools" (RSTs) necessary to accomplish and understand general research in the behavioral sciences. Emphasis is placed on acquiring an effective understanding of the RSTs and to this end they are are ordered to a continuum scale in terms of individual…
Syndromic surveillance of influenza activity in Sweden: an evaluation of three tools.

PubMed

Ma, T; Englund, H; Bjelkmar, P; Wallensten, A; Hulth, A

2015-08-01

An evaluation was conducted to determine which syndromic surveillance tools complement traditional surveillance by serving as earlier indicators of influenza activity in Sweden. Web queries, medical hotline statistics, and school absenteeism data were evaluated against two traditional surveillance tools. Cross-correlation calculations utilized aggregated weekly data for all-age, nationwide activity for four influenza seasons, from 2009/2010 to 2012/2013. The surveillance tool indicative of earlier influenza activity, by way of statistical and visual evidence, was identified. The web query algorithm and medical hotline statistics performed equally well as each other and to the traditional surveillance tools. School absenteeism data were not reliable resources for influenza surveillance. Overall, the syndromic surveillance tools did not perform with enough consistency in season lead nor in earlier timing of the peak week to be considered as early indicators. They do, however, capture incident cases before they have formally entered the primary healthcare system.
48 CFR 1852.223-76 - Federal Automotive Statistical Tool Reporting.

Code of Federal Regulations, 2011 CFR

2011-10-01

... data describing vehicle usage required by the Federal Automotive Statistical Tool (FAST) by October 15 of each year. FAST is accessed through http://fastweb.inel.gov/. (End of clause) [68 FR 43334, July...
48 CFR 1852.223-76 - Federal Automotive Statistical Tool Reporting.

Code of Federal Regulations, 2012 CFR

2012-10-01

... data describing vehicle usage required by the Federal Automotive Statistical Tool (FAST) by October 15 of each year. FAST is accessed through http://fastweb.inel.gov/. (End of clause) [68 FR 43334, July...
48 CFR 1852.223-76 - Federal Automotive Statistical Tool Reporting.

Code of Federal Regulations, 2013 CFR

2013-10-01

... data describing vehicle usage required by the Federal Automotive Statistical Tool (FAST) by October 15 of each year. FAST is accessed through http://fastweb.inel.gov/. (End of clause) [68 FR 43334, July...
48 CFR 1852.223-76 - Federal Automotive Statistical Tool Reporting.

Code of Federal Regulations, 2014 CFR

2014-10-01

... data describing vehicle usage required by the Federal Automotive Statistical Tool (FAST) by October 15 of each year. FAST is accessed through http://fastweb.inel.gov/. (End of clause) [68 FR 43334, July...
Using a Five-Step Procedure for Inferential Statistical Analyses

ERIC Educational Resources Information Center

Kamin, Lawrence F.

2010-01-01

Many statistics texts pose inferential statistical problems in a disjointed way. By using a simple five-step procedure as a template for statistical inference problems, the student can solve problems in an organized fashion. The problem and its solution will thus be a stand-by-itself organic whole and a single unit of thought and effort. The…
Comparison of Efficacy of Eye Movement, Desensitization and Reprocessing and Cognitive Behavioral Therapy Therapeutic Methods for Reducing Anxiety and Depression of Iranian Combatant Afflicted by Post Traumatic Stress Disorder

NASA Astrophysics Data System (ADS)

Narimani, M.; Sadeghieh Ahari, S.; Rajabi, S.

This research aims to determine efficacy of two therapeutic methods and compare them; Eye Movement, Desensitization and Reprocessing (EMDR) and Cognitive Behavioral Therapy (CBT) for reduction of anxiety and depression of Iranian combatant afflicted with Post traumatic Stress Disorder (PTSD) after imposed war. Statistical population of current study includes combatants afflicted with PTSD that were hospitalized in Isar Hospital of Ardabil province or were inhabited in Ardabil. These persons were selected through simple random sampling and were randomly located in three groups. The method was extended test method and study design was multi-group test-retest. Used tools include hospital anxiety and depression scale. This survey showed that exercise of EMDR and CBT has caused significant reduction of anxiety and depression.
Self-diffusion in periodic porous media: a comparison of numerical simulation and eigenvalue methods.

PubMed

Schwartz, L M; Bergman, D J; Dunn, K J; Mitra, P P

1996-01-01

Random walk computer simulations are an important tool in understanding magnetic resonance measurements in porous media. In this paper we focus on the description of pulsed field gradient spin echo (PGSE) experiments that measure the probability, P(R,t), that a diffusing water molecule will travel a distance R in a time t. Because PGSE simulations are often limited by statistical considerations, we will see that valuable insight can be gained by working with simple periodic geometries and comparing simulation data to the results of exact eigenvalue expansions. In this connection, our attention will be focused on (1) the wavevector, k, and time dependent magnetization, M(k, t); and (2) the normalized probability, Ps(delta R, t), that a diffusing particle will return to within delta R of the origin after time t.
Complexity-entropy causality plane: A useful approach for distinguishing songs

NASA Astrophysics Data System (ADS)

Ribeiro, Haroldo V.; Zunino, Luciano; Mendes, Renio S.; Lenzi, Ervin K.

2012-04-01

Nowadays we are often faced with huge databases resulting from the rapid growth of data storage technologies. This is particularly true when dealing with music databases. In this context, it is essential to have techniques and tools able to discriminate properties from these massive sets. In this work, we report on a statistical analysis of more than ten thousand songs aiming to obtain a complexity hierarchy. Our approach is based on the estimation of the permutation entropy combined with an intensive complexity measure, building up the complexity-entropy causality plane. The results obtained indicate that this representation space is very promising to discriminate songs as well as to allow a relative quantitative comparison among songs. Additionally, we believe that the here-reported method may be applied in practical situations since it is simple, robust and has a fast numerical implementation.
Plotting equation for gaussian percentiles and a spreadsheet program for generating probability plots

USGS Publications Warehouse

Balsillie, J.H.; Donoghue, J.F.; Butler, K.M.; Koch, J.L.

2002-01-01

Two-dimensional plotting tools can be of invaluable assistance in analytical scientific pursuits, and have been widely used in the analysis and interpretation of sedimentologic data. We consider, in this work, the use of arithmetic probability paper (APP). Most statistical computer applications do not allow for the generation of APP plots, because of apparent intractable nonlinearity of the percentile (or probability) axis of the plot. We have solved this problem by identifying an equation(s) for determining plotting positions of Gaussian percentiles (or probabilities), so that APP plots can easily be computer generated. An EXCEL example is presented, and a programmed, simple-to-use EXCEL application template is hereby made publicly available, whereby a complete granulometric analysis including data listing, moment measure calculations, and frequency and cumulative APP plots, is automatically produced.
Accelerating the weighted histogram analysis method by direct inversion in the iterative subspace.

PubMed

Zhang, Cheng; Lai, Chun-Liang; Pettitt, B Montgomery

The weighted histogram analysis method (WHAM) for free energy calculations is a valuable tool to produce free energy differences with the minimal errors. Given multiple simulations, WHAM obtains from the distribution overlaps the optimal statistical estimator of the density of states, from which the free energy differences can be computed. The WHAM equations are often solved by an iterative procedure. In this work, we use a well-known linear algebra algorithm which allows for more rapid convergence to the solution. We find that the computational complexity of the iterative solution to WHAM and the closely-related multiple Bennett acceptance ratio (MBAR) method can be improved by using the method of direct inversion in the iterative subspace. We give examples from a lattice model, a simple liquid and an aqueous protein solution.

A Framework for Assessing High School Students' Statistical Reasoning.

PubMed

Chan, Shiau Wei; Ismail, Zaleha; Sumintono, Bambang

2016-01-01

Based on a synthesis of literature, earlier studies, analyses and observations on high school students, this study developed an initial framework for assessing students' statistical reasoning about descriptive statistics. Framework descriptors were established across five levels of statistical reasoning and four key constructs. The former consisted of idiosyncratic reasoning, verbal reasoning, transitional reasoning, procedural reasoning, and integrated process reasoning. The latter include describing data, organizing and reducing data, representing data, and analyzing and interpreting data. In contrast to earlier studies, this initial framework formulated a complete and coherent statistical reasoning framework. A statistical reasoning assessment tool was then constructed from this initial framework. The tool was administered to 10 tenth-grade students in a task-based interview. The initial framework was refined, and the statistical reasoning assessment tool was revised. The ten students then participated in the second task-based interview, and the data obtained were used to validate the framework. The findings showed that the students' statistical reasoning levels were consistent across the four constructs, and this result confirmed the framework's cohesion. Developed to contribute to statistics education, this newly developed statistical reasoning framework provides a guide for planning learning goals and designing instruction and assessments.
A Framework for Assessing High School Students' Statistical Reasoning

PubMed Central

2016-01-01

Based on a synthesis of literature, earlier studies, analyses and observations on high school students, this study developed an initial framework for assessing students’ statistical reasoning about descriptive statistics. Framework descriptors were established across five levels of statistical reasoning and four key constructs. The former consisted of idiosyncratic reasoning, verbal reasoning, transitional reasoning, procedural reasoning, and integrated process reasoning. The latter include describing data, organizing and reducing data, representing data, and analyzing and interpreting data. In contrast to earlier studies, this initial framework formulated a complete and coherent statistical reasoning framework. A statistical reasoning assessment tool was then constructed from this initial framework. The tool was administered to 10 tenth-grade students in a task-based interview. The initial framework was refined, and the statistical reasoning assessment tool was revised. The ten students then participated in the second task-based interview, and the data obtained were used to validate the framework. The findings showed that the students’ statistical reasoning levels were consistent across the four constructs, and this result confirmed the framework’s cohesion. Developed to contribute to statistics education, this newly developed statistical reasoning framework provides a guide for planning learning goals and designing instruction and assessments. PMID:27812091
The Feasibility of Real-Time Intraoperative Performance Assessment With SIMPL (System for Improving and Measuring Procedural Learning): Early Experience From a Multi-institutional Trial.

PubMed

Bohnen, Jordan D; George, Brian C; Williams, Reed G; Schuller, Mary C; DaRosa, Debra A; Torbeck, Laura; Mullen, John T; Meyerson, Shari L; Auyang, Edward D; Chipman, Jeffrey G; Choi, Jennifer N; Choti, Michael A; Endean, Eric D; Foley, Eugene F; Mandell, Samuel P; Meier, Andreas H; Smink, Douglas S; Terhune, Kyla P; Wise, Paul E; Soper, Nathaniel J; Zwischenberger, Joseph B; Lillemoe, Keith D; Dunnington, Gary L; Fryer, Jonathan P

Intraoperative performance assessment of residents is of growing interest to trainees, faculty, and accreditors. Current approaches to collect such assessments are limited by low participation rates and long delays between procedure and evaluation. We deployed an innovative, smartphone-based tool, SIMPL (System for Improving and Measuring Procedural Learning), to make real-time intraoperative performance assessment feasible for every case in which surgical trainees participate, and hypothesized that SIMPL could be feasibly integrated into surgical training programs. Between September 1, 2015 and February 29, 2016, 15 U.S. general surgery residency programs were enrolled in an institutional review board-approved trial. SIMPL was made available after 70% of faculty and residents completed a 1-hour training session. Descriptive and univariate statistics analyzed multiple dimensions of feasibility, including training rates, volume of assessments, response rates/times, and dictation rates. The 20 most active residents and attendings were evaluated in greater detail. A total of 90% of eligible users (1267/1412) completed training. Further, 13/15 programs began using SIMPL. Totally, 6024 assessments were completed by 254 categorical general surgery residents (n = 3555 assessments) and 259 attendings (n = 2469 assessments), and 3762 unique operations were assessed. There was significant heterogeneity in participation within and between programs. Mean percentage (range) of users who completed ≥1, 5, and 20 assessments were 62% (21%-96%), 34% (5%-75%), and 10% (0%-32%) across all programs, and 96%, 75%, and 32% in the most active program. Overall, response rate was 70%, dictation rate was 24%, and mean response time was 12 hours. Assessments increased from 357 (September 2015) to 1146 (February 2016). The 20 most active residents each received mean 46 assessments by 10 attendings for 20 different procedures. SIMPL can be feasibly integrated into surgical training programs to enhance the frequency and timeliness of intraoperative performance assessment. We believe SIMPL could help facilitate a national competency-based surgical training system, although local and systemic challenges still need to be addressed. Copyright Â© 2016. Published by Elsevier Inc.
BOOK REVIEW: Critical Phenomena in Natural Sciences: Chaos, Fractals, Selforganization and Disorder: Concepts and Tools

NASA Astrophysics Data System (ADS)

Franz, S.

2004-10-01

Since the discovery of the renormalization group theory in statistical physics, the realm of applications of the concepts of scale invariance and criticality has pervaded several fields of natural and social sciences. This is the leitmotiv of Didier Sornette's book, who in Critical Phenomena in Natural Sciences reviews three decades of developments and applications of the concepts of criticality, scale invariance and power law behaviour from statistical physics, to earthquake prediction, ruptures, plate tectonics, modelling biological and economic systems and so on. This strongly interdisciplinary book addresses students and researchers in disciplines where concepts of criticality and scale invariance are appropriate: mainly geology from which most of the examples are taken, but also engineering, biology, medicine, economics, etc. A good preparation in quantitative science is assumed but the presentation of statistical physics principles, tools and models is self-contained, so that little background in this field is needed. The book is written in a simple informal style encouraging intuitive comprehension rather than stressing formal derivations. Together with the discussion of the main conceptual results of the discipline, great effort is devoted to providing applied scientists with the tools of data analysis and modelling necessary to analyse, understand, make predictions and simulate systems undergoing complex collective behaviour. The book starts from a purely descriptive approach, explaining basic probabilistic and geometrical tools to characterize power law behaviour and scale invariant sets. Probability theory is introduced by a detailed discussion of interpretative issues warning the reader on the use and misuse of probabilistic concepts when the emphasis is on prediction of low probability rare---and often catastrophic---events. Then, concepts that have proved useful in risk evaluation, extreme value statistics, large limit theorems for sums of independent variables with power law distribution, random walks, fractals and multifractal formalisms, etc, are discussed in an immediate and direct way so as to provide ready-to-use tools for analysing and representing power law behaviour in natural phenomena. The exposition then continues discussing the main developments, allowing the reader to understand theoretically and model strongly correlated behaviour. After a concise, but useful, introduction to the fundamentals of statistical physics a discussion of equilibrium critical phenomena and the renormalization group is proposed to the reader. With the centrality of the problem of non-equilibrium behaviour in mind, a discussion is devoted to tentative applications of the concept of temperature in the off-equilibrium context. Particular emphasis is given to the development of long range correlation and of precursors of phase transitions, and their role in the prediction of catastrophic events. Then, basic models such as percolation and rupture models are described. A central position in the book is occupied by a chapter on mechanisms for power laws and a subsequent one on self-organized criticality as a general paradigm for critical behaviour as proposed by P Bak and collaborators. The book concludes with a chapter on the prediction of fields generated by a random distribution of sources. The book maintains the promise of the title of providing concepts and tools to tackle criticality and self-organization. The second edition, while retaining the structure of the first edition, considerably extends the scope with new examples and applications of a research field which is constantly growing. Any scientific book has to solve the dichotomy between the depth of discussion, the pedagogical character of exposition and the quantity of material discussed. In general the book, which evolved from a graduate student course, favours these last two aspects at the expense of the first one. This makes the book very readable and means that, while complicated concepts are always explained by means of simple examples, important results are often mentioned but not derived or discussed in depth. Most of the time this style of exposition manages to successfully convey the essential information, other times unfortunately, e.g. in the case of the chapter on disordered systems, the presentation appears rather superficial. This is the price we pay for a book covering an impressively vast subject area and the huge bibliography (more than 1000 references) furnishes a necessary guide for acquiring the working knowledge of the subject covered. I would recommend it to teachers planning introductory courses on the field of complex systems and to researchers wanting to learn about an area of great contemporary interest.
Statistics Using Just One Formula

ERIC Educational Resources Information Center

Rosenthal, Jeffrey S.

2018-01-01

This article advocates that introductory statistics be taught by basing all calculations on a single simple margin-of-error formula and deriving all of the standard introductory statistical concepts (confidence intervals, significance tests, comparisons of means and proportions, etc) from that one formula. It is argued that this approach will…
NIRS-SPM: statistical parametric mapping for near infrared spectroscopy

NASA Astrophysics Data System (ADS)

Tak, Sungho; Jang, Kwang Eun; Jung, Jinwook; Jang, Jaeduck; Jeong, Yong; Ye, Jong Chul

2008-02-01

Even though there exists a powerful statistical parametric mapping (SPM) tool for fMRI, similar public domain tools are not available for near infrared spectroscopy (NIRS). In this paper, we describe a new public domain statistical toolbox called NIRS-SPM for quantitative analysis of NIRS signals. Specifically, NIRS-SPM statistically analyzes the NIRS data using GLM and makes inference as the excursion probability which comes from the random field that are interpolated from the sparse measurement. In order to obtain correct inference, NIRS-SPM offers the pre-coloring and pre-whitening method for temporal correlation estimation. For simultaneous recording NIRS signal with fMRI, the spatial mapping between fMRI image and real coordinate in 3-D digitizer is estimated using Horn's algorithm. These powerful tools allows us the super-resolution localization of the brain activation which is not possible using the conventional NIRS analysis tools.
Simple taper: Taper equations for the field forester

Treesearch

David R. Larsen

2017-01-01

"Simple taper" is set of linear equations that are based on stem taper rates; the intent is to provide taper equation functionality to field foresters. The equation parameters are two taper rates based on differences in diameter outside bark at two points on a tree. The simple taper equations are statistically equivalent to more complex equations. The linear...
Using Simple Linear Regression to Assess the Success of the Montreal Protocol in Reducing Atmospheric Chlorofluorocarbons

ERIC Educational Resources Information Center

Nelson, Dean

2009-01-01

Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…
The GenABEL Project for statistical genomics.

PubMed

Karssen, Lennart C; van Duijn, Cornelia M; Aulchenko, Yurii S

2016-01-01

Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the "core team", facilitating agile statistical omics methodology development and fast dissemination.
A simple and inexpensive external fixator.

PubMed

Noor, M A

1988-11-01

A simple and inexpensive external fixator has been designed. It is constructed of galvanized iron pipe and mild steel bolts and nuts. It can easily be manufactured in a hospital workshop with a minimum of tools.
Tools for Interdisciplinary Data Assimilation and Sharing in Support of Hydrologic Science

NASA Astrophysics Data System (ADS)

Blodgett, D. L.; Walker, J.; Suftin, I.; Warren, M.; Kunicki, T.

2013-12-01

Information consumed and produced in hydrologic analyses is interdisciplinary and massive. These factors put a heavy information management burden on the hydrologic science community. The U.S. Geological Survey (USGS) Office of Water Information Center for Integrated Data Analytics (CIDA) seeks to assist hydrologic science investigators with all-components of their scientific data management life cycle. Ongoing data publication and software development projects will be presented demonstrating publically available data access services and manipulation tools being developed with support from two Department of the Interior initiatives. The USGS-led National Water Census seeks to provide both data and tools in support of nationally consistent water availability estimates. Newly available data include national coverages of radar-indicated precipitation, actual evapotranspiration, water use estimates aggregated by county, and South East region estimates of streamflow for 12-digit hydrologic unit code watersheds. Web services making these data available and applications to access them will be demonstrated. Web-available processing services able to provide numerous streamflow statistics for any USGS daily flow record or model result time series and other National Water Census processing tools will also be demonstrated. The National Climate Change and Wildlife Science Center is a USGS center leading DOI-funded academic global change adaptation research. It has a mission goal to ensure data used and produced by funded projects is available via web services and tools that streamline data management tasks in interdisciplinary science. For example, collections of downscaled climate projections, typically large collections of files that must be downloaded to be accessed, are being published using web services that allow access to the entire dataset via simple web-service requests and numerous processing tools. Recent progress on this front includes, data web services for Climate Model Intercomparison Phase 5 based downscaled climate projections, EPA's Integrated Climate and Land Use Scenarios projections of population and land cover metrics, and MODIS-derived land cover parameters from NASA's Land Processes Distributed Active Archive Center. These new services and ways to discover others will be presented through demonstration of a recently open-sourced project from a web-application or scripted workflow. Development and public deployment of server-based processing tools to subset and summarize these and other data is ongoing at the CIDA with partner groups such as 52 Degrees North and Unidata. The latest progress on subsetting, spatial summarization to areas of interest, and temporal summarization via common-statistical methods will be presented.
What's Inside?

ERIC Educational Resources Information Center

Sigford, Ann; Nelson, Nancy

1998-01-01

Presents a program for elementary teachers to learn how to use hand tools and household appliances to teach the principles of physics. The lesson helps teachers become familiar with simple hand tools, combat the apprehension of mechanical devices, and develop an interest in tools and technology. Session involves disassembling appliances to…
Development and Validation of the Texas Best Management Practice Evaluation Tool (TBET)

USDA-ARS?s Scientific Manuscript database

Conservation planners need simple yet accurate tools to predict sediment and nutrient losses from agricultural fields to guide conservation practice implementation and increase cost-effectiveness. The Texas Best management practice Evaluation Tool (TBET), which serves as an input/output interpreter...
Asymptotically Optimal and Private Statistical Estimation

NASA Astrophysics Data System (ADS)

Smith, Adam

Differential privacy is a definition of "privacy" for statistical databases. The definition is simple, yet it implies strong semantics even in the presence of an adversary with arbitrary auxiliary information about the database.
Equivalent circuit models for interpreting impedance perturbation spectroscopy data

NASA Astrophysics Data System (ADS)

Smith, R. Lowell

2004-07-01

As in-situ structural integrity monitoring disciplines mature, there is a growing need to process sensor/actuator data efficiently in real time. Although smaller, faster embedded processors will contribute to this, it is also important to develop straightforward, robust methods to reduce the overall computational burden for practical applications of interest. This paper addresses the use of equivalent circuit modeling techniques for inferring structure attributes monitored using impedance perturbation spectroscopy. In pioneering work about ten years ago significant progress was associated with the development of simple impedance models derived from the piezoelectric equations. Using mathematical modeling tools currently available from research in ultrasonics and impedance spectroscopy is expected to provide additional synergistic benefits. For purposes of structural health monitoring the objective is to use impedance spectroscopy data to infer the physical condition of structures to which small piezoelectric actuators are bonded. Features of interest include stiffness changes, mass loading, and damping or mechanical losses. Equivalent circuit models are typically simple enough to facilitate the development of practical analytical models of the actuator-structure interaction. This type of parametric structure model allows raw impedance/admittance data to be interpreted optimally using standard multiple, nonlinear regression analysis. One potential long-term outcome is the possibility of cataloging measured viscoelastic properties of the mechanical subsystems of interest as simple lists of attributes and their statistical uncertainties, whose evolution can be followed in time. Equivalent circuit models are well suited for addressing calibration and self-consistency issues such as temperature corrections, Poisson mode coupling, and distributed relaxation processes.
Sequence History Update Tool

NASA Technical Reports Server (NTRS)

Khanampompan, Teerapat; Gladden, Roy; Fisher, Forest; DelGuercio, Chris

2008-01-01

The Sequence History Update Tool performs Web-based sequence statistics archiving for Mars Reconnaissance Orbiter (MRO). Using a single UNIX command, the software takes advantage of sequencing conventions to automatically extract the needed statistics from multiple files. This information is then used to populate a PHP database, which is then seamlessly formatted into a dynamic Web page. This tool replaces a previous tedious and error-prone process of manually editing HTML code to construct a Web-based table. Because the tool manages all of the statistics gathering and file delivery to and from multiple data sources spread across multiple servers, there is also a considerable time and effort savings. With the use of The Sequence History Update Tool what previously took minutes is now done in less than 30 seconds, and now provides a more accurate archival record of the sequence commanding for MRO.
Statistical Tutorial | Center for Cancer Research

Cancer.gov

Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018. The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean differences, simple and multiple linear regression, ANOVA tests, and Chi-Squared distribution.
Benchmarking of Decision-Support Tools Used for Tiered Sustainable Remediation Appraisal.

PubMed

Smith, Jonathan W N; Kerrison, Gavin

2013-01-01

Sustainable remediation comprises soil and groundwater risk-management actions that are selected, designed, and operated to maximize net environmental, social, and economic benefit (while assuring protection of human health and safety). This paper describes a benchmarking exercise to comparatively assess potential differences in environmental management decision making resulting from application of different sustainability appraisal tools ranging from simple (qualitative) to more quantitative (multi-criteria and fully monetized cost-benefit analysis), as outlined in the SuRF-UK framework. The appraisal tools were used to rank remedial options for risk management of a subsurface petroleum release that occurred at a petrol filling station in central England. The remediation options were benchmarked using a consistent set of soil and groundwater data for each tier of sustainability appraisal. The ranking of remedial options was very similar in all three tiers, and an environmental management decision to select the most sustainable options at tier 1 would have been the same decision at tiers 2 and 3. The exercise showed that, for relatively simple remediation projects, a simple sustainability appraisal led to the same remediation option selection as more complex appraisal, and can be used to reliably inform environmental management decisions on other relatively simple land contamination projects.
DMET-analyzer: automatic analysis of Affymetrix DMET data.

PubMed

Guzzi, Pietro Hiram; Agapito, Giuseppe; Di Martino, Maria Teresa; Arbitrio, Mariamena; Tassone, Pierfrancesco; Tagliaferri, Pierosandro; Cannataro, Mario

2012-10-05

Clinical Bioinformatics is currently growing and is based on the integration of clinical and omics data aiming at the development of personalized medicine. Thus the introduction of novel technologies able to investigate the relationship among clinical states and biological machineries may help the development of this field. For instance the Affymetrix DMET platform (drug metabolism enzymes and transporters) is able to study the relationship among the variation of the genome of patients and drug metabolism, detecting SNPs (Single Nucleotide Polymorphism) on genes related to drug metabolism. This may allow for instance to find genetic variants in patients which present different drug responses, in pharmacogenomics and clinical studies. Despite this, there is currently a lack in the development of open-source algorithms and tools for the analysis of DMET data. Existing software tools for DMET data generally allow only the preprocessing of binary data (e.g. the DMET-Console provided by Affymetrix) and simple data analysis operations, but do not allow to test the association of the presence of SNPs with the response to drugs. We developed DMET-Analyzer a tool for the automatic association analysis among the variation of the patient genomes and the clinical conditions of patients, i.e. the different response to drugs. The proposed system allows: (i) to automatize the workflow of analysis of DMET-SNP data avoiding the use of multiple tools; (ii) the automatic annotation of DMET-SNP data and the search in existing databases of SNPs (e.g. dbSNP), (iii) the association of SNP with pathway through the search in PharmaGKB, a major knowledge base for pharmacogenomic studies. DMET-Analyzer has a simple graphical user interface that allows users (doctors/biologists) to upload and analyse DMET files produced by Affymetrix DMET-Console in an interactive way. The effectiveness and easy use of DMET Analyzer is demonstrated through different case studies regarding the analysis of clinical datasets produced in the University Hospital of Catanzaro, Italy. DMET Analyzer is a novel tool able to automatically analyse data produced by the DMET-platform in case-control association studies. Using such tool user may avoid wasting time in the manual execution of multiple statistical tests avoiding possible errors and reducing the amount of time needed for a whole experiment. Moreover annotations and the direct link to external databases may increase the biological knowledge extracted. The system is freely available for academic purposes at: https://sourceforge.net/projects/dmetanalyzer/files/
The taxonomy statistic uncovers novel clinical patterns in a population of ischemic stroke patients.

PubMed

Tukiendorf, Andrzej; Kaźmierski, Radosław; Michalak, Sławomir

2013-01-01

In this paper, we describe a simple taxonomic approach for clinical data mining elaborated by Marczewski and Steinhaus (M-S), whose performance equals the advanced statistical methodology known as the expectation-maximization (E-M) algorithm. We tested these two methods on a cohort of ischemic stroke patients. The comparison of both methods revealed strong agreement. Direct agreement between M-S and E-M classifications reached 83%, while Cohen's coefficient of agreement was κ = 0.766(P < 0.0001). The statistical analysis conducted and the outcomes obtained in this paper revealed novel clinical patterns in ischemic stroke patients. The aim of the study was to evaluate the clinical usefulness of Marczewski-Steinhaus' taxonomic approach as a tool for the detection of novel patterns of data in ischemic stroke patients and the prediction of disease outcome. In terms of the identification of fairly frequent types of stroke patients using their age, National Institutes of Health Stroke Scale (NIHSS), and diabetes mellitus (DM) status, when dealing with rough characteristics of patients, four particular types of patients are recognized, which cannot be identified by means of routine clinical methods. Following the obtained taxonomical outcomes, the strong correlation between the health status at moment of admission to emergency department (ED) and the subsequent recovery of patients is established. Moreover, popularization and simplification of the ideas of advanced mathematicians may provide an unconventional explorative platform for clinical problems.

Non-extensivity and complexity in the earthquake activity at the West Corinth rift (Greece)

NASA Astrophysics Data System (ADS)

Michas, Georgios; Vallianatos, Filippos; Sammonds, Peter

2013-04-01

Earthquakes exhibit complex phenomenology that is revealed from the fractal structure in space, time and magnitude. For that reason other tools rather than the simple Poissonian statistics seem more appropriate to describe the statistical properties of the phenomenon. Here we use Non-Extensive Statistical Physics [NESP] to investigate the inter-event time distribution of the earthquake activity at the west Corinth rift (central Greece). This area is one of the most seismotectonically active areas in Europe, with an important continental N-S extension and high seismicity rates. NESP concept refers to the non-additive Tsallis entropy Sq that includes Boltzmann-Gibbs entropy as a particular case. This concept has been successfully used for the analysis of a variety of complex dynamic systems including earthquakes, where fractality and long-range interactions are important. The analysis indicates that the cumulative inter-event time distribution can be successfully described with NESP, implying the complexity that characterizes the temporal occurrences of earthquakes. Further on, we use the Tsallis entropy (Sq) and the Fischer Information Measure (FIM) to investigate the complexity that characterizes the inter-event time distribution through different time windows along the evolution of the seismic activity at the West Corinth rift. The results of this analysis reveal a different level of organization and clusterization of the seismic activity in time. Acknowledgments. GM wish to acknowledge the partial support of the Greek State Scholarships Foundation (IKY).
Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models.

PubMed

Jacquin, Hugo; Gilson, Amy; Shakhnovich, Eugene; Cocco, Simona; Monasson, Rémi

2016-05-01

Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structure and energetics remain untested so far. Here we use lattice protein model (LP) to benchmark those inverse statistical approaches. We build MSA of highly stable sequences in target LP structures, and infer the effective pairwise Potts Hamiltonians from those MSA. We find that inferred Potts Hamiltonians reproduce many important aspects of 'true' LP structures and energetics. Careful analysis reveals that effective pairwise couplings in inferred Potts Hamiltonians depend not only on the energetics of the native structure but also on competing folds; in particular, the coupling values reflect both positive design (stabilization of native conformation) and negative design (destabilization of competing folds). In addition to providing detailed structural information, the inferred Potts models used as protein Hamiltonian for design of new sequences are able to generate with high probability completely new sequences with the desired folds, which is not possible using independent-site models. Those are remarkable results as the effective LP Hamiltonians used to generate MSA are not simple pairwise models due to the competition between the folds. Our findings elucidate the reasons for the success of inverse approaches to the modelling of proteins from sequence data, and their limitations.
Open-source platform to benchmark fingerprints for ligand-based virtual screening

PubMed Central

2013-01-01

Similarity-search methods using molecular fingerprints are an important tool for ligand-based virtual screening. A huge variety of fingerprints exist and their performance, usually assessed in retrospective benchmarking studies using data sets with known actives and known or assumed inactives, depends largely on the validation data sets used and the similarity measure used. Comparing new methods to existing ones in any systematic way is rather difficult due to the lack of standard data sets and evaluation procedures. Here, we present a standard platform for the benchmarking of 2D fingerprints. The open-source platform contains all source code, structural data for the actives and inactives used (drawn from three publicly available collections of data sets), and lists of randomly selected query molecules to be used for statistically valid comparisons of methods. This allows the exact reproduction and comparison of results for future studies. The results for 12 standard fingerprints together with two simple baseline fingerprints assessed by seven evaluation methods are shown together with the correlations between methods. High correlations were found between the 12 fingerprints and a careful statistical analysis showed that only the two baseline fingerprints were different from the others in a statistically significant way. High correlations were also found between six of the seven evaluation methods, indicating that despite their seeming differences, many of these methods are similar to each other. PMID:23721588
Self-organization of cosmic radiation pressure instability. II - One-dimensional simulations

NASA Technical Reports Server (NTRS)

Hogan, Craig J.; Woods, Jorden

1992-01-01

The clustering of statistically uniform discrete absorbing particles moving solely under the influence of radiation pressure from uniformly distributed emitters is studied in a simple one-dimensional model. Radiation pressure tends to amplify statistical clustering in the absorbers; the absorbing material is swept into empty bubbles, the biggest bubbles grow bigger almost as they would in a uniform medium, and the smaller ones get crushed and disappear. Numerical simulations of a one-dimensional system are used to support the conjecture that the system is self-organizing. Simple statistics indicate that a wide range of initial conditions produce structure approaching the same self-similar statistical distribution, whose scaling properties follow those of the attractor solution for an isolated bubble. The importance of the process for large-scale structuring of the interstellar medium is briefly discussed.
Twelve essential tools for living the life of whole person health care.

PubMed

Schlitz, Marilyn; Valentina, Elizabeth

2013-01-01

The integration of body, mind, and spirit has become a key dimension of health education and disease prevention and treatment; however, our health care system remains primarily disease centered. Finding simple steps to help each of us find our own balance can improve our lives, our work, and our relationships. On the basis of interviews with health care experts at the leading edge of the new model of medicine, this article identifies simple tools to improve the health of patients and caregivers.
Why would we use the Sediment Isotope Tomography (SIT) model to establish a 210Pb-based chronology in recent-sediment cores?

PubMed

Abril Hernández, José-María

2015-05-01

After half a century, the use of unsupported (210)Pb ((210)Pbexc) is still far off from being a well established dating tool for recent sediments with widespread applicability. Recent results from the statistical analysis of time series of fluxes, mass sediment accumulation rates (SAR), and initial activities, derived from varved sediments, place serious constraints to the assumption of constant fluxes, which is widely used in dating models. The Sediment Isotope Tomography (SIT) model, under the assumption of non post-depositional redistribution, is used for dating recent sediments in scenarios in that fluxes and SAR are uncorrelated and both vary with time. By using a simple graphical analysis, this paper shows that under the above assumptions, any given (210)Pbexc profile, even with the restriction of a discrete set of reference points, is compatible with an infinite number of chronological lines, and thus generating an infinite number of mathematically exact solutions for histories of initial activity concentrations, SAR and fluxes onto the SWI, with these two last ranging from zero up to infinity. Particularly, SIT results, without additional assumptions, cannot contain any statistically significant difference with respect to the exact solutions consisting in intervals of constant SAR or constant fluxes (both being consistent with the reference points). Therefore, there is not any benefit in its use as a dating tool without the explicit introduction of additional restrictive assumptions about fluxes, SAR and/or their interrelationship. Copyright © 2015 Elsevier Ltd. All rights reserved.
Interactive Model Visualization for NET-VISA

NASA Astrophysics Data System (ADS)

Kuzma, H. A.; Arora, N. S.

2013-12-01

NET-VISA is a probabilistic system developed for seismic network processing of data measured on the International Monitoring System (IMS) of the Comprehensive nuclear Test Ban Treaty Organization (CTBTO). NET-VISA is composed of a Generative Model (GM) and an Inference Algorithm (IA). The GM is an explicit mathematical description of the relationships between various factors in seismic network analysis. Some of the relationships inside the GM are deterministic and some are statistical. Statistical relationships are described by probability distributions, the exact parameters of which (such as mean and standard deviation) are found by training NET-VISA using recent data. The IA uses the GM to evaluate the probability of various events and associations, searching for the seismic bulletin which has the highest overall probability and is consistent with a given set of measured arrivals. An Interactive Model Visualization tool (IMV) has been developed which makes 'peeking into' the GM simple and intuitive through a web-based interfaced. For example, it is now possible to access the probability distributions for attributes of events and arrivals such as the detection rate for each station for each of 14 phases. It also clarifies the assumptions and prior knowledge that are incorporated into NET-VISA's event determination. When NET-VISA is retrained, the IMV will be a visual tool for quality control both as a means of testing that the training has been accomplished correctly and that the IMS network has not changed unexpectedly. A preview of the IMV will be shown at this poster presentation. Homepage for the IMV IMV shows current model file and reference image.
DOE Office of Scientific and Technical Information (OSTI.GOV)

J.A. Krommes

Fusion physics poses an extremely challenging, practically complex problem that does not yield readily to simple paradigms. Nevertheless, various of the theoretical tools and conceptual advances emphasized at the KaufmanFest 2007 have motivated and/or found application to the development of fusion-related plasma turbulence theory. A brief historical commentary is given on some aspects of that specialty, with emphasis on the role (and limitations) of Hamiltonian/symplectic approaches, variational methods, oscillation-center theory, and nonlinear dynamics. It is shown how to extract a renormalized ponderomotive force from the statistical equations of plasma turbulence, and the possibility of a renormalized K-χ theorem is discussed.more » An unusual application of quasilinear theory to the problem of plasma equilibria in the presence of stochastic magnetic fields is described. The modern problem of zonal-flow dynamics illustrates a confluence of several techniques, including (i) the application of nonlinear-dynamics methods, especially center-manifold theory, to the problem of the transition to plasma turbulence in the face of self-generated zonal flows; and (ii) the use of Hamiltonian formalism to determine the appropriate (Casimir) invariant to be used in a novel wave-kinetic analysis of systems of interacting zonal flows and drift waves. Recent progress in the theory of intermittent chaotic statistics and the generation of coherent structures from turbulence is mentioned, and an appeal is made for some new tools to cope with these interesting and difficult problems in nonlinear plasma physics. Finally, the important influence of the intellectually stimulating research environment fostered by Prof. Allan Kaufman on the author's thinking and teaching methodology is described.« less
Knowledge-based machine indexing from natural language text: Knowledge base design, development, and maintenance

NASA Technical Reports Server (NTRS)

Genuardi, Michael T.

1993-01-01

One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.
S-SPatt: simple statistics for patterns on Markov chains.

PubMed

Nuel, Grégory

2005-07-01

S-SPatt allows the counting of patterns occurrences in text files and, assuming these texts are generated from a random Markovian source, the computation of the P-value of a given observation using a simple binomial approximation.
Statistical Tools for Fitting Models of the Population Consequences of Acoustic Disturbance to Data from Marine Mammal Populations (PCAD Tools II)

DTIC Science & Technology

2014-09-30

Consequences of Acoustic Disturbance to Data from Marine Mammal Populations (PCAD Tools II) Len Thomas, John Harwood, Catriona Harris, and Robert S... mammals changes over time. This project will develop statistical tools to allow mathematical models of the population consequences of acoustic...disturbance to be fitted to data from marine mammal populations. We will work closely with Phase II of the ONR PCAD Working Group, and will provide
Introducing SONS, a tool for operational taxonomic unit-based comparisons of microbial community memberships and structures.

PubMed

Schloss, Patrick D; Handelsman, Jo

2006-10-01

The recent advent of tools enabling statistical inferences to be drawn from comparisons of microbial communities has enabled the focus of microbial ecology to move from characterizing biodiversity to describing the distribution of that biodiversity. Although statistical tools have been developed to compare community structures across a phylogenetic tree, we lack tools to compare the memberships and structures of two communities at a particular operational taxonomic unit (OTU) definition. Furthermore, current tests of community structure do not indicate the similarity of the communities but only report the probability of a statistical hypothesis. Here we present a computer program, SONS, which implements nonparametric estimators for the fraction and richness of OTUs shared between two communities.
Software Used to Generate Cancer Statistics - SEER Cancer Statistics

Cancer.gov

Videos that highlight topics and trends in cancer statistics and definitions of statistical terms. Also software tools for analyzing and reporting cancer statistics, which are used to compile SEER's annual reports.
Publication bias in situ

PubMed Central

Phillips, Carl V

2004-01-01

Background Publication bias, as typically defined, refers to the decreased likelihood of studies' results being published when they are near the null, not statistically significant, or otherwise "less interesting." But choices about how to analyze the data and which results to report create a publication bias within the published results, a bias I label "publication bias in situ" (PBIS). Discussion PBIS may create much greater bias in the literature than traditionally defined publication bias (the failure to publish any result from a study). The causes of PBIS are well known, consisting of various decisions about reporting that are influenced by the data. But its impact is not generally appreciated, and very little attention is devoted to it. What attention there is consists largely of rules for statistical analysis that are impractical and do not actually reduce the bias in reported estimates. PBIS cannot be reduced by statistical tools because it is not fundamentally a problem of statistics, but rather of non-statistical choices and plain language interpretations. PBIS should be recognized as a phenomenon worthy of study – it is extremely common and probably has a huge impact on results reported in the literature – and there should be greater systematic efforts to identify and reduce it. The paper presents examples, including results of a recent HIV vaccine trial, that show how easily PBIS can have a large impact on reported results, as well as how there can be no simple answer to it. Summary PBIS is a major problem, worthy of substantially more attention than it receives. There are ways to reduce the bias, but they are very seldom employed because they are largely unrecognized. PMID:15296515
Monte Carlo isotopic inventory analysis for complex nuclear systems

NASA Astrophysics Data System (ADS)

Phruksarojanakun, Phiphat

Monte Carlo Inventory Simulation Engine (MCise) is a newly developed method for calculating isotopic inventory of materials. It offers the promise of modeling materials with complex processes and irradiation histories, which pose challenges for current, deterministic tools, and has strong analogies to Monte Carlo (MC) neutral particle transport. The analog method, including considerations for simple, complex and loop flows, is fully developed. In addition, six variance reduction tools provide unique capabilities of MCise to improve statistical precision of MC simulations. Forced Reaction forces an atom to undergo a desired number of reactions in a given irradiation environment. Biased Reaction Branching primarily focuses on improving statistical results of the isotopes that are produced from rare reaction pathways. Biased Source Sampling aims at increasing frequencies of sampling rare initial isotopes as the starting particles. Reaction Path Splitting increases the population by splitting the atom at each reaction point, creating one new atom for each decay or transmutation product. Delta Tracking is recommended for high-frequency pulsing to reduce the computing time. Lastly, Weight Window is introduced as a strategy to decrease large deviations of weight due to the uses of variance reduction techniques. A figure of merit is necessary to compare the efficiency of different variance reduction techniques. A number of possibilities for figure of merit are explored, two of which are robust and subsequently used. One is based on the relative error of a known target isotope (1/R 2T) and the other on the overall detection limit corrected by the relative error (1/DkR 2T). An automated Adaptive Variance-reduction Adjustment (AVA) tool is developed to iteratively define parameters for some variance reduction techniques in a problem with a target isotope. Sample problems demonstrate that AVA improves both precision and accuracy of a target result in an efficient manner. Potential applications of MCise include molten salt fueled reactors and liquid breeders in fusion blankets. As an example, the inventory analysis of a liquid actinide fuel in the In-Zinerator, a sub-critical power reactor driven by a fusion source, is examined. The result reassures MCise as a reliable tool for inventory analysis of complex nuclear systems.
TSSAR: TSS annotation regime for dRNA-seq data.

PubMed

Amman, Fabian; Wolfinger, Michael T; Lorenz, Ronny; Hofacker, Ivo L; Stadler, Peter F; Findeiß, Sven

2014-03-27

Differential RNA sequencing (dRNA-seq) is a high-throughput screening technique designed to examine the architecture of bacterial operons in general and the precise position of transcription start sites (TSS) in particular. Hitherto, dRNA-seq data were analyzed by visualizing the sequencing reads mapped to the reference genome and manually annotating reliable positions. This is very labor intensive and, due to the subjectivity, biased. Here, we present TSSAR, a tool for automated de novo TSS annotation from dRNA-seq data that respects the statistics of dRNA-seq libraries. TSSAR uses the premise that the number of sequencing reads starting at a certain genomic position within a transcriptional active region follows a Poisson distribution with a parameter that depends on the local strength of expression. The differences of two dRNA-seq library counts thus follow a Skellam distribution. This provides a statistical basis to identify significantly enriched primary transcripts.We assessed the performance by analyzing a publicly available dRNA-seq data set using TSSAR and two simple approaches that utilize user-defined score cutoffs. We evaluated the power of reproducing the manual TSS annotation. Furthermore, the same data set was used to reproduce 74 experimentally validated TSS in H. pylori from reliable techniques such as RACE or primer extension. Both analyses showed that TSSAR outperforms the static cutoff-dependent approaches. Having an automated and efficient tool for analyzing dRNA-seq data facilitates the use of the dRNA-seq technique and promotes its application to more sophisticated analysis. For instance, monitoring the plasticity and dynamics of the transcriptomal architecture triggered by different stimuli and growth conditions becomes possible.The main asset of a novel tool for dRNA-seq analysis that reaches out to a broad user community is usability. As such, we provide TSSAR both as intuitive RESTful Web service ( http://rna.tbi.univie.ac.at/TSSAR) together with a set of post-processing and analysis tools, as well as a stand-alone version for use in high-throughput dRNA-seq data analysis pipelines.
Hand anthropometry of Indian women.

PubMed

Nag, Anjali; Nag, P K; Desai, Hina

2003-06-01

Data on the physical dimension of the hand of Indian women are scanty. This information is necessary to ascertain human-machine compatibility in the design of manual systems for the bare and gloved hand, such as design and sizing of hand tools, controls, knobs and other applications in different kinds of precision and power grips. The present study was undertaken to generate hand anthropometric data of 95 women, working in informal industries (beedi, agarbatti and garment making). Fifty one hand measurements of the right hand (lengths, breadths, circumferences, depths, spreads and clearances of hand and fingers) were taken, using anthropometric sliding and spreading calipers, measuring tape and handgrip strength dynamometer. The data were statistically analyzed to determine the normality of data and the percentile values of different hand dimensions, and simple and multiple regression analysis were done to determine better predictors of hand length and grip strength. The hand breadths, circumferences and depths were approximately normally distributed, with some deviation in case of the finger lengths. Hand length was significantly correlated with the fist, wrist and finger circumferences. The fist and wrist circumferences, in combination, were better predictors of hand length. The hand lengths, breadths and depths, including finger joints of the Indian women studied were smaller than those of American, British and West Indian women. The hand circumferences of the Indian women were also smaller than the American women. Grip strengths of Indian women (20.36 +/- 3.24 kg) were less than those of American, British and West Indian women. Grip strength was found to be statistically significant with hand dimensions, such as hand height perpendicular to wrist crease (digit 5), proximal interphalangeal joint breadth (digit 3) and hand spread across wedge 1. The women who are forced to frequently use cutters, strippers and other tools, which are not optimally designed to their hand dimensions and strength range, might have higher prevalence of clinical symptoms and disorders of the hand. In view of the human hand-tool interface requirements, the present data on Indian women would be useful for ergo-design applications of hand tools and devices.
Peer Review of EPA's Draft BMDS Document: Exponential ...

EPA Pesticide Factsheets

BMDS is one of the Agency's premier tools for estimating risk assessments, therefore the validity and reliability of its statistical models are of paramount importance. This page provides links to peer review of the BMDS applications and its models as they were developed and eventually released documenting the rigorous review process taken to provide the best science tools available for statistical modeling. This page provides links to peer review of the BMDS applications and its models as they were developed and eventually released documenting the rigorous review process taken to provide the best science tools available for statistical modeling.
Back to BaySICS: a user-friendly program for Bayesian Statistical Inference from Coalescent Simulations.

PubMed

Sandoval-Castellanos, Edson; Palkopoulou, Eleftheria; Dalén, Love

2014-01-01

Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
Reliability, precision, and measurement in the context of data from ability tests, surveys, and assessments

NASA Astrophysics Data System (ADS)

Fisher, W. P., Jr.; Elbaum, B.; Coulter, A.

2010-07-01

Reliability coefficients indicate the proportion of total variance attributable to differences among measures separated along a quantitative continuum by a testing, survey, or assessment instrument. Reliability is usually considered to be influenced by both the internal consistency of a data set and the number of items, though textbooks and research papers rarely evaluate the extent to which these factors independently affect the data in question. Probabilistic formulations of the requirements for unidimensional measurement separate consistency from error by modelling individual response processes instead of group-level variation. The utility of this separation is illustrated via analyses of small sets of simulated data, and of subsets of data from a 78-item survey of over 2,500 parents of children with disabilities. Measurement reliability ultimately concerns the structural invariance specified in models requiring sufficient statistics, parameter separation, unidimensionality, and other qualities that historically have made quantification simple, practical, and convenient for end users. The paper concludes with suggestions for a research program aimed at focusing measurement research more on the calibration and wide dissemination of tools applicable to individuals, and less on the statistical study of inter-variable relations in large data sets.

A nonlinear isobologram model with Box-Cox transformation to both sides for chemical mixtures.

PubMed

Chen, D G; Pounds, J G

1998-12-01

The linear logistical isobologram is a commonly used and powerful graphical and statistical tool for analyzing the combined effects of simple chemical mixtures. In this paper a nonlinear isobologram model is proposed to analyze the joint action of chemical mixtures for quantitative dose-response relationships. This nonlinear isobologram model incorporates two additional new parameters, Ymin and Ymax, to facilitate analysis of response data that are not constrained between 0 and 1, where parameters Ymin and Ymax represent the minimal and the maximal observed toxic response. This nonlinear isobologram model for binary mixtures can be expressed as [formula: see text] In addition, a Box-Cox transformation to both sides is introduced to improve the goodness of fit and to provide a more robust model for achieving homogeneity and normality of the residuals. Finally, a confidence band is proposed for selected isobols, e.g., the median effective dose, to facilitate graphical and statistical analysis of the isobologram. The versatility of this approach is demonstrated using published data describing the toxicity of the binary mixtures of citrinin and ochratoxin as well as a new experimental data from our laboratory for mixtures of mercury and cadmium.
A nonlinear isobologram model with Box-Cox transformation to both sides for chemical mixtures.

PubMed Central

Chen, D G; Pounds, J G

1998-01-01

The linear logistical isobologram is a commonly used and powerful graphical and statistical tool for analyzing the combined effects of simple chemical mixtures. In this paper a nonlinear isobologram model is proposed to analyze the joint action of chemical mixtures for quantitative dose-response relationships. This nonlinear isobologram model incorporates two additional new parameters, Ymin and Ymax, to facilitate analysis of response data that are not constrained between 0 and 1, where parameters Ymin and Ymax represent the minimal and the maximal observed toxic response. This nonlinear isobologram model for binary mixtures can be expressed as [formula: see text] In addition, a Box-Cox transformation to both sides is introduced to improve the goodness of fit and to provide a more robust model for achieving homogeneity and normality of the residuals. Finally, a confidence band is proposed for selected isobols, e.g., the median effective dose, to facilitate graphical and statistical analysis of the isobologram. The versatility of this approach is demonstrated using published data describing the toxicity of the binary mixtures of citrinin and ochratoxin as well as a new experimental data from our laboratory for mixtures of mercury and cadmium. PMID:9860894
A computational DFT study of structural transitions in textured solid-fluid interfaces

NASA Astrophysics Data System (ADS)

Yatsyshin, Petr; Parry, Andrew O.; Kalliadasis, Serafim

2015-11-01

Fluids adsorbed at walls, in capillary pores and slits, and in more exotic, sculpted geometries such as grooves and wedges can exhibit many new phase transitions, including wetting, pre-wetting, capillary-condensation and filling, compared to their bulk counterparts. As well as being of fundamental interest to the modern statistical mechanical theory of inhomogeneous fluids, these are also relevant to nanofluidics, chemical- and bioengineering. In this talk we will show using a microscopic Density Functional Theory (DFT) for fluids how novel, continuous, interfacial transitions associated with the first-order prewetting line, can occur on steps, in grooves and in wedges, that are sensitive to both the range of the intermolecular forces and interfacial fluctuation effects. These transitions compete with wetting, filling and condensation producing very rich phase diagrams even for relatively simple geometries. We will also discuss practical aspects of DFT calculations, and demonstrate how this statistical-mechanical framework is capable of yielding complex fluid structure, interfacial tensions, and regions of thermodynamic stability of various fluid configurations. As a side note, this demonstrates that DFT is an excellent tool for the investigations of complex multiphase systems. We acknowledge financial support from the European Research Council via Advanced Grant No. 247031.
Evaluating the performance of a fault detection and diagnostic system for vapor compression equipment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Breuker, M.S.; Braun, J.E.

This paper presents a detailed evaluation of the performance of a statistical, rule-based fault detection and diagnostic (FDD) technique presented by Rossi and Braun (1997). Steady-state and transient tests were performed on a simple rooftop air conditioner over a range of conditions and fault levels. The steady-state data without faults were used to train models that predict outputs for normal operation. The transient data with faults were used to evaluate FDD performance. The effect of a number of design variables on FDD sensitivity for different faults was evaluated and two prototype systems were specified for more complete evaluation. Good performancemore » was achieved in detecting and diagnosing five faults using only six temperatures (2 input and 4 output) and linear models. The performance improved by about a factor of two when ten measurements (three input and seven output) and higher order models were used. This approach for evaluating and optimizing the performance of the statistical, rule-based FDD technique could be used as a design and evaluation tool when applying this FDD method to other packaged air-conditioning systems. Furthermore, the approach could also be modified to evaluate the performance of other FDD methods.« less
Multicategory Composite Least Squares Classifiers

PubMed Central

Park, Seo Young; Liu, Yufeng; Liu, Dacheng; Scholl, Paul

2010-01-01

Classification is a very useful statistical tool for information extraction. In particular, multicategory classification is commonly seen in various applications. Although binary classification problems are heavily studied, extensions to the multicategory case are much less so. In view of the increased complexity and volume of modern statistical problems, it is desirable to have multicategory classifiers that are able to handle problems with high dimensions and with a large number of classes. Moreover, it is necessary to have sound theoretical properties for the multicategory classifiers. In the literature, there exist several different versions of simultaneous multicategory Support Vector Machines (SVMs). However, the computation of the SVM can be difficult for large scale problems, especially for problems with large number of classes. Furthermore, the SVM cannot produce class probability estimation directly. In this article, we propose a novel efficient multicategory composite least squares classifier (CLS classifier), which utilizes a new composite squared loss function. The proposed CLS classifier has several important merits: efficient computation for problems with large number of classes, asymptotic consistency, ability to handle high dimensional data, and simple conditional class probability estimation. Our simulated and real examples demonstrate competitive performance of the proposed approach. PMID:21218128
Nanocluster building blocks of artificial square spin ice: Stray-field studies of thermal dynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pohlit, Merlin, E-mail: pohlit@physik.uni-frankfurt.de; Porrati, Fabrizio; Huth, Michael

We present measurements of the thermal dynamics of a Co-based single building block of an artificial square spin ice fabricated by focused electron-beam-induced deposition. We employ micro-Hall magnetometry, an ultra-sensitive tool to study the stray field emanating from magnetic nanostructures, as a new technique to access the dynamical properties during the magnetization reversal of the spin-ice nanocluster. The obtained hysteresis loop exhibits distinct steps, displaying a reduction of their “coercive field” with increasing temperature. Therefore, thermally unstable states could be repetitively prepared by relatively simple temperature and field protocols allowing one to investigate the statistics of their switching behavior withinmore » experimentally accessible timescales. For a selected switching event, we find a strong reduction of the so-prepared states' “survival time” with increasing temperature and magnetic field. Besides the possibility to control the lifetime of selected switching events at will, we find evidence for a more complex behavior caused by the special spin ice arrangement of the macrospins, i.e., that the magnetic reversal statistically follows distinct “paths” most likely driven by thermal perturbation.« less
Mapping of epistatic quantitative trait loci in four-way crosses.

PubMed

He, Xiao-Hong; Qin, Hongde; Hu, Zhongli; Zhang, Tianzhen; Zhang, Yuan-Ming

2011-01-01

Four-way crosses (4WC) involving four different inbred lines often appear in plant and animal commercial breeding programs. Direct mapping of quantitative trait loci (QTL) in these commercial populations is both economical and practical. However, the existing statistical methods for mapping QTL in a 4WC population are built on the single-QTL genetic model. This simple genetic model fails to take into account QTL interactions, which play an important role in the genetic architecture of complex traits. In this paper, therefore, we attempted to develop a statistical method to detect epistatic QTL in 4WC population. Conditional probabilities of QTL genotypes, computed by the multi-point single locus method, were used to sample the genotypes of all putative QTL in the entire genome. The sampled genotypes were used to construct the design matrix for QTL effects. All QTL effects, including main and epistatic effects, were simultaneously estimated by the penalized maximum likelihood method. The proposed method was confirmed by a series of Monte Carlo simulation studies and real data analysis of cotton. The new method will provide novel tools for the genetic dissection of complex traits, construction of QTL networks, and analysis of heterosis.
Simplified, inverse, ejector design tool

NASA Technical Reports Server (NTRS)

Dechant, Lawrence J.

1993-01-01

A simple lumped parameter based inverse design tool has been developed which provides flow path geometry and entrainment estimates subject to operational, acoustic, and design constraints. These constraints are manifested through specification of primary mass flow rate or ejector thrust, fully-mixed exit velocity, and static pressure matching. Fundamentally, integral forms of the conservation equations coupled with the specified design constraints are combined to yield an easily invertible linear system in terms of the flow path cross-sectional areas. Entrainment is computed by back substitution. Initial comparison with experimental and analogous one-dimensional methods show good agreement. Thus, this simple inverse design code provides an analytically based, preliminary design tool with direct application to High Speed Civil Transport (HSCT) design studies.
Simple statistical bias correction techniques greatly improve moderate resolution air quality forecast at station level

NASA Astrophysics Data System (ADS)

Curci, Gabriele; Falasca, Serena

2017-04-01

Deterministic air quality forecast is routinely carried out at many local Environmental Agencies in Europe and throughout the world by means of eulerian chemistry-transport models. The skill of these models in predicting the ground-level concentrations of relevant pollutants (ozone, nitrogen dioxide, particulate matter) a few days ahead has greatly improved in recent years, but it is not yet always compliant with the required quality level for decision making (e.g. the European Commission has set a maximum uncertainty of 50% on daily values of relevant pollutants). Post-processing of deterministic model output is thus still regarded as a useful tool to make the forecast more reliable. In this work, we test several bias correction techniques applied to a long-term dataset of air quality forecasts over Europe and Italy. We used the WRF-CHIMERE modelling system, which provides operational experimental chemical weather forecast at CETEMPS (http://pumpkin.aquila.infn.it/forechem/), to simulate the years 2008-2012 at low resolution over Europe (0.5° x 0.5°) and moderate resolution over Italy (0.15° x 0.15°). We compared the simulated dataset with available observation from the European Environmental Agency database (AirBase) and characterized model skill and compliance with EU legislation using the Delta tool from FAIRMODE project (http://fairmode.jrc.ec.europa.eu/). The bias correction techniques adopted are, in order of complexity: (1) application of multiplicative factors calculated as the ratio of model-to-observed concentrations averaged over the previous days; (2) correction of the statistical distribution of model forecasts, in order to make it similar to that of the observations; (3) development and application of Model Output Statistics (MOS) regression equations. We illustrate differences and advantages/disadvantages of the three approaches. All the methods are relatively easy to implement for other modelling systems.
Remote Control and Data Acquisition: A Case Study

NASA Technical Reports Server (NTRS)

DeGennaro, Alfred J.; Wilkinson, R. Allen

2000-01-01

This paper details software tools developed to remotely command experimental apparatus, and to acquire and visualize the associated data in soft real time. The work was undertaken because commercial products failed to meet the needs. This work has identified six key factors intrinsic to development of quality research laboratory software. Capabilities include access to all new instrument functions without any programming or dependence on others to write drivers or virtual instruments, simple full screen text-based experiment configuration and control user interface, months of continuous experiment run-times, order of 1% CPU load for condensed matter physics experiment described here, very little imposition of software tool choices on remote users, and total remote control from anywhere in the world over the Internet or from home on a 56 Kb modem as if the user is sitting in the laboratory. This work yielded a set of simple robust tools that are highly reliable, resource conserving, extensible, and versatile, with a uniform simple interface.
Analytical Tools Interface for Landscape Assessments

EPA Science Inventory

Environmental management practices are trending away from simple, local-scale assessments toward complex, multiple-stressor regional assessments. Landscape ecology provides the theory behind these assessments while geographic information systems (GIS) supply the tools to implemen...
Development of a simplified urban water balance model (WABILA).

PubMed

Henrichs, M; Langner, J; Uhl, M

2016-01-01

During the last decade, water sensitive urban design (WSUD) has become more and more accepted. However, there is not any simple tool or option available to evaluate the influence of these measures on the local water balance. To counteract the impact of new settlements, planners focus on mitigating increases in runoff through installation of infiltration systems. This leads to an increasing non-natural groundwater recharge and decreased evapotranspiration. Simple software tools which evaluate or simulate the effect of WSUD on the local water balance are still needed. The authors developed a tool named WABILA (Wasserbilanz) that could support planners for optimal WSUD. WABILA is an easy-to-use planning tool that is based on simplified regression functions for established measures and land covers. Results show that WSUD has to be site-specific, based on climate conditions and the natural water balance.
The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research

PubMed Central

Estivalet, Gustavo L.; Meunier, Fanny

2015-01-01

In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional’s corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior. PMID:26630138
The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research.

PubMed

Estivalet, Gustavo L; Meunier, Fanny

2015-01-01

In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional's corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior.
Simple prognostic model for patients with advanced cancer based on performance status.

PubMed

Jang, Raymond W; Caraiscos, Valerie B; Swami, Nadia; Banerjee, Subrata; Mak, Ernie; Kaya, Ebru; Rodin, Gary; Bryson, John; Ridley, Julia Z; Le, Lisa W; Zimmermann, Camilla

2014-09-01

Providing survival estimates is important for decision making in oncology care. The purpose of this study was to provide survival estimates for outpatients with advanced cancer, using the Eastern Cooperative Oncology Group (ECOG), Palliative Performance Scale (PPS), and Karnofsky Performance Status (KPS) scales, and to compare their ability to predict survival. ECOG, PPS, and KPS were completed by physicians for each new patient attending the Princess Margaret Cancer Centre outpatient Oncology Palliative Care Clinic (OPCC) from April 2007 to February 2010. Survival analysis was performed using the Kaplan-Meier method. The log-rank test for trend was employed to test for differences in survival curves for each level of performance status (PS), and the concordance index (C-statistic) was used to test the predictive discriminatory ability of each PS measure. Measures were completed for 1,655 patients. PS delineated survival well for all three scales according to the log-rank test for trend (P < .001). Survival was approximately halved for each worsening performance level. Median survival times, in days, for each ECOG level were: EGOG 0, 293; ECOG 1, 197; ECOG 2, 104; ECOG 3, 55; and ECOG 4, 25.5. Median survival times, in days, for PPS (and KPS) were: PPS/KPS 80-100, 221 (215); PPS/KPS 60 to 70, 115 (119); PPS/KPS 40 to 50, 51 (49); PPS/KPS 10 to 30, 22 (29). The C-statistic was similar for all three scales and ranged from 0.63 to 0.64. We present a simple tool that uses PS alone to prognosticate in advanced cancer, and has similar discriminatory ability to more complex models. Copyright © 2014 by American Society of Clinical Oncology.
KPS/LDH index: a simple tool for identifying patients with metastatic melanoma who are unlikely to benefit from palliative whole brain radiotherapy.

PubMed

Partl, Richard; Fastner, Gerd; Kaiser, Julia; Kronhuber, Elisabeth; Cetin-Strohmer, Klaudia; Steffal, Claudia; Böhmer-Breitfelder, Barbara; Mayer, Johannes; Avian, Alexander; Berghold, Andrea

2016-02-01

Low Karnofsky performance status (KPS) and elevated lactate dehydrogenases (LDHs) as a surrogate marker for tumor load and cell turnover may depict patients with a very short life expectancy. To validate this finding and compare it to other indices, namely, the recursive partitioning analysis (RPA) and diagnosis-specific graded prognostic assessment (DS-GPA), a multicenter analysis was undertaken. A retrospective analysis of 234 metastatic melanoma patients uniformly treated with palliative whole brain radiotherapy (WBRT) was done. Univariate and multivariate analyses were used to determine the impact of patient-, tumor-, and treatment-related parameters on overall survival (OS). KPS and LDH emerged as independent factors predicting OS. By combining KPS and LDH values (KPS/LDH index), groups of patients with statistically significant differences in median OS (days; 95 % CI) after onset of WBRT were identified: group 1 (KPS ≥ 70/normal LDH) 234 (96-372), group 2 (KPS ≥ 70/elevated LDH) 112 (69-155), group 3 (KPS <70/normal LDH) 43 (12-74), and group 4 (KPS <70/elevated LDH) 29 (17-41). Between all four groups, statistically significant differences were observed. The RPA and DS-GPA indices failed to distinguish significantly between good and moderate prognosis and were inferior in predicting a very unfavorable prognosis. The parameters KPS and LDH independently impacted on OS. The combination of both (KPS/LDH index) identified patients with a very short life expectancy, who might be better served by recommending best supportive care instead of WBRT. The KPS/LDH index is simple and effective in terms of time and cost as compared to other prognostic indices.
Construction of social value or utility-based health indices: the usefulness of factorial experimental design plans.

PubMed

Cadman, D; Goldsmith, C

1986-01-01

Global indices, which aggregate multiple health or function attributes into a single summary indicator, are useful measures in health research. Two key issues must be addressed in the initial stages of index construction from the universe of possible health and function attributes, which ones should be included in a new index? and how simple can the statistical model be to combine attributes into a single numeric index value? Factorial experimental designs were used in the initial stages of developing a function index for evaluating a program for the care of young handicapped children. Beginning with eight attributes judged important to the goals of the program by clinicians, social preference values for different function states were obtained from 32 parents of handicapped children and 32 members of the community. Using category rating methods each rater scored 16 written multi-attribute case descriptions which contained information about a child's status for all eight attributes. Either a good or poor level of each function attribute and age 3 or 5 years were described in each case. Thus, 2(8) = 256 different cases were rated. Two factorial design plans were selected and used to allocate case descriptions to raters. Analysis of variance determined that seven of the eight clinician selected attributes were required in a social value based index for handicapped children. Most importantly, the subsequent steps of index construction could be greatly simplified by the finding that a simple additive statistical model without complex attribute interaction terms was adequate for the index. We conclude that factorial experimental designs are an efficient, feasible and powerful tool for the initial stages of constructing a multi-attribute health index.
GAPIT version 2: an enhanced integrated tool for genomic association and prediction

USDA-ARS?s Scientific Manuscript database

Most human diseases and agriculturally important traits are complex. Dissecting their genetic architecture requires continued development of innovative and powerful statistical methods. Corresponding advances in computing tools are critical to efficiently use these statistical innovations and to enh...
Validation of a modified FRAX® tool for improving outpatient efficiency--part of the "Catch Before a Fall" initiative.

PubMed

Parker, Simon; Ciaccio, Maria; Cook, Erica; Davenport, Graham; Cooper, Alun; Grange, Simon; Smitham, Peter

2015-01-01

We have validated our touch-screen-modified FRAX® tool against the traditional healthcare professional-led questionnaire, demonstrating strong concordance between doctor- and patient-derived results. We will use this in outpatient clinics and general practice to increase our capture rate of at-risk patients, making valuable use of otherwise wasted patient waiting times. Outpatient clinics offer an opportunity to collect valuable health information from a captive population. We have previously developed a modified fracture risk assessment (FRAX®) tool, enabling patients to self-assess their osteoporotic fracture risk in a touch-screen computer format and demonstrated its acceptability with patients. We aim to validate the accuracy of our tool against the traditional questionnaire. Fifty patients over 50 years of age within the fracture clinic independently completed a paper equivalent of our touch-screen-modified FRAX® questionnaire. Responses were analysed against the traditional healthcare professional (HCP)-led questionnaire which was carried out afterwards. Correlation was assessed by sensitivity, specificity, Cohen's kappa statistic and Fisher's exact test for each potential FRAX® outcome of "treat", "measure BMD" and "lifestyle advice". Age range was 51-98 years. The FRAX® tool was completed by 88 % of patients; six patients lacked confidence in estimating either their height or weight. Following question adjustment according to patient response and feedback, our tool achieved >95 % sensitivity and specificity for the "treat" and "lifestyle advice" groups, and 79 % sensitivity and 100 % specificity in the "measure BMD" group. Cohen's kappa value ranged from 0.823 to 0.995 across all groups, demonstrating "very good" agreement for all. Fisher's exact test demonstrated significant concordance between doctor and patient decisions. Our modified tool provides a simple, accurate and reliable method for patients to self-report their own FRAX® score outside the clinical contact period, thus releasing the HCP from the time required to complete the questionnaire and potentially increasing our capture rate of at-risk patients.
A Prototype Tool to Enable Farmers to Measure and Improve the Welfare Performance of the Farm Animal Enterprise: The Unified Field Index

PubMed Central

Colditz, Ian G.; Ferguson, Drewe M.; Collins, Teresa; Matthews, Lindsay; Hemsworth, Paul H.

2014-01-01

Simple Summary Benchmarking is a tool widely used in agricultural industries that harnesses the experience of farmers to generate knowledge of practices that lead to better on-farm productivity and performance. We propose, by analogy with production performance, a method for measuring the animal welfare performance of an enterprise and describe a tool for farmers to monitor and improve the animal welfare performance of their business. A general framework is outlined for assessing and monitoring risks to animal welfare based on measures of animals, the environment they are kept in and how they are managed. The tool would enable farmers to continually improve animal welfare. Abstract Schemes for the assessment of farm animal welfare and assurance of welfare standards have proliferated in recent years. An acknowledged short-coming has been the lack of impact of these schemes on the welfare standards achieved on farm due in part to sociological factors concerning their implementation. Here we propose the concept of welfare performance based on a broad set of performance attributes of an enterprise and describe a tool based on risk assessment and benchmarking methods for measuring and managing welfare performance. The tool termed the Unified Field Index is presented in a general form comprising three modules addressing animal, resource, and management factors. Domains within these modules accommodate the principle conceptual perspectives for welfare assessment: biological functioning; emotional states; and naturalness. Pan-enterprise analysis in any livestock sector could be used to benchmark welfare performance of individual enterprises and also provide statistics of welfare performance for the livestock sector. An advantage of this concept of welfare performance is its use of continuous scales of measurement rather than traditional pass/fail measures. Through the feedback provided via benchmarking, the tool should help farmers better engage in on-going improvement of farm practices that affect animal welfare. PMID:26480317

The GenABEL Project for statistical genomics

PubMed Central

Karssen, Lennart C.; van Duijn, Cornelia M.; Aulchenko, Yurii S.

2016-01-01

Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the “core team”, facilitating agile statistical omics methodology development and fast dissemination. PMID:27347381
Ultrasonic evaluation of the physical and mechanical properties of granites.

PubMed

Vasconcelos, G; Lourenço, P B; Alves, C A S; Pamplona, J

2008-09-01

Masonry is the oldest building material that survived until today, being used all over the world and being present in the most impressive historical structures as an evidence of spirit of enterprise of ancient cultures. Conservation, rehabilitation and strengthening of the built heritage and protection of human lives are clear demands of modern societies. In this process, the use of nondestructive methods has become much common in the diagnosis of structural integrity of masonry elements. With respect to the evaluation of the stone condition, the ultrasonic pulse velocity is a simple and economical tool. Thus, the central issue of the present paper concerns the evaluation of the suitability of the ultrasonic pulse velocity method for describing the mechanical and physical properties of granites (range size between 0.1-4.0 mm and 0.3-16.5 mm) and for the assessment of its weathering state. The mechanical properties encompass the compressive and tensile strength and modulus of elasticity, and the physical properties include the density and porosity. For this purpose, measurements of the longitudinal ultrasonic pulse velocity with distinct natural frequency of the transducers were carried out on specimens with different size and shape. A discussion of the factors that induce variations on the ultrasonic velocity is also provided. Additionally, statistical correlations between ultrasonic pulse velocity and mechanical and physical properties of granites are presented and discussed. The major output of the work is the confirmation that ultrasonic pulse velocity can be effectively used as a simple and economical nondestructive method for a preliminary prediction of mechanical and physical properties, as well as a tool for the assessment of the weathering changes of granites that occur during the serviceable life. This is of much interest due to the usual difficulties in removing specimens for mechanical characterization.
Was it easy to use an Asthma Control Test (ACT) in different clinical practice settings in a tertiary hospital in Singapore?

PubMed

Prabhakaran, Lathy; Earnest, Arul; Abisheganaden, John; Chee, Jane

2009-12-01

The Asthma Control Test (ACT) is a 5-item self-administered tool designed to assess asthma control. It is said to be simple, easy and can be administered quickly by patients in the clinical practice setting. This stated benefit has yet to be demonstrated in our local clinical practice setting. The aim was to identify factors associated with difficulty in the administration of the ACT in different clinical practice settings in a tertiary hospital in Singapore. This is a prospective study performed from April to June 2008. All patients diagnosed with asthma and referred to an asthma nurse from the in-patient and out-patient clinical practice setting in Tan Tock Seng Hospital were enrolled. Four hundred and thirty-four patients were asked to complete the ACT tool. In the univariate model, we found that age, clinical setting and medical history to be significantly associated with the completion of the ACT. The odds of completion decreased by a factor of 0.92 (95% CI, 0.89 to 0.94) for every year's increase in age, and this was statistically significant (P <0.001). Similarly, the odds ratio of completion for those with more than 3 medical conditions by history were 0.59 (95% CI, 0.48 to 0.71) as compared to those with less than 3 medical conditions by history, and this was also significant (P <0.001). In the multivariate model, we only found age to be an independent and significant factor. After adjusting for age, none of the other variables initially significant in the univariate model remained significant. The results show that the ACT was simple and easy to be administered in younger-aged patients.
Revisiting Information Technology tools serving authorship and editorship: a case-guided tutorial to statistical analysis and plagiarism detection

PubMed Central

Bamidis, P D; Lithari, C; Konstantinidis, S T

2010-01-01

With the number of scientific papers published in journals, conference proceedings, and international literature ever increasing, authors and reviewers are not only facilitated with an abundance of information, but unfortunately continuously confronted with risks associated with the erroneous copy of another's material. In parallel, Information Communication Technology (ICT) tools provide to researchers novel and continuously more effective ways to analyze and present their work. Software tools regarding statistical analysis offer scientists the chance to validate their work and enhance the quality of published papers. Moreover, from the reviewers and the editor's perspective, it is now possible to ensure the (text-content) originality of a scientific article with automated software tools for plagiarism detection. In this paper, we provide a step-bystep demonstration of two categories of tools, namely, statistical analysis and plagiarism detection. The aim is not to come up with a specific tool recommendation, but rather to provide useful guidelines on the proper use and efficiency of either category of tools. In the context of this special issue, this paper offers a useful tutorial to specific problems concerned with scientific writing and review discourse. A specific neuroscience experimental case example is utilized to illustrate the young researcher's statistical analysis burden, while a test scenario is purpose-built using open access journal articles to exemplify the use and comparative outputs of seven plagiarism detection software pieces. PMID:21487489
Revisiting Information Technology tools serving authorship and editorship: a case-guided tutorial to statistical analysis and plagiarism detection.

PubMed

Bamidis, P D; Lithari, C; Konstantinidis, S T

2010-12-01

With the number of scientific papers published in journals, conference proceedings, and international literature ever increasing, authors and reviewers are not only facilitated with an abundance of information, but unfortunately continuously confronted with risks associated with the erroneous copy of another's material. In parallel, Information Communication Technology (ICT) tools provide to researchers novel and continuously more effective ways to analyze and present their work. Software tools regarding statistical analysis offer scientists the chance to validate their work and enhance the quality of published papers. Moreover, from the reviewers and the editor's perspective, it is now possible to ensure the (text-content) originality of a scientific article with automated software tools for plagiarism detection. In this paper, we provide a step-bystep demonstration of two categories of tools, namely, statistical analysis and plagiarism detection. The aim is not to come up with a specific tool recommendation, but rather to provide useful guidelines on the proper use and efficiency of either category of tools. In the context of this special issue, this paper offers a useful tutorial to specific problems concerned with scientific writing and review discourse. A specific neuroscience experimental case example is utilized to illustrate the young researcher's statistical analysis burden, while a test scenario is purpose-built using open access journal articles to exemplify the use and comparative outputs of seven plagiarism detection software pieces.
Using R-Project for Free Statistical Analysis in Extension Research

ERIC Educational Resources Information Center

Mangiafico, Salvatore S.

2013-01-01

One option for Extension professionals wishing to use free statistical software is to use online calculators, which are useful for common, simple analyses. A second option is to use a free computing environment capable of performing statistical analyses, like R-project. R-project is free, cross-platform, powerful, and respected, but may be…
Using Data from Climate Science to Teach Introductory Statistics

ERIC Educational Resources Information Center

Witt, Gary

2013-01-01

This paper shows how the application of simple statistical methods can reveal to students important insights from climate data. While the popular press is filled with contradictory opinions about climate science, teachers can encourage students to use introductory-level statistics to analyze data for themselves on this important issue in public…
Using R in Introductory Statistics Courses with the pmg Graphical User Interface

ERIC Educational Resources Information Center

Verzani, John

2008-01-01

The pmg add-on package for the open source statistics software R is described. This package provides a simple to use graphical user interface (GUI) that allows introductory statistics students, without advanced computing skills, to quickly create the graphical and numeric summaries expected of them. (Contains 9 figures.)
A simple rapid approach using coupled multivariate statistical methods, GIS and trajectory models to delineate areas of common oil spill risk

NASA Astrophysics Data System (ADS)

Guillen, George; Rainey, Gail; Morin, Michelle

2004-04-01

Currently, the Minerals Management Service uses the Oil Spill Risk Analysis model (OSRAM) to predict the movement of potential oil spills greater than 1000 bbl originating from offshore oil and gas facilities. OSRAM generates oil spill trajectories using meteorological and hydrological data input from either actual physical measurements or estimates generated from other hydrological models. OSRAM and many other models produce output matrices of average, maximum and minimum contact probabilities to specific landfall or target segments (columns) from oil spills at specific points (rows). Analysts and managers are often interested in identifying geographic areas or groups of facilities that pose similar risks to specific targets or groups of targets if a spill occurred. Unfortunately, due to the potentially large matrix generated by many spill models, this question is difficult to answer without the use of data reduction and visualization methods. In our study we utilized a multivariate statistical method called cluster analysis to group areas of similar risk based on potential distribution of landfall target trajectory probabilities. We also utilized ArcView™ GIS to display spill launch point groupings. The combination of GIS and multivariate statistical techniques in the post-processing of trajectory model output is a powerful tool for identifying and delineating areas of similar risk from multiple spill sources. We strongly encourage modelers, statistical and GIS software programmers to closely collaborate to produce a more seamless integration of these technologies and approaches to analyzing data. They are complimentary methods that strengthen the overall assessment of spill risks.
Fukunaga-Koontz feature transformation for statistical structural damage detection and hierarchical neuro-fuzzy damage localisation

NASA Astrophysics Data System (ADS)

Hoell, Simon; Omenzetter, Piotr

2017-07-01

Considering jointly damage sensitive features (DSFs) of signals recorded by multiple sensors, applying advanced transformations to these DSFs and assessing systematically their contribution to damage detectability and localisation can significantly enhance the performance of structural health monitoring systems. This philosophy is explored here for partial autocorrelation coefficients (PACCs) of acceleration responses. They are interrogated with the help of the linear discriminant analysis based on the Fukunaga-Koontz transformation using datasets of the healthy and selected reference damage states. Then, a simple but efficient fast forward selection procedure is applied to rank the DSF components with respect to statistical distance measures specialised for either damage detection or localisation. For the damage detection task, the optimal feature subsets are identified based on the statistical hypothesis testing. For damage localisation, a hierarchical neuro-fuzzy tool is developed that uses the DSF ranking to establish its own optimal architecture. The proposed approaches are evaluated experimentally on data from non-destructively simulated damage in a laboratory scale wind turbine blade. The results support our claim of being able to enhance damage detectability and localisation performance by transforming and optimally selecting DSFs. It is demonstrated that the optimally selected PACCs from multiple sensors or their Fukunaga-Koontz transformed versions can not only improve the detectability of damage via statistical hypothesis testing but also increase the accuracy of damage localisation when used as inputs into a hierarchical neuro-fuzzy network. Furthermore, the computational effort of employing these advanced soft computing models for damage localisation can be significantly reduced by using transformed DSFs.
Statistical properties of a utility measure of observer performance compared to area under the ROC curve

NASA Astrophysics Data System (ADS)

Abbey, Craig K.; Samuelson, Frank W.; Gallas, Brandon D.; Boone, John M.; Niklason, Loren T.

2013-03-01

The receiver operating characteristic (ROC) curve has become a common tool for evaluating diagnostic imaging technologies, and the primary endpoint of such evaluations is the area under the curve (AUC), which integrates sensitivity over the entire false positive range. An alternative figure of merit for ROC studies is expected utility (EU), which focuses on the relevant region of the ROC curve as defined by disease prevalence and the relative utility of the task. However if this measure is to be used, it must also have desirable statistical properties keep the burden of observer performance studies as low as possible. Here, we evaluate effect size and variability for EU and AUC. We use two observer performance studies recently submitted to the FDA to compare the EU and AUC endpoints. The studies were conducted using the multi-reader multi-case methodology in which all readers score all cases in all modalities. ROC curves from the study were used to generate both the AUC and EU values for each reader and modality. The EU measure was computed assuming an iso-utility slope of 1.03. We find mean effect sizes, the reader averaged difference between modalities, to be roughly 2.0 times as big for EU as AUC. The standard deviation across readers is roughly 1.4 times as large, suggesting better statistical properties for the EU endpoint. In a simple power analysis of paired comparison across readers, the utility measure required 36% fewer readers on average to achieve 80% statistical power compared to AUC.
The transfer of analytical procedures.

PubMed

Ermer, J; Limberger, M; Lis, K; Wätzig, H

2013-11-01

Analytical method transfers are certainly among the most discussed topics in the GMP regulated sector. However, they are surprisingly little regulated in detail. General information is provided by USP, WHO, and ISPE in particular. Most recently, the EU emphasized the importance of analytical transfer by including it in their draft of the revised GMP Guideline. In this article, an overview and comparison of these guidelines is provided. The key to success for method transfers is the excellent communication between sending and receiving unit. In order to facilitate this communication, procedures, flow charts and checklists for responsibilities, success factors, transfer categories, the transfer plan and report, strategies in case of failed transfers, tables with acceptance limits are provided here, together with a comprehensive glossary. Potential pitfalls are described such that they can be avoided. In order to assure an efficient and sustainable transfer of analytical procedures, a practically relevant and scientifically sound evaluation with corresponding acceptance criteria is crucial. Various strategies and statistical tools such as significance tests, absolute acceptance criteria, and equivalence tests are thoroughly descibed and compared in detail giving examples. Significance tests should be avoided. The success criterion is not statistical significance, but rather analytical relevance. Depending on a risk assessment of the analytical procedure in question, statistical equivalence tests are recommended, because they include both, a practically relevant acceptance limit and a direct control of the statistical risks. However, for lower risk procedures, a simple comparison of the transfer performance parameters to absolute limits is also regarded as sufficient. Copyright © 2013 Elsevier B.V. All rights reserved.
An automated approach to the design of decision tree classifiers

NASA Technical Reports Server (NTRS)

Argentiero, P.; Chin, R.; Beaudet, P.

1982-01-01

An automated technique is presented for designing effective decision tree classifiers predicated only on a priori class statistics. The procedure relies on linear feature extractions and Bayes table look-up decision rules. Associated error matrices are computed and utilized to provide an optimal design of the decision tree at each so-called 'node'. A by-product of this procedure is a simple algorithm for computing the global probability of correct classification assuming the statistical independence of the decision rules. Attention is given to a more precise definition of decision tree classification, the mathematical details on the technique for automated decision tree design, and an example of a simple application of the procedure using class statistics acquired from an actual Landsat scene.
Evaluation of Two Statistical Methods Provides Insights into the Complex Patterns of Alternative Polyadenylation Site Switching

PubMed Central

Li, Jie; Li, Rui; You, Leiming; Xu, Anlong; Fu, Yonggui; Huang, Shengfeng

2015-01-01

Switching between different alternative polyadenylation (APA) sites plays an important role in the fine tuning of gene expression. New technologies for the execution of 3’-end enriched RNA-seq allow genome-wide detection of the genes that exhibit significant APA site switching between different samples. Here, we show that the independence test gives better results than the linear trend test in detecting APA site-switching events. Further examination suggests that the discrepancy between these two statistical methods arises from complex APA site-switching events that cannot be represented by a simple change of average 3’-UTR length. In theory, the linear trend test is only effective in detecting these simple changes. We classify the switching events into four switching patterns: two simple patterns (3’-UTR shortening and lengthening) and two complex patterns. By comparing the results of the two statistical methods, we show that complex patterns account for 1/4 of all observed switching events that happen between normal and cancerous human breast cell lines. Because simple and complex switching patterns may convey different biological meanings, they merit separate study. We therefore propose to combine both the independence test and the linear trend test in practice. First, the independence test should be used to detect APA site switching; second, the linear trend test should be invoked to identify simple switching events; and third, those complex switching events that pass independence testing but fail linear trend testing can be identified. PMID:25875641
Techniques for estimating selected streamflow characteristics of rural unregulated streams in Ohio

USGS Publications Warehouse

Koltun, G.F.; Whitehead, Matthew T.

2002-01-01

This report provides equations for estimating mean annual streamflow, mean monthly streamflows, harmonic mean streamflow, and streamflow quartiles (the 25th-, 50th-, and 75th-percentile streamflows) as a function of selected basin characteristics for rural, unregulated streams in Ohio. The equations were developed from streamflow statistics and basin-characteristics data for as many as 219 active or discontinued streamflow-gaging stations on rural, unregulated streams in Ohio with 10 or more years of homogenous daily streamflow record. Streamflow statistics and basin-characteristics data for the 219 stations are presented in this report. Simple equations (based on drainage area only) and best-fit equations (based on drainage area and at least two other basin characteristics) were developed by means of ordinary least-squares regression techniques. Application of the best-fit equations generally involves quantification of basin characteristics that require or are facilitated by use of a geographic information system. In contrast, the simple equations can be used with information that can be obtained without use of a geographic information system; however, the simple equations have larger prediction errors than the best-fit equations and exhibit geographic biases for most streamflow statistics. The best-fit equations should be used instead of the simple equations whenever possible.
STATWIZ - AN ELECTRONIC STATISTICAL TOOL (ABSTRACT)

EPA Science Inventory

StatWiz is a web-based, interactive, and dynamic statistical tool for researchers. It will allow researchers to input information and/or data and then receive experimental design options, or outputs from data analysis. StatWiz is envisioned as an expert system that will walk rese...
Multicentre study for validation of the French addictovigilance network reports assessment tool

PubMed Central

Hardouin, Jean Benoit; Rousselet, Morgane; Gerardin, Marie; Guerlais, Marylène; Guillou, Morgane; Bronnec, Marie; Sébille, Véronique; Jolliet, Pascale

2016-01-01

Aims The French health authority (ANSM) is responsible for monitoring medicinal and other drug dependencies. To support these activities, the ANSM manages a network of 13 drug dependence evaluation and information centres (Centres d'Evaluation et d'Information sur la Pharmacodépendance ‐ Addictovigilance ‐ CEIP‐A) throughout France. In 2006, the Nantes CEIP‐A created a new tool called the EGAP (Echelle de GrAvité de la Pharmacodépendance‐ drug dependence severity scale) based on DSM IV criteria. This tool allows the creation of a substance use profile that enables the drug dependence severity to be homogeneously quantified by assigning a score to each substance indicated in the reports from health professionals. This article describes the validation and psychometric properties of the drug dependence severity score obtained from the scale ( Clinicaltrials.gov NCT01052675). Method The validity of the EGAP construct, the concurrent validity and the discriminative ability of the EGAP score, the consistency of answers to EGAP items, the internal consistency and inter rater reliability of the EGAP score were assessed using statistical methods that are generally used for psychometric tests. Results The total EGAP score was a reliable and precise measure for evaluating drug dependence (Cronbach alpha = 0.84; ASI correlation = 0.70; global ICC = 0.92). In addition to its good psychometric properties, the EGAP is a simple and efficient tool that can be easily specified on the official ANSM notification form. Conclusion The good psychometric properties of the total EGAP score justify its use for evaluating the severity of drug dependence. PMID:27302554
Application of cloud database in the management of clinical data of patients with skin diseases.

PubMed

Mao, Xiao-fei; Liu, Rui; DU, Wei; Fan, Xue; Chen, Dian; Zuo, Ya-gang; Sun, Qiu-ning

2015-04-01

To evaluate the needs and applications of using cloud database in the daily practice of dermatology department. The cloud database was established for systemic scleroderma and localized scleroderma. Paper forms were used to record the original data including personal information, pictures, specimens, blood biochemical indicators, skin lesions,and scores of self-rating scales. The results were input into the cloud database. The applications of the cloud database in the dermatology department were summarized and analyzed. The personal and clinical information of 215 systemic scleroderma patients and 522 localized scleroderma patients were included and analyzed using the cloud database. The disease status,quality of life, and prognosis were obtained by statistical calculations. The cloud database can efficiently and rapidly store and manage the data of patients with skin diseases. As a simple, prompt, safe, and convenient tool, it can be used in patients information management, clinical decision-making, and scientific research.
Investigation of magnetic resonance imaging texture analysis as an aid tool for characterization of refractory epilepsies.

PubMed

Baldissin, Maurício Martins; Souza, Edna Marina de

2013-12-01

Refractory epilepsies are syndromes for which therapies that employ two or more antiepileptic drugs, separately or in association, do not result in control of crisis. Patients may present focal cortical dysplasia or diffuse dysplasia and/or hippocampal atrophic alterations that may not be detectable by a simple visual analysis in magnetic resonance imaging. The aim of this study was to evaluate MRI texture in regions of interest located in the hippocampi, limbic association cortex and prefrontal cortex of 20 patients with refractory epilepsy and to compare them with the same areas in 20 healthy individuals, in order to find out if the texture parameters could be related to the presence of the disease. Of the 11 texture parameters calculated, three indicated the existence of statistically significant differences between the studied groups. Such findings suggest the possibility of this technique contributing to studies of refractory epilepsies.
SOBA: sequence ontology bioinformatics analysis.

PubMed

Moore, Barry; Fan, Guozhen; Eilbeck, Karen

2010-07-01

The advent of cheaper, faster sequencing technologies has pushed the task of sequence annotation from the exclusive domain of large-scale multi-national sequencing projects to that of research laboratories and small consortia. The bioinformatics burden placed on these laboratories, some with very little programming experience can be daunting. Fortunately, there exist software libraries and pipelines designed with these groups in mind, to ease the transition from an assembled genome to an annotated and accessible genome resource. We have developed the Sequence Ontology Bioinformatics Analysis (SOBA) tool to provide a simple statistical and graphical summary of an annotated genome. We envisage its use during annotation jamborees, genome comparison and for use by developers for rapid feedback during annotation software development and testing. SOBA also provides annotation consistency feedback to ensure correct use of terminology within annotations, and guides users to add new terms to the Sequence Ontology when required. SOBA is available at http://www.sequenceontology.org/cgi-bin/soba.cgi.

CARES/Life Software for Designing More Reliable Ceramic Parts

NASA Technical Reports Server (NTRS)

Nemeth, Noel N.; Powers, Lynn M.; Baker, Eric H.

1997-01-01

Products made from advanced ceramics show great promise for revolutionizing aerospace and terrestrial propulsion, and power generation. However, ceramic components are difficult to design because brittle materials in general have widely varying strength values. The CAPES/Life software eases this task by providing a tool to optimize the design and manufacture of brittle material components using probabilistic reliability analysis techniques. Probabilistic component design involves predicting the probability of failure for a thermomechanically loaded component from specimen rupture data. Typically, these experiments are performed using many simple geometry flexural or tensile test specimens. A static, dynamic, or cyclic load is applied to each specimen until fracture. Statistical strength and SCG (fatigue) parameters are then determined from these data. Using these parameters and the results obtained from a finite element analysis, the time-dependent reliability for a complex component geometry and loading is then predicted. Appropriate design changes are made until an acceptable probability of failure has been reached.
Electrochemical biosensors for Salmonella: State of the art and challenges in food safety assessment.

PubMed

Silva, Nádia F D; Magalhães, Júlia M C S; Freire, Cristina; Delerue-Matos, Cristina

2018-01-15

According to the recent statistics, Salmonella is still an important public health issue in the whole world. Legislated reference methods, based on counting plate methods, are sensitive enough but are inadequate as an effective emergency response tool, and are far from a rapid device, simple to use out of lab. An overview of the commercially available rapid methods for Salmonella detection is provided along with a critical discussion of their limitations, benefits and potential use in a real context. The distinguished potentialities of electrochemical biosensors for the development of rapid devices are highlighted. The state-of-art and the newest technologic approaches in electrochemical biosensors for Salmonella detection are presented and a critical analysis of the literature is made in an attempt to identify the current challenges towards a complete solution for Salmonella detection in microbial food control based on electrochemical biosensors. Copyright © 2017 Elsevier B.V. All rights reserved.
Fully stabilized mid-infrared frequency comb for high-precision molecular spectroscopy.

PubMed

Vainio, Markku; Karhu, Juho

2017-02-20

A fully stabilized mid-infrared optical frequency comb spanning from 2.9 to 3.4 µm is described in this article. The comb is based on half-harmonic generation in a femtosecond optical parametric oscillator, which transfers the high phase coherence of a fully stabilized near-infrared Er-doped fiber laser comb to the mid-infrared region. The method is simple, as no phase-locked loops or reference lasers are needed. Precise locking of optical frequencies of the mid-infrared comb to the pump comb is experimentally verified at sub-20 mHz level, which corresponds to a fractional statistical uncertainty of 2 × 10^-16 at the center frequency of the mid-infrared comb. The fully stabilized mid-infrared comb is an ideal tool for high-precision molecular spectroscopy, as well as for optical frequency metrology in the mid-infrared region, which is difficult to access with other stabilized frequency comb techniques.
Research Techniques Made Simple: Bioinformatics for Genome-Scale Biology.

PubMed

Foulkes, Amy C; Watson, David S; Griffiths, Christopher E M; Warren, Richard B; Huber, Wolfgang; Barnes, Michael R

2017-09-01

High-throughput biology presents unique opportunities and challenges for dermatological research. Drawing on a small handful of exemplary studies, we review some of the major lessons of these new technologies. We caution against several common errors and introduce helpful statistical concepts that may be unfamiliar to researchers without experience in bioinformatics. We recommend specific software tools that can aid dermatologists at varying levels of computational literacy, including platforms with command line and graphical user interfaces. The future of dermatology lies in integrative research, in which clinicians, laboratory scientists, and data analysts come together to plan, execute, and publish their work in open forums that promote critical discussion and reproducibility. In this article, we offer guidelines that we hope will steer researchers toward best practices for this new and dynamic era of data intensive dermatology. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
An effective visualization technique for depth perception in augmented reality-based surgical navigation.

PubMed

Choi, Hyunseok; Cho, Byunghyun; Masamune, Ken; Hashizume, Makoto; Hong, Jaesung

2016-03-01

Depth perception is a major issue in augmented reality (AR)-based surgical navigation. We propose an AR and virtual reality (VR) switchable visualization system with distance information, and evaluate its performance in a surgical navigation set-up. To improve depth perception, seamless switching from AR to VR was implemented. In addition, the minimum distance between the tip of the surgical tool and the nearest organ was provided in real time. To evaluate the proposed techniques, five physicians and 20 non-medical volunteers participated in experiments. Targeting error, time taken, and numbers of collisions were measured in simulation experiments. There was a statistically significant difference between a simple AR technique and the proposed technique. We confirmed that depth perception in AR could be improved by the proposed seamless switching between AR and VR, and providing an indication of the minimum distance also facilitated the surgical tasks. Copyright © 2015 John Wiley & Sons, Ltd.
Identification of the Species of Origin for Meat Products by Rapid Evaporative Ionization Mass Spectrometry.

PubMed

Balog, Julia; Perenyi, Dora; Guallar-Hoyas, Cristina; Egri, Attila; Pringle, Steven D; Stead, Sara; Chevallier, Olivier P; Elliott, Chris T; Takats, Zoltan

2016-06-15

Increasingly abundant food fraud cases have brought food authenticity and safety into major focus. This study presents a fast and effective way to identify meat products using rapid evaporative ionization mass spectrometry (REIMS). The experimental setup was demonstrated to be able to record a mass spectrometric profile of meat specimens in a time frame of <5 s. A multivariate statistical algorithm was developed and successfully tested for the identification of animal tissue with different anatomical origin, breed, and species with 100% accuracy at species and 97% accuracy at breed level. Detection of the presence of meat originating from a different species (horse, cattle, and venison) has also been demonstrated with high accuracy using mixed patties with a 5% detection limit. REIMS technology was found to be a promising tool in food safety applications providing a reliable and simple method for the rapid characterization of food products.
Earth System Modeling 2.0: A Blueprint for Models That Learn From Observations and Targeted High-Resolution Simulations

NASA Astrophysics Data System (ADS)

Schneider, Tapio; Lan, Shiwei; Stuart, Andrew; Teixeira, João.

2017-12-01

Climate projections continue to be marred by large uncertainties, which originate in processes that need to be parameterized, such as clouds, convection, and ecosystems. But rapid progress is now within reach. New computational tools and methods from data assimilation and machine learning make it possible to integrate global observations and local high-resolution simulations in an Earth system model (ESM) that systematically learns from both and quantifies uncertainties. Here we propose a blueprint for such an ESM. We outline how parameterization schemes can learn from global observations and targeted high-resolution simulations, for example, of clouds and convection, through matching low-order statistics between ESMs, observations, and high-resolution simulations. We illustrate learning algorithms for ESMs with a simple dynamical system that shares characteristics of the climate system; and we discuss the opportunities the proposed framework presents and the challenges that remain to realize it.
ontologyX: a suite of R packages for working with ontological data.

PubMed

Greene, Daniel; Richardson, Sylvia; Turro, Ernest

2017-04-01

Ontologies are widely used constructs for encoding and analyzing biomedical data, but the absence of simple and consistent tools has made exploratory and systematic analysis of such data unnecessarily difficult. Here we present three packages which aim to simplify such procedures. The ontologyIndex package enables arbitrary ontologies to be read into R, supports representation of ontological objects by native R types, and provides a parsimonius set of performant functions for querying ontologies. ontologySimilarity and ontologyPlot extend ontologyIndex with functionality for straightforward visualization and semantic similarity calculations, including statistical routines. ontologyIndex , ontologyPlot and ontologySimilarity are all available on the Comprehensive R Archive Network website under https://cran.r-project.org/web/packages/ . Daniel Greene dg333@cam.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Resource materials for a GIS spatial analysis course

USGS Publications Warehouse

Raines, Gary L.

2001-01-01

This report consists of materials prepared for a GIS spatial analysis course offered as part of the Geography curriculum at the University of Nevada, Reno and the University of California at Santa Barbara in the spring of 2000. The report is intended to share information with instructors preparing spatial-modeling training and scientists with advanced GIS expertise. The students taking this class had completed each universities GIS curriculum and had a foundation in statistics as part of a science major. This report is organized into chapters that contain the following: Slides used during lectures, Guidance on the use of Arcview, Introduction to filtering in Arcview, Conventional and spatial correlation in Arcview, Tools for fuzzification in Arcview, Data and instructions for creating using ArcSDM for simple weights-of-evidence, fuzzy logic, and neural network models for Carlin-type gold deposits in central Nevada, Reading list on spatial modeling, and Selected student spatial-modeling posters from the laboratory exercises.
DAnTE: a statistical tool for quantitative analysis of –omics data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Polpitiya, Ashoka D.; Qian, Weijun; Jaitly, Navdeep

2008-05-03

DAnTE (Data Analysis Tool Extension) is a statistical tool designed to address challenges unique to quantitative bottom-up, shotgun proteomics data. This tool has also been demonstrated for microarray data and can easily be extended to other high-throughput data types. DAnTE features selected normalization methods, missing value imputation algorithms, peptide to protein rollup methods, an extensive array of plotting functions, and a comprehensive ANOVA scheme that can handle unbalanced data and random effects. The Graphical User Interface (GUI) is designed to be very intuitive and user friendly.
Evaluating statistical consistency in the ocean model component of the Community Earth System Model (pyCECT v2.0)

NASA Astrophysics Data System (ADS)

Baker, Allison H.; Hu, Yong; Hammerling, Dorit M.; Tseng, Yu-heng; Xu, Haiying; Huang, Xiaomeng; Bryan, Frank O.; Yang, Guangwen

2016-07-01

The Parallel Ocean Program (POP), the ocean model component of the Community Earth System Model (CESM), is widely used in climate research. Most current work in CESM-POP focuses on improving the model's efficiency or accuracy, such as improving numerical methods, advancing parameterization, porting to new architectures, or increasing parallelism. Since ocean dynamics are chaotic in nature, achieving bit-for-bit (BFB) identical results in ocean solutions cannot be guaranteed for even tiny code modifications, and determining whether modifications are admissible (i.e., statistically consistent with the original results) is non-trivial. In recent work, an ensemble-based statistical approach was shown to work well for software verification (i.e., quality assurance) on atmospheric model data. The general idea of the ensemble-based statistical consistency testing is to use a qualitative measurement of the variability of the ensemble of simulations as a metric with which to compare future simulations and make a determination of statistical distinguishability. The capability to determine consistency without BFB results boosts model confidence and provides the flexibility needed, for example, for more aggressive code optimizations and the use of heterogeneous execution environments. Since ocean and atmosphere models have differing characteristics in term of dynamics, spatial variability, and timescales, we present a new statistical method to evaluate ocean model simulation data that requires the evaluation of ensemble means and deviations in a spatial manner. In particular, the statistical distribution from an ensemble of CESM-POP simulations is used to determine the standard score of any new model solution at each grid point. Then the percentage of points that have scores greater than a specified threshold indicates whether the new model simulation is statistically distinguishable from the ensemble simulations. Both ensemble size and composition are important. Our experiments indicate that the new POP ensemble consistency test (POP-ECT) tool is capable of distinguishing cases that should be statistically consistent with the ensemble and those that should not, as well as providing a simple, subjective and systematic way to detect errors in CESM-POP due to the hardware or software stack, positively contributing to quality assurance for the CESM-POP code.
Statistical tools for transgene copy number estimation based on real-time PCR.

PubMed

Yuan, Joshua S; Burris, Jason; Stewart, Nathan R; Mentewab, Ayalew; Stewart, C Neal

2007-11-01

As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination. Three experimental designs and four data quality control integrated statistical models are presented. For the first method, external calibration curves are established for the transgene based on serially-diluted templates. The Ct number from a control transgenic event and putative transgenic event are compared to derive the transgene copy number or zygosity estimation. Simple linear regression and two group T-test procedures were combined to model the data from this design. For the second experimental design, standard curves were generated for both an internal reference gene and the transgene, and the copy number of transgene was compared with that of internal reference gene. Multiple regression models and ANOVA models can be employed to analyze the data and perform quality control for this approach. In the third experimental design, transgene copy number is compared with reference gene without a standard curve, but rather, is based directly on fluorescence data. Two different multiple regression models were proposed to analyze the data based on two different approaches of amplification efficiency integration. Our results highlight the importance of proper statistical treatment and quality control integration in real-time PCR-based transgene copy number determination. These statistical methods allow the real-time PCR-based transgene copy number estimation to be more reliable and precise with a proper statistical estimation. Proper confidence intervals are necessary for unambiguous prediction of trangene copy number. The four different statistical methods are compared for their advantages and disadvantages. Moreover, the statistical methods can also be applied for other real-time PCR-based quantification assays including transfection efficiency analysis and pathogen quantification.
External validation of a simple clinical tool used to predict falls in people with Parkinson disease

PubMed Central

Duncan, Ryan P.; Cavanaugh, James T.; Earhart, Gammon M.; Ellis, Terry D.; Ford, Matthew P.; Foreman, K. Bo; Leddy, Abigail L.; Paul, Serene S.; Canning, Colleen G.; Thackeray, Anne; Dibble, Leland E.

2015-01-01

Background Assessment of fall risk in an individual with Parkinson disease (PD) is a critical yet often time consuming component of patient care. Recently a simple clinical prediction tool based only on fall history in the previous year, freezing of gait in the past month, and gait velocity <1.1 m/s was developed and accurately predicted future falls in a sample of individuals with PD. METHODS We sought to externally validate the utility of the tool by administering it to a different cohort of 171 individuals with PD. Falls were monitored prospectively for 6 months following predictor assessment. RESULTS The tool accurately discriminated future fallers from non-fallers (area under the curve [AUC] = 0.83; 95% CI 0.76 –0.89), comparable to the developmental study. CONCLUSION The results validated the utility of the tool for allowing clinicians to quickly and accurately identify an individual’s risk of an impending fall. PMID:26003412
External validation of a simple clinical tool used to predict falls in people with Parkinson disease.

PubMed

Duncan, Ryan P; Cavanaugh, James T; Earhart, Gammon M; Ellis, Terry D; Ford, Matthew P; Foreman, K Bo; Leddy, Abigail L; Paul, Serene S; Canning, Colleen G; Thackeray, Anne; Dibble, Leland E

2015-08-01

Assessment of fall risk in an individual with Parkinson disease (PD) is a critical yet often time consuming component of patient care. Recently a simple clinical prediction tool based only on fall history in the previous year, freezing of gait in the past month, and gait velocity <1.1 m/s was developed and accurately predicted future falls in a sample of individuals with PD. We sought to externally validate the utility of the tool by administering it to a different cohort of 171 individuals with PD. Falls were monitored prospectively for 6 months following predictor assessment. The tool accurately discriminated future fallers from non-fallers (area under the curve [AUC] = 0.83; 95% CI 0.76-0.89), comparable to the developmental study. The results validated the utility of the tool for allowing clinicians to quickly and accurately identify an individual's risk of an impending fall. Copyright © 2015 Elsevier Ltd. All rights reserved.
Using Quality Management Tools to Enhance Feedback from Student Evaluations

ERIC Educational Resources Information Center

Jensen, John B.; Artz, Nancy

2005-01-01

Statistical tools found in the service quality assessment literature--the "T"[superscript 2] statistic combined with factor analysis--can enhance the feedback instructors receive from student ratings. "T"[superscript 2] examines variability across multiple sets of ratings to isolate individual respondents with aberrant response…
Benefits and Pitfalls: Simple Guidelines for the Use of Social Networking Tools in K-12 Education

ERIC Educational Resources Information Center

Huffman, Stephanie

2013-01-01

The article will outline a framework for the use of social networking tools in K-12 education framed around four thought provoking questions: 1) what are the benefits and pitfalls of using social networking tools in P-12 education, 2) how do we plan effectively for the use of social networking tool, 3) what role does professional development play…
SimHap GUI: an intuitive graphical user interface for genetic association analysis.

PubMed

Carter, Kim W; McCaskie, Pamela A; Palmer, Lyle J

2008-12-25

Researchers wishing to conduct genetic association analysis involving single nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language, provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a graphical user interface that allows anyone but a professional statistician to effectively utilise the tool. We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI features a novel workflow interface that guides the user through each logical step of the analysis process, making it accessible to both novice and advanced users. This tool provides a seamless interface to the SimHap R package, while providing enhanced functionality such as sophisticated data checking, automated data conversion, and real-time estimations of haplotype simulation progress. SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a range of genetic and non-genetic association analyses. This provides a free alternative to commercial statistics packages that is specifically designed for genetic association analysis.
The Simple Concurrent Online Processing System (SCOPS) - An open-source interface for remotely sensed data processing

NASA Astrophysics Data System (ADS)

Warren, M. A.; Goult, S.; Clewley, D.

2018-06-01

Advances in technology allow remotely sensed data to be acquired with increasingly higher spatial and spectral resolutions. These data may then be used to influence government decision making and solve a number of research and application driven questions. However, such large volumes of data can be difficult to handle on a single personal computer or on older machines with slower components. Often the software required to process data is varied and can be highly technical and too advanced for the novice user to fully understand. This paper describes an open-source tool, the Simple Concurrent Online Processing System (SCOPS), which forms part of an airborne hyperspectral data processing chain that allows users accessing the tool over a web interface to submit jobs and process data remotely. It is demonstrated using Natural Environment Research Council Airborne Research Facility (NERC-ARF) instruments together with other free- and open-source tools to take radiometrically corrected data from sensor geometry into geocorrected form and to generate simple or complex band ratio products. The final processed data products are acquired via an HTTP download. SCOPS can cut data processing times and introduce complex processing software to novice users by distributing jobs across a network using a simple to use web interface.
A numerical tool for reproducing driver behaviour: experiments and predictive simulations.

PubMed

Casucci, M; Marchitto, M; Cacciabue, P C

2010-03-01

This paper presents the simulation tool called SDDRIVE (Simple Simulation of Driver performance), which is the numerical computerised implementation of the theoretical architecture describing Driver-Vehicle-Environment (DVE) interactions, contained in Cacciabue and Carsten [Cacciabue, P.C., Carsten, O. A simple model of driver behaviour to sustain design and safety assessment of automated systems in automotive environments, 2010]. Following a brief description of the basic algorithms that simulate the performance of drivers, the paper presents and discusses a set of experiments carried out in a Virtual Reality full scale simulator for validating the simulation. Then the predictive potentiality of the tool is shown by discussing two case studies of DVE interactions, performed in the presence of different driver attitudes in similar traffic conditions.
Graphical correlation of gaging-station records

USGS Publications Warehouse

Searcy, James K.

1960-01-01

A gaging-station record is a sample of the rate of flow of a stream at a given site. This sample can be used to estimate the magnitude and distribution of future flows if the record is long enough to be representative of the long-term flow of the stream. The reliability of a short-term record for estimating future flow characteristics can be improved through correlation with a long-term record. Correlation can be either numerical or graphical, but graphical correlation of gaging-station records has several advantages. The graphical correlation method is described in a step-by-step procedure with an illustrative problem of simple correlation, illustrative problems of three examples of multiple correlation--removing seasonal effect--and two examples of correlation of one record with two other records. Except in the problem on removal of seasonal effect, the same group of stations is used in the illustrative problems. The purpose of the problems is to illustrate the method--not to show the improvement that can result from multiple correlation as compared with simple correlation. Hydrologic factors determine whether a usable relation exists between gaging-station records. Statistics is only a tool for evaluating and using an existing relation, and the investigator must be guided by a knowledge of hydrology.

Direct Detection of Singlet-Triplet Interconversion in OLED Magnetoelectroluminescence with a Metal-Free Fluorescence-Phosphorescence Dual Emitter

NASA Astrophysics Data System (ADS)

Ratzke, Wolfram; Bange, Sebastian; Lupton, John M.

2018-05-01

We demonstrate that a simple phenazine derivative can serve as a dual emitter for organic light-emitting diodes, showing simultaneous luminescence from the singlet and triplet excited states at room temperature without the need of heavy-atom substituents. Although devices made with this emitter achieve only low quantum efficiencies of <0.2 % , changes in fluorescence and phosphorescence intensity on the subpercent scale caused by an external magnetic field of up to 30 mT are clearly resolved with an ultra-low-noise optical imaging technique. The results demonstrate the concept of using simple reporter molecules, available commercially, to optically detect the spin of excited states formed in an organic light-emitting diode and thereby probe the underlying spin statistics of recombining electron-hole pairs. A clear anticorrelation of the magnetic-field dependence of singlet and triplet emission shows that it is the spin interconversion between singlet and triplet which dominates the magnetoluminescence response: the phosphorescence intensity decreases by the same amount as the fluorescence intensity increases. The concurrent detection of singlet and triplet emission as well as device resistance at cryogenic and room temperature constitute a useful tool to disentangle the effects of spin-dependent recombination from spin-dependent transport mechanisms.
Audible handheld Doppler ultrasound determines reliable and inexpensive exclusion of significant peripheral arterial disease.

PubMed

Alavi, Afsaneh; Sibbald, R Gary; Nabavizadeh, Reza; Valaei, Farnaz; Coutts, Pat; Mayer, Dieter

2015-12-01

To determine the accuracy of audible arterial foot signals with an audible handheld Doppler ultrasound for identification of significant peripheral arterial disease as a simple, quick, and readily available bedside screening tool. Two hundred consecutive patients referred to an interprofessional wound care clinic underwent audible handheld Doppler ultrasound of both legs. As a control and comparator, a formal bilateral lower leg vascular study including the calculation of Ankle Brachial Pressure Index and toe pressure (TP) was performed at the vascular lab. Diagnostic reliability of audible handheld Doppler ultrasound was calculated versus Ankle Brachial Pressure Index as the gold standard test. A sensitivity of 42.8%, a specificity of 97.5%, negative predictive value of 94.10%, positive predictive value of 65.22%, positive likelihood ratio of 17.52, and negative likelihood ratio of 0.59. The univariable logistic regression model had an area under the curve of 0.78. There was a statistically significant difference at the 5% level between univariable and multivariable area under the curves of the dorsalis pedis and posterior tibial models (p < 0.001). Audible handheld Doppler ultrasound proved to be a reliable, simple, rapid, and inexpensive bedside exclusion test of peripheral arterial disease in diabetic and nondiabetic patients. © The Author(s) 2015.
Individual and population pharmacokinetic compartment analysis: a graphic procedure for quantification of predictive performance.

PubMed

Eksborg, Staffan

2013-01-01

Pharmacokinetic studies are important for optimizing of drug dosing, but requires proper validation of the used pharmacokinetic procedures. However, simple and reliable statistical methods suitable for evaluation of the predictive performance of pharmacokinetic analysis are essentially lacking. The aim of the present study was to construct and evaluate a graphic procedure for quantification of predictive performance of individual and population pharmacokinetic compartment analysis. Original data from previously published pharmacokinetic compartment analyses after intravenous, oral, and epidural administration, and digitized data, obtained from published scatter plots of observed vs predicted drug concentrations from population pharmacokinetic studies using the NPEM algorithm and NONMEM computer program and Bayesian forecasting procedures, were used for estimating the predictive performance according to the proposed graphical method and by the method of Sheiner and Beal. The graphical plot proposed in the present paper proved to be a useful tool for evaluation of predictive performance of both individual and population compartment pharmacokinetic analysis. The proposed method is simple to use and gives valuable information concerning time- and concentration-dependent inaccuracies that might occur in individual and population pharmacokinetic compartment analysis. Predictive performance can be quantified by the fraction of concentration ratios within arbitrarily specified ranges, e.g. within the range 0.8-1.2.
Cumulative sum control charts for assessing performance in arterial surgery.

PubMed

Beiles, C Barry; Morton, Anthony P

2004-03-01

The Melbourne Vascular Surgical Association (Melbourne, Australia) undertakes surveillance of mortality following aortic aneurysm surgery, patency at discharge following infrainguinal bypass and stroke and death following carotid endarterectomy. Quality improvement protocol employing the Deming cycle requires that the system for performing surgery first be analysed and optimized. Then process and outcome data are collected and these data require careful analysis. There must be a mechanism so that the causes of unsatisfactory outcomes can be determined and a good feedback mechanism must exist so that good performance is acknowledged and unsatisfactory performance corrected. A simple method for analysing these data that detects changes in average outcome rates is available using cumulative sum statistical control charts. Data have been analysed both retrospectively from 1999 to 2001, and prospectively during 2002 using cumulative sum control methods. A pathway to deal with control chart signals has been developed. The standard of arterial surgery in Victoria, Australia, is high. In one case a safe and satisfactory outcome was achieved by following the pathway developed by the audit committee. Cumulative sum control charts are a simple and effective tool for the identification of variations in performance standards in arterial surgery. The establishment of a pathway to manage problem performance is a vital part of audit activity.
Climate change impact assessment on food security in Indonesia

NASA Astrophysics Data System (ADS)

Ettema, Janneke; Aldrian, Edvin; de Bie, Kees; Jetten, Victor; Mannaerts, Chris

2013-04-01

As Indonesia is the world's fourth most populous country, food security is a persistent challenge. The potential impact of future climate change on the agricultural sector needs to be addressed in order to allow early implementation of mitigation strategies. The complex island topography and local sea-land-air interactions cannot adequately be represented in large scale General Climate Models (GCMs) nor visualized by TRMM. Downscaling is needed. Using meteorological observations and a simple statistical downscaling tool, local future projections are derived from state-of-the-art, large-scale GCM scenarios, provided by the CMIP5 project. To support the agriculture sector, providing information on especially rainfall and temperature variability is essential. Agricultural production forecast is influenced by several rain and temperature factors, such as rainy and dry season onset, offset and length, but also by daily and monthly minimum and maximum temperatures and its rainfall amount. A simple and advanced crop model will be used to address the sensitivity of different crops to temperature and rainfall variability, present-day and future. As case study area, Java Island is chosen as it is fourth largest island in Indonesia but contains more than half of the nation's population and dominates it politically and economically. The objective is to identify regions at agricultural risk due to changing patterns in precipitation and temperature.
Humans make efficient use of natural image statistics when performing spatial interpolation.

PubMed

D'Antona, Anthony D; Perry, Jeffrey S; Geisler, Wilson S

2013-12-16

Visual systems learn through evolution and experience over the lifespan to exploit the statistical structure of natural images when performing visual tasks. Understanding which aspects of this statistical structure are incorporated into the human nervous system is a fundamental goal in vision science. To address this goal, we measured human ability to estimate the intensity of missing image pixels in natural images. Human estimation accuracy is compared with various simple heuristics (e.g., local mean) and with optimal observers that have nearly complete knowledge of the local statistical structure of natural images. Human estimates are more accurate than those of simple heuristics, and they match the performance of an optimal observer that knows the local statistical structure of relative intensities (contrasts). This optimal observer predicts the detailed pattern of human estimation errors and hence the results place strong constraints on the underlying neural mechanisms. However, humans do not reach the performance of an optimal observer that knows the local statistical structure of the absolute intensities, which reflect both local relative intensities and local mean intensity. As predicted from a statistical analysis of natural images, human estimation accuracy is negligibly improved by expanding the context from a local patch to the whole image. Our results demonstrate that the human visual system exploits efficiently the statistical structure of natural images.
Morse Code, Scrabble, and the Alphabet

ERIC Educational Resources Information Center

Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss

2004-01-01

In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…
Diagnosis of Late-Stage, Early-Onset, Small-Fiber Polyneuropathy

DTIC Science & Technology

2016-10-01

develop biotechnology tools for simple diagnosis (sweat testing and pupilometry), 3) identify gene polymorphisms to detect risk for SFPN. None...Goal 4) Specific Aim 2: To develop and evaluate simple biotechnology devices for diagnosing and monitoring longstanding eoSFPN based on
Computational Modeling of Statistical Learning: Effects of Transitional Probability versus Frequency and Links to Word Learning

ERIC Educational Resources Information Center

Mirman, Daniel; Estes, Katharine Graf; Magnuson, James S.

2010-01-01

Statistical learning mechanisms play an important role in theories of language acquisition and processing. Recurrent neural network models have provided important insights into how these mechanisms might operate. We examined whether such networks capture two key findings in human statistical learning. In Simulation 1, a simple recurrent network…
A Laboratory Experiment, Based on the Maillard Reaction, Conducted as a Project in Introductory Statistics

ERIC Educational Resources Information Center

Kravchuk, Olena; Elliott, Antony; Bhandari, Bhesh

2005-01-01

A simple laboratory experiment, based on the Maillard reaction, served as a project in Introductory Statistics for undergraduates in Food Science and Technology. By using the principles of randomization and replication and reflecting on the sources of variation in the experimental data, students reinforced the statistical concepts and techniques…
Simple Nutrition Screening Tool for Pediatric Inpatients.

PubMed

White, Melinda; Lawson, Karen; Ramsey, Rebecca; Dennis, Nicole; Hutchinson, Zoe; Soh, Xin Ying; Matsuyama, Misa; Doolan, Annabel; Todd, Alwyn; Elliott, Aoife; Bell, Kristie; Littlewood, Robyn

2016-03-01

Pediatric nutrition risk screening tools are not routinely implemented throughout many hospitals, despite prevalence studies demonstrating malnutrition is common in hospitalized children. Existing tools lack the simplicity of those used to assess nutrition risk in the adult population. This study reports the accuracy of a new, quick, and simple pediatric nutrition screening tool (PNST) designed to be used for pediatric inpatients. The pediatric Subjective Global Nutrition Assessment (SGNA) and anthropometric measures were used to develop and assess the validity of 4 simple nutrition screening questions comprising the PNST. Participants were pediatric inpatients in 2 tertiary pediatric hospitals and 1 regional hospital. Two affirmative answers to the PNST questions were found to maximize the specificity and sensitivity to the pediatric SGNA and body mass index (BMI) z scores for malnutrition in 295 patients. The PNST identified 37.6% of patients as being at nutrition risk, whereas the pediatric SGNA identified 34.2%. The sensitivity and specificity of the PNST compared with the pediatric SGNA were 77.8% and 82.1%, respectively. The sensitivity of the PNST at detecting patients with a BMI z score of less than -2 was 89.3%, and the specificity was 66.2%. Both the PNST and pediatric SGNA were relatively poor at detecting patients who were stunted or overweight, with the sensitivity and specificity being less than 69%. The PNST provides a sensitive, valid, and simpler alternative to existing pediatric nutrition screening tools such as Screening Tool for the Assessment of Malnutrition in Pediatrics (STAMP), Screening Tool Risk on Nutritional status and Growth (STRONGkids), and Paediatric Yorkhill Malnutrition Score (PYMS) to ensure the early detection of hospitalized children at nutrition risk. © 2014 American Society for Parenteral and Enteral Nutrition.
STATISTICAL TECHNIQUES FOR DETERMINATION AND PREDICTION OF FUNDAMENTAL FISH ASSEMBLAGES OF THE MID-ATLANTIC HIGHLANDS

EPA Science Inventory

A statistical software tool, Stream Fish Community Predictor (SFCP), based on EMAP stream sampling in the mid-Atlantic Highlands, was developed to predict stream fish communities using stream and watershed characteristics. Step one in the tool development was a cluster analysis t...
A Statistical Project Control Tool for Engineering Managers

NASA Technical Reports Server (NTRS)

Bauch, Garland T.

2001-01-01

This slide presentation reviews the use of a Statistical Project Control Tool (SPCT) for managing engineering projects. A literature review pointed to a definition of project success, (i.e., A project is successful when the cost, schedule, technical performance, and quality satisfy the customer.) The literature review also pointed to project success factors, and traditional project control tools, and performance measures that are detailed in the report. The essential problem is that with resources becoming more limited, and an increasing number or projects, project failure is increasing, there is a limitation of existing methods and systematic methods are required. The objective of the work is to provide a new statistical project control tool for project managers. Graphs using the SPCT method plotting results of 3 successful projects and 3 failed projects are reviewed, with success and failure being defined by the owner.
Statistical bias correction method applied on CMIP5 datasets over the Indian region during the summer monsoon season for climate change applications

NASA Astrophysics Data System (ADS)

Prasanna, V.

2018-01-01

This study makes use of temperature and precipitation from CMIP5 climate model output for climate change application studies over the Indian region during the summer monsoon season (JJAS). Bias correction of temperature and precipitation from CMIP5 GCM simulation results with respect to observation is discussed in detail. The non-linear statistical bias correction is a suitable bias correction method for climate change data because it is simple and does not add up artificial uncertainties to the impact assessment of climate change scenarios for climate change application studies (agricultural production changes) in the future. The simple statistical bias correction uses observational constraints on the GCM baseline, and the projected results are scaled with respect to the changing magnitude in future scenarios, varying from one model to the other. Two types of bias correction techniques are shown here: (1) a simple bias correction using a percentile-based quantile-mapping algorithm and (2) a simple but improved bias correction method, a cumulative distribution function (CDF; Weibull distribution function)-based quantile-mapping algorithm. This study shows that the percentile-based quantile mapping method gives results similar to the CDF (Weibull)-based quantile mapping method, and both the methods are comparable. The bias correction is applied on temperature and precipitation variables for present climate and future projected data to make use of it in a simple statistical model to understand the future changes in crop production over the Indian region during the summer monsoon season. In total, 12 CMIP5 models are used for Historical (1901-2005), RCP4.5 (2005-2100), and RCP8.5 (2005-2100) scenarios. The climate index from each CMIP5 model and the observed agricultural yield index over the Indian region are used in a regression model to project the changes in the agricultural yield over India from RCP4.5 and RCP8.5 scenarios. The results revealed a better convergence of model projections in the bias corrected data compared to the uncorrected data. The study can be extended to localized regional domains aimed at understanding the changes in the agricultural productivity in the future with an agro-economy or a simple statistical model. The statistical model indicated that the total food grain yield is going to increase over the Indian region in the future, the increase in the total food grain yield is approximately 50 kg/ ha for the RCP4.5 scenario from 2001 until the end of 2100, and the increase in the total food grain yield is approximately 90 kg/ha for the RCP8.5 scenario from 2001 until the end of 2100. There are many studies using bias correction techniques, but this study applies the bias correction technique to future climate scenario data from CMIP5 models and applied it to crop statistics to find future crop yield changes over the Indian region.
Use of simple models to determine wake vortex categories for new aircraft.

DOT National Transportation Integrated Search

2015-06-22

The paper describes how to use simple models and, if needed, sensitivity analyses to determine the wake vortex categories for new aircraft. The methodology provides a tool for the regulators to assess the relative risk of introducing new aircraft int...
Detection of outliers in the response and explanatory variables of the simple circular regression model

NASA Astrophysics Data System (ADS)

Mahmood, Ehab A.; Rana, Sohel; Hussin, Abdul Ghapor; Midi, Habshah

2016-06-01

The circular regression model may contain one or more data points which appear to be peculiar or inconsistent with the main part of the model. This may be occur due to recording errors, sudden short events, sampling under abnormal conditions etc. The existence of these data points "outliers" in the data set cause lot of problems in the research results and the conclusions. Therefore, we should identify them before applying statistical analysis. In this article, we aim to propose a statistic to identify outliers in the both of the response and explanatory variables of the simple circular regression model. Our proposed statistic is robust circular distance RCDxy and it is justified by the three robust measurements such as proportion of detection outliers, masking and swamping rates.
Use of iPhone technology in improving acetabular component position in total hip arthroplasty.

PubMed

Tay, Xiau Wei; Zhang, Benny Xu; Gayagay, George

2017-09-01

Improper acetabular cup positioning is associated with high risk of complications after total hip arthroplasty. The aim of our study is to objectively compare 3 methods, namely (1) free hand, (2) alignment jig (Sputnik), and (3) iPhone application to identify an easy, reproducible, and accurate method in improving acetabular cup placement. We designed a simple setup and carried out a simple experiment (see Method section). Using statistical analysis, the difference in inclination angles using iPhone application compared with the freehand method was found to be statistically significant ( F [2,51] = 4.17, P = .02) in the "untrained group". There is no statistical significance detected for the other groups. This suggests a potential role for iPhone applications in junior surgeons in overcoming the steep learning curve.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.

PubMed

Kosinski, Andrzej S

2013-03-15

Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests

PubMed Central

Kosinski, Andrzej S.

2013-01-01

Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations which are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we present, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic which incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, it always reduces to the score statistic in the independent samples situation, and it preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the weighted generalized score test statistic in a general GEE setting. PMID:22912343
The Shock and Vibration Digest. Volume 15, Number 7

DTIC Science & Technology

1983-07-01

systems noise -- for tant analytical tool, the statistical energy analysis example, from a specific metal, chain driven, con- method, has been the subject...34Experimental Determination of Vibration Parameters Re- ~~~quired in the Statistical Energy Analysis Meth- .,i. 31. Dubowsky, S. and Morris, T.L., "An...34Coupling Loss Factors for 55. Upton, R., "Sound Intensity -. A Powerful New Statistical Energy Analysis of Sound Trans- Measurement Tool," S/V, Sound

DECONV-TOOL: An IDL based deconvolution software package

NASA Technical Reports Server (NTRS)

Varosi, F.; Landsman, W. B.

1992-01-01

There are a variety of algorithms for deconvolution of blurred images, each having its own criteria or statistic to be optimized in order to estimate the original image data. Using the Interactive Data Language (IDL), we have implemented the Maximum Likelihood, Maximum Entropy, Maximum Residual Likelihood, and sigma-CLEAN algorithms in a unified environment called DeConv_Tool. Most of the algorithms have as their goal the optimization of statistics such as standard deviation and mean of residuals. Shannon entropy, log-likelihood, and chi-square of the residual auto-correlation are computed by DeConv_Tool for the purpose of determining the performance and convergence of any particular method and comparisons between methods. DeConv_Tool allows interactive monitoring of the statistics and the deconvolved image during computation. The final results, and optionally, the intermediate results, are stored in a structure convenient for comparison between methods and review of the deconvolution computation. The routines comprising DeConv_Tool are available via anonymous FTP through the IDL Astronomy User's Library.
Correlation between cystatin C-based formulas, Schwartz formula and urinary creatinine clearance for glomerular filtration rate estimation in children with kidney disease.

PubMed

Safaei-Asl, Afshin; Enshaei, Mercede; Heydarzadeh, Abtin; Maleknejad, Shohreh

2016-01-01

Assessment of glomerular filtration rate (GFR) is an important tool for monitoring renal function. Regarding to limitations in available methods, we intended to calculate GFR by cystatin C (Cys C) based formulas and determine correlation rate of them with current methods. We studied 72 children (38 boys and 34 girls) with renal disorders. The 24 hour urinary creatinine (Cr) clearance was the gold standard method. GFR was measured with Schwartz formula and Cys C-based formulas (Grubb, Hoek, Larsson and Simple). Then correlation rates of these formulas were determined. Using Pearson correlation coefficient, a significant positive correlation between all formulas and the standard method was seen (R(2) for Schwartz, Hoek, Larsson, Grubb and Simple formula was 0.639, 0.722, 0.705, 0.712, 0.722, respectively) (P<0.001). Cys C-based formulas could predict the variance of standard method results with high power. These formulas had correlation with Schwarz formula by R(2) 0.62-0.65 (intermediate correlation). Using linear regression and constant (y-intercept), it revealed that Larsson, Hoek and Grubb formulas can estimate GFR amounts with no statistical difference compared with standard method; but Schwartz and Simple formulas overestimate GFR. This study shows that Cys C-based formulas have strong relationship with 24 hour urinary Cr clearance. Hence, they can determine GFR in children with kidney injury, easier and with enough accuracy. It helps the physician to diagnosis of renal disease in early stages and improves the prognosis.
Global-Scale Hydrology: Simple Characterization of Complex Simulation

NASA Technical Reports Server (NTRS)

Koster, Randal D.

1999-01-01

Atmospheric general circulation models (AGCMS) are unique and valuable tools for the analysis of large-scale hydrology. AGCM simulations of climate provide tremendous amounts of hydrological data with a spatial and temporal coverage unmatched by observation systems. To the extent that the AGCM behaves realistically, these data can shed light on the nature of the real world's hydrological cycle. In the first part of the seminar, I will describe the hydrological cycle in a typical AGCM, with some emphasis on the validation of simulated precipitation against observations. The second part of the seminar will focus on a key goal in large-scale hydrology studies, namely the identification of simple, overarching controls on hydrological behavior hidden amidst the tremendous amounts of data produced by the highly complex AGCM parameterizations. In particular, I will show that a simple 50-year-old climatological relation (and a recent extension we made to it) successfully predicts, to first order, both the annual mean and the interannual variability of simulated evaporation and runoff fluxes. The seminar will conclude with an example of a practical application of global hydrology studies. The accurate prediction of weather statistics several months in advance would have tremendous societal benefits, and conventional wisdom today points at the use of coupled ocean-atmosphere-land models for such seasonal-to-interannual prediction. Understanding the hydrological cycle in AGCMs is critical to establishing the potential for such prediction. Our own studies show, among other things, that soil moisture retention can lead to significant precipitation predictability in many midlatitude and tropical regions.
Rapid determination of Swiss cheese composition by Fourier transform infrared/attenuated total reflectance spectroscopy.

PubMed

Rodriguez-Saona, L E; Koca, N; Harper, W J; Alvarez, V B

2006-05-01

There is a need for rapid and simple techniques that can be used to predict the quality of cheese. The aim of this research was to develop a simple and rapid screening tool for monitoring Swiss cheese composition by using Fourier transform infrared spectroscopy. Twenty Swiss cheese samples from different manufacturers and degree of maturity were evaluated. Direct measurements of Swiss cheese slices (approximately 0.5 g) were made using a MIRacle 3-reflection diamond attenuated total reflectance (ATR) accessory. Reference methods for moisture (vacuum oven), protein content (Kjeldahl), and fat (Babcock) were used. Calibration models were developed based on a cross-validated (leave-one-out approach) partial least squares regression. The information-rich infrared spectral range for Swiss cheese samples was from 3,000 to 2,800 cm(-1) and 1,800 to 900 cm(-1). The performance statistics for cross-validated models gave estimates for standard error of cross-validation of 0.45, 0.25, and 0.21% for moisture, protein, and fat respectively, and correlation coefficients r > 0.96. Furthermore, the ATR infrared protocol allowed for the classification of cheeses according to manufacturer and aging based on unique spectral information, especially of carbonyl groups, probably due to their distinctive lipid composition. Attenuated total reflectance infrared spectroscopy allowed for the rapid (approximately 3-min analysis time) and accurate analysis of the composition of Swiss cheese. This technique could contribute to the development of simple and rapid protocols for monitoring complex biochemical changes, and predicting the final quality of the cheese.
Correlation and simple linear regression.

PubMed

Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

2003-06-01

In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Technology: Presentations in the Cloud with a Twist

ERIC Educational Resources Information Center

Siegle, Del

2011-01-01

Technology tools have come a long way from early word processing applications and opportunities for students to engage in simple programming. Many tools now exist for students to develop and share products in a variety of formats and for a wide range of audiences. PowerPoint is probably the most ubiquitously used tool for student projects. In…
Scratch as a Computational Modelling Tool for Teaching Physics

ERIC Educational Resources Information Center

Lopez, Victor; Hernandez, Maria Isabel

2015-01-01

The Scratch online authoring tool, which features a simple programming language that has been adapted to primary and secondary students, is being used more and more in schools as it offers students and teachers the opportunity to use a tool to build scientific models and evaluate their behaviour, just as can be done with computational modelling…
Toward an International Classification of Functioning, Disability and Health clinical data collection tool: the Italian experience of developing simple, intuitive descriptions of the Rehabilitation Set categories.

PubMed

Selb, Melissa; Gimigliano, Francesca; Prodinger, Birgit; Stucki, Gerold; Pestelli, Germano; Iocco, Maurizio; Boldrini, Paolo

2017-04-01

As part of international efforts to develop and implement national models including the specification of ICF-based clinical data collection tools, the Italian rehabilitation community initiated a project to develop simple, intuitive descriptions of the ICF Rehabilitation Set, highlighting the core concept of each category in user-friendly language. This paper outlines the Italian experience in developing simple, intuitive descriptions of the ICF Rehabilitation Set as an ICF-based clinical data collection tool for Italy. Consensus process. Expert conference. Multidisciplinary group of rehabilitation professionals. The first of a two-stage consensus process involved developing an initial proposal for simple, intuitive descriptions of each ICF Rehabilitation Set category based on descriptions generated in a similar process in China. Stage two involved a consensus conference. Divided into three working groups, participants discussed and voted (vote A) whether the initially proposed descriptions of each ICF Rehabilitation Set category was simple and intuitive enough for use in daily practice. Afterwards the categories with descriptions considered ambiguous i.e. not simple and intuitive enough, were divided among the working groups, who were asked to propose a new description for the allocated categories. These proposals were then voted (vote B) on in a plenary session. The last step of the consensus conference required each working group to develop a new proposal for each and the same categories with descriptions still considered ambiguous. Participants then voted (final vote) for which of the three proposed descriptions they preferred. Nineteen clinicians from diverse rehabilitation disciplines from various regions of Italy participated in the consensus process. Three ICF categories already achieved consensus in vote A, while 20 ICF categories were accepted in vote B. The remaining 7 categories were decided in the final vote. The findings were discussed in light of current efforts toward developing strategies for ICF implementation, specifically for the application of an ICF-based clinical data collection tool, not only for Italy but also for the rest of Europe. Promising as minimal standards for monitoring the impact of interventions and for standardized reporting of functioning as a relevant outcome in rehabilitation.
Risk assessment tools to identify women with increased risk of osteoporotic fracture: complexity or simplicity? A systematic review.

PubMed

Rubin, Katrine Hass; Friis-Holmberg, Teresa; Hermann, Anne Pernille; Abrahamsen, Bo; Brixen, Kim

2013-08-01

A huge number of risk assessment tools have been developed. Far from all have been validated in external studies, more of them have absence of methodological and transparent evidence, and few are integrated in national guidelines. Therefore, we performed a systematic review to provide an overview of existing valid and reliable risk assessment tools for prediction of osteoporotic fractures. Additionally, we aimed to determine if the performance of each tool was sufficient for practical use, and last, to examine whether the complexity of the tools influenced their discriminative power. We searched PubMed, Embase, and Cochrane databases for papers and evaluated these with respect to methodological quality using the Quality Assessment Tool for Diagnostic Accuracy Studies (QUADAS) checklist. A total of 48 tools were identified; 20 had been externally validated, however, only six tools had been tested more than once in a population-based setting with acceptable methodological quality. None of the tools performed consistently better than the others and simple tools (i.e., the Osteoporosis Self-assessment Tool [OST], Osteoporosis Risk Assessment Instrument [ORAI], and Garvan Fracture Risk Calculator [Garvan]) often did as well or better than more complex tools (i.e., Simple Calculated Risk Estimation Score [SCORE], WHO Fracture Risk Assessment Tool [FRAX], and Qfracture). No studies determined the effectiveness of tools in selecting patients for therapy and thus improving fracture outcomes. High-quality studies in randomized design with population-based cohorts with different case mixes are needed. Copyright © 2013 American Society for Bone and Mineral Research.
The Math Problem: Advertising Students' Attitudes toward Statistics

ERIC Educational Resources Information Center

Fullerton, Jami A.; Kendrick, Alice

2013-01-01

This study used the Students' Attitudes toward Statistics Scale (STATS) to measure attitude toward statistics among a national sample of advertising students. A factor analysis revealed four underlying factors make up the attitude toward statistics construct--"Interest & Future Applicability," "Confidence," "Statistical Tools," and "Initiative."…
A Step Beyond Simple Keyword Searches: Services Enabled by a Full Content Digital Journal Archive

NASA Technical Reports Server (NTRS)

Boccippio, Dennis J.

2003-01-01

The problems of managing and searching large archives of scientific journal articles can potentially be addressed through data mining and statistical techniques matured primarily for quantitative scientific data analysis. A journal paper could be represented by a multivariate descriptor, e.g., the occurrence counts of a number key technical terms or phrases (keywords), perhaps derived from a controlled vocabulary ( e . g . , the American Meteorological Society's Glossary of Meteorology) or bootstrapped from the journal archive itself. With this technique, conventional statistical classification tools can be leveraged to address challenges faced by both scientists and professional societies in knowledge management. For example, cluster analyses can be used to find bundles of "most-related" papers, and address the issue of journal bifurcation (when is a new journal necessary, and what topics should it encompass). Similarly, neural networks can be trained to predict the optimal journal (within a society's collection) in which a newly submitted paper should be published. Comparable techniques could enable very powerful end-user tools for journal searches, all premised on the view of a paper as a data point in a multidimensional descriptor space, e.g.: "find papers most similar to the one I am reading", "build a personalized subscription service, based on the content of the papers I am interested in, rather than preselected keywords", "find suitable reviewers, based on the content of their own published works", etc. Such services may represent the next "quantum leap" beyond the rudimentary search interfaces currently provided to end-users, as well as a compelling value-added component needed to bridge the print-to-digital-medium gap, and help stabilize professional societies' revenue stream during the print-to-digital transition.
Determining Angle of Humeral Torsion Using Image Software Technique.

PubMed

Patil, Sachin; Sethi, Madhu; Vasudeva, Neelam

2016-10-01

Several researches have been done on the measurement of angles of humeral torsion in different parts of the world. Previously described methods were more complicated, not much accurate, cumbersome or required sophisticated instruments. The present study was conducted with the aim to determine the angles of humeral torsion with a newer simple technique using digital images and image tool software. A total of 250 dry normal adult human humeri were obtained from the bone bank of Department of Anatomy. The length and mid-shaft circumference of each bone was measured with the help of measuring tape. The angle of humeral torsion was measured directly from the digital images by the image analysis using Image Tool 3.0 software program. The data was analysed statistically with SPSS version 17 using unpaired t-test and Spearman's rank order correlation coefficient. The mean angle of torsion was 64.57°±7.56°. On the right side it was 66.84°±9.69°, whereas, on the left side it was found to be 63.31±9.50°. The mean humeral length was 31.6 cm on right side and 30.33 cm on left side. Mid shaft circumference was 5.79 on right side and 5.63 cm on left side. No statistical differences were seen in angles between right and left humeri (p>0.001). From our study, it was concluded that circumference of shaft is inversely proportional to angle of humeral torsion. The length and side of humerus has no relation with the humeral torsion. With advancement of digital technology, it is better to use new image softwares for anatomical studies.
Prevalence, pattern, and factors associated with work-related musculoskeletal disorders among pluckers in a tea plantation in Tamil Nadu, India

PubMed Central

Vasanth, Deepthi; Ramesh, Naveen; Fathima, Farah Naaz; Fernandez, Ria; Jennifer, Steffi; Joseph, Bobby

2015-01-01

Context: Musculoskeletal pain is common among tea leaf pluckers and is attributed to the load they carry, long working hours, the terrain, and insufficient job rotations. As a result of this, their health and work capacity are affected. Aims: To assess the prevalence, patterns, and factors associated with work-related musculoskeletal disorders (WRMDs) among pluckers in a tea plantation in Annamalai, Tamil Nadu, India. Settings and Design: This cross-sectional study surveyed 195 pluckers selected by simple random sampling aged between 18 years and 60 years. Materials and Methods: The interview schedule had four parts––sociodemographic detail, Standard Nordic Scale, numeric and facial pain rating tool, and a tool to assess factors associated with WRMDs. Statistical Analysis Used: Statistical Package for the Social Sciences (SPSS) version 16. Results: Prevalence of musculoskeletal pain in the last 12 months and the last 7 days was 83.6% and 78.5%, respectively. The most common site for last 1 year was shoulder (59%) and for last 7 days was the lower back (52.8%). Independent t-test revealed that the mean age of those with pain was 6.59 year more and mean years of employment was 1.38 years more among the workers with pain compared to workers without pain. Increasing morbidities among workers was also significantly associated with an increase in WRMDs on Chi-square test. Conclusions: The prevalence of musculoskeletal pain was high among tea pluckers and the most common site during the last 12 months and the last 7 days was the shoulder and lower back respectively was mild in character. Increase in age and duration of employment was associated with WRMDs. PMID:26957816
A pre-operative planning for endoprosthetic human tracheal implantation: a decision support system based on robust design of experiments.

PubMed

Trabelsi, O; Villalobos, J L López; Ginel, A; Cortes, E Barrot; Doblaré, M

2014-05-01

Swallowing depends on physiological variables that have a decisive influence on the swallowing capacity and on the tracheal stress distribution. Prosthetic implantation modifies these values and the overall performance of the trachea. The objective of this work was to develop a decision support system based on experimental, numerical and statistical approaches, with clinical verification, to help the thoracic surgeon in deciding the position and appropriate dimensions of a Dumon prosthesis for a specific patient in an optimal time and with sufficient robustness. A code for mesh adaptation to any tracheal geometry was implemented and used to develop a robust experimental design, based on the Taguchi's method and the analysis of variance. This design was able to establish the main swallowing influencing factors. The equations to fit the stress and the vertical displacement distributions were obtained. The resulting fitted values were compared to those calculated directly by the finite element method (FEM). Finally, a checking and clinical validation of the statistical study were made, by studying two cases of real patients. The vertical displacements and principal stress distribution obtained for the specific tracheal model were in agreement with those calculated by FE simulations with a maximum absolute error of 1.2 mm and 0.17 MPa, respectively. It was concluded that the resulting decision support tool provides a fast, accurate and simple tool for the thoracic surgeon to predict the stress state of the trachea and the reduction in the ability to swallow after implantation. Thus, it will help them in taking decisions during pre-operative planning of tracheal interventions.
Parallel computing for automated model calibration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burke, John S.; Danielson, Gary R.; Schulz, Douglas A.

2002-07-29

Natural resources model calibration is a significant burden on computing and staff resources in modeling efforts. Most assessments must consider multiple calibration objectives (for example magnitude and timing of stream flow peak). An automated calibration process that allows real time updating of data/models, allowing scientists to focus effort on improving models is needed. We are in the process of building a fully featured multi objective calibration tool capable of processing multiple models cheaply and efficiently using null cycle computing. Our parallel processing and calibration software routines have been generically, but our focus has been on natural resources model calibration. Somore » far, the natural resources models have been friendly to parallel calibration efforts in that they require no inter-process communication, only need a small amount of input data and only output a small amount of statistical information for each calibration run. A typical auto calibration run might involve running a model 10,000 times with a variety of input parameters and summary statistical output. In the past model calibration has been done against individual models for each data set. The individual model runs are relatively fast, ranging from seconds to minutes. The process was run on a single computer using a simple iterative process. We have completed two Auto Calibration prototypes and are currently designing a more feature rich tool. Our prototypes have focused on running the calibration in a distributed computing cross platform environment. They allow incorporation of?smart? calibration parameter generation (using artificial intelligence processing techniques). Null cycle computing similar to SETI@Home has also been a focus of our efforts. This paper details the design of the latest prototype and discusses our plans for the next revision of the software.« less
Testing for voter rigging in small polling stations

PubMed Central

Jimenez, Raúl; Hidalgo, Manuel; Klimek, Peter

2017-01-01

Nowadays, a large number of countries combine formal democratic institutions with authoritarian practices. Although in these countries the ruling elites may receive considerable voter support, they often use several manipulation tools to control election outcomes. A common practice of these regimes is the coercion and mobilization of large numbers of voters. This electoral irregularity is known as voter rigging, distinguishing it from vote rigging, which involves ballot stuffing or stealing. We develop a statistical test to quantify the extent to which the results of a particular election display traces of voter rigging. Our key hypothesis is that small polling stations are more susceptible to voter rigging because it is easier to identify opposing individuals, there are fewer eyewitnesses, and interested parties might reasonably expect fewer visits from election observers. We devise a general statistical method for testing whether voting behavior in small polling stations is significantly different from the behavior in their neighbor stations in a way that is consistent with the widespread occurrence of voter rigging. On the basis of a comparative analysis, the method enables third parties to conclude that an explanation other than simple variability is needed to explain geographic heterogeneities in vote preferences. We analyze 21 elections in 10 countries and find significant statistical anomalies compatible with voter rigging in Russia from 2007 to 2011, in Venezuela from 2006 to 2013, and in Uganda in 2011. Particularly disturbing is the case of Venezuela, where the smallest polling stations were decisive to the outcome of the 2013 presidential elections. PMID:28695193
Testing for voter rigging in small polling stations.

PubMed

Jimenez, Raúl; Hidalgo, Manuel; Klimek, Peter

2017-06-01

Nowadays, a large number of countries combine formal democratic institutions with authoritarian practices. Although in these countries the ruling elites may receive considerable voter support, they often use several manipulation tools to control election outcomes. A common practice of these regimes is the coercion and mobilization of large numbers of voters. This electoral irregularity is known as voter rigging, distinguishing it from vote rigging, which involves ballot stuffing or stealing. We develop a statistical test to quantify the extent to which the results of a particular election display traces of voter rigging. Our key hypothesis is that small polling stations are more susceptible to voter rigging because it is easier to identify opposing individuals, there are fewer eyewitnesses, and interested parties might reasonably expect fewer visits from election observers. We devise a general statistical method for testing whether voting behavior in small polling stations is significantly different from the behavior in their neighbor stations in a way that is consistent with the widespread occurrence of voter rigging. On the basis of a comparative analysis, the method enables third parties to conclude that an explanation other than simple variability is needed to explain geographic heterogeneities in vote preferences. We analyze 21 elections in 10 countries and find significant statistical anomalies compatible with voter rigging in Russia from 2007 to 2011, in Venezuela from 2006 to 2013, and in Uganda in 2011. Particularly disturbing is the case of Venezuela, where the smallest polling stations were decisive to the outcome of the 2013 presidential elections.
Methodological and Reporting Quality of Systematic Reviews and Meta-analyses in Endodontics.

PubMed

Nagendrababu, Venkateshbabu; Pulikkotil, Shaju Jacob; Sultan, Omer Sheriff; Jayaraman, Jayakumar; Peters, Ove A

2018-06-01

The aim of this systematic review (SR) was to evaluate the quality of SRs and meta-analyses (MAs) in endodontics. A comprehensive literature search was conducted to identify relevant articles in the electronic databases from January 2000 to June 2017. Two reviewers independently assessed the articles for eligibility and data extraction. SRs and MAs on interventional studies with a minimum of 2 therapeutic strategies in endodontics were included in this SR. Methodologic and reporting quality were assessed using A Measurement Tool to Assess Systematic Reviews (AMSTAR) and Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA), respectively. The interobserver reliability was calculated using the Cohen kappa statistic. Statistical analysis with the level of significance at P < .05 was performed using Kruskal-Wallis tests and simple linear regression analysis. A total of 30 articles were selected for the current SR. Using AMSTAR, the item related to the scientific quality of studies used in conclusion was adhered by less than 40% of studies. Using PRISMA, 3 items were reported by less than 40% of studies, which were on objectives, protocol registration, and funding. No association was evident comparing the number of authors and country with quality. Statistical significance was observed when quality was compared among journals, with studies published as Cochrane reviews superior to those published in other journals. AMSTAR and PRISMA scores were significantly related. SRs in endodontics showed variability in both methodologic and reporting quality. Copyright © 2018 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Automated detection of hospital outbreaks: A systematic review of methods.

PubMed

Leclère, Brice; Buckeridge, David L; Boëlle, Pierre-Yves; Astagneau, Pascal; Lepelletier, Didier

2017-01-01

Several automated algorithms for epidemiological surveillance in hospitals have been proposed. However, the usefulness of these methods to detect nosocomial outbreaks remains unclear. The goal of this review was to describe outbreak detection algorithms that have been tested within hospitals, consider how they were evaluated, and synthesize their results. We developed a search query using keywords associated with hospital outbreak detection and searched the MEDLINE database. To ensure the highest sensitivity, no limitations were initially imposed on publication languages and dates, although we subsequently excluded studies published before 2000. Every study that described a method to detect outbreaks within hospitals was included, without any exclusion based on study design. Additional studies were identified through citations in retrieved studies. Twenty-nine studies were included. The detection algorithms were grouped into 5 categories: simple thresholds (n = 6), statistical process control (n = 12), scan statistics (n = 6), traditional statistical models (n = 6), and data mining methods (n = 4). The evaluation of the algorithms was often solely descriptive (n = 15), but more complex epidemiological criteria were also investigated (n = 10). The performance measures varied widely between studies: e.g., the sensitivity of an algorithm in a real world setting could vary between 17 and 100%. Even if outbreak detection algorithms are useful complementary tools for traditional surveillance, the heterogeneity in results among published studies does not support quantitative synthesis of their performance. A standardized framework should be followed when evaluating outbreak detection methods to allow comparison of algorithms across studies and synthesis of results.
A Data Warehouse Architecture for DoD Healthcare Performance Measurements.

DTIC Science & Technology

1999-09-01

design, develop, implement, and apply statistical analysis and data mining tools to a Data Warehouse of healthcare metrics. With the DoD healthcare...framework, this thesis defines a methodology to design, develop, implement, and apply statistical analysis and data mining tools to a Data Warehouse...21 F. INABILITY TO CONDUCT HELATHCARE ANALYSIS

A Web-Based Learning Tool Improves Student Performance in Statistics: A Randomized Masked Trial

ERIC Educational Resources Information Center

Gonzalez, Jose A.; Jover, Lluis; Cobo, Erik; Munoz, Pilar

2010-01-01

Background: e-status is a web-based tool able to generate different statistical exercises and to provide immediate feedback to students' answers. Although the use of Information and Communication Technologies (ICTs) is becoming widespread in undergraduate education, there are few experimental studies evaluating its effects on learning. Method: All…
Learning Axes and Bridging Tools in a Technology-Based Design for Statistics

ERIC Educational Resources Information Center

Abrahamson, Dor; Wilensky, Uri

2007-01-01

We introduce a design-based research framework, "learning axes and bridging tools," and demonstrate its application in the preparation and study of an implementation of a middle-school experimental computer-based unit on probability and statistics, "ProbLab" (Probability Laboratory, Abrahamson and Wilensky 2002 [Abrahamson, D., & Wilensky, U.…
Statistical Physics in the Era of Big Data

ERIC Educational Resources Information Center

Wang, Dashun

2013-01-01

With the wealth of data provided by a wide range of high-throughout measurement tools and technologies, statistical physics of complex systems is entering a new phase, impacting in a meaningful fashion a wide range of fields, from cell biology to computer science to economics. In this dissertation, by applying tools and techniques developed in…
Entropy Is Simple, Qualitatively.

ERIC Educational Resources Information Center

Lambert, Frank L.

2002-01-01

Suggests that qualitatively, entropy is simple. Entropy increase from a macro viewpoint is a measure of the dispersal of energy from localized to spread out at a temperature T. Fundamentally based on statistical and quantum mechanics, this approach is superior to the non-fundamental "disorder" as a descriptor of entropy change. (MM)
Several steps/day indicators predict changes in anthropometric outcomes: HUB city steps

USDA-ARS?s Scientific Manuscript database

Walking for exercise remains the most frequently reported leisure-time activity, likely because it is simple, inexpensive, and easily incorporated into most people’s lifestyle. Pedometers are simple, convenient, and economical tools that can be used to quantify step-determined physical activity. F...
Predicting Fish Densities in Lotic Systems: a Simple Modeling Approach

EPA Science Inventory

Fish density models are essential tools for fish ecologists and fisheries managers. However, applying these models can be difficult because of high levels of model complexity and the large number of parameters that must be estimated. We designed a simple fish density model and te...
A Progression of Static Equilibrium Laboratory Exercises

ERIC Educational Resources Information Center

Kutzner, Mickey; Kutzner, Andrew

2013-01-01

Although simple architectural structures like bridges, catwalks, cantilevers, and Stonehenge have been integral in human societies for millennia, as have levers and other simple tools, modern students of introductory physics continue to grapple with Newton's conditions for static equilibrium. As formulated in typical introductory physics…
Simple tool for planting acorns

Treesearch

William R. Beaufait

1957-01-01

A handy, inexpensive tool for planting acorns has been developed at the Delta Research Center of the Southern Forest Experiment Station and used successfully in experimental plantings. One of its merits is that it ensures a planting hole of eactly the desired depth.
A simple biota removal algorithm for 35 GHz cloud radar measurements

NASA Astrophysics Data System (ADS)

Kalapureddy, Madhu Chandra R.; Sukanya, Patra; Das, Subrata K.; Deshpande, Sachin M.; Pandithurai, Govindan; Pazamany, Andrew L.; Ambuj K., Jha; Chakravarty, Kaustav; Kalekar, Prasad; Krishna Devisetty, Hari; Annam, Sreenivas

2018-03-01

Cloud radar reflectivity profiles can be an important measurement for the investigation of cloud vertical structure (CVS). However, extracting intended meteorological cloud content from the measurement often demands an effective technique or algorithm that can reduce error and observational uncertainties in the recorded data. In this work, a technique is proposed to identify and separate cloud and non-hydrometeor echoes using the radar Doppler spectral moments profile measurements. The point and volume target-based theoretical radar sensitivity curves are used for removing the receiver noise floor and identified radar echoes are scrutinized according to the signal decorrelation period. Here, it is hypothesized that cloud echoes are observed to be temporally more coherent and homogenous and have a longer correlation period than biota. That can be checked statistically using ˜ 4 s sliding mean and standard deviation value of reflectivity profiles. The above step helps in screen out clouds critically by filtering out the biota. The final important step strives for the retrieval of cloud height. The proposed algorithm potentially identifies cloud height solely through the systematic characterization of Z variability using the local atmospheric vertical structure knowledge besides to the theoretical, statistical and echo tracing tools. Thus, characterization of high-resolution cloud radar reflectivity profile measurements has been done with the theoretical echo sensitivity curves and observed echo statistics for the true cloud height tracking (TEST). TEST showed superior performance in screening out clouds and filtering out isolated insects. TEST constrained with polarimetric measurements was found to be more promising under high-density biota whereas TEST combined with linear depolarization ratio and spectral width perform potentially to filter out biota within the highly turbulent shallow cumulus clouds in the convective boundary layer (CBL). This TEST technique is promisingly simple in realization but powerful in performance due to the flexibility in constraining, identifying and filtering out the biota and screening out the true cloud content, especially the CBL clouds. Therefore, the TEST algorithm is superior for screening out the low-level clouds that are strongly linked to the rainmaking mechanism associated with the Indian Summer Monsoon region's CVS.
Making Sense of 'Big Data' in Provenance Studies

NASA Astrophysics Data System (ADS)

Vermeesch, P.

2014-12-01

Huge online databases can be 'mined' to reveal previously hidden trends and relationships in society. One could argue that sedimentary geology has entered a similar era of 'Big Data', as modern provenance studies routinely apply multiple proxies to dozens of samples. Just like the Internet, sedimentary geology now requires specialised statistical tools to interpret such large datasets. These can be organised on three levels of progressively higher order:A single sample: The most effective way to reveal the provenance information contained in a representative sample of detrital zircon U-Pb ages are probability density estimators such as histograms and kernel density estimates. The widely popular 'probability density plots' implemented in IsoPlot and AgeDisplay compound analytical uncertainty with geological scatter and are therefore invalid.Several samples: Multi-panel diagrams comprising many detrital age distributions or compositional pie charts quickly become unwieldy and uninterpretable. For example, if there are N samples in a study, then the number of pairwise comparisons between samples increases quadratically as N(N-1)/2. This is simply too much information for the human eye to process. To solve this problem, it is necessary to (a) express the 'distance' between two samples as a simple scalar and (b) combine all N(N-1)/2 such values in a single two-dimensional 'map', grouping similar and pulling apart dissimilar samples. This can be easily achieved using simple statistics-based dissimilarity measures and a standard statistical method called Multidimensional Scaling (MDS).Several methods: Suppose that we use four provenance proxies: bulk petrography, chemistry, heavy minerals and detrital geochronology. This will result in four MDS maps, each of which likely show slightly different trends and patterns. To deal with such cases, it may be useful to use a related technique called 'three way multidimensional scaling'. This results in two graphical outputs: an MDS map, and a map with 'weights' showing to what extent the different provenance proxies influence the horizontal and vertical axis of the MDS map. Thus, detrital data can not only inform the user about the provenance of sediments, but also about the causal relationships between the mineralogy, geochronology and chemistry.
Students' perspectives on research and assessment of a model template designed to guide beginners in research in a medical school in Cameroon.

PubMed

Tambe, Joshua; Minkande, Jacqueline Ze; Moifo, Boniface; Mbu, Robinson; Ongolo-Zogo, Pierre; Gonsu, Joseph

2014-12-21

Research activities for medical students and residents (trainees) are expected to serve as a foundation for the acquisition of basic research skills. Some medical schools therefore recommend research work as partial requirement for certification. However medical trainees have many difficulties concerning research, for which reason potential remedial strategies need to be constantly developed and tested. The views of medical trainees are assessed followed by their use and appraisal of a novel "self-help" tool designed for the purposes of this study with potential for improvement and a wider application. This study was a cross-sectional survey of volunteering final-year medical students and residents of a medical school in Cameroon. This study surveyed the opinions of a total of 120 volunteers of which 82 (68%) were medical students. Three out of 82 (4%) medical students reported they had participated in research activities with a publication versus 10 out of 38 residents (26%). The reported difficulties in research for these trainees included referencing of material (84%), writing a research proposal (79%), searching for literature (73%) and knowledge of applicable statistical tests (72%) amongst others. All participants declared the "self-help" tool was simple to use, guided them to think and better understand their research focus. Medical trainees require much assistance on research and some "self-help" tools such as the template used in this study might be a useful adjunct to didactic lectures.
Mr-Moose: An advanced SED-fitting tool for heterogeneous multi-wavelength datasets

NASA Astrophysics Data System (ADS)

Drouart, G.; Falkendal, T.

2018-04-01

We present the public release of Mr-Moose, a fitting procedure that is able to perform multi-wavelength and multi-object spectral energy distribution (SED) fitting in a Bayesian framework. This procedure is able to handle a large variety of cases, from an isolated source to blended multi-component sources from an heterogeneous dataset (i.e. a range of observation sensitivities and spectral/spatial resolutions). Furthermore, Mr-Moose handles upper-limits during the fitting process in a continuous way allowing models to be gradually less probable as upper limits are approached. The aim is to propose a simple-to-use, yet highly-versatile fitting tool fro handling increasing source complexity when combining multi-wavelength datasets with fully customisable filter/model databases. The complete control of the user is one advantage, which avoids the traditional problems related to the "black box" effect, where parameter or model tunings are impossible and can lead to overfitting and/or over-interpretation of the results. Also, while a basic knowledge of Python and statistics is required, the code aims to be sufficiently user-friendly for non-experts. We demonstrate the procedure on three cases: two artificially-generated datasets and a previous result from the literature. In particular, the most complex case (inspired by a real source, combining Herschel, ALMA and VLA data) in the context of extragalactic SED fitting, makes Mr-Moose a particularly-attractive SED fitting tool when dealing with partially blended sources, without the need for data deconvolution.
A study of the mechanical vibrations of a table-top extreme ultraviolet interference nanolithography tool.

PubMed

Prezioso, S; De Marco, P; Zuppella, P; Santucci, S; Ottaviano, L

2010-04-01

A prototype low cost table-top extreme ultraviolet (EUV) laser source (1.5 ns pulse duration, lambda=46.9 nm) was successfully employed as a laboratory scale interference nanolithography (INL) tool. Interference patterns were obtained with a simple Lloyd's mirror setup. Periodic structures on Polymethylmethacrylate/Si substrates were produced on large areas (8 mm(2)) with resolutions from 400 to 22.5 nm half pitch (the smallest resolution achieved so far with table-top EUV laser sources). The mechanical vibrations affecting both the laser source and Lloyd's setup were studied to determine if and how they affect the lateral resolution of the lithographic system. The vibration dynamics was described by a statistical model based on the assumption that the instantaneous position of the vibrating mechanical parts follows a normal distribution. An algorithm was developed to simulate the process of sample irradiation under different vibrations. The comparison between simulations and experiments allowed to estimate the characteristic amplitude of vibrations that was deduced to be lower than 50 nm. The same algorithm was used to reproduce the expected pattern profiles in the lambda/4 half pitch physical resolution limit. In that limit, a nonzero pattern modulation amplitude was obtained from the simulations, comparable to the peak-to-valley height (2-3 nm) measured for the 45 nm spaced fringes, indicating that the mechanical vibrations affecting the INL tool do not represent a limit in scaling down the resolution.
MR-MOOSE: an advanced SED-fitting tool for heterogeneous multi-wavelength data sets

NASA Astrophysics Data System (ADS)

Drouart, G.; Falkendal, T.

2018-07-01

We present the public release of MR-MOOSE, a fitting procedure that is able to perform multi-wavelength and multi-object spectral energy distribution (SED) fitting in a Bayesian framework. This procedure is able to handle a large variety of cases, from an isolated source to blended multi-component sources from a heterogeneous data set (i.e. a range of observation sensitivities and spectral/spatial resolutions). Furthermore, MR-MOOSE handles upper limits during the fitting process in a continuous way allowing models to be gradually less probable as upper limits are approached. The aim is to propose a simple-to-use, yet highly versatile fitting tool for handling increasing source complexity when combining multi-wavelength data sets with fully customisable filter/model data bases. The complete control of the user is one advantage, which avoids the traditional problems related to the `black box' effect, where parameter or model tunings are impossible and can lead to overfitting and/or over-interpretation of the results. Also, while a basic knowledge of PYTHON and statistics is required, the code aims to be sufficiently user-friendly for non-experts. We demonstrate the procedure on three cases: two artificially generated data sets and a previous result from the literature. In particular, the most complex case (inspired by a real source, combining Herschel, ALMA, and VLA data) in the context of extragalactic SED fitting makes MR-MOOSE a particularly attractive SED fitting tool when dealing with partially blended sources, without the need for data deconvolution.
GO Explorer: A gene-ontology tool to aid in the interpretation of shotgun proteomics data.

PubMed

Carvalho, Paulo C; Fischer, Juliana Sg; Chen, Emily I; Domont, Gilberto B; Carvalho, Maria Gc; Degrave, Wim M; Yates, John R; Barbosa, Valmir C

2009-02-24

Spectral counting is a shotgun proteomics approach comprising the identification and relative quantitation of thousands of proteins in complex mixtures. However, this strategy generates bewildering amounts of data whose biological interpretation is a challenge. Here we present a new algorithm, termed GO Explorer (GOEx), that leverages the gene ontology (GO) to aid in the interpretation of proteomic data. GOEx stands out because it combines data from protein fold changes with GO over-representation statistics to help draw conclusions. Moreover, it is tightly integrated within the PatternLab for Proteomics project and, thus, lies within a complete computational environment that provides parsers and pattern recognition tools designed for spectral counting. GOEx offers three independent methods to query data: an interactive directed acyclic graph, a specialist mode where key words can be searched, and an automatic search. Its usefulness is demonstrated by applying it to help interpret the effects of perillyl alcohol, a natural chemotherapeutic agent, on glioblastoma multiform cell lines (A172). We used a new multi-surfactant shotgun proteomic strategy and identified more than 2600 proteins; GOEx pinpointed key sets of differentially expressed proteins related to cell cycle, alcohol catabolism, the Ras pathway, apoptosis, and stress response, to name a few. GOEx facilitates organism-specific studies by leveraging GO and providing a rich graphical user interface. It is a simple to use tool, specialized for biologists who wish to analyze spectral counting data from shotgun proteomics. GOEx is available at http://pcarvalho.com/patternlab.
Making Temporal Logic Calculational: A Tool for Unification and Discovery

NASA Astrophysics Data System (ADS)

Boute, Raymond

In temporal logic, calculational proofs beyond simple cases are often seen as challenging. The situation is reversed by making temporal logic calculational, yielding shorter and clearer proofs than traditional ones, and serving as a (mental) tool for unification and discovery. A side-effect of unifying theories is easier access by practicians. The starting point is a simple generic (software tool independent) Functional Temporal Calculus (FTC). Specific temporal logics are then captured via endosemantic functions. This concept reflects tacit conventions throughout mathematics and, once identified, is general and useful. FTC also yields a reasoning style that helps discovering theorems by calculation rather than just proving given facts. This is illustrated by deriving various theorems, most related to liveness issues in TLA+, and finding strengthenings of known results. Educational issues are addressed in passing.
A Simple Framework for Evaluating Authorial Contributions for Scientific Publications.

PubMed

Warrender, Jeffrey M

2016-10-01

A simple tool is provided to assist researchers in assessing contributions to a scientific publication, for ease in evaluating which contributors qualify for authorship, and in what order the authors should be listed. The tool identifies four phases of activity leading to a publication-Conception and Design, Data Acquisition, Analysis and Interpretation, and Manuscript Preparation. By comparing a project participant's contribution in a given phase to several specified thresholds, a score of up to five points can be assigned; the contributor's scores in all four phases are summed to yield a total "contribution score", which is compared to a threshold to determine which contributors merit authorship. This tool may be useful in a variety of contexts in which a systematic approach to authorial credit is desired.
Seed: a user-friendly tool for exploring and visualizing microbial community data.

PubMed

Beck, Daniel; Dennis, Christopher; Foster, James A

2015-02-15

In this article we present Simple Exploration of Ecological Data (Seed), a data exploration tool for microbial communities. Seed is written in R using the Shiny library. This provides access to powerful R-based functions and libraries through a simple user interface. Seed allows users to explore ecological datasets using principal coordinate analyses, scatter plots, bar plots, hierarchal clustering and heatmaps. Seed is open source and available at https://github.com/danlbek/Seed. danlbek@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
PANDA-view: An easy-to-use tool for statistical analysis and visualization of quantitative proteomics data.

PubMed

Chang, Cheng; Xu, Kaikun; Guo, Chaoping; Wang, Jinxia; Yan, Qi; Zhang, Jian; He, Fuchu; Zhu, Yunping

2018-05-22

Compared with the numerous software tools developed for identification and quantification of -omics data, there remains a lack of suitable tools for both downstream analysis and data visualization. To help researchers better understand the biological meanings in their -omics data, we present an easy-to-use tool, named PANDA-view, for both statistical analysis and visualization of quantitative proteomics data and other -omics data. PANDA-view contains various kinds of analysis methods such as normalization, missing value imputation, statistical tests, clustering and principal component analysis, as well as the most commonly-used data visualization methods including an interactive volcano plot. Additionally, it provides user-friendly interfaces for protein-peptide-spectrum representation of the quantitative proteomics data. PANDA-view is freely available at https://sourceforge.net/projects/panda-view/. 1987ccpacer@163.com and zhuyunping@gmail.com. Supplementary data are available at Bioinformatics online.
SimHap GUI: An intuitive graphical user interface for genetic association analysis

PubMed Central

Carter, Kim W; McCaskie, Pamela A; Palmer, Lyle J

2008-01-01

Background Researchers wishing to conduct genetic association analysis involving single nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language, provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a graphical user interface that allows anyone but a professional statistician to effectively utilise the tool. Results We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI features a novel workflow interface that guides the user through each logical step of the analysis process, making it accessible to both novice and advanced users. This tool provides a seamless interface to the SimHap R package, while providing enhanced functionality such as sophisticated data checking, automated data conversion, and real-time estimations of haplotype simulation progress. Conclusion SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a range of genetic and non-genetic association analyses. This provides a free alternative to commercial statistics packages that is specifically designed for genetic association analysis. PMID:19109877

Rapid development of image analysis research tools: Bridging the gap between researcher and clinician with pyOsiriX.

PubMed

Blackledge, Matthew D; Collins, David J; Koh, Dow-Mu; Leach, Martin O

2016-02-01

We present pyOsiriX, a plugin built for the already popular dicom viewer OsiriX that provides users the ability to extend the functionality of OsiriX through simple Python scripts. This approach allows users to integrate the many cutting-edge scientific/image-processing libraries created for Python into a powerful DICOM visualisation package that is intuitive to use and already familiar to many clinical researchers. Using pyOsiriX we hope to bridge the apparent gap between basic imaging scientists and clinical practice in a research setting and thus accelerate the development of advanced clinical image processing. We provide arguments for the use of Python as a robust scripting language for incorporation into larger software solutions, outline the structure of pyOsiriX and how it may be used to extend the functionality of OsiriX, and we provide three case studies that exemplify its utility. For our first case study we use pyOsiriX to provide a tool for smooth histogram display of voxel values within a user-defined region of interest (ROI) in OsiriX. We used a kernel density estimation (KDE) method available in Python using the scikit-learn library, where the total number of lines of Python code required to generate this tool was 22. Our second example presents a scheme for segmentation of the skeleton from CT datasets. We have demonstrated that good segmentation can be achieved for two example CT studies by using a combination of Python libraries including scikit-learn, scikit-image, SimpleITK and matplotlib. Furthermore, this segmentation method was incorporated into an automatic analysis of quantitative PET-CT in a patient with bone metastases from primary prostate cancer. This enabled repeatable statistical evaluation of PET uptake values for each lesion, before and after treatment, providing estaimes maximum and median standardised uptake values (SUVmax and SUVmed respectively). Following treatment we observed a reduction in lesion volume, SUVmax and SUVmed for all lesions, in agreement with a reduction in concurrent measures of serum prostate-specific antigen (PSA). Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.
Pedagogical Utilization and Assessment of the Statistic Online Computational Resource in Introductory Probability and Statistics Courses.

PubMed

Dinov, Ivo D; Sanchez, Juana; Christou, Nicolas

2008-01-01

Technology-based instruction represents a new recent pedagogical paradigm that is rooted in the realization that new generations are much more comfortable with, and excited about, new technologies. The rapid technological advancement over the past decade has fueled an enormous demand for the integration of modern networking, informational and computational tools with classical pedagogical instruments. Consequently, teaching with technology typically involves utilizing a variety of IT and multimedia resources for online learning, course management, electronic course materials, and novel tools of communication, engagement, experimental, critical thinking and assessment.The NSF-funded Statistics Online Computational Resource (SOCR) provides a number of interactive tools for enhancing instruction in various undergraduate and graduate courses in probability and statistics. These resources include online instructional materials, statistical calculators, interactive graphical user interfaces, computational and simulation applets, tools for data analysis and visualization. The tools provided as part of SOCR include conceptual simulations and statistical computing interfaces, which are designed to bridge between the introductory and the more advanced computational and applied probability and statistics courses. In this manuscript, we describe our designs for utilizing SOCR technology in instruction in a recent study. In addition, present the results of the effectiveness of using SOCR tools at two different course intensity levels on three outcome measures: exam scores, student satisfaction and choice of technology to complete assignments. Learning styles assessment was completed at baseline. We have used three very different designs for three different undergraduate classes. Each course included a treatment group, using the SOCR resources, and a control group, using classical instruction techniques. Our findings include marginal effects of the SOCR treatment per individual classes; however, pooling the results across all courses and sections, SOCR effects on the treatment groups were exceptionally robust and significant. Coupling these findings with a clear decrease in the variance of the quantitative examination measures in the treatment groups indicates that employing technology, like SOCR, in a sound pedagogical and scientific manner enhances overall the students' understanding and suggests better long-term knowledge retention.
Pedagogical Utilization and Assessment of the Statistic Online Computational Resource in Introductory Probability and Statistics Courses

PubMed Central

Dinov, Ivo D.; Sanchez, Juana; Christou, Nicolas

2009-01-01

Technology-based instruction represents a new recent pedagogical paradigm that is rooted in the realization that new generations are much more comfortable with, and excited about, new technologies. The rapid technological advancement over the past decade has fueled an enormous demand for the integration of modern networking, informational and computational tools with classical pedagogical instruments. Consequently, teaching with technology typically involves utilizing a variety of IT and multimedia resources for online learning, course management, electronic course materials, and novel tools of communication, engagement, experimental, critical thinking and assessment. The NSF-funded Statistics Online Computational Resource (SOCR) provides a number of interactive tools for enhancing instruction in various undergraduate and graduate courses in probability and statistics. These resources include online instructional materials, statistical calculators, interactive graphical user interfaces, computational and simulation applets, tools for data analysis and visualization. The tools provided as part of SOCR include conceptual simulations and statistical computing interfaces, which are designed to bridge between the introductory and the more advanced computational and applied probability and statistics courses. In this manuscript, we describe our designs for utilizing SOCR technology in instruction in a recent study. In addition, present the results of the effectiveness of using SOCR tools at two different course intensity levels on three outcome measures: exam scores, student satisfaction and choice of technology to complete assignments. Learning styles assessment was completed at baseline. We have used three very different designs for three different undergraduate classes. Each course included a treatment group, using the SOCR resources, and a control group, using classical instruction techniques. Our findings include marginal effects of the SOCR treatment per individual classes; however, pooling the results across all courses and sections, SOCR effects on the treatment groups were exceptionally robust and significant. Coupling these findings with a clear decrease in the variance of the quantitative examination measures in the treatment groups indicates that employing technology, like SOCR, in a sound pedagogical and scientific manner enhances overall the students’ understanding and suggests better long-term knowledge retention. PMID:19750185
Creation of a simple natural language processing tool to support an imaging utilization quality dashboard.

PubMed

Swartz, Jordan; Koziatek, Christian; Theobald, Jason; Smith, Silas; Iturrate, Eduardo

2017-05-01

Testing for venous thromboembolism (VTE) is associated with cost and risk to patients (e.g. radiation). To assess the appropriateness of imaging utilization at the provider level, it is important to know that provider's diagnostic yield (percentage of tests positive for the diagnostic entity of interest). However, determining diagnostic yield typically requires either time-consuming, manual review of radiology reports or the use of complex and/or proprietary natural language processing software. The objectives of this study were twofold: 1) to develop and implement a simple, user-configurable, and open-source natural language processing tool to classify radiology reports with high accuracy and 2) to use the results of the tool to design a provider-specific VTE imaging dashboard, consisting of both utilization rate and diagnostic yield. Two physicians reviewed a training set of 400 lower extremity ultrasound (UTZ) and computed tomography pulmonary angiogram (CTPA) reports to understand the language used in VTE-positive and VTE-negative reports. The insights from this review informed the arguments to the five modifiable parameters of the NLP tool. A validation set of 2,000 studies was then independently classified by the reviewers and by the tool; the classifications were compared and the performance of the tool was calculated. The tool was highly accurate in classifying the presence and absence of VTE for both the UTZ (sensitivity 95.7%; 95% CI 91.5-99.8, specificity 100%; 95% CI 100-100) and CTPA reports (sensitivity 97.1%; 95% CI 94.3-99.9, specificity 98.6%; 95% CI 97.8-99.4). The diagnostic yield was then calculated at the individual provider level and the imaging dashboard was created. We have created a novel NLP tool designed for users without a background in computer programming, which has been used to classify venous thromboembolism reports with a high degree of accuracy. The tool is open-source and available for download at http://iturrate.com/simpleNLP. Results obtained using this tool can be applied to enhance quality by presenting information about utilization and yield to providers via an imaging dashboard. Copyright © 2017 Elsevier B.V. All rights reserved.
Contrast Analysis: A Tutorial

ERIC Educational Resources Information Center

Haans, Antal

2018-01-01

Contrast analysis is a relatively simple but effective statistical method for testing theoretical predictions about differences between group means against the empirical data. Despite its advantages, contrast analysis is hardly used to date, perhaps because it is not implemented in a convenient manner in many statistical software packages. This…
Superordinate Shape Classification Using Natural Shape Statistics

ERIC Educational Resources Information Center

Wilder, John; Feldman, Jacob; Singh, Manish

2011-01-01

This paper investigates the classification of shapes into broad natural categories such as "animal" or "leaf". We asked whether such coarse classifications can be achieved by a simple statistical classification of the shape skeleton. We surveyed databases of natural shapes, extracting shape skeletons and tabulating their…
Modelling unsupervised online-learning of artificial grammars: linking implicit and statistical learning.

PubMed

Rohrmeier, Martin A; Cross, Ian

2014-07-01

Humans rapidly learn complex structures in various domains. Findings of above-chance performance of some untrained control groups in artificial grammar learning studies raise questions about the extent to which learning can occur in an untrained, unsupervised testing situation with both correct and incorrect structures. The plausibility of unsupervised online-learning effects was modelled with n-gram, chunking and simple recurrent network models. A novel evaluation framework was applied, which alternates forced binary grammaticality judgments and subsequent learning of the same stimulus. Our results indicate a strong online learning effect for n-gram and chunking models and a weaker effect for simple recurrent network models. Such findings suggest that online learning is a plausible effect of statistical chunk learning that is possible when ungrammatical sequences contain a large proportion of grammatical chunks. Such common effects of continuous statistical learning may underlie statistical and implicit learning paradigms and raise implications for study design and testing methodologies. Copyright © 2014 Elsevier Inc. All rights reserved.
A statistical method for measuring activation of gene regulatory networks.

PubMed

Esteves, Gustavo H; Reis, Luiz F L

2018-06-13

Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.
Contingency and statistical laws in replicate microbial closed ecosystems.

PubMed

Hekstra, Doeke R; Leibler, Stanislas

2012-05-25

Contingency, the persistent influence of past random events, pervades biology. To what extent, then, is each course of ecological or evolutionary dynamics unique, and to what extent are these dynamics subject to a common statistical structure? Addressing this question requires replicate measurements to search for emergent statistical laws. We establish a readily replicated microbial closed ecosystem (CES), sustaining its three species for years. We precisely measure the local population density of each species in many CES replicates, started from the same initial conditions and kept under constant light and temperature. The covariation among replicates of the three species densities acquires a stable structure, which could be decomposed into discrete eigenvectors, or "ecomodes." The largest ecomode dominates population density fluctuations around the replicate-average dynamics. These fluctuations follow simple power laws consistent with a geometric random walk. Thus, variability in ecological dynamics can be studied with CES replicates and described by simple statistical laws. Copyright © 2012 Elsevier Inc. All rights reserved.
A Simple Mechanical Model for the Isotropic Harmonic Oscillator

ERIC Educational Resources Information Center

Nita, Gelu M.

2010-01-01

A constrained elastic pendulum is proposed as a simple mechanical model for the isotropic harmonic oscillator. The conceptual and mathematical simplicity of this model recommends it as an effective pedagogical tool in teaching basic physics concepts at advanced high school and introductory undergraduate course levels. (Contains 2 figures.)
The Simple Theory of Public Library Services.

ERIC Educational Resources Information Center

Newhouse, Joseph P.

A simple normative theory applicable to public library services was developed as a tool to aid libraries in answering the question: which books should be bought by the library? Although developed for normative purposes, the theory generates testable predictions. It is relevant to measuring benefits from services which are provided publicly because…
Safety in the Chemical Laboratory: Laboratory Air Quality: Part I. A Concentration Model.

ERIC Educational Resources Information Center

Butcher, Samuel S.; And Others

1985-01-01

Offers a simple model for estimating vapor concentrations in instructional laboratories. Three methods are described for measuring ventilation rates, and the results of measurements in six laboratories are presented. The model should provide a simple screening tool for evaluating worst-case personal exposures. (JN)
Always look on both sides: Phylogenetic information conveyed by simple sequence repeat allele sequences

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...
Kangaroo – A pattern-matching program for biological sequences

PubMed Central

2002-01-01

Background Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells. Results Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/. Conclusion A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats. PMID:12150718
Rating a Teacher Observation Tool: Five Ways to Ensure Classroom Observations are Focused and Rigorous

ERIC Educational Resources Information Center

New Teacher Project, 2011

2011-01-01

This "Rating a Teacher Observation Tool" identifies five simple questions and provides an easy-to-use scorecard to help policymakers decide whether an observation framework is likely to produce fair and accurate results. The five questions are: (1) Do the criteria and tools cover the classroom performance areas most connected to student outcomes?…
Wire harness twisting aid

NASA Technical Reports Server (NTRS)

Casey, E. J.; Commadore, C. C.; Ingles, M. E.

1980-01-01

Long wire bundles twist into uniform spiral harnesses with help of simple apparatus. Wires pass through spacers and through hand-held tool with hole for each wire. Ends are attached to low speed bench motor. As motor turns, operator moves hand tool away forming smooth twists in wires between motor and tool. Technique produces harnesses that generate less radio-frequency interference than do irregularly twisted cables.
Validation of the Canada Acute Coronary Syndrome Risk Score for Hospital Mortality in the Gulf Registry of Acute Coronary Events-2.

PubMed

AlFaleh, Hussam F; Alsheikh-Ali, Alawi A; Ullah, Anhar; AlHabib, Khalid F; Hersi, Ahmad; Suwaidi, Jassim Al; Sulaiman, Kadhim; Saif, Shukri Al; Almahmeed, Wael; Asaad, Nidal; Amin, Haitham; Al-Motarreb, Ahmed; Kashour, Tarek

2015-09-01

Several risk scores have been developed for acute coronary syndrome (ACS) patients, but their use is limited by their complexity. The new Canada Acute Coronary Syndrome (C-ACS) risk score is a simple risk-assessment tool for ACS patients. This study assessed the performance of the C-ACS risk score in predicting hospital mortality in a contemporary Middle Eastern ACS cohort. The C-ACS score accurately predicts hospital mortality in ACS patients. The baseline risk of 7929 patients from 6 Arab countries who were enrolled in the Gulf RACE-2 registry was assessed using the C-ACS risk score. The score ranged from 0 to 4, with 1 point assigned for the presence of each of the following variables: age ≥75 years, Killip class >1, systolic blood pressure <100 mm Hg, and heart rate >100 bpm. The discriminative ability and calibration of the score were assessed using C statistics and goodness-of-fit tests, respectively. The C-ACS score demonstrated good predictive values for hospital mortality in all ACS patients with a C statistic of 0.77 (95% confidence interval [CI]: 0.74-0.80) and in ST-segment elevation myocardial infarction and non-ST-segment elevation acute coronary syndrome patients (C statistic: 0.76, 95% CI: 0.73-0.79; and C statistic: 0.80, 95% CI: 0.75-0.84, respectively). The discriminative ability of the score was moderate regardless of age category, nationality, and diabetic status. Overall, calibration was optimal in all subgroups. The new C-ACS score performed well in predicting hospital mortality in a contemporary ACS population outside North America. © 2015 Wiley Periodicals, Inc.
Equivalence between Step Selection Functions and Biased Correlated Random Walks for Statistical Inference on Animal Movement.

PubMed

Duchesne, Thierry; Fortin, Daniel; Rivest, Louis-Paul

2015-01-01

Animal movement has a fundamental impact on population and community structure and dynamics. Biased correlated random walks (BCRW) and step selection functions (SSF) are commonly used to study movements. Because no studies have contrasted the parameters and the statistical properties of their estimators for models constructed under these two Lagrangian approaches, it remains unclear whether or not they allow for similar inference. First, we used the Weak Law of Large Numbers to demonstrate that the log-likelihood function for estimating the parameters of BCRW models can be approximated by the log-likelihood of SSFs. Second, we illustrated the link between the two approaches by fitting BCRW with maximum likelihood and with SSF to simulated movement data in virtual environments and to the trajectory of bison (Bison bison L.) trails in natural landscapes. Using simulated and empirical data, we found that the parameters of a BCRW estimated directly from maximum likelihood and by fitting an SSF were remarkably similar. Movement analysis is increasingly used as a tool for understanding the influence of landscape properties on animal distribution. In the rapidly developing field of movement ecology, management and conservation biologists must decide which method they should implement to accurately assess the determinants of animal movement. We showed that BCRW and SSF can provide similar insights into the environmental features influencing animal movements. Both techniques have advantages. BCRW has already been extended to allow for multi-state modeling. Unlike BCRW, however, SSF can be estimated using most statistical packages, it can simultaneously evaluate habitat selection and movement biases, and can easily integrate a large number of movement taxes at multiple scales. SSF thus offers a simple, yet effective, statistical technique to identify movement taxis.
Lagrangian statistics and flow topology in forced two-dimensional turbulence.

PubMed

Kadoch, B; Del-Castillo-Negrete, D; Bos, W J T; Schneider, K

2011-03-01

A study of the relationship between Lagrangian statistics and flow topology in fluid turbulence is presented. The topology is characterized using the Weiss criterion, which provides a conceptually simple tool to partition the flow into topologically different regions: elliptic (vortex dominated), hyperbolic (deformation dominated), and intermediate (turbulent background). The flow corresponds to forced two-dimensional Navier-Stokes turbulence in doubly periodic and circular bounded domains, the latter with no-slip boundary conditions. In the double periodic domain, the probability density function (pdf) of the Weiss field exhibits a negative skewness consistent with the fact that in periodic domains the flow is dominated by coherent vortex structures. On the other hand, in the circular domain, the elliptic and hyperbolic regions seem to be statistically similar. We follow a Lagrangian approach and obtain the statistics by tracking large ensembles of passively advected tracers. The pdfs of residence time in the topologically different regions are computed introducing the Lagrangian Weiss field, i.e., the Weiss field computed along the particles' trajectories. In elliptic and hyperbolic regions, the pdfs of the residence time have self-similar algebraic decaying tails. In contrast, in the intermediate regions the pdf has exponential decaying tails. The conditional pdfs (with respect to the flow topology) of the Lagrangian velocity exhibit Gaussian-like behavior in the periodic and in the bounded domains. In contrast to the freely decaying turbulence case, the conditional pdfs of the Lagrangian acceleration in forced turbulence show a comparable level of intermittency in both the periodic and the bounded domains. The conditional pdfs of the Lagrangian curvature are characterized, in all cases, by self-similar power-law behavior with a decay exponent of order -2.
Serum albumin levels in burn people are associated to the total body surface burned and the length of hospital stay but not to the initiation of the oral/enteral nutrition

PubMed Central

Pérez-Guisado, Joaquín; de Haro-Padilla, Jesús M; Rioja, Luis F; DeRosier, Leo C; de la Torre, Jorge I

2013-01-01

Objective: Serum albumin levels have been used to evaluate the severity of the burns and the nutrition protein status in burn people, specifically in the response of the burn patient to the nutrition. Although it hasn’t been proven if all these associations are fully funded. The aim of this retrospective study was to determine the relationship of serum albumin levels at 3-7 days after the burn injury, with the total body surface area burned (TBSA), the length of hospital stay (LHS) and the initiation of the oral/enteral nutrition (IOEN). Subject and methods: It was carried out with the health records of patients that accomplished the inclusion criteria and were admitted to the burn units at the University Hospital of Reina Sofia (Córdoba, Spain) and UAB Hospital at Birmingham (Alabama, USA) over a 10 years period, between January 2000 and December 2009. We studied the statistical association of serum albumin levels with the TBSA, LHS and IOEN by ANOVA one way test. The confidence interval chosen for statistical differences was 95%. Duncan’s test was used to determine the number of statistically significantly groups. Results: Were expressed as mean±standard deviation. We found serum albumin levels association with TBSA and LHS, with greater to lesser serum albumin levels found associated to lesser to greater TBSA and LHS. We didn’t find statistical association with IOEN. Conclusion: We conclude that serum albumin levels aren’t a nutritional marker in burn people although they could be used as a simple clinical tool to identify the severity of the burn wounds represented by the total body surface area burned and the lenght of hospital stay. PMID:23875122

Serum albumin levels in burn people are associated to the total body surface burned and the length of hospital stay but not to the initiation of the oral/enteral nutrition.

PubMed

Pérez-Guisado, Joaquín; de Haro-Padilla, Jesús M; Rioja, Luis F; Derosier, Leo C; de la Torre, Jorge I

2013-01-01

Serum albumin levels have been used to evaluate the severity of the burns and the nutrition protein status in burn people, specifically in the response of the burn patient to the nutrition. Although it hasn't been proven if all these associations are fully funded. The aim of this retrospective study was to determine the relationship of serum albumin levels at 3-7 days after the burn injury, with the total body surface area burned (TBSA), the length of hospital stay (LHS) and the initiation of the oral/enteral nutrition (IOEN). It was carried out with the health records of patients that accomplished the inclusion criteria and were admitted to the burn units at the University Hospital of Reina Sofia (Córdoba, Spain) and UAB Hospital at Birmingham (Alabama, USA) over a 10 years period, between January 2000 and December 2009. We studied the statistical association of serum albumin levels with the TBSA, LHS and IOEN by ANOVA one way test. The confidence interval chosen for statistical differences was 95%. Duncan's test was used to determine the number of statistically significantly groups. Were expressed as mean±standard deviation. We found serum albumin levels association with TBSA and LHS, with greater to lesser serum albumin levels found associated to lesser to greater TBSA and LHS. We didn't find statistical association with IOEN. We conclude that serum albumin levels aren't a nutritional marker in burn people although they could be used as a simple clinical tool to identify the severity of the burn wounds represented by the total body surface area burned and the lenght of hospital stay.
Simple Ontology Format (SOFT)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sorokine, Alexandre

2011-10-01

Simple Ontology Format (SOFT) library and file format specification provides a set of simple tools for developing and maintaining ontologies. The library, implemented as a perl module, supports parsing and verification of the files in SOFt format, operations with ontologies (adding, removing, or filtering of entities), and converting of ontologies into other formats. SOFT allows users to quickly create ontologies using only a basic text editor, verify it, and portray it in a graph layout system using customized styles.
Laparoscopic repair of perforated peptic ulcer: simple closure versus omentopexy.

PubMed

Lin, Being-Chuan; Liao, Chien-Hung; Wang, Shang-Yu; Hwang, Tsann-Long

2017-12-01

This report presents our experience with laparoscopic repair performed in 118 consecutive patients diagnosed with a perforated peptic ulcer (PPU). We compared the surgical outcome of simple closure with modified Cellan-Jones omentopexy and report the safety and benefit of simple closure. From January 2010 to December 2014, 118 patients with PPU underwent laparoscopic repair with simple closure (n = 27) or omentopexy (n = 91). Charts were retrospectively reviewed for demographic characteristics and outcome. The data were compared by Fisher's exact test, Mann-Whitney U test, Pearson's chi-square test, and the Kruskal-Wallis test. The results were considered statistically significant if P < 0.05. No patients died, whereas three incurred leakage. After matching, the simple closure and omentopexy groups had similarity in sex, systolic blood pressure, pulse rate, respiratory rate, Boey score, Charlson comorbidity index, Mannheim peritonitis index, and leakage. There were statistically significant differences in age, length of hospital stay, perforated size, and operating time. Comparison of the operating time in the ≤4.0 mm and 5.0-12 mm groups revealed that the simple closure took less time than omentopexy in both groups (≤4.0 mm, 76 versus 133 minutes, P < 0.0001; 5.0-12 mm, 97 versus 139.5 minutes; P = 0.006). Compared to the omentopexy, laparoscopic simple closure is a safe procedure and shortens the operating time. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Evaluation of a simple method for the automatic assignment of MeSH descriptors to health resources in a French online catalogue.

PubMed

Névéol, Aurélie; Pereira, Suzanne; Kerdelhué, Gaetan; Dahamna, Badisse; Joubert, Michel; Darmoni, Stéfan J

2007-01-01

The growing number of resources to be indexed in the catalogue of online health resources in French (CISMeF) calls for curating strategies involving automatic indexing tools while maintaining the catalogue's high indexing quality standards. To develop a simple automatic tool that retrieves MeSH descriptors from documents titles. In parallel to research on advanced indexing methods, a bag-of-words tool was developed for timely inclusion in CISMeF's maintenance system. An evaluation was carried out on a corpus of 99 documents. The indexing sets retrieved by the automatic tool were compared to manual indexing based on the title and on the full text of resources. 58% of the major main headings were retrieved by the bag-of-words algorithm and the precision on main heading retrieval was 69%. Bag-of-words indexing has effectively been used on selected resources to be included in CISMeF since August 2006. Meanwhile, on going work aims at improving the current version of the tool.
Calibrating the Difficulty of an Assessment Tool: The Blooming of a Statistics Examination

ERIC Educational Resources Information Center

Dunham, Bruce; Yapa, Gaitri; Yu, Eugenia

2015-01-01

Bloom's taxonomy is proposed as a tool by which to assess the level of complexity of assessment tasks in statistics. Guidelines are provided for how to locate tasks at each level of the taxonomy, along with descriptions and examples of suggested test questions. Through the "Blooming" of an examination--that is, locating its constituent…
Analytics for Cyber Network Defense

DOE Office of Scientific and Technical Information (OSTI.GOV)

Plantenga, Todd.; Kolda, Tamara Gibson

2011-06-01

This report provides a brief survey of analytics tools considered relevant to cyber network defense (CND). Ideas and tools come from elds such as statistics, data mining, and knowledge discovery. Some analytics are considered standard mathematical or statistical techniques, while others re ect current research directions. In all cases the report attempts to explain the relevance to CND with brief examples.
On the blind use of statistical tools in the analysis of globular cluster stars

NASA Astrophysics Data System (ADS)

D'Antona, Francesca; Caloi, Vittoria; Tailo, Marco

2018-04-01

As with most data analysis methods, the Bayesian method must be handled with care. We show that its application to determine stellar evolution parameters within globular clusters can lead to paradoxical results if used without the necessary precautions. This is a cautionary tale on the use of statistical tools for big data analysis.
A note on the correlation between circular and linear variables with an application to wind direction and air temperature data in a Mediterranean climate

NASA Astrophysics Data System (ADS)

Lototzis, M.; Papadopoulos, G. K.; Droulia, F.; Tseliou, A.; Tsiros, I. X.

2018-04-01

There are several cases where a circular variable is associated with a linear one. A typical example is wind direction that is often associated with linear quantities such as air temperature and air humidity. The analysis of a statistical relationship of this kind can be tested by the use of parametric and non-parametric methods, each of which has its own advantages and drawbacks. This work deals with correlation analysis using both the parametric and the non-parametric procedure on a small set of meteorological data of air temperature and wind direction during a summer period in a Mediterranean climate. Correlations were examined between hourly, daily and maximum-prevailing values, under typical and non-typical meteorological conditions. Both tests indicated a strong correlation between mean hourly wind directions and mean hourly air temperature, whereas mean daily wind direction and mean daily air temperature do not seem to be correlated. In some cases, however, the two procedures were found to give quite dissimilar levels of significance on the rejection or not of the null hypothesis of no correlation. The simple statistical analysis presented in this study, appropriately extended in large sets of meteorological data, may be a useful tool for estimating effects of wind on local climate studies.
To t-Test or Not to t-Test? A p-Values-Based Point of View in the Receiver Operating Characteristic Curve Framework.

PubMed

Vexler, Albert; Yu, Jihnhee

2018-04-13

A common statistical doctrine supported by many introductory courses and textbooks is that t-test type procedures based on normally distributed data points are anticipated to provide a standard in decision-making. In order to motivate scholars to examine this convention, we introduce a simple approach based on graphical tools of receiver operating characteristic (ROC) curve analysis, a well-established biostatistical methodology. In this context, we propose employing a p-values-based method, taking into account the stochastic nature of p-values. We focus on the modern statistical literature to address the expected p-value (EPV) as a measure of the performance of decision-making rules. During the course of our study, we extend the EPV concept to be considered in terms of the ROC curve technique. This provides expressive evaluations and visualizations of a wide spectrum of testing mechanisms' properties. We show that the conventional power characterization of tests is a partial aspect of the presented EPV/ROC technique. We desire that this explanation of the EPV/ROC approach convinces researchers of the usefulness of the EPV/ROC approach for depicting different characteristics of decision-making procedures, in light of the growing interest regarding correct p-values-based applications.
Heart rate variability changes during stroop color and word test among genders.

PubMed

Satish, Priyanka; Muralikrishnan, Krishnan; Balasubramanian, Kabali; Shanmugapriya

2015-01-01

Stress is the reaction of the body to a change that requires physical, mental or emotional adjustments. Individual differences in stress reactivity are a potentially important risk factor for gender-specific health problems in men and women. The Autonomic regulation of the cardiovascular system is most commonly affected by stress and is assessed by means of short term heart rate variability (HRV).The present study was undertaken to investigate the difference in the cardiovascular Autonomic Nervous System response to mental stress between the genders using HRV as tool. We compared the mean RR interval, Blood pressure and indices of HRV during the StroopColor Word Test (SCWT).Twenty five male (Age 19.52±0.714, BMI 22.73±2 kg/m2) and twenty five female subjects (Age 19.80±0.65, BMI 22.39±1.9) performed SCWT for five minutes. Blood Pressure (SBP p<0.01, DBP p<0.042) & Mean HR (p<0.010) values showed statistically significant difference among the genders. HRV indices like LFms2 (p<0.051), HF nu (p<0.029) and LF/HF ratio (p<0.025, p<0.052) show statistically significant difference among the genders. The response by the cardiovascular system to a simple mental stressor exhibits difference among the genders.
CompGO: an R package for comparing and visualizing Gene Ontology enrichment differences between DNA binding experiments.

PubMed

Waardenberg, Ashley J; Basset, Samuel D; Bouveret, Romaric; Harvey, Richard P

2015-09-02

Gene ontology (GO) enrichment is commonly used for inferring biological meaning from systems biology experiments. However, determining differential GO and pathway enrichment between DNA-binding experiments or using the GO structure to classify experiments has received little attention. Herein, we present a bioinformatics tool, CompGO, for identifying Differentially Enriched Gene Ontologies, called DiEGOs, and pathways, through the use of a z-score derivation of log odds ratios, and visualizing these differences at GO and pathway level. Through public experimental data focused on the cardiac transcription factor NKX2-5, we illustrate the problems associated with comparing GO enrichments between experiments using a simple overlap approach. We have developed an R/Bioconductor package, CompGO, which implements a new statistic normally used in epidemiological studies for performing comparative GO analyses and visualizing comparisons from . BED data containing genomic coordinates as well as gene lists as inputs. We justify the statistic through inclusion of experimental data and compare to the commonly used overlap method. CompGO is freely available as a R/Bioconductor package enabling easy integration into existing pipelines and is available at: http://www.bioconductor.org/packages/release/bioc/html/CompGO.html packages/release/bioc/html/CompGO.html.
Translation, cultural adaptation, cross-validation of the Turkish diabetes quality-of-life (DQOL) measure.

PubMed

Yildirim, Aysegul; Akinci, Fevzi; Gozu, Hulya; Sargin, Haluk; Orbay, Ekrem; Sargin, Mehmet

2007-06-01

The aim of this study was to test the validity and reliability of the Turkish version of the diabetes quality of life (DQOL) questionnaire for use with patients with diabetes. Turkish version of the generic quality of life (QoL) scale 15D and DQOL, socio-demographics and clinical parameter characteristics were administered to 150 patients with type 2 diabetes. Study participants were randomly sampled from the Endocrinology and Diabetes Outpatient Department of Dr. Lutfi Kirdar Kartal Education and Research Hospital in Istanbul, Turkey. The Cronbach alpha coefficient of the overall DQOL scale was 0.89; the Cronbach alpha coefficient ranged from 0.80 to 0.94 for subscales. Distress, discomfort and its symptoms, depression, mobility, usual activities, and vitality on the 15 D scale had statistically significant correlations with social/vocational worry and diabetes-related worry on the DQOL scale indicating good convergent validity. Factor analysis identified four subscales: satisfaction", impact", "diabetes-related worry", and "social/vocational worry". Statistical analyses showed that the Turkish version of the DQOL is a valid and reliable instrument to measure disease related QoL in patients with diabetes. It is a simple and quick screening tool with about 15 +/- 5.8 min administration time for measuring QoL in this population.
The European Southern Observatory-MIDAS table file system

NASA Technical Reports Server (NTRS)

Peron, M.; Grosbol, P.

1992-01-01

The new and substantially upgraded version of the Table File System in MIDAS is presented as a scientific database system. MIDAS applications for performing database operations on tables are discussed, for instance, the exchange of the data to and from the TFS, the selection of objects, the uncertainty joins across tables, and the graphical representation of data. This upgraded version of the TFS is a full implementation of the binary table extension of the FITS format; in addition, it also supports arrays of strings. Different storage strategies for optimal access of very large data sets are implemented and are addressed in detail. As a simple relational database, the TFS may be used for the management of personal data files. This opens the way to intelligent pipeline processing of large amounts of data. One of the key features of the Table File System is to provide also an extensive set of tools for the analysis of the final results of a reduction process. Column operations using standard and special mathematical functions as well as statistical distributions can be carried out; commands for linear regression and model fitting using nonlinear least square methods and user-defined functions are available. Finally, statistical tests of hypothesis and multivariate methods can also operate on tables.
Field-theoretic approach to fluctuation effects in neural networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Buice, Michael A.; Cowan, Jack D.; Mathematics Department, University of Chicago, Chicago, Illinois 60637

A well-defined stochastic theory for neural activity, which permits the calculation of arbitrary statistical moments and equations governing them, is a potentially valuable tool for theoretical neuroscience. We produce such a theory by analyzing the dynamics of neural activity using field theoretic methods for nonequilibrium statistical processes. Assuming that neural network activity is Markovian, we construct the effective spike model, which describes both neural fluctuations and response. This analysis leads to a systematic expansion of corrections to mean field theory, which for the effective spike model is a simple version of the Wilson-Cowan equation. We argue that neural activity governedmore » by this model exhibits a dynamical phase transition which is in the universality class of directed percolation. More general models (which may incorporate refractoriness) can exhibit other universality classes, such as dynamic isotropic percolation. Because of the extremely high connectivity in typical networks, it is expected that higher-order terms in the systematic expansion are small for experimentally accessible measurements, and thus, consistent with measurements in neocortical slice preparations, we expect mean field exponents for the transition. We provide a quantitative criterion for the relative magnitude of each term in the systematic expansion, analogous to the Ginsburg criterion. Experimental identification of dynamic universality classes in vivo is an outstanding and important question for neuroscience.« less
Proceedings, Seminar on Probabilistic Methods in Geotechnical Engineering

NASA Astrophysics Data System (ADS)

Hynes-Griffin, M. E.; Buege, L. L.

1983-09-01

Contents: Applications of Probabilistic Methods in Geotechnical Engineering; Probabilistic Seismic and Geotechnical Evaluation at a Dam Site; Probabilistic Slope Stability Methodology; Probability of Liquefaction in a 3-D Soil Deposit; Probabilistic Design of Flood Levees; Probabilistic and Statistical Methods for Determining Rock Mass Deformability Beneath Foundations: An Overview; Simple Statistical Methodology for Evaluating Rock Mechanics Exploration Data; New Developments in Statistical Techniques for Analyzing Rock Slope Stability.
Cloud-Based Tools to Support High-Resolution Modeling (Invited)

NASA Astrophysics Data System (ADS)

Jones, N.; Nelson, J.; Swain, N.; Christensen, S.

2013-12-01

The majority of watershed models developed to support decision-making by water management agencies are simple, lumped-parameter models. Maturity in research codes and advances in the computational power from multi-core processors on desktop machines, commercial cloud-computing resources, and supercomputers with thousands of cores have created new opportunities for employing more accurate, high-resolution distributed models for routine use in decision support. The barriers for using such models on a more routine basis include massive amounts of spatial data that must be processed for each new scenario and lack of efficient visualization tools. In this presentation we will review a current NSF-funded project called CI-WATER that is intended to overcome many of these roadblocks associated with high-resolution modeling. We are developing a suite of tools that will make it possible to deploy customized web-based apps for running custom scenarios for high-resolution models with minimal effort. These tools are based on a software stack that includes 52 North, MapServer, PostGIS, HT Condor, CKAN, and Python. This open source stack provides a simple scripting environment for quickly configuring new custom applications for running high-resolution models as geoprocessing workflows. The HT Condor component facilitates simple access to local distributed computers or commercial cloud resources when necessary for stochastic simulations. The CKAN framework provides a powerful suite of tools for hosting such workflows in a web-based environment that includes visualization tools and storage of model simulations in a database to archival, querying, and sharing of model results. Prototype applications including land use change, snow melt, and burned area analysis will be presented. This material is based upon work supported by the National Science Foundation under Grant No. 1135482
Development of a Clinical Tool to Predict Home Death of a Discharged Cancer Patient in Japan: a Case-Control Study.

PubMed

Fukui, Sakiko; Morita, Tatsuya; Yoshiuchi, Kazuhiro

2017-08-01

The aim of this study was to investigate the predictive value of a clinical tool to predict whether discharged cancer patients die at home, comparing groups of case who died at home and control who died in hospitals or other facilities. We conducted a nationwide case-control study to identify the determinants of home death for a discharged cancer patient. We randomly selected nurses in charge of 2000 home-visit nursing agencies from all 5813 agencies in Japan by referring to the nationwide databases in January 2013. The nurses were asked to report variables of their patients' place of death, patients' and caregivers' clinical statuses, and their preferences for home death. We used logistic regression analysis and developed a clinical tool to accurately predict it and investigated their predictive values. We identified 466 case and 478 control patients. Five predictive variables of home death were obtained: patients' and caregivers' preferences for home death [OR (95% CI) 2.66 (1.99-3.55)], availability of visiting physicians [2.13 (1.67-2.70)], 24-h contact between physicians and nurses [1.68 (1.30-2.18)], caregivers' experiences of deathwatch at home [1.41 (1.13-1.75)], and patients' insights as to their own prognosis [1.23 (1.02-1.50)]. We calculated the scores predicting home death for each variable (range 6-28). When using a cutoff point of 16, home death was predicted with a sensitivity of 0.72 and a specificity of 0.81 with the Harrell's c-statistic of 0.84. This simple clinical tool for healthcare professionals can help predict whether a discharged patient is likely to die at home.
Concept design theory and model for multi-use space facilities: Analysis of key system design parameters through variance of mission requirements

NASA Astrophysics Data System (ADS)

Reynerson, Charles Martin

This research has been performed to create concept design and economic feasibility data for space business parks. A space business park is a commercially run multi-use space station facility designed for use by a wide variety of customers. Both space hardware and crew are considered as revenue producing payloads. Examples of commercial markets may include biological and materials research, processing, and production, space tourism habitats, and satellite maintenance and resupply depots. This research develops a design methodology and an analytical tool to create feasible preliminary design information for space business parks. The design tool is validated against a number of real facility designs. Appropriate model variables are adjusted to ensure that statistical approximations are valid for subsequent analyses. The tool is used to analyze the effect of various payload requirements on the size, weight and power of the facility. The approach for the analytical tool was to input potential payloads as simple requirements, such as volume, weight, power, crew size, and endurance. In creating the theory, basic principles are used and combined with parametric estimation of data when necessary. Key system parameters are identified for overall system design. Typical ranges for these key parameters are identified based on real human spaceflight systems. To connect the economics to design, a life-cycle cost model is created based upon facility mass. This rough cost model estimates potential return on investments, initial investment requirements and number of years to return on the initial investment. Example cases are analyzed for both performance and cost driven requirements for space hotels, microgravity processing facilities, and multi-use facilities. In combining both engineering and economic models, a design-to-cost methodology is created for more accurately estimating the commercial viability for multiple space business park markets.
A Novel Approach to Determining Violence Risk in Schizophrenia: Developing a Stepped Strategy in 13,806 Discharged Patients

PubMed Central

Singh, Jay P.; Grann, Martin; Lichtenstein, Paul; Långström, Niklas; Fazel, Seena

2012-01-01

Clinical guidelines recommend that violence risk be assessed in schizophrenia. Current approaches are resource-intensive as they employ detailed clinical assessments of dangerousness for most patients. An alternative approach would be to first screen out patients at very low risk of future violence prior to more costly and time-consuming assessments. In order to implement such a stepped strategy, we developed a simple tool to screen out individuals with schizophrenia at very low risk of violent offending. We merged high quality Swedish national registers containing information on psychiatric diagnoses, socio-demographic factors, and violent crime. A cohort of 13,806 individuals with hospital discharge diagnoses of schizophrenia was identified and followed for up to 33 years for violent crime. Cox regression was used to determine risk factors for violent crime and construct the screening tool, the predictive validity of which was measured using four outcome statistics. The instrument was calibrated on 6,903 participants and cross-validated using three independent replication samples of 2,301 participants each. Regression analyses resulted in a tool composed of five items: male sex, previous criminal conviction, young age at assessment, comorbid alcohol abuse, and comorbid drug abuse. At 5 years after discharge, the instrument had a negative predictive value of 0.99 (95% CI = 0.98–0.99), meaning that very few individuals who the tool screened out (n = 2,359 out of original sample of 6,903) were subsequently convicted of a violent offence. Screening out patients who are at very low risk of violence prior to more detailed clinical assessment may assist the risk assessment process in schizophrenia. PMID:22359622
Evaluation of recently validated non-invasive formula using basic lung functions as new screening tool for pulmonary hypertension in idiopathic pulmonary fibrosis patients

PubMed Central

Ghanem, Maha K.; Makhlouf, Hoda A.; Agmy, Gamal R.; Imam, Hisham M. K.; Fouad, Doaa A.

2009-01-01

BACKGROUND: A prediction formula for mean pulmonary artery pressure (MPAP) using standard lung function measurement has been recently validated to screen for pulmonary hypertension (PH) in idiopathic pulmonary fibrosis (IPF) patients. OBJECTIVE: To test the usefulness of this formula as a new non invasive screening tool for PH in IPF patients. Also, to study its correlation with patients' clinical data, pulmonary function tests, arterial blood gases (ABGs) and other commonly used screening methods for PH including electrocardiogram (ECG), chest X ray (CXR), trans-thoracic echocardiography (TTE) and computerized tomography pulmonary angiography (CTPA). MATERIALS AND METHODS: Cross-sectional study of 37 IPF patients from tertiary hospital. The accuracy of MPAP estimation was assessed by examining the correlation between the predicted MPAP using the formula and PH diagnosed by other screening tools and patients' clinical signs of PH. RESULTS: There was no statistically significant difference in the prediction of PH using cut off point of 21 or 25 mm Hg (P = 0.24). The formula-predicted MPAP greater than 25 mm Hg strongly correlated in the expected direction with O2 saturation (r = −0.95, P < 0.000), partial arterial O2 tension (r = −0.71, P < 0.000), right ventricular systolic pressure measured by TTE (r = 0.6, P < 0.000) and hilar width on CXR (r = 0.31, P = 0.03). Chest symptoms, ECG and CTPA signs of PH poorly correlated with the same formula (P > 0.05). CONCLUSIONS: The prediction formula for MPAP using standard lung function measurements is a simple non invasive tool that can be used as TTE to screen for PH in IPF patients and select those who need right heart catheterization. PMID:19881164

Effects of interactive patient smartphone support app on drug adherence and lifestyle changes in myocardial infarction patients: A randomized study.

PubMed

Johnston, Nina; Bodegard, Johan; Jerström, Susanna; Åkesson, Johanna; Brorsson, Hilja; Alfredsson, Joakim; Albertsson, Per A; Karlsson, Jan-Erik; Varenhorst, Christoph

2016-08-01

Patients with myocardial infarction (MI) seldom reach recommended targets for secondary prevention. This study evaluated a smartphone application ("app") aimed at improving treatment adherence and cardiovascular lifestyle in MI patients. Multicenter, randomized trial. A total of 174 ticagrelor-treated MI patients were randomized to either an interactive patient support tool (active group) or a simplified tool (control group) in addition to usual post-MI care. Primary end point was a composite nonadherence score measuring patient-registered ticagrelor adherence, defined as a combination of adherence failure events (2 missed doses registered in 7-day cycles) and treatment gaps (4 consecutive missed doses). Secondary end points included change in cardiovascular risk factors, quality of life (European Quality of Life-5 Dimensions), and patient device satisfaction (System Usability Scale). Patient mean age was 58 years, 81% were men, and 21% were current smokers. At 6 months, greater patient-registered drug adherence was achieved in the active vs the control group (nonadherence score: 16.6 vs 22.8 [P = .025]). Numerically, the active group was associated with higher degree of smoking cessation, increased physical activity, and change in quality of life; however, this did not reach statistical significance. Patient satisfaction was significantly higher in the active vs the control group (system usability score: 87.3 vs 78.1 [P = .001]). In MI patients, use of an interactive patient support tool improved patient self-reported drug adherence and may be associated with a trend toward improved cardiovascular lifestyle changes and quality of life. Use of a disease-specific interactive patient support tool may be an appreciated, simple, and promising complement to standard secondary prevention. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
One Hundred Ways to be Non-Fickian - A Rigorous Multi-Variate Statistical Analysis of Pore-Scale Transport

NASA Astrophysics Data System (ADS)

Most, Sebastian; Nowak, Wolfgang; Bijeljic, Branko

2015-04-01

Fickian transport in groundwater flow is the exception rather than the rule. Transport in porous media is frequently simulated via particle methods (i.e. particle tracking random walk (PTRW) or continuous time random walk (CTRW)). These methods formulate transport as a stochastic process of particle position increments. At the pore scale, geometry and micro-heterogeneities prohibit the commonly made assumption of independent and normally distributed increments to represent dispersion. Many recent particle methods seek to loosen this assumption. Hence, it is important to get a better understanding of the processes at pore scale. For our analysis we track the positions of 10.000 particles migrating through the pore space over time. The data we use come from micro CT scans of a homogeneous sandstone and encompass about 10 grain sizes. Based on those images we discretize the pore structure and simulate flow at the pore scale based on the Navier-Stokes equation. This flow field realistically describes flow inside the pore space and we do not need to add artificial dispersion during the transport simulation. Next, we use particle tracking random walk and simulate pore-scale transport. Finally, we use the obtained particle trajectories to do a multivariate statistical analysis of the particle motion at the pore scale. Our analysis is based on copulas. Every multivariate joint distribution is a combination of its univariate marginal distributions. The copula represents the dependence structure of those univariate marginals and is therefore useful to observe correlation and non-Gaussian interactions (i.e. non-Fickian transport). The first goal of this analysis is to better understand the validity regions of commonly made assumptions. We are investigating three different transport distances: 1) The distance where the statistical dependence between particle increments can be modelled as an order-one Markov process. This would be the Markovian distance for the process, where the validity of yet-unexplored non-Gaussian-but-Markovian random walks start. 2) The distance where bivariate statistical dependence simplifies to a multi-Gaussian dependence based on simple linear correlation (validity of correlated PTRW/CTRW). 3) The distance of complete statistical independence (validity of classical PTRW/CTRW). The second objective is to reveal characteristic dependencies influencing transport the most. Those dependencies can be very complex. Copulas are highly capable of representing linear dependence as well as non-linear dependence. With that tool we are able to detect persistent characteristics dominating transport even across different scales. The results derived from our experimental data set suggest that there are many more non-Fickian aspects of pore-scale transport than the univariate statistics of longitudinal displacements. Non-Fickianity can also be found in transverse displacements, and in the relations between increments at different time steps. Also, the found dependence is non-linear (i.e. beyond simple correlation) and persists over long distances. Thus, our results strongly support the further refinement of techniques like correlated PTRW or correlated CTRW towards non-linear statistical relations.
Model weights and the foundations of multimodel inference

USGS Publications Warehouse

Link, W.A.; Barker, R.J.

2006-01-01

Statistical thinking in wildlife biology and ecology has been profoundly influenced by the introduction of AIC (Akaike?s information criterion) as a tool for model selection and as a basis for model averaging. In this paper, we advocate the Bayesian paradigm as a broader framework for multimodel inference, one in which model averaging and model selection are naturally linked, and in which the performance of AIC-based tools is naturally evaluated. Prior model weights implicitly associated with the use of AIC are seen to highly favor complex models: in some cases, all but the most highly parameterized models in the model set are virtually ignored a priori. We suggest the usefulness of the weighted BIC (Bayesian information criterion) as a computationally simple alternative to AIC, based on explicit selection of prior model probabilities rather than acceptance of default priors associated with AIC. We note, however, that both procedures are only approximate to the use of exact Bayes factors. We discuss and illustrate technical difficulties associated with Bayes factors, and suggest approaches to avoiding these difficulties in the context of model selection for a logistic regression. Our example highlights the predisposition of AIC weighting to favor complex models and suggests a need for caution in using the BIC for computing approximate posterior model weights.
A Simple Tool for Diet Evaluation in Primary Health Care: Validation of a 16-Item Food Intake Questionnaire

PubMed Central

Hemiö, Katri; Pölönen, Auli; Ahonen, Kirsti; Kosola, Mikko; Viitasalo, Katriina; Lindström, Jaana

2014-01-01

Our aim was to validate a 16-item food intake questionnaire (16-FIQ) and create an easy to use method to estimate patients’ nutrient intake in primary health care. Participants (52 men, 25 women) completed a 7-day food record and a 16-FIQ. Food and nutrient intakes were calculated and compared using Spearman correlation. Further, nutrient intakes were compared using kappa-statistics and exact and opposite agreement of intake tertiles. The results indicated that the 16-FIQ reliably categorized individuals according to their nutrient intakes. Methods to estimate nutrient intake based on the answers given in 16-FIQ were created. In linear regression models nutrient intake estimates from the food records were used as the dependent variables and sum variables derived from the 16-FIQ were used as the independent variables. Valid regression models were created for the energy proportion of fat, saturated fat, and sucrose and the amount of fibre (g), vitamin C (mg), iron (mg), and vitamin D (μg) intake. The 16-FIQ is a valid method for estimating nutrient intakes in group level. In addition, the 16-FIQ could be a useful tool to facilitate identification of people in need of dietary counselling and to monitor the effect of counselling in primary health care. PMID:24599042
Evolution of democracy in Europe

NASA Astrophysics Data System (ADS)

Oberoi, Mukesh K.

The emphasis of this thesis is to build an intuitive and robust GIS (Geographic Information systems) Tool which will give a survey on the evolution of democracy in European countries. The user can know about the evolution of the democratic histories of these countries by just clicking on them on the map. The information is provided in separate HTML pages which will give information about start of revolution, transition to democracy, current legislature, women's status in the country etc. There are two separate web pages for each country- one shows the detailed explanation on how democracy evolved in diff. countries and another page contains a timeline which holds key events of the evolution. The tool has been developed in JAVA. For the European map MOJO (Map Objects Java Objects) is used. MOJO is developed by ESRI. The major features shown on the European map were designed using MOJO. MOJO made it easy to incorporate the statistical data with these features. The user interface, as well as the language was intentionally kept simple and easy to use, to broaden the potential audience. To keep the user engaged, key aspects are explained using HTML pages. The idea is that users can view the timeline to get a quick overview and can go through the other html page to learn about things in more detail.
Musical representation of dendritic spine distribution: a new exploratory tool.

PubMed

Toharia, Pablo; Morales, Juan; de Juan, Octavio; Fernaud, Isabel; Rodríguez, Angel; DeFelipe, Javier

2014-04-01

Dendritic spines are small protrusions along the dendrites of many types of neurons in the central nervous system and represent the major target of excitatory synapses. For this reason, numerous anatomical, physiological and computational studies have focused on these structures. In the cerebral cortex the most abundant and characteristic neuronal type are pyramidal cells (about 85 % of all neurons) and their dendritic spines are the main postsynaptic target of excitatory glutamatergic synapses. Thus, our understanding of the synaptic organization of the cerebral cortex largely depends on the knowledge regarding synaptic inputs to dendritic spines of pyramidal cells. Much of the structural data on dendritic spines produced by modern neuroscience involves the quantitative analysis of image stacks from light and electron microscopy, using standard statistical and mathematical tools and software developed to this end. Here, we present a new method with musical feedback for exploring dendritic spine morphology and distribution patterns in pyramidal neurons. We demonstrate that audio analysis of spiny dendrites with apparently similar morphology may "sound" quite different, revealing anatomical substrates that are not apparent from simple visual inspection. These morphological/music translations may serve as a guide for further mathematical analysis of the design of the pyramidal neurons and of spiny dendrites in general.
Patient-Generated Subjective Global Assessment of nutritional status in pediatric patients with recent cancer diagnosis.

PubMed

Vázquez de la Torre, Mayra Jezabel; Stein, Katja; Vásquez Garibay, Edgar Manuel; Kumazawa Ichikawa, Miguel Roberto; Troyo Sanromán, Rogelio; Salcedo Flores, Alicia Guadalupe; Sánchez Zubieta, Fernando Antonio

2017-10-24

The subjective global assessment (SGA) is a simple, sensitive tool used to identify nutritional risk. It is widely used in the adult population, but there is little evidence on its effectiveness in children with cancer. This cross-sectional study was undertaken to demonstrate significant correlation between a simplified version of the Patient-Generated SGA (PG-SGA) and anthropometric assessment to identify nutritional status in children recently diagnosed with cancer. The nutritional status of 70 pediatric cancer patients was assessed with the PG-SGA and anthropometric measurements. The relation between the assessments was tested with ANOVA, independent samples t-test, Kappa statistic, and non-parametric Spearman and Kendall correlation coefficient. The PG-SGA divided the patients into four groups: well nourished, mildly, moderately and severely malnourished. The prevalence of malnutrition according to the PG-SGA was 21.4%. The correlations (r ≥ 0.300, p < 0.001) and the concordance (k ≥ 0.327, p < 0.001) between the PG-SGA and anthropometric indicators were moderate and significant. The results indicate that the PG-SGA is a valid tool for assessing nutritional status in hospitalized children recently diagnosed with cancer. It is important to emphasize that the subjective assessment does not detect growth retardation, overweight or obesity.
EEG Analytics for Early Detection of Autism Spectrum Disorder: A data-driven approach.

PubMed

Bosl, William J; Tager-Flusberg, Helen; Nelson, Charles A

2018-05-01

Autism spectrum disorder (ASD) is a complex and heterogeneous disorder, diagnosed on the basis of behavioral symptoms during the second year of life or later. Finding scalable biomarkers for early detection is challenging because of the variability in presentation of the disorder and the need for simple measurements that could be implemented routinely during well-baby checkups. EEG is a relatively easy-to-use, low cost brain measurement tool that is being increasingly explored as a potential clinical tool for monitoring atypical brain development. EEG measurements were collected from 99 infants with an older sibling diagnosed with ASD, and 89 low risk controls, beginning at 3 months of age and continuing until 36 months of age. Nonlinear features were computed from EEG signals and used as input to statistical learning methods. Prediction of the clinical diagnostic outcome of ASD or not ASD was highly accurate when using EEG measurements from as early as 3 months of age. Specificity, sensitivity and PPV were high, exceeding 95% at some ages. Prediction of ADOS calibrated severity scores for all infants in the study using only EEG data taken as early as 3 months of age was strongly correlated with the actual measured scores. This suggests that useful digital biomarkers might be extracted from EEG measurements.
US-SOMO HPLC-SAXS module: dealing with capillary fouling and extraction of pure component patterns from poorly resolved SEC-SAXS data

PubMed Central

Brookes, Emre; Vachette, Patrice; Rocco, Mattia; Pérez, Javier

2016-01-01

Size-exclusion chromatography coupled with SAXS (small-angle X-ray scattering), often performed using a flow-through capillary, should allow direct collection of monodisperse sample data. However, capillary fouling issues and non-baseline-resolved peaks can hamper its efficacy. The UltraScan solution modeler (US-SOMO) HPLC-SAXS (high-performance liquid chromatography coupled with SAXS) module provides a comprehensive framework to analyze such data, starting with a simple linear baseline correction and symmetrical Gaussian decomposition tools [Brookes, Pérez, Cardinali, Profumo, Vachette & Rocco (2013 ▸). J. Appl. Cryst. 46, 1823–1833]. In addition to several new features, substantial improvements to both routines have now been implemented, comprising the evaluation of outcomes by advanced statistical tools. The novel integral baseline-correction procedure is based on the more sound assumption that the effect of capillary fouling on scattering increases monotonically with the intensity scattered by the material within the X-ray beam. Overlapping peaks, often skewed because of sample interaction with the column matrix, can now be accurately decomposed using non-symmetrical modified Gaussian functions. As an example, the case of a polydisperse solution of aldolase is analyzed: from heavily convoluted peaks, individual SAXS profiles of tetramers, octamers and dodecamers are extracted and reliably modeled. PMID:27738419
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

PubMed Central

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-01-01

Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Personalizing oncology treatments by predicting drug efficacy, side-effects, and improved therapy: mathematics, statistics, and their integration.

PubMed

Agur, Zvia; Elishmereni, Moran; Kheifetz, Yuri

2014-01-01

Despite its great promise, personalized oncology still faces many hurdles, and it is increasingly clear that targeted drugs and molecular biomarkers alone yield only modest clinical benefit. One reason is the complex relationships between biomarkers and the patient's response to drugs, obscuring the true weight of the biomarkers in the overall patient's response. This complexity can be disentangled by computational models that integrate the effects of personal biomarkers into a simulator of drug-patient dynamic interactions, for predicting the clinical outcomes. Several computational tools have been developed for personalized oncology, notably evidence-based tools for simulating pharmacokinetics, Bayesian-estimated tools for predicting survival, etc. We describe representative statistical and mathematical tools, and discuss their merits, shortcomings and preliminary clinical validation attesting to their potential. Yet, the individualization power of mathematical models alone, or statistical models alone, is limited. More accurate and versatile personalization tools can be constructed by a new application of the statistical/mathematical nonlinear mixed effects modeling (NLMEM) approach, which until recently has been used only in drug development. Using these advanced tools, clinical data from patient populations can be integrated with mechanistic models of disease and physiology, for generating personal mathematical models. Upon a more substantial validation in the clinic, this approach will hopefully be applied in personalized clinical trials, P-trials, hence aiding the establishment of personalized medicine within the main stream of clinical oncology. © 2014 Wiley Periodicals, Inc.
More Powerful Tests of Simple Interaction Contrasts in the Two-Way Factorial Design

ERIC Educational Resources Information Center

Hancock, Gregory R.; McNeish, Daniel M.

2017-01-01

For the two-way factorial design in analysis of variance, the current article explicates and compares three methods for controlling the Type I error rate for all possible simple interaction contrasts following a statistically significant interaction, including a proposed modification to the Bonferroni procedure that increases the power of…
Calibration of Response Data Using MIRT Models with Simple and Mixed Structures

ERIC Educational Resources Information Center

Zhang, Jinming

2012-01-01

It is common to assume during a statistical analysis of a multiscale assessment that the assessment is composed of several unidimensional subtests or that it has simple structure. Under this assumption, the unidimensional and multidimensional approaches can be used to estimate item parameters. These two approaches are equivalent in parameter…
Quantitation & Case-Study-Driven Inquiry to Enhance Yeast Fermentation Studies

ERIC Educational Resources Information Center

Grammer, Robert T.

2012-01-01

We propose a procedure for the assay of fermentation in yeast in microcentrifuge tubes that is simple and rapid, permitting assay replicates, descriptive statistics, and the preparation of line graphs that indicate reproducibility. Using regression and simple derivatives to determine initial velocities, we suggest methods to compare the effects of…
A Simple Statistical Thermodynamics Experiment

ERIC Educational Resources Information Center

LoPresto, Michael C.

2010-01-01

Comparing the predicted and actual rolls of combinations of both two and three dice can help to introduce many of the basic concepts of statistical thermodynamics, including multiplicity, probability, microstates, and macrostates, and demonstrate that entropy is indeed a measure of randomness, that disordered states (those of higher entropy) are…
Standard Entropy of Crystalline Iodine from Vapor Pressure Measurements: A Physical Chemistry Experiment.

ERIC Educational Resources Information Center

Harris, Ronald M.

1978-01-01

Presents material dealing with an application of statistical thermodynamics to the diatomic solid I-2(s). The objective is to enhance the student's appreciation of the power of the statistical formulation of thermodynamics. The Simple Einstein Model is used. (Author/MA)
"Hyperstat": an educational and working tool in epidemiology.

PubMed

Nicolosi, A

1995-01-01

The work of a researcher in epidemiology is based on studying literature, planning studies, gathering data, analyzing data and writing results. Therefore he has need for performing, more or less, simple calculations, the need for consulting or quoting literature, the need for consulting textbooks about certain issues or procedures, and the need for looking at a specific formula. There are no programs conceived as a workstation to assist the different aspects of researcher work in an integrated fashion. A hypertextual system was developed which supports different stages of the epidemiologist's work. It combines database management, statistical analysis or planning, and literature searches. The software was developed on Apple Macintosh by using Hypercard 2.1 as a database and HyperTalk as a programming language. The program is structured in 7 "stacks" or files: Procedures; Statistical Tables; Graphs; References; Text; Formulas; Help. Each stack has its own management system with an automated Table of Contents. Stacks contain "cards" which make up the databases and carry executable programs. The programs are of four kinds: association; statistical procedure; formatting (input/output); database management. The system performs general statistical procedures, procedures applicable to epidemiological studies only (follow-up and case-control), and procedures for clinical trials. All commands are given by clicking the mouse on self-explanatory "buttons". In order to perform calculations, the user only needs to enter the data into the appropriate cells and then click on the selected procedure's button. The system has a hypertextual structure. The user can go from a procedure to other cards following the preferred order of succession and according to built-in associations. The user can access different levels of knowledge or information from any stack he is consulting or operating. From every card, the user can go to a selected procedure to perform statistical calculations, to the reference database management system, to the textbook in which all procedures and issues are discussed in detail, to the database of statistical formulas with automated table of contents, to statistical tables with automated table of contents, or to the help module. he program has a very user-friendly interface and leaves the user free to use the same format he would use on paper. The interface does not require special skills. It reflects the Macintosh philosophy of using windows, buttons and mouse. This allows the user to perform complicated calculations without losing the "feel" of data, weight alternatives, and simulations. This program shares many features in common with hypertexts. It has an underlying network database where the nodes consist of text, graphics, executable procedures, and combinations of these; the nodes in the database correspond to windows on the screen; the links between the nodes in the database are visible as "active" text or icons in the windows; the text is read by following links and opening new windows. The program is especially useful as an educational tool, directed to medical and epidemiology students. The combination of computing capabilities with a textbook and databases of formulas and literature references, makes the program versatile and attractive as a learning tool. The program is also helpful in the work done at the desk, where the researcher examines results, consults literature, explores different analytic approaches, plans new studies, or writes grant proposals or scientific articles.
Statistical issues in the design and planning of proteomic profiling experiments.

PubMed

Cairns, David A

2015-01-01

The statistical design of a clinical proteomics experiment is a critical part of well-undertaken investigation. Standard concepts from experimental design such as randomization, replication and blocking should be applied in all experiments, and this is possible when the experimental conditions are well understood by the investigator. The large number of proteins simultaneously considered in proteomic discovery experiments means that determining the number of required replicates to perform a powerful experiment is more complicated than in simple experiments. However, by using information about the nature of an experiment and making simple assumptions this is achievable for a variety of experiments useful for biomarker discovery and initial validation.
A simple statistical model for geomagnetic reversals

NASA Technical Reports Server (NTRS)

Constable, Catherine

1990-01-01

The diversity of paleomagnetic records of geomagnetic reversals now available indicate that the field configuration during transitions cannot be adequately described by simple zonal or standing field models. A new model described here is based on statistical properties inferred from the present field and is capable of simulating field transitions like those observed. Some insight is obtained into what one can hope to learn from paleomagnetic records. In particular, it is crucial that the effects of smoothing in the remanence acquisition process be separated from true geomagnetic field behavior. This might enable us to determine the time constants associated with the dominant field configuration during a reversal.
Statistical Properties of Online Auctions

NASA Astrophysics Data System (ADS)

Namazi, Alireza; Schadschneider, Andreas

We characterize the statistical properties of a large number of online auctions run on eBay. Both stationary and dynamic properties, like distributions of prices, number of bids etc., as well as relations between these quantities are studied. The analysis of the data reveals surprisingly simple distributions and relations, typically of power-law form. Based on these findings we introduce a simple method to identify suspicious auctions that could be influenced by a form of fraud known as shill bidding. Furthermore the influence of bidding strategies is discussed. The results indicate that the observed behavior is related to a mixture of agents using a variety of strategies.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.