statistical analysis software: Topics by Science.gov

Sample records for statistical analysis software

An overview of the mathematical and statistical analysis component of RICIS

NASA Technical Reports Server (NTRS)

Hallum, Cecil R.

1987-01-01

Mathematical and statistical analysis components of RICIS (Research Institute for Computing and Information Systems) can be used in the following problem areas: (1) quantification and measurement of software reliability; (2) assessment of changes in software reliability over time (reliability growth); (3) analysis of software-failure data; and (4) decision logic for whether to continue or stop testing software. Other areas of interest to NASA/JSC where mathematical and statistical analysis can be successfully employed include: math modeling of physical systems, simulation, statistical data reduction, evaluation methods, optimization, algorithm development, and mathematical methods in signal processing.
Analysis of Variance: What Is Your Statistical Software Actually Doing?

ERIC Educational Resources Information Center

Li, Jian; Lomax, Richard G.

2011-01-01

Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
Knowledge and utilization of computer-software for statistics among Nigerian dentists.

PubMed

Chukwuneke, F N; Anyanechi, C E; Obiakor, A O; Amobi, O; Onyejiaka, N; Alamba, I

2013-01-01

The use of computer soft ware for generation of statistic analysis has transformed health information and data to simplest form in the areas of access, storage, retrieval and analysis in the field of research. This survey therefore was carried out to assess the level of knowledge and utilization of computer software for statistical analysis among dental researchers in eastern Nigeria. Questionnaires on the use of computer software for statistical analysis were randomly distributed to 65 practicing dental surgeons of above 5 years experience in the tertiary academic hospitals in eastern Nigeria. The focus was on: years of clinical experience; research work experience; knowledge and application of computer generated software for data processing and stastistical analysis. Sixty-two (62/65; 95.4%) of these questionnaires were returned anonymously, which were used in our data analysis. Twenty-nine (29/62; 46.8%) respondents fall within those with 5-10 years of clinical experience out of which none has completed the specialist training programme. Practitioners with above 10 years clinical experiences were 33 (33/62; 53.2%) out of which 15 (15/33; 45.5%) are specialists representing 24.2% (15/62) of the total number of respondents. All the 15 specialists are actively involved in research activities and only five (5/15; 33.3%) can utilize software statistical analysis unaided. This study has i dentified poor utilization of computer software for statistic analysis among dental researchers in eastern Nigeria. This is strongly associated with lack of exposure on the use of these software early enough especially during the undergraduate training. This call for introduction of computer training programme in dental curriculum to enable practitioners develops the attitude of using computer software for their research.
Study designs, use of statistical tests, and statistical analysis software choice in 2015: Results from two Pakistani monthly Medline indexed journals.

PubMed

Shaikh, Masood Ali

2017-09-01

Assessment of research articles in terms of study designs used, statistical tests applied and the use of statistical analysis programmes help determine research activity profile and trends in the country. In this descriptive study, all original articles published by Journal of Pakistan Medical Association (JPMA) and Journal of the College of Physicians and Surgeons Pakistan (JCPSP), in the year 2015 were reviewed in terms of study designs used, application of statistical tests, and the use of statistical analysis programmes. JPMA and JCPSP published 192 and 128 original articles, respectively, in the year 2015. Results of this study indicate that cross-sectional study design, bivariate inferential statistical analysis entailing comparison between two variables/groups, and use of statistical software programme SPSS to be the most common study design, inferential statistical analysis, and statistical analysis software programmes, respectively. These results echo previously published assessment of these two journals for the year 2014.
Space station software reliability analysis based on failures observed during testing at the multisystem integration facility

NASA Technical Reports Server (NTRS)

Tamayo, Tak Chai

1987-01-01

Quality of software not only is vital to the successful operation of the space station, it is also an important factor in establishing testing requirements, time needed for software verification and integration as well as launching schedules for the space station. Defense of management decisions can be greatly strengthened by combining engineering judgments with statistical analysis. Unlike hardware, software has the characteristics of no wearout and costly redundancies, thus making traditional statistical analysis not suitable in evaluating reliability of software. A statistical model was developed to provide a representation of the number as well as types of failures occur during software testing and verification. From this model, quantitative measure of software reliability based on failure history during testing are derived. Criteria to terminate testing based on reliability objectives and methods to estimate the expected number of fixings required are also presented.
[Application of Stata software to test heterogeneity in meta-analysis method].

PubMed

Wang, Dan; Mou, Zhen-yun; Zhai, Jun-xia; Zong, Hong-xia; Zhao, Xiao-dong

2008-07-01

To introduce the application of Stata software to heterogeneity test in meta-analysis. A data set was set up according to the example in the study, and the corresponding commands of the methods in Stata 9 software were applied to test the example. The methods used were Q-test and I2 statistic attached to the fixed effect model forest plot, H statistic and Galbraith plot. The existence of the heterogeneity among studies could be detected by Q-test and H statistic and the degree of the heterogeneity could be detected by I2 statistic. The outliers which were the sources of the heterogeneity could be spotted from the Galbraith plot. Heterogeneity test in meta-analysis can be completed by the four methods in Stata software simply and quickly. H and I2 statistics are more robust, and the outliers of the heterogeneity can be clearly seen in the Galbraith plot among the four methods.
GWAMA: software for genome-wide association meta-analysis.

PubMed

Mägi, Reedik; Morris, Andrew P

2010-05-28

Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.
Computer and Software Use in Teaching the Beginning Statistics Course.

ERIC Educational Resources Information Center

Bartz, Albert E.; Sabolik, Marisa A.

2001-01-01

Surveys the extent of computer usage in the beginning statistics course and the variety of statistics software used. Finds that 69% of the psychology departments used computers in beginning statistics courses and 90% used computer-assisted data analysis in statistics or other courses. (CMK)
BrightStat.com: free statistics online.

PubMed

Stricker, Daniel

2008-10-01

Powerful software for statistical analysis is expensive. Here I present BrightStat, a statistical software running on the Internet which is free of charge. BrightStat's goals, its main capabilities and functionalities are outlined. Three different sample runs, a Friedman test, a chi-square test, and a step-wise multiple regression are presented. The results obtained by BrightStat are compared with results computed by SPSS, one of the global leader in providing statistical software, and VassarStats, a collection of scripts for data analysis running on the Internet. Elementary statistics is an inherent part of academic education and BrightStat is an alternative to commercial products.
Statistical software applications used in health services research: analysis of published studies in the U.S

PubMed Central

2011-01-01

Background This study aims to identify the statistical software applications most commonly employed for data analysis in health services research (HSR) studies in the U.S. The study also examines the extent to which information describing the specific analytical software utilized is provided in published articles reporting on HSR studies. Methods Data were extracted from a sample of 1,139 articles (including 877 original research articles) published between 2007 and 2009 in three U.S. HSR journals, that were considered to be representative of the field based upon a set of selection criteria. Descriptive analyses were conducted to categorize patterns in statistical software usage in those articles. The data were stratified by calendar year to detect trends in software use over time. Results Only 61.0% of original research articles in prominent U.S. HSR journals identified the particular type of statistical software application used for data analysis. Stata and SAS were overwhelmingly the most commonly used software applications employed (in 46.0% and 42.6% of articles respectively). However, SAS use grew considerably during the study period compared to other applications. Stratification of the data revealed that the type of statistical software used varied considerably by whether authors were from the U.S. or from other countries. Conclusions The findings highlight a need for HSR investigators to identify more consistently the specific analytical software used in their studies. Knowing that information can be important, because different software packages might produce varying results, owing to differences in the software's underlying estimation methods. PMID:21977990
Data Processing System (DPS) software with experimental design, statistical analysis and data mining developed for use in entomological research.

PubMed

Tang, Qi-Yi; Zhang, Chuan-Xi

2013-04-01

A comprehensive but simple-to-use software package called DPS (Data Processing System) has been developed to execute a range of standard numerical analyses and operations used in experimental design, statistics and data mining. This program runs on standard Windows computers. Many of the functions are specific to entomological and other biological research and are not found in standard statistical software. This paper presents applications of DPS to experimental design, statistical analysis and data mining in entomology. © 2012 The Authors Insect Science © 2012 Institute of Zoology, Chinese Academy of Sciences.
INTERFACING SAS TO ORACLE IN THE UNIX ENVIRONMENT

EPA Science Inventory

SAS is an EPA standard data and statistical analysis software package while ORACLE is EPA's standard data base management system software package. RACLE has the advantage over SAS in data retrieval and storage capabilities but has limited data and statistical analysis capability....
Statistical Energy Analysis (SEA) and Energy Finite Element Analysis (EFEA) Predictions for a Floor-Equipped Composite Cylinder

NASA Technical Reports Server (NTRS)

Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.

2011-01-01

Comet Enflow is a commercially available, high frequency vibroacoustic analysis software founded on Energy Finite Element Analysis (EFEA) and Energy Boundary Element Analysis (EBEA). Energy Finite Element Analysis (EFEA) was validated on a floor-equipped composite cylinder by comparing EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) and experimental results. Statistical Energy Analysis (SEA) predictions were made using the commercial software program VA One 2009 from ESI Group. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 100 Hz to 4000 Hz.
The Role of Data Analysis Software in Graduate Programs in Education and Post-Graduate Research

ERIC Educational Resources Information Center

Harwell, Michael

2018-01-01

The importance of data analysis software in graduate programs in education and post-graduate educational research is self-evident. However the role of this software in facilitating supererogated statistical practice versus "cookbookery" is unclear. The need to rigorously document the role of data analysis software in students' graduate…
Analyzing a Mature Software Inspection Process Using Statistical Process Control (SPC)

NASA Technical Reports Server (NTRS)

Barnard, Julie; Carleton, Anita; Stamper, Darrell E. (Technical Monitor)

1999-01-01

This paper presents a cooperative effort where the Software Engineering Institute and the Space Shuttle Onboard Software Project could experiment applying Statistical Process Control (SPC) analysis to inspection activities. The topics include: 1) SPC Collaboration Overview; 2) SPC Collaboration Approach and Results; and 3) Lessons Learned.
A comparison of InVivoStat with other statistical software packages for analysis of data generated from animal experiments.

PubMed

Clark, Robin A; Shoaib, Mohammed; Hewitt, Katherine N; Stanford, S Clare; Bate, Simon T

2012-08-01

InVivoStat is a free-to-use statistical software package for analysis of data generated from animal experiments. The package is designed specifically for researchers in the behavioural sciences, where exploiting the experimental design is crucial for reliable statistical analyses. This paper compares the analysis of three experiments conducted using InVivoStat with other widely used statistical packages: SPSS (V19), PRISM (V5), UniStat (V5.6) and Statistica (V9). We show that InVivoStat provides results that are similar to those from the other packages and, in some cases, are more advanced. This investigation provides evidence of further validation of InVivoStat and should strengthen users' confidence in this new software package.
New software for statistical analysis of Cambridge Structural Database data

PubMed Central

Sykes, Richard A.; McCabe, Patrick; Allen, Frank H.; Battle, Gary M.; Bruno, Ian J.; Wood, Peter A.

2011-01-01

A collection of new software tools is presented for the analysis of geometrical, chemical and crystallographic data from the Cambridge Structural Database (CSD). This software supersedes the program Vista. The new functionality is integrated into the program Mercury in order to provide statistical, charting and plotting options alongside three-dimensional structural visualization and analysis. The integration also permits immediate access to other information about specific CSD entries through the Mercury framework, a common requirement in CSD data analyses. In addition, the new software includes a range of more advanced features focused towards structural analysis such as principal components analysis, cone-angle correction in hydrogen-bond analyses and the ability to deal with topological symmetry that may be exhibited in molecular search fragments. PMID:22477784
Report: Scientific Software.

ERIC Educational Resources Information Center

Borman, Stuart A.

1985-01-01

Discusses various aspects of scientific software, including evaluation and selection of commercial software products; program exchanges, catalogs, and other information sources; major data analysis packages; statistics and chemometrics software; and artificial intelligence. (JN)
Software for the Integration of Multiomics Experiments in Bioconductor.

PubMed

Ramos, Marcel; Schiffer, Lucas; Re, Angela; Azhar, Rimsha; Basunia, Azfar; Rodriguez, Carmen; Chan, Tiffany; Chapman, Phil; Davis, Sean R; Gomez-Cabrero, David; Culhane, Aedin C; Haibe-Kains, Benjamin; Hansen, Kasper D; Kodali, Hanish; Louis, Marie S; Mer, Arvind S; Riester, Markus; Morgan, Martin; Carey, Vince; Waldron, Levi

2017-11-01

Multiomics experiments are increasingly commonplace in biomedical research and add layers of complexity to experimental design, data integration, and analysis. R and Bioconductor provide a generic framework for statistical analysis and visualization, as well as specialized data classes for a variety of high-throughput data types, but methods are lacking for integrative analysis of multiomics experiments. The MultiAssayExperiment software package, implemented in R and leveraging Bioconductor software and design principles, provides for the coordinated representation of, storage of, and operation on multiple diverse genomics data. We provide the unrestricted multiple 'omics data for each cancer tissue in The Cancer Genome Atlas as ready-to-analyze MultiAssayExperiment objects and demonstrate in these and other datasets how the software simplifies data representation, statistical analysis, and visualization. The MultiAssayExperiment Bioconductor package reduces major obstacles to efficient, scalable, and reproducible statistical analysis of multiomics data and enhances data science applications of multiple omics datasets. Cancer Res; 77(21); e39-42. ©2017 AACR . ©2017 American Association for Cancer Research.
Integrating Statistical Visualization Research into the Political Science Classroom

ERIC Educational Resources Information Center

Draper, Geoffrey M.; Liu, Baodong; Riesenfeld, Richard F.

2011-01-01

The use of computer software to facilitate learning in political science courses is well established. However, the statistical software packages used in many political science courses can be difficult to use and counter-intuitive. We describe the results of a preliminary user study suggesting that visually-oriented analysis software can help…

Data Analysis and Graphing in an Introductory Physics Laboratory: Spreadsheet versus Statistics Suite

ERIC Educational Resources Information Center

Peterlin, Primoz

2010-01-01

Two methods of data analysis are compared: spreadsheet software and a statistics software suite. Their use is compared analysing data collected in three selected experiments taken from an introductory physics laboratory, which include a linear dependence, a nonlinear dependence and a histogram. The merits of each method are compared. (Contains 7…
The analysis of the statistical and historical information gathered during the development of the Shuttle Orbiter Primary Flight Software

NASA Technical Reports Server (NTRS)

Simmons, D. B.; Marchbanks, M. P., Jr.; Quick, M. J.

1982-01-01

The results of an effort to thoroughly and objectively analyze the statistical and historical information gathered during the development of the Shuttle Orbiter Primary Flight Software are given. The particular areas of interest include cost of the software, reliability of the software, requirements for the software and how the requirements changed during development of the system. Data related to the current version of the software system produced some interesting results. Suggestions are made for the saving of additional data which will allow additional investigation.
Analysis of High-Throughput ELISA Microarray Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

White, Amanda M.; Daly, Don S.; Zangar, Richard C.

Our research group develops analytical methods and software for the high-throughput analysis of quantitative enzyme-linked immunosorbent assay (ELISA) microarrays. ELISA microarrays differ from DNA microarrays in several fundamental aspects and most algorithms for analysis of DNA microarray data are not applicable to ELISA microarrays. In this review, we provide an overview of the steps involved in ELISA microarray data analysis and how the statistically sound algorithms we have developed provide an integrated software suite to address the needs of each data-processing step. The algorithms discussed are available in a set of open-source software tools (http://www.pnl.gov/statistics/ProMAT).
Statistical Software and Artificial Intelligence: A Watershed in Applications Programming.

ERIC Educational Resources Information Center

Pickett, John C.

1984-01-01

AUTOBJ and AUTOBOX are revolutionary software programs which contain the first application of artificial intelligence to statistical procedures used in analysis of time series data. The artificial intelligence included in the programs and program features are discussed. (JN)
Advanced statistical methods for improved data analysis of NASA astrophysics missions

NASA Technical Reports Server (NTRS)

Feigelson, Eric D.

1992-01-01

The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.
Demonstration of a software design and statistical analysis methodology with application to patient outcomes data sets

PubMed Central

Mayo, Charles; Conners, Steve; Warren, Christopher; Miller, Robert; Court, Laurence; Popple, Richard

2013-01-01

Purpose: With emergence of clinical outcomes databases as tools utilized routinely within institutions, comes need for software tools to support automated statistical analysis of these large data sets and intrainstitutional exchange from independent federated databases to support data pooling. In this paper, the authors present a design approach and analysis methodology that addresses both issues. Methods: A software application was constructed to automate analysis of patient outcomes data using a wide range of statistical metrics, by combining use of C#.Net and R code. The accuracy and speed of the code was evaluated using benchmark data sets. Results: The approach provides data needed to evaluate combinations of statistical measurements for ability to identify patterns of interest in the data. Through application of the tools to a benchmark data set for dose-response threshold and to SBRT lung data sets, an algorithm was developed that uses receiver operator characteristic curves to identify a threshold value and combines use of contingency tables, Fisher exact tests, Welch t-tests, and Kolmogorov-Smirnov tests to filter the large data set to identify values demonstrating dose-response. Kullback-Leibler divergences were used to provide additional confirmation. Conclusions: The work demonstrates the viability of the design approach and the software tool for analysis of large data sets. PMID:24320426
Demonstration of a software design and statistical analysis methodology with application to patient outcomes data sets.

PubMed

Mayo, Charles; Conners, Steve; Warren, Christopher; Miller, Robert; Court, Laurence; Popple, Richard

2013-11-01

With emergence of clinical outcomes databases as tools utilized routinely within institutions, comes need for software tools to support automated statistical analysis of these large data sets and intrainstitutional exchange from independent federated databases to support data pooling. In this paper, the authors present a design approach and analysis methodology that addresses both issues. A software application was constructed to automate analysis of patient outcomes data using a wide range of statistical metrics, by combining use of C#.Net and R code. The accuracy and speed of the code was evaluated using benchmark data sets. The approach provides data needed to evaluate combinations of statistical measurements for ability to identify patterns of interest in the data. Through application of the tools to a benchmark data set for dose-response threshold and to SBRT lung data sets, an algorithm was developed that uses receiver operator characteristic curves to identify a threshold value and combines use of contingency tables, Fisher exact tests, Welch t-tests, and Kolmogorov-Smirnov tests to filter the large data set to identify values demonstrating dose-response. Kullback-Leibler divergences were used to provide additional confirmation. The work demonstrates the viability of the design approach and the software tool for analysis of large data sets.
CRAX. Cassandra Exoskeleton

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robinson, D.G.; Eubanks, L.

1998-03-01

This software assists the engineering designer in characterizing the statistical uncertainty in the performance of complex systems as a result of variations in manufacturing processes, material properties, system geometry or operating environment. The software is composed of a graphical user interface that provides the user with easy access to Cassandra uncertainty analysis routines. Together this interface and the Cassandra routines are referred to as CRAX (CassandRA eXoskeleton). The software is flexible enough, that with minor modification, it is able to interface with large modeling and analysis codes such as heat transfer or finite element analysis software. The current version permitsmore » the user to manually input a performance function, the number of random variables and their associated statistical characteristics: density function, mean, coefficients of variation. Additional uncertainity analysis modules are continuously being added to the Cassandra core.« less
Cassandra Exoskeleton

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robiinson, David G.

1999-02-20

This software assists the engineering designer in characterizing the statistical uncertainty in the performance of complex systems as a result of variations in manufacturing processes, material properties, system geometry or operating environment. The software is composed of a graphical user interface that provides the user with easy access to Cassandra uncertainty analysis routines. Together this interface and the Cassandra routines are referred to as CRAX (CassandRA eXoskeleton). The software is flexible enough, that with minor modification, it is able to interface with large modeling and analysis codes such as heat transfer or finite element analysis software. The current version permitsmore » the user to manually input a performance function, the number of random variables and their associated statistical characteristics: density function, mean, coefficients of variation. Additional uncertainity analysis modules are continuously being added to the Cassandra core.« less
Software for Statistical Analysis of Weibull Distributions with Application to Gear Fatigue Data: User Manual with Verification

NASA Technical Reports Server (NTRS)

Krantz, Timothy L.

2002-01-01

The Weibull distribution has been widely adopted for the statistical description and inference of fatigue data. This document provides user instructions, examples, and verification for software to analyze gear fatigue test data. The software was developed presuming the data are adequately modeled using a two-parameter Weibull distribution. The calculations are based on likelihood methods, and the approach taken is valid for data that include type 1 censoring. The software was verified by reproducing results published by others.
Software for Statistical Analysis of Weibull Distributions with Application to Gear Fatigue Data: User Manual with Verification

NASA Technical Reports Server (NTRS)

Kranz, Timothy L.

2002-01-01

The Weibull distribution has been widely adopted for the statistical description and inference of fatigue data. This document provides user instructions, examples, and verification for software to analyze gear fatigue test data. The software was developed presuming the data are adequately modeled using a two-parameter Weibull distribution. The calculations are based on likelihood methods, and the approach taken is valid for data that include type I censoring. The software was verified by reproducing results published by others.
RAD-ADAPT: Software for modelling clonogenic assay data in radiation biology.

PubMed

Zhang, Yaping; Hu, Kaiqiang; Beumer, Jan H; Bakkenist, Christopher J; D'Argenio, David Z

2017-04-01

We present a comprehensive software program, RAD-ADAPT, for the quantitative analysis of clonogenic assays in radiation biology. Two commonly used models for clonogenic assay analysis, the linear-quadratic model and single-hit multi-target model, are included in the software. RAD-ADAPT uses maximum likelihood estimation method to obtain parameter estimates with the assumption that cell colony count data follow a Poisson distribution. The program has an intuitive interface, generates model prediction plots, tabulates model parameter estimates, and allows automatic statistical comparison of parameters between different groups. The RAD-ADAPT interface is written using the statistical software R and the underlying computations are accomplished by the ADAPT software system for pharmacokinetic/pharmacodynamic systems analysis. The use of RAD-ADAPT is demonstrated using an example that examines the impact of pharmacologic ATM and ATR kinase inhibition on human lung cancer cell line A549 after ionizing radiation. Copyright © 2017 Elsevier B.V. All rights reserved.
Quality assurance software inspections at NASA Ames: Metrics for feedback and modification

NASA Technical Reports Server (NTRS)

Wenneson, G.

1985-01-01

Software inspections are a set of formal technical review procedures held at selected key points during software development in order to find defects in software documents--is described in terms of history, participants, tools, procedures, statistics, and database analysis.
OSPAR standard method and software for statistical analysis of beach litter data.

PubMed

Schulz, Marcus; van Loon, Willem; Fleet, David M; Baggelaar, Paul; van der Meulen, Eit

2017-09-15

The aim of this study is to develop standard statistical methods and software for the analysis of beach litter data. The optimal ensemble of statistical methods comprises the Mann-Kendall trend test, the Theil-Sen slope estimation, the Wilcoxon step trend test and basic descriptive statistics. The application of Litter Analyst, a tailor-made software for analysing the results of beach litter surveys, to OSPAR beach litter data from seven beaches bordering on the south-eastern North Sea, revealed 23 significant trends in the abundances of beach litter types for the period 2009-2014. Litter Analyst revealed a large variation in the abundance of litter types between beaches. To reduce the effects of spatial variation, trend analysis of beach litter data can most effectively be performed at the beach or national level. Spatial aggregation of beach litter data within a region is possible, but resulted in a considerable reduction in the number of significant trends. Copyright © 2017 Elsevier Ltd. All rights reserved.
CADDIS Volume 4. Data Analysis: Download Software

EPA Pesticide Factsheets

Overview of the data analysis tools available for download on CADDIS. Provides instructions for downloading and installing CADStat, access to Microsoft Excel macro for computing SSDs, a brief overview of command line use of R, a statistical software.
Calypso: a user-friendly web-server for mining and visualizing microbiome-environment interactions.

PubMed

Zakrzewski, Martha; Proietti, Carla; Ellis, Jonathan J; Hasan, Shihab; Brion, Marie-Jo; Berger, Bernard; Krause, Lutz

2017-03-01

Calypso is an easy-to-use online software suite that allows non-expert users to mine, interpret and compare taxonomic information from metagenomic or 16S rDNA datasets. Calypso has a focus on multivariate statistical approaches that can identify complex environment-microbiome associations. The software enables quantitative visualizations, statistical testing, multivariate analysis, supervised learning, factor analysis, multivariable regression, network analysis and diversity estimates. Comprehensive help pages, tutorials and videos are provided via a wiki page. The web-interface is accessible via http://cgenome.net/calypso/ . The software is programmed in Java, PERL and R and the source code is available from Zenodo ( https://zenodo.org/record/50931 ). The software is freely available for non-commercial users. l.krause@uq.edu.au. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Evaluation of Solid Rocket Motor Component Data Using a Commercially Available Statistical Software Package

NASA Technical Reports Server (NTRS)

Stefanski, Philip L.

2015-01-01

Commercially available software packages today allow users to quickly perform the routine evaluations of (1) descriptive statistics to numerically and graphically summarize both sample and population data, (2) inferential statistics that draws conclusions about a given population from samples taken of it, (3) probability determinations that can be used to generate estimates of reliability allowables, and finally (4) the setup of designed experiments and analysis of their data to identify significant material and process characteristics for application in both product manufacturing and performance enhancement. This paper presents examples of analysis and experimental design work that has been conducted using Statgraphics®(Registered Trademark) statistical software to obtain useful information with regard to solid rocket motor propellants and internal insulation material. Data were obtained from a number of programs (Shuttle, Constellation, and Space Launch System) and sources that include solid propellant burn rate strands, tensile specimens, sub-scale test motors, full-scale operational motors, rubber insulation specimens, and sub-scale rubber insulation analog samples. Besides facilitating the experimental design process to yield meaningful results, statistical software has demonstrated its ability to quickly perform complex data analyses and yield significant findings that might otherwise have gone unnoticed. One caveat to these successes is that useful results not only derive from the inherent power of the software package, but also from the skill and understanding of the data analyst.
Meta-Analysis in Stata Using Gllamm

ERIC Educational Resources Information Center

Bagos, Pantelis G.

2015-01-01

There are several user-written programs for performing meta-analysis in Stata (Stata Statistical Software: College Station, TX: Stata Corp LP). These include metan, metareg, mvmeta, and glst. However, there are several cases for which these programs do not suffice. For instance, there is no software for performing univariate meta-analysis with…
[Statistical analysis using freely-available "EZR (Easy R)" software].

PubMed

Kanda, Yoshinobu

2015-10-01

Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.
The Missing Link: The Use of Link Words and Phrases as a Link to Manuscript Quality

ERIC Educational Resources Information Center

Onwuegbuzie, Anthony J.

2016-01-01

In this article, I provide a typology of transition words/phrases. This typology comprises 12 dimensions of link words/phrases that capture 277 link words/phrases. Using QDA Miner, WordStat, and SPSS--a computer-assisted mixed methods data analysis software, content analysis software, and statistical software, respectively--I analyzed 74…

Compression Algorithm Analysis of In-Situ (S)TEM Video: Towards Automatic Event Detection and Characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Teuton, Jeremy R.; Griswold, Richard L.; Mehdi, Beata L.

Precise analysis of both (S)TEM images and video are time and labor intensive processes. As an example, determining when crystal growth and shrinkage occurs during the dynamic process of Li dendrite deposition and stripping involves manually scanning through each frame in the video to extract a specific set of frames/images. For large numbers of images, this process can be very time consuming, so a fast and accurate automated method is desirable. Given this need, we developed software that uses analysis of video compression statistics for detecting and characterizing events in large data sets. This software works by converting the datamore » into a series of images which it compresses into an MPEG-2 video using the open source “avconv” utility [1]. The software does not use the video itself, but rather analyzes the video statistics from the first pass of the video encoding that avconv records in the log file. This file contains statistics for each frame of the video including the frame quality, intra-texture and predicted texture bits, forward and backward motion vector resolution, among others. In all, avconv records 15 statistics for each frame. By combining different statistics, we have been able to detect events in various types of data. We have developed an interactive tool for exploring the data and the statistics that aids the analyst in selecting useful statistics for each analysis. Going forward, an algorithm for detecting and possibly describing events automatically can be written based on statistic(s) for each data type.« less
AN EVALUATION OF FIVE COMMERCIAL IMMUNOASSAY DATA ANALYSIS SOFTWARE SYSTEMS

EPA Science Inventory

An evaluation of five commercial software systems used for immunoassay data analysis revealed numerous deficiencies. Often, the utility of statistical output was compromised by poor documentation. Several data sets were run through each system using a four-parameter calibration f...
Acoustic Emission Analysis Applet (AEAA) Software

NASA Technical Reports Server (NTRS)

Nichols, Charles T.; Roth, Don J.

2013-01-01

NASA Glenn Research and NASA White Sands Test Facility have developed software supporting an automated pressure vessel structural health monitoring (SHM) system based on acoustic emissions (AE). The software, referred to as the Acoustic Emission Analysis Applet (AEAA), provides analysts with a tool that can interrogate data collected on Digital Wave Corp. and Physical Acoustics Corp. software using a wide spectrum of powerful filters and charts. This software can be made to work with any data once the data format is known. The applet will compute basic AE statistics, and statistics as a function of time and pressure (see figure). AEAA provides value added beyond the analysis provided by the respective vendors' analysis software. The software can handle data sets of unlimited size. A wide variety of government and commercial applications could benefit from this technology, notably requalification and usage tests for compressed gas and hydrogen-fueled vehicles. Future enhancements will add features similar to a "check engine" light on a vehicle. Once installed, the system will ultimately be used to alert International Space Station crewmembers to critical structural instabilities, but will have little impact to missions otherwise. Diagnostic information could then be transmitted to experienced technicians on the ground in a timely manner to determine whether pressure vessels have been impacted, are structurally unsound, or can be safely used to complete the mission.
Software Reliability, Measurement, and Testing. Volume 2. Guidebook for Software Reliability Measurement and Testing

DTIC Science & Technology

1992-04-01

contractor’s existing data collection, analysis and corrective action system shall be utilized, with modification only as necessary to meet the...either from test or from analysis of field data . The procedures of MIL-STD-756B assume that the reliability of a 18 DEFINE IDENTIFY SOFTWARE LIFE CYCLE...to generate sufficient data to report a statistically valid reliability figure for a class of software. Casual data gathering accumulates data more
Orchestrating high-throughput genomic analysis with Bioconductor

PubMed Central

Huber, Wolfgang; Carey, Vincent J.; Gentleman, Robert; Anders, Simon; Carlson, Marc; Carvalho, Benilton S.; Bravo, Hector Corrada; Davis, Sean; Gatto, Laurent; Girke, Thomas; Gottardo, Raphael; Hahne, Florian; Hansen, Kasper D.; Irizarry, Rafael A.; Lawrence, Michael; Love, Michael I.; MacDonald, James; Obenchain, Valerie; Oleś, Andrzej K.; Pagès, Hervé; Reyes, Alejandro; Shannon, Paul; Smyth, Gordon K.; Tenenbaum, Dan; Waldron, Levi; Morgan, Martin

2015-01-01

Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors. PMID:25633503
Contrast Analysis: A Tutorial

ERIC Educational Resources Information Center

Haans, Antal

2018-01-01

Contrast analysis is a relatively simple but effective statistical method for testing theoretical predictions about differences between group means against the empirical data. Despite its advantages, contrast analysis is hardly used to date, perhaps because it is not implemented in a convenient manner in many statistical software packages. This…
Not so Fast My Friend: The Rush to R and the Need for Rigorous Evaluation of Data Analysis and Software in Education

ERIC Educational Resources Information Center

Harwell, Michael

2014-01-01

Commercial data analysis software has been a fixture of quantitative analyses in education for more than three decades. Despite its apparent widespread use there is no formal evidence cataloging what software is used in educational research and educational statistics classes, by whom and for what purpose, and whether some programs should be…
P-MartCancer–Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Webb-Robertson, Bobbie-Jo M.; Bramer, Lisa M.; Jensen, Jeffrey L.

P-MartCancer is a new interactive web-based software environment that enables biomedical and biological scientists to perform in-depth analyses of global proteomics data without requiring direct interaction with the data or with statistical software. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access to multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium (CPTAC) at the peptide, gene and protein levels. P-MartCancer is deployed using Azure technologies (http://pmart.labworks.org/cptac.html), the web-service is alternativelymore » available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/) and many statistical functions can be utilized directly from an R package available on GitHub (https://github.com/pmartR).« less
Computer-assisted qualitative data analysis software.

PubMed

Cope, Diane G

2014-05-01

Advances in technology have provided new approaches for data collection methods and analysis for researchers. Data collection is no longer limited to paper-and-pencil format, and numerous methods are now available through Internet and electronic resources. With these techniques, researchers are not burdened with entering data manually and data analysis is facilitated by software programs. Quantitative research is supported by the use of computer software and provides ease in the management of large data sets and rapid analysis of numeric statistical methods. New technologies are emerging to support qualitative research with the availability of computer-assisted qualitative data analysis software (CAQDAS).CAQDAS will be presented with a discussion of advantages, limitations, controversial issues, and recommendations for this type of software use.
HDBStat!: a platform-independent software suite for statistical analysis of high dimensional biology data.

PubMed

Trivedi, Prinal; Edwards, Jode W; Wang, Jelai; Gadbury, Gary L; Srinivasasainagendra, Vinodh; Zakharkin, Stanislav O; Kim, Kyoungmi; Mehta, Tapan; Brand, Jacob P L; Patki, Amit; Page, Grier P; Allison, David B

2005-04-06

Many efforts in microarray data analysis are focused on providing tools and methods for the qualitative analysis of microarray data. HDBStat! (High-Dimensional Biology-Statistics) is a software package designed for analysis of high dimensional biology data such as microarray data. It was initially developed for the analysis of microarray gene expression data, but it can also be used for some applications in proteomics and other aspects of genomics. HDBStat! provides statisticians and biologists a flexible and easy-to-use interface to analyze complex microarray data using a variety of methods for data preprocessing, quality control analysis and hypothesis testing. Results generated from data preprocessing methods, quality control analysis and hypothesis testing methods are output in the form of Excel CSV tables, graphs and an Html report summarizing data analysis. HDBStat! is a platform-independent software that is freely available to academic institutions and non-profit organizations. It can be downloaded from our website http://www.soph.uab.edu/ssg_content.asp?id=1164.
SWToolbox: A surface-water tool-box for statistical analysis of streamflow time series

USGS Publications Warehouse

Kiang, Julie E.; Flynn, Kate; Zhai, Tong; Hummel, Paul; Granato, Gregory

2018-03-07

This report is a user guide for the low-flow analysis methods provided with version 1.0 of the Surface Water Toolbox (SWToolbox) computer program. The software combines functionality from two software programs—U.S. Geological Survey (USGS) SWSTAT and U.S. Environmental Protection Agency (EPA) DFLOW. Both of these programs have been used primarily for computation of critical low-flow statistics. The main analysis methods are the computation of hydrologic frequency statistics such as the 7-day minimum flow that occurs on average only once every 10 years (7Q10), computation of design flows including biologically based flows, and computation of flow-duration curves and duration hydrographs. Other annual, monthly, and seasonal statistics can also be computed. The interface facilitates retrieval of streamflow discharge data from the USGS National Water Information System and outputs text reports for a record of the analysis. Tools for graphing data and screening tests are available to assist the analyst in conducting the analysis.
Leveraging Code Comments to Improve Software Reliability

ERIC Educational Resources Information Center

Tan, Lin

2009-01-01

Commenting source code has long been a common practice in software development. This thesis, consisting of three pieces of work, made novel use of the code comments written in natural language to improve software reliability. Our solution combines Natural Language Processing (NLP), Machine Learning, Statistics, and Program Analysis techniques to…
Network Meta-Analysis Using R: A Review of Currently Available Automated Packages

PubMed Central

Neupane, Binod; Richer, Danielle; Bonner, Ashley Joel; Kibret, Taddele; Beyene, Joseph

2014-01-01

Network meta-analysis (NMA) – a statistical technique that allows comparison of multiple treatments in the same meta-analysis simultaneously – has become increasingly popular in the medical literature in recent years. The statistical methodology underpinning this technique and software tools for implementing the methods are evolving. Both commercial and freely available statistical software packages have been developed to facilitate the statistical computations using NMA with varying degrees of functionality and ease of use. This paper aims to introduce the reader to three R packages, namely, gemtc, pcnetmeta, and netmeta, which are freely available software tools implemented in R. Each automates the process of performing NMA so that users can perform the analysis with minimal computational effort. We present, compare and contrast the availability and functionality of different important features of NMA in these three packages so that clinical investigators and researchers can determine which R packages to implement depending on their analysis needs. Four summary tables detailing (i) data input and network plotting, (ii) modeling options, (iii) assumption checking and diagnostic testing, and (iv) inference and reporting tools, are provided, along with an analysis of a previously published dataset to illustrate the outputs available from each package. We demonstrate that each of the three packages provides a useful set of tools, and combined provide users with nearly all functionality that might be desired when conducting a NMA. PMID:25541687
Network meta-analysis using R: a review of currently available automated packages.

PubMed

Neupane, Binod; Richer, Danielle; Bonner, Ashley Joel; Kibret, Taddele; Beyene, Joseph

2014-01-01

Network meta-analysis (NMA)--a statistical technique that allows comparison of multiple treatments in the same meta-analysis simultaneously--has become increasingly popular in the medical literature in recent years. The statistical methodology underpinning this technique and software tools for implementing the methods are evolving. Both commercial and freely available statistical software packages have been developed to facilitate the statistical computations using NMA with varying degrees of functionality and ease of use. This paper aims to introduce the reader to three R packages, namely, gemtc, pcnetmeta, and netmeta, which are freely available software tools implemented in R. Each automates the process of performing NMA so that users can perform the analysis with minimal computational effort. We present, compare and contrast the availability and functionality of different important features of NMA in these three packages so that clinical investigators and researchers can determine which R packages to implement depending on their analysis needs. Four summary tables detailing (i) data input and network plotting, (ii) modeling options, (iii) assumption checking and diagnostic testing, and (iv) inference and reporting tools, are provided, along with an analysis of a previously published dataset to illustrate the outputs available from each package. We demonstrate that each of the three packages provides a useful set of tools, and combined provide users with nearly all functionality that might be desired when conducting a NMA.
Integrated Assessment and Improvement of the Quality Assurance System for the Cosworth Casting Process

NASA Astrophysics Data System (ADS)

Yousif, Dilon

The purpose of this study was to improve the Quality Assurance (QA) System at the Nemak Windsor Aluminum Plant (WAP). The project used Six Sigma method based on Define, Measure, Analyze, Improve, and Control (DMAIC). Analysis of in process melt at WAP was based on chemical, thermal, and mechanical testing. The control limits for the W319 Al Alloy were statistically recalculated using the composition measured under stable conditions. The "Chemistry Viewer" software was developed for statistical analysis of alloy composition. This software features the Silicon Equivalency (SiBQ) developed by the IRC. The Melt Sampling Device (MSD) was designed and evaluated at WAP to overcome traditional sampling limitations. The Thermal Analysis "Filters" software was developed for cooling curve analysis of the 3XX Al Alloy(s) using IRC techniques. The impact of low melting point impurities on the start of melting was evaluated using the Universal Metallurgical Simulator and Analyzer (UMSA).
Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics

USGS Publications Warehouse

Lee, L.; Helsel, D.

2005-01-01

Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.
DEIVA: a web application for interactive visual analysis of differential gene expression profiles.

PubMed

Harshbarger, Jayson; Kratz, Anton; Carninci, Piero

2017-01-07

Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.
Using Statistical Analysis Software to Advance Nitro Plasticizer Wettability

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shear, Trevor Allan

Statistical analysis in science is an extremely powerful tool that is often underutilized. Additionally, it is frequently the case that data is misinterpreted or not used to its fullest extent. Utilizing the advanced software JMP®, many aspects of experimental design and data analysis can be evaluated and improved. This overview will detail the features of JMP® and how they were used to advance a project, resulting in time and cost savings, as well as the collection of scientifically sound data. The project analyzed in this report addresses the inability of a nitro plasticizer to coat a gold coated quartz crystalmore » sensor used in a quartz crystal microbalance. Through the use of the JMP® software, the wettability of the nitro plasticizer was increased by over 200% using an atmospheric plasma pen, ensuring good sample preparation and reliable results.« less
Experimental analysis of computer system dependability

NASA Technical Reports Server (NTRS)

Iyer, Ravishankar, K.; Tang, Dong

1993-01-01

This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance.
Using Business Analysis Software in a Business Intelligence Course

ERIC Educational Resources Information Center

Elizondo, Juan; Parzinger, Monica J.; Welch, Orion J.

2011-01-01

This paper presents an example of a project used in an undergraduate business intelligence class which integrates concepts from statistics, marketing, and information systems disciplines. SAS Enterprise Miner software is used as the foundation for predictive analysis and data mining. The course culminates with a competition and the project is used…

LV software support for supersonic flow analysis

NASA Technical Reports Server (NTRS)

Bell, W. A.; Lepicovsky, J.

1992-01-01

The software for configuring an LV counter processor system has been developed using structured design. The LV system includes up to three counter processors and a rotary encoder. The software for configuring and testing the LV system has been developed, tested, and included in an overall software package for data acquisition, analysis, and reduction. Error handling routines respond to both operator and instrument errors which often arise in the course of measuring complex, high-speed flows. The use of networking capabilities greatly facilitates the software development process by allowing software development and testing from a remote site. In addition, high-speed transfers allow graphics files or commands to provide viewing of the data from a remote site. Further advances in data analysis require corresponding advances in procedures for statistical and time series analysis of nonuniformly sampled data.
LV software support for supersonic flow analysis

NASA Technical Reports Server (NTRS)

Bell, William A.

1992-01-01

The software for configuring a Laser Velocimeter (LV) counter processor system was developed using structured design. The LV system includes up to three counter processors and a rotary encoder. The software for configuring and testing the LV system was developed, tested, and included in an overall software package for data acquisition, analysis, and reduction. Error handling routines respond to both operator and instrument errors which often arise in the course of measuring complex, high-speed flows. The use of networking capabilities greatly facilitates the software development process by allowing software development and testing from a remote site. In addition, high-speed transfers allow graphics files or commands to provide viewing of the data from a remote site. Further advances in data analysis require corresponding advances in procedures for statistical and time series analysis of nonuniformly sampled data.
Software for Data Analysis with Graphical Models

NASA Technical Reports Server (NTRS)

Buntine, Wray L.; Roy, H. Scott

1994-01-01

Probabilistic graphical models are being used widely in artificial intelligence and statistics, for instance, in diagnosis and expert systems, as a framework for representing and reasoning with probabilities and independencies. They come with corresponding algorithms for performing statistical inference. This offers a unifying framework for prototyping and/or generating data analysis algorithms from graphical specifications. This paper illustrates the framework with an example and then presents some basic techniques for the task: problem decomposition and the calculation of exact Bayes factors. Other tools already developed, such as automatic differentiation, Gibbs sampling, and use of the EM algorithm, make this a broad basis for the generation of data analysis software.
Cognition of and Demand for Education and Teaching in Medical Statistics in China: A Systematic Review and Meta-Analysis

PubMed Central

Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong

2015-01-01

Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent. PMID:26053876
Cognition of and Demand for Education and Teaching in Medical Statistics in China: A Systematic Review and Meta-Analysis.

PubMed

Wu, Yazhou; Zhou, Liang; Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong

2015-01-01

Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent.
A Didactic Experience of Statistical Analysis for the Determination of Glycine in a Nonaqueous Medium Using ANOVA and a Computer Program

ERIC Educational Resources Information Center

Santos-Delgado, M. J.; Larrea-Tarruella, L.

2004-01-01

The back-titration methods are compared statistically to establish glycine in a nonaqueous medium of acetic acid. Important variations in the mean values of glycine are observed due to the interaction effects between the analysis of variance (ANOVA) technique and a statistical study through a computer software.
Object-oriented productivity metrics

NASA Technical Reports Server (NTRS)

Connell, John L.; Eller, Nancy

1992-01-01

Software productivity metrics are useful for sizing and costing proposed software and for measuring development productivity. Estimating and measuring source lines of code (SLOC) has proven to be a bad idea because it encourages writing more lines of code and using lower level languages. Function Point Analysis is an improved software metric system, but it is not compatible with newer rapid prototyping and object-oriented approaches to software development. A process is presented here for counting object-oriented effort points, based on a preliminary object-oriented analysis. It is proposed that this approach is compatible with object-oriented analysis, design, programming, and rapid prototyping. Statistics gathered on actual projects are presented to validate the approach.
Meta-analyses and Forest plots using a microsoft excel spreadsheet: step-by-step guide focusing on descriptive data analysis.

PubMed

Neyeloff, Jeruza L; Fuchs, Sandra C; Moreira, Leila B

2012-01-20

Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software.
Meta-analyses and Forest plots using a microsoft excel spreadsheet: step-by-step guide focusing on descriptive data analysis

PubMed Central

2012-01-01

Background Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. Findings We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. Conclusions It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software. PMID:22264277
Crux: Rapid Open Source Protein Tandem Mass Spectrometry Analysis

PubMed Central

2015-01-01

Efficiently and accurately analyzing big protein tandem mass spectrometry data sets requires robust software that incorporates state-of-the-art computational, machine learning, and statistical methods. The Crux mass spectrometry analysis software toolkit (http://cruxtoolkit.sourceforge.net) is an open source project that aims to provide users with a cross-platform suite of analysis tools for interpreting protein mass spectrometry data. PMID:25182276
How Statistics "Excel" Online.

ERIC Educational Resources Information Center

Chao, Faith; Davis, James

2000-01-01

Discusses the use of Microsoft Excel software and provides examples of its use in an online statistics course at Golden Gate University in the areas of randomness and probability, sampling distributions, confidence intervals, and regression analysis. (LRW)
SIMREL: Software for Coefficient Alpha and Its Confidence Intervals with Monte Carlo Studies

ERIC Educational Resources Information Center

Yurdugul, Halil

2009-01-01

This article describes SIMREL, a software program designed for the simulation of alpha coefficients and the estimation of its confidence intervals. SIMREL runs on two alternatives. In the first one, if SIMREL is run for a single data file, it performs descriptive statistics, principal components analysis, and variance analysis of the item scores…
Mathematical and Statistical Software Index.

DTIC Science & Technology

1986-08-01

geometric) mean HMEAN - harmonic mean MEDIAN - median MODE - mode QUANT - quantiles OGIVE - distribution curve IQRNG - interpercentile range RANGE ... range mutliphase pivoting algorithm cross-classification multiple discriminant analysis cross-tabul ation mul tipl e-objecti ve model curve fitting...Statistics). .. .. .... ...... ..... ...... ..... .. 21 *RANGEX (Correct Correlations for Curtailment of Range ). .. .. .... ...... ... 21 *RUMMAGE II (Analysis
Technological Tools in the Introductory Statistics Classroom: Effects on Student Understanding of Inferential Statistics

ERIC Educational Resources Information Center

Meletiou-Mavrotheris, Maria

2004-01-01

While technology has become an integral part of introductory statistics courses, the programs typically employed are professional packages designed primarily for data analysis rather than for learning. Findings from several studies suggest that use of such software in the introductory statistics classroom may not be very effective in helping…
Revisiting Information Technology tools serving authorship and editorship: a case-guided tutorial to statistical analysis and plagiarism detection

PubMed Central

Bamidis, P D; Lithari, C; Konstantinidis, S T

2010-01-01

With the number of scientific papers published in journals, conference proceedings, and international literature ever increasing, authors and reviewers are not only facilitated with an abundance of information, but unfortunately continuously confronted with risks associated with the erroneous copy of another's material. In parallel, Information Communication Technology (ICT) tools provide to researchers novel and continuously more effective ways to analyze and present their work. Software tools regarding statistical analysis offer scientists the chance to validate their work and enhance the quality of published papers. Moreover, from the reviewers and the editor's perspective, it is now possible to ensure the (text-content) originality of a scientific article with automated software tools for plagiarism detection. In this paper, we provide a step-bystep demonstration of two categories of tools, namely, statistical analysis and plagiarism detection. The aim is not to come up with a specific tool recommendation, but rather to provide useful guidelines on the proper use and efficiency of either category of tools. In the context of this special issue, this paper offers a useful tutorial to specific problems concerned with scientific writing and review discourse. A specific neuroscience experimental case example is utilized to illustrate the young researcher's statistical analysis burden, while a test scenario is purpose-built using open access journal articles to exemplify the use and comparative outputs of seven plagiarism detection software pieces. PMID:21487489
Revisiting Information Technology tools serving authorship and editorship: a case-guided tutorial to statistical analysis and plagiarism detection.

PubMed

Bamidis, P D; Lithari, C; Konstantinidis, S T

2010-12-01

With the number of scientific papers published in journals, conference proceedings, and international literature ever increasing, authors and reviewers are not only facilitated with an abundance of information, but unfortunately continuously confronted with risks associated with the erroneous copy of another's material. In parallel, Information Communication Technology (ICT) tools provide to researchers novel and continuously more effective ways to analyze and present their work. Software tools regarding statistical analysis offer scientists the chance to validate their work and enhance the quality of published papers. Moreover, from the reviewers and the editor's perspective, it is now possible to ensure the (text-content) originality of a scientific article with automated software tools for plagiarism detection. In this paper, we provide a step-bystep demonstration of two categories of tools, namely, statistical analysis and plagiarism detection. The aim is not to come up with a specific tool recommendation, but rather to provide useful guidelines on the proper use and efficiency of either category of tools. In the context of this special issue, this paper offers a useful tutorial to specific problems concerned with scientific writing and review discourse. A specific neuroscience experimental case example is utilized to illustrate the young researcher's statistical analysis burden, while a test scenario is purpose-built using open access journal articles to exemplify the use and comparative outputs of seven plagiarism detection software pieces.
[Design and implementation of online statistical analysis function in information system of air pollution and health impact monitoring].

PubMed

Lü, Yiran; Hao, Shuxin; Zhang, Guoqing; Liu, Jie; Liu, Yue; Xu, Dongqun

2018-01-01

To implement the online statistical analysis function in information system of air pollution and health impact monitoring, and obtain the data analysis information real-time. Using the descriptive statistical method as well as time-series analysis and multivariate regression analysis, SQL language and visual tools to implement online statistical analysis based on database software. Generate basic statistical tables and summary tables of air pollution exposure and health impact data online; Generate tendency charts of each data part online and proceed interaction connecting to database; Generate butting sheets which can lead to R, SAS and SPSS directly online. The information system air pollution and health impact monitoring implements the statistical analysis function online, which can provide real-time analysis result to its users.
Novel features and enhancements in BioBin, a tool for the biologically inspired binning and association analysis of rare variants

PubMed Central

Byrska-Bishop, Marta; Wallace, John; Frase, Alexander T; Ritchie, Marylyn D

2018-01-01

Abstract Motivation BioBin is an automated bioinformatics tool for the multi-level biological binning of sequence variants. Herein, we present a significant update to BioBin which expands the software to facilitate a comprehensive rare variant analysis and incorporates novel features and analysis enhancements. Results In BioBin 2.3, we extend our software tool by implementing statistical association testing, updating the binning algorithm, as well as incorporating novel analysis features providing for a robust, highly customizable, and unified rare variant analysis tool. Availability and implementation The BioBin software package is open source and freely available to users at http://www.ritchielab.com/software/biobin-download Contact mdritchie@geisinger.edu Supplementary information Supplementary data are available at Bioinformatics online. PMID:28968757
P-MartCancer-Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets.

PubMed

Webb-Robertson, Bobbie-Jo M; Bramer, Lisa M; Jensen, Jeffrey L; Kobold, Markus A; Stratton, Kelly G; White, Amanda M; Rodland, Karin D

2017-11-01

P-MartCancer is an interactive web-based software environment that enables statistical analyses of peptide or protein data, quantitated from mass spectrometry-based global proteomics experiments, without requiring in-depth knowledge of statistical programming. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification, and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access and the capability to analyze multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium at the peptide, gene, and protein levels. P-MartCancer is deployed as a web service (https://pmart.labworks.org/cptac.html), alternatively available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/). Cancer Res; 77(21); e47-50. ©2017 AACR . ©2017 American Association for Cancer Research.
Compositional Solution Space Quantification for Probabilistic Software Analysis

NASA Technical Reports Server (NTRS)

Borges, Mateus; Pasareanu, Corina S.; Filieri, Antonio; d'Amorim, Marcelo; Visser, Willem

2014-01-01

Probabilistic software analysis aims at quantifying how likely a target event is to occur during program execution. Current approaches rely on symbolic execution to identify the conditions to reach the target event and try to quantify the fraction of the input domain satisfying these conditions. Precise quantification is usually limited to linear constraints, while only approximate solutions can be provided in general through statistical approaches. However, statistical approaches may fail to converge to an acceptable accuracy within a reasonable time. We present a compositional statistical approach for the efficient quantification of solution spaces for arbitrarily complex constraints over bounded floating-point domains. The approach leverages interval constraint propagation to improve the accuracy of the estimation by focusing the sampling on the regions of the input domain containing the sought solutions. Preliminary experiments show significant improvement on previous approaches both in results accuracy and analysis time.

Application of econometric and ecology analysis methods in physics software

NASA Astrophysics Data System (ADS)

Han, Min Cheol; Hoff, Gabriela; Kim, Chan Hyeong; Kim, Sung Hun; Grazia Pia, Maria; Ronchieri, Elisabetta; Saracco, Paolo

2017-10-01

Some data analysis methods typically used in econometric studies and in ecology have been evaluated and applied in physics software environments. They concern the evolution of observables through objective identification of change points and trends, and measurements of inequality, diversity and evenness across a data set. Within each analysis area, various statistical tests and measures have been examined. This conference paper summarizes a brief overview of some of these methods.
Identifying biologically relevant differences between metagenomic communities.

PubMed

Parks, Donovan H; Beiko, Robert G

2010-03-15

Metagenomics is the study of genetic material recovered directly from environmental samples. Taxonomic and functional differences between metagenomic samples can highlight the influence of ecological factors on patterns of microbial life in a wide range of habitats. Statistical hypothesis tests can help us distinguish ecological influences from sampling artifacts, but knowledge of only the P-value from a statistical hypothesis test is insufficient to make inferences about biological relevance. Current reporting practices for pairwise comparative metagenomics are inadequate, and better tools are needed for comparative metagenomic analysis. We have developed a new software package, STAMP, for comparative metagenomics that supports best practices in analysis and reporting. Examination of a pair of iron mine metagenomes demonstrates that deeper biological insights can be gained using statistical techniques available in our software. An analysis of the functional potential of 'Candidatus Accumulibacter phosphatis' in two enhanced biological phosphorus removal metagenomes identified several subsystems that differ between the A.phosphatis stains in these related communities, including phosphate metabolism, secretion and metal transport. Python source code and binaries are freely available from our website at http://kiwi.cs.dal.ca/Software/STAMP CONTACT: beiko@cs.dal.ca Supplementary data are available at Bioinformatics online.
FieldTrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data.

PubMed

Oostenveld, Robert; Fries, Pascal; Maris, Eric; Schoffelen, Jan-Mathijs

2011-01-01

This paper describes FieldTrip, an open source software package that we developed for the analysis of MEG, EEG, and other electrophysiological data. The software is implemented as a MATLAB toolbox and includes a complete set of consistent and user-friendly high-level functions that allow experimental neuroscientists to analyze experimental data. It includes algorithms for simple and advanced analysis, such as time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. The implementation as toolbox allows the user to perform elaborate and structured analyses of large data sets using the MATLAB command line and batch scripting. Furthermore, users and developers can easily extend the functionality and implement new algorithms. The modular design facilitates the reuse in other software packages.
Classification software technique assessment

NASA Technical Reports Server (NTRS)

Jayroe, R. R., Jr.; Atkinson, R.; Dasarathy, B. V.; Lybanon, M.; Ramapryian, H. K.

1976-01-01

A catalog of software options is presented for the use of local user communities to obtain software for analyzing remotely sensed multispectral imagery. The resources required to utilize a particular software program are described. Descriptions of how a particular program analyzes data and the performance of that program for an application and data set provided by the user are shown. An effort is made to establish a statistical performance base for various software programs with regard to different data sets and analysis applications, to determine the status of the state-of-the-art.
Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates.

PubMed

Xia, Li C; Steele, Joshua A; Cram, Jacob A; Cardon, Zoe G; Simmons, Sheri L; Vallino, Joseph J; Fuhrman, Jed A; Sun, Fengzhu

2011-01-01

The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa.
Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates

PubMed Central

2011-01-01

Background The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. Results We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. Conclusions The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa. PMID:22784572
Software For Computing Reliability Of Other Software

NASA Technical Reports Server (NTRS)

Nikora, Allen; Antczak, Thomas M.; Lyu, Michael

1995-01-01

Computer Aided Software Reliability Estimation (CASRE) computer program developed for use in measuring reliability of other software. Easier for non-specialists in reliability to use than many other currently available programs developed for same purpose. CASRE incorporates mathematical modeling capabilities of public-domain Statistical Modeling and Estimation of Reliability Functions for Software (SMERFS) computer program and runs in Windows software environment. Provides menu-driven command interface; enabling and disabling of menu options guides user through (1) selection of set of failure data, (2) execution of mathematical model, and (3) analysis of results from model. Written in C language.
Assessing the Lexico-Grammatical Characteristics of a Corpus of College-Level Statistics Textbooks: Implications for Instruction and Practice

ERIC Educational Resources Information Center

Wagler, Amy E.; Lesser, Lawrence M.; González, Ariel I.; Leal, Luis

2015-01-01

A corpus of current editions of statistics textbooks was assessed to compare aspects and levels of readability for the topics of "measures of center," "line of fit," "regression analysis," and "regression inference." Analysis with lexical software of these text selections revealed that the large corpus can…
Using R-Project for Free Statistical Analysis in Extension Research

ERIC Educational Resources Information Center

Mangiafico, Salvatore S.

2013-01-01

One option for Extension professionals wishing to use free statistical software is to use online calculators, which are useful for common, simple analyses. A second option is to use a free computing environment capable of performing statistical analyses, like R-project. R-project is free, cross-platform, powerful, and respected, but may be…
Probability and Statistics in Sensor Performance Modeling

DTIC Science & Technology

2010-12-01

language software program is called Environmental Awareness for Sensor and Emitter Employment. Some important numerical issues in the implementation...3 Statistical analysis for measuring sensor performance...complementary cumulative distribution function cdf cumulative distribution function DST decision-support tool EASEE Environmental Awareness of
The open-source movement: an introduction for forestry professionals

Treesearch

Patrick Proctor; Paul C. Van Deusen; Linda S. Heath; Jeffrey H. Gove

2005-01-01

In recent years, the open-source movement has yielded a generous and powerful suite of software and utilities that rivals those developed by many commercial software companies. Open-source programs are available for many scientific needs: operating systems, databases, statistical analysis, Geographic Information System applications, and object-oriented programming....
How to choose the right statistical software?-a method increasing the post-purchase satisfaction.

PubMed

Cavaliere, Roberto

2015-12-01

Nowadays, we live in the "data era" where the use of statistical or data analysis software is inevitable, in any research field. This means that the choice of the right software tool or platform is a strategic issue for a research department. Nevertheless, in many cases decision makers do not pay the right attention to a comprehensive and appropriate evaluation of what the market offers. Indeed, the choice still depends on few factors like, for instance, researcher's personal inclination, e.g., which software have been used at the university or is already known. This is not wrong in principle, but in some cases it's not enough at all and might lead to a "dead end" situation, typically after months or years of investments already done on the wrong software. This article, far from being a full and complete guide to statistical software evaluation, aims to illustrate some key points of the decision process and introduce an extended range of factors which can help to undertake the right choice, at least in potential. There is not enough literature about that topic, most of the time underestimated, both in the traditional literature and even in the so called "gray literature", even if some documents or short pages can be found online. Anyhow, it seems there is not a common and known standpoint about the process of software evaluation from the final user perspective. We suggests a multi-factor analysis leading to an evaluation matrix tool, to be intended as a flexible and customizable tool, aimed to provide a clearer picture of the software alternatives available, not in abstract but related to the researcher's own context and needs. This method is a result of about twenty years of experience of the author in the field of evaluating and using technical-computing software and partially arises from a research made about such topics as part of a project funded by European Commission under the Lifelong Learning Programme 2011.
Prototyping with Data Dictionaries for Requirements Analysis.

DTIC Science & Technology

1985-03-01

statistical packages and software for screen layout. These items work at a higher level than another category of prototyping tool, program generators... Program generators are software packages which, when given specifications, produce source listings, usually in a high order language such as COBCL...with users and this will not happen if he must stop to develcp a detailed program . [Ref. 241] Hardware as well as software should be considered in
The Implication of Using NVivo Software in Qualitative Data Analysis: Evidence-Based Reflections.

PubMed

Zamawe, F C

2015-03-01

For a long time, electronic data analysis has been associated with quantitative methods. However, Computer Assisted Qualitative Data Analysis Software (CAQDAS) are increasingly being developed. Although the CAQDAS has been there for decades, very few qualitative health researchers report using it. This may be due to the difficulties that one has to go through to master the software and the misconceptions that are associated with using CAQDAS. While the issue of mastering CAQDAS has received ample attention, little has been done to address the misconceptions associated with CAQDAS. In this paper, the author reflects on his experience of interacting with one of the popular CAQDAS (NVivo) in order to provide evidence-based implications of using the software. The key message is that unlike statistical software, the main function of CAQDAS is not to analyse data but rather to aid the analysis process, which the researcher must always remain in control of. In other words, researchers must equally know that no software can analyse qualitative data. CAQDAS are basically data management packages, which support the researcher during analysis.
Statistical analysis of arsenic contamination in drinking water in a city of Iran and its modeling using GIS.

PubMed

Sadeghi, Fatemeh; Nasseri, Simin; Mosaferi, Mohammad; Nabizadeh, Ramin; Yunesian, Masud; Mesdaghinia, Alireza

2017-05-01

In this research, probable arsenic contamination in drinking water in the city of Ardabil was studied in 163 samples during four seasons. In each season, sampling was carried out randomly in the study area. Results were analyzed statistically applying SPSS 19 software, and the data was also modeled by Arc GIS 10.1 software. The maximum permissible arsenic concentration in drinking water defined by the World Health Organization and Iranian national standard is 10 μg/L. Statistical analysis showed 75, 88, 47, and 69% of samples in autumn, winter, spring, and summer, respectively, had concentrations higher than the national standard. The mean concentrations of arsenic in autumn, winter, spring, and summer were 19.89, 15.9, 10.87, and 14.6 μg/L, respectively, and the overall average in all samples through the year was 15.32 μg/L. Although GIS outputs indicated that the concentration distribution profiles changed in four consecutive seasons, variance analysis of the results showed that statistically there is no significant difference in arsenic levels in four seasons.
FieldTrip: Open Source Software for Advanced Analysis of MEG, EEG, and Invasive Electrophysiological Data

PubMed Central

Oostenveld, Robert; Fries, Pascal; Maris, Eric; Schoffelen, Jan-Mathijs

2011-01-01

This paper describes FieldTrip, an open source software package that we developed for the analysis of MEG, EEG, and other electrophysiological data. The software is implemented as a MATLAB toolbox and includes a complete set of consistent and user-friendly high-level functions that allow experimental neuroscientists to analyze experimental data. It includes algorithms for simple and advanced analysis, such as time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. The implementation as toolbox allows the user to perform elaborate and structured analyses of large data sets using the MATLAB command line and batch scripting. Furthermore, users and developers can easily extend the functionality and implement new algorithms. The modular design facilitates the reuse in other software packages. PMID:21253357
An Environmental Decision Support System for Spatial Assessment and Selective Remediation

EPA Science Inventory

Spatial Analysis and Decision Assistance (SADA) is a Windows freeware program that incorporates environmental assessment tools for effective problem-solving. The software integrates modules for GIS, visualization, geospatial analysis, statistical analysis, human health and ecolog...
Analysis of half diallel mating designs I: a practical analysis procedure for ANOVA approximation.

Treesearch

G.R. Johnson; J.N. King

1998-01-01

Procedures to analyze half-diallel mating designs using the SAS statistical package are presented. The procedure requires two runs of PROC and VARCOMP and results in estimates of additive and non-additive genetic variation. The procedures described can be modified to work on most statistical software packages which can compute variance component estimates. The...
Database Performance Monitoring for the Photovoltaic Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Klise, Katherine A.

The Database Performance Monitoring (DPM) software (copyright in processes) is being developed at Sandia National Laboratories to perform quality control analysis on time series data. The software loads time indexed databases (currently csv format), performs a series of quality control tests defined by the user, and creates reports which include summary statistics, tables, and graphics. DPM can be setup to run on an automated schedule defined by the user. For example, the software can be run once per day to analyze data collected on the previous day. HTML formatted reports can be sent via email or hosted on a website.more » To compare performance of several databases, summary statistics and graphics can be gathered in a dashboard view which links to detailed reporting information for each database. The software can be customized for specific applications.« less
A 20-year period of orthotopic liver transplantation activity in a single center: a time series analysis performed using the R Statistical Software.

PubMed

Santori, G; Andorno, E; Morelli, N; Casaccia, M; Bottino, G; Di Domenico, S; Valente, U

2009-05-01

In many Western countries a "minimum volume rule" policy has been adopted as a quality measure for complex surgical procedures. In Italy, the National Transplant Centre set the minimum number of orthotopic liver transplantation (OLT) procedures/y at 25/center. OLT procedures performed in a single center for a reasonably large period may be treated as a time series to evaluate trend, seasonal cycles, and nonsystematic fluctuations. Between January 1, 1987 and December 31, 2006, we performed 563 cadaveric donor OLTs to adult recipients. During 2007, there were another 28 procedures. The greatest numbers of OLTs/y were performed in 2001 (n = 51), 2005 (n = 50), and 2004 (n = 49). A time series analysis performed using R Statistical Software (Foundation for Statistical Computing, Vienna, Austria), a free software environment for statistical computing and graphics, showed an incremental trend after exponential smoothing as well as after seasonal decomposition. The predicted OLT/mo for 2007 calculated with the Holt-Winters exponential smoothing applied to the previous period 1987-2006 helped to identify the months where there was a major difference between predicted and performed procedures. The time series approach may be helpful to establish a minimum volume/y at a single-center level.

Process air quality data

NASA Technical Reports Server (NTRS)

Butler, C. M.; Hogge, J. E.

1978-01-01

Air quality sampling was conducted. Data for air quality parameters, recorded on written forms, punched cards or magnetic tape, are available for 1972 through 1975. Computer software was developed to (1) calculate several daily statistical measures of location, (2) plot time histories of data or the calculated daily statistics, (3) calculate simple correlation coefficients, and (4) plot scatter diagrams. Computer software was developed for processing air quality data to include time series analysis and goodness of fit tests. Computer software was developed to (1) calculate a larger number of daily statistical measures of location, and a number of daily monthly and yearly measures of location, dispersion, skewness and kurtosis, (2) decompose the extended time series model and (3) perform some goodness of fit tests. The computer program is described, documented and illustrated by examples. Recommendations are made for continuation of the development of research on processing air quality data.
Automatic Rock Detection and Mapping from HiRISE Imagery

NASA Technical Reports Server (NTRS)

Huertas, Andres; Adams, Douglas S.; Cheng, Yang

2008-01-01

This system includes a C-code software program and a set of MATLAB software tools for statistical analysis and rock distribution mapping. The major functions include rock detection and rock detection validation. The rock detection code has been evolved into a production tool that can be used by engineers and geologists with minor training.
Proposing a Mathematical Software Tool in Physics Secondary Education

ERIC Educational Resources Information Center

Baltzis, Konstantinos B.

2009-01-01

MathCad® is a very popular software tool for mathematical and statistical analysis in science and engineering. Its low cost, ease of use, extensive function library, and worksheet-like user interface distinguish it among other commercial packages. Its features are also well suited to educational process. The use of natural mathematical notation…
GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis.

PubMed

Zheng, Qi; Wang, Xiu-Jie

2008-07-01

Gene Ontology (GO) analysis has become a commonly used approach for functional studies of large-scale genomic or transcriptomic data. Although there have been a lot of software with GO-related analysis functions, new tools are still needed to meet the requirements for data generated by newly developed technologies or for advanced analysis purpose. Here, we present a Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), an easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets. Compared with available GO analysis tools, GOEAST has the following improved features: (i) GOEAST displays enriched GO terms in graphical format according to their relationships in the hierarchical tree of each GO category (biological process, molecular function and cellular component), therefore, provides better understanding of the correlations among enriched GO terms; (ii) GOEAST supports analysis for data from various sources (probe or probe set IDs of Affymetrix, Illumina, Agilent or customized microarrays, as well as different gene identifiers) and multiple species (about 60 prokaryote and eukaryote species); (iii) One unique feature of GOEAST is to allow cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them. GOEAST also provides rigorous statistical tests to enhance the reliability of analysis results. GOEAST is freely accessible at http://omicslab.genetics.ac.cn/GOEAST/
Profile Of 'Original Articles' Published In 2016 By The Journal Of Ayub Medical College, Pakistan.

PubMed

Shaikh, Masood Ali

2018-01-01

Journal of Ayub Medical College (JAMC) is the only Medline indexed biomedical journal of Pakistan that is edited and published by a medical college. Assessing the trends of study designs employed, statistical methods used, and statistical analysis software used in the articles of medical journals help understand the sophistication of research published. The objectives of this descriptive study were to assess all original articles published by JAMC in the year 2016. JAMC published 147 original articles in the year 2016. The most commonly used study design was crosssectional studies, with 64 (43.5%) articles reporting its use. Statistical tests involving bivariate analysis were most common and reported by 73 (49.6%) articles. Use of SPSS software was reported by 109 (74.1%) of articles. Most 138 (93.9%) of the original articles published were based on studies conducted in Pakistan. The number and sophistication of analysis reported in JAMC increased from year 2014 to 2016.
DOE Office of Scientific and Technical Information (OSTI.GOV)

The plpdfa software is a product of an LDRD project at LLNL entitked "Adaptive Sampling for Very High Throughput Data Streams" (tracking number 11-ERD-035). This software was developed by a graduate student summer intern, Chris Challis, who worked under project PI Dan Merl furing the summer of 2011. The software the source code is implementing is a statistical analysis technique for clustering and classification of text-valued data. The method had been previously published by the PI in the open literature.
An Examination of the Demographic and Career Progression of Air Force Institute of Technology Cost Analysis Graduates.

DTIC Science & Technology

1997-09-01

program include the ACEIT software training and the combination of Department of Defense (DOD) application, regression, and statistics. The weaknesses...and Integrated Tools ( ACEIT ) software and training could not be praised enough. AFIT vs. Civilian Institutions. The GCA program provides a Department...very useful to the graduates and beneficial to their careers. The main strengths of the program include the ACEIT software training and the combination
Teaching meta-analysis using MetaLight.

PubMed

Thomas, James; Graziosi, Sergio; Higgins, Steve; Coe, Robert; Torgerson, Carole; Newman, Mark

2012-10-18

Meta-analysis is a statistical method for combining the results of primary studies. It is often used in systematic reviews and is increasingly a method and topic that appears in student dissertations. MetaLight is a freely available software application that runs simple meta-analyses and contains specific functionality to facilitate the teaching and learning of meta-analysis. While there are many courses and resources for meta-analysis available and numerous software applications to run meta-analyses, there are few pieces of software which are aimed specifically at helping those teaching and learning meta-analysis. Valuable teaching time can be spent learning the mechanics of a new software application, rather than on the principles and practices of meta-analysis. We discuss ways in which the MetaLight tool can be used to present some of the main issues involved in undertaking and interpreting a meta-analysis. While there are many software tools available for conducting meta-analysis, in the context of a teaching programme such software can require expenditure both in terms of money and in terms of the time it takes to learn how to use it. MetaLight was developed specifically as a tool to facilitate the teaching and learning of meta-analysis and we have presented here some of the ways it might be used in a training situation.
DOE Office of Scientific and Technical Information (OSTI.GOV)

White, Amanda M.; Daly, Don S.; Willse, Alan R.

The Automated Microarray Image Analysis (AMIA) Toolbox for MATLAB is a flexible, open-source microarray image analysis tool that allows the user to customize analysis of sets of microarray images. This tool provides several methods of identifying and quantify spot statistics, as well as extensive diagnostic statistics and images to identify poor data quality or processing. The open nature of this software allows researchers to understand the algorithms used to provide intensity estimates and to modify them easily if desired.
Preparing systems engineering and computing science students in disciplined methods, quantitative, and advanced statistical techniques to improve process performance

NASA Astrophysics Data System (ADS)

McCray, Wilmon Wil L., Jr.

The research was prompted by a need to conduct a study that assesses process improvement, quality management and analytical techniques taught to students in U.S. colleges and universities undergraduate and graduate systems engineering and the computing science discipline (e.g., software engineering, computer science, and information technology) degree programs during their academic training that can be applied to quantitatively manage processes for performance. Everyone involved in executing repeatable processes in the software and systems development lifecycle processes needs to become familiar with the concepts of quantitative management, statistical thinking, process improvement methods and how they relate to process-performance. Organizations are starting to embrace the de facto Software Engineering Institute (SEI) Capability Maturity Model Integration (CMMI RTM) Models as process improvement frameworks to improve business processes performance. High maturity process areas in the CMMI model imply the use of analytical, statistical, quantitative management techniques, and process performance modeling to identify and eliminate sources of variation, continually improve process-performance; reduce cost and predict future outcomes. The research study identifies and provides a detail discussion of the gap analysis findings of process improvement and quantitative analysis techniques taught in U.S. universities systems engineering and computing science degree programs, gaps that exist in the literature, and a comparison analysis which identifies the gaps that exist between the SEI's "healthy ingredients " of a process performance model and courses taught in U.S. universities degree program. The research also heightens awareness that academicians have conducted little research on applicable statistics and quantitative techniques that can be used to demonstrate high maturity as implied in the CMMI models. The research also includes a Monte Carlo simulation optimization model and dashboard that demonstrates the use of statistical methods, statistical process control, sensitivity analysis, quantitative and optimization techniques to establish a baseline and predict future customer satisfaction index scores (outcomes). The American Customer Satisfaction Index (ACSI) model and industry benchmarks were used as a framework for the simulation model.
RipleyGUI: software for analyzing spatial patterns in 3D cell distributions

PubMed Central

Hansson, Kristin; Jafari-Mamaghani, Mehrdad; Krieger, Patrik

2013-01-01

The true revolution in the age of digital neuroanatomy is the ability to extensively quantify anatomical structures and thus investigate structure-function relationships in great detail. To facilitate the quantification of neuronal cell patterns we have developed RipleyGUI, a MATLAB-based software that can be used to detect patterns in the 3D distribution of cells. RipleyGUI uses Ripley's K-function to analyze spatial distributions. In addition the software contains statistical tools to determine quantitative statistical differences, and tools for spatial transformations that are useful for analyzing non-stationary point patterns. The software has a graphical user interface making it easy to use without programming experience, and an extensive user manual explaining the basic concepts underlying the different statistical tools used to analyze spatial point patterns. The described analysis tool can be used for determining the spatial organization of neurons that is important for a detailed study of structure-function relationships. For example, neocortex that can be subdivided into six layers based on cell density and cell types can also be analyzed in terms of organizational principles distinguishing the layers. PMID:23658544
Software testing

NASA Astrophysics Data System (ADS)

Price-Whelan, Adrian M.

2016-01-01

Now more than ever, scientific results are dependent on sophisticated software and analysis. Why should we trust code written by others? How do you ensure your own code produces sensible results? How do you make sure it continues to do so as you update, modify, and add functionality? Software testing is an integral part of code validation and writing tests should be a requirement for any software project. I will talk about Python-based tools that make managing and running tests much easier and explore some statistics for projects hosted on GitHub that contain tests.
ViPAR: a software platform for the Virtual Pooling and Analysis of Research Data.

PubMed

Carter, Kim W; Francis, Richard W; Carter, K W; Francis, R W; Bresnahan, M; Gissler, M; Grønborg, T K; Gross, R; Gunnes, N; Hammond, G; Hornig, M; Hultman, C M; Huttunen, J; Langridge, A; Leonard, H; Newman, S; Parner, E T; Petersson, G; Reichenberg, A; Sandin, S; Schendel, D E; Schalkwyk, L; Sourander, A; Steadman, C; Stoltenberg, C; Suominen, A; Surén, P; Susser, E; Sylvester Vethanayagam, A; Yusof, Z

2016-04-01

Research studies exploring the determinants of disease require sufficient statistical power to detect meaningful effects. Sample size is often increased through centralized pooling of disparately located datasets, though ethical, privacy and data ownership issues can often hamper this process. Methods that facilitate the sharing of research data that are sympathetic with these issues and which allow flexible and detailed statistical analyses are therefore in critical need. We have created a software platform for the Virtual Pooling and Analysis of Research data (ViPAR), which employs free and open source methods to provide researchers with a web-based platform to analyse datasets housed in disparate locations. Database federation permits controlled access to remotely located datasets from a central location. The Secure Shell protocol allows data to be securely exchanged between devices over an insecure network. ViPAR combines these free technologies into a solution that facilitates 'virtual pooling' where data can be temporarily pooled into computer memory and made available for analysis without the need for permanent central storage. Within the ViPAR infrastructure, remote sites manage their own harmonized research dataset in a database hosted at their site, while a central server hosts the data federation component and a secure analysis portal. When an analysis is initiated, requested data are retrieved from each remote site and virtually pooled at the central site. The data are then analysed by statistical software and, on completion, results of the analysis are returned to the user and the virtually pooled data are removed from memory. ViPAR is a secure, flexible and powerful analysis platform built on open source technology that is currently in use by large international consortia, and is made publicly available at [http://bioinformatics.childhealthresearch.org.au/software/vipar/]. © The Author 2015. Published by Oxford University Press on behalf of the International Epidemiological Association.
HDX Workbench: Software for the Analysis of H/D Exchange MS Data

NASA Astrophysics Data System (ADS)

Pascal, Bruce D.; Willis, Scooter; Lauer, Janelle L.; Landgraf, Rachelle R.; West, Graham M.; Marciano, David; Novick, Scott; Goswami, Devrishi; Chalmers, Michael J.; Griffin, Patrick R.

2012-09-01

Hydrogen/deuterium exchange mass spectrometry (HDX-MS) is an established method for the interrogation of protein conformation and dynamics. While the data analysis challenge of HDX-MS has been addressed by a number of software packages, new computational tools are needed to keep pace with the improved methods and throughput of this technique. To address these needs, we report an integrated desktop program titled HDX Workbench, which facilitates automation, management, visualization, and statistical cross-comparison of large HDX data sets. Using the software, validated data analysis can be achieved at the rate of generation. The application is available at the project home page http://hdx.florida.scripps.edu.
The Future of Statistical Software. Proceedings of a Forum--Panel on Guidelines for Statistical Software (Washington, D.C., February 22, 1991).

ERIC Educational Resources Information Center

National Academy of Sciences - National Research Council, Washington, DC.

The Panel on Guidelines for Statistical Software was organized in 1990 to document, assess, and prioritize problem areas regarding quality and reliability of statistical software; present prototype guidelines in high priority areas; and make recommendations for further research and discussion. This document provides the following papers presented…
Computing and software

USGS Publications Warehouse

White, Gary C.; Hines, J.E.

2004-01-01

The reality is that the statistical methods used for analysis of data depend upon the availability of software. Analysis of marked animal data is no different than the rest of the statistical field. The methods used for analysis are those that are available in reliable software packages. Thus, the critical importance of having reliable, up–to–date software available to biologists is obvious. Statisticians have continued to develop more robust models, ever expanding the suite of potential analysis methodsavailable. But without software to implement these newer methods, they will languish in the abstract, and not be applied to the problems deserving them.In the Computers and Software Session, two new software packages are described, a comparison of implementation of methods for the estimation of nest survival is provided, and a more speculative paper about how the next generation of software might be structured is presented.Rotella et al. (2004) compare nest survival estimation with different software packages: SAS logistic regression, SAS non–linear mixed models, and Program MARK. Nests are assumed to be visited at various, possibly infrequent, intervals. All of the approaches described compute nest survival with the same likelihood, and require that the age of the nest is known to account for nests that eventually hatch. However, each approach offers advantages and disadvantages, explored by Rotella et al. (2004).Efford et al. (2004) present a new software package called DENSITY. The package computes population abundance and density from trapping arrays and other detection methods with a new and unique approach. DENSITY represents the first major addition to the analysis of trapping arrays in 20 years.Barker & White (2004) discuss how existing software such as Program MARK require that each new model’s likelihood must be programmed specifically for that model. They wishfully think that future software might allow the user to combine pieces of likelihood functions together to generate estimates. The idea is interesting, and maybe some bright young statistician can work out the specifics to implement the procedure.Choquet et al. (2004) describe MSURGE, a software package that implements the multistate capture–recapture models. The unique feature of MSURGE is that the design matrix is constructed with an interpreted language called GEMACO. Because MSURGE is limited to just multistate models, the special requirements of these likelihoods can be provided.The software and methods presented in these papers gives biologists and wildlife managers an expanding range of possibilities for data analysis. Although ease–of–use is generally getting better, it does not replace the need for understanding of the requirements and structure of the models being computed. The internet provides access to many free software packages as well as user–discussion groups to share knowledge and ideas. (A starting point for wildlife–related applications is (http://www.phidot.org).
mapDIA: Preprocessing and statistical analysis of quantitative proteomics data from data independent acquisition mass spectrometry.

PubMed

Teo, Guoshou; Kim, Sinae; Tsou, Chih-Chiang; Collins, Ben; Gingras, Anne-Claude; Nesvizhskii, Alexey I; Choi, Hyungwon

2015-11-03

Data independent acquisition (DIA) mass spectrometry is an emerging technique that offers more complete detection and quantification of peptides and proteins across multiple samples. DIA allows fragment-level quantification, which can be considered as repeated measurements of the abundance of the corresponding peptides and proteins in the downstream statistical analysis. However, few statistical approaches are available for aggregating these complex fragment-level data into peptide- or protein-level statistical summaries. In this work, we describe a software package, mapDIA, for statistical analysis of differential protein expression using DIA fragment-level intensities. The workflow consists of three major steps: intensity normalization, peptide/fragment selection, and statistical analysis. First, mapDIA offers normalization of fragment-level intensities by total intensity sums as well as a novel alternative normalization by local intensity sums in retention time space. Second, mapDIA removes outlier observations and selects peptides/fragments that preserve the major quantitative patterns across all samples for each protein. Last, using the selected fragments and peptides, mapDIA performs model-based statistical significance analysis of protein-level differential expression between specified groups of samples. Using a comprehensive set of simulation datasets, we show that mapDIA detects differentially expressed proteins with accurate control of the false discovery rates. We also describe the analysis procedure in detail using two recently published DIA datasets generated for 14-3-3β dynamic interaction network and prostate cancer glycoproteome. The software was written in C++ language and the source code is available for free through SourceForge website http://sourceforge.net/projects/mapdia/.This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015 Elsevier B.V. All rights reserved.
A Study of the NASS-CDS System for Injury/Fatality Rates of Occupants in Various Restraints and A Discussion of Alternative Presentation Methods

PubMed Central

Stucki, Sheldon Lee; Biss, David J.

2000-01-01

An analysis was performed using the National Automotive Sampling System Crashworthiness Data System (NASS-CDS) database to compare the injury/fatality rates of variously restrained driver occupants as compared to unrestrained driver occupants in the total database of drivers/frontals, and also by Delta-V. A structured search of the NASS-CDS was done using the SAS® statistical analysis software to extract the data for this analysis and the SUDAAN software package was used to arrive at statistical significance indicators. In addition, this paper goes on to investigate different methods for presenting results of accident database searches including significance results; a risk versus Delta-V format for specific exposures; and, a percent cumulative injury versus Delta-V format to characterize injury trends. These alternative analysis presentation methods are then discussed by example using the present study results. PMID:11558105
Role of microstructure on twin nucleation and growth in HCP titanium: A statistical study

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arul Kumar, M.; Wroński, M.; McCabe, Rodney James

In this study, a detailed statistical analysis is performed using Electron Back Scatter Diffraction (EBSD) to establish the effect of microstructure on twin nucleation and growth in deformed commercial purity hexagonal close packed (HCP) titanium. Rolled titanium samples are compressed along rolling, transverse and normal directions to establish statistical correlations for {10–12}, {11–21}, and {11–22} twins. A recently developed automated EBSD-twinning analysis software is employed for the statistical analysis. Finally, the analysis provides the following key findings: (I) grain size and strain dependence is different for twin nucleation and growth; (II) twinning statistics can be generalized for the HCP metalsmore » magnesium, zirconium and titanium; and (III) complex microstructure, where grain shape and size distribution is heterogeneous, requires multi-point statistical correlations.« less
Role of microstructure on twin nucleation and growth in HCP titanium: A statistical study

DOE PAGES

Arul Kumar, M.; Wroński, M.; McCabe, Rodney James; ...

2018-02-01

In this study, a detailed statistical analysis is performed using Electron Back Scatter Diffraction (EBSD) to establish the effect of microstructure on twin nucleation and growth in deformed commercial purity hexagonal close packed (HCP) titanium. Rolled titanium samples are compressed along rolling, transverse and normal directions to establish statistical correlations for {10–12}, {11–21}, and {11–22} twins. A recently developed automated EBSD-twinning analysis software is employed for the statistical analysis. Finally, the analysis provides the following key findings: (I) grain size and strain dependence is different for twin nucleation and growth; (II) twinning statistics can be generalized for the HCP metalsmore » magnesium, zirconium and titanium; and (III) complex microstructure, where grain shape and size distribution is heterogeneous, requires multi-point statistical correlations.« less

Statistical modeling of software reliability

NASA Technical Reports Server (NTRS)

Miller, Douglas R.

1992-01-01

This working paper discusses the statistical simulation part of a controlled software development experiment being conducted under the direction of the System Validation Methods Branch, Information Systems Division, NASA Langley Research Center. The experiment uses guidance and control software (GCS) aboard a fictitious planetary landing spacecraft: real-time control software operating on a transient mission. Software execution is simulated to study the statistical aspects of reliability and other failure characteristics of the software during development, testing, and random usage. Quantification of software reliability is a major goal. Various reliability concepts are discussed. Experiments are described for performing simulations and collecting appropriate simulated software performance and failure data. This data is then used to make statistical inferences about the quality of the software development and verification processes as well as inferences about the reliability of software versions and reliability growth under random testing and debugging.
MORTICIA, a statistical analysis software package for determining optical surveillance system effectiveness.

NASA Astrophysics Data System (ADS)

Ramkilowan, A.; Griffith, D. J.

2017-10-01

Surveillance modelling in terms of the standard Detect, Recognise and Identify (DRI) thresholds remains a key requirement for determining the effectiveness of surveillance sensors. With readily available computational resources it has become feasible to perform statistically representative evaluations of the effectiveness of these sensors. A new capability for performing this Monte-Carlo type analysis is demonstrated in the MORTICIA (Monte- Carlo Optical Rendering for Theatre Investigations of Capability under the Influence of the Atmosphere) software package developed at the Council for Scientific and Industrial Research (CSIR). This first generation, python-based open-source integrated software package, currently in the alpha stage of development aims to provide all the functionality required to perform statistical investigations of the effectiveness of optical surveillance systems in specific or generic deployment theatres. This includes modelling of the mathematical and physical processes that govern amongst other components of a surveillance system; a sensor's detector and optical components, a target and its background as well as the intervening atmospheric influences. In this paper we discuss integral aspects of the bespoke framework that are critical to the longevity of all subsequent modelling efforts. Additionally, some preliminary results are presented.
JCMT COADD: UKT14 continuum and photometry data reduction

NASA Astrophysics Data System (ADS)

Hughes, David; Oliveira, Firmin J.; Tilanus, Remo P. J.; Jenness, Tim

2014-11-01

COADD was used to reduce photometry and continuum data from the UKT14 instrument on the James Clerk Maxwell Telescope in the 1990s. The software can co-add multiple observations and perform sigma clipping and Kolmogorov-Smirnov statistical analysis. Additional information on the software is available in the JCMT Spring 1993 newsletter (large PDF).
Software Tools on the Peregrine System | High-Performance Computing | NREL

Science.gov Websites

Debugger or performance analysis Tool for understanding the behavior of MPI applications. Intel VTune environment for statistical computing and graphics. VirtualGL/TurboVNC Visualization and analytics Remote Tools on the Peregrine System Software Tools on the Peregrine System NREL has a variety of
STATISTICAL TECHNIQUES FOR DETERMINATION AND PREDICTION OF FUNDAMENTAL FISH ASSEMBLAGES OF THE MID-ATLANTIC HIGHLANDS

EPA Science Inventory

A statistical software tool, Stream Fish Community Predictor (SFCP), based on EMAP stream sampling in the mid-Atlantic Highlands, was developed to predict stream fish communities using stream and watershed characteristics. Step one in the tool development was a cluster analysis t...
Tree-space statistics and approximations for large-scale analysis of anatomical trees.

PubMed

Feragen, Aasa; Owen, Megan; Petersen, Jens; Wille, Mathilde M W; Thomsen, Laura H; Dirksen, Asger; de Bruijne, Marleen

2013-01-01

Statistical analysis of anatomical trees is hard to perform due to differences in the topological structure of the trees. In this paper we define statistical properties of leaf-labeled anatomical trees with geometric edge attributes by considering the anatomical trees as points in the geometric space of leaf-labeled trees. This tree-space is a geodesic metric space where any two trees are connected by a unique shortest path, which corresponds to a tree deformation. However, tree-space is not a manifold, and the usual strategy of performing statistical analysis in a tangent space and projecting onto tree-space is not available. Using tree-space and its shortest paths, a variety of statistical properties, such as mean, principal component, hypothesis testing and linear discriminant analysis can be defined. For some of these properties it is still an open problem how to compute them; others (like the mean) can be computed, but efficient alternatives are helpful in speeding up algorithms that use means iteratively, like hypothesis testing. In this paper, we take advantage of a very large dataset (N = 8016) to obtain computable approximations, under the assumption that the data trees parametrize the relevant parts of tree-space well. Using the developed approximate statistics, we illustrate how the structure and geometry of airway trees vary across a population and show that airway trees with Chronic Obstructive Pulmonary Disease come from a different distribution in tree-space than healthy ones. Software is available from http://image.diku.dk/aasa/software.php.
Power Analysis Software for Educational Researchers

ERIC Educational Resources Information Center

Peng, Chao-Ying Joanne; Long, Haiying; Abaci, Serdar

2012-01-01

Given the importance of statistical power analysis in quantitative research and the repeated emphasis on it by American Educational Research Association/American Psychological Association journals, the authors examined the reporting practice of power analysis by the quantitative studies published in 12 education/psychology journals between 2005…
cluster trials v. 1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mitchell, John; Castillo, Andrew

2016-09-21

This software contains a set of python modules – input, search, cluster, analysis; these modules read input files containing spatial coordinates and associated attributes which can be used to perform nearest neighbor search (spatial indexing via kdtree), cluster analysis/identification, and calculation of spatial statistics for analysis.
Information Technology.

ERIC Educational Resources Information Center

Marcum, Deanna; Boss, Richard

1983-01-01

Relates office automation to its application in libraries, discussing computer software packages for microcomputers performing tasks involved in word processing, accounting, statistical analysis, electronic filing cabinets, and electronic mail systems. (EJS)
Statistical analysis of water-quality data containing multiple detection limits II: S-language software for nonparametric distribution modeling and hypothesis testing

USGS Publications Warehouse

Lee, L.; Helsel, D.

2007-01-01

Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
Secure software practices among Malaysian software practitioners: An exploratory study

NASA Astrophysics Data System (ADS)

Mohamed, Shafinah Farvin Packeer; Baharom, Fauziah; Deraman, Aziz; Yahya, Jamaiah; Mohd, Haslina

2016-08-01

Secure software practices is increasingly gaining much importance among software practitioners and researchers due to the rise of computer crimes in the software industry. It has become as one of the determinant factors for producing high quality software. Even though its importance has been revealed, its current practice in the software industry is still scarce, particularly in Malaysia. Thus, an exploratory study is conducted among software practitioners in Malaysia to study their experiences and practices in the real-world projects. This paper discusses the findings from the study, which involved 93 software practitioners. Structured questionnaire is utilized for data collection purpose whilst statistical methods such as frequency, mean, and cross tabulation are used for data analysis. Outcomes from this study reveal that software practitioners are becoming increasingly aware on the importance of secure software practices, however, they lack of appropriate implementation, which could affect the quality of produced software.
[Development of Hospital Equipment Maintenance Information System].

PubMed

Zhou, Zhixin

2015-11-01

Hospital equipment maintenance information system plays an important role in improving medical treatment quality and efficiency. By requirement analysis of hospital equipment maintenance, the system function diagram is drawed. According to analysis of input and output data, tables and reports in connection with equipment maintenance process, relationships between entity and attribute is found out, and E-R diagram is drawed and relational database table is established. Software development should meet actual process requirement of maintenance and have a friendly user interface and flexible operation. The software can analyze failure cause by statistical analysis.
How to choose the right statistical software?—a method increasing the post-purchase satisfaction

PubMed Central

2015-01-01

Nowadays, we live in the “data era” where the use of statistical or data analysis software is inevitable, in any research field. This means that the choice of the right software tool or platform is a strategic issue for a research department. Nevertheless, in many cases decision makers do not pay the right attention to a comprehensive and appropriate evaluation of what the market offers. Indeed, the choice still depends on few factors like, for instance, researcher’s personal inclination, e.g., which software have been used at the university or is already known. This is not wrong in principle, but in some cases it’s not enough at all and might lead to a “dead end” situation, typically after months or years of investments already done on the wrong software. This article, far from being a full and complete guide to statistical software evaluation, aims to illustrate some key points of the decision process and introduce an extended range of factors which can help to undertake the right choice, at least in potential. There is not enough literature about that topic, most of the time underestimated, both in the traditional literature and even in the so called “gray literature”, even if some documents or short pages can be found online. Anyhow, it seems there is not a common and known standpoint about the process of software evaluation from the final user perspective. We suggests a multi-factor analysis leading to an evaluation matrix tool, to be intended as a flexible and customizable tool, aimed to provide a clearer picture of the software alternatives available, not in abstract but related to the researcher’s own context and needs. This method is a result of about twenty years of experience of the author in the field of evaluating and using technical-computing software and partially arises from a research made about such topics as part of a project funded by European Commission under the Lifelong Learning Programme 2011. PMID:26793368
ODM Data Analysis-A tool for the automatic validation, monitoring and generation of generic descriptive statistics of patient data.

PubMed

Brix, Tobias Johannes; Bruland, Philipp; Sarfraz, Saad; Ernsting, Jan; Neuhaus, Philipp; Storck, Michael; Doods, Justin; Ständer, Sonja; Dugas, Martin

2018-01-01

A required step for presenting results of clinical studies is the declaration of participants demographic and baseline characteristics as claimed by the FDAAA 801. The common workflow to accomplish this task is to export the clinical data from the used electronic data capture system and import it into statistical software like SAS software or IBM SPSS. This software requires trained users, who have to implement the analysis individually for each item. These expenditures may become an obstacle for small studies. Objective of this work is to design, implement and evaluate an open source application, called ODM Data Analysis, for the semi-automatic analysis of clinical study data. The system requires clinical data in the CDISC Operational Data Model format. After uploading the file, its syntax and data type conformity of the collected data is validated. The completeness of the study data is determined and basic statistics, including illustrative charts for each item, are generated. Datasets from four clinical studies have been used to evaluate the application's performance and functionality. The system is implemented as an open source web application (available at https://odmanalysis.uni-muenster.de) and also provided as Docker image which enables an easy distribution and installation on local systems. Study data is only stored in the application as long as the calculations are performed which is compliant with data protection endeavors. Analysis times are below half an hour, even for larger studies with over 6000 subjects. Medical experts have ensured the usefulness of this application to grant an overview of their collected study data for monitoring purposes and to generate descriptive statistics without further user interaction. The semi-automatic analysis has its limitations and cannot replace the complex analysis of statisticians, but it can be used as a starting point for their examination and reporting.
Statistical correlation of structural mode shapes from test measurements and NASTRAN analytical values

NASA Technical Reports Server (NTRS)

Purves, L.; Strang, R. F.; Dube, M. P.; Alea, P.; Ferragut, N.; Hershfeld, D.

1983-01-01

The software and procedures of a system of programs used to generate a report of the statistical correlation between NASTRAN modal analysis results and physical tests results from modal surveys are described. Topics discussed include: a mathematical description of statistical correlation, a user's guide for generating a statistical correlation report, a programmer's guide describing the organization and functions of individual programs leading to a statistical correlation report, and a set of examples including complete listings of programs, and input and output data.
Statistical Analysis of an Infrared Thermography Inspection of Reinforced Carbon-Carbon

NASA Technical Reports Server (NTRS)

Comeaux, Kayla

2011-01-01

Each piece of flight hardware being used on the shuttle must be analyzed and pass NASA requirements before the shuttle is ready for launch. One tool used to detect cracks that lie within flight hardware is Infrared Flash Thermography. This is a non-destructive testing technique which uses an intense flash of light to heat up the surface of a material after which an Infrared camera is used to record the cooling of the material. Since cracks within the material obstruct the natural heat flow through the material, they are visible when viewing the data from the Infrared camera. We used Ecotherm, a software program, to collect data pertaining to the delaminations and analyzed the data using Ecotherm and University of Dayton Log Logistic Probability of Detection (POD) Software. The goal was to reproduce the statistical analysis produced by the University of Dayton software, by using scatter plots, log transforms, and residuals to test the assumption of normality for the residuals.
Status Quo and Outlook of the Studies of Entrepreneurship Education in China: Statistics and Analysis Based on Papers Indexed in CSSCI (2004-2013)

ERIC Educational Resources Information Center

Xia, Tian; Shumin, Zhang; Yifeng, Wu

2016-01-01

We utilized cross tabulation statistics, word frequency counts, and content analysis of research output to conduct a bibliometric study, and used CiteSpace software to depict a knowledge map for research on entrepreneurship education in China from 2004 to 2013. The study shows that, in this duration, the study of Chinese entrepreneurship education…
pyLIMA : an open source microlensing software

NASA Astrophysics Data System (ADS)

Bachelet, Etienne

2017-01-01

Planetary microlensing is a unique tool to detect cold planets around low-mass stars which is approaching a watershed in discoveries as near-future missions incorporate dedicated surveys. NASA and ESA have decided to complement WFIRST-AFTA and Euclid with microlensing programs to enrich our statistics about this planetary population. Of the nany challenges in- herent in these missions, the data analysis is of primary importance, yet is often perceived as time consuming, complex and daunting barrier to participation in the field. We present the first open source modeling software to conduct a microlensing analysis. This software is written in Python and use as much as possible existing packages.
QGene 4.0, an extensible Java QTL-analysis platform.

PubMed

Joehanes, Roby; Nelson, James C

2008-12-01

Of many statistical methods developed to date for quantitative trait locus (QTL) analysis, only a limited subset are available in public software allowing their exploration, comparison and practical application by researchers. We have developed QGene 4.0, a plug-in platform that allows execution and comparison of a variety of modern QTL-mapping methods and supports third-party addition of new ones. The software accommodates line-cross mating designs consisting of any arbitrary sequence of selfing, backcrossing, intercrossing and haploid-doubling steps that includes map, population, and trait simulators; and is scriptable. Software and documentation are available at http://coding.plantpath.ksu.edu/qgene. Source code is available on request.
A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls

PubMed Central

Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique LCT; Heeren, Ron MA; Sillevis Smitt, Peter A; Luider, Theo M

2006-01-01

Background Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. Results A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. Conclusion The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the higher resolution and mass accuracy of the FT-ICR mass spectrometry prevents the clustering of peaks of different peptides and allows the identification of differentially expressed proteins from the peptide profiles. PMID:16953879

A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls.

PubMed

Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique L C T; Heeren, Ron M A; Sillevis Smitt, Peter A; Luider, Theo M

2006-09-05

Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the higher resolution and mass accuracy of the FT-ICR mass spectrometry prevents the clustering of peaks of different peptides and allows the identification of differentially expressed proteins from the peptide profiles.
MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories.

PubMed

McGibbon, Robert T; Beauchamp, Kyle A; Harrigan, Matthew P; Klein, Christoph; Swails, Jason M; Hernández, Carlos X; Schwantes, Christian R; Wang, Lee-Ping; Lane, Thomas J; Pande, Vijay S

2015-10-20

As molecular dynamics (MD) simulations continue to evolve into powerful computational tools for studying complex biomolecular systems, the necessity of flexible and easy-to-use software tools for the analysis of these simulations is growing. We have developed MDTraj, a modern, lightweight, and fast software package for analyzing MD simulations. MDTraj reads and writes trajectory data in a wide variety of commonly used formats. It provides a large number of trajectory analysis capabilities including minimal root-mean-square-deviation calculations, secondary structure assignment, and the extraction of common order parameters. The package has a strong focus on interoperability with the wider scientific Python ecosystem, bridging the gap between MD data and the rapidly growing collection of industry-standard statistical analysis and visualization tools in Python. MDTraj is a powerful and user-friendly software package that simplifies the analysis of MD data and connects these datasets with the modern interactive data science software ecosystem in Python. Copyright © 2015 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Software phantom with realistic speckle modeling for validation of image analysis methods in echocardiography

NASA Astrophysics Data System (ADS)

Law, Yuen C.; Tenbrinck, Daniel; Jiang, Xiaoyi; Kuhlen, Torsten

2014-03-01

Computer-assisted processing and interpretation of medical ultrasound images is one of the most challenging tasks within image analysis. Physical phenomena in ultrasonographic images, e.g., the characteristic speckle noise and shadowing effects, make the majority of standard methods from image analysis non optimal. Furthermore, validation of adapted computer vision methods proves to be difficult due to missing ground truth information. There is no widely accepted software phantom in the community and existing software phantoms are not exible enough to support the use of specific speckle models for different tissue types, e.g., muscle and fat tissue. In this work we propose an anatomical software phantom with a realistic speckle pattern simulation to _ll this gap and provide a exible tool for validation purposes in medical ultrasound image analysis. We discuss the generation of speckle patterns and perform statistical analysis of the simulated textures to obtain quantitative measures of the realism and accuracy regarding the resulting textures.
MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories

PubMed Central

McGibbon, Robert T.; Beauchamp, Kyle A.; Harrigan, Matthew P.; Klein, Christoph; Swails, Jason M.; Hernández, Carlos X.; Schwantes, Christian R.; Wang, Lee-Ping; Lane, Thomas J.; Pande, Vijay S.

2015-01-01

As molecular dynamics (MD) simulations continue to evolve into powerful computational tools for studying complex biomolecular systems, the necessity of flexible and easy-to-use software tools for the analysis of these simulations is growing. We have developed MDTraj, a modern, lightweight, and fast software package for analyzing MD simulations. MDTraj reads and writes trajectory data in a wide variety of commonly used formats. It provides a large number of trajectory analysis capabilities including minimal root-mean-square-deviation calculations, secondary structure assignment, and the extraction of common order parameters. The package has a strong focus on interoperability with the wider scientific Python ecosystem, bridging the gap between MD data and the rapidly growing collection of industry-standard statistical analysis and visualization tools in Python. MDTraj is a powerful and user-friendly software package that simplifies the analysis of MD data and connects these datasets with the modern interactive data science software ecosystem in Python. PMID:26488642
DOE Office of Scientific and Technical Information (OSTI.GOV)

Apte, A; Veeraraghavan, H; Oh, J

Purpose: To present an open source and free platform to facilitate radiomics research — The “Radiomics toolbox” in CERR. Method: There is scarcity of open source tools that support end-to-end modeling of image features to predict patient outcomes. The “Radiomics toolbox” strives to fill the need for such a software platform. The platform supports (1) import of various kinds of image modalities like CT, PET, MR, SPECT, US. (2) Contouring tools to delineate structures of interest. (3) Extraction and storage of image based features like 1st order statistics, gray-scale co-occurrence and zonesize matrix based texture features and shape features andmore » (4) Statistical Analysis. Statistical analysis of the extracted features is supported with basic functionality that includes univariate correlations, Kaplan-Meir curves and advanced functionality that includes feature reduction and multivariate modeling. The graphical user interface and the data management are performed with Matlab for the ease of development and readability of code and features for wide audience. Open-source software developed with other programming languages is integrated to enhance various components of this toolbox. For example: Java-based DCM4CHE for import of DICOM, R for statistical analysis. Results: The Radiomics toolbox will be distributed as an open source, GNU copyrighted software. The toolbox was prototyped for modeling Oropharyngeal PET dataset at MSKCC. The analysis will be presented in a separate paper. Conclusion: The Radiomics Toolbox provides an extensible platform for extracting and modeling image features. To emphasize new uses of CERR for radiomics and image-based research, we have changed the name from the “Computational Environment for Radiotherapy Research” to the “Computational Environment for Radiological Research”.« less
IMPLEMENTATION AND VALIDATION OF STATISTICAL TESTS IN RESEARCH'S SOFTWARE HELPING DATA COLLECTION AND PROTOCOLS ANALYSIS IN SURGERY.

PubMed

Kuretzki, Carlos Henrique; Campos, Antônio Carlos Ligocki; Malafaia, Osvaldo; Soares, Sandramara Scandelari Kusano de Paula; Tenório, Sérgio Bernardo; Timi, Jorge Rufino Ribas

2016-03-01

The use of information technology is often applied in healthcare. With regard to scientific research, the SINPE(c) - Integrated Electronic Protocols was created as a tool to support researchers, offering clinical data standardization. By the time, SINPE(c) lacked statistical tests obtained by automatic analysis. Add to SINPE(c) features for automatic realization of the main statistical methods used in medicine . The study was divided into four topics: check the interest of users towards the implementation of the tests; search the frequency of their use in health care; carry out the implementation; and validate the results with researchers and their protocols. It was applied in a group of users of this software in their thesis in the strict sensu master and doctorate degrees in one postgraduate program in surgery. To assess the reliability of the statistics was compared the data obtained both automatically by SINPE(c) as manually held by a professional in statistics with experience with this type of study. There was concern for the use of automatic statistical tests, with good acceptance. The chi-square, Mann-Whitney, Fisher and t-Student were considered as tests frequently used by participants in medical studies. These methods have been implemented and thereafter approved as expected. The incorporation of the automatic SINPE (c) Statistical Analysis was shown to be reliable and equal to the manually done, validating its use as a research tool for medical research.
FluxPyt: a Python-based free and open-source software for 13C-metabolic flux analyses.

PubMed

Desai, Trunil S; Srivastava, Shireesh

2018-01-01

13 C-Metabolic flux analysis (MFA) is a powerful approach to estimate intracellular reaction rates which could be used in strain analysis and design. Processing and analysis of labeling data for calculation of fluxes and associated statistics is an essential part of MFA. However, various software currently available for data analysis employ proprietary platforms and thus limit accessibility. We developed FluxPyt, a Python-based truly open-source software package for conducting stationary 13 C-MFA data analysis. The software is based on the efficient elementary metabolite unit framework. The standard deviations in the calculated fluxes are estimated using the Monte-Carlo analysis. FluxPyt also automatically creates flux maps based on a template for visualization of the MFA results. The flux distributions calculated by FluxPyt for two separate models: a small tricarboxylic acid cycle model and a larger Corynebacterium glutamicum model, were found to be in good agreement with those calculated by a previously published software. FluxPyt was tested in Microsoft™ Windows 7 and 10, as well as in Linux Mint 18.2. The availability of a free and open 13 C-MFA software that works in various operating systems will enable more researchers to perform 13 C-MFA and to further modify and develop the package.
FluxPyt: a Python-based free and open-source software for 13C-metabolic flux analyses

PubMed Central

Desai, Trunil S.

2018-01-01

13C-Metabolic flux analysis (MFA) is a powerful approach to estimate intracellular reaction rates which could be used in strain analysis and design. Processing and analysis of labeling data for calculation of fluxes and associated statistics is an essential part of MFA. However, various software currently available for data analysis employ proprietary platforms and thus limit accessibility. We developed FluxPyt, a Python-based truly open-source software package for conducting stationary 13C-MFA data analysis. The software is based on the efficient elementary metabolite unit framework. The standard deviations in the calculated fluxes are estimated using the Monte-Carlo analysis. FluxPyt also automatically creates flux maps based on a template for visualization of the MFA results. The flux distributions calculated by FluxPyt for two separate models: a small tricarboxylic acid cycle model and a larger Corynebacterium glutamicum model, were found to be in good agreement with those calculated by a previously published software. FluxPyt was tested in Microsoft™ Windows 7 and 10, as well as in Linux Mint 18.2. The availability of a free and open 13C-MFA software that works in various operating systems will enable more researchers to perform 13C-MFA and to further modify and develop the package. PMID:29736347
Learning Outcomes in a Laboratory Environment vs. Classroom for Statistics Instruction: An Alternative Approach Using Statistical Software

ERIC Educational Resources Information Center

McCulloch, Ryan Sterling

2017-01-01

The role of any statistics course is to increase the understanding and comprehension of statistical concepts and those goals can be achieved via both theoretical instruction and statistical software training. However, many introductory courses either forego advanced software usage, or leave its use to the student as a peripheral activity. The…
Akterations/corrections to the BRASS Program

NASA Technical Reports Server (NTRS)

Brand, S. N.

1985-01-01

Corrections applied to statistical programs contained in two subroutines of the Bed Rest Analysis Software System (BRASS) are summarized. Two subroutines independently calculate significant values within the BRASS program.
QPROT: Statistical method for testing differential expression using protein-level intensity data in label-free quantitative proteomics.

PubMed

Choi, Hyungwon; Kim, Sinae; Fermin, Damian; Tsou, Chih-Chiang; Nesvizhskii, Alexey I

2015-11-03

We introduce QPROT, a statistical framework and computational tool for differential protein expression analysis using protein intensity data. QPROT is an extension of the QSPEC suite, originally developed for spectral count data, adapted for the analysis using continuously measured protein-level intensity data. QPROT offers a new intensity normalization procedure and model-based differential expression analysis, both of which account for missing data. Determination of differential expression of each protein is based on the standardized Z-statistic based on the posterior distribution of the log fold change parameter, guided by the false discovery rate estimated by a well-known Empirical Bayes method. We evaluated the classification performance of QPROT using the quantification calibration data from the clinical proteomic technology assessment for cancer (CPTAC) study and a recently published Escherichia coli benchmark dataset, with evaluation of FDR accuracy in the latter. QPROT is a statistical framework with computational software tool for comparative quantitative proteomics analysis. It features various extensions of QSPEC method originally built for spectral count data analysis, including probabilistic treatment of missing values in protein intensity data. With the increasing popularity of label-free quantitative proteomics data, the proposed method and accompanying software suite will be immediately useful for many proteomics laboratories. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015 Elsevier B.V. All rights reserved.
A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

PubMed

Karabatsos, George

2017-02-01

Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.
Quantifying, displaying and accounting for heterogeneity in the meta-analysis of RCTs using standard and generalised Q statistics

PubMed Central

2011-01-01

Background Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. Methods We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Results Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Conclusions Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim. PMID:21473747
Incorporating Code-Based Software in an Introductory Statistics Course

ERIC Educational Resources Information Center

Doehler, Kirsten; Taylor, Laura

2015-01-01

This article is based on the experiences of two statistics professors who have taught students to write and effectively utilize code-based software in a college-level introductory statistics course. Advantages of using software and code-based software in this context are discussed. Suggestions are made on how to ease students into using code with…
Astrophysics

Science.gov Websites

, microquasars, neutron stars, pulsars, black holes astro-ph.IM - Instrumentation and Methods for Astrophysics Astrophysics. Methods for data analysis, statistical methods. Software, database design astro-ph.SR - Solar and
Software Used to Generate Cancer Statistics - SEER Cancer Statistics

Cancer.gov

Videos that highlight topics and trends in cancer statistics and definitions of statistical terms. Also software tools for analyzing and reporting cancer statistics, which are used to compile SEER's annual reports.
Statistical modelling of software reliability

NASA Technical Reports Server (NTRS)

Miller, Douglas R.

1991-01-01

During the six-month period from 1 April 1991 to 30 September 1991 the following research papers in statistical modeling of software reliability appeared: (1) A Nonparametric Software Reliability Growth Model; (2) On the Use and the Performance of Software Reliability Growth Models; (3) Research and Development Issues in Software Reliability Engineering; (4) Special Issues on Software; and (5) Software Reliability and Safety.
MetaComp: comprehensive analysis software for comparative meta-omics including comparative metagenomics.

PubMed

Zhai, Peng; Yang, Longshu; Guo, Xiao; Wang, Zhe; Guo, Jiangtao; Wang, Xiaoqi; Zhu, Huaiqiu

2017-10-02

During the past decade, the development of high throughput nucleic sequencing and mass spectrometry analysis techniques have enabled the characterization of microbial communities through metagenomics, metatranscriptomics, metaproteomics and metabolomics data. To reveal the diversity of microbial communities and interactions between living conditions and microbes, it is necessary to introduce comparative analysis based upon integration of all four types of data mentioned above. Comparative meta-omics, especially comparative metageomics, has been established as a routine process to highlight the significant differences in taxon composition and functional gene abundance among microbiota samples. Meanwhile, biologists are increasingly concerning about the correlations between meta-omics features and environmental factors, which may further decipher the adaptation strategy of a microbial community. We developed a graphical comprehensive analysis software named MetaComp comprising a series of statistical analysis approaches with visualized results for metagenomics and other meta-omics data comparison. This software is capable to read files generated by a variety of upstream programs. After data loading, analyses such as multivariate statistics, hypothesis testing of two-sample, multi-sample as well as two-group sample and a novel function-regression analysis of environmental factors are offered. Here, regression analysis regards meta-omic features as independent variable and environmental factors as dependent variables. Moreover, MetaComp is capable to automatically choose an appropriate two-group sample test based upon the traits of input abundance profiles. We further evaluate the performance of its choice, and exhibit applications for metagenomics, metaproteomics and metabolomics samples. MetaComp, an integrative software capable for applying to all meta-omics data, originally distills the influence of living environment on microbial community by regression analysis. Moreover, since the automatically chosen two-group sample test is verified to be outperformed, MetaComp is friendly to users without adequate statistical training. These improvements are aiming to overcome the new challenges under big data era for all meta-omics data. MetaComp is available at: http://cqb.pku.edu.cn/ZhuLab/MetaComp/ and https://github.com/pzhaipku/MetaComp/ .
Image analysis library software development

NASA Technical Reports Server (NTRS)

Guseman, L. F., Jr.; Bryant, J.

1977-01-01

The Image Analysis Library consists of a collection of general purpose mathematical/statistical routines and special purpose data analysis/pattern recognition routines basic to the development of image analysis techniques for support of current and future Earth Resources Programs. Work was done to provide a collection of computer routines and associated documentation which form a part of the Image Analysis Library.
Powerlaw: a Python package for analysis of heavy-tailed distributions.

PubMed

Alstott, Jeff; Bullmore, Ed; Plenz, Dietmar

2014-01-01

Power laws are theoretically interesting probability distributions that are also frequently used to describe empirical data. In recent years, effective statistical methods for fitting power laws have been developed, but appropriate use of these techniques requires significant programming and statistical insight. In order to greatly decrease the barriers to using good statistical methods for fitting power law distributions, we developed the powerlaw Python package. This software package provides easy commands for basic fitting and statistical analysis of distributions. Notably, it also seeks to support a variety of user needs by being exhaustive in the options available to the user. The source code is publicly available and easily extensible.

Gis-Based Spatial Statistical Analysis of College Graduates Employment

NASA Astrophysics Data System (ADS)

Tang, R.

2012-07-01

It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
Statistical Validation for Clinical Measures: Repeatability and Agreement of Kinect™-Based Software.

PubMed

Lopez, Natalia; Perez, Elisa; Tello, Emanuel; Rodrigo, Alejandro; Valentinuzzi, Max E

2018-01-01

The rehabilitation process is a fundamental stage for recovery of people's capabilities. However, the evaluation of the process is performed by physiatrists and medical doctors, mostly based on their observations, that is, a subjective appreciation of the patient's evolution. This paper proposes a tracking platform of the movement made by an individual's upper limb using Kinect sensor(s) to be applied for the patient during the rehabilitation process. The main contribution is the development of quantifying software and the statistical validation of its performance, repeatability, and clinical use in the rehabilitation process. The software determines joint angles and upper limb trajectories for the construction of a specific rehabilitation protocol and quantifies the treatment evolution. In turn, the information is presented via a graphical interface that allows the recording, storage, and report of the patient's data. For clinical purposes, the software information is statistically validated with three different methodologies, comparing the measures with a goniometer in terms of agreement and repeatability. The agreement of joint angles measured with the proposed software and goniometer is evaluated with Bland-Altman plots; all measurements fell well within the limits of agreement, meaning interchangeability of both techniques. Additionally, the results of Bland-Altman analysis of repeatability show 95% confidence. Finally, the physiotherapists' qualitative assessment shows encouraging results for the clinical use. The main conclusion is that the software is capable of offering a clinical history of the patient and is useful for quantification of the rehabilitation success. The simplicity, low cost, and visualization possibilities enhance the use of the software Kinect for rehabilitation and other applications, and the expert's opinion endorses the choice of our approach for clinical practice. Comparison of the new measurement technique with established goniometric methods determines that the proposed software agrees sufficiently to be used interchangeably.
Mathematical and Statistical Software Index. Final Report.

ERIC Educational Resources Information Center

Black, Doris E., Comp.

Brief descriptions are provided of general-purpose mathematical and statistical software, including 27 "stand-alone" programs, three subroutine systems, and two nationally recognized statistical packages, which are available in the Air Force Human Resources Laboratory (AFHRL) software library. This index was created to enable researchers…
The Statistical Consulting Center for Astronomy (SCCA)

NASA Technical Reports Server (NTRS)

Akritas, Michael

2001-01-01

The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.
The discounting model selector: Statistical software for delay discounting applications.

PubMed

Gilroy, Shawn P; Franck, Christopher T; Hantula, Donald A

2017-05-01

Original, open-source computer software was developed and validated against established delay discounting methods in the literature. The software executed approximate Bayesian model selection methods from user-supplied temporal discounting data and computed the effective delay 50 (ED50) from the best performing model. Software was custom-designed to enable behavior analysts to conveniently apply recent statistical methods to temporal discounting data with the aid of a graphical user interface (GUI). The results of independent validation of the approximate Bayesian model selection methods indicated that the program provided results identical to that of the original source paper and its methods. Monte Carlo simulation (n = 50,000) confirmed that true model was selected most often in each setting. Simulation code and data for this study were posted to an online repository for use by other researchers. The model selection approach was applied to three existing delay discounting data sets from the literature in addition to the data from the source paper. Comparisons of model selected ED50 were consistent with traditional indices of discounting. Conceptual issues related to the development and use of computer software by behavior analysts and the opportunities afforded by free and open-sourced software are discussed and a review of possible expansions of this software are provided. © 2017 Society for the Experimental Analysis of Behavior.
The Development of Design Tools for Fault Tolerant Quantum Dot Cellular Automata Based Logic

NASA Technical Reports Server (NTRS)

Armstrong, Curtis D.; Humphreys, William M.

2003-01-01

We are developing software to explore the fault tolerance of quantum dot cellular automata gate architectures in the presence of manufacturing variations and device defects. The Topology Optimization Methodology using Applied Statistics (TOMAS) framework extends the capabilities of the A Quantum Interconnected Network Array Simulator (AQUINAS) by adding front-end and back-end software and creating an environment that integrates all of these components. The front-end tools establish all simulation parameters, configure the simulation system, automate the Monte Carlo generation of simulation files, and execute the simulation of these files. The back-end tools perform automated data parsing, statistical analysis and report generation.
Monitoring and Evaluation: Statistical Support for Life-cycle Studies, Annual Report 2003.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Skalski, John

2003-11-01

The ongoing mission of this project is the development of statistical tools for analyzing fisheries tagging data in the most precise and appropriate manner possible. This mission also includes providing statistical guidance on the best ways to design large-scale tagging studies. This mission continues because the technologies for conducting fish tagging studies continuously evolve. In just the last decade, fisheries biologists have seen the evolution from freeze-brands and coded wire tags (CWT) to passive integrated transponder (PIT) tags, balloon-tags, radiotelemetry, and now, acoustic-tags. With each advance, the technology holds the promise of more detailed and precise information. However, the technologymore » for analyzing and interpreting the data also becomes more complex as the tagging techniques become more sophisticated. The goal of the project is to develop the analytical tools in parallel with the technical advances in tagging studies, so that maximum information can be extracted on a timely basis. Associated with this mission is the transfer of these analytical capabilities to the field investigators to assure consistency and the highest levels of design and analysis throughout the fisheries community. Consequently, this project provides detailed technical assistance on the design and analysis of tagging studies to groups requesting assistance throughout the fisheries community. Ideally, each project and each investigator would invest in the statistical support needed for the successful completion of their study. However, this is an ideal that is rarely if every attained. Furthermore, there is only a small pool of highly trained scientists in this specialized area of tag analysis here in the Northwest. Project 198910700 provides the financial support to sustain this local expertise on the statistical theory of tag analysis at the University of Washington and make it available to the fisheries community. Piecemeal and fragmented support from various agencies and organizations would be incapable of maintaining a center of expertise. The mission of the project is to help assure tagging studies are designed and analyzed from the onset to extract the best available information using state-of-the-art statistical methods. The overarching goals of the project is to assure statistically sound survival studies so that fish managers can focus on the management implications of their findings and not be distracted by concerns whether the studies are statistically reliable or not. Specific goals and objectives of the study include the following: (1) Provide consistent application of statistical methodologies for survival estimation across all salmon life cycle stages to assure comparable performance measures and assessment of results through time, to maximize learning and adaptive management opportunities, and to improve and maintain the ability to responsibly evaluate the success of implemented Columbia River FWP salmonid mitigation programs and identify future mitigation options. (2) Improve analytical capabilities to conduct research on survival processes of wild and hatchery chinook and steelhead during smolt outmigration, to improve monitoring and evaluation capabilities and assist in-season river management to optimize operational and fish passage strategies to maximize survival. (3) Extend statistical support to estimate ocean survival and in-river survival of returning adults. Provide statistical guidance in implementing a river-wide adult PIT-tag detection capability. (4) Develop statistical methods for survival estimation for all potential users and make this information available through peer-reviewed publications, statistical software, and technology transfers to organizations such as NOAA Fisheries, the Fish Passage Center, US Fish and Wildlife Service, US Geological Survey (USGS), US Army Corps of Engineers (USACE), Public Utility Districts (PUDs), the Independent Scientific Advisory Board (ISAB), and other members of the Northwest fisheries community. (5) Provide and maintain statistical software for tag analysis and user support. (6) Provide improvements in statistical theory and software as requested by user groups. These improvements include extending software capabilities to address new research issues, adapting tagging techniques to new study designs, and extending the analysis capabilities to new technologies such as radio-tags and acoustic-tags.« less
Is the spatial distribution of brain lesions associated with closed-head injury predictive of subsequent development of attention-deficit/hyperactivity disorder? Analysis with brain-image database

NASA Technical Reports Server (NTRS)

Herskovits, E. H.; Megalooikonomou, V.; Davatzikos, C.; Chen, A.; Bryan, R. N.; Gerring, J. P.

1999-01-01

PURPOSE: To determine whether there is an association between the spatial distribution of lesions detected at magnetic resonance (MR) imaging of the brain in children after closed-head injury and the development of secondary attention-deficit/hyperactivity disorder (ADHD). MATERIALS AND METHODS: Data obtained from 76 children without prior history of ADHD were analyzed. MR images were obtained 3 months after closed-head injury. After manual delineation of lesions, images were registered to the Talairach coordinate system. For each subject, registered images and secondary ADHD status were integrated into a brain-image database, which contains depiction (visualization) and statistical analysis software. Using this database, we assessed visually the spatial distributions of lesions and performed statistical analysis of image and clinical variables. RESULTS: Of the 76 children, 15 developed secondary ADHD. Depiction of the data suggested that children who developed secondary ADHD had more lesions in the right putamen than children who did not develop secondary ADHD; this impression was confirmed statistically. After Bonferroni correction, we could not demonstrate significant differences between secondary ADHD status and lesion burdens for the right caudate nucleus or the right globus pallidus. CONCLUSION: Closed-head injury-induced lesions in the right putamen in children are associated with subsequent development of secondary ADHD. Depiction software is useful in guiding statistical analysis of image data.
Melanie II--a third-generation software package for analysis of two-dimensional electrophoresis images: I. Features and user interface.

PubMed

Appel, R D; Palagi, P M; Walther, D; Vargas, J R; Sanchez, J C; Ravier, F; Pasquali, C; Hochstrasser, D F

1997-12-01

Although two-dimensional electrophoresis (2-DE) computer analysis software packages have existed ever since 2-DE technology was developed, it is only now that the hardware and software technology allows large-scale studies to be performed on low-cost personal computers or workstations, and that setting up a 2-DE computer analysis system in a small laboratory is no longer considered a luxury. After a first attempt in the seventies and early eighties to develop 2-DE analysis software systems on hardware that had poor or even no graphical capabilities, followed in the late eighties by a wave of innovative software developments that were possible thanks to new graphical interface standards such as XWindows, a third generation of 2-DE analysis software packages has now come to maturity. It can be run on a variety of low-cost, general-purpose personal computers, thus making the purchase of a 2-DE analysis system easily attainable for even the smallest laboratory that is involved in proteome research. Melanie II 2-D PAGE, developed at the University Hospital of Geneva, is such a third-generation software system for 2-DE analysis. Based on unique image processing algorithms, this user-friendly object-oriented software package runs on multiple platforms, including Unix, MS-Windows 95 and NT, and Power Macintosh. It provides efficient spot detection and quantitation, state-of-the-art image comparison, statistical data analysis facilities, and is Internet-ready. Linked to proteome databases such as those available on the World Wide Web, it represents a valuable tool for the "Virtual Lab" of the post-genome area.
Practicality of Elementary Statistics Module Based on CTL Completed by Instructions on Using Software R

NASA Astrophysics Data System (ADS)

Delyana, H.; Rismen, S.; Handayani, S.

2018-04-01

This research is a development research using 4-D design model (define, design, develop, and disseminate). The results of the define stage are analyzed for the needs of the following; Syllabus analysis, textbook analysis, student characteristics analysis and literature analysis. The results of textbook analysis obtained the description that of the two textbooks that must be owned by students also still difficulty in understanding it, the form of presentation also has not facilitated students to be independent in learning to find the concept, textbooks are also not equipped with data processing referrals by using software R. The developed module is considered valid by the experts. Further field trials are conducted to determine the practicality and effectiveness. The trial was conducted to the students of Mathematics Education Study Program of STKIP PGRI which was taken randomly which has not taken Basic Statistics Course that is as many as 4 people. Practical aspects of attention are easy, time efficient, easy to interpret, and equivalence. The practical value in each aspect is 3.7; 3.79, 3.7 and 3.78. Based on the results of the test students considered that the module has been very practical use in learning. This means that the module developed can be used by students in Elementary Statistics learning.
A Survey of Popular R Packages for Cluster Analysis

ERIC Educational Resources Information Center

Flynt, Abby; Dean, Nema

2016-01-01

Cluster analysis is a set of statistical methods for discovering new group/class structure when exploring data sets. This article reviews the following popular libraries/commands in the R software language for applying different types of cluster analysis: from the stats library, the kmeans, and hclust functions; the mclust library; the poLCA…
Spotlight-8 Image Analysis Software

NASA Technical Reports Server (NTRS)

Klimek, Robert; Wright, Ted

2006-01-01

Spotlight is a cross-platform GUI-based software package designed to perform image analysis on sequences of images generated by combustion and fluid physics experiments run in a microgravity environment. Spotlight can perform analysis on a single image in an interactive mode or perform analysis on a sequence of images in an automated fashion. Image processing operations can be employed to enhance the image before various statistics and measurement operations are performed. An arbitrarily large number of objects can be analyzed simultaneously with independent areas of interest. Spotlight saves results in a text file that can be imported into other programs for graphing or further analysis. Spotlight can be run on Microsoft Windows, Linux, and Apple OS X platforms.
Understanding software faults and their role in software reliability modeling

NASA Technical Reports Server (NTRS)

Munson, John C.

1994-01-01

This study is a direct result of an on-going project to model the reliability of a large real-time control avionics system. In previous modeling efforts with this system, hardware reliability models were applied in modeling the reliability behavior of this system. In an attempt to enhance the performance of the adapted reliability models, certain software attributes were introduced in these models to control for differences between programs and also sequential executions of the same program. As the basic nature of the software attributes that affect software reliability become better understood in the modeling process, this information begins to have important implications on the software development process. A significant problem arises when raw attribute measures are to be used in statistical models as predictors, for example, of measures of software quality. This is because many of the metrics are highly correlated. Consider the two attributes: lines of code, LOC, and number of program statements, Stmts. In this case, it is quite obvious that a program with a high value of LOC probably will also have a relatively high value of Stmts. In the case of low level languages, such as assembly language programs, there might be a one-to-one relationship between the statement count and the lines of code. When there is a complete absence of linear relationship among the metrics, they are said to be orthogonal or uncorrelated. Usually the lack of orthogonality is not serious enough to affect a statistical analysis. However, for the purposes of some statistical analysis such as multiple regression, the software metrics are so strongly interrelated that the regression results may be ambiguous and possibly even misleading. Typically, it is difficult to estimate the unique effects of individual software metrics in the regression equation. The estimated values of the coefficients are very sensitive to slight changes in the data and to the addition or deletion of variables in the regression equation. Since most of the existing metrics have common elements and are linear combinations of these common elements, it seems reasonable to investigate the structure of the underlying common factors or components that make up the raw metrics. The technique we have chosen to use to explore this structure is a procedure called principal components analysis. Principal components analysis is a decomposition technique that may be used to detect and analyze collinearity in software metrics. When confronted with a large number of metrics measuring a single construct, it may be desirable to represent the set by some smaller number of variables that convey all, or most, of the information in the original set. Principal components are linear transformations of a set of random variables that summarize the information contained in the variables. The transformations are chosen so that the first component accounts for the maximal amount of variation of the measures of any possible linear transform; the second component accounts for the maximal amount of residual variation; and so on. The principal components are constructed so that they represent transformed scores on dimensions that are orthogonal. Through the use of principal components analysis, it is possible to have a set of highly related software attributes mapped into a small number of uncorrelated attribute domains. This definitively solves the problem of multi-collinearity in subsequent regression analysis. There are many software metrics in the literature, but principal component analysis reveals that there are few distinct sources of variation, i.e. dimensions, in this set of metrics. It would appear perfectly reasonable to characterize the measurable attributes of a program with a simple function of a small number of orthogonal metrics each of which represents a distinct software attribute domain.
VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.

2012-04-25

Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.
A statistical data analysis and plotting program for cloud microphysics experiments

NASA Technical Reports Server (NTRS)

Jordan, A. J.

1981-01-01

The analysis software developed for atmospheric cloud microphysics experiments conducted in the laboratory as well as aboard a KC-135 aircraft is described. A group of four programs was developed and implemented on a Hewlett Packard 1000 series F minicomputer running under HP's RTE-IVB operating system. The programs control and read data from a MEMODYNE Model 3765-8BV cassette recorder, format the data on the Hewlett Packard disk subsystem, and generate statistical data (mean, variance, standard deviation) and voltage and engineering unit plots on a user selected plotting device. The programs are written in HP FORTRAN IV and HP ASSEMBLY Language with the graphics software using the HP 1000 Graphics. The supported plotting devices are the HP 2647A graphics terminal, the HP 9872B four color pen plotter, and the HP 2608A matrix line printer.
Incorporating Multi-criteria Optimization and Uncertainty Analysis in the Model-Based Systems Engineering of an Autonomous Surface Craft

DTIC Science & Technology

2009-09-01

SAS Statistical Analysis Software SE Systems Engineering SEP Systems Engineering Process SHP Shaft Horsepower SIGINT Signals Intelligence......management occurs (OSD 2002). The Systems Engineering Process (SEP), displayed in Figure 2, is a comprehensive , iterative and recursive problem
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jarocki, John Charles; Zage, David John; Fisher, Andrew N.

LinkShop is a software tool for applying the method of Linkography to the analysis time-sequence data. LinkShop provides command line, web, and application programming interfaces (API) for input and processing of time-sequence data, abstraction models, and ontologies. The software creates graph representations of the abstraction model, ontology, and derived linkograph. Finally, the tool allows the user to perform statistical measurements of the linkograph and refine the ontology through direct manipulation of the linkograph.
Monte Carlo based statistical power analysis for mediation models: methods and software.

PubMed

Zhang, Zhiyong

2014-12-01

The existing literature on statistical power analysis for mediation models often assumes data normality and is based on a less powerful Sobel test instead of the more powerful bootstrap test. This study proposes to estimate statistical power to detect mediation effects on the basis of the bootstrap method through Monte Carlo simulation. Nonnormal data with excessive skewness and kurtosis are allowed in the proposed method. A free R package called bmem is developed to conduct the power analysis discussed in this study. Four examples, including a simple mediation model, a multiple-mediator model with a latent mediator, a multiple-group mediation model, and a longitudinal mediation model, are provided to illustrate the proposed method.
Statistical Reference Datasets

National Institute of Standards and Technology Data Gateway

Statistical Reference Datasets (Web, free access) The Statistical Reference Datasets is also supported by the Standard Reference Data Program. The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software.
A statistical model of operational impacts on the framework of the bridge crane

NASA Astrophysics Data System (ADS)

Antsev, V. Yu; Tolokonnikov, A. S.; Gorynin, A. D.; Reutov, A. A.

2017-02-01

The technical regulations of the Customs Union demands implementation of the risk analysis of the bridge cranes operation at their design stage. The statistical model has been developed for performance of random calculations of risks, allowing us to model possible operational influences on the bridge crane metal structure in their various combination. The statistical model is practically actualized in the software product automated calculation of risks of failure occurrence of bridge cranes.

Analysis of key technologies for virtual instruments metrology

NASA Astrophysics Data System (ADS)

Liu, Guixiong; Xu, Qingui; Gao, Furong; Guan, Qiuju; Fang, Qiang

2008-12-01

Virtual instruments (VIs) require metrological verification when applied as measuring instruments. Owing to the software-centered architecture, metrological evaluation of VIs includes two aspects: measurement functions and software characteristics. Complexity of software imposes difficulties on metrological testing of VIs. Key approaches and technologies for metrology evaluation of virtual instruments are investigated and analyzed in this paper. The principal issue is evaluation of measurement uncertainty. The nature and regularity of measurement uncertainty caused by software and algorithms can be evaluated by modeling, simulation, analysis, testing and statistics with support of powerful computing capability of PC. Another concern is evaluation of software features like correctness, reliability, stability, security and real-time of VIs. Technologies from software engineering, software testing and computer security domain can be used for these purposes. For example, a variety of black-box testing, white-box testing and modeling approaches can be used to evaluate the reliability of modules, components, applications and the whole VI software. The security of a VI can be assessed by methods like vulnerability scanning and penetration analysis. In order to facilitate metrology institutions to perform metrological verification of VIs efficiently, an automatic metrological tool for the above validation is essential. Based on technologies of numerical simulation, software testing and system benchmarking, a framework for the automatic tool is proposed in this paper. Investigation on implementation of existing automatic tools that perform calculation of measurement uncertainty, software testing and security assessment demonstrates the feasibility of the automatic framework advanced.
The Equivalence of Three Statistical Packages for Performing Hierarchical Cluster Analysis

ERIC Educational Resources Information Center

Blashfield, Roger

1977-01-01

Three different software programs which contain hierarchical agglomerative cluster analysis procedures were shown to generate different solutions on the same data set using apparently the same options. The basis for the differences in the solutions was the formulae used to calculate Euclidean distance. (Author/JKS)
Cellular Consequences of Telomere Shortening in Histologically Normal Breast Tissues

DTIC Science & Technology

2013-09-01

using the open source, JAVA -based image analysis software package ImageJ (http://rsb.info.nih.gov/ij/) and a custom designed plugin (“Telometer...Tabulated data were stored in a MySQL (http://www.mysql.com) database and viewed through Microsoft Access (Microsoft Corp.). Statistical Analysis For
GIS Tools For Improving Pedestrian & Bicycle Safety

DOT National Transportation Integrated Search

2000-07-01

Geographic Information System (GIS) software turns statistical data, such as accidents, and geographic data, such as roads and crash locations, into meaningful information for spatial analysis and mapping. In this project, GIS-based analytical techni...
Kepler Planet Detection Metrics: Statistical Bootstrap Test

NASA Technical Reports Server (NTRS)

Jenkins, Jon M.; Burke, Christopher J.

2016-01-01

This document describes the data produced by the Statistical Bootstrap Test over the final three Threshold Crossing Event (TCE) deliveries to NExScI: SOC 9.1 (Q1Q16)1 (Tenenbaum et al. 2014), SOC 9.2 (Q1Q17) aka DR242 (Seader et al. 2015), and SOC 9.3 (Q1Q17) aka DR253 (Twicken et al. 2016). The last few years have seen significant improvements in the SOC science data processing pipeline, leading to higher quality light curves and more sensitive transit searches. The statistical bootstrap analysis results presented here and the numerical results archived at NASAs Exoplanet Science Institute (NExScI) bear witness to these software improvements. This document attempts to introduce and describe the main features and differences between these three data sets as a consequence of the software changes.
A basic analysis toolkit for biological sequences

PubMed Central

Giancarlo, Raffaele; Siragusa, Alessandro; Siragusa, Enrico; Utro, Filippo

2007-01-01

This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at under the GNU GPL. PMID:17877802
Conservative Allowables Determined by a Tsai-Hill Equivalent Criterion for Design of Satellite Composite Parts

NASA Astrophysics Data System (ADS)

Pommatau, Gilles

2014-06-01

The present paper deals with the industrial application, via a software developed by Thales Alenia Space, of a new failure criterion named "Tsai-Hill equivalent criterion" for composite structural parts of satellites. The first part of the paper briefly describes the main hypothesis and the possibilities in terms of failure analysis of the software. The second parts reminds the quadratic and conservative nature of the new failure criterion, already presented in ESA conference in a previous paper. The third part presents the statistical calculation possibilities of the software, and the associated sensitivity analysis, via results obtained on different composites. Then a methodology, proposed to customers and agencies, is presented with its limitations and advantages. It is then conclude that this methodology is an efficient industrial way to perform mechanical analysis on quasi-isotropic composite parts.
An interactive environment for the analysis of large Earth observation and model data sets

NASA Technical Reports Server (NTRS)

Bowman, Kenneth P.; Walsh, John E.; Wilhelmson, Robert B.

1993-01-01

We propose to develop an interactive environment for the analysis of large Earth science observation and model data sets. We will use a standard scientific data storage format and a large capacity (greater than 20 GB) optical disk system for data management; develop libraries for coordinate transformation and regridding of data sets; modify the NCSA X Image and X DataSlice software for typical Earth observation data sets by including map transformations and missing data handling; develop analysis tools for common mathematical and statistical operations; integrate the components described above into a system for the analysis and comparison of observations and model results; and distribute software and documentation to the scientific community.
An interactive environment for the analysis of large Earth observation and model data sets

NASA Technical Reports Server (NTRS)

Bowman, Kenneth P.; Walsh, John E.; Wilhelmson, Robert B.

1992-01-01

We propose to develop an interactive environment for the analysis of large Earth science observation and model data sets. We will use a standard scientific data storage format and a large capacity (greater than 20 GB) optical disk system for data management; develop libraries for coordinate transformation and regridding of data sets; modify the NCSA X Image and X Data Slice software for typical Earth observation data sets by including map transformations and missing data handling; develop analysis tools for common mathematical and statistical operations; integrate the components described above into a system for the analysis and comparison of observations and model results; and distribute software and documentation to the scientific community.
SOCR: Statistics Online Computational Resource

PubMed Central

Dinov, Ivo D.

2011-01-01

The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR). This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student’s intuition and enhance their learning. PMID:21451741
LFSTAT - Low-Flow Analysis in R

NASA Astrophysics Data System (ADS)

Koffler, Daniel; Laaha, Gregor

2013-04-01

The calculation of characteristic stream flow during dry conditions is a basic requirement for many problems in hydrology, ecohydrology and water resources management. As opposed to floods, a number of different indices are used to characterise low flows and streamflow droughts. Although these indices and methods of calculation have been well documented in the WMO Manual on Low-flow Estimation and Prediction [1], a comprehensive software was missing which enables a fast and standardized calculation of low flow statistics. We present the new software package lfstat to fill in this obvious gap. Our software package is based on the statistical open source software R, and expands it to analyse daily stream flow data records focusing on low-flows. As command-line based programs are not everyone's preference, we also offer a plug-in for the R-Commander, an easy to use graphical user interface (GUI) provided for R which is based on tcl/tk. The functionality of lfstat includes estimation methods for low-flow indices, extreme value statistics, deficit characteristics, and additional graphical methods to control the computation of complex indices and to illustrate the data. Beside the basic low flow indices, the baseflow index and recession constants can be computed. For extreme value statistics, state-of-the-art methods for L-moment based local and regional frequency analysis (RFA) are available. The tools for deficit characteristics include various pooling and threshold selection methods to support the calculation of drought duration and deficit indices. The most common graphics for low flow analysis are available, and the plots can be modified according to the user preferences. Graphics include hydrographs for different periods, flexible streamflow deficit plots, baseflow visualisation, recession diagnostic, flow duration curves as well as double mass curves, and many more. From a technical point of view, the package uses a S3-class called lfobj (low-flow objects). This objects are usual R-data-frames including date, flow, hydrological year and possibly baseflow information. Once these objects are created, analysis can be performed by mouse-click and a script can be saved to make the analysis easily reproducible. At the moment we are offering implementation of all major methods proposed in the WMO manual on Low-flow Estimation and Predictions [1]. Future plans include a dynamic low flow report in odt-file format using odf-weave which allows automatic updates if data or analysis change. We hope to offer a tool to ease and structure the analysis of stream flow data focusing on low-flows and to make analysis transparent and communicable. The package can also be used in teaching students the first steps in low-flow hydrology. The software packages can be installed from CRAN (latest stable) and R-Forge: http://r-forge.r-project.org (development version). References: [1] Gustard, Alan; Demuth, Siegfried, (eds.) Manual on Low-flow Estimation and Prediction. Geneva, Switzerland, World Meteorological Organization, (Operational Hydrology Report No. 50, WMO-No. 1029).
CORSSA: The Community Online Resource for Statistical Seismicity Analysis

USGS Publications Warehouse

Michael, Andrew J.; Wiemer, Stefan

2010-01-01

Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.
An improved approach for flight readiness certification: Methodology for failure risk assessment and application examples. Volume 2: Software documentation

NASA Technical Reports Server (NTRS)

Moore, N. R.; Ebbeler, D. H.; Newlin, L. E.; Sutharshana, S.; Creager, M.

1992-01-01

An improved methodology for quantitatively evaluating failure risk of spaceflight systems to assess flight readiness and identify risk control measures is presented. This methodology, called Probabilistic Failure Assessment (PFA), combines operating experience from tests and flights with engineering analysis to estimate failure risk. The PFA methodology is of particular value when information on which to base an assessment of failure risk, including test experience and knowledge of parameters used in engineering analyses of failure phenomena, is expensive or difficult to acquire. The PFA methodology is a prescribed statistical structure in which engineering analysis models that characterize failure phenomena are used conjointly with uncertainties about analysis parameters and/or modeling accuracy to estimate failure probability distributions for specific failure modes, These distributions can then be modified, by means of statistical procedures of the PFA methodology, to reflect any test or flight experience. Conventional engineering analysis models currently employed for design of failure prediction are used in this methodology. The PFA methodology is described and examples of its application are presented. Conventional approaches to failure risk evaluation for spaceflight systems are discussed, and the rationale for the approach taken in the PFA methodology is presented. The statistical methods, engineering models, and computer software used in fatigue failure mode applications are thoroughly documented.
Statistics of software vulnerability detection in certification testing

NASA Astrophysics Data System (ADS)

Barabanov, A. V.; Markov, A. S.; Tsirlov, V. L.

2018-05-01

The paper discusses practical aspects of introduction of the methods to detect software vulnerability in the day-to-day activities of the accredited testing laboratory. It presents the approval results of the vulnerability detection methods as part of the study of the open source software and the software that is a test object of the certification tests under information security requirements, including software for communication networks. Results of the study showing the allocation of identified vulnerabilities by types of attacks, country of origin, programming languages used in the development, methods for detecting vulnerability, etc. are given. The experience of foreign information security certification systems related to the detection of certified software vulnerabilities is analyzed. The main conclusion based on the study is the need to implement practices for developing secure software in the development life cycle processes. The conclusions and recommendations for the testing laboratories on the implementation of the vulnerability analysis methods are laid down.
Business Statistics Education: Content and Software in Undergraduate Business Statistics Courses.

ERIC Educational Resources Information Center

Tabatabai, Manouchehr; Gamble, Ralph

1997-01-01

Survey responses from 204 of 500 business schools identified most often topics in business statistics I and II courses. The most popular software at both levels was Minitab. Most schools required both statistics I and II. (SK)
Metrics of Software Quality.

DTIC Science & Technology

1980-11-01

Systems: A Raytheon Project History", RADC-TR-77-188, Final Technical Report, June 1977. 4. IBM Federal Systems Division, "Statistical Prediction of...147, June 1979. 4. W. D. Brooks, R. W. Motley, "Analysis of Discrete Software Reliability Models", IBM Corp., RADC-TR-80-84, RADC, New York, April 1980...J. C. King of IBM (Reference 9) and Lori A. Clark (Reference 10) of the University of Massachusetts. Programs, so exercised must be augmented so they
Computer Administering of the Psychological Investigations: Set-Relational Representation

NASA Astrophysics Data System (ADS)

Yordzhev, Krasimir

Computer administering of a psychological investigation is the computer representation of the entire procedure of psychological assessments - test construction, test implementation, results evaluation, storage and maintenance of the developed database, its statistical processing, analysis and interpretation. A mathematical description of psychological assessment with the aid of personality tests is discussed in this article. The set theory and the relational algebra are used in this description. A relational model of data, needed to design a computer system for automation of certain psychological assessments is given. Some finite sets and relation on them, which are necessary for creating a personality psychological test, are described. The described model could be used to develop real software for computer administering of any psychological test and there is full automation of the whole process: test construction, test implementation, result evaluation, storage of the developed database, statistical implementation, analysis and interpretation. A software project for computer administering personality psychological tests is suggested.
Sex differences in discriminative power of volleyball game-related statistics.

PubMed

João, Paulo Vicente; Leite, Nuno; Mesquita, Isabel; Sampaio, Jaime

2010-12-01

To identify sex differences in volleyball game-related statistics, the game-related statistics of several World Championships in 2007 (N=132) were analyzed using the software VIS from the International Volleyball Federation. Discriminant analysis was used to identify the game-related statistics which better discriminated performances by sex. Analysis yielded an emphasis on fault serves (SC = -.40), shot spikes (SC = .40), and reception digs (SC = .31). Specific robust numbers represent that considerable variability was evident in the game-related statistics profile, as men's volleyball games were better associated with terminal actions (errors of service), and women's volleyball games were characterized by continuous actions (in defense and attack). These differences may be related to the anthropometric and physiological differences between women and men and their influence on performance profiles.
The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison

PubMed Central

Sioson, Allan A; Mane, Shrinivasrao P; Li, Pinghua; Sha, Wei; Heath, Lenwood S; Bohnert, Hans J; Grene, Ruth

2006-01-01

Background Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. Results The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. Conclusion The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity. PMID:16626497
What Software to Use in the Teaching of Mathematical Subjects?

ERIC Educational Resources Information Center

Berežný, Štefan

2015-01-01

We can consider two basic views, when using mathematical software in the teaching of mathematical subjects. First: How to learn to use specific software for the specific tasks, e. g., software Statistica for the subjects of Applied statistics, probability and mathematical statistics, or financial mathematics. Second: How to learn to use the…

STAPP: Spatiotemporal analysis of plantar pressure measurements using statistical parametric mapping.

PubMed

Booth, Brian G; Keijsers, Noël L W; Sijbers, Jan; Huysmans, Toon

2018-05-03

Pedobarography produces large sets of plantar pressure samples that are routinely subsampled (e.g. using regions of interest) or aggregated (e.g. center of pressure trajectories, peak pressure images) in order to simplify statistical analysis and provide intuitive clinical measures. We hypothesize that these data reductions discard gait information that can be used to differentiate between groups or conditions. To test the hypothesis of null information loss, we created an implementation of statistical parametric mapping (SPM) for dynamic plantar pressure datasets (i.e. plantar pressure videos). Our SPM software framework brings all plantar pressure videos into anatomical and temporal correspondence, then performs statistical tests at each sampling location in space and time. Novelly, we introduce non-linear temporal registration into the framework in order to normalize for timing differences within the stance phase. We refer to our software framework as STAPP: spatiotemporal analysis of plantar pressure measurements. Using STAPP, we tested our hypothesis on plantar pressure videos from 33 healthy subjects walking at different speeds. As walking speed increased, STAPP was able to identify significant decreases in plantar pressure at mid-stance from the heel through the lateral forefoot. The extent of these plantar pressure decreases has not previously been observed using existing plantar pressure analysis techniques. We therefore conclude that the subsampling of plantar pressure videos - a task which led to the discarding of gait information in our study - can be avoided using STAPP. Copyright © 2018 Elsevier B.V. All rights reserved.
The High Cost of Complexity in Experimental Design and Data Analysis: Type I and Type II Error Rates in Multiway ANOVA.

ERIC Educational Resources Information Center

Smith, Rachel A.; Levine, Timothy R.; Lachlan, Kenneth A.; Fediuk, Thomas A.

2002-01-01

Notes that the availability of statistical software packages has led to a sharp increase in use of complex research designs and complex statistical analyses in communication research. Reports a series of Monte Carlo simulations which demonstrate that this complexity may come at a heavier cost than many communication researchers realize. Warns…
Making statistical inferences about software reliability

NASA Technical Reports Server (NTRS)

Miller, Douglas R.

1988-01-01

Failure times of software undergoing random debugging can be modelled as order statistics of independent but nonidentically distributed exponential random variables. Using this model inferences can be made about current reliability and, if debugging continues, future reliability. This model also shows the difficulty inherent in statistical verification of very highly reliable software such as that used by digital avionics in commercial aircraft.
Low-cost digital image processing at the University of Oklahoma

NASA Technical Reports Server (NTRS)

Harrington, J. A., Jr.

1981-01-01

Computer assisted instruction in remote sensing at the University of Oklahoma involves two separate approaches and is dependent upon initial preprocessing of a LANDSAT computer compatible tape using software developed for an IBM 370/158 computer. In-house generated preprocessing algorithms permits students or researchers to select a subset of a LANDSAT scene for subsequent analysis using either general purpose statistical packages or color graphic image processing software developed for Apple II microcomputers. Procedures for preprocessing the data and image analysis using either of the two approaches for low-cost LANDSAT data processing are described.
CGO: utilizing and integrating gene expression microarray data in clinical research and data management.

PubMed

Bumm, Klaus; Zheng, Mingzhong; Bailey, Clyde; Zhan, Fenghuang; Chiriva-Internati, M; Eddlemon, Paul; Terry, Julian; Barlogie, Bart; Shaughnessy, John D

2002-02-01

Clinical GeneOrganizer (CGO) is a novel windows-based archiving, organization and data mining software for the integration of gene expression profiling in clinical medicine. The program implements various user-friendly tools and extracts data for further statistical analysis. This software was written for Affymetrix GeneChip *.txt files, but can also be used for any other microarray-derived data. The MS-SQL server version acts as a data mart and links microarray data with clinical parameters of any other existing database and therefore represents a valuable tool for combining gene expression analysis and clinical disease characteristics.
Statistical Analysis Software for the TRS-80 Microcomputer.

DTIC Science & Technology

1981-09-01

1160 3 W640 : KSa3 ; GOSUB 760 1170 CLS : PRINT"Is the variance known?" 1190 ZP64: TI"’T" 1 T2S-N : CIOSU1 710 1190 IF 70- en " OR mO"N" 0070 1250 1200...Paul Isbell September 1981 C--) Thesis Advisor: Charles F. Taylor, Jr. LLU __ Approved for public release; distribution unlimited C. 3 - ? 01 12 032...ACCUIiON no . 3 . 1CIPIIT*S CATALOG NMUNIRO, 4. TITLE (god Su feiej) S. TYPE OP MI[PORT 6 PEIOo COVI[EO Stastical Analysis Software for the Master’s Thesis
Determination of apparent coupling factors for adhesive bonded acrylic plates using SEAL approach

NASA Astrophysics Data System (ADS)

Pankaj, Achuthan. C.; Shivaprasad, M. V.; Murigendrappa, S. M.

2018-04-01

Apparent coupling loss factors (CLF) and velocity responses has been computed for two lap joined adhesive bonded plates using finite element and experimental statistical energy analysis like approach. A finite element model of the plates has been created using ANSYS software. The statistical energy parameters have been computed using the velocity responses obtained from a harmonic forced excitation analysis. Experiments have been carried out for two different cases of adhesive bonded joints and the results have been compared with the apparent coupling factors and velocity responses obtained from finite element analysis. The results obtained from the studies signify the importance of modeling of adhesive bonded joints in computation of the apparent coupling factors and its further use in computation of energies and velocity responses using statistical energy analysis like approach.
Use of statistical study methods for the analysis of the results of the imitation modeling of radiation transfer

NASA Astrophysics Data System (ADS)

Alekseenko, M. A.; Gendrina, I. Yu.

2017-11-01

Recently, due to the abundance of various types of observational data in the systems of vision through the atmosphere and the need for their processing, the use of various methods of statistical research in the study of such systems as correlation-regression analysis, dynamic series, variance analysis, etc. is actual. We have attempted to apply elements of correlation-regression analysis for the study and subsequent prediction of the patterns of radiation transfer in these systems same as in the construction of radiation models of the atmosphere. In this paper, we present some results of statistical processing of the results of numerical simulation of the characteristics of vision systems through the atmosphere obtained with the help of a special software package.1
CORSSA: Community Online Resource for Statistical Seismicity Analysis

NASA Astrophysics Data System (ADS)

Zechar, J. D.; Hardebeck, J. L.; Michael, A. J.; Naylor, M.; Steacy, S.; Wiemer, S.; Zhuang, J.

2011-12-01

Statistical seismology is critical to the understanding of seismicity, the evaluation of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology-especially to those aspects with great impact on public policy-statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA, www.corssa.org). We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each will contain between four and eight articles. CORSSA now includes seven articles with an additional six in draft form along with forums for discussion, a glossary, and news about upcoming meetings, special issues, and recent papers. Each article is peer-reviewed and presents a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. We have also begun curating a collection of statistical seismology software packages.
A Meta-Analysis of Referential Communication Studies: A Computer Readable Literature Review.

ERIC Educational Resources Information Center

Dickson, W. Patrick; Moskoff, Mary

A computer-assisted analysis of studies on referential communication (giving directions/explanations) located 66 reports involving 80 experiments, 114 referential tasks, and over 6,200 individuals. The studies were entered into a statistical software package system (SPSS) and analyzed for characteristics of the subjects and experimental designs,…
Model Uncertainty and Robustness: A Computational Framework for Multimodel Analysis

ERIC Educational Resources Information Center

Young, Cristobal; Holsteen, Katherine

2017-01-01

Model uncertainty is pervasive in social science. A key question is how robust empirical results are to sensible changes in model specification. We present a new approach and applied statistical software for computational multimodel analysis. Our approach proceeds in two steps: First, we estimate the modeling distribution of estimates across all…
A book review of Spatial data analysis in ecology and agriculture using R

USDA-ARS?s Scientific Manuscript database

Spatial Data Analysis in Ecology and Agriculture Using R is a valuable resource to assist agricultural and ecological researchers with spatial data analyses using the R statistical software(www.r-project.org). Special emphasis is on spatial data sets; how-ever, the text also provides ample guidance ...
Modular Open-Source Software for Item Factor Analysis

ERIC Educational Resources Information Center

Pritikin, Joshua N.; Hunter, Micheal D.; Boker, Steven M.

2015-01-01

This article introduces an item factor analysis (IFA) module for "OpenMx," a free, open-source, and modular statistical modeling package that runs within the R programming environment on GNU/Linux, Mac OS X, and Microsoft Windows. The IFA module offers a novel model specification language that is well suited to programmatic generation…
Interactive visual analysis promotes exploration of long-term ecological data

Treesearch

T.N. Pham; J.A. Jones; R. Metoyer; F.J. Swanson; R.J. Pabst

2013-01-01

Long-term ecological data are crucial in helping ecologists understand ecosystem function and environmental change. Nevertheless, these kinds of data sets are difficult to analyze because they are usually large, multivariate, and spatiotemporal. Although existing analysis tools such as statistical methods and spreadsheet software permit rigorous tests of pre-conceived...
A new SAS program for behavioral analysis of Electrical Penetration Graph (EPG) data

USDA-ARS?s Scientific Manuscript database

A new program is introduced that uses SAS software to duplicate output of descriptive statistics from the Sarria Excel workbook for EPG waveform analysis. Not only are publishable means and standard errors or deviations output, the user also is guided through four relatively simple sub-programs for ...
Accuracy of metric sex analysis of skeletal remains using Fordisc based on a recent skull collection.

PubMed

Ramsthaler, F; Kreutz, K; Verhoff, M A

2007-11-01

It has been generally accepted in skeletal sex determination that the use of metric methods is limited due to the population dependence of the multivariate algorithms. The aim of the study was to verify the applicability of software-based sex estimations outside the reference population group for which discriminant equations have been developed. We examined 98 skulls from recent forensic cases of known age, sex, and Caucasian ancestry from cranium collections in Frankfurt and Mainz (Germany) to determine the accuracy of sex determination using the statistical software solution Fordisc which derives its database and functions from the US American Forensic Database. In a comparison between metric analysis using Fordisc and morphological determination of sex, average accuracy for both sexes was 86 vs 94%, respectively, and males were identified more accurately than females. The ratio of the true test result rate to the false test result rate was not statistically different for the two methodological approaches at a significance level of 0.05 but was statistically different at a level of 0.10 (p=0.06). Possible explanations for this difference comprise different ancestry, age distribution, and socio-economic status compared to the Fordisc reference sample. It is likely that a discriminant function analysis on the basis of more similar European reference samples will lead to more valid and reliable sexing results. The use of Fordisc as a single method for the estimation of sex of recent skeletal remains in Europe cannot be recommended without additional morphological assessment and without a built-in software update based on modern European reference samples.
Advances in Statistical Methods for Substance Abuse Prevention Research

PubMed Central

MacKinnon, David P.; Lockwood, Chondra M.

2010-01-01

The paper describes advances in statistical methods for prevention research with a particular focus on substance abuse prevention. Standard analysis methods are extended to the typical research designs and characteristics of the data collected in prevention research. Prevention research often includes longitudinal measurement, clustering of data in units such as schools or clinics, missing data, and categorical as well as continuous outcome variables. Statistical methods to handle these features of prevention data are outlined. Developments in mediation, moderation, and implementation analysis allow for the extraction of more detailed information from a prevention study. Advancements in the interpretation of prevention research results include more widespread calculation of effect size and statistical power, the use of confidence intervals as well as hypothesis testing, detailed causal analysis of research findings, and meta-analysis. The increased availability of statistical software has contributed greatly to the use of new methods in prevention research. It is likely that the Internet will continue to stimulate the development and application of new methods. PMID:12940467
Statistical Validation for Clinical Measures: Repeatability and Agreement of Kinect™-Based Software

PubMed Central

Tello, Emanuel; Rodrigo, Alejandro; Valentinuzzi, Max E.

2018-01-01

Background The rehabilitation process is a fundamental stage for recovery of people's capabilities. However, the evaluation of the process is performed by physiatrists and medical doctors, mostly based on their observations, that is, a subjective appreciation of the patient's evolution. This paper proposes a tracking platform of the movement made by an individual's upper limb using Kinect sensor(s) to be applied for the patient during the rehabilitation process. The main contribution is the development of quantifying software and the statistical validation of its performance, repeatability, and clinical use in the rehabilitation process. Methods The software determines joint angles and upper limb trajectories for the construction of a specific rehabilitation protocol and quantifies the treatment evolution. In turn, the information is presented via a graphical interface that allows the recording, storage, and report of the patient's data. For clinical purposes, the software information is statistically validated with three different methodologies, comparing the measures with a goniometer in terms of agreement and repeatability. Results The agreement of joint angles measured with the proposed software and goniometer is evaluated with Bland-Altman plots; all measurements fell well within the limits of agreement, meaning interchangeability of both techniques. Additionally, the results of Bland-Altman analysis of repeatability show 95% confidence. Finally, the physiotherapists' qualitative assessment shows encouraging results for the clinical use. Conclusion The main conclusion is that the software is capable of offering a clinical history of the patient and is useful for quantification of the rehabilitation success. The simplicity, low cost, and visualization possibilities enhance the use of the software Kinect for rehabilitation and other applications, and the expert's opinion endorses the choice of our approach for clinical practice. Comparison of the new measurement technique with established goniometric methods determines that the proposed software agrees sufficiently to be used interchangeably. PMID:29750166
An Example of the Informative Potential of Polar Coordinate Analysis: Sprint Tactics in Elite 1,500-m Track Events

ERIC Educational Resources Information Center

Aragón, Sonia; Lapresa, Daniel; Arana, Javier; Anguera, M. Teresa; Garzón, Belén

2017-01-01

Polar coordinate analysis is a powerful data reduction technique based on the Zsum statistic, which is calculated from adjusted residuals obtained by lag sequential analysis. Its use has been greatly simplified since the addition of a module in the free software program HOISAN for performing the necessary computations and producing…
[Comparison among various software for LMS growth curve fitting methods].

PubMed

Han, Lin; Wu, Wenhong; Wei, Qiuxia

2015-03-01

To explore the methods to realize the growth curve fitting of coefficients of skewness-median-coefficient of variation (LMS) using different software, and to optimize growth curve statistical method for grass-root child and adolescent staffs. Regular physical examination data of head circumference for normal infants aging 3, 6, 9 and 12 months in Baotou City were analyzed. Statistical software such as SAS, R, STATA and SPSS were used to fit the LMS growth curve and the results were evaluated upon the user 's convenience, study circle, user interface, results display forms, software update and maintenance and so on. Growth curve fitting results showed the same calculation outcome and each of statistical software had its own advantages and disadvantages. With all the evaluation aspects in consideration, R software excelled others in LMS growth curve fitting. R software have the advantage over other software in grass roots child and adolescent staff.

Geostatistics and GIS: tools for characterizing environmental contamination.

PubMed

Henshaw, Shannon L; Curriero, Frank C; Shields, Timothy M; Glass, Gregory E; Strickland, Paul T; Breysse, Patrick N

2004-08-01

Geostatistics is a set of statistical techniques used in the analysis of georeferenced data that can be applied to environmental contamination and remediation studies. In this study, the 1,1-dichloro-2,2-bis(p-chlorophenyl)ethylene (DDE) contamination at a Superfund site in western Maryland is evaluated. Concern about the site and its future clean up has triggered interest within the community because residential development surrounds the area. Spatial statistical methods, of which geostatistics is a subset, are becoming increasingly popular, in part due to the availability of geographic information system (GIS) software in a variety of application packages. In this article, the joint use of ArcGIS software and the R statistical computing environment are demonstrated as an approach for comprehensive geostatistical analyses. The spatial regression method, kriging, is used to provide predictions of DDE levels at unsampled locations both within the site and the surrounding areas where residential development is ongoing.
SandiaMRCR

DOE Office of Scientific and Technical Information (OSTI.GOV)

2012-01-05

SandiaMCR was developed to identify pure components and their concentrations from spectral data. This software efficiently implements the multivariate calibration regression alternating least squares (MCR-ALS), principal component analysis (PCA), and singular value decomposition (SVD). Version 3.37 also includes the PARAFAC-ALS Tucker-1 (for trilinear analysis) algorithms. The alternating least squares methods can be used to determine the composition without or with incomplete prior information on the constituents and their concentrations. It allows the specification of numerous preprocessing, initialization and data selection and compression options for the efficient processing of large data sets. The software includes numerous options including the definition ofmore » equality and non-negativety constraints to realistically restrict the solution set, various normalization or weighting options based on the statistics of the data, several initialization choices and data compression. The software has been designed to provide a practicing spectroscopist the tools required to routinely analysis data in a reasonable time and without requiring expert intervention.« less
MPTinR: analysis of multinomial processing tree models in R.

PubMed

Singmann, Henrik; Kellen, David

2013-06-01

We introduce MPTinR, a software package developed for the analysis of multinomial processing tree (MPT) models. MPT models represent a prominent class of cognitive measurement models for categorical data with applications in a wide variety of fields. MPTinR is the first software for the analysis of MPT models in the statistical programming language R, providing a modeling framework that is more flexible than standalone software packages. MPTinR also introduces important features such as (1) the ability to calculate the Fisher information approximation measure of model complexity for MPT models, (2) the ability to fit models for categorical data outside the MPT model class, such as signal detection models, (3) a function for model selection across a set of nested and nonnested candidate models (using several model selection indices), and (4) multicore fitting. MPTinR is available from the Comprehensive R Archive Network at http://cran.r-project.org/web/packages/MPTinR/ .
DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data.

PubMed

Gaspar, John M; Hart, Ronald P

2017-11-29

DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are differentially methylated between two or more biological conditions. Even though a variety of software packages is available for different aspects of the bioinformatics analysis, they often produce results that are biased or require excessive computational requirements. DMRfinder is a novel computational pipeline that identifies differentially methylated regions efficiently. Following alignment, DMRfinder extracts methylation counts and performs a modified single-linkage clustering of methylation sites into genomic regions. It then compares methylation levels using beta-binomial hierarchical modeling and Wald tests. Among its innovative attributes are the analyses of novel methylation sites and methylation linkage, as well as the simultaneous statistical analysis of multiple sample groups. To demonstrate its efficiency, DMRfinder is benchmarked against other computational approaches using a large published dataset. Contrasting two replicates of the same sample yielded minimal genomic regions with DMRfinder, whereas two alternative software packages reported a substantial number of false positives. Further analyses of biological samples revealed fundamental differences between DMRfinder and another software package, despite the fact that they utilize the same underlying statistical basis. For each step, DMRfinder completed the analysis in a fraction of the time required by other software. Among the computational approaches for identifying differentially methylated regions from high-throughput bisulfite sequencing datasets, DMRfinder is the first that integrates all the post-alignment steps in a single package. Compared to other software, DMRfinder is extremely efficient and unbiased in this process. DMRfinder is free and open-source software, available on GitHub ( github.com/jsh58/DMRfinder ); it is written in Python and R, and is supported on Linux.
STAMPS: Software Tool for Automated MRI Post-processing on a supercomputer.

PubMed

Bigler, Don C; Aksu, Yaman; Miller, David J; Yang, Qing X

2009-08-01

This paper describes a Software Tool for Automated MRI Post-processing (STAMP) of multiple types of brain MRIs on a workstation and for parallel processing on a supercomputer (STAMPS). This software tool enables the automation of nonlinear registration for a large image set and for multiple MR image types. The tool uses standard brain MRI post-processing tools (such as SPM, FSL, and HAMMER) for multiple MR image types in a pipeline fashion. It also contains novel MRI post-processing features. The STAMP image outputs can be used to perform brain analysis using Statistical Parametric Mapping (SPM) or single-/multi-image modality brain analysis using Support Vector Machines (SVMs). Since STAMPS is PBS-based, the supercomputer may be a multi-node computer cluster or one of the latest multi-core computers.
Protecting Externally Supplied Software in Small Computers.

DTIC Science & Technology

1980-09-01

smiall computer sysiemrS incor-poratinig these SecCurity fetReNS reCquireCs careful analysis of a number of options in making tradeoffs amnong perfbrmance...alarming statistic is not representative of the market as a whole or that it is not indicative of the fate of sales of such software in the future. In ... market as well. Although the size of this market ( in numbers of machines) may not approach that of personal computers, small business computers may
Comparison of four software packages for CT lung volumetry in healthy individuals.

PubMed

Nemec, Stefan F; Molinari, Francesco; Dufresne, Valerie; Gosset, Natacha; Silva, Mario; Bankier, Alexander A

2015-06-01

To compare CT lung volumetry (CTLV) measurements provided by different software packages, and to provide normative data for lung densitometric measurements in healthy individuals. This retrospective study included 51 chest CTs of 17 volunteers (eight men and nine women; mean age, 30 ± 6 years), who underwent spirometrically monitored CT at total lung capacity (TLC), functional residual capacity (FRC), and mean inspiratory capacity (MIC). Volumetric differences assessed by four commercial software packages were compared with analysis of variance (ANOVA) for repeated measurements and benchmarked against the threshold for acceptable variability between spirometric measurements. Mean lung density (MLD) and parenchymal heterogeneity (MLD-SD) were also compared with ANOVA. Volumetric differences ranged from 12 to 213 ml (0.20 % to 6.45 %). Although 16/18 comparisons (among four software packages at TLC, MIC, and FRC) were statistically significant (P < 0.001 to P = 0.004), only 3/18 comparisons, one at MIC and two at FRC, exceeded the spirometry variability threshold. MLD and MLD-SD significantly increased with decreasing volumes, and were significantly larger in lower compared to upper lobes (P < 0.001). Lung volumetric differences provided by different software packages are small. These differences should not be interpreted based on statistical significance alone, but together with absolute volumetric differences. • Volumetric differences, assessed by different CTLV software, are small but statistically significant. • Volumetric differences are smaller at TLC than at MIC and FRC. • Volumetric differences rarely exceed spirometric repeatability thresholds at MIC and FRC. • Differences between CTLV measurements should be interpreted based on comparison of absolute differences. • MLD increases with decreasing volumes, and is larger in lower compared to upper lobes.
Statistical Approaches to Adjusting Weights for Dependent Arms in Network Meta-analysis.

PubMed

Su, Yu-Xuan; Tu, Yu-Kang

2018-05-22

Network meta-analysis compares multiple treatments in terms of their efficacy and harm by including evidence from randomized controlled trials. Most clinical trials use parallel design, where patients are randomly allocated to different treatments and receive only one treatment. However, some trials use within person designs such as split-body, split-mouth and cross-over designs, where each patient may receive more than one treatment. Data from treatment arms within these trials are no longer independent, so the correlations between dependent arms need to be accounted for within the statistical analyses. Ignoring these correlations may result in incorrect conclusions. The main objective of this study is to develop statistical approaches to adjusting weights for dependent arms within special design trials. In this study, we demonstrate the following three approaches: the data augmentation approach, the adjusting variance approach, and the reducing weight approach. These three methods could be perfectly applied in current statistic tools such as R and STATA. An example of periodontal regeneration was used to demonstrate how these approaches could be undertaken and implemented within statistical software packages, and to compare results from different approaches. The adjusting variance approach can be implemented within the network package in STATA, while reducing weight approach requires computer software programming to set up the within-study variance-covariance matrix. This article is protected by copyright. All rights reserved.
Association between obesity with disease-free survival and overall survival in triple-negative breast cancer: A meta-analysis.

PubMed

Mei, Lin; He, Lin; Song, Yuhua; Lv, Yang; Zhang, Lijiu; Hao, Fengxi; Xu, Mengmeng

2018-05-01

To investigate the relationship between obesity and disease-free survival (DFS) and overall survival (OS) of triple-negative breast cancer. Citations were searched in PubMed, Cochrane Library, and Web of Science. Random effect model meta-analysis was conducted by using Revman software version 5.0, and publication bias was evaluated by creating Egger regression with STATA software version 12. Nine studies (4412 patients) were included for DFS meta-analysis, 8 studies (4392 patients) include for OS meta-analysis. There were no statistical significances between obesity with DFS (P = .60) and OS (P = .71) in triple-negative breast cancer (TNBC) patients. Obesity has no impact on DFS and OS in patients with TNBC.
Imperial College near infrared spectroscopy neuroimaging analysis framework.

PubMed

Orihuela-Espina, Felipe; Leff, Daniel R; James, David R C; Darzi, Ara W; Yang, Guang-Zhong

2018-01-01

This paper describes the Imperial College near infrared spectroscopy neuroimaging analysis (ICNNA) software tool for functional near infrared spectroscopy neuroimaging data. ICNNA is a MATLAB-based object-oriented framework encompassing an application programming interface and a graphical user interface. ICNNA incorporates reconstruction based on the modified Beer-Lambert law and basic processing and data validation capabilities. Emphasis is placed on the full experiment rather than individual neuroimages as the central element of analysis. The software offers three types of analyses including classical statistical methods based on comparison of changes in relative concentrations of hemoglobin between the task and baseline periods, graph theory-based metrics of connectivity and, distinctively, an analysis approach based on manifold embedding. This paper presents the different capabilities of ICNNA in its current version.
Introducing 3D U-statistic method for separating anomaly from background in exploration geochemical data with associated software development

NASA Astrophysics Data System (ADS)

Ghannadpour, Seyyed Saeed; Hezarkhani, Ardeshir

2016-03-01

The U-statistic method is one of the most important structural methods to separate the anomaly from the background. It considers the location of samples and carries out the statistical analysis of the data without judging from a geochemical point of view and tries to separate subpopulations and determine anomalous areas. In the present study, to use U-statistic method in three-dimensional (3D) condition, U-statistic is applied on the grade of two ideal test examples, by considering sample Z values (elevation). So far, this is the first time that this method has been applied on a 3D condition. To evaluate the performance of 3D U-statistic method and in order to compare U-statistic with one non-structural method, the method of threshold assessment based on median and standard deviation (MSD method) is applied on the two example tests. Results show that the samples indicated by U-statistic method as anomalous are more regular and involve less dispersion than those indicated by the MSD method. So that, according to the location of anomalous samples, denser areas of them can be determined as promising zones. Moreover, results show that at a threshold of U = 0, the total error of misclassification for U-statistic method is much smaller than the total error of criteria of bar {x}+n× s. Finally, 3D model of two test examples for separating anomaly from background using 3D U-statistic method is provided. The source code for a software program, which was developed in the MATLAB programming language in order to perform the calculations of the 3D U-spatial statistic method, is additionally provided. This software is compatible with all the geochemical varieties and can be used in similar exploration projects.
KMWin--a convenient tool for graphical presentation of results from Kaplan-Meier survival time analysis.

PubMed

Gross, Arnd; Ziepert, Marita; Scholz, Markus

2012-01-01

Analysis of clinical studies often necessitates multiple graphical representations of the results. Many professional software packages are available for this purpose. Most packages are either only commercially available or hard to use especially if one aims to generate or customize a huge number of similar graphical outputs. We developed a new, freely available software tool called KMWin (Kaplan-Meier for Windows) facilitating Kaplan-Meier survival time analysis. KMWin is based on the statistical software environment R and provides an easy to use graphical interface. Survival time data can be supplied as SPSS (sav), SAS export (xpt) or text file (dat), which is also a common export format of other applications such as Excel. Figures can directly be exported in any graphical file format supported by R. On the basis of a working example, we demonstrate how to use KMWin and present its main functions. We show how to control the interface, customize the graphical output, and analyse survival time data. A number of comparisons are performed between KMWin and SPSS regarding graphical output, statistical output, data management and development. Although the general functionality of SPSS is larger, KMWin comprises a number of features useful for survival time analysis in clinical trials and other applications. These are for example number of cases and number of cases under risk within the figure or provision of a queue system for repetitive analyses of updated data sets. Moreover, major adjustments of graphical settings can be performed easily on a single window. We conclude that our tool is well suited and convenient for repetitive analyses of survival time data. It can be used by non-statisticians and provides often used functions as well as functions which are not supplied by standard software packages. The software is routinely applied in several clinical study groups.
KMWin – A Convenient Tool for Graphical Presentation of Results from Kaplan-Meier Survival Time Analysis

PubMed Central

Gross, Arnd; Ziepert, Marita; Scholz, Markus

2012-01-01

Background Analysis of clinical studies often necessitates multiple graphical representations of the results. Many professional software packages are available for this purpose. Most packages are either only commercially available or hard to use especially if one aims to generate or customize a huge number of similar graphical outputs. We developed a new, freely available software tool called KMWin (Kaplan-Meier for Windows) facilitating Kaplan-Meier survival time analysis. KMWin is based on the statistical software environment R and provides an easy to use graphical interface. Survival time data can be supplied as SPSS (sav), SAS export (xpt) or text file (dat), which is also a common export format of other applications such as Excel. Figures can directly be exported in any graphical file format supported by R. Results On the basis of a working example, we demonstrate how to use KMWin and present its main functions. We show how to control the interface, customize the graphical output, and analyse survival time data. A number of comparisons are performed between KMWin and SPSS regarding graphical output, statistical output, data management and development. Although the general functionality of SPSS is larger, KMWin comprises a number of features useful for survival time analysis in clinical trials and other applications. These are for example number of cases and number of cases under risk within the figure or provision of a queue system for repetitive analyses of updated data sets. Moreover, major adjustments of graphical settings can be performed easily on a single window. Conclusions We conclude that our tool is well suited and convenient for repetitive analyses of survival time data. It can be used by non-statisticians and provides often used functions as well as functions which are not supplied by standard software packages. The software is routinely applied in several clinical study groups. PMID:22723912
A software platform for statistical evaluation of patient respiratory patterns in radiation therapy.

PubMed

Dunn, Leon; Kenny, John

2017-10-01

The aim of this work was to design and evaluate a software tool for analysis of a patient's respiration, with the goal of optimizing the effectiveness of motion management techniques during radiotherapy imaging and treatment. A software tool which analyses patient respiratory data files (.vxp files) created by the Varian Real-Time Position Management System (RPM) was developed to analyse patient respiratory data. The software, called RespAnalysis, was created in MATLAB and provides four modules, one each for determining respiration characteristics, providing breathing coaching (biofeedback training), comparing pre and post-training characteristics and performing a fraction-by-fraction assessment. The modules analyse respiratory traces to determine signal characteristics and specifically use a Sample Entropy algorithm as the key means to quantify breathing irregularity. Simulated respiratory signals, as well as 91 patient RPM traces were analysed with RespAnalysis to test the viability of using the Sample Entropy for predicting breathing regularity. Retrospective assessment of patient data demonstrated that the Sample Entropy metric was a predictor of periodic irregularity in respiration data, however, it was found to be insensitive to amplitude variation. Additional waveform statistics assessing the distribution of signal amplitudes over time coupled with Sample Entropy method were found to be useful in assessing breathing regularity. The RespAnalysis software tool presented in this work uses the Sample Entropy method to analyse patient respiratory data recorded for motion management purposes in radiation therapy. This is applicable during treatment simulation and during subsequent treatment fractions, providing a way to quantify breathing irregularity, as well as assess the need for breathing coaching. It was demonstrated that the Sample Entropy metric was correlated to the irregularity of the patient's respiratory motion in terms of periodicity, whilst other metrics, such as percentage deviation of inhale/exhale peak positions provided insight into respiratory amplitude regularity. Copyright © 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Measurements and analysis in imaging for biomedical applications

NASA Astrophysics Data System (ADS)

Hoeller, Timothy L.

2009-02-01

A Total Quality Management (TQM) approach can be used to analyze data from biomedical optical and imaging platforms of tissues. A shift from individuals to teams, partnerships, and total participation are necessary from health care groups for improved prognostics using measurement analysis. Proprietary measurement analysis software is available for calibrated, pixel-to-pixel measurements of angles and distances in digital images. Feature size, count, and color are determinable on an absolute and comparative basis. Although changes in images of histomics are based on complex and numerous factors, the variation of changes in imaging analysis to correlations of time, extent, and progression of illness can be derived. Statistical methods are preferred. Applications of the proprietary measurement software are available for any imaging platform. Quantification of results provides improved categorization of illness towards better health. As health care practitioners try to use quantified measurement data for patient diagnosis, the techniques reported can be used to track and isolate causes better. Comparisons, norms, and trends are available from processing of measurement data which is obtained easily and quickly from Scientific Software and methods. Example results for the class actions of Preventative and Corrective Care in Ophthalmology and Dermatology, respectively, are provided. Improved and quantified diagnosis can lead to better health and lower costs associated with health care. Systems support improvements towards Lean and Six Sigma affecting all branches of biology and medicine. As an example for use of statistics, the major types of variation involving a study of Bone Mineral Density (BMD) are examined. Typically, special causes in medicine relate to illness and activities; whereas, common causes are known to be associated with gender, race, size, and genetic make-up. Such a strategy of Continuous Process Improvement (CPI) involves comparison of patient results to baseline data using F-statistics. Self-parings over time are also useful. Special and common causes are identified apart from aging in applying the statistical methods. In the future, implementation of imaging measurement methods by research staff, doctors, and concerned patient partners result in improved health diagnosis, reporting, and cause determination. The long-term prospects for quantified measurements are better quality in imaging analysis with applications of higher utility for heath care providers.
Probabilistic Graphical Model Representation in Phylogenetics

PubMed Central

Höhna, Sebastian; Heath, Tracy A.; Boussau, Bastien; Landis, Michael J.; Ronquist, Fredrik; Huelsenbeck, John P.

2014-01-01

Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis–Hastings or Gibbs sampling of the posterior distribution. [Computation; graphical models; inference; modularization; statistical phylogenetics; tree plate.] PMID:24951559
Development of Software for Automatic Analysis of Intervention in the Field of Homeopathy.

PubMed

Jain, Rajesh Kumar; Goyal, Shagun; Bhat, Sushma N; Rao, Srinath; Sakthidharan, Vivek; Kumar, Prasanna; Sajan, Kannanaikal Rappayi; Jindal, Sameer Kumar; Jindal, Ghanshyam D

2018-05-01

To study the effect of homeopathic medicines (in higher potencies) in normal subjects, Peripheral Pulse Analyzer (PPA) has been used to record physiologic variability parameters before and after administration of the medicine/placebo in 210 normal subjects. Data have been acquired in seven rounds; placebo was administered in rounds 1 and 2 and medicine in potencies 6, 30, 200, 1 M, and 10 M was administered in rounds 3 to 7, respectively. Five different medicines in the said potencies were given to a group of around 40 subjects each. Although processing of data required human intervention, a software application has been developed to analyze the processed data and detect the response to eliminate the undue delay as well as human bias in subjective analysis. This utility named Automatic Analysis of Intervention in the Field of Homeopathy is run on the processed PPA data and the outcome has been compared with the manual analysis. The application software uses adaptive threshold based on statistics for detecting responses in contrast to fixed threshold used in manual analysis. The automatic analysis has detected 12.96% higher responses than subjective analysis. Higher response rates have been manually verified to be true positive. This indicates robustness of the application software. The automatic analysis software was run on another set of pulse harmonic parameters derived from the same data set to study cardiovascular susceptibility and 385 responses were detected in contrast to 272 of variability parameters. It was observed that 65% of the subjects, eliciting response, were common. This not only validates the software utility for giving consistent yield but also reveals the certainty of the response. This development may lead to electronic proving of homeopathic medicines (e-proving).
A Review of PROC IRT in SAS

ERIC Educational Resources Information Center

Choi, Jinnie

2017-01-01

This article reviews PROC IRT, which was added to Statistical Analysis Software in 2014. We provide an introductory overview of a free version of SAS, describe what PROC IRT offers for item response theory (IRT) analysis and how one can use PROC IRT, and discuss how other SAS macros and procedures may compensate the IRT functionalities of PROC IRT.
Components of spatial information management in wildlife ecology: Software for statistical and modeling analysis [Chapter 14

Treesearch

Hawthorne L. Beyer; Jeff Jenness; Samuel A. Cushman

2010-01-01

Spatial information systems (SIS) is a term that describes a wide diversity of concepts, techniques, and technologies related to the capture, management, display and analysis of spatial information. It encompasses technologies such as geographic information systems (GIS), global positioning systems (GPS), remote sensing, and relational database management systems (...
Sole: Online Analysis of Southern FIA Data

Treesearch

Michael P. Spinney; Paul C. Van Deusen; Francis A. Roesch

2006-01-01

The Southern On Line Estimator (SOLE) is a flexible modular software program for analyzing U.S. Department of Agriculture Forest Service Forest Inventory and Analysis data. SOLE produces statistical tables, figures, maps, and portable document format reports based on user selected area and variables. SOLE?s Java-based graphical user interface is easy to use, and its R-...

From proteomics to systems biology: MAPA, MASS WESTERN, PROMEX, and COVAIN as a user-oriented platform.

PubMed

Weckwerth, Wolfram; Wienkoop, Stefanie; Hoehenwarter, Wolfgang; Egelhofer, Volker; Sun, Xiaoliang

2014-01-01

Genome sequencing and systems biology are revolutionizing life sciences. Proteomics emerged as a fundamental technique of this novel research area as it is the basis for gene function analysis and modeling of dynamic protein networks. Here a complete proteomics platform suited for functional genomics and systems biology is presented. The strategy includes MAPA (mass accuracy precursor alignment; http://www.univie.ac.at/mosys/software.html ) as a rapid exploratory analysis step; MASS WESTERN for targeted proteomics; COVAIN ( http://www.univie.ac.at/mosys/software.html ) for multivariate statistical analysis, data integration, and data mining; and PROMEX ( http://www.univie.ac.at/mosys/databases.html ) as a database module for proteogenomics and proteotypic peptides for targeted analysis. Moreover, the presented platform can also be utilized to integrate metabolomics and transcriptomics data for the analysis of metabolite-protein-transcript correlations and time course analysis using COVAIN. Examples for the integration of MAPA and MASS WESTERN data, proteogenomic and metabolic modeling approaches for functional genomics, phosphoproteomics by integration of MOAC (metal-oxide affinity chromatography) with MAPA, and the integration of metabolomics, transcriptomics, proteomics, and physiological data using this platform are presented. All software and step-by-step tutorials for data processing and data mining can be downloaded from http://www.univie.ac.at/mosys/software.html.
A new computer code for discrete fracture network modelling

NASA Astrophysics Data System (ADS)

Xu, Chaoshui; Dowd, Peter

2010-03-01

The authors describe a comprehensive software package for two- and three-dimensional stochastic rock fracture simulation using marked point processes. Fracture locations can be modelled by a Poisson, a non-homogeneous, a cluster or a Cox point process; fracture geometries and properties are modelled by their respective probability distributions. Virtual sampling tools such as plane, window and scanline sampling are included in the software together with a comprehensive set of statistical tools including histogram analysis, probability plots, rose diagrams and hemispherical projections. The paper describes in detail the theoretical basis of the implementation and provides a case study in rock fracture modelling to demonstrate the application of the software.
Quick Overview Scout 2008 Version 1.0

EPA Science Inventory

The Scout 2008 version 1.0 statistical software package has been updated from past DOS and Windows versions to provide classical and robust univariate and multivariate graphical and statistical methods that are not typically available in commercial or freeware statistical softwar...
ProteoSign: an end-user online differential proteomics statistical analysis platform.

PubMed

Efstathiou, Georgios; Antonakis, Andreas N; Pavlopoulos, Georgios A; Theodosiou, Theodosios; Divanach, Peter; Trudgian, David C; Thomas, Benjamin; Papanikolaou, Nikolas; Aivaliotis, Michalis; Acuto, Oreste; Iliopoulos, Ioannis

2017-07-03

Profiling of proteome dynamics is crucial for understanding cellular behavior in response to intrinsic and extrinsic stimuli and maintenance of homeostasis. Over the last 20 years, mass spectrometry (MS) has emerged as the most powerful tool for large-scale identification and characterization of proteins. Bottom-up proteomics, the most common MS-based proteomics approach, has always been challenging in terms of data management, processing, analysis and visualization, with modern instruments capable of producing several gigabytes of data out of a single experiment. Here, we present ProteoSign, a freely available web application, dedicated in allowing users to perform proteomics differential expression/abundance analysis in a user-friendly and self-explanatory way. Although several non-commercial standalone tools have been developed for post-quantification statistical analysis of proteomics data, most of them are not end-user appealing as they often require very stringent installation of programming environments, third-party software packages and sometimes further scripting or computer programming. To avoid this bottleneck, we have developed a user-friendly software platform accessible via a web interface in order to enable proteomics laboratories and core facilities to statistically analyse quantitative proteomics data sets in a resource-efficient manner. ProteoSign is available at http://bioinformatics.med.uoc.gr/ProteoSign and the source code at https://github.com/yorgodillo/ProteoSign. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Efficacy and Physicochemical Evaluation of an Optimized Semisolid Formulation of Povidone Iodine Proposed by Extreme Vertices Statistical Design; a Practical Approach

PubMed Central

Lotfipour, Farzaneh; Valizadeh, Hadi; Shademan, Shahin; Monajjemzadeh, Farnaz

2015-01-01

One of the most significant issues in pharmaceutical industries, prior to commercialization of a pharmaceutical preparation is the "preformulation" stage. However, far too attention has been paid to verification of the software assisted statistical designs in preformulation studies. The main aim of this study was to report a step by step preformulation approach for a semisolid preparation based on a statistical mixture design and to verify the predictions made by the software with an in-vitro efficacy bioassay test. Extreme vertices mixture design (4 factors, 4 levels) was applied for preformulation of a semisolid Povidone Iodine preparation as Water removable ointment using different PolyEthylenGlycoles. Software Assisted (Minitab) analysis was then performed using four practically assessed response values including; Available iodine, viscosity (N index and yield value) and water absorption capacity. Subsequently mixture analysis was performed and finally, an optimized formulation was proposed. The efficacy of this formulation was bio-assayed using microbial tests in-vitro and MIC values were calculated for Escherichia coli, pseudomonaaeruginosa, staphylococcus aureus and Candida albicans. Results indicated the acceptable conformity of the measured responses. Thus, it can be concluded that the proposed design had an adequate power to predict the responses in practice. Stability studies, proved no significant change during the one year study for the optimized formulation. Efficacy was eligible on all tested species and in the case of staphylococcus aureus; the prepared semisolid formulation was even more effective. PMID:26664368
PANDA-view: An easy-to-use tool for statistical analysis and visualization of quantitative proteomics data.

PubMed

Chang, Cheng; Xu, Kaikun; Guo, Chaoping; Wang, Jinxia; Yan, Qi; Zhang, Jian; He, Fuchu; Zhu, Yunping

2018-05-22

Compared with the numerous software tools developed for identification and quantification of -omics data, there remains a lack of suitable tools for both downstream analysis and data visualization. To help researchers better understand the biological meanings in their -omics data, we present an easy-to-use tool, named PANDA-view, for both statistical analysis and visualization of quantitative proteomics data and other -omics data. PANDA-view contains various kinds of analysis methods such as normalization, missing value imputation, statistical tests, clustering and principal component analysis, as well as the most commonly-used data visualization methods including an interactive volcano plot. Additionally, it provides user-friendly interfaces for protein-peptide-spectrum representation of the quantitative proteomics data. PANDA-view is freely available at https://sourceforge.net/projects/panda-view/. 1987ccpacer@163.com and zhuyunping@gmail.com. Supplementary data are available at Bioinformatics online.
Aerosol, a health hazard during ultrasonic scaling: A clinico-microbiological study.

PubMed

Singh, Akanksha; Shiva Manjunath, R G; Singla, Deepak; Bhattacharya, Hirak S; Sarkar, Arijit; Chandra, Neeraj

2016-01-01

Ultrasonic scaling is a routinely used treatment to remove plaque and calculus from tooth surfaces. These scalers use water as a coolant which is splattered during the vibration of the tip. The splatter when mixed with saliva and plaque of the patients causes the aerosol highly infectious and acts as a major risk factor for transmission of the disease. In spite of necessary protection, sometimes, the operator might get infected because of the infectious nature of the splatter. To evaluate the aerosol contamination produced during ultrasonic scaling by the help of microbiological analysis. This clinico-microbiological study consisted of twenty patients. Two agar plates were used for each patient; the first was kept at the center of the operatory room 20 min before the treatment while the second agar plate was kept 40 cm away from the patient's chest during the treatment. Both the agar plates were sent for microbiological analysis. The statistical analysis was done with the help of STATA 11.0 (StataCorp. 2013. Stata Statistical Software, Release 13. College Station, TX: StataCorp LP, 4905 Lakeway Drive College Station, Texas, USA). Statistical software was used for data analysis and the P < 0.001 was considered to be statistically significant. The results for bacterial count were highly significant when compared before and during the treatment. The Gram staining showed the presence of Staphylococcus and Streptococcus species in high numbers. The aerosols and splatters produced during dental procedures have the potential to spread infection to dental personnel. Therefore, proper precautions should be taken to minimize the risk of infection to the operator.
SPSS and SAS programs for determining the number of components using parallel analysis and velicer's MAP test.

PubMed

O'Connor, B P

2000-08-01

Popular statistical software packages do not have the proper procedures for determining the number of components in factor and principal components analyses. Parallel analysis and Velicer's minimum average partial (MAP) test are validated procedures, recommended widely by statisticians. However, many researchers continue to use alternative, simpler, but flawed procedures, such as the eigenvalues-greater-than-one rule. Use of the proper procedures might be increased if these procedures could be conducted within familiar software environments. This paper describes brief and efficient programs for using SPSS and SAS to conduct parallel analyses and the MAP test.
Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)

NASA Astrophysics Data System (ADS)

Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee

2010-12-01

Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review available statistical seismology software packages.
The ImageJ ecosystem: an open platform for biomedical image analysis

PubMed Central

Schindelin, Johannes; Rueden, Curtis T.; Hiner, Mark C.; Eliceiri, Kevin W.

2015-01-01

Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available – from commercial to academic, special-purpose to Swiss army knife, small to large–but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts life science, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. PMID:26153368
The ImageJ ecosystem: An open platform for biomedical image analysis.

PubMed

Schindelin, Johannes; Rueden, Curtis T; Hiner, Mark C; Eliceiri, Kevin W

2015-01-01

Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available-from commercial to academic, special-purpose to Swiss army knife, small to large-but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on the life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts the life sciences, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. © 2015 Wiley Periodicals, Inc.
[Sem: a suitable statistical software adaptated for research in oncology].

PubMed

Kwiatkowski, F; Girard, M; Hacene, K; Berlie, J

2000-10-01

Many softwares have been adapted for medical use; they rarely enable conveniently both data management and statistics. A recent cooperative work ended up in a new software, Sem (Statistics Epidemiology Medicine), which allows data management of trials and, as well, statistical treatments on them. Very convenient, it can be used by non professional in statistics (biologists, doctors, researchers, data managers), since usually (excepted with multivariate models), the software performs by itself the most adequate test, after what complementary tests can be requested if needed. Sem data base manager (DBM) is not compatible with usual DBM: this constitutes a first protection against loss of privacy. Other shields (passwords, cryptage...) strengthen data security, all the more necessary today since Sem can be run on computers nets. Data organization enables multiplicity: forms can be duplicated by patient. Dates are treated in a special but transparent manner (sorting, date and delay calculations...). Sem communicates with common desktop softwares, often with a simple copy/paste. So, statistics can be easily performed on data stored in external calculation sheets, and slides by pasting graphs with a single mouse click (survival curves...). Already used over fifty places in different hospitals for daily work, this product, combining data management and statistics, appears to be a convenient and innovative solution.
New generation of exploration tools: interactive modeling software and microcomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krajewski, S.A.

1986-08-01

Software packages offering interactive modeling techniques are now available for use on microcomputer hardware systems. These packages are reasonably priced for both company and independent explorationists; they do not require users to have high levels of computer literacy; they are capable of rapidly completing complex ranges of sophisticated geologic and geophysical modeling tasks; and they can produce presentation-quality output for comparison with real-world data. For example, interactive packages are available for mapping, log analysis, seismic modeling, reservoir studies, and financial projects as well as for applying a variety of statistical and geostatistical techniques to analysis of exploration data. More importantly,more » these packages enable explorationists to directly apply their geologic expertise when developing and fine-tuning models for identifying new prospects and for extending producing fields. As a result of these features, microcomputers and interactive modeling software are becoming common tools in many exploration offices. Gravity and magnetics software programs illustrate some of the capabilities of such exploration tools.« less
Statistical tools for analysis and modeling of cosmic populations and astronomical time series: CUDAHM and TSE

NASA Astrophysics Data System (ADS)

Loredo, Thomas; Budavari, Tamas; Scargle, Jeffrey D.

2018-01-01

This presentation provides an overview of open-source software packages addressing two challenging classes of astrostatistics problems. (1) CUDAHM is a C++ framework for hierarchical Bayesian modeling of cosmic populations, leveraging graphics processing units (GPUs) to enable applying this computationally challenging paradigm to large datasets. CUDAHM is motivated by measurement error problems in astronomy, where density estimation and linear and nonlinear regression must be addressed for populations of thousands to millions of objects whose features are measured with possibly complex uncertainties, potentially including selection effects. An example calculation demonstrates accurate GPU-accelerated luminosity function estimation for simulated populations of $10^6$ objects in about two hours using a single NVIDIA Tesla K40c GPU. (2) Time Series Explorer (TSE) is a collection of software in Python and MATLAB for exploratory analysis and statistical modeling of astronomical time series. It comprises a library of stand-alone functions and classes, as well as an application environment for interactive exploration of times series data. The presentation will summarize key capabilities of this emerging project, including new algorithms for analysis of irregularly-sampled time series.
Analyzing longitudinal data with the linear mixed models procedure in SPSS.

PubMed

West, Brady T

2009-09-01

Many applied researchers analyzing longitudinal data share a common misconception: that specialized statistical software is necessary to fit hierarchical linear models (also known as linear mixed models [LMMs], or multilevel models) to longitudinal data sets. Although several specialized statistical software programs of high quality are available that allow researchers to fit these models to longitudinal data sets (e.g., HLM), rapid advances in general purpose statistical software packages have recently enabled analysts to fit these same models when using preferred packages that also enable other more common analyses. One of these general purpose statistical packages is SPSS, which includes a very flexible and powerful procedure for fitting LMMs to longitudinal data sets with continuous outcomes. This article aims to present readers with a practical discussion of how to analyze longitudinal data using the LMMs procedure in the SPSS statistical software package.
DnaSAM: Software to perform neutrality testing for large datasets with complex null models.

PubMed

Eckert, Andrew J; Liechty, John D; Tearse, Brandon R; Pande, Barnaly; Neale, David B

2010-05-01

Patterns of DNA sequence polymorphisms can be used to understand the processes of demography and adaptation within natural populations. High-throughput generation of DNA sequence data has historically been the bottleneck with respect to data processing and experimental inference. Advances in marker technologies have largely solved this problem. Currently, the limiting step is computational, with most molecular population genetic software allowing a gene-by-gene analysis through a graphical user interface. An easy-to-use analysis program that allows both high-throughput processing of multiple sequence alignments along with the flexibility to simulate data under complex demographic scenarios is currently lacking. We introduce a new program, named DnaSAM, which allows high-throughput estimation of DNA sequence diversity and neutrality statistics from experimental data along with the ability to test those statistics via Monte Carlo coalescent simulations. These simulations are conducted using the ms program, which is able to incorporate several genetic parameters (e.g. recombination) and demographic scenarios (e.g. population bottlenecks). The output is a set of diversity and neutrality statistics with associated probability values under a user-specified null model that are stored in easy to manipulate text file. © 2009 Blackwell Publishing Ltd.
[A Review on the Use of Effect Size in Nursing Research].

PubMed

Kang, Hyuncheol; Yeon, Kyupil; Han, Sang Tae

2015-10-01

The purpose of this study was to introduce the main concepts of statistical testing and effect size and to provide researchers in nursing science with guidance on how to calculate the effect size for the statistical analysis methods mainly used in nursing. For t-test, analysis of variance, correlation analysis, regression analysis which are used frequently in nursing research, the generally accepted definitions of the effect size were explained. Some formulae for calculating the effect size are described with several examples in nursing research. Furthermore, the authors present the required minimum sample size for each example utilizing G*Power 3 software that is the most widely used program for calculating sample size. It is noted that statistical significance testing and effect size measurement serve different purposes, and the reliance on only one side may be misleading. Some practical guidelines are recommended for combining statistical significance testing and effect size measure in order to make more balanced decisions in quantitative analyses.
FabricS: A user-friendly, complete and robust software for particle shape-fabric analysis

NASA Astrophysics Data System (ADS)

Moreno Chávez, G.; Castillo Rivera, F.; Sarocchi, D.; Borselli, L.; Rodríguez-Sedano, L. A.

2018-06-01

Shape-fabric is a textural parameter related to the spatial arrangement of elongated particles in geological samples. Its usefulness spans a range from sedimentary petrology to igneous and metamorphic petrology. Independently of the process being studied, when a material flows, the elongated particles are oriented with the major axis in the direction of flow. In sedimentary petrology this information has been used for studies of paleo-flow direction of turbidites, the origin of quartz sediments, and locating ignimbrite vents, among others. In addition to flow direction and its polarity, the method enables flow rheology to be inferred. The use of shape-fabric has been limited due to the difficulties of automatically measuring particles and analyzing them with reliable circular statistics programs. This has dampened interest in the method for a long time. Shape-fabric measurement has increased in popularity since the 1980s thanks to the development of new image analysis techniques and circular statistics software. However, the programs currently available are unreliable, old and are incompatible with newer operating systems, or require programming skills. The goal of our work is to develop a user-friendly program, in the MATLAB environment, with a graphical user interface, that can process images and includes editing functions, and thresholds (elongation and size) for selecting a particle population and analyzing it with reliable circular statistics algorithms. Moreover, the method also has to produce rose diagrams, orientation vectors, and a complete series of statistical parameters. All these requirements are met by our new software. In this paper, we briefly explain the methodology from collection of oriented samples in the field to the minimum number of particles needed to obtain reliable fabric data. We obtained the data using specific statistical tests and taking into account the degree of iso-orientation of the samples and the required degree of reliability. The program has been verified by means of several simulations performed using appropriately designed features and by analyzing real samples.
Comparison of requirements and capabilities of major multipurpose software packages.

PubMed

Igo, Robert P; Schnell, Audrey H

2012-01-01

The aim of this chapter is to introduce the reader to commonly used software packages and illustrate their input requirements, analysis options, strengths, and limitations. We focus on packages that perform more than one function and include a program for quality control, linkage, and association analyses. Additional inclusion criteria were (1) programs that are free to academic users and (2) currently supported, maintained, and developed. Using those criteria, we chose to review three programs: Statistical Analysis for Genetic Epidemiology (S.A.G.E.), PLINK, and Merlin. We will describe the required input format and analysis options. We will not go into detail about every possible program in the packages, but we will give an overview of the packages requirements and capabilities.
User's Guide, software for reduction and analysis of daily weather and surface-water data: Tools for time series analysis of precipitation, temperature, and streamflow data

USGS Publications Warehouse

Hereford, Richard

2006-01-01

The software described here is used to process and analyze daily weather and surface-water data. The programs are refinements of earlier versions that include minor corrections and routines to calculate frequencies above a threshold on an annual or seasonal basis. Earlier versions of this software were used successfully to analyze historical precipitation patterns of the Mojave Desert and the southern Colorado Plateau regions, ecosystem response to climate variation, and variation of sediment-runoff frequency related to climate (Hereford and others, 2003; 2004; in press; Griffiths and others, 2006). The main program described here (Day_Cli_Ann_v5.3) uses daily data to develop a time series of various statistics for a user specified accounting period such as a year or season. The statistics include averages and totals, but the emphasis is on the frequency of occurrence in days of relatively rare weather or runoff events. These statistics are indices of climate variation; for a discussion of climate indices, see the Climate Research Unit website of the University of East Anglia (http://www.cru.uea.ac.uk/projects/stardex/) and the Climate Change Indices web site (http://cccma.seos.uvic.ca/ETCCDMI/indices.html). Specifically, the indices computed with this software are the frequency of high intensity 24-hour rainfall, unusually warm temperature, and unusually high runoff. These rare, or extreme events, are those greater than the 90th percentile of precipitation, streamflow, or temperature computed for the period of record of weather or gaging stations. If they cluster in time over several decades, extreme events may produce detectable change in the physical landscape and ecosystem of a given region. Although the software has been tested on a variety of data, as with any software, the user should carefully evaluate the results with their data. The programs were designed for the range of precipitation, temperature, and streamflow measurements expected in the semiarid Southwest United States. The user is encouraged to review the examples provided with the software. The software is written in Fortran 90 with Fortran 95 extensions and was compiled with the Digital Visual Fortran compiler version 6.6. The executables run on Windows 2000 and XP, and they operate in a MS-DOS console window that has only very simple graphical options such as font size and color, background color, and size of the window. Error trapping was not written into the programs. Typically, when an error occurs, the console window closes without a message.

Software for computerised analysis of cardiotocographic traces.

PubMed

Romano, M; Bifulco, P; Ruffo, M; Improta, G; Clemente, F; Cesarelli, M

2016-02-01

Despite the widespread use of cardiotocography in foetal monitoring, the evaluation of foetal status suffers from a considerable inter and intra-observer variability. In order to overcome the main limitations of visual cardiotocographic assessment, computerised methods to analyse cardiotocographic recordings have been recently developed. In this study, a new software for automated analysis of foetal heart rate is presented. It allows an automatic procedure for measuring the most relevant parameters derivable from cardiotocographic traces. Simulated and real cardiotocographic traces were analysed to test software reliability. In artificial traces, we simulated a set number of events (accelerations, decelerations and contractions) to be recognised. In the case of real signals, instead, results of the computerised analysis were compared with the visual assessment performed by 18 expert clinicians and three performance indexes were computed to gain information about performances of the proposed software. The software showed preliminary performance we judged satisfactory in that the results matched completely the requirements, as proved by tests on artificial signals in which all simulated events were detected from the software. Performance indexes computed in comparison with obstetricians' evaluations are, on the contrary, not so satisfactory; in fact they led to obtain the following values of the statistical parameters: sensitivity equal to 93%, positive predictive value equal to 82% and accuracy equal to 77%. Very probably this arises from the high variability of trace annotation carried out by clinicians. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Semi-automatic computerized approach to radiological quantification in rheumatoid arthritis

NASA Astrophysics Data System (ADS)

Steiner, Wolfgang; Schoeffmann, Sylvia; Prommegger, Andrea; Boegl, Karl; Klinger, Thomas; Peloschek, Philipp; Kainberger, Franz

2004-04-01

Rheumatoid Arthritis (RA) is a common systemic disease predominantly involving the joints. Precise diagnosis and follow-up therapy requires objective quantification. For this purpose, radiological analyses using standardized scoring systems are considered to be the most appropriate method. The aim of our study is to develop a semi-automatic image analysis software, especially applicable for scoring of joints in rheumatic disorders. The X-Ray RheumaCoach software delivers various scoring systems (Larsen-Score and Ratingen-Rau-Score) which can be applied by the scorer. In addition to the qualitative assessment of joints performed by the radiologist, a semi-automatic image analysis for joint detection and measurements of bone diameters and swollen tissue supports the image assessment process. More than 3000 radiographs from hands and feet of more than 200 RA patients were collected, analyzed, and statistically evaluated. Radiographs were quantified using conventional paper-based Larsen score and the X-Ray RheumaCoach software. The use of the software shortened the scoring time by about 25 percent and reduced the rate of erroneous scorings in all our studies. Compared to paper-based scoring methods, the X-Ray RheumaCoach software offers several advantages: (i) Structured data analysis and input that minimizes variance by standardization, (ii) faster and more precise calculation of sum scores and indices, (iii) permanent data storing and fast access to the software"s database, (iv) the possibility of cross-calculation to other scores, (v) semi-automatic assessment of images, and (vii) reliable documentation of results in the form of graphical printouts.
ToxMiner Software Interface for Visualizing and Analyzing ToxCast Data

EPA Science Inventory

The ToxCast dataset represents a collection of assays and endpoints that will require both standard statistical approaches as well as customized data analysis workflows. To analyze this unique dataset, we have developed an integrated database with Javabased interface called ToxMi...
Computer program documentation for the pasture/range condition assessment processor

NASA Technical Reports Server (NTRS)

Mcintyre, K. S.; Miller, T. G. (Principal Investigator)

1982-01-01

The processor which drives for the RANGE software allows the user to analyze LANDSAT data containing pasture and rangeland. Analysis includes mapping, generating statistics, calculating vegetative indexes, and plotting vegetative indexes. Routines for using the processor are given. A flow diagram is included.
Bonneville Power Administration Communication Alarm Processor expert system:

DOE Office of Scientific and Technical Information (OSTI.GOV)

Goeltz, R.; Purucker, S.; Tonn, B.

This report describes the Communications Alarm Processor (CAP), a prototype expert system developed for the Bonneville Power Administration by Oak Ridge National Laboratory. The system is designed to receive and diagnose alarms from Bonneville's Microwave Communications System (MCS). The prototype encompasses one of seven branches of the communications network and a subset of alarm systems and alarm types from each system. The expert system employs a backward chaining approach to diagnosing alarms. Alarms are fed into the expert system directly from the communication system via RS232 ports and sophisticated alarm filtering and mailbox software. Alarm diagnoses are presented to operatorsmore » for their review and concurrence before the diagnoses are archived. Statistical software is incorporated to allow analysis of archived data for report generation and maintenance studies. The delivered system resides on a Digital Equipment Corporation VAX 3200 workstation and utilizes Nexpert Object and SAS for the expert system and statistical analysis, respectively. 11 refs., 23 figs., 7 tabs.« less
Impact of Requirements Quality on Project Success or Failure

NASA Astrophysics Data System (ADS)

Tamai, Tetsuo; Kamata, Mayumi Itakura

We are interested in the relationship between the quality of the requirements specifications for software projects and the subsequent outcome of the projects. To examine this relationship, we investigated 32 projects started and completed between 2003 and 2005 by the software development division of a large company in Tokyo. The company has collected reliable data on requirements specification quality, as evaluated by software quality assurance teams, and overall project performance data relating to cost and time overruns. The data for requirements specification quality were first converted into a multiple-dimensional space, with each dimension corresponding to an item of the recommended structure for software requirements specifications (SRS) defined in IEEE Std. 830-1998. We applied various statistical analysis methods to the SRS quality data and project outcomes.
Experimental control in software reliability certification

NASA Technical Reports Server (NTRS)

Trammell, Carmen J.; Poore, Jesse H.

1994-01-01

There is growing interest in software 'certification', i.e., confirmation that software has performed satisfactorily under a defined certification protocol. Regulatory agencies, customers, and prospective reusers all want assurance that a defined product standard has been met. In other industries, products are typically certified under protocols in which random samples of the product are drawn, tests characteristic of operational use are applied, analytical or statistical inferences are made, and products meeting a standard are 'certified' as fit for use. A warranty statement is often issued upon satisfactory completion of a certification protocol. This paper outlines specific engineering practices that must be used to preserve the validity of the statistical certification testing protocol. The assumptions associated with a statistical experiment are given, and their implications for statistical testing of software are described.
Online Statistical Modeling (Regression Analysis) for Independent Responses

NASA Astrophysics Data System (ADS)

Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

2017-06-01

Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Computer image analysis in obtaining characteristics of images: greenhouse tomatoes in the process of generating learning sets of artificial neural networks

NASA Astrophysics Data System (ADS)

Zaborowicz, M.; Przybył, J.; Koszela, K.; Boniecki, P.; Mueller, W.; Raba, B.; Lewicki, A.; Przybył, K.

2014-04-01

The aim of the project was to make the software which on the basis on image of greenhouse tomato allows for the extraction of its characteristics. Data gathered during the image analysis and processing were used to build learning sets of artificial neural networks. Program enables to process pictures in jpeg format, acquisition of statistical information of the picture and export them to an external file. Produced software is intended to batch analyze collected research material and obtained information saved as a csv file. Program allows for analysis of 33 independent parameters implicitly to describe tested image. The application is dedicated to processing and image analysis of greenhouse tomatoes. The program can be used for analysis of other fruits and vegetables of a spherical shape.
Visual analytics in cheminformatics: user-supervised descriptor selection for QSAR methods.

PubMed

Martínez, María Jimena; Ponzoni, Ignacio; Díaz, Mónica F; Vazquez, Gustavo E; Soto, Axel J

2015-01-01

The design of QSAR/QSPR models is a challenging problem, where the selection of the most relevant descriptors constitutes a key step of the process. Several feature selection methods that address this step are concentrated on statistical associations among descriptors and target properties, whereas the chemical knowledge is left out of the analysis. For this reason, the interpretability and generality of the QSAR/QSPR models obtained by these feature selection methods are drastically affected. Therefore, an approach for integrating domain expert's knowledge in the selection process is needed for increase the confidence in the final set of descriptors. In this paper a software tool, which we named Visual and Interactive DEscriptor ANalysis (VIDEAN), that combines statistical methods with interactive visualizations for choosing a set of descriptors for predicting a target property is proposed. Domain expertise can be added to the feature selection process by means of an interactive visual exploration of data, and aided by statistical tools and metrics based on information theory. Coordinated visual representations are presented for capturing different relationships and interactions among descriptors, target properties and candidate subsets of descriptors. The competencies of the proposed software were assessed through different scenarios. These scenarios reveal how an expert can use this tool to choose one subset of descriptors from a group of candidate subsets or how to modify existing descriptor subsets and even incorporate new descriptors according to his or her own knowledge of the target property. The reported experiences showed the suitability of our software for selecting sets of descriptors with low cardinality, high interpretability, low redundancy and high statistical performance in a visual exploratory way. Therefore, it is possible to conclude that the resulting tool allows the integration of a chemist's expertise in the descriptor selection process with a low cognitive effort in contrast with the alternative of using an ad-hoc manual analysis of the selected descriptors. Graphical abstractVIDEAN allows the visual analysis of candidate subsets of descriptors for QSAR/QSPR. In the two panels on the top, users can interactively explore numerical correlations as well as co-occurrences in the candidate subsets through two interactive graphs.
Two-sample statistics for testing the equality of survival functions against improper semi-parametric accelerated failure time alternatives: an application to the analysis of a breast cancer clinical trial.

PubMed

Broët, Philippe; Tsodikov, Alexander; De Rycke, Yann; Moreau, Thierry

2004-06-01

This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests.
Senior Computational Scientist | Center for Cancer Research

Cancer.gov

The Basic Science Program (BSP) pursues independent, multidisciplinary research in basic and applied molecular biology, immunology, retrovirology, cancer biology, and human genetics. Research efforts and support are an integral part of the Center for Cancer Research (CCR) at the Frederick National Laboratory for Cancer Research (FNLCR). The Cancer & Inflammation Program (CIP), Basic Science Program, HLA Immunogenetics Section, under the leadership of Dr. Mary Carrington, studies the influence of human leukocyte antigens (HLA) and specific KIR/HLA genotypes on risk of and outcomes to infection, cancer, autoimmune disease, and maternal-fetal disease. Recent studies have focused on the impact of HLA gene expression in disease, the molecular mechanism regulating expression levels, and the functional basis for the effect of differential expression on disease outcome. The lab’s further focus is on the genetic basis for resistance/susceptibility to disease conferred by immunogenetic variation. KEY ROLES/RESPONSIBILITIES The Senior Computational Scientist will provide research support to the CIP-BSP-HLA Immunogenetics Section performing bio-statistical design, analysis and reporting of research projects conducted in the lab. This individual will be involved in the implementation of statistical models and data preparation. Successful candidate should have 5 or more years of competent, innovative biostatistics/bioinformatics research experience, beyond doctoral training Considerable experience with statistical software, such as SAS, R and S-Plus Sound knowledge, and demonstrated experience of theoretical and applied statistics Write program code to analyze data using statistical analysis software Contribute to the interpretation and publication of research results
A Matlab user interface for the statistically assisted fluid registration algorithm and tensor-based morphometry

NASA Astrophysics Data System (ADS)

Yepes-Calderon, Fernando; Brun, Caroline; Sant, Nishita; Thompson, Paul; Lepore, Natasha

2015-01-01

Tensor-Based Morphometry (TBM) is an increasingly popular method for group analysis of brain MRI data. The main steps in the analysis consist of a nonlinear registration to align each individual scan to a common space, and a subsequent statistical analysis to determine morphometric differences, or difference in fiber structure between groups. Recently, we implemented the Statistically-Assisted Fluid Registration Algorithm or SAFIRA,1 which is designed for tracking morphometric differences among populations. To this end, SAFIRA allows the inclusion of statistical priors extracted from the populations being studied as regularizers in the registration. This flexibility and degree of sophistication limit the tool to expert use, even more so considering that SAFIRA was initially implemented in command line mode. Here, we introduce a new, intuitive, easy to use, Matlab-based graphical user interface for SAFIRA's multivariate TBM. The interface also generates different choices for the TBM statistics, including both the traditional univariate statistics on the Jacobian matrix, and comparison of the full deformation tensors.2 This software will be freely disseminated to the neuroimaging research community.
Finding the Root Causes of Statistical Inconsistency in Community Earth System Model Output

NASA Astrophysics Data System (ADS)

Milroy, D.; Hammerling, D.; Baker, A. H.

2017-12-01

Baker et al (2015) developed the Community Earth System Model Ensemble Consistency Test (CESM-ECT) to provide a metric for software quality assurance by determining statistical consistency between an ensemble of CESM outputs and new test runs. The test has proved useful for detecting statistical difference caused by compiler bugs and errors in physical modules. However, detection is only the necessary first step in finding the causes of statistical difference. The CESM is a vastly complex model comprised of millions of lines of code which is developed and maintained by a large community of software engineers and scientists. Any root cause analysis is correspondingly challenging. We propose a new capability for CESM-ECT: identifying the sections of code that cause statistical distinguishability. The first step is to discover CESM variables that cause CESM-ECT to classify new runs as statistically distinct, which we achieve via Randomized Logistic Regression. Next we use a tool developed to identify CESM components that define or compute the variables found in the first step. Finally, we employ the application Kernel GENerator (KGEN) created in Kim et al (2016) to detect fine-grained floating point differences. We demonstrate an example of the procedure and advance a plan to automate this process in our future work.
[Statistical analysis of German radiologic periodicals: developmental trends in the last 10 years].

PubMed

Golder, W

1999-09-01

To identify which statistical tests are applied in German radiological publications, to what extent their use has changed during the last decade, and which factors might be responsible for this development. The major articles published in "ROFO" and "DER RADIOLOGE" during 1988, 1993 and 1998 were reviewed for statistical content. The contributions were classified by principal focus and radiological subspecialty. The methods used were assigned to descriptive, basal and advanced statistics. Sample size, significance level and power were established. The use of experts' assistance was monitored. Finally, we calculated the so-called cumulative accessibility of the publications. 525 contributions were found to be eligible. In 1988, 87% used descriptive statistics only, 12.5% basal, and 0.5% advanced statistics. The corresponding figures in 1993 and 1998 are 62 and 49%, 32 and 41%, and 6 and 10%, respectively. Statistical techniques were most likely to be used in research on musculoskeletal imaging and articles dedicated to MRI. Six basic categories of statistical methods account for the complete statistical analysis appearing in 90% of the articles. ROC analysis is the single most common advanced technique. Authors make increasingly use of statistical experts' opinion and programs. During the last decade, the use of statistical methods in German radiological journals has fundamentally improved, both quantitatively and qualitatively. Presently, advanced techniques account for 20% of the pertinent statistical tests. This development seems to be promoted by the increasing availability of statistical analysis software.
Multivariate statistical analysis software technologies for astrophysical research involving large data bases

NASA Technical Reports Server (NTRS)

Djorgovski, George

1993-01-01

The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multiparameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resource.
Multivariate statistical analysis software technologies for astrophysical research involving large data bases

NASA Technical Reports Server (NTRS)

Djorgovski, Stanislav

1992-01-01

The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multi parameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resources.
pyLIMA : The first open source microlensing modeling software

NASA Astrophysics Data System (ADS)

Bachelet, Etienne; Street, Rachel; Bozza, Valerio

2018-01-01

Microlensing is highly sensitive to planets beyond the snowline and distributed along the line of sight towards the Galactic Bulge. The WFIRST-AFTA mission should detect about 3000 of these planets and significantly improves our knowledge of planet formation and statistics, complementing results found by transit and radial velocity methods. However, the modeling of microlensing event is challenging on different aspects leading to a highly time consuming analysis. After a quick summarize of these different challenges, I will present pyLIMA, the first open source microlensing modeling software. The aimed goal of this software are to be flexible, powerful and user friendly. This presentation will focus on various case and early results.
Rapid processing of PET list-mode data for efficient uncertainty estimation and data analysis

NASA Astrophysics Data System (ADS)

Markiewicz, P. J.; Thielemans, K.; Schott, J. M.; Atkinson, D.; Arridge, S. R.; Hutton, B. F.; Ourselin, S.

2016-07-01

In this technical note we propose a rapid and scalable software solution for the processing of PET list-mode data, which allows the efficient integration of list mode data processing into the workflow of image reconstruction and analysis. All processing is performed on the graphics processing unit (GPU), making use of streamed and concurrent kernel execution together with data transfers between disk and CPU memory as well as CPU and GPU memory. This approach leads to fast generation of multiple bootstrap realisations, and when combined with fast image reconstruction and analysis, it enables assessment of uncertainties of any image statistic and of any component of the image generation process (e.g. random correction, image processing) within reasonable time frames (e.g. within five minutes per realisation). This is of particular value when handling complex chains of image generation and processing. The software outputs the following: (1) estimate of expected random event data for noise reduction; (2) dynamic prompt and random sinograms of span-1 and span-11 and (3) variance estimates based on multiple bootstrap realisations of (1) and (2) assuming reasonable count levels for acceptable accuracy. In addition, the software produces statistics and visualisations for immediate quality control and crude motion detection, such as: (1) count rate curves; (2) centre of mass plots of the radiodistribution for motion detection; (3) video of dynamic projection views for fast visual list-mode skimming and inspection; (4) full normalisation factor sinograms. To demonstrate the software, we present an example of the above processing for fast uncertainty estimation of regional SUVR (standard uptake value ratio) calculation for a single PET scan of 18F-florbetapir using the Siemens Biograph mMR scanner.
GenomeGraphs: integrated genomic data visualization with R.

PubMed

Durinck, Steffen; Bullard, James; Spellman, Paul T; Dudoit, Sandrine

2009-01-06

Biological studies involve a growing number of distinct high-throughput experiments to characterize samples of interest. There is a lack of methods to visualize these different genomic datasets in a versatile manner. In addition, genomic data analysis requires integrated visualization of experimental data along with constantly changing genomic annotation and statistical analyses. We developed GenomeGraphs, as an add-on software package for the statistical programming environment R, to facilitate integrated visualization of genomic datasets. GenomeGraphs uses the biomaRt package to perform on-line annotation queries to Ensembl and translates these to gene/transcript structures in viewports of the grid graphics package. This allows genomic annotation to be plotted together with experimental data. GenomeGraphs can also be used to plot custom annotation tracks in combination with different experimental data types together in one plot using the same genomic coordinate system. GenomeGraphs is a flexible and extensible software package which can be used to visualize a multitude of genomic datasets within the statistical programming environment R.

Final Report: Quantification of Uncertainty in Extreme Scale Computations (QUEST)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marzouk, Youssef; Conrad, Patrick; Bigoni, Daniele

QUEST (\\url{www.quest-scidac.org}) is a SciDAC Institute that is focused on uncertainty quantification (UQ) in large-scale scientific computations. Our goals are to (1) advance the state of the art in UQ mathematics, algorithms, and software; and (2) provide modeling, algorithmic, and general UQ expertise, together with software tools, to other SciDAC projects, thereby enabling and guiding a broad range of UQ activities in their respective contexts. QUEST is a collaboration among six institutions (Sandia National Laboratories, Los Alamos National Laboratory, the University of Southern California, Massachusetts Institute of Technology, the University of Texas at Austin, and Duke University) with a historymore » of joint UQ research. Our vision encompasses all aspects of UQ in leadership-class computing. This includes the well-founded setup of UQ problems; characterization of the input space given available data/information; local and global sensitivity analysis; adaptive dimensionality and order reduction; forward and inverse propagation of uncertainty; handling of application code failures, missing data, and hardware/software fault tolerance; and model inadequacy, comparison, validation, selection, and averaging. The nature of the UQ problem requires the seamless combination of data, models, and information across this landscape in a manner that provides a self-consistent quantification of requisite uncertainties in predictions from computational models. Accordingly, our UQ methods and tools span an interdisciplinary space across applied math, information theory, and statistics. The MIT QUEST effort centers on statistical inference and methods for surrogate or reduced-order modeling. MIT personnel have been responsible for the development of adaptive sampling methods, methods for approximating computationally intensive models, and software for both forward uncertainty propagation and statistical inverse problems. A key software product of the MIT QUEST effort is the MIT Uncertainty Quantification library, called MUQ (\\url{muq.mit.edu}).« less
Asthma in Adults Fact Sheet

MedlinePlus

... American Lung Association Epidemiology and Statistics Unit using SPSS software. Center for Disease Control and Prevention. Behavioral ... American Lung Association Epidemiology and Statistics Unit using SPSS software. Centers for Disease Control and Prevention. National ...
Heuristic Identification of Biological Architectures for Simulating Complex Hierarchical Genetic Interactions

PubMed Central

Moore, Jason H; Amos, Ryan; Kiralis, Jeff; Andrews, Peter C

2015-01-01

Simulation plays an essential role in the development of new computational and statistical methods for the genetic analysis of complex traits. Most simulations start with a statistical model using methods such as linear or logistic regression that specify the relationship between genotype and phenotype. This is appealing due to its simplicity and because these statistical methods are commonly used in genetic analysis. It is our working hypothesis that simulations need to move beyond simple statistical models to more realistically represent the biological complexity of genetic architecture. The goal of the present study was to develop a prototype genotype–phenotype simulation method and software that are capable of simulating complex genetic effects within the context of a hierarchical biology-based framework. Specifically, our goal is to simulate multilocus epistasis or gene–gene interaction where the genetic variants are organized within the framework of one or more genes, their regulatory regions and other regulatory loci. We introduce here the Heuristic Identification of Biological Architectures for simulating Complex Hierarchical Interactions (HIBACHI) method and prototype software for simulating data in this manner. This approach combines a biological hierarchy, a flexible mathematical framework, a liability threshold model for defining disease endpoints, and a heuristic search strategy for identifying high-order epistatic models of disease susceptibility. We provide several simulation examples using genetic models exhibiting independent main effects and three-way epistatic effects. PMID:25395175
Data analysis of the benefits of an electronic registry of information in a neonatal intensive care unit in Greece.

PubMed

Skouroliakou, Maria; Soloupis, George; Gounaris, Antonis; Charitou, Antonia; Papasarantopoulos, Petros; Markantonis, Sophia L; Golna, Christina; Souliotis, Kyriakos

2008-07-28

This study assesses the results of implementation of a software program that allows for input of admission/discharge summary data (including cost) in a neonatal intensive care unit (NICU) in Greece, based on the establishment of a baseline statistical database for infants treated in a NICU and the statistical analysis of epidemiological and resource utilization data thus collected. A software tool was designed, developed, and implemented between April 2004 and March 2005 in the NICU of the LITO private maternity hospital in Athens, Greece, to allow for the first time for step-by-step collection and management of summary treatment data. Data collected over this period were subsequently analyzed using defined indicators as a basis to extract results related to treatment options, treatment duration, and relative resource utilization. Data for 499 babies were entered in the tool and processed. Information on medical costs (e.g., mean total cost +/- SD of treatment was euro310.44 +/- 249.17 and euro6704.27 +/- 4079.53 for babies weighing more than 2500 g and 1000-1500 g respectively), incidence of complications or disease (e.g., 4.3 percent and 14.3 percent of study babies weighing 1,000 to 1,500 g suffered from cerebral bleeding [grade I] and bronchopulmonary dysplasia, respectively, while overall 6.0 percent had microbial infections), and medical statistics (e.g., perinatal mortality was 6.8 percent) was obtained in a quick and robust manner. The software tool allowed for collection and analysis of data traditionally maintained in paper medical records in the NICU with greater ease and accuracy. Data codification and analysis led to significant findings at the epidemiological, medical resource utilization, and respective hospital cost levels that allowed comparisons with literature findings for the first time in Greece. The tool thus contributed to a clearer understanding of treatment practices in the NICU and set the baseline for the assessment of the impact of future interventions at the policy or hospital level.
medplot: a web application for dynamic summary and analysis of longitudinal medical data based on R.

PubMed

Ahlin, Črt; Stupica, Daša; Strle, Franc; Lusa, Lara

2015-01-01

In biomedical studies the patients are often evaluated numerous times and a large number of variables are recorded at each time-point. Data entry and manipulation of longitudinal data can be performed using spreadsheet programs, which usually include some data plotting and analysis capabilities and are straightforward to use, but are not designed for the analyses of complex longitudinal data. Specialized statistical software offers more flexibility and capabilities, but first time users with biomedical background often find its use difficult. We developed medplot, an interactive web application that simplifies the exploration and analysis of longitudinal data. The application can be used to summarize, visualize and analyze data by researchers that are not familiar with statistical programs and whose knowledge of statistics is limited. The summary tools produce publication-ready tables and graphs. The analysis tools include features that are seldom available in spreadsheet software, such as correction for multiple testing, repeated measurement analyses and flexible non-linear modeling of the association of the numerical variables with the outcome. medplot is freely available and open source, it has an intuitive graphical user interface (GUI), it is accessible via the Internet and can be used within a web browser, without the need for installing and maintaining programs locally on the user's computer. This paper describes the application and gives detailed examples describing how to use the application on real data from a clinical study including patients with early Lyme borreliosis.
Data processing has major impact on the outcome of quantitative label-free LC-MS analysis.

PubMed

Chawade, Aakash; Sandin, Marianne; Teleman, Johan; Malmström, Johan; Levander, Fredrik

2015-02-06

High-throughput multiplexed protein quantification using mass spectrometry is steadily increasing in popularity, with the two major techniques being data-dependent acquisition (DDA) and targeted acquisition using selected reaction monitoring (SRM). However, both techniques involve extensive data processing, which can be performed by a multitude of different software solutions. Analysis of quantitative LC-MS/MS data is mainly performed in three major steps: processing of raw data, normalization, and statistical analysis. To evaluate the impact of data processing steps, we developed two new benchmark data sets, one each for DDA and SRM, with samples consisting of a long-range dilution series of synthetic peptides spiked in a total cell protein digest. The generated data were processed by eight different software workflows and three postprocessing steps. The results show that the choice of the raw data processing software and the postprocessing steps play an important role in the final outcome. Also, the linear dynamic range of the DDA data could be extended by an order of magnitude through feature alignment and a charge state merging algorithm proposed here. Furthermore, the benchmark data sets are made publicly available for further benchmarking and software developments.
Find Pairs: The Module for Protein Quantification of the PeakQuant Software Suite

PubMed Central

Eisenacher, Martin; Kohl, Michael; Wiese, Sebastian; Hebeler, Romano; Meyer, Helmut E.

2012-01-01

Abstract Accurate quantification of proteins is one of the major tasks in current proteomics research. To address this issue, a wide range of stable isotope labeling techniques have been developed, allowing one to quantitatively study thousands of proteins by means of mass spectrometry. In this article, the FindPairs module of the PeakQuant software suite is detailed. It facilitates the automatic determination of protein abundance ratios based on the automated analysis of stable isotope-coded mass spectrometric data. Furthermore, it implements statistical methods to determine outliers due to biological as well as technical variance of proteome data obtained in replicate experiments. This provides an important means to evaluate the significance in obtained protein expression data. For demonstrating the high applicability of FindPairs, we focused on the quantitative analysis of proteome data acquired in 14N/15N labeling experiments. We further provide a comprehensive overview of the features of the FindPairs software, and compare these with existing quantification packages. The software presented here supports a wide range of proteomics applications, allowing one to quantitatively assess data derived from different stable isotope labeling approaches, such as 14N/15N labeling, SILAC, and iTRAQ. The software is publicly available at http://www.medizinisches-proteom-center.de/software and free for academic use. PMID:22909347
The microcomputer scientific software series 3: general linear model--analysis of variance.

Treesearch

Harold M. Rauscher

1985-01-01

A BASIC language set of programs, designed for use on microcomputers, is presented. This set of programs will perform the analysis of variance for any statistical model describing either balanced or unbalanced designs. The program computes and displays the degrees of freedom, Type I sum of squares, and the mean square for the overall model, the error, and each factor...
How Can Single-Case Data Be Analyzed? Software Resources, Tutorial, and Reflections on Analysis.

PubMed

Manolov, Rumen; Moeyaert, Mariola

2017-03-01

The present article aims to present a series of software developments in the quantitative analysis of data obtained via single-case experimental designs (SCEDs), as well as the tutorial describing these developments. The tutorial focuses on software implementations based on freely available platforms such as R and aims to bring statistical advances closer to applied researchers and help them become autonomous agents in the data analysis stage of a study. The range of analyses dealt with in the tutorial is illustrated on a typical single-case dataset, relying heavily on graphical data representations. We illustrate how visual and quantitative analyses can be used jointly, giving complementary information and helping the researcher decide whether there is an intervention effect, how large it is, and whether it is practically significant. To help applied researchers in the use of the analyses, we have organized the data in the different ways required by the different analytical procedures and made these data available online. We also provide Internet links to all free software available, as well as all the main references to the analytical techniques. Finally, we suggest that appropriate and informative data analysis is likely to be a step forward in documenting and communicating results and also for increasing the scientific credibility of SCEDs.
Effectiveness of Simulation in a Hybrid and Online Networking Course.

ERIC Educational Resources Information Center

Cameron, Brian H.

2003-01-01

Reports on a study that compares the performance of students enrolled in two sections of a Web-based computer networking course: one utilizing a simulation package and the second utilizing a static, graphical software package. Analysis shows statistically significant improvements in performance in the simulation group compared to the…
AnClim and ProClimDB software for data quality control and homogenization of time series

NASA Astrophysics Data System (ADS)

Stepanek, Petr

2015-04-01

During the last decade, a software package consisting of AnClim, ProClimDB and LoadData for processing (mainly climatological) data has been created. This software offers a complex solution for processing of climatological time series, starting from loading the data from a central database (e.g. Oracle, software LoadData), through data duality control and homogenization to time series analysis, extreme value evaluations and RCM outputs verification and correction (ProClimDB and AnClim software). The detection of inhomogeneities is carried out on a monthly scale through the application of AnClim, or newly by R functions called from ProClimDB, while quality control, the preparation of reference series and the correction of found breaks is carried out by the ProClimDB software. The software combines many statistical tests, types of reference series and time scales (monthly, seasonal and annual, daily and sub-daily ones). These can be used to create an "ensemble" of solutions, which may be more reliable than any single method. AnClim software is suitable for educational purposes: e.g. for students getting acquainted with methods used in climatology. Built-in graphical tools and comparison of various statistical tests help in better understanding of a given method. ProClimDB is, on the contrary, tool aimed for processing of large climatological datasets. Recently, functions from R may be used within the software making it more efficient in data processing and capable of easy inclusion of new methods (when available under R). An example of usage is easy comparison of methods for correction of inhomogeneities in daily data (HOM of Paul Della-Marta, SPLIDHOM method of Olivier Mestre, DAP - own method, QM of Xiaolan Wang and others). The software is available together with further information on www.climahom.eu . Acknowledgement: this work was partially funded by the project "Building up a multidisciplinary scientific team focused on drought" No. CZ.1.07/2.3.00/20.0248.
Bone histomorphometry using free and commonly available software.

PubMed

Egan, Kevin P; Brennan, Tracy A; Pignolo, Robert J

2012-12-01

Histomorphometric analysis is a widely used technique to assess changes in tissue structure and function. Commercially available programs that measure histomorphometric parameters can be cost-prohibitive. In this study, we compared an inexpensive method of histomorphometry to a current proprietary software program. Image J and Adobe Photoshop(®) were used to measure static and kinetic bone histomorphometric parameters. Photomicrographs of Goldner's trichrome-stained femurs were used to generate black-and-white image masks, representing bone and non-bone tissue, respectively, in Adobe Photoshop(®) . The masks were used to quantify histomorphometric parameters (bone volume, tissue volume, osteoid volume, mineralizing surface and interlabel width) in Image J. The resultant values obtained using Image J and the proprietary software were compared and differences found to be statistically non-significant. The wide-ranging use of histomorphometric analysis for assessing the basic morphology of tissue components makes it important to have affordable and accurate measurement options available for a diverse range of applications. Here we have developed and validated an approach to histomorphometry using commonly and freely available software that is comparable to a much more costly, commercially available software program. © 2012 Blackwell Publishing Limited.
Bone histomorphometry using free and commonly available software

PubMed Central

Egan, Kevin P.; Brennan, Tracy A.; Pignolo, Robert J.

2012-01-01

Aims Histomorphometric analysis is a widely used technique to assess changes in tissue structure and function. Commercially-available programs that measure histomorphometric parameters can be cost prohibitive. In this study, we compared an inexpensive method of histomorphometry to a current proprietary software program. Methods and results Image J and Adobe Photoshop® were used to measure static and kinetic bone histomorphometric parameters. Photomicrographs of Goldner’s Trichrome stained femurs were used to generate black and white image masks, representing bone and non-bone tissue, respectively, in Adobe Photoshop®. The masks were used to quantify histomorphometric parameters (bone volume, tissue volume, osteoid volume, mineralizing surface, and interlabel width) in Image J. The resultant values obtained using Image J and the proprietary software were compared and found to be statistically non-significant. Conclusions The wide ranging use of histomorphometric analysis for assessing the basic morphology of tissue components makes it important to have affordable and accurate measurement options that are available for a diverse range of applications. Here we have developed and validated an approach to histomorphometry using commonly and freely available software that is comparable to a much more costly, commercially-available software program. PMID:22882309
LADES: a software for constructing and analyzing longitudinal designs in biomedical research.

PubMed

Vázquez-Alcocer, Alan; Garzón-Cortes, Daniel Ladislao; Sánchez-Casas, Rosa María

2014-01-01

One of the most important steps in biomedical longitudinal studies is choosing a good experimental design that can provide high accuracy in the analysis of results with a minimum sample size. Several methods for constructing efficient longitudinal designs have been developed based on power analysis and the statistical model used for analyzing the final results. However, development of this technology is not available to practitioners through user-friendly software. In this paper we introduce LADES (Longitudinal Analysis and Design of Experiments Software) as an alternative and easy-to-use tool for conducting longitudinal analysis and constructing efficient longitudinal designs. LADES incorporates methods for creating cost-efficient longitudinal designs, unequal longitudinal designs, and simple longitudinal designs. In addition, LADES includes different methods for analyzing longitudinal data such as linear mixed models, generalized estimating equations, among others. A study of European eels is reanalyzed in order to show LADES capabilities. Three treatments contained in three aquariums with five eels each were analyzed. Data were collected from 0 up to the 12th week post treatment for all the eels (complete design). The response under evaluation is sperm volume. A linear mixed model was fitted to the results using LADES. The complete design had a power of 88.7% using 15 eels. With LADES we propose the use of an unequal design with only 14 eels and 89.5% efficiency. LADES was developed as a powerful and simple tool to promote the use of statistical methods for analyzing and creating longitudinal experiments in biomedical research.
Novel Assessment of Interstitial Lung Disease Using the "Computer-Aided Lung Informatics for Pathology Evaluation and Rating" (CALIPER) Software System in Idiopathic Inflammatory Myopathies.

PubMed

Ungprasert, Patompong; Wilton, Katelynn M; Ernste, Floranne C; Kalra, Sanjay; Crowson, Cynthia S; Rajagopalan, Srinivasan; Bartholmai, Brian J

2017-10-01

To evaluate the correlation between measurements from quantitative thoracic high-resolution CT (HRCT) analysis with "Computer-Aided Lung Informatics for Pathology Evaluation and Rating" (CALIPER) software and measurements from pulmonary function tests (PFTs) in patients with idiopathic inflammatory myopathies (IIM)-associated interstitial lung disease (ILD). A cohort of patients with IIM-associated ILD seen at Mayo Clinic was identified from medical record review. Retrospective analysis of HRCT data and PFTs at baseline and 1 year was performed. The abnormalities in HRCT were quantified using CALIPER software. A total of 110 patients were identified. At baseline, total interstitial abnormalities as measured by CALIPER, both by absolute volume and by percentage of total lung volume, had a significant negative correlation with diffusing capacity for carbon monoxide (DLCO), total lung capacity (TLC), and oxygen saturation. Analysis by subtype of interstitial abnormality revealed significant negative correlations between ground glass opacities (GGO) and reticular density (RD) with DLCO and TLC. At one year, changes of total interstitial abnormalities compared with baseline had a significant negative correlation with changes of TLC and oxygen saturation. A negative correlation between changes of total interstitial abnormalities and DLCO was also observed, but it was not statistically significant. Analysis by subtype of interstitial abnormality revealed negative correlations between changes of GGO and RD and changes of DLCO, TLC, and oxygen saturation, but most of the correlations did not achieve statistical significance. CALIPER measurements correlate well with functional measurements in patients with IIM-associated ILD.
PIV Data Validation Software Package

NASA Technical Reports Server (NTRS)

Blackshire, James L.

1997-01-01

A PIV data validation and post-processing software package was developed to provide semi-automated data validation and data reduction capabilities for Particle Image Velocimetry data sets. The software provides three primary capabilities including (1) removal of spurious vector data, (2) filtering, smoothing, and interpolating of PIV data, and (3) calculations of out-of-plane vorticity, ensemble statistics, and turbulence statistics information. The software runs on an IBM PC/AT host computer working either under Microsoft Windows 3.1 or Windows 95 operating systems.
GSuite HyperBrowser: integrative analysis of dataset collections across the genome and epigenome.

PubMed

Simovski, Boris; Vodák, Daniel; Gundersen, Sveinung; Domanska, Diana; Azab, Abdulrahman; Holden, Lars; Holden, Marit; Grytten, Ivar; Rand, Knut; Drabløs, Finn; Johansen, Morten; Mora, Antonio; Lund-Andersen, Christin; Fromm, Bastian; Eskeland, Ragnhild; Gabrielsen, Odd Stokke; Ferkingstad, Egil; Nakken, Sigve; Bengtsen, Mads; Nederbragt, Alexander Johan; Thorarensen, Hildur Sif; Akse, Johannes Andreas; Glad, Ingrid; Hovig, Eivind; Sandve, Geir Kjetil

2017-07-01

Recent large-scale undertakings such as ENCODE and Roadmap Epigenomics have generated experimental data mapped to the human reference genome (as genomic tracks) representing a variety of functional elements across a large number of cell types. Despite the high potential value of these publicly available data for a broad variety of investigations, little attention has been given to the analytical methodology necessary for their widespread utilisation. We here present a first principled treatment of the analysis of collections of genomic tracks. We have developed novel computational and statistical methodology to permit comparative and confirmatory analyses across multiple and disparate data sources. We delineate a set of generic questions that are useful across a broad range of investigations and discuss the implications of choosing different statistical measures and null models. Examples include contrasting analyses across different tissues or diseases. The methodology has been implemented in a comprehensive open-source software system, the GSuite HyperBrowser. To make the functionality accessible to biologists, and to facilitate reproducible analysis, we have also developed a web-based interface providing an expertly guided and customizable way of utilizing the methodology. With this system, many novel biological questions can flexibly be posed and rapidly answered. Through a combination of streamlined data acquisition, interoperable representation of dataset collections, and customizable statistical analysis with guided setup and interpretation, the GSuite HyperBrowser represents a first comprehensive solution for integrative analysis of track collections across the genome and epigenome. The software is available at: https://hyperbrowser.uio.no. © The Author 2017. Published by Oxford University Press.
MNE software for processing MEG and EEG data

PubMed Central

Gramfort, A.; Luessi, M.; Larson, E.; Engemann, D.; Strohmeier, D.; Brodbeck, C.; Parkkonen, L.; Hämäläinen, M.

2013-01-01

Magnetoencephalography and electroencephalography (M/EEG) measure the weak electromagnetic signals originating from neural currents in the brain. Using these signals to characterize and locate brain activity is a challenging task, as evidenced by several decades of methodological contributions. MNE, whose name stems from its capability to compute cortically-constrained minimum-norm current estimates from M/EEG data, is a software package that provides comprehensive analysis tools and workflows including preprocessing, source estimation, time–frequency analysis, statistical analysis, and several methods to estimate functional connectivity between distributed brain regions. The present paper gives detailed information about the MNE package and describes typical use cases while also warning about potential caveats in analysis. The MNE package is a collaborative effort of multiple institutes striving to implement and share best methods and to facilitate distribution of analysis pipelines to advance reproducibility of research. Full documentation is available at http://martinos.org/mne. PMID:24161808
GHEP-ISFG collaborative exercise on mixture profiles (GHEP-MIX06). Reporting conclusions: Results and evaluation.

PubMed

Barrio, P A; Crespillo, M; Luque, J A; Aler, M; Baeza-Richer, C; Baldassarri, L; Carnevali, E; Coufalova, P; Flores, I; García, O; García, M A; González, R; Hernández, A; Inglés, V; Luque, G M; Mosquera-Miguel, A; Pedrosa, S; Pontes, M L; Porto, M J; Posada, Y; Ramella, M I; Ribeiro, T; Riego, E; Sala, A; Saragoni, V G; Serrano, A; Vannelli, S

2018-07-01

One of the main goals of the Spanish and Portuguese-Speaking Group of the International Society for Forensic Genetics (GHEP-ISFG) is to promote and contribute to the development and dissemination of scientific knowledge in the field of forensic genetics. Due to this fact, GHEP-ISFG holds different working commissions that are set up to develop activities in scientific aspects of general interest. One of them, the Mixture Commission of GHEP-ISFG, has organized annually, since 2009, a collaborative exercise on analysis and interpretation of autosomal short tandem repeat (STR) mixture profiles. Until now, six exercises have been organized. At the present edition (GHEP-MIX06), with 25 participant laboratories, the exercise main aim was to assess mixture profiles results by issuing a report, from the proposal of a complex mock case. One of the conclusions obtained from this exercise is the increasing tendency of participating laboratories to validate DNA mixture profiles analysis following international recommendations. However, the results have shown some differences among them regarding the edition and also the interpretation of mixture profiles. Besides, although the last revision of ISO/IEC 17025:2017 gives indications of how results should be reported, not all laboratories strictly follow their recommendations. Regarding the statistical aspect, all those laboratories that have performed statistical evaluation of the data have employed the likelihood ratio (LR) as a parameter to evaluate the statistical compatibility. However, LR values obtained show a wide range of variation. This fact could not be attributed to the software employed, since the vast majority of laboratories that performed LR calculation employed the same software (LRmixStudio). Thus, the final allelic composition of the edited mixture profile and the parameters employed in the software could explain this data dispersion. This highlights the need, for each laboratory, to define through internal validations its criteria for editing and interpreting mixtures, and to continuous train in software handling. Copyright © 2018 Elsevier B.V. All rights reserved.
The GenABEL Project for statistical genomics.

PubMed

Karssen, Lennart C; van Duijn, Cornelia M; Aulchenko, Yurii S

2016-01-01

Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the "core team", facilitating agile statistical omics methodology development and fast dissemination.

Free and Open Source Software for land degradation vulnerability assessment

NASA Astrophysics Data System (ADS)

Imbrenda, Vito; Calamita, Giuseppe; Coluzzi, Rosa; D'Emilio, Mariagrazia; Lanfredi, Maria Teresa; Perrone, Angela; Ragosta, Maria; Simoniello, Tiziana

2013-04-01

Nowadays the role of FOSS software in scientific research is becoming increasingly important. Besides the important issues of reduced costs for licences, legality and security there are many other reasons that make FOSS software attractive. Firstly, making the code opened is a warranty of quality permitting to thousands of developers around the world to check the code and fix bugs rather than rely on vendors claims. FOSS communities are usually enthusiastic about helping other users for solving problems and expand or customize software (flexibility). Most important for this study, the interoperability allows to combine the user-friendly QGIS with the powerful GRASS-GIS and the richness of statistical methods of R in order to process remote sensing data and to perform geo-statistical analysis in one only environment. This study is focused on the land degradation (i.e. the reduction in the capacity of the land to provide ecosystem goods and services and assure its functions) and in particular on the estimation of the vulnerability levels in order to suggest appropriate policy actions to reduce/halt land degradation impacts, using the above mentioned software. The area investigated is the Basilicata Region (Southern Italy) where large natural areas are mixed with anthropized areas. To identify different levels of vulnerability we adopted the Environmentally Sensitive Areas (ESAs) model, based on the combination of indicators related to soil, climate, vegetation and anthropic stress. Such indicators were estimated by using the following data-sources: - Basilicata Region Geoportal to assess soil vulnerability; - DESERTNET2 project to evaluate potential vegetation vulnerability and climate vulnerability; - NDVI-MODIS satellite time series (2000-2010) with 250m resolution, available as 16-day composite from the NASA LP DAAC to characterize the dynamic component of vegetation; - Agricultural Census data 2010, Corine Land Cover 2006 and morphological information to assess the vulnerability to anthropic factors mainly connected with agricultural and grazing management. To achieve the final ESAs Index depicting the overall vulnerability to degradation of the investigated area we applied the geometric mean to cross normalized indices related to each examined component. In this context QGIS was used to display data and to perform basic GIS calculations, whereas GRASS was used for map-algebra operations and image processing. Finally R was used for computing statistical analysis (Principal Component Analysis) aimed to determine the relative importance of each adopted indicator. Our results show that GRASS, QGIS and R software are suitable to map land degradation vulnerability and identify highly vulnerable areas in which rehabilitation/recovery interventions are urgent. In addition they allow us to put into evidence the most important drivers of degradation thus supplying basic information for the setting up of intervention strategies. Ultimately, Free Open Source Software deliver a fair chance for geoscientific investigations thanks to their high interoperability and flexibility enabling to preserve the accuracy of the data and to reduce processing time. Moreover, the presence of several communities that steadily support users allows for achieving high quality results, making free open source software a valuable and easy alternative to conventional commercial software.
GLIMMPSE Lite: Calculating Power and Sample Size on Smartphone Devices

PubMed Central

Munjal, Aarti; Sakhadeo, Uttara R.; Muller, Keith E.; Glueck, Deborah H.; Kreidler, Sarah M.

2014-01-01

Researchers seeking to develop complex statistical applications for mobile devices face a common set of difficult implementation issues. In this work, we discuss general solutions to the design challenges. We demonstrate the utility of the solutions for a free mobile application designed to provide power and sample size calculations for univariate, one-way analysis of variance (ANOVA), GLIMMPSE Lite. Our design decisions provide a guide for other scientists seeking to produce statistical software for mobile platforms. PMID:25541688
SEDA: A software package for the Statistical Earthquake Data Analysis

NASA Astrophysics Data System (ADS)

Lombardi, A. M.

2017-03-01

In this paper, the first version of the software SEDA (SEDAv1.0), designed to help seismologists statistically analyze earthquake data, is presented. The package consists of a user-friendly Matlab-based interface, which allows the user to easily interact with the application, and a computational core of Fortran codes, to guarantee the maximum speed. The primary factor driving the development of SEDA is to guarantee the research reproducibility, which is a growing movement among scientists and highly recommended by the most important scientific journals. SEDAv1.0 is mainly devoted to produce accurate and fast outputs. Less care has been taken for the graphic appeal, which will be improved in the future. The main part of SEDAv1.0 is devoted to the ETAS modeling. SEDAv1.0 contains a set of consistent tools on ETAS, allowing the estimation of parameters, the testing of model on data, the simulation of catalogs, the identification of sequences and forecasts calculation. The peculiarities of routines inside SEDAv1.0 are discussed in this paper. More specific details on the software are presented in the manual accompanying the program package.
MuffinInfo: HTML5-Based Statistics Extractor from Next-Generation Sequencing Data.

PubMed

Alic, Andy S; Blanquer, Ignacio

2016-09-01

Usually, the information known a priori about a newly sequenced organism is limited. Even resequencing the same organism can generate unpredictable output. We introduce MuffinInfo, a FastQ/Fasta/SAM information extractor implemented in HTML5 capable of offering insights into next-generation sequencing (NGS) data. Our new tool can run on any software or hardware environment, in command line or graphically, and in browser or standalone. It presents information such as average length, base distribution, quality scores distribution, k-mer histogram, and homopolymers analysis. MuffinInfo improves upon the existing extractors by adding the ability to save and then reload the results obtained after a run as a navigable file (also supporting saving pictures of the charts), by supporting custom statistics implemented by the user, and by offering user-adjustable parameters involved in the processing, all in one software. At the moment, the extractor works with all base space technologies such as Illumina, Roche, Ion Torrent, Pacific Biosciences, and Oxford Nanopore. Owing to HTML5, our software demonstrates the readiness of web technologies for mild intensive tasks encountered in bioinformatics.
SEDA: A software package for the Statistical Earthquake Data Analysis

PubMed Central

Lombardi, A. M.

2017-01-01

In this paper, the first version of the software SEDA (SEDAv1.0), designed to help seismologists statistically analyze earthquake data, is presented. The package consists of a user-friendly Matlab-based interface, which allows the user to easily interact with the application, and a computational core of Fortran codes, to guarantee the maximum speed. The primary factor driving the development of SEDA is to guarantee the research reproducibility, which is a growing movement among scientists and highly recommended by the most important scientific journals. SEDAv1.0 is mainly devoted to produce accurate and fast outputs. Less care has been taken for the graphic appeal, which will be improved in the future. The main part of SEDAv1.0 is devoted to the ETAS modeling. SEDAv1.0 contains a set of consistent tools on ETAS, allowing the estimation of parameters, the testing of model on data, the simulation of catalogs, the identification of sequences and forecasts calculation. The peculiarities of routines inside SEDAv1.0 are discussed in this paper. More specific details on the software are presented in the manual accompanying the program package. PMID:28290482
Interactive Web Graphs with Fewer Restrictions

NASA Technical Reports Server (NTRS)

Fiedler, James

2012-01-01

There is growing popularity for interactive, statistical web graphs and programs to generate them. However, it seems that these programs tend to be somewhat restricted in which web browsers and statistical software are supported. For example, the software might use SVG (e.g., Protovis, gridSVG) or HTML canvas, both of which exclude most versions of Internet Explorer, or the software might be made specifically for R (gridSVG, CRanvas), thus excluding users of other stats software. There are more general tools (d3, Rapha lJS) which are compatible with most browsers, but using one of these to make statistical graphs requires more coding than is probably desired, and requires learning a new tool. This talk will present a method for making interactive web graphs, which, by design, attempts to support as many browsers and as many statistical programs as possible, while also aiming to be relatively easy to use and relatively easy to extend.
Two-Sample Statistics for Testing the Equality of Survival Functions Against Improper Semi-parametric Accelerated Failure Time Alternatives: An Application to the Analysis of a Breast Cancer Clinical Trial

PubMed Central

BROËT, PHILIPPE; TSODIKOV, ALEXANDER; DE RYCKE, YANN; MOREAU, THIERRY

2010-01-01

This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests. PMID:15293627
Factors Influencing the Behavioural Intention to Use Statistical Software: The Perspective of the Slovenian Students of Social Sciences

ERIC Educational Resources Information Center

Brezavšcek, Alenka; Šparl, Petra; Žnidaršic, Anja

2017-01-01

The aim of the paper is to investigate the main factors influencing the adoption and continuous utilization of statistical software among university social sciences students in Slovenia. Based on the Technology Acceptance Model (TAM), a conceptual model was derived where five external variables were taken into account: statistical software…
Analysis of Wood Structure Connections Using Cylindrical Steel and Carbon Fiber Dowel Pins

NASA Astrophysics Data System (ADS)

Vodiannikov, Mikhail A.; Kashevarova, Galina G., Dr.

2017-06-01

In this paper, the results of the statistical analysis of corrosion processes and moisture saturation of glued laminated timber structures and their joints in corrosive environment are shown. This paper includes calculation results for dowel connections of wood structures using steel and carbon fiber reinforced plastic cylindrical dowel pins in accordance with applicable regulatory documents by means of finite element analysis in ANSYS software, as well as experimental findings. Dependence diagrams are shown; comparative analysis of the results obtained is conducted.
SCOPA and META-SCOPA: software for the analysis and aggregation of genome-wide association studies of multiple correlated phenotypes.

PubMed

Mägi, Reedik; Suleimanov, Yury V; Clarke, Geraldine M; Kaakinen, Marika; Fischer, Krista; Prokopenko, Inga; Morris, Andrew P

2017-01-11

Genome-wide association studies (GWAS) of single nucleotide polymorphisms (SNPs) have been successful in identifying loci contributing genetic effects to a wide range of complex human diseases and quantitative traits. The traditional approach to GWAS analysis is to consider each phenotype separately, despite the fact that many diseases and quantitative traits are correlated with each other, and often measured in the same sample of individuals. Multivariate analyses of correlated phenotypes have been demonstrated, by simulation, to increase power to detect association with SNPs, and thus may enable improved detection of novel loci contributing to diseases and quantitative traits. We have developed the SCOPA software to enable GWAS analysis of multiple correlated phenotypes. The software implements "reverse regression" methodology, which treats the genotype of an individual at a SNP as the outcome and the phenotypes as predictors in a general linear model. SCOPA can be applied to quantitative traits and categorical phenotypes, and can accommodate imputed genotypes under a dosage model. The accompanying META-SCOPA software enables meta-analysis of association summary statistics from SCOPA across GWAS. Application of SCOPA to two GWAS of high-and low-density lipoprotein cholesterol, triglycerides and body mass index, and subsequent meta-analysis with META-SCOPA, highlighted stronger association signals than univariate phenotype analysis at established lipid and obesity loci. The META-SCOPA meta-analysis also revealed a novel signal of association at genome-wide significance for triglycerides mapping to GPC5 (lead SNP rs71427535, p = 1.1x10 -8 ), which has not been reported in previous large-scale GWAS of lipid traits. The SCOPA and META-SCOPA software enable discovery and dissection of multiple phenotype association signals through implementation of a powerful reverse regression approach.
PSGMiner: A modular software for polysomnographic analysis.

PubMed

Umut, İlhan

2016-06-01

Sleep disorders affect a great percentage of the population. The diagnosis of these disorders is usually made by polysomnography. This paper details the development of new software to carry out feature extraction in order to perform robust analysis and classification of sleep events using polysomnographic data. The software, called PSGMiner, is a tool, which visualizes, processes and classifies bioelectrical data. The purpose of this program is to provide researchers with a platform with which to test new hypotheses by creating tests to check for correlations that are not available in commercially available software. The software is freely available under the GPL3 License. PSGMiner is composed of a number of diverse modules such as feature extraction, annotation, and machine learning modules, all of which are accessible from the main module. Using the software, it is possible to extract features of polysomnography using digital signal processing and statistical methods and to perform different analyses. The features can be classified through the use of five classification algorithms. PSGMiner offers an architecture designed for integrating new methods. Automatic scoring, which is available in almost all commercial PSG software, is not inherently available in this program, though it can be implemented by two different methodologies (machine learning and algorithms). While similar software focuses on a certain signal or event composed of a small number of modules with no expansion possibility, the software introduced here can handle all polysomnographic signals and events. The software simplifies the processing of polysomnographic signals for researchers and physicians that are not experts in computer programming. It can find correlations between different events which could help predict an oncoming event such as sleep apnea. The software could also be used for educational purposes. Copyright © 2016 Elsevier Ltd. All rights reserved.
An improved approach for flight readiness certification: Probabilistic models for flaw propagation and turbine blade failure. Volume 2: Software documentation

NASA Technical Reports Server (NTRS)

Moore, N. R.; Ebbeler, D. H.; Newlin, L. E.; Sutharshana, S.; Creager, M.

1992-01-01

An improved methodology for quantitatively evaluating failure risk of spaceflights systems to assess flight readiness and identify risk control measures is presented. This methodology, called Probabilistic Failure Assessment (PFA), combines operating experience from tests and flights with analytical modeling of failure phenomena to estimate failure risk. The PFA methodology is of particular value when information on which to base an assessment of failure risk, including test experience and knowledge of parameters used in analytical modeling, is expensive or difficult to acquire. The PFA methodology is a prescribed statistical structure in which analytical models that characterize failure phenomena are used conjointly with uncertainties about analysis parameters and/or modeling accuracy to estimate failure probability distributions for specific failure modes. These distributions can then be modified, by means of statistical procedures of the PFA methodology, to reflect any test or flight experience. State-of-the-art analytical models currently employed for design, failure prediction, or performance analysis are used in this methodology. The rationale for the statistical approach taken in the PFA methodology is discussed, the PFA methodology is described, and examples of its application to structural failure modes are presented. The engineering models and computer software used in fatigue crack growth and fatigue crack initiation applications are thoroughly documented.
Identifying currents in the gene pool for bacterial populations using an integrative approach.

PubMed

Tang, Jing; Hanage, William P; Fraser, Christophe; Corander, Jukka

2009-08-01

The evolution of bacterial populations has recently become considerably better understood due to large-scale sequencing of population samples. It has become clear that DNA sequences from a multitude of genes, as well as a broad sample coverage of a target population, are needed to obtain a relatively unbiased view of its genetic structure and the patterns of ancestry connected to the strains. However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place. To meet the needs of large-scale analyses of population structure for bacteria, we introduce here several statistical tools for the detection and representation of recombination between populations. Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations. Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis. The statistical tools introduced here are freely available in BAPS 5.2 software, which can be downloaded from http://web.abo.fi/fak/mnf/mate/jc/software/baps.html.
Age estimation by pulp-to-tooth area ratio using cone-beam computed tomography: A preliminary analysis

PubMed Central

Rai, Arpita; Acharya, Ashith B.; Naikmasur, Venkatesh G.

2016-01-01

Background: Age estimation of living or deceased individuals is an important aspect of forensic sciences. Conventionally, pulp-to-tooth area ratio (PTR) measured from periapical radiographs have been utilized as a nondestructive method of age estimation. Cone-beam computed tomography (CBCT) is a new method to acquire three-dimensional images of the teeth in living individuals. Aims: The present study investigated age estimation based on PTR of the maxillary canines measured in three planes obtained from CBCT image data. Settings and Design: Sixty subjects aged 20–85 years were included in the study. Materials and Methods: For each tooth, mid-sagittal, mid-coronal, and three axial sections—cementoenamel junction (CEJ), one-fourth root level from CEJ, and mid-root—were assessed. PTR was calculated using AutoCAD software after outlining the pulp and tooth. Statistical Analysis Used: All statistical analyses were performed using an SPSS 17.0 software program. Results and Conclusions: Linear regression analysis showed that only PTR in axial plane at CEJ had significant age correlation (r = 0.32; P < 0.05). This is probably because of clearer demarcation of pulp and tooth outline at this level. PMID:28123269
Implementation errors in the GingerALE Software: Description and recommendations.

PubMed

Eickhoff, Simon B; Laird, Angela R; Fox, P Mickle; Lancaster, Jack L; Fox, Peter T

2017-01-01

Neuroscience imaging is a burgeoning, highly sophisticated field the growth of which has been fostered by grant-funded, freely distributed software libraries that perform voxel-wise analyses in anatomically standardized three-dimensional space on multi-subject, whole-brain, primary datasets. Despite the ongoing advances made using these non-commercial computational tools, the replicability of individual studies is an acknowledged limitation. Coordinate-based meta-analysis offers a practical solution to this limitation and, consequently, plays an important role in filtering and consolidating the enormous corpus of functional and structural neuroimaging results reported in the peer-reviewed literature. In both primary data and meta-analytic neuroimaging analyses, correction for multiple comparisons is a complex but critical step for ensuring statistical rigor. Reports of errors in multiple-comparison corrections in primary-data analyses have recently appeared. Here, we report two such errors in GingerALE, a widely used, US National Institutes of Health (NIH)-funded, freely distributed software package for coordinate-based meta-analysis. These errors have given rise to published reports with more liberal statistical inferences than were specified by the authors. The intent of this technical report is threefold. First, we inform authors who used GingerALE of these errors so that they can take appropriate actions including re-analyses and corrective publications. Second, we seek to exemplify and promote an open approach to error management. Third, we discuss the implications of these and similar errors in a scientific environment dependent on third-party software. Hum Brain Mapp 38:7-11, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Flexible Adaptive Paradigms for fMRI Using a Novel Software Package ‘Brain Analysis in Real-Time’ (BART)

PubMed Central

Hellrung, Lydia; Hollmann, Maurice; Zscheyge, Oliver; Schlumm, Torsten; Kalberlah, Christian; Roggenhofer, Elisabeth; Okon-Singer, Hadas; Villringer, Arno; Horstmann, Annette

2015-01-01

In this work we present a new open source software package offering a unified framework for the real-time adaptation of fMRI stimulation procedures. The software provides a straightforward setup and highly flexible approach to adapt fMRI paradigms while the experiment is running. The general framework comprises the inclusion of parameters from subject’s compliance, such as directing gaze to visually presented stimuli and physiological fluctuations, like blood pressure or pulse. Additionally, this approach yields possibilities to investigate complex scientific questions, for example the influence of EEG rhythms or fMRI signals results themselves. To prove the concept of this approach, we used our software in a usability example for an fMRI experiment where the presentation of emotional pictures was dependent on the subject’s gaze position. This can have a significant impact on the results. So far, if this is taken into account during fMRI data analysis, it is commonly done by the post-hoc removal of erroneous trials. Here, we propose an a priori adaptation of the paradigm during the experiment’s runtime. Our fMRI findings clearly show the benefits of an adapted paradigm in terms of statistical power and higher effect sizes in emotion-related brain regions. This can be of special interest for all experiments with low statistical power due to a limited number of subjects, a limited amount of time, costs or available data to analyze, as is the case with real-time fMRI. PMID:25837719
Statistical assessment of crosstalk enrichment between gene groups in biological networks.

PubMed

McCormack, Theodore; Frings, Oliver; Alexeyenko, Andrey; Sonnhammer, Erik L L

2013-01-01

Analyzing groups of functionally coupled genes or proteins in the context of global interaction networks has become an important aspect of bioinformatic investigations. Assessing the statistical significance of crosstalk enrichment between or within groups of genes can be a valuable tool for functional annotation of experimental gene sets. Here we present CrossTalkZ, a statistical method and software to assess the significance of crosstalk enrichment between pairs of gene or protein groups in large biological networks. We demonstrate that the standard z-score is generally an appropriate and unbiased statistic. We further evaluate the ability of four different methods to reliably recover crosstalk within known biological pathways. We conclude that the methods preserving the second-order topological network properties perform best. Finally, we show how CrossTalkZ can be used to annotate experimental gene sets using known pathway annotations and that its performance at this task is superior to gene enrichment analysis (GEA). CrossTalkZ (available at http://sonnhammer.sbc.su.se/download/software/CrossTalkZ/) is implemented in C++, easy to use, fast, accepts various input file formats, and produces a number of statistics. These include z-score, p-value, false discovery rate, and a test of normality for the null distributions.
Environmental statistics with S-Plus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Millard, S.P.; Neerchal, N.K.

1999-12-01

The combination of easy-to-use software with easy access to a description of the statistical methods (definitions, concepts, etc.) makes this book an excellent resource. One of the major features of this book is the inclusion of general information on environmental statistical methods and examples of how to implement these methods using the statistical software package S-Plus and the add-in modules Environmental-Stats for S-Plus, S+SpatialStats, and S-Plus for ArcView.
Enhancing interdisciplinary mathematics and biology education: a microarray data analysis course bridging these disciplines.

PubMed

Tra, Yolande V; Evans, Irene M

2010-01-01

BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course.
Enhancing Interdisciplinary Mathematics and Biology Education: A Microarray Data Analysis Course Bridging These Disciplines

PubMed Central

Evans, Irene M.

2010-01-01

BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course. PMID:20810954

Biological Parametric Mapping: A Statistical Toolbox for Multi-Modality Brain Image Analysis

PubMed Central

Casanova, Ramon; Ryali, Srikanth; Baer, Aaron; Laurienti, Paul J.; Burdette, Jonathan H.; Hayasaka, Satoru; Flowers, Lynn; Wood, Frank; Maldjian, Joseph A.

2006-01-01

In recent years multiple brain MR imaging modalities have emerged; however, analysis methodologies have mainly remained modality specific. In addition, when comparing across imaging modalities, most researchers have been forced to rely on simple region-of-interest type analyses, which do not allow the voxel-by-voxel comparisons necessary to answer more sophisticated neuroscience questions. To overcome these limitations, we developed a toolbox for multimodal image analysis called biological parametric mapping (BPM), based on a voxel-wise use of the general linear model. The BPM toolbox incorporates information obtained from other modalities as regressors in a voxel-wise analysis, thereby permitting investigation of more sophisticated hypotheses. The BPM toolbox has been developed in MATLAB with a user friendly interface for performing analyses, including voxel-wise multimodal correlation, ANCOVA, and multiple regression. It has a high degree of integration with the SPM (statistical parametric mapping) software relying on it for visualization and statistical inference. Furthermore, statistical inference for a correlation field, rather than a widely-used T-field, has been implemented in the correlation analysis for more accurate results. An example with in-vivo data is presented demonstrating the potential of the BPM methodology as a tool for multimodal image analysis. PMID:17070709
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data

PubMed Central

Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J.; Yanes, Oscar

2012-01-01

Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples. PMID:24957762
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data.

PubMed

Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J; Yanes, Oscar

2012-10-18

Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples.
Tc-99m Labeled and VIP-Receptor Targeted Liposomes for Effective Imaging of Breast Cancer

DTIC Science & Technology

2006-09-01

computer. The images (100,000 counts/image) were acquired and stored in a 512X512 matrix. Image Analysis : The Odyssey software program was used to...as well as between normal and tumor- were calculated. Statistical analysis was performed bearing rats for each of the formulations using using...large signal-to-noise ratio, thereby rendering data analysis impractical. Moreover, -helicity of VIP associated with SSM is potentiated in the presence
Assessment of ICount software, a precise and fast egg counting tool for the mosquito vector Aedes aegypti.

PubMed

Gaburro, Julie; Duchemin, Jean-Bernard; Paradkar, Prasad N; Nahavandi, Saeid; Bhatti, Asim

2016-11-18

Widespread in the tropics, the mosquito Aedes aegypti is an important vector of many viruses, posing a significant threat to human health. Vector monitoring often requires fecundity estimation by counting eggs laid by female mosquitoes. Traditionally, manual data analyses have been used but this requires a lot of effort and is the methods are prone to errors. An easy tool to assess the number of eggs laid would facilitate experimentation and vector control operations. This study introduces a built-in software called ICount allowing automatic egg counting of the mosquito vector, Aedes aegypti. ICount egg estimation compared to manual counting is statistically equivalent, making the software effective for automatic and semi-automatic data analysis. This technique also allows rapid analysis compared to manual methods. Finally, the software has been used to assess p-cresol oviposition choices under laboratory conditions in order to test the system with different egg densities. ICount is a powerful tool for fast and precise egg count analysis, freeing experimenters from manual data processing. Software access is free and its user-friendly interface allows easy use by non-experts. Its efficiency has been tested in our laboratory with oviposition dual choices of Aedes aegypti females. The next step will be the development of a mobile application, based on the ICount platform, for vector monitoring surveys in the field.
Preliminary clinical evaluation of automated analysis of the sublingual microcirculation in the assessment of patients with septic shock: Comparison of automated versus semi-automated software.

PubMed

Sharawy, Nivin; Mukhtar, Ahmed; Islam, Sufia; Mahrous, Reham; Mohamed, Hassan; Ali, Mohamed; Hakeem, Amr A; Hossny, Osama; Refaa, Amera; Saka, Ahmed; Cerny, Vladimir; Whynot, Sara; George, Ronald B; Lehmann, Christian

2017-01-01

The outcome of patients in septic shock has been shown to be related to changes within the microcirculation. Modern imaging technologies are available to generate high resolution video recordings of the microcirculation in humans. However, evaluation of the microcirculation is not yet implemented in the routine clinical monitoring of critically ill patients. This is mainly due to large amount of time and user interaction required by the current video analysis software. The aim of this study was to validate a newly developed automated method (CCTools®) for microcirculatory analysis of sublingual capillary perfusion in septic patients in comparison to standard semi-automated software (AVA3®). 204 videos from 47 patients were recorded using incident dark field (IDF) imaging. Total vessel density (TVD), proportion of perfused vessels (PPV), perfused vessel density (PVD), microvascular flow index (MFI) and heterogeneity index (HI) were measured using AVA3® and CCTools®. Significant differences between the numeric results obtained by the two different software packages were observed. The values for TVD, PVD and MFI were statistically related though. The automated software technique successes to show septic shock induced microcirculation alterations in near real time. However, we found wide degrees of agreement between AVA3® and CCTools® values due to several technical factors that should be considered in the future studies.
RepExplore: addressing technical replicate variance in proteomics and metabolomics data analysis.

PubMed

Glaab, Enrico; Schneider, Reinhard

2015-07-01

High-throughput omics datasets often contain technical replicates included to account for technical sources of noise in the measurement process. Although summarizing these replicate measurements by using robust averages may help to reduce the influence of noise on downstream data analysis, the information on the variance across the replicate measurements is lost in the averaging process and therefore typically disregarded in subsequent statistical analyses.We introduce RepExplore, a web-service dedicated to exploit the information captured in the technical replicate variance to provide more reliable and informative differential expression and abundance statistics for omics datasets. The software builds on previously published statistical methods, which have been applied successfully to biomedical omics data but are difficult to use without prior experience in programming or scripting. RepExplore facilitates the analysis by providing a fully automated data processing and interactive ranking tables, whisker plot, heat map and principal component analysis visualizations to interpret omics data and derived statistics. Freely available at http://www.repexplore.tk enrico.glaab@uni.lu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of Markov chains.

PubMed

Xia, Li C; Ai, Dongmei; Cram, Jacob A; Liang, Xiaoyi; Fuhrman, Jed A; Sun, Fengzhu

2015-09-21

Local trend (i.e. shape) analysis of time series data reveals co-changing patterns in dynamics of biological systems. However, slow permutation procedures to evaluate the statistical significance of local trend scores have limited its applications to high-throughput time series data analysis, e.g., data from the next generation sequencing technology based studies. By extending the theories for the tail probability of the range of sum of Markovian random variables, we propose formulae for approximating the statistical significance of local trend scores. Using simulations and real data, we show that the approximate p-value is close to that obtained using a large number of permutations (starting at time points >20 with no delay and >30 with delay of at most three time steps) in that the non-zero decimals of the p-values obtained by the approximation and the permutations are mostly the same when the approximate p-value is less than 0.05. In addition, the approximate p-value is slightly larger than that based on permutations making hypothesis testing based on the approximate p-value conservative. The approximation enables efficient calculation of p-values for pairwise local trend analysis, making large scale all-versus-all comparisons possible. We also propose a hybrid approach by integrating the approximation and permutations to obtain accurate p-values for significantly associated pairs. We further demonstrate its use with the analysis of the Polymouth Marine Laboratory (PML) microbial community time series from high-throughput sequencing data and found interesting organism co-occurrence dynamic patterns. The software tool is integrated into the eLSA software package that now provides accelerated local trend and similarity analysis pipelines for time series data. The package is freely available from the eLSA website: http://bitbucket.org/charade/elsa.
Processing of on-board recorded data for quick analysis of aircraft performance. [rotor systems research aircraft

NASA Technical Reports Server (NTRS)

Michaud, N. H.

1979-01-01

A system of independent computer programs for the processing of digitized pulse code modulated (PCM) and frequency modulated (FM) data is described. Information is stored in a set of random files and accessed to produce both statistical and graphical output. The software system is designed primarily to present these reports within a twenty-four hour period for quick analysis of the helicopter's performance.
DCL System Research Using Advanced Approaches for Land-based or Ship-based Real-Time Recognition and Localization of Marine Mammals

DTIC Science & Technology

2012-09-30

recognition. Algorithm design and statistical analysis and feature analysis. Post -Doctoral Associate, Cornell University, Bioacoustics Research...short. The HPC-ADA was designed based on fielded systems [1-4, 6] that offer a variety of desirable attributes, specifically dynamic resource...The software package was designed to utilize parallel and distributed processing for running recognition and other advanced algorithms. DeLMA
Statistical Software Engineering

DTIC Science & Technology

1998-04-13

multiversion software subject to coincident errors. IEEE Trans. Software Eng. SE-11:1511-1517. Eckhardt, D.E., A.K Caglayan, J.C. Knight, L.D. Lee, D.F...J.C. and N.G. Leveson. 1986. Experimental evaluation of the assumption of independence in multiversion software. IEEE Trans. Software
Exponential order statistic models of software reliability growth

NASA Technical Reports Server (NTRS)

Miller, D. R.

1985-01-01

Failure times of a software reliabilty growth process are modeled as order statistics of independent, nonidentically distributed exponential random variables. The Jelinsky-Moranda, Goel-Okumoto, Littlewood, Musa-Okumoto Logarithmic, and Power Law models are all special cases of Exponential Order Statistic Models, but there are many additional examples also. Various characterizations, properties and examples of this class of models are developed and presented.
Modelling the Effects of Land-Use Changes on Climate: a Case Study on Yamula DAM

NASA Astrophysics Data System (ADS)

Köylü, Ü.; Geymen, A.

2016-10-01

Dams block flow of rivers and cause artificial water reservoirs which affect the climate and the land use characteristics of the river basin. In this research, the effect of the huge water body obtained by Yamula Dam in Kızılırmak Basin is analysed over surrounding spatial's land use and climate change. Mann Kendal non-parametrical statistical test, Theil&Sen Slope method, Inverse Distance Weighting (IDW), Soil Conservation Service-Curve Number (SCS-CN) methods are integrated for spatial and temporal analysis of the research area. For this research humidity, temperature, wind speed, precipitation observations which are collected in 16 weather stations nearby Kızılırmak Basin are analyzed. After that these statistical information is combined by GIS data over years. An application is developed for GIS analysis in Python Programming Language and integrated with ArcGIS software. Statistical analysis calculated in the R Project for Statistical Computing and integrated with developed application. According to the statistical analysis of extracted time series of meteorological parameters, statistical significant spatiotemporal trends are observed for climate change and land use characteristics. In this study, we indicated the effect of big dams in local climate on semi-arid Yamula Dam.
Creation of a virtual cutaneous tissue bank

NASA Astrophysics Data System (ADS)

LaFramboise, William A.; Shah, Sujal; Hoy, R. W.; Letbetter, D.; Petrosko, P.; Vennare, R.; Johnson, Peter C.

2000-04-01

Cellular and non-cellular constituents of skin contain fundamental morphometric features and structural patterns that correlate with tissue function. High resolution digital image acquisitions performed using an automated system and proprietary software to assemble adjacent images and create a contiguous, lossless, digital representation of individual microscope slide specimens. Serial extraction, evaluation and statistical analysis of cutaneous feature is performed utilizing an automated analysis system, to derive normal cutaneous parameters comprising essential structural skin components. Automated digital cutaneous analysis allows for fast extraction of microanatomic dat with accuracy approximating manual measurement. The process provides rapid assessment of feature both within individual specimens and across sample populations. The images, component data, and statistical analysis comprise a bioinformatics database to serve as an architectural blueprint for skin tissue engineering and as a diagnostic standard of comparison for pathologic specimens.
New Tools for "New" History: Computers and the Teaching of Quantitative Historical Methods.

ERIC Educational Resources Information Center

Burton, Orville Vernon; Finnegan, Terence

1989-01-01

Explains the development of an instructional software package and accompanying workbook which teaches students to apply computerized statistical analysis to historical data, improving the study of social history. Concludes that the use of microcomputers and supercomputers to manipulate historical data enhances critical thinking skills and the use…
Structure of Student Time Management Scale (STMS)

ERIC Educational Resources Information Center

Balamurugan, M.

2013-01-01

With the aim of constructing a Student Time Management Scale (STMS), the initial version was administered and data were collected from 523 standard eleventh students. (Mean age = 15.64). The data obtained were subjected to Reliability and Factor analysis using PASW Statistical software version 18. From 42 items 14 were dropped, resulting in the…
A comparative approach to computer aided design model of a dog femur.

PubMed

Turamanlar, O; Verim, O; Karabulut, A

2016-01-01

Computer assisted technologies offer new opportunities in medical imaging and rapid prototyping in biomechanical engineering. Three dimensional (3D) modelling of soft tissues and bones are becoming more important. The accuracy of the analysis in modelling processes depends on the outline of the tissues derived from medical images. The aim of this study is the evaluation of the accuracy of 3D models of a dog femur derived from computed tomography data by using point cloud method and boundary line method on several modelling software. Solidworks, Rapidform and 3DSMax software were used to create 3D models and outcomes were evaluated statistically. The most accurate 3D prototype of the dog femur was created with stereolithography method using rapid prototype device. Furthermore, the linearity of the volumes of models was investigated between software and the constructed models. The difference between the software and real models manifests the sensitivity of the software and the devices used in this manner.
M.S.L.A.P. Modular Spectral Line Analysis Program documentation

NASA Technical Reports Server (NTRS)

Joseph, Charles L.; Jenkins, Edward B.

1991-01-01

MSLAP is a software for analyzing spectra, providing the basic structure to identify spectral features, to make quantitative measurements of this features, and to store the measurements for convenient access. MSLAP can be used to measure not only the zeroth moment (equivalent width) of a profile, but also the first and second moments. Optical depths and the corresponding column densities across the profile can be measured as well for sufficiently high resolution data. The software was developed for an interactive, graphical analysis where the computer carries most of the computational and data organizational burden and the investigator is responsible only for all judgement decisions. It employs sophisticated statistical techniques for determining the best polynomial fit to the continuum and for calculating the uncertainties.
GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor.

PubMed

Davis, Sean; Meltzer, Paul S

2007-07-15

Microarray technology has become a standard molecular biology tool. Experimental data have been generated on a huge number of organisms, tissue types, treatment conditions and disease states. The Gene Expression Omnibus (Barrett et al., 2005), developed by the National Center for Bioinformatics (NCBI) at the National Institutes of Health is a repository of nearly 140,000 gene expression experiments. The BioConductor project (Gentleman et al., 2004) is an open-source and open-development software project built in the R statistical programming environment (R Development core Team, 2005) for the analysis and comprehension of genomic data. The tools contained in the BioConductor project represent many state-of-the-art methods for the analysis of microarray and genomics data. We have developed a software tool that allows access to the wealth of information within GEO directly from BioConductor, eliminating many the formatting and parsing problems that have made such analyses labor-intensive in the past. The software, called GEOquery, effectively establishes a bridge between GEO and BioConductor. Easy access to GEO data from BioConductor will likely lead to new analyses of GEO data using novel and rigorous statistical and bioinformatic tools. Facilitating analyses and meta-analyses of microarray data will increase the efficiency with which biologically important conclusions can be drawn from published genomic data. GEOquery is available as part of the BioConductor project.
D-Light on promoters: a client-server system for the analysis and visualization of cis-regulatory elements

PubMed Central

2013-01-01

Background The binding of transcription factors to DNA plays an essential role in the regulation of gene expression. Numerous experiments elucidated binding sequences which subsequently have been used to derive statistical models for predicting potential transcription factor binding sites (TFBS). The rapidly increasing number of genome sequence data requires sophisticated computational approaches to manage and query experimental and predicted TFBS data in the context of other epigenetic factors and across different organisms. Results We have developed D-Light, a novel client-server software package to store and query large amounts of TFBS data for any number of genomes. Users can add small-scale data to the server database and query them in a large scale, genome-wide promoter context. The client is implemented in Java and provides simple graphical user interfaces and data visualization. Here we also performed a statistical analysis showing what a user can expect for certain parameter settings and we illustrate the usage of D-Light with the help of a microarray data set. Conclusions D-Light is an easy to use software tool to integrate, store and query annotation data for promoters. A public D-Light server, the client and server software for local installation and the source code under GNU GPL license are available at http://biwww.che.sbg.ac.at/dlight. PMID:23617301

Bayesian Hierarchical Random Effects Models in Forensic Science.

PubMed

Aitken, Colin G G

2018-01-01

Statistical modeling of the evaluation of evidence with the use of the likelihood ratio has a long history. It dates from the Dreyfus case at the end of the nineteenth century through the work at Bletchley Park in the Second World War to the present day. The development received a significant boost in 1977 with a seminal work by Dennis Lindley which introduced a Bayesian hierarchical random effects model for the evaluation of evidence with an example of refractive index measurements on fragments of glass. Many models have been developed since then. The methods have now been sufficiently well-developed and have become so widespread that it is timely to try and provide a software package to assist in their implementation. With that in mind, a project (SAILR: Software for the Analysis and Implementation of Likelihood Ratios) was funded by the European Network of Forensic Science Institutes through their Monopoly programme to develop a software package for use by forensic scientists world-wide that would assist in the statistical analysis and implementation of the approach based on likelihood ratios. It is the purpose of this document to provide a short review of a small part of this history. The review also provides a background, or landscape, for the development of some of the models within the SAILR package and references to SAILR as made as appropriate.
Semi-automatic tracking, smoothing and segmentation of hyoid bone motion from videofluoroscopic swallowing study.

PubMed

Kim, Won-Seok; Zeng, Pengcheng; Shi, Jian Qing; Lee, Youngjo; Paik, Nam-Jong

2017-01-01

Motion analysis of the hyoid bone via videofluoroscopic study has been used in clinical research, but the classical manual tracking method is generally labor intensive and time consuming. Although some automatic tracking methods have been developed, masked points could not be tracked and smoothing and segmentation, which are necessary for functional motion analysis prior to registration, were not provided by the previous software. We developed software to track the hyoid bone motion semi-automatically. It works even in the situation where the hyoid bone is masked by the mandible and has been validated in dysphagia patients with stroke. In addition, we added the function of semi-automatic smoothing and segmentation. A total of 30 patients' data were used to develop the software, and data collected from 17 patients were used for validation, of which the trajectories of 8 patients were partly masked. Pearson correlation coefficients between the manual and automatic tracking are high and statistically significant (0.942 to 0.991, P-value<0.0001). Relative errors between automatic tracking and manual tracking in terms of the x-axis, y-axis and 2D range of hyoid bone excursion range from 3.3% to 9.2%. We also developed an automatic method to segment each hyoid bone trajectory into four phases (elevation phase, anterior movement phase, descending phase and returning phase). The semi-automatic hyoid bone tracking from VFSS data by our software is valid compared to the conventional manual tracking method. In addition, the ability of automatic indication to switch the automatic mode to manual mode in extreme cases and calibration without attaching the radiopaque object is convenient and useful for users. Semi-automatic smoothing and segmentation provide further information for functional motion analysis which is beneficial to further statistical analysis such as functional classification and prognostication for dysphagia. Therefore, this software could provide the researchers in the field of dysphagia with a convenient, useful, and all-in-one platform for analyzing the hyoid bone motion. Further development of our method to track the other swallowing related structures or objects such as epiglottis and bolus and to carry out the 2D curve registration may be needed for a more comprehensive functional data analysis for dysphagia with big data.
Language workbench user interfaces for data analysis

PubMed Central

Benson, Victoria M.

2015-01-01

Biological data analysis is frequently performed with command line software. While this practice provides considerable flexibility for computationally savy individuals, such as investigators trained in bioinformatics, this also creates a barrier to the widespread use of data analysis software by investigators trained as biologists and/or clinicians. Workflow systems such as Galaxy and Taverna have been developed to try and provide generic user interfaces that can wrap command line analysis software. These solutions are useful for problems that can be solved with workflows, and that do not require specialized user interfaces. However, some types of analyses can benefit from custom user interfaces. For instance, developing biomarker models from high-throughput data is a type of analysis that can be expressed more succinctly with specialized user interfaces. Here, we show how Language Workbench (LW) technology can be used to model the biomarker development and validation process. We developed a language that models the concepts of Dataset, Endpoint, Feature Selection Method and Classifier. These high-level language concepts map directly to abstractions that analysts who develop biomarker models are familiar with. We found that user interfaces developed in the Meta-Programming System (MPS) LW provide convenient means to configure a biomarker development project, to train models and view the validation statistics. We discuss several advantages of developing user interfaces for data analysis with a LW, including increased interface consistency, portability and extension by language composition. The language developed during this experiment is distributed as an MPS plugin (available at http://campagnelab.org/software/bdval-for-mps/). PMID:25755929
DNA Damage Analysis in Children with Non-syndromic Developmental Delay by Comet Assay.

PubMed

Susai, Surraj; Chand, Parkash; Ballambattu, Vishnu Bhat; Hanumanthappa, Nandeesha; Veeramani, Raveendranath

2016-05-01

Majority of the developmental delays in children are non-syndromic and they are believed to have an underlying DNA damage, though not well substantiated. Hence the present study was carried out to find out if there is any increased DNA damage in children with non-syndromic developmental delay by using the comet assay. The present case-control study was undertaken to assess the level of DNA damage in children with non syndromic developmental delay and compare the same with that of age and sex matched controls using submarine gel electrophoresis (Comet Assay). The blood from clinically diagnosed children with non syndromic developmental delay and controls were subjected for alkaline version of comet assay - Single cell gel electrophoresis using lymphocytes isolated from the peripheral blood. The comets were observed under a bright field microscope; photocaptured and scored using the Image J image quantification software. Comet parameters were compared between the cases and controls and statistical analysis and interpretation of results was done using the statistical software SPSS version 20. The mean comet tail length in cases and control was 20.77+7.659μm and 08.97+4.398μm respectively which was statistically significant (p<0.001). Other comet parameters like total comet length and % DNA in tail also showed a statistically significant difference (p < 0.001) between cases and controls. The current investigation unraveled increased levels of DNA damage in children with non syndromic developmental delay when compared to the controls.
The GenABEL Project for statistical genomics

PubMed Central

Karssen, Lennart C.; van Duijn, Cornelia M.; Aulchenko, Yurii S.

2016-01-01

Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the “core team”, facilitating agile statistical omics methodology development and fast dissemination. PMID:27347381
Visual Data Analysis for Satellites

NASA Technical Reports Server (NTRS)

Lau, Yee; Bhate, Sachin; Fitzpatrick, Patrick

2008-01-01

The Visual Data Analysis Package is a collection of programs and scripts that facilitate visual analysis of data available from NASA and NOAA satellites, as well as dropsonde, buoy, and conventional in-situ observations. The package features utilities for data extraction, data quality control, statistical analysis, and data visualization. The Hierarchical Data Format (HDF) satellite data extraction routines from NASA's Jet Propulsion Laboratory were customized for specific spatial coverage and file input/output. Statistical analysis includes the calculation of the relative error, the absolute error, and the root mean square error. Other capabilities include curve fitting through the data points to fill in missing data points between satellite passes or where clouds obscure satellite data. For data visualization, the software provides customizable Generic Mapping Tool (GMT) scripts to generate difference maps, scatter plots, line plots, vector plots, histograms, timeseries, and color fill images.
The changing landscape of astrostatistics and astroinformatics

NASA Astrophysics Data System (ADS)

Feigelson, Eric D.

2017-06-01

The history and current status of the cross-disciplinary fields of astrostatistics and astroinformatics are reviewed. Astronomers need a wide range of statistical methods for both data reduction and science analysis. With the proliferation of high-throughput telescopes, efficient large scale computational methods are also becoming essential. However, astronomers receive only weak training in these fields during their formal education. Interest in the fields is rapidly growing with conferences organized by scholarly societies, textbooks and tutorial workshops, and research studies pushing the frontiers of methodology. R, the premier language of statistical computing, can provide an important software environment for the incorporation of advanced statistical and computational methodology into the astronomical community.
Rule-based statistical data mining agents for an e-commerce application

NASA Astrophysics Data System (ADS)

Qin, Yi; Zhang, Yan-Qing; King, K. N.; Sunderraman, Rajshekhar

2003-03-01

Intelligent data mining techniques have useful e-Business applications. Because an e-Commerce application is related to multiple domains such as statistical analysis, market competition, price comparison, profit improvement and personal preferences, this paper presents a hybrid knowledge-based e-Commerce system fusing intelligent techniques, statistical data mining, and personal information to enhance QoS (Quality of Service) of e-Commerce. A Web-based e-Commerce application software system, eDVD Web Shopping Center, is successfully implemented uisng Java servlets and an Oracle81 database server. Simulation results have shown that the hybrid intelligent e-Commerce system is able to make smart decisions for different customers.
Estimation of Apple Intake for the Exposure Assessment of Residual Chemicals Using Korea National Health and Nutrition Examination Survey Database

PubMed Central

2016-01-01

The aims of this study were to develop strategies and algorithms of calculating food commodity intake suitable for exposure assessment of residual chemicals by using the food intake database of Korea National Health and Nutrition Examination Survey (KNHANES). In this study, apples and their processed food products were chosen as a model food for accurate calculation of food commodity intakes uthrough the recently developed Korea food commodity intake calculation (KFCIC) software. The average daily intakes of total apples in Korea Health Statistics were 29.60 g in 2008, 32.40 g in 2009, 34.30 g in 2010, 28.10 g in 2011, and 24.60 g in 2012. The average daily intakes of apples by KFCIC software was 2.65 g higher than that by Korea Health Statistics. The food intake data in Korea Health Statistics might have less reflected the intake of apples from mixed and processed foods than KFCIC software has. These results can affect outcome of risk assessment for residual chemicals in foods. Therefore, the accurate estimation of the average daily intake of food commodities is very important, and more data for food intakes and recipes have to be applied to improve the quality of data. Nevertheless, this study can contribute to the predictive estimation of exposure to possible residual chemicals and subsequent analysis for their potential risks. PMID:27152299
Cleanroom certification model

NASA Technical Reports Server (NTRS)

Currit, P. A.

1983-01-01

The Cleanroom software development methodology is designed to take the gamble out of product releases for both suppliers and receivers of the software. The ingredients of this procedure are a life cycle of executable product increments, representative statistical testing, and a standard estimate of the MTTF (Mean Time To Failure) of the product at the time of its release. A statistical approach to software product testing using randomly selected samples of test cases is considered. A statistical model is defined for the certification process which uses the timing data recorded during test. A reasonableness argument for this model is provided that uses previously published data on software product execution. Also included is a derivation of the certification model estimators and a comparison of the proposed least squares technique with the more commonly used maximum likelihood estimators.
Software engineering the mixed model for genome-wide association studies on large samples.

PubMed

Zhang, Zhiwu; Buckler, Edward S; Casstevens, Terry M; Bradbury, Peter J

2009-11-01

Mixed models improve the ability to detect phenotype-genotype associations in the presence of population stratification and multiple levels of relatedness in genome-wide association studies (GWAS), but for large data sets the resource consumption becomes impractical. At the same time, the sample size and number of markers used for GWAS is increasing dramatically, resulting in greater statistical power to detect those associations. The use of mixed models with increasingly large data sets depends on the availability of software for analyzing those models. While multiple software packages implement the mixed model method, no single package provides the best combination of fast computation, ability to handle large samples, flexible modeling and ease of use. Key elements of association analysis with mixed models are reviewed, including modeling phenotype-genotype associations using mixed models, population stratification, kinship and its estimation, variance component estimation, use of best linear unbiased predictors or residuals in place of raw phenotype, improving efficiency and software-user interaction. The available software packages are evaluated, and suggestions made for future software development.
Twice random, once mixed: applying mixed models to simultaneously analyze random effects of language and participants.

PubMed

Janssen, Dirk P

2012-03-01

Psychologists, psycholinguists, and other researchers using language stimuli have been struggling for more than 30 years with the problem of how to analyze experimental data that contain two crossed random effects (items and participants). The classical analysis of variance does not apply; alternatives have been proposed but have failed to catch on, and a statistically unsatisfactory procedure of using two approximations (known as F(1) and F(2)) has become the standard. A simple and elegant solution using mixed model analysis has been available for 15 years, and recent improvements in statistical software have made mixed models analysis widely available. The aim of this article is to increase the use of mixed models by giving a concise practical introduction and by giving clear directions for undertaking the analysis in the most popular statistical packages. The article also introduces the DJMIXED: add-on package for SPSS, which makes entering the models and reporting their results as straightforward as possible.
Progress toward openness, transparency, and reproducibility in cognitive neuroscience.

PubMed

Gilmore, Rick O; Diaz, Michele T; Wyble, Brad A; Yarkoni, Tal

2017-05-01

Accumulating evidence suggests that many findings in psychological science and cognitive neuroscience may prove difficult to reproduce; statistical power in brain imaging studies is low and has not improved recently; software errors in analysis tools are common and can go undetected for many years; and, a few large-scale studies notwithstanding, open sharing of data, code, and materials remain the rare exception. At the same time, there is a renewed focus on reproducibility, transparency, and openness as essential core values in cognitive neuroscience. The emergence and rapid growth of data archives, meta-analytic tools, software pipelines, and research groups devoted to improved methodology reflect this new sensibility. We review evidence that the field has begun to embrace new open research practices and illustrate how these can begin to address problems of reproducibility, statistical power, and transparency in ways that will ultimately accelerate discovery. © 2017 New York Academy of Sciences.
Statistical evaluation of manual segmentation of a diffuse low-grade glioma MRI dataset.

PubMed

Ben Abdallah, Meriem; Blonski, Marie; Wantz-Mezieres, Sophie; Gaudeau, Yann; Taillandier, Luc; Moureaux, Jean-Marie

2016-08-01

Software-based manual segmentation is critical to the supervision of diffuse low-grade glioma patients and to the optimal treatment's choice. However, manual segmentation being time-consuming, it is difficult to include it in the clinical routine. An alternative to circumvent the time cost of manual segmentation could be to share the task among different practitioners, providing it can be reproduced. The goal of our work is to assess diffuse low-grade gliomas' manual segmentation's reproducibility on MRI scans, with regard to practitioners, their experience and field of expertise. A panel of 13 experts manually segmented 12 diffuse low-grade glioma clinical MRI datasets using the OSIRIX software. A statistical analysis gave promising results, as the practitioner factor, the medical specialty and the years of experience seem to have no significant impact on the average values of the tumor volume variable.
[Propensity score matching in SPSS].

PubMed

Huang, Fuqiang; DU, Chunlin; Sun, Menghui; Ning, Bing; Luo, Ying; An, Shengli

2015-11-01

To realize propensity score matching in PS Matching module of SPSS and interpret the analysis results. The R software and plug-in that could link with the corresponding versions of SPSS and propensity score matching package were installed. A PS matching module was added in the SPSS interface, and its use was demonstrated with test data. Score estimation and nearest neighbor matching was achieved with the PS matching module, and the results of qualitative and quantitative statistical description and evaluation were presented in the form of a graph matching. Propensity score matching can be accomplished conveniently using SPSS software.
Design of Student Information Management Database Application System for Office and Departmental Target Responsibility System

NASA Astrophysics Data System (ADS)

Zhou, Hui

It is the inevitable outcome of higher education reform to carry out office and departmental target responsibility system, in which statistical processing of student's information is an important part of student's performance review. On the basis of the analysis of the student's evaluation, the student information management database application system is designed by using relational database management system software in this paper. In order to implement the function of student information management, the functional requirement, overall structure, data sheets and fields, data sheet Association and software codes are designed in details.
Detailed analysis of complex single molecule FRET data with the software MASH

NASA Astrophysics Data System (ADS)

Hadzic, Mélodie C. A. S.; Kowerko, Danny; Börner, Richard; Zelger-Paulus, Susann; Sigel, Roland K. O.

2016-04-01

The processing and analysis of surface-immobilized single molecule FRET (Förster resonance energy transfer) data follows systematic steps (e.g. single molecule localization, clearance of different sources of noise, selection of the conformational and kinetic model, etc.) that require a solid knowledge in optics, photophysics, signal processing and statistics. The present proceeding aims at standardizing and facilitating procedures for single molecule detection by guiding the reader through an optimization protocol for a particular experimental data set. Relevant features were determined from single molecule movies (SMM) imaging Cy3- and Cy5-labeled Sc.ai5γ group II intron molecules synthetically recreated, to test the performances of four different detection algorithms. Up to 120 different parameterizations per method were routinely evaluated to finally establish an optimum detection procedure. The present protocol is adaptable to any movie displaying surface-immobilized molecules, and can be easily reproduced with our home-written software MASH (multifunctional analysis software for heterogeneous data) and script routines (both available in the download section of www.chem.uzh.ch/rna).
The Digital Shoreline Analysis System (DSAS) Version 4.0 - An ArcGIS extension for calculating shoreline change

USGS Publications Warehouse

Thieler, E. Robert; Himmelstoss, Emily A.; Zichichi, Jessica L.; Ergul, Ayhan

2009-01-01

The Digital Shoreline Analysis System (DSAS) version 4.0 is a software extension to ESRI ArcGIS v.9.2 and above that enables a user to calculate shoreline rate-of-change statistics from multiple historic shoreline positions. A user-friendly interface of simple buttons and menus guides the user through the major steps of shoreline change analysis. Components of the extension and user guide include (1) instruction on the proper way to define a reference baseline for measurements, (2) automated and manual generation of measurement transects and metadata based on user-specified parameters, and (3) output of calculated rates of shoreline change and other statistical information. DSAS computes shoreline rates of change using four different methods: (1) endpoint rate, (2) simple linear regression, (3) weighted linear regression, and (4) least median of squares. The standard error, correlation coefficient, and confidence interval are also computed for the simple and weighted linear-regression methods. The results of all rate calculations are output to a table that can be linked to the transect file by a common attribute field. DSAS is intended to facilitate the shoreline change-calculation process and to provide rate-of-change information and the statistical data necessary to establish the reliability of the calculated results. The software is also suitable for any generic application that calculates positional change over time, such as assessing rates of change of glacier limits in sequential aerial photos, river edge boundaries, land-cover changes, and so on.
Principal component analysis of normalized full spectrum mass spectrometry data in multiMS-toolbox: An effective tool to identify important factors for classification of different metabolic patterns and bacterial strains.

PubMed

Cejnar, Pavel; Kuckova, Stepanka; Prochazka, Ales; Karamonova, Ludmila; Svobodova, Barbora

2018-06-15

Explorative statistical analysis of mass spectrometry data is still a time-consuming step. We analyzed critical factors for application of principal component analysis (PCA) in mass spectrometry and focused on two whole spectrum based normalization techniques and their application in the analysis of registered peak data and, in comparison, in full spectrum data analysis. We used this technique to identify different metabolic patterns in the bacterial culture of Cronobacter sakazakii, an important foodborne pathogen. Two software utilities, the ms-alone, a python-based utility for mass spectrometry data preprocessing and peak extraction, and the multiMS-toolbox, an R software tool for advanced peak registration and detailed explorative statistical analysis, were implemented. The bacterial culture of Cronobacter sakazakii was cultivated on Enterobacter sakazakii Isolation Agar, Blood Agar Base and Tryptone Soya Agar for 24 h and 48 h and applied by the smear method on an Autoflex speed MALDI-TOF mass spectrometer. For three tested cultivation media only two different metabolic patterns of Cronobacter sakazakii were identified using PCA applied on data normalized by two different normalization techniques. Results from matched peak data and subsequent detailed full spectrum analysis identified only two different metabolic patterns - a cultivation on Enterobacter sakazakii Isolation Agar showed significant differences to the cultivation on the other two tested media. The metabolic patterns for all tested cultivation media also proved the dependence on cultivation time. Both whole spectrum based normalization techniques together with the full spectrum PCA allow identification of important discriminative factors in experiments with several variable condition factors avoiding any problems with improper identification of peaks or emphasis on bellow threshold peak data. The amounts of processed data remain still manageable. Both implemented software utilities are available free of charge from http://uprt.vscht.cz/ms. Copyright © 2018 John Wiley & Sons, Ltd.
Statistical analysis of Turbine Engine Diagnostic (TED) field test data

NASA Astrophysics Data System (ADS)

Taylor, Malcolm S.; Monyak, John T.

1994-11-01

During the summer of 1993, a field test of turbine engine diagnostic (TED) software, developed jointly by U.S. Army Research Laboratory and the U.S. Army Ordnance Center and School, was conducted at Fort Stuart, GA. The data were collected in conformance with a cross-over design, some of whose considerations are detailed. The initial analysis of the field test data was exploratory, followed by a more formal investigation. Technical aspects of the data analysis insights that were elicited are reported.

Validation of a free software for unsupervised assessment of abdominal fat in MRI.

PubMed

Maddalo, Michele; Zorza, Ivan; Zubani, Stefano; Nocivelli, Giorgio; Calandra, Giulio; Soldini, Pierantonio; Mascaro, Lorella; Maroldi, Roberto

2017-05-01

To demonstrate the accuracy of an unsupervised (fully automated) software for fat segmentation in magnetic resonance imaging. The proposed software is a freeware solution developed in ImageJ that enables the quantification of metabolically different adipose tissues in large cohort studies. The lumbar part of the abdomen (19cm in craniocaudal direction, centered in L3) of eleven healthy volunteers (age range: 21-46years, BMI range: 21.7-31.6kg/m 2 ) was examined in a breath hold on expiration with a GE T1 Dixon sequence. Single-slice and volumetric data were considered for each subject. The results of the visceral and subcutaneous adipose tissue assessments obtained by the unsupervised software were compared to supervised segmentations of reference. The associated statistical analysis included Pearson correlations, Bland-Altman plots and volumetric differences (VD % ). Values calculated by the unsupervised software significantly correlated with corresponding supervised segmentations of reference for both subcutaneous adipose tissue - SAT (R=0.9996, p<0.001) and visceral adipose tissue - VAT (R=0.995, p<0.001). Bland-Altman plots showed the absence of systematic errors and a limited spread of the differences. In the single-slice analysis, VD % were (1.6±2.9)% for SAT and (4.9±6.9)% for VAT. In the volumetric analysis, VD % were (1.3±0.9)% for SAT and (2.9±2.7)% for VAT. The developed software is capable of segmenting the metabolically different adipose tissues with a high degree of accuracy. This free add-on software for ImageJ can easily have a widespread and enable large-scale population studies regarding the adipose tissue and its related diseases. Copyright © 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Evaluation of vocal acoustic and efficiency analysis parameters in medical students and academic teachers with use of iris and diagnoscope specialist software.

PubMed

Zielińska-Bliźniewska, Hanna; Sułkowski, Wiesław J; Pietkiewicz, Piotr; Miłoński, Jarosław; Mazurek, Agnieszka; Olszewski, Jurek

2012-06-01

The aim of this study was to compare the parameters of vocal acoustic and vocal efficiency analyses in medical students and academic teachers with use of the IRIS and DiagnoScope Specialist software and to evaluate their usefulness in prevention and certification of occupational disease. The study group comprised 40 women, including students and employees of the Military Medical Faculty, Medical University of Łodź. After informed consent had been obtained from the participant women, the primary medical history was taken, videolaryngoscopic and stroboscopic examinations were performed and diagnostic vocal acoustic analysis was carried out with the use of the IRIS and Diagno-Scope Specialist software. Based on the results of the performed measurements, the statistical analysis evidenced the compatibility between two software programs, IRIS and DiagnoScope Specialist, with the only exception of the F4 formant. The mean values of vocal acoustic parameters in medical students and academic teachers, obtained by means of the IRIS software, can be used as standards for the female population not yet developed by the producer. When using the DiagnoScope Specialist software, some mean values were higher and some lower than the standards specified by the producer. The study evidenced the compatibility between two measurement software programs, IRIS and DiagnoScope Specialist, except for the F4 formant. It should be noted that the later has advantage over the former since the standard values of vocal acoustic parameters have been worked out by the producer. Moreover, they only slightly departed from the values obtained in our study and may be useful in diagnostics of occupational voice disorders.
A framework for incorporating DTI Atlas Builder registration into Tract-Based Spatial Statistics and a simulated comparison to standard TBSS.

PubMed

Leming, Matthew; Steiner, Rachel; Styner, Martin

2016-02-27

Tract-based spatial statistics (TBSS) 6 is a software pipeline widely employed in comparative analysis of the white matter integrity from diffusion tensor imaging (DTI) datasets. In this study, we seek to evaluate the relationship between different methods of atlas registration for use with TBSS and different measurements of DTI (fractional anisotropy, FA, axial diffusivity, AD, radial diffusivity, RD, and medial diffusivity, MD). To do so, we have developed a novel tool that builds on existing diffusion atlas building software, integrating it into an adapted version of TBSS called DAB-TBSS (DTI Atlas Builder-Tract-Based Spatial Statistics) by using the advanced registration offered in DTI Atlas Builder 7 . To compare the effectiveness of these two versions of TBSS, we also propose a framework for simulating population differences for diffusion tensor imaging data, providing a more substantive means of empirically comparing DTI group analysis programs such as TBSS. In this study, we used 33 diffusion tensor imaging datasets and simulated group-wise changes in this data by increasing, in three different simulations, the principal eigenvalue (directly altering AD), the second and third eigenvalues (RD), and all three eigenvalues (MD) in the genu, the right uncinate fasciculus, and the left IFO. Additionally, we assessed the benefits of comparing the tensors directly using a functional analysis of diffusion tensor tract statistics (FADTTS 10 ). Our results indicate comparable levels of FA-based detection between DAB-TBSS and TBSS, with standard TBSS registration reporting a higher rate of false positives in other measurements of DTI. Within the simulated changes investigated here, this study suggests that the use of DTI Atlas Builder's registration enhances TBSS group-based studies.
Multivariate statistical analysis software technologies for astrophysical research involving large data bases

NASA Technical Reports Server (NTRS)

Djorgovski, S. George

1994-01-01

We developed a package to process and analyze the data from the digital version of the Second Palomar Sky Survey. This system, called SKICAT, incorporates the latest in machine learning and expert systems software technology, in order to classify the detected objects objectively and uniformly, and facilitate handling of the enormous data sets from digital sky surveys and other sources. The system provides a powerful, integrated environment for the manipulation and scientific investigation of catalogs from virtually any source. It serves three principal functions: image catalog construction, catalog management, and catalog analysis. Through use of the GID3* Decision Tree artificial induction software, SKICAT automates the process of classifying objects within CCD and digitized plate images. To exploit these catalogs, the system also provides tools to merge them into a large, complete database which may be easily queried and modified when new data or better methods of calibrating or classifying become available. The most innovative feature of SKICAT is the facility it provides to experiment with and apply the latest in machine learning technology to the tasks of catalog construction and analysis. SKICAT provides a unique environment for implementing these tools for any number of future scientific purposes. Initial scientific verification and performance tests have been made using galaxy counts and measurements of galaxy clustering from small subsets of the survey data, and a search for very high redshift quasars. All of the tests were successful, and produced new and interesting scientific results. Attachments to this report give detailed accounts of the technical aspects for multivariate statistical analysis of small and moderate-size data sets, called STATPROG. The package was tested extensively on a number of real scientific applications, and has produced real, published results.
Analysis of lipid experiments (ALEX): a software framework for analysis of high-resolution shotgun lipidomics data.

PubMed

Husen, Peter; Tarasov, Kirill; Katafiasz, Maciej; Sokol, Elena; Vogt, Johannes; Baumgart, Jan; Nitsch, Robert; Ekroos, Kim; Ejsing, Christer S

2013-01-01

Global lipidomics analysis across large sample sizes produces high-content datasets that require dedicated software tools supporting lipid identification and quantification, efficient data management and lipidome visualization. Here we present a novel software-based platform for streamlined data processing, management and visualization of shotgun lipidomics data acquired using high-resolution Orbitrap mass spectrometry. The platform features the ALEX framework designed for automated identification and export of lipid species intensity directly from proprietary mass spectral data files, and an auxiliary workflow using database exploration tools for integration of sample information, computation of lipid abundance and lipidome visualization. A key feature of the platform is the organization of lipidomics data in "database table format" which provides the user with an unsurpassed flexibility for rapid lipidome navigation using selected features within the dataset. To demonstrate the efficacy of the platform, we present a comparative neurolipidomics study of cerebellum, hippocampus and somatosensory barrel cortex (S1BF) from wild-type and knockout mice devoid of the putative lipid phosphate phosphatase PRG-1 (plasticity related gene-1). The presented framework is generic, extendable to processing and integration of other lipidomic data structures, can be interfaced with post-processing protocols supporting statistical testing and multivariate analysis, and can serve as an avenue for disseminating lipidomics data within the scientific community. The ALEX software is available at www.msLipidomics.info.
The Perseus computational platform for comprehensive analysis of (prote)omics data.

PubMed

Tyanova, Stefka; Temu, Tikira; Sinitcyn, Pavel; Carlson, Arthur; Hein, Marco Y; Geiger, Tamar; Mann, Matthias; Cox, Jürgen

2016-09-01

A main bottleneck in proteomics is the downstream biological analysis of highly multivariate quantitative protein abundance data generated using mass-spectrometry-based analysis. We developed the Perseus software platform (http://www.perseus-framework.org) to support biological and biomedical researchers in interpreting protein quantification, interaction and post-translational modification data. Perseus contains a comprehensive portfolio of statistical tools for high-dimensional omics data analysis covering normalization, pattern recognition, time-series analysis, cross-omics comparisons and multiple-hypothesis testing. A machine learning module supports the classification and validation of patient groups for diagnosis and prognosis, and it also detects predictive protein signatures. Central to Perseus is a user-friendly, interactive workflow environment that provides complete documentation of computational methods used in a publication. All activities in Perseus are realized as plugins, and users can extend the software by programming their own, which can be shared through a plugin store. We anticipate that Perseus's arsenal of algorithms and its intuitive usability will empower interdisciplinary analysis of complex large data sets.
Meta-analysis in Stata using gllamm.

PubMed

Bagos, Pantelis G

2015-12-01

There are several user-written programs for performing meta-analysis in Stata (Stata Statistical Software: College Station, TX: Stata Corp LP). These include metan, metareg, mvmeta, and glst. However, there are several cases for which these programs do not suffice. For instance, there is no software for performing univariate meta-analysis with correlated estimates, for multilevel or hierarchical meta-analysis, or for meta-analysis of longitudinal data. In this work, we show with practical applications that many disparate models, including but not limited to the ones mentioned earlier, can be fitted using gllamm. The software is very versatile and can handle a wide variety of models with applications in a wide range of disciplines. The method presented here takes advantage of these modeling capabilities and makes use of appropriate transformations, based on the Cholesky decomposition of the inverse of the covariance matrix, known as generalized least squares, in order to handle correlated data. The models described earlier can be thought of as special instances of a general linear mixed-model formulation, but to the author's knowledge, a general exposition in order to incorporate all the available models for meta-analysis as special cases and the instructions to fit them in Stata has not been presented so far. Source code is available at http:www.compgen.org/tools/gllamm. Copyright © 2015 John Wiley & Sons, Ltd.
Testing Software Development Project Productivity Model

NASA Astrophysics Data System (ADS)

Lipkin, Ilya

Software development is an increasingly influential factor in today's business environment, and a major issue affecting software development is how an organization estimates projects. If the organization underestimates cost, schedule, and quality requirements, the end results will not meet customer needs. On the other hand, if the organization overestimates these criteria, resources that could have been used more profitably will be wasted. There is no accurate model or measure available that can guide an organization in a quest for software development, with existing estimation models often underestimating software development efforts as much as 500 to 600 percent. To address this issue, existing models usually are calibrated using local data with a small sample size, with resulting estimates not offering improved cost analysis. This study presents a conceptual model for accurately estimating software development, based on an extensive literature review and theoretical analysis based on Sociotechnical Systems (STS) theory. The conceptual model serves as a solution to bridge organizational and technological factors and is validated using an empirical dataset provided by the DoD. Practical implications of this study allow for practitioners to concentrate on specific constructs of interest that provide the best value for the least amount of time. This study outlines key contributing constructs that are unique for Software Size E-SLOC, Man-hours Spent, and Quality of the Product, those constructs having the largest contribution to project productivity. This study discusses customer characteristics and provides a framework for a simplified project analysis for source selection evaluation and audit task reviews for the customers and suppliers. Theoretical contributions of this study provide an initial theory-based hypothesized project productivity model that can be used as a generic overall model across several application domains such as IT, Command and Control, Simulation and etc... This research validates findings from previous work concerning software project productivity and leverages said results in this study. The hypothesized project productivity model provides statistical support and validation of expert opinions used by practitioners in the field of software project estimation.
VOIP for Telerehabilitation: A Risk Analysis for Privacy, Security and HIPAA Compliance: Part II

PubMed Central

Watzlaf, Valerie J.M.; Moeini, Sohrab; Matusow, Laura; Firouzan, Patti

2011-01-01

In a previous publication the authors developed a privacy and security checklist to evaluate Voice over Internet Protocol (VoIP) videoconferencing software used between patients and therapists to provide telerehabilitation (TR) therapy. In this paper, the privacy and security checklist that was previously developed is used to perform a risk analysis of the top ten VoIP videoconferencing software to determine if their policies provide answers to the privacy and security checklist. Sixty percent of the companies claimed they do not listen into video-therapy calls unless maintenance is needed. Only 50% of the companies assessed use some form of encryption, and some did not specify what type of encryption was used. Seventy percent of the companies assessed did not specify any form of auditing on their servers. Statistically significant differences across company websites were found for sharing information outside of the country (p=0.010), encryption (p=0.006), and security evaluation (p=0.005). Healthcare providers considering use of VoIP software for TR services may consider using this privacy and security checklist before deciding to incorporate a VoIP software system for TR. Other videoconferencing software that is specific for TR with strong encryption, good access controls, and hardware that meets privacy and security standards should be considered for use with TR. PMID:25945177
VOIP for Telerehabilitation: A Risk Analysis for Privacy, Security and HIPAA Compliance: Part II.

PubMed

Watzlaf, Valerie J M; Moeini, Sohrab; Matusow, Laura; Firouzan, Patti

2011-01-01

In a previous publication the authors developed a privacy and security checklist to evaluate Voice over Internet Protocol (VoIP) videoconferencing software used between patients and therapists to provide telerehabilitation (TR) therapy. In this paper, the privacy and security checklist that was previously developed is used to perform a risk analysis of the top ten VoIP videoconferencing software to determine if their policies provide answers to the privacy and security checklist. Sixty percent of the companies claimed they do not listen into video-therapy calls unless maintenance is needed. Only 50% of the companies assessed use some form of encryption, and some did not specify what type of encryption was used. Seventy percent of the companies assessed did not specify any form of auditing on their servers. Statistically significant differences across company websites were found for sharing information outside of the country (p=0.010), encryption (p=0.006), and security evaluation (p=0.005). Healthcare providers considering use of VoIP software for TR services may consider using this privacy and security checklist before deciding to incorporate a VoIP software system for TR. Other videoconferencing software that is specific for TR with strong encryption, good access controls, and hardware that meets privacy and security standards should be considered for use with TR.
Comprehensive analysis of yeast metabolite GC x GC-TOFMS data: combining discovery-mode and deconvolution chemometric software.

PubMed

Mohler, Rachel E; Dombek, Kenneth M; Hoggard, Jamin C; Pierce, Karisa M; Young, Elton T; Synovec, Robert E

2007-08-01

The first extensive study of yeast metabolite GC x GC-TOFMS data from cells grown under fermenting, R, and respiring, DR, conditions is reported. In this study, recently developed chemometric software for use with three-dimensional instrumentation data was implemented, using a statistically-based Fisher ratio method. The Fisher ratio method is fully automated and will rapidly reduce the data to pinpoint two-dimensional chromatographic peaks differentiating sample types while utilizing all the mass channels. The effect of lowering the Fisher ratio threshold on peak identification was studied. At the lowest threshold (just above the noise level), 73 metabolite peaks were identified, nearly three-fold greater than the number of previously reported metabolite peaks identified (26). In addition to the 73 identified metabolites, 81 unknown metabolites were also located. A Parallel Factor Analysis graphical user interface (PARAFAC GUI) was applied to selected mass channels to obtain a concentration ratio, for each metabolite under the two growth conditions. Of the 73 known metabolites identified by the Fisher ratio method, 54 were statistically changing to the 95% confidence limit between the DR and R conditions according to the rigorous Student's t-test. PARAFAC determined the concentration ratio and provided a fully-deconvoluted (i.e. mathematically resolved) mass spectrum for each of the metabolites. The combination of the Fisher ratio method with the PARAFAC GUI provides high-throughput software for discovery-based metabolomics research, and is novel for GC x GC-TOFMS data due to the use of the entire data set in the analysis (640 MB x 70 runs, double precision floating point).
Application of Maxent Multivariate Analysis to Define Climate-Change Effects on Species Distributions and Changes

DTIC Science & Technology

2014-09-01

approaches. Ecological Modelling Volume 200, Issues 1–2, 10, pp 1–19. Buhlmann, Kurt A ., Thomas S.B. Akre , John B. Iverson, Deno Karapatakis, Russell A ...statistical multivariate analysis to define the current and projected future range probability for species of interest to Army land managers. A software...15 Figure 4. RCW omission rate and predicted area as a function of the cumulative threshold
Exploratory Visual Analysis of Statistical Results from Microarray Experiments Comparing High and Low Grade Glioma

PubMed Central

Reif, David M.; Israel, Mark A.; Moore, Jason H.

2007-01-01

The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666
CSAM Metrology Software Tool

NASA Technical Reports Server (NTRS)

Vu, Duc; Sandor, Michael; Agarwal, Shri

2005-01-01

CSAM Metrology Software Tool (CMeST) is a computer program for analysis of false-color CSAM images of plastic-encapsulated microcircuits. (CSAM signifies C-mode scanning acoustic microscopy.) The colors in the images indicate areas of delamination within the plastic packages. Heretofore, the images have been interpreted by human examiners. Hence, interpretations have not been entirely consistent and objective. CMeST processes the color information in image-data files to detect areas of delamination without incurring inconsistencies of subjective judgement. CMeST can be used to create a database of baseline images of packages acquired at given times for comparison with images of the same packages acquired at later times. Any area within an image can be selected for analysis, which can include examination of different delamination types by location. CMeST can also be used to perform statistical analyses of image data. Results of analyses are available in a spreadsheet format for further processing. The results can be exported to any data-base-processing software.
Error correction and diversity analysis of population mixtures determined by NGS

PubMed Central

Burroughs, Nigel J.; Evans, David J.; Ryabov, Eugene V.

2014-01-01

The impetus for this work was the need to analyse nucleotide diversity in a viral mix taken from honeybees. The paper has two findings. First, a method for correction of next generation sequencing error in the distribution of nucleotides at a site is developed. Second, a package of methods for assessment of nucleotide diversity is assembled. The error correction method is statistically based and works at the level of the nucleotide distribution rather than the level of individual nucleotides. The method relies on an error model and a sample of known viral genotypes that is used for model calibration. A compendium of existing and new diversity analysis tools is also presented, allowing hypotheses about diversity and mean diversity to be tested and associated confidence intervals to be calculated. The methods are illustrated using honeybee viral samples. Software in both Excel and Matlab and a guide are available at http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/, the Warwick University Systems Biology Centre software download site. PMID:25405074
Navigating freely-available software tools for metabolomics analysis.

PubMed

Spicer, Rachel; Salek, Reza M; Moreno, Pablo; Cañueto, Daniel; Steinbeck, Christoph

2017-01-01

The field of metabolomics has expanded greatly over the past two decades, both as an experimental science with applications in many areas, as well as in regards to data standards and bioinformatics software tools. The diversity of experimental designs and instrumental technologies used for metabolomics has led to the need for distinct data analysis methods and the development of many software tools. To compile a comprehensive list of the most widely used freely available software and tools that are used primarily in metabolomics. The most widely used tools were selected for inclusion in the review by either ≥ 50 citations on Web of Science (as of 08/09/16) or the use of the tool being reported in the recent Metabolomics Society survey. Tools were then categorised by the type of instrumental data (i.e. LC-MS, GC-MS or NMR) and the functionality (i.e. pre- and post-processing, statistical analysis, workflow and other functions) they are designed for. A comprehensive list of the most used tools was compiled. Each tool is discussed within the context of its application domain and in relation to comparable tools of the same domain. An extended list including additional tools is available at https://github.com/RASpicer/MetabolomicsTools which is classified and searchable via a simple controlled vocabulary. This review presents the most widely used tools for metabolomics analysis, categorised based on their main functionality. As future work, we suggest a direct comparison of tools' abilities to perform specific data analysis tasks e.g. peak picking.
Cognition, comprehension and application of biostatistics in research by Indian postgraduate students in periodontics.

PubMed

Swetha, Jonnalagadda Laxmi; Arpita, Ramisetti; Srikanth, Chintalapani; Nutalapati, Rajasekhar

2014-01-01

Biostatistics is an integral part of research protocols. In any field of inquiry or investigation, data obtained is subsequently classified, analyzed and tested for accuracy by statistical methods. Statistical analysis of collected data, thus, forms the basis for all evidence-based conclusions. The aim of this study is to evaluate the cognition, comprehension and application of biostatistics in research among post graduate students in Periodontics, in India. A total of 391 post graduate students registered for a master's course in periodontics at various dental colleges across India were included in the survey. Data regarding the level of knowledge, understanding and its application in design and conduct of the research protocol was collected using a dichotomous questionnaire. A descriptive statistics was used for data analysis. Nearly 79.2% students were aware of the importance of biostatistics in research, 55-65% were familiar with MS-EXCEL spreadsheet for graphical representation of data and with the statistical softwares available on the internet, 26.0% had biostatistics as mandatory subject in their curriculum, 9.5% tried to perform statistical analysis on their own while 3.0% were successful in performing statistical analysis of their studies on their own. Biostatistics should play a central role in planning, conduct, interim analysis, final analysis and reporting of periodontal research especially by the postgraduate students. Indian postgraduate students in periodontics are aware of the importance of biostatistics in research but the level of understanding and application is still basic and needs to be addressed.
Quantification of Operational Risk Using A Data Mining

NASA Technical Reports Server (NTRS)

Perera, J. Sebastian

1999-01-01

What is Data Mining? - Data Mining is the process of finding actionable information hidden in raw data. - Data Mining helps find hidden patterns, trends, and important relationships often buried in a sea of data - Typically, automated software tools based on advanced statistical analysis and data modeling technology can be utilized to automate the data mining process
SU-F-I-43: A Software-Based Statistical Method to Compute Low Contrast Detectability in Computed Tomography Images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chacko, M; Aldoohan, S

Purpose: The low contrast detectability (LCD) of a CT scanner is its ability to detect and display faint lesions. The current approach to quantify LCD is achieved using vendor-specific methods and phantoms, typically by subjectively observing the smallest size object at a contrast level above phantom background. However, this approach does not yield clinically applicable values for LCD. The current study proposes a statistical LCD metric using software tools to not only to assess scanner performance, but also to quantify the key factors affecting LCD. This approach was developed using uniform QC phantoms, and its applicability was then extended undermore » simulated clinical conditions. Methods: MATLAB software was developed to compute LCD using a uniform image of a QC phantom. For a given virtual object size, the software randomly samples the image within a selected area, and uses statistical analysis based on Student’s t-distribution to compute the LCD as the minimal Hounsfield Unit’s that can be distinguished from the background at the 95% confidence level. Its validity was assessed by comparison with the behavior of a known QC phantom under various scan protocols and a tissue-mimicking phantom. The contributions of beam quality and scattered radiation upon the computed LCD were quantified by using various external beam-hardening filters and phantom lengths. Results: As expected, the LCD was inversely related to object size under all scan conditions. The type of image reconstruction kernel filter and tissue/organ type strongly influenced the background noise characteristics and therefore, the computed LCD for the associated image. Conclusion: The proposed metric and its associated software tools are vendor-independent and can be used to analyze any LCD scanner performance. Furthermore, the method employed can be used in conjunction with the relationships established in this study between LCD and tissue type to extend these concepts to patients’ clinical CT images.« less
Estimating multilevel logistic regression models when the number of clusters is low: a comparison of different statistical software procedures.

PubMed

Austin, Peter C

2010-04-22

Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.

Development of parallel line analysis criteria for recombinant adenovirus potency assay and definition of a unit of potency.

PubMed

Ogawa, Yasushi; Fawaz, Farah; Reyes, Candice; Lai, Julie; Pungor, Erno

2007-01-01

Parameter settings of a parallel line analysis procedure were defined by applying statistical analysis procedures to the absorbance data from a cell-based potency bioassay for a recombinant adenovirus, Adenovirus 5 Fibroblast Growth Factor-4 (Ad5FGF-4). The parallel line analysis was performed with a commercially available software, PLA 1.2. The software performs Dixon outlier test on replicates of the absorbance data, performs linear regression analysis to define linear region of the absorbance data, and tests parallelism between the linear regions of standard and sample. Width of Fiducial limit, expressed as a percent of the measured potency, was developed as a criterion for rejection of the assay data and to significantly improve the reliability of the assay results. With the linear range-finding criteria of the software set to a minimum of 5 consecutive dilutions and best statistical outcome, and in combination with the Fiducial limit width acceptance criterion of <135%, 13% of the assay results were rejected. With these criteria applied, the assay was found to be linear over the range of 0.25 to 4 relative potency units, defined as the potency of the sample normalized to the potency of Ad5FGF-4 standard containing 6 x 10(6) adenovirus particles/mL. The overall precision of the assay was estimated to be 52%. Without the application of Fiducial limit width criterion, the assay results were not linear over the range, and an overall precision of 76% was calculated from the data. An absolute unit of potency for the assay was defined by using the parallel line analysis procedure as the amount of Ad5FGF-4 that results in an absorbance value that is 121% of the average absorbance readings of the wells containing cells not infected with the adenovirus.
Intrinsic noise analyzer: a software package for the exploration of stochastic biochemical kinetics using the system size expansion.

PubMed

Thomas, Philipp; Matuschek, Hannes; Grima, Ramon

2012-01-01

The accepted stochastic descriptions of biochemical dynamics under well-mixed conditions are given by the Chemical Master Equation and the Stochastic Simulation Algorithm, which are equivalent. The latter is a Monte-Carlo method, which, despite enjoying broad availability in a large number of existing software packages, is computationally expensive due to the huge amounts of ensemble averaging required for obtaining accurate statistical information. The former is a set of coupled differential-difference equations for the probability of the system being in any one of the possible mesoscopic states; these equations are typically computationally intractable because of the inherently large state space. Here we introduce the software package intrinsic Noise Analyzer (iNA), which allows for systematic analysis of stochastic biochemical kinetics by means of van Kampen's system size expansion of the Chemical Master Equation. iNA is platform independent and supports the popular SBML format natively. The present implementation is the first to adopt a complementary approach that combines state-of-the-art analysis tools using the computer algebra system Ginac with traditional methods of stochastic simulation. iNA integrates two approximation methods based on the system size expansion, the Linear Noise Approximation and effective mesoscopic rate equations, which to-date have not been available to non-expert users, into an easy-to-use graphical user interface. In particular, the present methods allow for quick approximate analysis of time-dependent mean concentrations, variances, covariances and correlations coefficients, which typically outperforms stochastic simulations. These analytical tools are complemented by automated multi-core stochastic simulations with direct statistical evaluation and visualization. We showcase iNA's performance by using it to explore the stochastic properties of cooperative and non-cooperative enzyme kinetics and a gene network associated with circadian rhythms. The software iNA is freely available as executable binaries for Linux, MacOSX and Microsoft Windows, as well as the full source code under an open source license.
Intrinsic Noise Analyzer: A Software Package for the Exploration of Stochastic Biochemical Kinetics Using the System Size Expansion

PubMed Central

Grima, Ramon

2012-01-01

The accepted stochastic descriptions of biochemical dynamics under well-mixed conditions are given by the Chemical Master Equation and the Stochastic Simulation Algorithm, which are equivalent. The latter is a Monte-Carlo method, which, despite enjoying broad availability in a large number of existing software packages, is computationally expensive due to the huge amounts of ensemble averaging required for obtaining accurate statistical information. The former is a set of coupled differential-difference equations for the probability of the system being in any one of the possible mesoscopic states; these equations are typically computationally intractable because of the inherently large state space. Here we introduce the software package intrinsic Noise Analyzer (iNA), which allows for systematic analysis of stochastic biochemical kinetics by means of van Kampen’s system size expansion of the Chemical Master Equation. iNA is platform independent and supports the popular SBML format natively. The present implementation is the first to adopt a complementary approach that combines state-of-the-art analysis tools using the computer algebra system Ginac with traditional methods of stochastic simulation. iNA integrates two approximation methods based on the system size expansion, the Linear Noise Approximation and effective mesoscopic rate equations, which to-date have not been available to non-expert users, into an easy-to-use graphical user interface. In particular, the present methods allow for quick approximate analysis of time-dependent mean concentrations, variances, covariances and correlations coefficients, which typically outperforms stochastic simulations. These analytical tools are complemented by automated multi-core stochastic simulations with direct statistical evaluation and visualization. We showcase iNA’s performance by using it to explore the stochastic properties of cooperative and non-cooperative enzyme kinetics and a gene network associated with circadian rhythms. The software iNA is freely available as executable binaries for Linux, MacOSX and Microsoft Windows, as well as the full source code under an open source license. PMID:22723865
Benchmarking quantitative label-free LC-MS data processing workflows using a complex spiked proteomic standard dataset.

PubMed

Ramus, Claire; Hovasse, Agnès; Marcellin, Marlène; Hesse, Anne-Marie; Mouton-Barbosa, Emmanuelle; Bouyssié, David; Vaca, Sebastian; Carapito, Christine; Chaoui, Karima; Bruley, Christophe; Garin, Jérôme; Cianférani, Sarah; Ferro, Myriam; Van Dorssaeler, Alain; Burlet-Schiltz, Odile; Schaeffer, Christine; Couté, Yohann; Gonzalez de Peredo, Anne

2016-01-30

Proteomic workflows based on nanoLC-MS/MS data-dependent-acquisition analysis have progressed tremendously in recent years. High-resolution and fast sequencing instruments have enabled the use of label-free quantitative methods, based either on spectral counting or on MS signal analysis, which appear as an attractive way to analyze differential protein expression in complex biological samples. However, the computational processing of the data for label-free quantification still remains a challenge. Here, we used a proteomic standard composed of an equimolar mixture of 48 human proteins (Sigma UPS1) spiked at different concentrations into a background of yeast cell lysate to benchmark several label-free quantitative workflows, involving different software packages developed in recent years. This experimental design allowed to finely assess their performances in terms of sensitivity and false discovery rate, by measuring the number of true and false-positive (respectively UPS1 or yeast background proteins found as differential). The spiked standard dataset has been deposited to the ProteomeXchange repository with the identifier PXD001819 and can be used to benchmark other label-free workflows, adjust software parameter settings, improve algorithms for extraction of the quantitative metrics from raw MS data, or evaluate downstream statistical methods. Bioinformatic pipelines for label-free quantitative analysis must be objectively evaluated in their ability to detect variant proteins with good sensitivity and low false discovery rate in large-scale proteomic studies. This can be done through the use of complex spiked samples, for which the "ground truth" of variant proteins is known, allowing a statistical evaluation of the performances of the data processing workflow. We provide here such a controlled standard dataset and used it to evaluate the performances of several label-free bioinformatics tools (including MaxQuant, Skyline, MFPaQ, IRMa-hEIDI and Scaffold) in different workflows, for detection of variant proteins with different absolute expression levels and fold change values. The dataset presented here can be useful for tuning software tool parameters, and also testing new algorithms for label-free quantitative analysis, or for evaluation of downstream statistical methods. Copyright © 2015 Elsevier B.V. All rights reserved.
Statistical models for the analysis and design of digital polymerase chain (dPCR) experiments

USGS Publications Warehouse

Dorazio, Robert; Hunter, Margaret

2015-01-01

Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log–log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model’s parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments.

PubMed

Dorazio, Robert M; Hunter, Margaret E

2015-11-03

Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log-log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model's parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
Quantitative Measures for Software Independent Verification and Validation

NASA Technical Reports Server (NTRS)

Lee, Alice

1996-01-01

As software is maintained or reused, it undergoes an evolution which tends to increase the overall complexity of the code. To understand the effects of this, we brought in statistics experts and leading researchers in software complexity, reliability, and their interrelationships. These experts' project has resulted in our ability to statistically correlate specific code complexity attributes, in orthogonal domains, to errors found over time in the HAL/S flight software which flies in the Space Shuttle. Although only a prototype-tools experiment, the result of this research appears to be extendable to all other NASA software, given appropriate data similar to that logged for the Shuttle onboard software. Our research has demonstrated that a more complete domain coverage can be mathematically demonstrated with the approach we have applied, thereby ensuring full insight into the cause-and-effects relationship between the complexity of a software system and the fault density of that system. By applying the operational profile we can characterize the dynamic effects of software path complexity under this same approach We now have the ability to measure specific attributes which have been statistically demonstrated to correlate to increased error probability, and to know which actions to take, for each complexity domain. Shuttle software verifiers can now monitor the changes in the software complexity, assess the added or decreased risk of software faults in modified code, and determine necessary corrections. The reports, tool documentation, user's guides, and new approach that have resulted from this research effort represent advances in the state of the art of software quality and reliability assurance. Details describing how to apply this technique to other NASA code are contained in this document.
A study of 1177 odontogenic lesions in a South Kerala population

PubMed Central

Deepthi, PV; Beena, VT; Padmakumar, SK; Rajeev, R; Sivakumar, R

2016-01-01

Context: A study on odontogenic cysts and tumors. Aims: The aim of this study is to determine the frequency of odontogenic cysts and tumors and their distribution according to age, gender, site and histopathologic types of those reported over a period of 1998–2012 in a Tertiary Health Care Center at South Kerala. Settings and Design: The archives of Department of Oral Pathology and Microbiology, were retrospectively analyzed. Subjects and Methods: Archival records were reviewed and all the cases of odontogenic cysts and tumors were retrieved from 1998 to 2012. Statistical Analysis Used: Descriptive statistical analysis was performed using the computer software, Statistical Package for Social Sciences (SPSS) IBM SPSS Software version 16. Results: Of 7117 oral biopsies, 4.29% were odontogenic tumors. Ameloblastoma was the most common odontogenic tumor comprising 50.2% of cases, followed by keratocystic odontogenic tumor (24.3%). These tumors showed a male predilection (1.19: 1). Odontogenic tumors occurred in a mean age of 33.7 ± 16.8 years. Mandible was the most common jaw affected (76.07%). Odontogenic cysts constituted 12.25% of all oral biopsies. Radicular cyst comprised 75.11% of odontogenic cysts followed by dentigerous cyst (17.2%). Conclusions: This study showed similar as well as contradictory results compared to other studies, probably due to geographical and ethnic variations which is yet to be corroborated. PMID:27601809
A time series analysis performed on a 25-year period of kidney transplantation activity in a single center.

PubMed

Santori, G; Fontana, I; Bertocchi, M; Gasloli, G; Valente, U

2010-05-01

Following the example of many Western countries, where a "minimum volume rule" policy has been adopted as a quality parameter for complex surgical procedures, the Italian National Transplant Centre set the minimum number of kidney transplantation procedures/y at 30/center. The number of procedures performed in a single center over a large period may be treated as a time series to evaluate trends, seasonal cycles, and nonsystematic fluctuations. Between January 1, 1983, and December 31, 2007, we performed 1376 procedures in adult or pediatric recipients from living or cadaveric donors. The greatest numbers of cases/y were performed in 1998 (n = 86) followed by 2004 (n = 82), 1996 (n = 75), and 2003 (n = 73). A time series analysis performed using R Statistical Software (Foundation for Statistical Computing, Vienna, Austria), a free software environment for statistical computing and graphics, showed a whole incremental trend after exponential smoothing as well as after seasonal decomposition. However, starting from 2005, we observed a decreased trend in the series. The number of kidney transplants expected to be performed for 2008 by using the Holt-Winters exponential smoothing applied to the period 1983 to 2007 suggested 58 procedures, while in that year there were 52. The time series approach may be helpful to establish a minimum volume/y at a single-center level. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Sampling and sensitivity analyses tools (SaSAT) for computational modelling

PubMed Central

Hoare, Alexander; Regan, David G; Wilson, David P

2008-01-01

SaSAT (Sampling and Sensitivity Analysis Tools) is a user-friendly software package for applying uncertainty and sensitivity analyses to mathematical and computational models of arbitrary complexity and context. The toolbox is built in Matlab®, a numerical mathematical software package, and utilises algorithms contained in the Matlab® Statistics Toolbox. However, Matlab® is not required to use SaSAT as the software package is provided as an executable file with all the necessary supplementary files. The SaSAT package is also designed to work seamlessly with Microsoft Excel but no functionality is forfeited if that software is not available. A comprehensive suite of tools is provided to enable the following tasks to be easily performed: efficient and equitable sampling of parameter space by various methodologies; calculation of correlation coefficients; regression analysis; factor prioritisation; and graphical output of results, including response surfaces, tornado plots, and scatterplots. Use of SaSAT is exemplified by application to a simple epidemic model. To our knowledge, a number of the methods available in SaSAT for performing sensitivity analyses have not previously been used in epidemiological modelling and their usefulness in this context is demonstrated. PMID:18304361
Gencrypt: one-way cryptographic hashes to detect overlapping individuals across samples

PubMed Central

Turchin, Michael C.; Hirschhorn, Joel N.

2012-01-01

Summary: Meta-analysis across genome-wide association studies is a common approach for discovering genetic associations. However, in some meta-analysis efforts, individual-level data cannot be broadly shared by study investigators due to privacy and Institutional Review Board concerns. In such cases, researchers cannot confirm that each study represents a unique group of people, leading to potentially inflated test statistics and false positives. To resolve this problem, we created a software tool, Gencrypt, which utilizes a security protocol known as one-way cryptographic hashes to allow overlapping participants to be identified without sharing individual-level data. Availability: Gencrypt is freely available under the GNU general public license v3 at http://www.broadinstitute.org/software/gencrypt/ Contact: joelh@broadinstitute.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22302573
ViennaNGS: A toolbox for building efficient next- generation sequencing analysis pipelines

PubMed Central

Wolfinger, Michael T.; Fallmann, Jörg; Eggenhofer, Florian; Amman, Fabian

2015-01-01

Recent achievements in next-generation sequencing (NGS) technologies lead to a high demand for reuseable software components to easily compile customized analysis workflows for big genomics data. We present ViennaNGS, an integrated collection of Perl modules focused on building efficient pipelines for NGS data processing. It comes with functionality for extracting and converting features from common NGS file formats, computation and evaluation of read mapping statistics, as well as normalization of RNA abundance. Moreover, ViennaNGS provides software components for identification and characterization of splice junctions from RNA-seq data, parsing and condensing sequence motif data, automated construction of Assembly and Track Hubs for the UCSC genome browser, as well as wrapper routines for a set of commonly used NGS command line tools. PMID:26236465
Integrality in cervical cancer care: evaluation of access

PubMed Central

Brito-Silva, Keila; Bezerra, Adriana Falangola Benjamin; Chaves, Lucieli Dias Pedreschi; Tanaka, Oswaldo Yoshimi

2014-01-01

OBJECTIVE To evaluate integrity of access to uterine cervical cancer prevention, diagnosis and treatment services. METHODS The tracer condition was analyzed using a mixed quantitative and qualitative approach. The quantitative approach was based on secondary data from the analysis of cytology and biopsy exams performed between 2008 and 2010 on 25 to 59 year-old women in a municipality with a large population and with the necessary technological resources. Data were obtained from the Health Information System and the Regional Cervical Cancer Information System. Statistical analysis was performed using PASW statistic 17.0 software. The qualitative approach involved semi-structured interviews with service managers, health care professionals and users. NVivo 9.0 software was used for the content analysis of the primary data. RESULTS Pap smear coverage was low, possible due to insufficient screening and the difficulty of making appointments in primary care. The numbers of biopsies conducted are similar to those of abnormal cytologies, reflecting easy access to the specialized services. There was higher coverage among younger women. More serious diagnoses, for both cytologies and biopsies, were more prevalent in older women. CONCLUSIONS Insufficient coverage of cytologies, reported by the interviewees allows us to understand access difficulties in primary care, as well as the fragility of screening strategies. PMID:24897045
TASI: A software tool for spatial-temporal quantification of tumor spheroid dynamics.

PubMed

Hou, Yue; Konen, Jessica; Brat, Daniel J; Marcus, Adam I; Cooper, Lee A D

2018-05-08

Spheroid cultures derived from explanted cancer specimens are an increasingly utilized resource for studying complex biological processes like tumor cell invasion and metastasis, representing an important bridge between the simplicity and practicality of 2-dimensional monolayer cultures and the complexity and realism of in vivo animal models. Temporal imaging of spheroids can capture the dynamics of cell behaviors and microenvironments, and when combined with quantitative image analysis methods, enables deep interrogation of biological mechanisms. This paper presents a comprehensive open-source software framework for Temporal Analysis of Spheroid Imaging (TASI) that allows investigators to objectively characterize spheroid growth and invasion dynamics. TASI performs spatiotemporal segmentation of spheroid cultures, extraction of features describing spheroid morpho-phenotypes, mathematical modeling of spheroid dynamics, and statistical comparisons of experimental conditions. We demonstrate the utility of this tool in an analysis of non-small cell lung cancer spheroids that exhibit variability in metastatic and proliferative behaviors.
Data management in large-scale collaborative toxicity studies: how to file experimental data for automated statistical analysis.

PubMed

Stanzel, Sven; Weimer, Marc; Kopp-Schneider, Annette

2013-06-01

High-throughput screening approaches are carried out for the toxicity assessment of a large number of chemical compounds. In such large-scale in vitro toxicity studies several hundred or thousand concentration-response experiments are conducted. The automated evaluation of concentration-response data using statistical analysis scripts saves time and yields more consistent results in comparison to data analysis performed by the use of menu-driven statistical software. Automated statistical analysis requires that concentration-response data are available in a standardised data format across all compounds. To obtain consistent data formats, a standardised data management workflow must be established, including guidelines for data storage, data handling and data extraction. In this paper two procedures for data management within large-scale toxicological projects are proposed. Both procedures are based on Microsoft Excel files as the researcher's primary data format and use a computer programme to automate the handling of data files. The first procedure assumes that data collection has not yet started whereas the second procedure can be used when data files already exist. Successful implementation of the two approaches into the European project ACuteTox is illustrated. Copyright © 2012 Elsevier Ltd. All rights reserved.
VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data.

PubMed

Peterson, Elena S; McCue, Lee Ann; Schrimpe-Rutledge, Alexandra C; Jensen, Jeffrey L; Walker, Hyunjoo; Kobold, Markus A; Webb, Samantha R; Payne, Samuel H; Ansong, Charles; Adkins, Joshua N; Cannon, William R; Webb-Robertson, Bobbie-Jo M

2012-04-05

The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic genomes. Data is evaluated via visual analysis across multiple levels of genomic resolution, linked searches and interaction with existing bioinformatics tools. We highlight the novel functionality of VESPA and core programming requirements for visualization of these large heterogeneous datasets for a client-side application. The software is freely available at https://www.biopilot.org/docs/Software/Vespa.php.
VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data

PubMed Central

2012-01-01

Background The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. Results VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. Conclusions VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic genomes. Data is evaluated via visual analysis across multiple levels of genomic resolution, linked searches and interaction with existing bioinformatics tools. We highlight the novel functionality of VESPA and core programming requirements for visualization of these large heterogeneous datasets for a client-side application. The software is freely available at https://www.biopilot.org/docs/Software/Vespa.php. PMID:22480257
Statistics and Informatics in Space Astrophysics

NASA Astrophysics Data System (ADS)

Feigelson, E.

2017-12-01

The interest in statistical and computational methodology has seen rapid growth in space-based astrophysics, parallel to the growth seen in Earth remote sensing. There is widespread agreement that scientific interpretation of the cosmic microwave background, discovery of exoplanets, and classifying multiwavelength surveys is too complex to be accomplished with traditional techniques. NASA operates several well-functioning Science Archive Research Centers providing 0.5 PBy datasets to the research community. These databases are integrated with full-text journal articles in the NASA Astrophysics Data System (200K pageviews/day). Data products use interoperable formats and protocols established by the International Virtual Observatory Alliance. NASA supercomputers also support complex astrophysical models of systems such as accretion disks and planet formation. Academic researcher interest in methodology has significantly grown in areas such as Bayesian inference and machine learning, and statistical research is underway to treat problems such as irregularly spaced time series and astrophysical model uncertainties. Several scholarly societies have created interest groups in astrostatistics and astroinformatics. Improvements are needed on several fronts. Community education in advanced methodology is not sufficiently rapid to meet the research needs. Statistical procedures within NASA science analysis software are sometimes not optimal, and pipeline development may not use modern software engineering techniques. NASA offers few grant opportunities supporting research in astroinformatics and astrostatistics.
DOE Office of Scientific and Technical Information (OSTI.GOV)

P-Mart was designed specifically to allow cancer researchers to perform robust statistical processing of publicly available cancer proteomic datasets. To date an online statistical processing suite for proteomics does not exist. The P-Mart software is designed to allow statistical programmers to utilize these algorithms through packages in the R programming language as well as offering a web-based interface using the Azure cloud technology. The Azure cloud technology also allows the release of the software via Docker containers.
Anima: Modular Workflow System for Comprehensive Image Data Analysis

PubMed Central

Rantanen, Ville; Valori, Miko; Hautaniemi, Sampsa

2014-01-01

Modern microscopes produce vast amounts of image data, and computational methods are needed to analyze and interpret these data. Furthermore, a single image analysis project may require tens or hundreds of analysis steps starting from data import and pre-processing to segmentation and statistical analysis; and ending with visualization and reporting. To manage such large-scale image data analysis projects, we present here a modular workflow system called Anima. Anima is designed for comprehensive and efficient image data analysis development, and it contains several features that are crucial in high-throughput image data analysis: programing language independence, batch processing, easily customized data processing, interoperability with other software via application programing interfaces, and advanced multivariate statistical analysis. The utility of Anima is shown with two case studies focusing on testing different algorithms developed in different imaging platforms and an automated prediction of alive/dead C. elegans worms by integrating several analysis environments. Anima is a fully open source and available with documentation at www.anduril.org/anima. PMID:25126541

Fast and accurate imputation of summary statistics enhances evidence of functional enrichment

PubMed Central

Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P.; Patterson, Nick; Price, Alkes L.

2014-01-01

Motivation: Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. Results: In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1–5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case–control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of χ2 association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Availability and implementation: Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. Contact: bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:24990607
Symposium on the Interface: Computing Science and Statistics (20th). Theme: Computationally Intensive Methods in Statistics Held in Reston, Virginia on April 20-23, 1988

DTIC Science & Technology

1988-08-20

34 William A. Link, Patuxent Wildlife Research Center "Increasing reliability of multiversion fault-tolerant software design by modulation," Junryo 3... Multiversion lault-Tolerant Software Design by Modularization Junryo Miyashita Department of Computer Science California state University at san Bernardino Fault...They shall beE refered to as " multiversion fault-tolerant software design". Onel problem of developing multi-versions of a program is the high cost
The software and algorithms for hyperspectral data processing

NASA Astrophysics Data System (ADS)

Shyrayeva, Anhelina; Martinov, Anton; Ivanov, Victor; Katkovsky, Leonid

2017-04-01

Hyperspectral remote sensing technique is widely used for collecting and processing -information about the Earth's surface objects. Hyperspectral data are combined to form a three-dimensional (x, y, λ) data cube. Department of Aerospace Research of the Institute of Applied Physical Problems of the Belarusian State University presents a general model of the software for hyperspectral image data analysis and processing. The software runs in Windows XP/7/8/8.1/10 environment on any personal computer. This complex has been has been written in C++ language using QT framework and OpenGL for graphical data visualization. The software has flexible structure that consists of a set of independent plugins. Each plugin was compiled as Qt Plugin and represents Windows Dynamic library (dll). Plugins can be categorized in terms of data reading types, data visualization (3D, 2D, 1D) and data processing The software has various in-built functions for statistical and mathematical analysis, signal processing functions like direct smoothing function for moving average, Savitzky-Golay smoothing technique, RGB correction, histogram transformation, and atmospheric correction. The software provides two author's engineering techniques for the solution of atmospheric correction problem: iteration method of refinement of spectral albedo's parameters using Libradtran and analytical least square method. The main advantages of these methods are high rate of processing (several minutes for 1 GB data) and low relative error in albedo retrieval (less than 15%). Also, the software supports work with spectral libraries, region of interest (ROI) selection, spectral analysis such as cluster-type image classification and automatic hypercube spectrum comparison by similarity criterion with similar ones from spectral libraries, and vice versa. The software deals with different kinds of spectral information in order to identify and distinguish spectrally unique materials. Also, the following advantages should be noted: fast and low memory hypercube manipulation features, user-friendly interface, modularity, and expandability.
Characterization of Morphology using MAMA Software

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gravelle, Julie

The MAMA (Morphological Analysis for Material Attribution) software was developed at the Los Alamos National Laboratory funded through the National Technical Nuclear Forensics Center in the Department of Homeland Security. The software allows images to be analysed and quantified. The largest project I worked on was to quantify images of plutonium oxides and ammonium diuranates prepared by the group with the software and provide analyses on the particles of each sample. Images were quantified through MAMA, with a color analysis, a lexicon description and powder x-ray diffraction. Through this we were able to visually see a difference between some ofmore » the syntheses. An additional project was to revise the manual for MAMA to help streamline training and provide useful tips to users to more quickly become acclimated to using the software. The third project investigated expanding the scope of MAMA and finding a statistically relevant baseline for the particulates through the analysis of maps in the software and using known measurements to compare the error associated with the software. During this internship, I worked on several different projects dealing with the MAMA software. The revision of the usermanual for the MAMA software was the first project I was able to work and collaborate on. I first learned how to use the software by getting instruction from a skilled user at the laboratory, Dan Schwartz, and by using the existing user manual and examples. After becoming accustomed to the program, I started to go over the manual to correct and change items that were not as useful or descriptive as they could have been. I also added in tips that I learned as I explored the software. The updated manual was also worked on by several others who have been developing the program. The goal of these revisions was to ensure the most concise and simple directions to the software were available to future users. By incorporating tricks and shortcuts that I discovered and picked up from watching other users into the user guide, I believe that anyone who utilizes the software will be able to quickly understand the best way to analyze their image and use the tools the program offers to achieve useful results.« less
Ensuring Positiveness of the Scaled Difference Chi-square Test Statistic.

PubMed

Satorra, Albert; Bentler, Peter M

2010-06-01

A scaled difference test statistic [Formula: see text] that can be computed from standard software of structural equation models (SEM) by hand calculations was proposed in Satorra and Bentler (2001). The statistic [Formula: see text] is asymptotically equivalent to the scaled difference test statistic T̄(d) introduced in Satorra (2000), which requires more involved computations beyond standard output of SEM software. The test statistic [Formula: see text] has been widely used in practice, but in some applications it is negative due to negativity of its associated scaling correction. Using the implicit function theorem, this note develops an improved scaling correction leading to a new scaled difference statistic T̄(d) that avoids negative chi-square values.
PRANAS: A New Platform for Retinal Analysis and Simulation.

PubMed

Cessac, Bruno; Kornprobst, Pierre; Kraria, Selim; Nasser, Hassan; Pamplona, Daniela; Portelli, Geoffrey; Viéville, Thierry

2017-01-01

The retina encodes visual scenes by trains of action potentials that are sent to the brain via the optic nerve. In this paper, we describe a new free access user-end software allowing to better understand this coding. It is called PRANAS (https://pranas.inria.fr), standing for Platform for Retinal ANalysis And Simulation. PRANAS targets neuroscientists and modelers by providing a unique set of retina-related tools. PRANAS integrates a retina simulator allowing large scale simulations while keeping a strong biological plausibility and a toolbox for the analysis of spike train population statistics. The statistical method (entropy maximization under constraints) takes into account both spatial and temporal correlations as constraints, allowing to analyze the effects of memory on statistics. PRANAS also integrates a tool computing and representing in 3D (time-space) receptive fields. All these tools are accessible through a friendly graphical user interface. The most CPU-costly of them have been implemented to run in parallel.
From fields to objects: A review of geographic boundary analysis

NASA Astrophysics Data System (ADS)

Jacquez, G. M.; Maruca, S.; Fortin, M.-J.

Geographic boundary analysis is a relatively new approach unfamiliar to many spatial analysts. It is best viewed as a technique for defining objects - geographic boundaries - on spatial fields, and for evaluating the statistical significance of characteristics of those boundary objects. This is accomplished using null spatial models representative of the spatial processes expected in the absence of boundary-generating phenomena. Close ties to the object-field dialectic eminently suit boundary analysis to GIS data. The majority of existing spatial methods are field-based in that they describe, estimate, or predict how attributes (variables defining the field) vary through geographic space. Such methods are appropriate for field representations but not object representations. As the object-field paradigm gains currency in geographic information science, appropriate techniques for the statistical analysis of objects are required. The methods reviewed in this paper are a promising foundation. Geographic boundary analysis is clearly a valuable addition to the spatial statistical toolbox. This paper presents the philosophy of, and motivations for geographic boundary analysis. It defines commonly used statistics for quantifying boundaries and their characteristics, as well as simulation procedures for evaluating their significance. We review applications of these techniques, with the objective of making this promising approach accessible to the GIS-spatial analysis community. We also describe the implementation of these methods within geographic boundary analysis software: GEM.
Sensory redundancy management: The development of a design methodology for determining threshold values through a statistical analysis of sensor output data

NASA Technical Reports Server (NTRS)

Scalzo, F.

1983-01-01

Sensor redundancy management (SRM) requires a system which will detect failures and reconstruct avionics accordingly. A probability density function to determine false alarm rates, using an algorithmic approach was generated. Microcomputer software was developed which will print out tables of values for the cummulative probability of being in the domain of failure; system reliability; and false alarm probability, given a signal is in the domain of failure. The microcomputer software was applied to the sensor output data for various AFT1 F-16 flights and sensor parameters. Practical recommendations for further research were made.
QUEST/Ada (Query Utility Environment for Software Testing) of Ada: The development of a program analysis environment for Ada

NASA Technical Reports Server (NTRS)

Brown, David B.

1988-01-01

A history of the Query Utility Environment for Software Testing (QUEST)/Ada is presented. A fairly comprehensive literature review which is targeted toward issues of Ada testing is given. The definition of the system structure and the high level interfaces are then presented. The design of the three major components is described. The QUEST/Ada IORL System Specifications to this point in time are included in the Appendix. A paper is also included in the appendix which gives statistical evidence of the validity of the test case generation approach which is being integrated into QUEST/Ada.
Teaching Probability with the Support of the R Statistical Software

ERIC Educational Resources Information Center

dos Santos Ferreira, Robson; Kataoka, Verônica Yumi; Karrer, Monica

2014-01-01

The objective of this paper is to discuss aspects of high school students' learning of probability in a context where they are supported by the statistical software R. We report on the application of a teaching experiment, constructed using the perspective of Gal's probabilistic literacy and Papert's constructionism. The results show improvement…
Using Information Technology in Teaching of Business Statistics in Nigeria Business School

ERIC Educational Resources Information Center

Hamadu, Dallah; Adeleke, Ismaila; Ehie, Ike

2011-01-01

This paper discusses the use of Microsoft Excel software in the teaching of statistics in the Faculty of Business Administration at the University of Lagos, Nigeria. Problems associated with existing traditional methods are identified and a novel pedagogy using Excel is proposed. The advantages of using this software over other specialized…
Virtual tool mark generation for efficient striation analysis in forensic science

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ekstrand, Laura

In 2009, a National Academy of Sciences report called for investigation into the scienti c basis behind tool mark comparisons (National Academy of Sciences, 2009). Answering this call, Chumbley et al. (2010) attempted to prove or disprove the hypothesis that tool marks are unique to a single tool. They developed a statistical algorithm that could, in most cases, discern matching and non-matching tool marks made at di erent angles by sequentially numbered screwdriver tips. Moreover, in the cases where the algorithm misinterpreted a pair of marks, an experienced forensics examiner could discern the correct outcome. While this research served tomore » con rm the basic assumptions behind tool mark analysis, it also suggested that statistical analysis software could help to reduce the examiner's workload. This led to a new tool mark analysis approach, introduced in this thesis, that relies on 3D scans of screwdriver tip and marked plate surfaces at the micrometer scale from an optical microscope. These scans are carefully cleaned to remove noise from the data acquisition process and assigned a coordinate system that mathematically de nes angles and twists in a natural way. The marking process is then simulated by using a 3D graphics software package to impart rotations to the tip and take the projection of the tip's geometry in the direction of tool travel. The edge of this projection, retrieved from the 3D graphics software, becomes a virtual tool mark. Using this method, virtual marks are made at increments of 5 and compared to a scan of the evidence mark. The previously developed statistical package from Chumbley et al. (2010) performs the comparison, comparing the similarity of the geometry of both marks to the similarity that would occur due to random chance. The resulting statistical measure of the likelihood of the match informs the examiner of the angle of the best matching virtual mark, allowing the examiner to focus his/her mark analysis on a smaller range of angles. Preliminary results are quite promising. In a study with both sides of 6 screwdriver tips and 34 corresponding marks, the method distinguished known matches from known non-matches with zero false positive matches and only two matches mistaken for non-matches. For matches, it could predict the correct marking angle within 5-10 . Moreover, on a standard desktop computer, the virtual marking software is capable of cleaning 3D tip and plate scans in minutes and producing a virtual mark and comparing it to a real mark in seconds. These results support several of the professional conclusions of the tool mark analysis com- munity, including the idea that marks produced by the same tool only match if they are made at similar angles. The method also displays the potential to automate part of the comparison process, freeing the examiner to focus on other tasks, which is important in busy, backlogged crime labs. Finally, the method o ers the unique chance to directly link an evidence mark to the tool that produced it while reducing potential damage to the evidence.« less
Federal Communications Commission (FCC) Transponder Loading Data Conversion Software. User's guide and software maintenance manual, version 1.2

NASA Technical Reports Server (NTRS)

Mallasch, Paul G.

1993-01-01

This volume contains the complete software system documentation for the Federal Communications Commission (FCC) Transponder Loading Data Conversion Software (FIX-FCC). This software was written to facilitate the formatting and conversion of FCC Transponder Occupancy (Loading) Data before it is loaded into the NASA Geosynchronous Satellite Orbital Statistics Database System (GSOSTATS). The information that FCC supplies NASA is in report form and must be converted into a form readable by the database management software used in the GSOSTATS application. Both the User's Guide and Software Maintenance Manual are contained in this document. This volume of documentation passed an independent quality assurance review and certification by the Product Assurance and Security Office of the Planning Research Corporation (PRC). The manuals were reviewed for format, content, and readability. The Software Management and Assurance Program (SMAP) life cycle and documentation standards were used in the development of this document. Accordingly, these standards were used in the review. Refer to the System/Software Test/Product Assurance Report for the Geosynchronous Satellite Orbital Statistics Database System (GSOSTATS) for additional information.
Utilization of Solar Dynamics Observatory space weather digital image data for comparative analysis with application to Baryon Oscillation Spectroscopic Survey

NASA Astrophysics Data System (ADS)

Shekoyan, V.; Dehipawala, S.; Liu, Ernest; Tulsee, Vivek; Armendariz, R.; Tremberger, G.; Holden, T.; Marchese, P.; Cheung, T.

2012-10-01

Digital solar image data is available to users with access to standard, mass-market software. Many scientific projects utilize the Flexible Image Transport System (FITS) format, which requires specialized software typically used in astrophysical research. Data in the FITS format includes photometric and spatial calibration information, which may not be useful to researchers working with self-calibrated, comparative approaches. This project examines the advantages of using mass-market software with readily downloadable image data from the Solar Dynamics Observatory for comparative analysis over with the use of specialized software capable of reading data in the FITS format. Comparative analyses of brightness statistics that describe the solar disk in the study of magnetic energy using algorithms included in mass-market software have been shown to give results similar to analyses using FITS data. The entanglement of magnetic energy associated with solar eruptions, as well as the development of such eruptions, has been characterized successfully using mass-market software. The proposed algorithm would help to establish a publicly accessible, computing network that could assist in exploratory studies of all FITS data. The advances in computer, cell phone and tablet technology could incorporate such an approach readily for the enhancement of high school and first-year college space weather education on a global scale. Application to ground based data such as that contained in the Baryon Oscillation Spectroscopic Survey is discussed.
Comparison of classical statistical methods and artificial neural network in traffic noise prediction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nedic, Vladimir, E-mail: vnedic@kg.ac.rs; Despotovic, Danijela, E-mail: ddespotovic@kg.ac.rs; Cvetanovic, Slobodan, E-mail: slobodan.cvetanovic@eknfak.ni.ac.rs

2014-11-15

Traffic is the main source of noise in urban environments and significantly affects human mental and physical health and labor productivity. Therefore it is very important to model the noise produced by various vehicles. Techniques for traffic noise prediction are mainly based on regression analysis, which generally is not good enough to describe the trends of noise. In this paper the application of artificial neural networks (ANNs) for the prediction of traffic noise is presented. As input variables of the neural network, the proposed structure of the traffic flow and the average speed of the traffic flow are chosen. Themore » output variable of the network is the equivalent noise level in the given time period L{sub eq}. Based on these parameters, the network is modeled, trained and tested through a comparative analysis of the calculated values and measured levels of traffic noise using the originally developed user friendly software package. It is shown that the artificial neural networks can be a useful tool for the prediction of noise with sufficient accuracy. In addition, the measured values were also used to calculate equivalent noise level by means of classical methods, and comparative analysis is given. The results clearly show that ANN approach is superior in traffic noise level prediction to any other statistical method. - Highlights: • We proposed an ANN model for prediction of traffic noise. • We developed originally designed user friendly software package. • The results are compared with classical statistical methods. • The results are much better predictive capabilities of ANN model.« less
Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science).

PubMed

Zeng, Irene Sui Lan; Lumley, Thomas

2018-01-01

Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.
Systematic on-site monitoring of compliance dust samples

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grayson, R.L.; Gandy, J.R.

1996-12-31

Maintaining compliance with U.S. respirable coal mine dust standards can be difficult on high-productivity longwall panels. Comprehensive and systematic analysis of compliance dust sample data, coupled with access to the U.S. Bureau of Mines (USBM) DUSTPRO, can yield important information for use in maintaining compliance. The objective of this study was to develop and apply a customized software for the collection, storage, modification, and analysis of respirable dust data while providing for flexible export of data and linking with the USBM`s expert advisory system on dust control. An executable, IBM-compatible software was created and customized for use by the personmore » in charge of collecting, submitting, analyzing, and monitoring respirable dust compliance samples. Both descriptive statistics and multiple regression analysis were incorporated. The software allows ASCH files to be exported and directly links with DUSTPRO. After development and validation of the software, longwall compliance data from two different mines was analyzed to evaluate the value of the software. Data included variables on respirable dust concentration, tons produced, the existence of roof/floor rock (dummy variable), and the sampling cycle (dummy variables). Because of confidentiality, specific data will not be presented, only the equations and ANOVA tables. The final regression models explained 83.8% and 61.1% of the variation in the data for the two panels. Important correlations among variables within sampling cycles showed the value of using dummy variables for sampling cycles. The software proved flexible and fast for its intended use. The insights obtained from use improved the systematic monitoring of respirable dust compliance data, especially for pinpointing the most effective dust control methods during specific sampling cycles.« less
Plateletpheresis efficiency and mathematical correction of software-derived platelet yield prediction: A linear regression and ROC modeling approach.

PubMed

Jaime-Pérez, José Carlos; Jiménez-Castillo, Raúl Alberto; Vázquez-Hernández, Karina Elizabeth; Salazar-Riojas, Rosario; Méndez-Ramírez, Nereida; Gómez-Almaguer, David

2017-10-01

Advances in automated cell separators have improved the efficiency of plateletpheresis and the possibility of obtaining double products (DP). We assessed cell processor accuracy of predicted platelet (PLT) yields with the goal of a better prediction of DP collections. This retrospective proof-of-concept study included 302 plateletpheresis procedures performed on a Trima Accel v6.0 at the apheresis unit of a hematology department. Donor variables, software predicted yield and actual PLT yield were statistically evaluated. Software prediction was optimized by linear regression analysis and its optimal cut-off to obtain a DP assessed by receiver operating characteristic curve (ROC) modeling. Three hundred and two plateletpheresis procedures were performed; in 271 (89.7%) occasions, donors were men and in 31 (10.3%) women. Pre-donation PLT count had the best direct correlation with actual PLT yield (r = 0.486. P < .001). Means of software machine-derived values differed significantly from actual PLT yield, 4.72 × 10 11 vs.6.12 × 10 11 , respectively, (P < .001). The following equation was developed to adjust these values: actual PLT yield= 0.221 + (1.254 × theoretical platelet yield). ROC curve model showed an optimal apheresis device software prediction cut-off of 4.65 × 10 11 to obtain a DP, with a sensitivity of 82.2%, specificity of 93.3%, and an area under the curve (AUC) of 0.909. Trima Accel v6.0 software consistently underestimated PLT yields. Simple correction derived from linear regression analysis accurately corrected this underestimation and ROC analysis identified a precise cut-off to reliably predict a DP. © 2016 Wiley Periodicals, Inc.
Illinois Trauma Centers and Intimate Partner Violence: Are We Doing Our Share?

ERIC Educational Resources Information Center

Crandall, Marie; Schwab, Jennifer; Sheehan, Karen; Esposito, Thomas

2009-01-01

Intimate partner violence (IPV) is a major source of morbidity and mortality nationally. Trauma Centers can be very helpful for victims of IPV but there may be variability in IPV resource provision. A survey was mailed to each of the 65 Trauma Centers in Illinois. Stata and EZ-Text statistical software were used for analysis. Eighty-three percent…
Exact and Monte carlo resampling procedures for the Wilcoxon-Mann-Whitney and Kruskal-Wallis tests.

PubMed

Berry, K J; Mielke, P W

2000-12-01

Exact and Monte Carlo resampling FORTRAN programs are described for the Wilcoxon-Mann-Whitney rank sum test and the Kruskal-Wallis one-way analysis of variance for ranks test. The program algorithms compensate for tied values and do not depend on asymptotic approximations for probability values, unlike most algorithms contained in PC-based statistical software packages.

Improving phylogenetic analyses by incorporating additional information from genetic sequence databases.

PubMed

Liang, Li-Jung; Weiss, Robert E; Redelings, Benjamin; Suchard, Marc A

2009-10-01

Statistical analyses of phylogenetic data culminate in uncertain estimates of underlying model parameters. Lack of additional data hinders the ability to reduce this uncertainty, as the original phylogenetic dataset is often complete, containing the entire gene or genome information available for the given set of taxa. Informative priors in a Bayesian analysis can reduce posterior uncertainty; however, publicly available phylogenetic software specifies vague priors for model parameters by default. We build objective and informative priors using hierarchical random effect models that combine additional datasets whose parameters are not of direct interest but are similar to the analysis of interest. We propose principled statistical methods that permit more precise parameter estimates in phylogenetic analyses by creating informative priors for parameters of interest. Using additional sequence datasets from our lab or public databases, we construct a fully Bayesian semiparametric hierarchical model to combine datasets. A dynamic iteratively reweighted Markov chain Monte Carlo algorithm conveniently recycles posterior samples from the individual analyses. We demonstrate the value of our approach by examining the insertion-deletion (indel) process in the enolase gene across the Tree of Life using the phylogenetic software BALI-PHY; we incorporate prior information about indels from 82 curated alignments downloaded from the BAliBASE database.
IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics

PubMed Central

2016-01-01

Background We live in an era of explosive data generation that will continue to grow and involve all industries. One of the results of this explosion is the need for newer and more efficient data analytics procedures. Traditionally, data analytics required a substantial background in statistics and computer science. In 2015, International Business Machines Corporation (IBM) released the IBM Watson Analytics (IBMWA) software that delivered advanced statistical procedures based on the Statistical Package for the Social Sciences (SPSS). The latest entry of Watson Analytics into the field of analytical software products provides users with enhanced functions that are not available in many existing programs. For example, Watson Analytics automatically analyzes datasets, examines data quality, and determines the optimal statistical approach. Users can request exploratory, predictive, and visual analytics. Using natural language processing (NLP), users are able to submit additional questions for analyses in a quick response format. This analytical package is available free to academic institutions (faculty and students) that plan to use the tools for noncommercial purposes. Objective To report the features of IBMWA and discuss how this software subjectively and objectively compares to other data mining programs. Methods The salient features of the IBMWA program were examined and compared with other common analytical platforms, using validated health datasets. Results Using a validated dataset, IBMWA delivered similar predictions compared with several commercial and open source data mining software applications. The visual analytics generated by IBMWA were similar to results from programs such as Microsoft Excel and Tableau Software. In addition, assistance with data preprocessing and data exploration was an inherent component of the IBMWA application. Sensitivity and specificity were not included in the IBMWA predictive analytics results, nor were odds ratios, confidence intervals, or a confusion matrix. Conclusions IBMWA is a new alternative for data analytics software that automates descriptive, predictive, and visual analytics. This program is very user-friendly but requires data preprocessing, statistical conceptual understanding, and domain expertise. PMID:27729304
IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics.

PubMed

Hoyt, Robert Eugene; Snider, Dallas; Thompson, Carla; Mantravadi, Sarita

2016-10-11

We live in an era of explosive data generation that will continue to grow and involve all industries. One of the results of this explosion is the need for newer and more efficient data analytics procedures. Traditionally, data analytics required a substantial background in statistics and computer science. In 2015, International Business Machines Corporation (IBM) released the IBM Watson Analytics (IBMWA) software that delivered advanced statistical procedures based on the Statistical Package for the Social Sciences (SPSS). The latest entry of Watson Analytics into the field of analytical software products provides users with enhanced functions that are not available in many existing programs. For example, Watson Analytics automatically analyzes datasets, examines data quality, and determines the optimal statistical approach. Users can request exploratory, predictive, and visual analytics. Using natural language processing (NLP), users are able to submit additional questions for analyses in a quick response format. This analytical package is available free to academic institutions (faculty and students) that plan to use the tools for noncommercial purposes. To report the features of IBMWA and discuss how this software subjectively and objectively compares to other data mining programs. The salient features of the IBMWA program were examined and compared with other common analytical platforms, using validated health datasets. Using a validated dataset, IBMWA delivered similar predictions compared with several commercial and open source data mining software applications. The visual analytics generated by IBMWA were similar to results from programs such as Microsoft Excel and Tableau Software. In addition, assistance with data preprocessing and data exploration was an inherent component of the IBMWA application. Sensitivity and specificity were not included in the IBMWA predictive analytics results, nor were odds ratios, confidence intervals, or a confusion matrix. IBMWA is a new alternative for data analytics software that automates descriptive, predictive, and visual analytics. This program is very user-friendly but requires data preprocessing, statistical conceptual understanding, and domain expertise.
Dynamic Weather Routes Architecture Overview

NASA Technical Reports Server (NTRS)

Eslami, Hassan; Eshow, Michelle

2014-01-01

Dynamic Weather Routes Architecture Overview, presents the high level software architecture of DWR, based on the CTAS software framework and the Direct-To automation tool. The document also covers external and internal data flows, required dataset, changes to the Direct-To software for DWR, collection of software statistics, and the code structure.
Automated system for the on-line monitoring of powder blending processes using near-infrared spectroscopy. Part I. System development and control.

PubMed

Hailey, P A; Doherty, P; Tapsell, P; Oliver, T; Aldridge, P K

1996-03-01

An automated system for the on-line monitoring of powder blending processes is described. The system employs near-infrared (NIR) spectroscopy using fibre-optics and a graphical user interface (GUI) developed in the LabVIEW environment. The complete supervisory control and data analysis (SCADA) software controls blender and spectrophotometer operation and performs statistical spectral data analysis in real time. A data analysis routine using standard deviation is described to demonstrate an approach to the real-time determination of blend homogeneity.
Isobio software: biological dose distribution and biological dose volume histogram from physical dose conversion using linear-quadratic-linear model.

PubMed

Jaikuna, Tanwiwat; Khadsiri, Phatchareewan; Chawapun, Nisa; Saekho, Suwit; Tharavichitkul, Ekkasit

2017-02-01

To develop an in-house software program that is able to calculate and generate the biological dose distribution and biological dose volume histogram by physical dose conversion using the linear-quadratic-linear (LQL) model. The Isobio software was developed using MATLAB version 2014b to calculate and generate the biological dose distribution and biological dose volume histograms. The physical dose from each voxel in treatment planning was extracted through Computational Environment for Radiotherapy Research (CERR), and the accuracy was verified by the differentiation between the dose volume histogram from CERR and the treatment planning system. An equivalent dose in 2 Gy fraction (EQD 2 ) was calculated using biological effective dose (BED) based on the LQL model. The software calculation and the manual calculation were compared for EQD 2 verification with pair t -test statistical analysis using IBM SPSS Statistics version 22 (64-bit). Two and three-dimensional biological dose distribution and biological dose volume histogram were displayed correctly by the Isobio software. Different physical doses were found between CERR and treatment planning system (TPS) in Oncentra, with 3.33% in high-risk clinical target volume (HR-CTV) determined by D 90% , 0.56% in the bladder, 1.74% in the rectum when determined by D 2cc , and less than 1% in Pinnacle. The difference in the EQD 2 between the software calculation and the manual calculation was not significantly different with 0.00% at p -values 0.820, 0.095, and 0.593 for external beam radiation therapy (EBRT) and 0.240, 0.320, and 0.849 for brachytherapy (BT) in HR-CTV, bladder, and rectum, respectively. The Isobio software is a feasible tool to generate the biological dose distribution and biological dose volume histogram for treatment plan evaluation in both EBRT and BT.
Modelling and analysis of FMS productivity variables by ISM, SEM and GTMA approach

NASA Astrophysics Data System (ADS)

Jain, Vineet; Raj, Tilak

2014-09-01

Productivity has often been cited as a key factor in a flexible manufacturing system (FMS) performance, and actions to increase it are said to improve profitability and the wage earning capacity of employees. Improving productivity is seen as a key issue for survival and success in the long term of a manufacturing system. The purpose of this paper is to make a model and analysis of the productivity variables of FMS. This study was performed by different approaches viz. interpretive structural modelling (ISM), structural equation modelling (SEM), graph theory and matrix approach (GTMA) and a cross-sectional survey within manufacturing firms in India. ISM has been used to develop a model of productivity variables, and then it has been analyzed. Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) are powerful statistical techniques. CFA is carried by SEM. EFA is applied to extract the factors in FMS by the statistical package for social sciences (SPSS 20) software and confirming these factors by CFA through analysis of moment structures (AMOS 20) software. The twenty productivity variables are identified through literature and four factors extracted, which involves the productivity of FMS. The four factors are people, quality, machine and flexibility. SEM using AMOS 20 was used to perform the first order four-factor structures. GTMA is a multiple attribute decision making (MADM) methodology used to find intensity/quantification of productivity variables in an organization. The FMS productivity index has purposed to intensify the factors which affect FMS.
Building confidence and credibility amid growing model and computing complexity

NASA Astrophysics Data System (ADS)

Evans, K. J.; Mahajan, S.; Veneziani, C.; Kennedy, J. H.

2017-12-01

As global Earth system models are developed to answer an ever-wider range of science questions, software products that provide robust verification, validation, and evaluation must evolve in tandem. Measuring the degree to which these new models capture past behavior, predict the future, and provide the certainty of predictions is becoming ever more challenging for reasons that are generally well known, yet are still challenging to address. Two specific and divergent needs for analysis of the Accelerated Climate Model for Energy (ACME) model - but with a similar software philosophy - are presented to show how a model developer-based focus can address analysis needs during expansive model changes to provide greater fidelity and execute on multi-petascale computing facilities. A-PRIME is a python script-based quick-look overview of a fully-coupled global model configuration to determine quickly if it captures specific behavior before significant computer time and expense is invested. EVE is an ensemble-based software framework that focuses on verification of performance-based ACME model development, such as compiler or machine settings, to determine the equivalence of relevant climate statistics. The challenges and solutions for analysis of multi-petabyte output data are highlighted from the aspect of the scientist using the software, with the aim of fostering discussion and further input from the community about improving developer confidence and community credibility.
Performance of student software development teams: the influence of personality and identifying as team members

NASA Astrophysics Data System (ADS)

Monaghan, Conal; Bizumic, Boris; Reynolds, Katherine; Smithson, Michael; Johns-Boast, Lynette; van Rooy, Dirk

2015-01-01

One prominent approach in the exploration of the variations in project team performance has been to study two components of the aggregate personalities of the team members: conscientiousness and agreeableness. A second line of research, known as self-categorisation theory, argues that identifying as team members and the team's performance norms should substantially influence the team's performance. This paper explores the influence of both these perspectives in university software engineering project teams. Eighty students worked to complete a piece of software in small project teams during 2007 or 2008. To reduce limitations in statistical analysis, Monte Carlo simulation techniques were employed to extrapolate from the results of the original sample to a larger simulated sample (2043 cases, within 319 teams). The results emphasise the importance of taking into account personality (particularly conscientiousness), and both team identification and the team's norm of performance, in order to cultivate higher levels of performance in student software engineering project teams.
Dynamic online surveys and experiments with the free open-source software dynQuest.

PubMed

Rademacher, Jens D M; Lippke, Sonia

2007-08-01

With computers and the World Wide Web widely available, collecting data through Web browsers is an attractive method utilized by the social sciences. In this article, conducting PC- and Web-based trials with the software package dynQuest is described. The software manages dynamic questionnaire-based trials over the Internet or on single computers, possibly as randomized control trials (RCT), if two or more groups are involved. The choice of follow-up questions can depend on previous responses, as needed for matched interventions. Data are collected in a simple text-based database that can be imported easily into other programs for postprocessing and statistical analysis. The software consists of platform-independent scripts written in the programming language PERL that use the common gateway interface between Web browser and server for submission of data through HTML forms. Advantages of dynQuest are parsimony, simplicity in use and installation, transparency, and reliability. The program is available as open-source freeware from the authors.
Descriptive statistics and correlation analysis of agronomic traits in a maize recombinant inbred line population.

PubMed

Zhang, H M; Hui, G Q; Luo, Q; Sun, Y; Liu, X H

2014-01-21

Maize (Zea mays L.) is one of the most important crops in the world. In this study, 13 agronomic traits of a recombinant inbred line population that was derived from the cross between Mo17 and Huangzao4 were investigated in maize: ear diameter, ear length, ear axis diameter, ear weight, plant height, ear height, days to pollen shed (DPS), days to silking (DS), the interval between DPS and DS, 100-kernel weight, kernel test weight, ear kernel weight, and kernel rate. Furthermore, the descriptive statistics and correlation analysis of the 13 traits were performed using the SPSS 11.5 software. The results providing the phenotypic data here are needed for the quantitative trait locus mapping of these agronomic traits.
Causal network analysis of head and neck keloid tissue identifies potential master regulators.

PubMed

Garcia-Rodriguez, Laura; Jones, Lamont; Chen, Kang Mei; Datta, Indrani; Divine, George; Worsham, Maria J

2016-10-01

To generate novel insights and hypotheses in keloid development from potential master regulators. Prospective cohort. Six fresh keloid and six normal skin samples from 12 anonymous donors were used in a prospective cohort study. Genome-wide profiling was done previously on the cohort using the Infinium HumanMethylation450 BeadChip (Illumina, San Diego, CA). The 190 statistically significant CpG islands between keloid and normal tissue mapped to 152 genes (P < .05). The top 10 statistically significant genes (VAMP5, ACTR3C, GALNT3, KCNAB2, LRRC61, SCML4, SYNGR1, TNS1, PLEKHG5, PPP1R13-α, false discovery rate <.015) were uploaded into the Ingenuity Pathway Analysis software's Causal Network Analysis (QIAGEN, Redwood City, CA). To reflect expected gene expression direction in the context of methylation changes, the inverse of the methylation ratio from keloid versus normal tissue was used for the analysis. Causal Network Analysis identified disease-specific master regulator molecules based on downstream differentially expressed keloid-specific genes and expected directionality of expression (hypermethylated vs. hypomethylated). Causal Network Analysis software identified four hierarchical networks that included four master regulators (pyroxamide, tributyrin, PRKG2, and PENK) and 19 intermediate regulators. Causal Network Analysis of differentiated methylated gene data of keloid versus normal skin demonstrated four causal networks with four master regulators. These hierarchical networks suggest potential driver roles for their downstream keloid gene targets in the pathogenesis of the keloid phenotype, likely triggered due to perturbation/injury to normal tissue. NA Laryngoscope, 126:E319-E324, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.
Quantitative analysis of tympanic membrane perforation: a simple and reliable method.

PubMed

Ibekwe, T S; Adeosun, A A; Nwaorgu, O G

2009-01-01

Accurate assessment of the features of tympanic membrane perforation, especially size, site, duration and aetiology, is important, as it enables optimum management. To describe a simple, cheap and effective method of quantitatively analysing tympanic membrane perforations. The system described comprises a video-otoscope (capable of generating still and video images of the tympanic membrane), adapted via a universal serial bus box to a computer screen, with images analysed using the Image J geometrical analysis software package. The reproducibility of results and their correlation with conventional otoscopic methods of estimation were tested statistically with the paired t-test and correlational tests, using the Statistical Package for the Social Sciences version 11 software. The following equation was generated: P/T x 100 per cent = percentage perforation, where P is the area (in pixels2) of the tympanic membrane perforation and T is the total area (in pixels2) for the entire tympanic membrane (including the perforation). Illustrations are shown. Comparison of blinded data on tympanic membrane perforation area obtained independently from assessments by two trained otologists, of comparative years of experience, using the video-otoscopy system described, showed similar findings, with strong correlations devoid of inter-observer error (p = 0.000, r = 1). Comparison with conventional otoscopic assessment also indicated significant correlation, comparing results for two trained otologists, but some inter-observer variation was present (p = 0.000, r = 0.896). Correlation between the two methods for each of the otologists was also highly significant (p = 0.000). A computer-adapted video-otoscope, with images analysed by Image J software, represents a cheap, reliable, technology-driven, clinical method of quantitative analysis of tympanic membrane perforations and injuries.
Using a Software Tool in Forecasting: a Case Study of Sales Forecasting Taking into Account Data Uncertainty

NASA Astrophysics Data System (ADS)

Fabianová, Jana; Kačmáry, Peter; Molnár, Vieroslav; Michalik, Peter

2016-10-01

Forecasting is one of the logistics activities and a sales forecast is the starting point for the elaboration of business plans. Forecast accuracy affects the business outcomes and ultimately may significantly affect the economic stability of the company. The accuracy of the prediction depends on the suitability of the use of forecasting methods, experience, quality of input data, time period and other factors. The input data are usually not deterministic but they are often of random nature. They are affected by uncertainties of the market environment, and many other factors. Taking into account the input data uncertainty, the forecast error can by reduced. This article deals with the use of the software tool for incorporating data uncertainty into forecasting. Proposals are presented of a forecasting approach and simulation of the impact of uncertain input parameters to the target forecasted value by this case study model. The statistical analysis and risk analysis of the forecast results is carried out including sensitivity analysis and variables impact analysis.
Data Analysis with Graphical Models: Software Tools

NASA Technical Reports Server (NTRS)

Buntine, Wray L.

1994-01-01

Probabilistic graphical models (directed and undirected Markov fields, and combined in chain graphs) are used widely in expert systems, image processing and other areas as a framework for representing and reasoning with probabilities. They come with corresponding algorithms for performing probabilistic inference. This paper discusses an extension to these models by Spiegelhalter and Gilks, plates, used to graphically model the notion of a sample. This offers a graphical specification language for representing data analysis problems. When combined with general methods for statistical inference, this also offers a unifying framework for prototyping and/or generating data analysis algorithms from graphical specifications. This paper outlines the framework and then presents some basic tools for the task: a graphical version of the Pitman-Koopman Theorem for the exponential family, problem decomposition, and the calculation of exact Bayes factors. Other tools already developed, such as automatic differentiation, Gibbs sampling, and use of the EM algorithm, make this a broad basis for the generation of data analysis software.
Age estimation using exfoliative cytology and radiovisiography: A comparative study

PubMed Central

Nallamala, Shilpa; Guttikonda, Venkateswara Rao; Manchikatla, Praveen Kumar; Taneeru, Sravya

2017-01-01

Introduction: Age estimation is one of the essential factors in establishing the identity of an individual. Among various methods, exfoliative cytology (EC) is a unique, noninvasive technique, involving simple, and pain-free collection of intact cells from the oral cavity for microscopic examination. Objective: The study was undertaken with an aim to estimate the age of an individual from the average cell size of their buccal smears calculated using image analysis morphometric software and the pulp–tooth area ratio in mandibular canine of the same individual using radiovisiography (RVG). Materials and Methods: Buccal smears were collected from 100 apparently healthy individuals. After fixation in 95% alcohol, the smears were stained using Papanicolaou stain. The average cell size was measured using image analysis software (Image-Pro Insight 8.0). The RVG images of mandibular canines were obtained, pulp and tooth areas were traced using AutoCAD 2010 software, and area ratio was calculated. The estimated age was then calculated using regression analysis. Results: The paired t-test between chronological age and estimated age by cell size and pulp–tooth area ratio was statistically nonsignificant (P > 0.05). Conclusion: In the present study, age estimated by pulp–tooth area ratio and EC yielded good results. PMID:29657491
Age estimation using exfoliative cytology and radiovisiography: A comparative study.

PubMed

Nallamala, Shilpa; Guttikonda, Venkateswara Rao; Manchikatla, Praveen Kumar; Taneeru, Sravya

2017-01-01

Age estimation is one of the essential factors in establishing the identity of an individual. Among various methods, exfoliative cytology (EC) is a unique, noninvasive technique, involving simple, and pain-free collection of intact cells from the oral cavity for microscopic examination. The study was undertaken with an aim to estimate the age of an individual from the average cell size of their buccal smears calculated using image analysis morphometric software and the pulp-tooth area ratio in mandibular canine of the same individual using radiovisiography (RVG). Buccal smears were collected from 100 apparently healthy individuals. After fixation in 95% alcohol, the smears were stained using Papanicolaou stain. The average cell size was measured using image analysis software (Image-Pro Insight 8.0). The RVG images of mandibular canines were obtained, pulp and tooth areas were traced using AutoCAD 2010 software, and area ratio was calculated. The estimated age was then calculated using regression analysis. The paired t -test between chronological age and estimated age by cell size and pulp-tooth area ratio was statistically nonsignificant ( P > 0.05). In the present study, age estimated by pulp-tooth area ratio and EC yielded good results.
HTAPP: High-Throughput Autonomous Proteomic Pipeline

PubMed Central

Yu, Kebing; Salomon, Arthur R.

2011-01-01

Recent advances in the speed and sensitivity of mass spectrometers and in analytical methods, the exponential acceleration of computer processing speeds, and the availability of genomic databases from an array of species and protein information databases have led to a deluge of proteomic data. The development of a lab-based automated proteomic software platform for the automated collection, processing, storage, and visualization of expansive proteomic datasets is critically important. The high-throughput autonomous proteomic pipeline (HTAPP) described here is designed from the ground up to provide critically important flexibility for diverse proteomic workflows and to streamline the total analysis of a complex proteomic sample. This tool is comprised of software that controls the acquisition of mass spectral data along with automation of post-acquisition tasks such as peptide quantification, clustered MS/MS spectral database searching, statistical validation, and data exploration within a user-configurable lab-based relational database. The software design of HTAPP focuses on accommodating diverse workflows and providing missing software functionality to a wide range of proteomic researchers to accelerate the extraction of biological meaning from immense proteomic data sets. Although individual software modules in our integrated technology platform may have some similarities to existing tools, the true novelty of the approach described here is in the synergistic and flexible combination of these tools to provide an integrated and efficient analysis of proteomic samples. PMID:20336676
CONAN: copy number variation analysis software for genome-wide association studies

PubMed Central

2010-01-01

Background Genome-wide association studies (GWAS) based on single nucleotide polymorphisms (SNPs) revolutionized our perception of the genetic regulation of complex traits and diseases. Copy number variations (CNVs) promise to shed additional light on the genetic basis of monogenic as well as complex diseases and phenotypes. Indeed, the number of detected associations between CNVs and certain phenotypes are constantly increasing. However, while several software packages support the determination of CNVs from SNP chip data, the downstream statistical inference of CNV-phenotype associations is still subject to complicated and inefficient in-house solutions, thus strongly limiting the performance of GWAS based on CNVs. Results CONAN is a freely available client-server software solution which provides an intuitive graphical user interface for categorizing, analyzing and associating CNVs with phenotypes. Moreover, CONAN assists the evaluation process by visualizing detected associations via Manhattan plots in order to enable a rapid identification of genome-wide significant CNV regions. Various file formats including the information on CNVs in population samples are supported as input data. Conclusions CONAN facilitates the performance of GWAS based on CNVs and the visual analysis of calculated results. CONAN provides a rapid, valid and straightforward software solution to identify genetic variation underlying the 'missing' heritability for complex traits that remains unexplained by recent GWAS. The freely available software can be downloaded at http://genepi-conan.i-med.ac.at. PMID:20546565
HRVanalysis: A Free Software for Analyzing Cardiac Autonomic Activity

PubMed Central

Pichot, Vincent; Roche, Frédéric; Celle, Sébastien; Barthélémy, Jean-Claude; Chouchou, Florian

2016-01-01

Since the pioneering studies of the 1960s, heart rate variability (HRV) has become an increasingly used non-invasive tool for examining cardiac autonomic functions and dysfunctions in various populations and conditions. Many calculation methods have been developed to address these issues, each with their strengths and weaknesses. Although, its interpretation may remain difficult, this technique provides, from a non-invasive approach, reliable physiological information that was previously inaccessible, in many fields including death and health prediction, training and overtraining, cardiac and respiratory rehabilitation, sleep-disordered breathing, large cohort follow-ups, children's autonomic status, anesthesia, or neurophysiological studies. In this context, we developed HRVanalysis, a software to analyse HRV, used and improved for over 20 years and, thus, designed to meet laboratory requirements. The main strength of HRVanalysis is its wide application scope. In addition to standard analysis over short and long periods of RR intervals, the software allows time-frequency analysis using wavelet transform as well as analysis of autonomic nervous system status on surrounding scored events and on preselected labeled areas. Moreover, the interface is designed for easy study of large cohorts, including batch mode signal processing to avoid running repetitive operations. Results are displayed as figures or saved in TXT files directly employable in statistical softwares. Recordings can arise from RR or EKG files of different types such as cardiofrequencemeters, holters EKG, polygraphs, and data acquisition systems. HRVanalysis can be downloaded freely from the Web page at: https://anslabtools.univ-st-etienne.fr HRVanalysis is meticulously maintained and developed for in-house laboratory use. In this article, after a brief description of the context, we present an overall view of HRV analysis and we describe the methodological approach of the different techniques provided by the software. PMID:27920726

Cognition, comprehension and application of biostatistics in research by Indian postgraduate students in periodontics

PubMed Central

Swetha, Jonnalagadda Laxmi; Arpita, Ramisetti; Srikanth, Chintalapani; Nutalapati, Rajasekhar

2014-01-01

Background: Biostatistics is an integral part of research protocols. In any field of inquiry or investigation, data obtained is subsequently classified, analyzed and tested for accuracy by statistical methods. Statistical analysis of collected data, thus, forms the basis for all evidence-based conclusions. Aim: The aim of this study is to evaluate the cognition, comprehension and application of biostatistics in research among post graduate students in Periodontics, in India. Materials and Methods: A total of 391 post graduate students registered for a master's course in periodontics at various dental colleges across India were included in the survey. Data regarding the level of knowledge, understanding and its application in design and conduct of the research protocol was collected using a dichotomous questionnaire. A descriptive statistics was used for data analysis. Results: Nearly 79.2% students were aware of the importance of biostatistics in research, 55-65% were familiar with MS-EXCEL spreadsheet for graphical representation of data and with the statistical softwares available on the internet, 26.0% had biostatistics as mandatory subject in their curriculum, 9.5% tried to perform statistical analysis on their own while 3.0% were successful in performing statistical analysis of their studies on their own. Conclusion: Biostatistics should play a central role in planning, conduct, interim analysis, final analysis and reporting of periodontal research especially by the postgraduate students. Indian postgraduate students in periodontics are aware of the importance of biostatistics in research but the level of understanding and application is still basic and needs to be addressed. PMID:24744547
Parallel processing of genomics data

NASA Astrophysics Data System (ADS)

Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario

2016-10-01

The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.
Tolerancing aspheres based on manufacturing statistics

NASA Astrophysics Data System (ADS)

Wickenhagen, S.; Möhl, A.; Fuchs, U.

2017-11-01

A standard way of tolerancing optical elements or systems is to perform a Monte Carlo based analysis within a common optical design software package. Although, different weightings and distributions are assumed they are all counting on statistics, which usually means several hundreds or thousands of systems for reliable results. Thus, employing these methods for small batch sizes is unreliable, especially when aspheric surfaces are involved. The huge database of asphericon was used to investigate the correlation between the given tolerance values and measured data sets. The resulting probability distributions of these measured data were analyzed aiming for a robust optical tolerancing process.
A Gibbs sampler for Bayesian analysis of site-occupancy data

USGS Publications Warehouse

Dorazio, Robert M.; Rodriguez, Daniel Taylor

2012-01-01

1. A Bayesian analysis of site-occupancy data containing covariates of species occurrence and species detection probabilities is usually completed using Markov chain Monte Carlo methods in conjunction with software programs that can implement those methods for any statistical model, not just site-occupancy models. Although these software programs are quite flexible, considerable experience is often required to specify a model and to initialize the Markov chain so that summaries of the posterior distribution can be estimated efficiently and accurately. 2. As an alternative to these programs, we develop a Gibbs sampler for Bayesian analysis of site-occupancy data that include covariates of species occurrence and species detection probabilities. This Gibbs sampler is based on a class of site-occupancy models in which probabilities of species occurrence and detection are specified as probit-regression functions of site- and survey-specific covariate measurements. 3. To illustrate the Gibbs sampler, we analyse site-occupancy data of the blue hawker, Aeshna cyanea (Odonata, Aeshnidae), a common dragonfly species in Switzerland. Our analysis includes a comparison of results based on Bayesian and classical (non-Bayesian) methods of inference. We also provide code (based on the R software program) for conducting Bayesian and classical analyses of site-occupancy data.
Automated ultrasound edge-tracking software comparable to established semi-automated reference software for carotid intima-media thickness analysis.

PubMed

Shenouda, Ninette; Proudfoot, Nicole A; Currie, Katharine D; Timmons, Brian W; MacDonald, Maureen J

2018-05-01

Many commercial ultrasound systems are now including automated analysis packages for the determination of carotid intima-media thickness (cIMT); however, details regarding their algorithms and methodology are not published. Few studies have compared their accuracy and reliability with previously established automated software, and those that have were in asymptomatic adults. Therefore, this study compared cIMT measures from a fully automated ultrasound edge-tracking software (EchoPAC PC, Version 110.0.2; GE Medical Systems, Horten, Norway) to an established semi-automated reference software (Artery Measurement System (AMS) II, Version 1.141; Gothenburg, Sweden) in 30 healthy preschool children (ages 3-5 years) and 27 adults with coronary artery disease (CAD; ages 48-81 years). For both groups, Bland-Altman plots revealed good agreement with a negligible mean cIMT difference of -0·03 mm. Software differences were statistically, but not clinically, significant for preschool images (P = 0·001) and were not significant for CAD images (P = 0·09). Intra- and interoperator repeatability was high and comparable between software for preschool images (ICC, 0·90-0·96; CV, 1·3-2·5%), but slightly higher with the automated ultrasound than the semi-automated reference software for CAD images (ICC, 0·98-0·99; CV, 1·4-2·0% versus ICC, 0·84-0·89; CV, 5·6-6·8%). These findings suggest that the automated ultrasound software produces valid cIMT values in healthy preschool children and adults with CAD. Automated ultrasound software may be useful for ensuring consistency among multisite research initiatives or large cohort studies involving repeated cIMT measures, particularly in adults with documented CAD. © 2017 Scandinavian Society of Clinical Physiology and Nuclear Medicine. Published by John Wiley & Sons Ltd.
Closed-Loop Analysis of Soft Decisions for Serial Links

NASA Technical Reports Server (NTRS)

Lansdowne, Chatwin A.; Steele, Glen F.; Zucha, Joan P.; Schlesinger, Adam M.

2013-01-01

We describe the benefit of using closed-loop measurements for a radio receiver paired with a counterpart transmitter. We show that real-time analysis of the soft decision output of a receiver can provide rich and relevant insight far beyond the traditional hard-decision bit error rate (BER) test statistic. We describe a Soft Decision Analyzer (SDA) implementation for closed-loop measurements on single- or dual- (orthogonal) channel serial data communication links. The analyzer has been used to identify, quantify, and prioritize contributors to implementation loss in live-time during the development of software defined radios. This test technique gains importance as modern receivers are providing soft decision symbol synchronization as radio links are challenged to push more data and more protocol overhead through noisier channels, and software-defined radios (SDRs) use error-correction codes that approach Shannon's theoretical limit of performance.
The Accuracy of GBM GRB Localizations

NASA Astrophysics Data System (ADS)

Briggs, Michael Stephen; Connaughton, V.; Meegan, C.; Hurley, K.

2010-03-01

We report an study of the accuracy of GBM GRB localizations, analyzing three types of localizations: those produced automatically by the GBM Flight Software on board GBM, those produced automatically with ground software in near real time, and localizations produced with human guidance. The two types of automatic locations are distributed in near real-time via GCN Notices; the human-guided locations are distributed on timescale of many minutes or hours using GCN Circulars. This work uses a Bayesian analysis that models the distribution of the GBM total location error by comparing GBM locations to more accurate locations obtained with other instruments. Reference locations are obtained from Swift, Super-AGILE, the LAT, and with the IPN. We model the GBM total location errors as having systematic errors in addition to the statistical errors and use the Bayesian analysis to constrain the systematic errors.
AnthropMMD: An R package with a graphical user interface for the mean measure of divergence.

PubMed

Santos, Frédéric

2018-01-01

The mean measure of divergence is a dissimilarity measure between groups of individuals described by dichotomous variables. It is well suited to datasets with many missing values, and it is generally used to compute distance matrices and represent phenograms. Although often used in biological anthropology and archaeozoology, this method suffers from a lack of implementation in common statistical software. A package for the R statistical software, AnthropMMD, is presented here. Offering a dynamic graphical user interface, it is the first one dedicated to Smith's mean measure of divergence. The package also provides facilities for graphical representations and the crucial step of trait selection, so that the entire analysis can be performed through the graphical user interface. Its use is demonstrated using an artificial dataset, and the impact of trait selection is discussed. Finally, AnthropMMD is compared to three other free tools available for calculating the mean measure of divergence, and is proven to be consistent with them. © 2017 Wiley Periodicals, Inc.
Software and Algorithms for Biomedical Image Data Processing and Visualization

NASA Technical Reports Server (NTRS)

Talukder, Ashit; Lambert, James; Lam, Raymond

2004-01-01

A new software equipped with novel image processing algorithms and graphical-user-interface (GUI) tools has been designed for automated analysis and processing of large amounts of biomedical image data. The software, called PlaqTrak, has been specifically used for analysis of plaque on teeth of patients. New algorithms have been developed and implemented to segment teeth of interest from surrounding gum, and a real-time image-based morphing procedure is used to automatically overlay a grid onto each segmented tooth. Pattern recognition methods are used to classify plaque from surrounding gum and enamel, while ignoring glare effects due to the reflection of camera light and ambient light from enamel regions. The PlaqTrak system integrates these components into a single software suite with an easy-to-use GUI (see Figure 1) that allows users to do an end-to-end run of a patient s record, including tooth segmentation of all teeth, grid morphing of each segmented tooth, and plaque classification of each tooth image. The automated and accurate processing of the captured images to segment each tooth [see Figure 2(a)] and then detect plaque on a tooth-by-tooth basis is a critical component of the PlaqTrak system to do clinical trials and analysis with minimal human intervention. These features offer distinct advantages over other competing systems that analyze groups of teeth or synthetic teeth. PlaqTrak divides each segmented tooth into eight regions using an advanced graphics morphing procedure [see results on a chipped tooth in Figure 2(b)], and a pattern recognition classifier is then used to locate plaque [red regions in Figure 2(d)] and enamel regions. The morphing allows analysis within regions of teeth, thereby facilitating detailed statistical analysis such as the amount of plaque present on the biting surfaces on teeth. This software system is applicable to a host of biomedical applications, such as cell analysis and life detection, or robotic applications, such as product inspection or assembly of parts in space and industry.
Implementing Software Safety in the NASA Environment

NASA Technical Reports Server (NTRS)

Wetherholt, Martha S.; Radley, Charles F.

1994-01-01

Until recently, NASA did not consider allowing computers total control of flight systems. Human operators, via hardware, have constituted the ultimate safety control. In an attempt to reduce costs, NASA has come to rely more and more heavily on computers and software to control space missions. (For example. software is now planned to control most of the operational functions of the International Space Station.) Thus the need for systematic software safety programs has become crucial for mission success. Concurrent engineering principles dictate that safety should be designed into software up front, not tested into the software after the fact. 'Cost of Quality' studies have statistics and metrics to prove the value of building quality and safety into the development cycle. Unfortunately, most software engineers are not familiar with designing for safety, and most safety engineers are not software experts. Software written to specifications which have not been safety analyzed is a major source of computer related accidents. Safer software is achieved step by step throughout the system and software life cycle. It is a process that includes requirements definition, hazard analyses, formal software inspections, safety analyses, testing, and maintenance. The greatest emphasis is placed on clearly and completely defining system and software requirements, including safety and reliability requirements. Unfortunately, development and review of requirements are the weakest link in the process. While some of the more academic methods, e.g. mathematical models, may help bring about safer software, this paper proposes the use of currently approved software methodologies, and sound software and assurance practices to show how, to a large degree, safety can be designed into software from the start. NASA's approach today is to first conduct a preliminary system hazard analysis (PHA) during the concept and planning phase of a project. This determines the overall hazard potential of the system to be built. Shortly thereafter, as the system requirements are being defined, the second iteration of hazard analyses takes place, the systems hazard analysis (SHA). During the systems requirements phase, decisions are made as to what functions of the system will be the responsibility of software. This is the most critical time to affect the safety of the software. From this point, software safety analyses as well as software engineering practices are the main focus for assuring safe software. While many of the steps proposed in this paper seem like just sound engineering practices, they are the best technical and most cost effective means to assure safe software within a safe system.
A guide to understanding meta-analysis.

PubMed

Israel, Heidi; Richter, Randy R

2011-07-01

With the focus on evidence-based practice in healthcare, a well-conducted systematic review that includes a meta-analysis where indicated represents a high level of evidence for treatment effectiveness. The purpose of this commentary is to assist clinicians in understanding meta-analysis as a statistical tool using both published articles and explanations of components of the technique. We describe what meta-analysis is, what heterogeneity is, and how it affects meta-analysis, effect size, the modeling techniques of meta-analysis, and strengths and weaknesses of meta-analysis. Common components like forest plot interpretation, software that may be used, special cases for meta-analysis, such as subgroup analysis, individual patient data, and meta-regression, and a discussion of criticisms, are included.
RGG: A general GUI Framework for R scripts

PubMed Central

Visne, Ilhami; Dilaveroglu, Erkan; Vierlinger, Klemens; Lauss, Martin; Yildiz, Ahmet; Weinhaeusel, Andreas; Noehammer, Christa; Leisch, Friedrich; Kriegner, Albert

2009-01-01

Background R is the leading open source statistics software with a vast number of biostatistical and bioinformatical analysis packages. To exploit the advantages of R, extensive scripting/programming skills are required. Results We have developed a software tool called R GUI Generator (RGG) which enables the easy generation of Graphical User Interfaces (GUIs) for the programming language R by adding a few Extensible Markup Language (XML) – tags. RGG consists of an XML-based GUI definition language and a Java-based GUI engine. GUIs are generated in runtime from defined GUI tags that are embedded into the R script. User-GUI input is returned to the R code and replaces the XML-tags. RGG files can be developed using any text editor. The current version of RGG is available as a stand-alone software (RGGRunner) and as a plug-in for JGR. Conclusion RGG is a general GUI framework for R that has the potential to introduce R statistics (R packages, built-in functions and scripts) to users with limited programming skills and helps to bridge the gap between R developers and GUI-dependent users. RGG aims to abstract the GUI development from individual GUI toolkits by using an XML-based GUI definition language. Thus RGG can be easily integrated in any software. The RGG project further includes the development of a web-based repository for RGG-GUIs. RGG is an open source project licensed under the Lesser General Public License (LGPL) and can be downloaded freely at PMID:19254356
Holo-analysis.

PubMed

Rosen, G D

2006-06-01

Meta-analysis is a vague descriptor used to encompass very diverse methods of data collection analysis, ranging from simple averages to more complex statistical methods. Holo-analysis is a fully comprehensive statistical analysis of all available data and all available variables in a specified topic, with results expressed in a holistic factual empirical model. The objectives and applications of holo-analysis include software production for prediction of responses with confidence limits, translation of research conditions to praxis (field) circumstances, exposure of key missing variables, discovery of theoretically unpredictable variables and interactions, and planning future research. Holo-analyses are cited as examples of the effects on broiler feed intake and live weight gain of exogenous phytases, which account for 70% of variation in responses in terms of 20 highly significant chronological, dietary, environmental, genetic, managemental, and nutrient variables. Even better future accountancy of variation will be facilitated if and when authors of papers routinely provide key data for currently neglected variables, such as temperatures, complete feed formulations, and mortalities.
Software Design Challenges in Time Series Prediction Systems Using Parallel Implementation of Artificial Neural Networks.

PubMed

Manikandan, Narayanan; Subha, Srinivasan

2016-01-01

Software development life cycle has been characterized by destructive disconnects between activities like planning, analysis, design, and programming. Particularly software developed with prediction based results is always a big challenge for designers. Time series data forecasting like currency exchange, stock prices, and weather report are some of the areas where an extensive research is going on for the last three decades. In the initial days, the problems with financial analysis and prediction were solved by statistical models and methods. For the last two decades, a large number of Artificial Neural Networks based learning models have been proposed to solve the problems of financial data and get accurate results in prediction of the future trends and prices. This paper addressed some architectural design related issues for performance improvement through vectorising the strengths of multivariate econometric time series models and Artificial Neural Networks. It provides an adaptive approach for predicting exchange rates and it can be called hybrid methodology for predicting exchange rates. This framework is tested for finding the accuracy and performance of parallel algorithms used.
Software Design Challenges in Time Series Prediction Systems Using Parallel Implementation of Artificial Neural Networks

PubMed Central

Manikandan, Narayanan; Subha, Srinivasan

2016-01-01

Software development life cycle has been characterized by destructive disconnects between activities like planning, analysis, design, and programming. Particularly software developed with prediction based results is always a big challenge for designers. Time series data forecasting like currency exchange, stock prices, and weather report are some of the areas where an extensive research is going on for the last three decades. In the initial days, the problems with financial analysis and prediction were solved by statistical models and methods. For the last two decades, a large number of Artificial Neural Networks based learning models have been proposed to solve the problems of financial data and get accurate results in prediction of the future trends and prices. This paper addressed some architectural design related issues for performance improvement through vectorising the strengths of multivariate econometric time series models and Artificial Neural Networks. It provides an adaptive approach for predicting exchange rates and it can be called hybrid methodology for predicting exchange rates. This framework is tested for finding the accuracy and performance of parallel algorithms used. PMID:26881271
A Microarray Tool Provides Pathway and GO Term Analysis.

PubMed

Koch, Martin; Royer, Hans-Dieter; Wiese, Michael

2011-12-01

Analysis of gene expression profiles is no longer exclusively a task for bioinformatic experts. However, gaining statistically significant results is challenging and requires both biological knowledge and computational know-how. Here we present a novel, user-friendly microarray reporting tool called maRt. The software provides access to bioinformatic resources, like gene ontology terms and biological pathways by use of the DAVID and the BioMart web-service. Results are summarized in structured HTML reports, each presenting a different layer of information. In these report, contents of diverse sources are integrated and interlinked. To speed up processing, maRt takes advantage of the multi-core technology of modern desktop computers by using parallel processing. Since the software is built upon a RCP infrastructure it might be an outset for developers aiming to integrate novel R based applications. Installer, documentation and various kinds of tutorials are available under LGPL license at the website of our institute http://www.pharma.uni-bonn.de/www/mart. This software is free for academic use. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Integrating SAS and GIS software to improve habitat-use estimates from radiotelemetry data

USGS Publications Warehouse

Kenow, K.P.; Wright, R.G.; Samuel, M.D.; Rasmussen, P.W.

2001-01-01

Radiotelemetry has been used commonly to remotely determine habitat use by a variety of wildlife species. However, habitat misclassification can occur because the true location of a radiomarked animal can only be estimated. Analytical methods that provide improved estimates of habitat use from radiotelemetry location data using a subsampling approach have been proposed previously. We developed software, based on these methods, to conduct improved habitat-use analyses. A Statistical Analysis System (SAS)-executable file generates a random subsample of points from the error distribution of an estimated animal location and formats the output into ARC/INFO-compatible coordinate and attribute files. An associated ARC/INFO Arc Macro Language (AML) creates a coverage of the random points, determines the habitat type at each random point from an existing habitat coverage, sums the number of subsample points by habitat type for each location, and outputs tile results in ASCII format. The proportion and precision of habitat types used is calculated from the subsample of points generated for each radiotelemetry location. We illustrate the method and software by analysis of radiotelemetry data for a female wild turkey (Meleagris gallopavo).
A Biosequence-based Approach to Software Characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oehmen, Christopher S.; Peterson, Elena S.; Phillips, Aaron R.

For many applications, it is desirable to have some process for recognizing when software binaries are closely related without relying on them to be identical or have identical segments. Some examples include monitoring utilization of high performance computing centers or service clouds, detecting freeware in licensed code, and enforcing application whitelists. But doing so in a dynamic environment is a nontrivial task because most approaches to software similarity require extensive and time-consuming analysis of a binary, or they fail to recognize executables that are similar but nonidentical. Presented herein is a novel biosequence-based method for quantifying similarity of executable binaries.more » Using this method, it is shown in an example application on large-scale multi-author codes that 1) the biosequence-based method has a statistical performance in recognizing and distinguishing between a collection of real-world high performance computing applications better than 90% of ideal; and 2) an example of using family tree analysis to tune identification for a code subfamily can achieve better than 99% of ideal performance.« less
Point of care use of a personal digital assistant for patient consultation management: experience of an intravenous resource nurse team in a major Canadian teaching hospital.

PubMed

Bosma, Laine; Balen, Robert M; Davidson, Erin; Jewesson, Peter J

2003-01-01

The development and integration of a personal digital assistant (PDA)-based point-of-care database into an intravenous resource nurse (IVRN) consultation service for the purposes of consultation management and service characterization are described. The IVRN team provides a consultation service 7 days a week in this 1000-bed tertiary adult care teaching hospital. No simple, reliable method for documenting IVRN patient care activity and facilitating IVRN-initiated patient follow-up evaluation was available. Implementation of a PDA database with exportability of data to statistical analysis software was undertaken in July 2001. A Palm IIIXE PDA was purchased and a three-table, 13-field database was developed using HanDBase software. During the 7-month period of data collection, the IVRN team recorded 4868 consultations for 40 patient care areas. Full analysis of service characteristics was conducted using SPSS 10.0 software. Team members adopted the new technology with few problems, and the authors now can efficiently track and analyze the services provided by their IVRN team.
High-Performance Mixed Models Based Genome-Wide Association Analysis with omicABEL software

PubMed Central

Fabregat-Traver, Diego; Sharapov, Sodbo Zh.; Hayward, Caroline; Rudan, Igor; Campbell, Harry; Aulchenko, Yurii; Bientinesi, Paolo

2014-01-01

To raise the power of genome-wide association studies (GWAS) and avoid false-positive results in structured populations, one can rely on mixed model based tests. When large samples are used, and when multiple traits are to be studied in the ’omics’ context, this approach becomes computationally challenging. Here we consider the problem of mixed-model based GWAS for arbitrary number of traits, and demonstrate that for the analysis of single-trait and multiple-trait scenarios different computational algorithms are optimal. We implement these optimal algorithms in a high-performance computing framework that uses state-of-the-art linear algebra kernels, incorporates optimizations, and avoids redundant computations, increasing throughput while reducing memory usage and energy consumption. We show that, compared to existing libraries, our algorithms and software achieve considerable speed-ups. The OmicABEL software described in this manuscript is available under the GNU GPL v. 3 license as part of the GenABEL project for statistical genomics at http: //www.genabel.org/packages/OmicABEL. PMID:25717363

High-Performance Mixed Models Based Genome-Wide Association Analysis with omicABEL software.

PubMed

Fabregat-Traver, Diego; Sharapov, Sodbo Zh; Hayward, Caroline; Rudan, Igor; Campbell, Harry; Aulchenko, Yurii; Bientinesi, Paolo

2014-01-01

To raise the power of genome-wide association studies (GWAS) and avoid false-positive results in structured populations, one can rely on mixed model based tests. When large samples are used, and when multiple traits are to be studied in the 'omics' context, this approach becomes computationally challenging. Here we consider the problem of mixed-model based GWAS for arbitrary number of traits, and demonstrate that for the analysis of single-trait and multiple-trait scenarios different computational algorithms are optimal. We implement these optimal algorithms in a high-performance computing framework that uses state-of-the-art linear algebra kernels, incorporates optimizations, and avoids redundant computations, increasing throughput while reducing memory usage and energy consumption. We show that, compared to existing libraries, our algorithms and software achieve considerable speed-ups. The OmicABEL software described in this manuscript is available under the GNU GPL v. 3 license as part of the GenABEL project for statistical genomics at http: //www.genabel.org/packages/OmicABEL.
Evidence-based pathology in its second decade: toward probabilistic cognitive computing.

PubMed

Marchevsky, Alberto M; Walts, Ann E; Wick, Mark R

2017-03-01

Evidence-based pathology advocates using a combination of best available data ("evidence") from the literature and personal experience for the diagnosis, estimation of prognosis, and assessment of other variables that impact individual patient care. Evidence-based pathology relies on systematic reviews of the literature, evaluation of the quality of evidence as categorized by evidence levels and statistical tools such as meta-analyses, estimates of probabilities and odds, and others. However, it is well known that previously "statistically significant" information usually does not accurately forecast the future for individual patients. There is great interest in "cognitive computing" in which "data mining" is combined with "predictive analytics" designed to forecast future events and estimate the strength of those predictions. This study demonstrates the use of IBM Watson Analytics software to evaluate and predict the prognosis of 101 patients with typical and atypical pulmonary carcinoid tumors in which Ki-67 indices have been determined. The results obtained with this system are compared with those previously reported using "routine" statistical software and the help of a professional statistician. IBM Watson Analytics interactively provides statistical results that are comparable to those obtained with routine statistical tools but much more rapidly, with considerably less effort and with interactive graphics that are intuitively easy to apply. It also enables analysis of natural language variables and yields detailed survival predictions for patient subgroups selected by the user. Potential applications of this tool and basic concepts of cognitive computing are discussed. Copyright © 2016 Elsevier Inc. All rights reserved.
Treatment of control data in lunar phototriangulation. [application of statistical procedures and development of mathematical and computer techniques

NASA Technical Reports Server (NTRS)

Wong, K. W.

1974-01-01

In lunar phototriangulation, there is a complete lack of accurate ground control points. The accuracy analysis of the results of lunar phototriangulation must, therefore, be completely dependent on statistical procedure. It was the objective of this investigation to examine the validity of the commonly used statistical procedures, and to develop both mathematical techniques and computer softwares for evaluating (1) the accuracy of lunar phototriangulation; (2) the contribution of the different types of photo support data on the accuracy of lunar phototriangulation; (3) accuracy of absolute orientation as a function of the accuracy and distribution of both the ground and model points; and (4) the relative slope accuracy between any triangulated pass points.
Meta-analysis of gene-level associations for rare variants based on single-variant statistics.

PubMed

Hu, Yi-Juan; Berndt, Sonja I; Gustafsson, Stefan; Ganna, Andrea; Hirschhorn, Joel; North, Kari E; Ingelsson, Erik; Lin, Dan-Yu

2013-08-08

Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
45 CFR 153.350 - Risk adjustment data validation standards.

Code of Federal Regulations, 2012 CFR

2012-10-01

... implementation of any risk adjustment software and ensure proper validation of a statistically valid sample of... respect to implementation of risk adjustment software or as a result of data validation conducted pursuant... implementation of risk adjustment software or data validation. ...
Analyzing Responses of Chemical Sensor Arrays

NASA Technical Reports Server (NTRS)

Zhou, Hanying

2007-01-01

NASA is developing a third-generation electronic nose (ENose) capable of continuous monitoring of the International Space Station s cabin atmosphere for specific, harmful airborne contaminants. Previous generations of the ENose have been described in prior NASA Tech Briefs issues. Sensor selection is critical in both (prefabrication) sensor material selection and (post-fabrication) data analysis of the ENose, which detects several analytes that are difficult to detect, or that are at very low concentration ranges. Existing sensor selection approaches usually include limited statistical measures, where selectivity is more important but reliability and sensitivity are not of concern. When reliability and sensitivity can be major limiting factors in detecting target compounds reliably, the existing approach is not able to provide meaningful selection that will actually improve data analysis results. The approach and software reported here consider more statistical measures (factors) than existing approaches for a similar purpose. The result is a more balanced and robust sensor selection from a less than ideal sensor array. The software offers quick, flexible, optimal sensor selection and weighting for a variety of purposes without a time-consuming, iterative search by performing sensor calibrations to a known linear or nonlinear model, evaluating the individual sensor s statistics, scoring the individual sensor s overall performance, finding the best sensor array size to maximize class separation, finding optimal weights for the remaining sensor array, estimating limits of detection for the target compounds, evaluating fingerprint distance between group pairs, and finding the best event-detecting sensors.
A Tutorial for SPSS/PC+ Studentware. Study Guide for the Doctor of Arts in Computer-Based Learning.

ERIC Educational Resources Information Center

MacFarland, Thomas W.; Hou, Cheng-I

The purpose of this tutorial is to provide the basic information needed for success with SPSS/PC+ Studentware, a student version of the statistical analysis software offered by SPSS, Inc., for the IBM PC+ and compatible computers. It is intended as a convenient summary of how to organize and conduct the most common computer-based statistical…
Reservoir characterization using core, well log, and seismic data and intelligent software

NASA Astrophysics Data System (ADS)

Soto Becerra, Rodolfo

We have developed intelligent software, Oilfield Intelligence (OI), as an engineering tool to improve the characterization of oil and gas reservoirs. OI integrates neural networks and multivariate statistical analysis. It is composed of five main subsystems: data input, preprocessing, architecture design, graphics design, and inference engine modules. More than 1,200 lines of programming code as M-files using the language MATLAB been written. The degree of success of many oil and gas drilling, completion, and production activities depends upon the accuracy of the models used in a reservoir description. Neural networks have been applied for identification of nonlinear systems in almost all scientific fields of humankind. Solving reservoir characterization problems is no exception. Neural networks have a number of attractive features that can help to extract and recognize underlying patterns, structures, and relationships among data. However, before developing a neural network model, we must solve the problem of dimensionality such as determining dominant and irrelevant variables. We can apply principal components and factor analysis to reduce the dimensionality and help the neural networks formulate more realistic models. We validated OI by obtaining confident models in three different oil field problems: (1) A neural network in-situ stress model using lithology and gamma ray logs for the Travis Peak formation of east Texas, (2) A neural network permeability model using porosity and gamma ray and a neural network pseudo-gamma ray log model using 3D seismic attributes for the reservoir VLE 196 Lamar field located in Block V of south-central Lake Maracaibo (Venezuela), and (3) Neural network primary ultimate oil recovery (PRUR), initial waterflooding ultimate oil recovery (IWUR), and infill drilling ultimate oil recovery (IDUR) models using reservoir parameters for San Andres and Clearfork carbonate formations in west Texas. In all cases, we compared the results from the neural network models with the results from regression statistical and non-parametric approach models. The results show that it is possible to obtain the highest cross-correlation coefficient between predicted and actual target variables, and the lowest average absolute errors using the integrated techniques of multivariate statistical analysis and neural networks in our intelligent software.
National policies for technical change: Where are the increasing returns to economic research?

PubMed Central

Pavitt, Keith

1996-01-01

Improvements over the past 30 years in statistical data, analysis, and related theory have strengthened the basis for science and technology policy by confirming the importance of technical change in national economic performance. But two important features of scientific and technological activities in the Organization for Economic Cooperation and Development countries are still not addressed adequately in mainstream economics: (i) the justification of public funding for basic research and (ii) persistent international differences in investment in research and development and related activities. In addition, one major gap is now emerging in our systems of empirical measurement—the development of software technology, especially in the service sector. There are therefore dangers of diminishing returns to the usefulness of economic research, which continues to rely completely on established theory and established statistical sources. Alternative propositions that deserve serious consideration are: (i) the economic usefulness of basic research is in the provision of (mainly tacit) skills rather than codified and applicable information; (ii) in developing and exploiting technological opportunities, institutional competencies are just as important as the incentive structures that they face; and (iii) software technology developed in traditional service sectors may now be a more important locus of technical change than software technology developed in “high-tech” manufacturing. PMID:8917481
Statistical Theory for the "RCT-YES" Software: Design-Based Causal Inference for RCTs. NCEE 2015-4011

ERIC Educational Resources Information Center

Schochet, Peter Z.

2015-01-01

This report presents the statistical theory underlying the "RCT-YES" software that estimates and reports impacts for RCTs for a wide range of designs used in social policy research. The report discusses a unified, non-parametric design-based approach for impact estimation using the building blocks of the Neyman-Rubin-Holland causal…
An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

NASA Technical Reports Server (NTRS)

Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

2013-01-01

In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.
An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

NASA Technical Reports Server (NTRS)

Smith, Kelly; Gay, Robert; Stachowiak, Susan

2013-01-01

In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles
A survey of tools for the analysis of quantitative PCR (qPCR) data.

PubMed

Pabinger, Stephan; Rödiger, Stefan; Kriegner, Albert; Vierlinger, Klemens; Weinhäusel, Andreas

2014-09-01

Real-time quantitative polymerase-chain-reaction (qPCR) is a standard technique in most laboratories used for various applications in basic research. Analysis of qPCR data is a crucial part of the entire experiment, which has led to the development of a plethora of methods. The released tools either cover specific parts of the workflow or provide complete analysis solutions. Here, we surveyed 27 open-access software packages and tools for the analysis of qPCR data. The survey includes 8 Microsoft Windows, 5 web-based, 9 R-based and 5 tools from other platforms. Reviewed packages and tools support the analysis of different qPCR applications, such as RNA quantification, DNA methylation, genotyping, identification of copy number variations, and digital PCR. We report an overview of the functionality, features and specific requirements of the individual software tools, such as data exchange formats, availability of a graphical user interface, included procedures for graphical data presentation, and offered statistical methods. In addition, we provide an overview about quantification strategies, and report various applications of qPCR. Our comprehensive survey showed that most tools use their own file format and only a fraction of the currently existing tools support the standardized data exchange format RDML. To allow a more streamlined and comparable analysis of qPCR data, more vendors and tools need to adapt the standardized format to encourage the exchange of data between instrument software, analysis tools, and researchers.
Artificial Intelligence in Mitral Valve Analysis

PubMed Central

Jeganathan, Jelliffe; Knio, Ziyad; Amador, Yannis; Hai, Ting; Khamooshian, Arash; Matyal, Robina; Khabbaz, Kamal R; Mahmood, Feroze

2017-01-01

Background: Echocardiographic analysis of mitral valve (MV) has become essential for diagnosis and management of patients with MV disease. Currently, the various software used for MV analysis require manual input and are prone to interobserver variability in the measurements. Aim: The aim of this study is to determine the interobserver variability in an automated software that uses artificial intelligence for MV analysis. Settings and Design: Retrospective analysis of intraoperative three-dimensional transesophageal echocardiography data acquired from four patients with normal MV undergoing coronary artery bypass graft surgery in a tertiary hospital. Materials and Methods: Echocardiographic data were analyzed using the eSie Valve Software (Siemens Healthcare, Mountain View, CA, USA). Three examiners analyzed three end-systolic (ES) frames from each of the four patients. A total of 36 ES frames were analyzed and included in the study. Statistical Analysis: A multiple mixed-effects ANOVA model was constructed to determine if the examiner, the patient, and the loop had a significant effect on the average value of each parameter. A Bonferroni correction was used to correct for multiple comparisons, and P = 0.0083 was considered to be significant. Results: Examiners did not have an effect on any of the six parameters tested. Patient and loop had an effect on the average parameter value for each of the six parameters as expected (P < 0.0083 for both). Conclusion: We were able to conclude that using automated analysis, it is possible to obtain results with good reproducibility, which only requires minimal user intervention. PMID:28393769
Using R to implement spatial analysis in open source environment

NASA Astrophysics Data System (ADS)

Shao, Yixi; Chen, Dong; Zhao, Bo

2007-06-01

R is an open source (GPL) language and environment for spatial analysis, statistical computing and graphics which provides a wide variety of statistical and graphical techniques, and is highly extensible. In the Open Source environment it plays an important role in doing spatial analysis. So, to implement spatial analysis in the Open Source environment which we called the Open Source geocomputation is using the R data analysis language integrated with GRASS GIS and MySQL or PostgreSQL. This paper explains the architecture of the Open Source GIS environment and emphasizes the role R plays in the aspect of spatial analysis. Furthermore, one apt illustration of the functions of R is given in this paper through the project of constructing CZPGIS (Cheng Zhou Population GIS) supported by Changzhou Government, China. In this project we use R to implement the geostatistics in the Open Source GIS environment to evaluate the spatial correlation of land price and estimate it by Kriging Interpolation. We also use R integrated with MapServer and php to show how R and other Open Source software cooperate with each other in WebGIS environment, which represents the advantages of using R to implement spatial analysis in Open Source GIS environment. And in the end, we points out that the packages for spatial analysis in R is still scattered and the limited memory is still a bottleneck when large sum of clients connect at the same time. Therefore further work is to group the extensive packages in order or design normative packages and make R cooperate better with other commercial software such as ArcIMS. Also we look forward to developing packages for land price evaluation.
Evaluation of computer-aided diagnosis (CAD) software for the detection of lung nodules on multidetector row computed tomography (MDCT): JAFROC study for the improvement in radiologists' diagnostic accuracy.

PubMed

Hirose, Tomohiro; Nitta, Norihisa; Shiraishi, Junji; Nagatani, Yukihiro; Takahashi, Masashi; Murata, Kiyoshi

2008-12-01

The aim of this study was to evaluate the usefulness of computer-aided diagnosis (CAD) software for the detection of lung nodules on multidetector-row computed tomography (MDCT) in terms of improvement in radiologists' diagnostic accuracy in detecting lung nodules, using jackknife free-response receiver-operating characteristic (JAFROC) analysis. Twenty-one patients (6 without and 15 with lung nodules) were selected randomly from 120 consecutive thoracic computed tomographic examinations. The gold standard for the presence or absence of nodules in the observer study was determined by consensus of two radiologists. Six expert radiologists participated in a free-response receiver operating characteristic study for the detection of lung nodules on MDCT, in which cases were interpreted first without and then with the output of CAD software. Radiologists were asked to indicate the locations of lung nodule candidates on the monitor with their confidence ratings for the presence of lung nodules. The performance of the CAD software indicated that the sensitivity in detecting lung nodules was 71.4%, with 0.95 false-positive results per case. When radiologists used the CAD software, the average sensitivity improved from 39.5% to 81.0%, with an increase in the average number of false-positive results from 0.14 to 0.89 per case. The average figure-of-merit values for the six radiologists were 0.390 without and 0.845 with the output of the CAD software, and there was a statistically significant difference (P < .0001) using the JAFROC analysis. The CAD software for the detection of lung nodules on MDCT has the potential to assist radiologists by increasing their accuracy.
Integration of modern statistical tools for the analysis of climate extremes into the web-GIS “CLIMATE”

NASA Astrophysics Data System (ADS)

Ryazanova, A. A.; Okladnikov, I. G.; Gordov, E. P.

2017-11-01

The frequency of occurrence and magnitude of precipitation and temperature extreme events show positive trends in several geographical regions. These events must be analyzed and studied in order to better understand their impact on the environment, predict their occurrences, and mitigate their effects. For this purpose, we augmented web-GIS called “CLIMATE” to include a dedicated statistical package developed in the R language. The web-GIS “CLIMATE” is a software platform for cloud storage processing and visualization of distributed archives of spatial datasets. It is based on a combined use of web and GIS technologies with reliable procedures for searching, extracting, processing, and visualizing the spatial data archives. The system provides a set of thematic online tools for the complex analysis of current and future climate changes and their effects on the environment. The package includes new powerful methods of time-dependent statistics of extremes, quantile regression and copula approach for the detailed analysis of various climate extreme events. Specifically, the very promising copula approach allows obtaining the structural connections between the extremes and the various environmental characteristics. The new statistical methods integrated into the web-GIS “CLIMATE” can significantly facilitate and accelerate the complex analysis of climate extremes using only a desktop PC connected to the Internet.
The SRS-Viewer: A Software Tool for Displaying and Evaluation of Pyroshock Data

NASA Astrophysics Data System (ADS)

Eberl, Stefan

2014-06-01

For the evaluation of the success of a pyroshock, the time domain and the corresponding Shock-Response- Spectra (SRS) have to be considered. The SRS-Viewer is an IABG developed software tool [1] to read data in Universal File format (*.unv) and either display or plot for each accelerometer the time domain, corresponding SRS and the specified Reference-SRS with tolerances in the background.The software calculates the "Average (AVG)", "Maximum (MAX)" and "Minimum (MIN)" SRS of any selection of accelerometers. A statistical analysis calculates the percentages of measured SRS above the specified Reference-SRS level and the percentage within the tolerance bands for comparison with the specified success criteria.Overlay plots of single accelerometers of different test runs enable to monitor the repeatability of the shock input and the integrity of the specimen. Furthermore the difference between the shock on a mass-dummy and the real test unit can be examined.
Interactive graphics for the Macintosh: software review of FlexiGraphs.

PubMed

Antonak, R F

1990-01-01

While this product is clearly unique, its usefulness to individuals outside small business environments is somewhat limited. FlexiGraphs is, however, a reasonable first attempt to design a microcomputer software package that controls data through interactive editing within a graph. Although the graphics capabilities of mainframe programs such as MINITAB (Ryan, Joiner, & Ryan, 1981) and the graphic manipulations available through exploratory data analysis (e.g., Velleman & Hoaglin, 1981) will not be surpassed anytime soon by this program, a researcher may want to add this program to a software library containing other Macintosh statistics, drawing, and graphics programs if only to obtain the easy-to-obtain curve fitting and line smoothing options. I welcome the opportunity to review the enhanced "scientific" version of FlexiGraphs that the author of the program indicates is currently under development. An MS-DOS version of the program should be available within the year.
GLOBECOM '84 - Global Telecommunications Conference, Atlanta, GA, November 26-29, 1984, Conference Record. Volume 1

NASA Astrophysics Data System (ADS)

The subjects discussed are related to LSI/VLSI based subscriber transmission and customer access for the Integrated Services Digital Network (ISDN), special applications of fiber optics, ISDN and competitive telecommunication services, technical preparations for the Geostationary-Satellite Orbit Conference, high-capacity statistical switching fabrics, networking and distributed systems software, adaptive arrays and cancelers, synchronization and tracking, speech processing, advances in communication terminals, full-color videotex, and a performance analysis of protocols. Advances in data communications are considered along with transmission network plans and progress, direct broadcast satellite systems, packet radio system aspects, radio-new and developing technologies and applications, the management of software quality, and Open Systems Interconnection (OSI) aspects of telematic services. Attention is given to personal computers and OSI, the role of software reliability measurement in information systems, and an active array antenna for the next-generation direct broadcast satellite.

Development of online NIR urine analyzing system based on AOTF

NASA Astrophysics Data System (ADS)

Wan, Feng; Sun, Zhendong; Li, Xiaoxia

2006-09-01

In this paper, some key techniques on development of on-line MR urine analyzing system based on AOTF (Acousto - Optics Tunable Filter) are introduced. Problems about designing the optical system including collimation of incident light and working distance (the shortest distance for separating incident light and diffracted light) are analyzed and researched. DDS (Direct Digital Synthesizer) controlled by microprocessor is used to realize the wavelength scan. The experiment results show that this MR urine analyzing system based on. AOTF has 10000 - 4000cm -1 wavelength range and O.3ms wavelength transfer rate. Compare with the conventional Fourier Transform NIP. spectrophotometer for analyzing multi-components in urine, this system features low cost, small volume and on-line measurement function. Unscrambler software (multivariate statistical software by CAMO Inc. Norway) is selected as the software for processing the data. This system can realize on line quantitative analysis of protein, urea and creatinine in urine.
A Business Analytics Software Tool for Monitoring and Predicting Radiology Throughput Performance.

PubMed

Jones, Stephen; Cournane, Seán; Sheehy, Niall; Hederman, Lucy

2016-12-01

Business analytics (BA) is increasingly being utilised by radiology departments to analyse and present data. It encompasses statistical analysis, forecasting and predictive modelling and is used as an umbrella term for decision support and business intelligence systems. The primary aim of this study was to determine whether utilising BA technologies could contribute towards improved decision support and resource management within radiology departments. A set of information technology requirements were identified with key stakeholders, and a prototype BA software tool was designed, developed and implemented. A qualitative evaluation of the tool was carried out through a series of semi-structured interviews with key stakeholders. Feedback was collated, and emergent themes were identified. The results indicated that BA software applications can provide visibility of radiology performance data across all time horizons. The study demonstrated that the tool could potentially assist with improving operational efficiencies and management of radiology resources.
Database for the collection and analysis of clinical data and images of neoplasms of the sinonasal tract.

PubMed

Trimarchi, Matteo; Lund, Valerie J; Nicolai, Piero; Pini, Massimiliano; Senna, Massimo; Howard, David J

2004-04-01

The Neoplasms of the Sinonasal Tract software package (NSNT v 1.0) implements a complete visual database for patients with sinonasal neoplasia, facilitating standardization of data and statistical analysis. The software, which is compatible with the Macintosh and Windows platforms, provides multiuser application with a dedicated server (on Windows NT or 2000 or Macintosh OS 9 or X and a network of clients) together with web access, if required. The system hardware consists of an Apple Power Macintosh G4500 MHz computer with PCI bus, 256 Mb of RAM plus 60 Gb hard disk, or any IBM-compatible computer with a Pentium 2 processor. Image acquisition may be performed with different frame-grabber cards for analog or digital video input of different standards (PAL, SECAM, or NTSC) and levels of quality (VHS, S-VHS, Betacam, Mini DV, DV). The visual database is based on 4th Dimension by 4D Inc, and video compression is made in real-time MPEG format. Six sections have been developed: demographics, symptoms, extent of disease, radiology, treatment, and follow-up. Acquisition of data includes computed tomography and magnetic resonance imaging, histology, and endoscopy images, allowing sequential comparison. Statistical analysis integral to the program provides Kaplan-Meier survival curves. The development of a dedicated, user-friendly database for sinonasal neoplasia facilitates a multicenter network and has obvious clinical and research benefits.
MEG and EEG data analysis with MNE-Python.

PubMed

Gramfort, Alexandre; Luessi, Martin; Larson, Eric; Engemann, Denis A; Strohmeier, Daniel; Brodbeck, Christian; Goj, Roman; Jas, Mainak; Brooks, Teon; Parkkonen, Lauri; Hämäläinen, Matti

2013-12-26

Magnetoencephalography and electroencephalography (M/EEG) measure the weak electromagnetic signals generated by neuronal activity in the brain. Using these signals to characterize and locate neural activation in the brain is a challenge that requires expertise in physics, signal processing, statistics, and numerical methods. As part of the MNE software suite, MNE-Python is an open-source software package that addresses this challenge by providing state-of-the-art algorithms implemented in Python that cover multiple methods of data preprocessing, source localization, statistical analysis, and estimation of functional connectivity between distributed brain regions. All algorithms and utility functions are implemented in a consistent manner with well-documented interfaces, enabling users to create M/EEG data analysis pipelines by writing Python scripts. Moreover, MNE-Python is tightly integrated with the core Python libraries for scientific comptutation (NumPy, SciPy) and visualization (matplotlib and Mayavi), as well as the greater neuroimaging ecosystem in Python via the Nibabel package. The code is provided under the new BSD license allowing code reuse, even in commercial products. Although MNE-Python has only been under heavy development for a couple of years, it has rapidly evolved with expanded analysis capabilities and pedagogical tutorials because multiple labs have collaborated during code development to help share best practices. MNE-Python also gives easy access to preprocessed datasets, helping users to get started quickly and facilitating reproducibility of methods by other researchers. Full documentation, including dozens of examples, is available at http://martinos.org/mne.
Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons

PubMed Central

Fischer, Sandra; Kumar, Neeraj

2017-01-01

The importance of 16S rRNA gene amplicon profiles for understanding the influence of microbes in a variety of environments coupled with the steep reduction in sequencing costs led to a surge of microbial sequencing projects. The expanding crowd of scientists and clinicians wanting to make use of sequencing datasets can choose among a range of multipurpose software platforms, the use of which can be intimidating for non-expert users. Among available pipeline options for high-throughput 16S rRNA gene analysis, the R programming language and software environment for statistical computing stands out for its power and increased flexibility, and the possibility to adhere to most recent best practices and to adjust to individual project needs. Here we present the Rhea pipeline, a set of R scripts that encode a series of well-documented choices for the downstream analysis of Operational Taxonomic Units (OTUs) tables, including normalization steps, alpha- and beta-diversity analysis, taxonomic composition, statistical comparisons, and calculation of correlations. Rhea is primarily a straightforward starting point for beginners, but can also be a framework for advanced users who can modify and expand the tool. As the community standards evolve, Rhea will adapt to always represent the current state-of-the-art in microbial profiles analysis in the clear and comprehensive way allowed by the R language. Rhea scripts and documentation are freely available at https://lagkouvardos.github.io/Rhea. PMID:28097056
Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons.

PubMed

Lagkouvardos, Ilias; Fischer, Sandra; Kumar, Neeraj; Clavel, Thomas

2017-01-01

The importance of 16S rRNA gene amplicon profiles for understanding the influence of microbes in a variety of environments coupled with the steep reduction in sequencing costs led to a surge of microbial sequencing projects. The expanding crowd of scientists and clinicians wanting to make use of sequencing datasets can choose among a range of multipurpose software platforms, the use of which can be intimidating for non-expert users. Among available pipeline options for high-throughput 16S rRNA gene analysis, the R programming language and software environment for statistical computing stands out for its power and increased flexibility, and the possibility to adhere to most recent best practices and to adjust to individual project needs. Here we present the Rhea pipeline, a set of R scripts that encode a series of well-documented choices for the downstream analysis of Operational Taxonomic Units (OTUs) tables, including normalization steps, alpha - and beta -diversity analysis, taxonomic composition, statistical comparisons, and calculation of correlations. Rhea is primarily a straightforward starting point for beginners, but can also be a framework for advanced users who can modify and expand the tool. As the community standards evolve, Rhea will adapt to always represent the current state-of-the-art in microbial profiles analysis in the clear and comprehensive way allowed by the R language. Rhea scripts and documentation are freely available at https://lagkouvardos.github.io/Rhea.
UFO (UnFold Operator) user guide

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kissel, L.; Biggs, F.; Marking, T.R.

UFO is a collection of interactive utility programs for estimating unknown functions of one variable using a wide-ranging class of information as input, for miscellaneous data-analysis applications, for performing feasibility studies, and for supplementing our other software. Inverse problems, which include spectral unfolds, inverse heat-transfer problems, time-domain deconvolution, and unusual or difficult curve-fit problems, are classes of applications for which UFO is well suited. Extensive use of B-splines and (X,Y)-datasets is made to represent functions. The (X,Y)-dataset representation is unique in that it is not restricted to equally-spaced data. This feature is used, for example, in a table-generating algorithm thatmore » evaluates a function to a user-specified interpolation accuracy while minimizing the number of points stored in the corresponding dataset. UFO offers a variety of miscellaneous data-analysis options such as plotting, comparing, transforming, scaling, integrating; and adding, subtracting, multiplying, and dividing functions together. These options are often needed as intermediate steps in analyzing and solving difficult inverse problems, but they also find frequent use in other applications. Statistical options are available to calculate goodness-of-fit to measurements, specify error bands on solutions, give confidence limits on calculated quantities, and to point out the statistical consequences of operations such as smoothing. UFO is designed to do feasibility studies on a variety of engineering measurements. It is also tailored to supplement our Test Analysis and Design codes, SRAD Test-Data Archive software, and Digital Signal Analysis routines.« less
MEG and EEG data analysis with MNE-Python

PubMed Central

Gramfort, Alexandre; Luessi, Martin; Larson, Eric; Engemann, Denis A.; Strohmeier, Daniel; Brodbeck, Christian; Goj, Roman; Jas, Mainak; Brooks, Teon; Parkkonen, Lauri; Hämäläinen, Matti

2013-01-01

Magnetoencephalography and electroencephalography (M/EEG) measure the weak electromagnetic signals generated by neuronal activity in the brain. Using these signals to characterize and locate neural activation in the brain is a challenge that requires expertise in physics, signal processing, statistics, and numerical methods. As part of the MNE software suite, MNE-Python is an open-source software package that addresses this challenge by providing state-of-the-art algorithms implemented in Python that cover multiple methods of data preprocessing, source localization, statistical analysis, and estimation of functional connectivity between distributed brain regions. All algorithms and utility functions are implemented in a consistent manner with well-documented interfaces, enabling users to create M/EEG data analysis pipelines by writing Python scripts. Moreover, MNE-Python is tightly integrated with the core Python libraries for scientific comptutation (NumPy, SciPy) and visualization (matplotlib and Mayavi), as well as the greater neuroimaging ecosystem in Python via the Nibabel package. The code is provided under the new BSD license allowing code reuse, even in commercial products. Although MNE-Python has only been under heavy development for a couple of years, it has rapidly evolved with expanded analysis capabilities and pedagogical tutorials because multiple labs have collaborated during code development to help share best practices. MNE-Python also gives easy access to preprocessed datasets, helping users to get started quickly and facilitating reproducibility of methods by other researchers. Full documentation, including dozens of examples, is available at http://martinos.org/mne. PMID:24431986
dartr: An r package to facilitate analysis of SNP data generated from reduced representation genome sequencing.

PubMed

Gruber, Bernd; Unmack, Peter J; Berry, Oliver F; Georges, Arthur

2018-05-01

Although vast technological advances have been made and genetic software packages are growing in number, it is not a trivial task to analyse SNP data. We announce a new r package, dartr, enabling the analysis of single nucleotide polymorphism data for population genomic and phylogenomic applications. dartr provides user-friendly functions for data quality control and marker selection, and permits rigorous evaluations of conformation to Hardy-Weinberg equilibrium, gametic-phase disequilibrium and neutrality. The package reports standard descriptive statistics, permits exploration of patterns in the data through principal components analysis and conducts standard F-statistics, as well as basic phylogenetic analyses, population assignment, isolation by distance and exports data to a variety of commonly used downstream applications (e.g., newhybrids, faststructure and phylogeny applications) outside of the r environment. The package serves two main purposes: first, a user-friendly approach to lower the hurdle to analyse such data-therefore, the package comes with a detailed tutorial targeted to the r beginner to allow data analysis without requiring deep knowledge of r. Second, we use a single, well-established format-genlight from the adegenet package-as input for all our functions to avoid data reformatting. By strictly using the genlight format, we hope to facilitate this format as the de facto standard of future software developments and hence reduce the format jungle of genetic data sets. The dartr package is available via the r CRAN network and GitHub. © 2017 John Wiley & Sons Ltd.
Automated analysis of calcium spiking profiles with CaSA software: two case studies from root-microbe symbioses.

PubMed

Russo, Giulia; Spinella, Salvatore; Sciacca, Eva; Bonfante, Paola; Genre, Andrea

2013-12-26

Repeated oscillations in intracellular calcium (Ca2+) concentration, known as Ca2+ spiking signals, have been described in plants for a limited number of cellular responses to biotic or abiotic stimuli and most notably the common symbiotic signaling pathway (CSSP) which mediates the recognition by their plant hosts of two endosymbiotic microbes, arbuscular mycorrhizal (AM) fungi and nitrogen fixing rhizobia. The detailed analysis of the complexity and variability of the Ca2+ spiking patterns which have been revealed in recent studies requires both extensive datasets and sophisticated statistical tools. As a contribution, we have developed automated Ca2+ spiking analysis (CaSA) software that performs i) automated peak detection, ii) statistical analyses based on the detected peaks, iii) autocorrelation analysis of peak-to-peak intervals to highlight major traits in the spiking pattern.We have evaluated CaSA in two experimental studies. In the first, CaSA highlighted unpredicted differences in the spiking patterns induced in Medicago truncatula root epidermal cells by exudates of the AM fungus Gigaspora margarita as a function of the phosphate concentration in the growth medium of both host and fungus. In the second study we compared the spiking patterns triggered by either AM fungal or rhizobial symbiotic signals. CaSA revealed the existence of different patterns in signal periodicity, which are thought to contribute to the so-called Ca2+ signature. We therefore propose CaSA as a useful tool for characterizing oscillatory biological phenomena such as Ca2+ spiking.
Analyzing and Predicting Effort Associated with Finding and Fixing Software Faults

NASA Technical Reports Server (NTRS)

Hamill, Maggie; Goseva-Popstojanova, Katerina

2016-01-01

Context: Software developers spend a significant amount of time fixing faults. However, not many papers have addressed the actual effort needed to fix software faults. Objective: The objective of this paper is twofold: (1) analysis of the effort needed to fix software faults and how it was affected by several factors and (2) prediction of the level of fix implementation effort based on the information provided in software change requests. Method: The work is based on data related to 1200 failures, extracted from the change tracking system of a large NASA mission. The analysis includes descriptive and inferential statistics. Predictions are made using three supervised machine learning algorithms and three sampling techniques aimed at addressing the imbalanced data problem. Results: Our results show that (1) 83% of the total fix implementation effort was associated with only 20% of failures. (2) Both safety critical failures and post-release failures required three times more effort to fix compared to non-critical and pre-release counterparts, respectively. (3) Failures with fixes spread across multiple components or across multiple types of software artifacts required more effort. The spread across artifacts was more costly than spread across components. (4) Surprisingly, some types of faults associated with later life-cycle activities did not require significant effort. (5) The level of fix implementation effort was predicted with 73% overall accuracy using the original, imbalanced data. Using oversampling techniques improved the overall accuracy up to 77%. More importantly, oversampling significantly improved the prediction of the high level effort, from 31% to around 85%. Conclusions: This paper shows the importance of tying software failures to changes made to fix all associated faults, in one or more software components and/or in one or more software artifacts, and the benefit of studying how the spread of faults and other factors affect the fix implementation effort.
Multivariate Statistical Analysis Software Technologies for Astrophysical Research Involving Large Data Bases

NASA Technical Reports Server (NTRS)

Djorgovski, S. G.

1994-01-01

We developed a package to process and analyze the data from the digital version of the Second Palomar Sky Survey. This system, called SKICAT, incorporates the latest in machine learning and expert systems software technology, in order to classify the detected objects objectively and uniformly, and facilitate handling of the enormous data sets from digital sky surveys and other sources. The system provides a powerful, integrated environment for the manipulation and scientific investigation of catalogs from virtually any source. It serves three principal functions: image catalog construction, catalog management, and catalog analysis. Through use of the GID3* Decision Tree artificial induction software, SKICAT automates the process of classifying objects within CCD and digitized plate images. To exploit these catalogs, the system also provides tools to merge them into a large, complex database which may be easily queried and modified when new data or better methods of calibrating or classifying become available. The most innovative feature of SKICAT is the facility it provides to experiment with and apply the latest in machine learning technology to the tasks of catalog construction and analysis. SKICAT provides a unique environment for implementing these tools for any number of future scientific purposes. Initial scientific verification and performance tests have been made using galaxy counts and measurements of galaxy clustering from small subsets of the survey data, and a search for very high redshift quasars. All of the tests were successful and produced new and interesting scientific results. Attachments to this report give detailed accounts of the technical aspects of the SKICAT system, and of some of the scientific results achieved to date. We also developed a user-friendly package for multivariate statistical analysis of small and moderate-size data sets, called STATPROG. The package was tested extensively on a number of real scientific applications and has produced real, published results.
Skin antiseptics in venous puncture site disinfection for preventing blood culture contamination: A Bayesian network meta-analysis of randomized controlled trials.

PubMed

Liu, Wenjie; Duan, Yuchen; Cui, Wenyao; Li, Li; Wang, Xia; Dai, Heling; You, Chao; Chen, Maojun

2016-07-01

To compare the efficacy of several antiseptics in decreasing the blood culture contamination rate. Network meta-analysis. Electronic searches of PubMed and Embase were conducted up to November 2015. Only randomized controlled trials or quasi-randomized controlled trials were eligible. We applied no language restriction. A comprehensive review of articles in the reference lists was also accomplished for possible relevant studies. Relevant studies evaluating efficacy of different antiseptics in venous puncture site for decreasing the blood culture contamination rate were included. The data were extracted from the included randomized controlled trials by two authors independently. The risk of bias was evaluated using Detsky scale by two authors independently. We used WinBUGS1.43 software and statistic model described by Chaimani to perform this network meta-analysis. Then graphs of statistical results of WinBUGS1.43 software were generated using 'networkplot', 'ifplot', 'netfunnel' and 'sucra' procedure by STATA13.0. Odds ratio and 95% confidence intervals were assessed for dichotomous data. A probability of p less than 0.05 was considered to be statistically significant. Compared with ordinary meta-analyses, this network meta-analysis offered hierarchies for the efficacy of different antiseptics in decreasing the blood culture contamination rate. Seven randomized controlled trials involving 34,408 blood samples were eligible for the meta-analysis. No significant difference was found in blood culture contamination rate among different antiseptics. No significant difference was found between non-alcoholic antiseptics and alcoholic antiseptics, alcoholic chlorhexidine and povidone iodine, chlorhexidine and iodine compounds, povidone iodine and iodine tincture in this aspect, respectively. Different antiseptics may not affect the blood culture contamination rate. Different intervals between the skin disinfection and the venous puncture, the different settings (emergency room, medical wards, and intensive care units) and the performance of the phlebotomy may affect the blood culture contamination rate. Copyright © 2016 Elsevier Ltd. All rights reserved.
Quantitative Analysis of Venus Radar Backscatter Data in ArcGIS

NASA Technical Reports Server (NTRS)

Long, S. M.; Grosfils, E. B.

2005-01-01

Ongoing mapping of the Ganiki Planitia (V14) quadrangle of Venus and definition of material units has involved an integrated but qualitative analysis of Magellan radar backscatter images and topography using standard geomorphological mapping techniques. However, such analyses do not take full advantage of the quantitative information contained within the images. Analysis of the backscatter coefficient allows a much more rigorous statistical comparison between mapped units, permitting first order selfsimilarity tests of geographically separated materials assigned identical geomorphological labels. Such analyses cannot be performed directly on pixel (DN) values from Magellan backscatter images, because the pixels are scaled to the Muhleman law for radar echoes on Venus and are not corrected for latitudinal variations in incidence angle. Therefore, DN values must be converted based on pixel latitude back to their backscatter coefficient values before accurate statistical analysis can occur. Here we present a method for performing the conversions and analysis of Magellan backscatter data using commonly available ArcGIS software and illustrate the advantages of the process for geological mapping.
A RESEARCH DATABASE FOR IMPROVED DATA MANAGEMENT AND ANALYSIS IN LONGITUDINAL STUDIES

PubMed Central

BIELEFELD, ROGER A.; YAMASHITA, TOYOKO S.; KEREKES, EDWARD F.; ERCANLI, EHAT; SINGER, LYNN T.

2014-01-01

We developed a research database for a five-year prospective investigation of the medical, social, and developmental correlates of chronic lung disease during the first three years of life. We used the Ingres database management system and the Statit statistical software package. The database includes records containing 1300 variables each, the results of 35 psychological tests, each repeated five times (providing longitudinal data on the child, the parents, and behavioral interactions), both raw and calculated variables, and both missing and deferred values. The four-layer menu-driven user interface incorporates automatic activation of complex functions to handle data verification, missing and deferred values, static and dynamic backup, determination of calculated values, display of database status, reports, bulk data extraction, and statistical analysis. PMID:7596250
A New Paradigm to Analyze Data Completeness of Patient Data.

PubMed

Nasir, Ayan; Gurupur, Varadraj; Liu, Xinliang

2016-08-03

There is a need to develop a tool that will measure data completeness of patient records using sophisticated statistical metrics. Patient data integrity is important in providing timely and appropriate care. Completeness is an important step, with an emphasis on understanding the complex relationships between data fields and their relative importance in delivering care. This tool will not only help understand where data problems are but also help uncover the underlying issues behind them. Develop a tool that can be used alongside a variety of health care database software packages to determine the completeness of individual patient records as well as aggregate patient records across health care centers and subpopulations. The methodology of this project is encapsulated within the Data Completeness Analysis Package (DCAP) tool, with the major components including concept mapping, CSV parsing, and statistical analysis. The results from testing DCAP with Healthcare Cost and Utilization Project (HCUP) State Inpatient Database (SID) data show that this tool is successful in identifying relative data completeness at the patient, subpopulation, and database levels. These results also solidify a need for further analysis and call for hypothesis driven research to find underlying causes for data incompleteness. DCAP examines patient records and generates statistics that can be used to determine the completeness of individual patient data as well as the general thoroughness of record keeping in a medical database. DCAP uses a component that is customized to the settings of the software package used for storing patient data as well as a Comma Separated Values (CSV) file parser to determine the appropriate measurements. DCAP itself is assessed through a proof of concept exercise using hypothetical data as well as available HCUP SID patient data.
A New Paradigm to Analyze Data Completeness of Patient Data

PubMed Central

Nasir, Ayan; Liu, Xinliang

2016-01-01

Summary Background There is a need to develop a tool that will measure data completeness of patient records using sophisticated statistical metrics. Patient data integrity is important in providing timely and appropriate care. Completeness is an important step, with an emphasis on understanding the complex relationships between data fields and their relative importance in delivering care. This tool will not only help understand where data problems are but also help uncover the underlying issues behind them. Objectives Develop a tool that can be used alongside a variety of health care database software packages to determine the completeness of individual patient records as well as aggregate patient records across health care centers and subpopulations. Methods The methodology of this project is encapsulated within the Data Completeness Analysis Package (DCAP) tool, with the major components including concept mapping, CSV parsing, and statistical analysis. Results The results from testing DCAP with Healthcare Cost and Utilization Project (HCUP) State Inpatient Database (SID) data show that this tool is successful in identifying relative data completeness at the patient, subpopulation, and database levels. These results also solidify a need for further analysis and call for hypothesis driven research to find underlying causes for data incompleteness. Conclusion DCAP examines patient records and generates statistics that can be used to determine the completeness of individual patient data as well as the general thoroughness of record keeping in a medical database. DCAP uses a component that is customized to the settings of the software package used for storing patient data as well as a Comma Separated Values (CSV) file parser to determine the appropriate measurements. DCAP itself is assessed through a proof of concept exercise using hypothetical data as well as available HCUP SID patient data. PMID:27484918
Neurophysiological analytics for all! Free open-source software tools for documenting, analyzing, visualizing, and sharing using electronic notebooks.

PubMed

Rosenberg, David M; Horn, Charles C

2016-08-01

Neurophysiology requires an extensive workflow of information analysis routines, which often includes incompatible proprietary software, introducing limitations based on financial costs, transfer of data between platforms, and the ability to share. An ecosystem of free open-source software exists to fill these gaps, including thousands of analysis and plotting packages written in Python and R, which can be implemented in a sharable and reproducible format, such as the Jupyter electronic notebook. This tool chain can largely replace current routines by importing data, producing analyses, and generating publication-quality graphics. An electronic notebook like Jupyter allows these analyses, along with documentation of procedures, to display locally or remotely in an internet browser, which can be saved as an HTML, PDF, or other file format for sharing with team members and the scientific community. The present report illustrates these methods using data from electrophysiological recordings of the musk shrew vagus-a model system to investigate gut-brain communication, for example, in cancer chemotherapy-induced emesis. We show methods for spike sorting (including statistical validation), spike train analysis, and analysis of compound action potentials in notebooks. Raw data and code are available from notebooks in data supplements or from an executable online version, which replicates all analyses without installing software-an implementation of reproducible research. This demonstrates the promise of combining disparate analyses into one platform, along with the ease of sharing this work. In an age of diverse, high-throughput computational workflows, this methodology can increase efficiency, transparency, and the collaborative potential of neurophysiological research. Copyright © 2016 the American Physiological Society.
Software Framework for Development of Web-GIS Systems for Analysis of Georeferenced Geophysical Data

NASA Astrophysics Data System (ADS)

Okladnikov, I.; Gordov, E. P.; Titov, A. G.

2011-12-01

Georeferenced datasets (meteorological databases, modeling and reanalysis results, remote sensing products, etc.) are currently actively used in numerous applications including modeling, interpretation and forecast of climatic and ecosystem changes for various spatial and temporal scales. Due to inherent heterogeneity of environmental datasets as well as their size which might constitute up to tens terabytes for a single dataset at present studies in the area of climate and environmental change require a special software support. A dedicated software framework for rapid development of providing such support information-computational systems based on Web-GIS technologies has been created. The software framework consists of 3 basic parts: computational kernel developed using ITTVIS Interactive Data Language (IDL), a set of PHP-controllers run within specialized web portal, and JavaScript class library for development of typical components of web mapping application graphical user interface (GUI) based on AJAX technology. Computational kernel comprise of number of modules for datasets access, mathematical and statistical data analysis and visualization of results. Specialized web-portal consists of web-server Apache, complying OGC standards Geoserver software which is used as a base for presenting cartographical information over the Web, and a set of PHP-controllers implementing web-mapping application logic and governing computational kernel. JavaScript library aiming at graphical user interface development is based on GeoExt library combining ExtJS Framework and OpenLayers software. Based on the software framework an information-computational system for complex analysis of large georeferenced data archives was developed. Structured environmental datasets available for processing now include two editions of NCEP/NCAR Reanalysis, JMA/CRIEPI JRA-25 Reanalysis, ECMWF ERA-40 Reanalysis, ECMWF ERA Interim Reanalysis, MRI/JMA APHRODITE's Water Resources Project Reanalysis, meteorological observational data for the territory of the former USSR for the 20th century, and others. Current version of the system is already involved into a scientific research process. Particularly, recently the system was successfully used for analysis of Siberia climate changes and its impact in the region. The software framework presented allows rapid development of Web-GIS systems for geophysical data analysis thus providing specialists involved into multidisciplinary research projects with reliable and practical instruments for complex analysis of climate and ecosystems changes on global and regional scales. This work is partially supported by RFBR grants #10-07-00547, #11-05-01190, and SB RAS projects 4.31.1.5, 4.31.2.7, 4, 8, 9, 50 and 66.
Web-based spatial analysis with the ILWIS open source GIS software and satellite images from GEONETCast

NASA Astrophysics Data System (ADS)

Lemmens, R.; Maathuis, B.; Mannaerts, C.; Foerster, T.; Schaeffer, B.; Wytzisk, A.

2009-12-01

This paper involves easy accessible integrated web-based analysis of satellite images with a plug-in based open source software. The paper is targeted to both users and developers of geospatial software. Guided by a use case scenario, we describe the ILWIS software and its toolbox to access satellite images through the GEONETCast broadcasting system. The last two decades have shown a major shift from stand-alone software systems to networked ones, often client/server applications using distributed geo-(web-)services. This allows organisations to combine without much effort their own data with remotely available data and processing functionality. Key to this integrated spatial data analysis is a low-cost access to data from within a user-friendly and flexible software. Web-based open source software solutions are more often a powerful option for developing countries. The Integrated Land and Water Information System (ILWIS) is a PC-based GIS & Remote Sensing software, comprising a complete package of image processing, spatial analysis and digital mapping and was developed as commercial software from the early nineties onwards. Recent project efforts have migrated ILWIS into a modular, plug-in-based open source software, and provide web-service support for OGC-based web mapping and processing. The core objective of the ILWIS Open source project is to provide a maintainable framework for researchers and software developers to implement training components, scientific toolboxes and (web-) services. The latest plug-ins have been developed for multi-criteria decision making, water resources analysis and spatial statistics analysis. The development of this framework is done since 2007 in the context of 52°North, which is an open initiative that advances the development of cutting edge open source geospatial software, using the GPL license. GEONETCast, as part of the emerging Global Earth Observation System of Systems (GEOSS), puts essential environmental data at the fingertips of users around the globe. This user-friendly and low-cost information dissemination provides global information as a basis for decision-making in a number of critical areas, including public health, energy, agriculture, weather, water, climate, natural disasters and ecosystems. GEONETCast makes available satellite images via Digital Video Broadcast (DVB) technology. An OGC WMS interface and plug-ins which convert GEONETCast data streams allow an ILWIS user to integrate various distributed data sources with data locally stored on his machine. Our paper describes a use case in which ILWIS is used with GEONETCast satellite imagery for decision making processes in Ghana. We also explain how the ILWIS software can be extended with additional functionality by means of building plug-ins and unfold our plans to implement other OGC standards, such as WCS and WPS in the same context. Especially, the latter one can be seen as a major step forward in terms of moving well-proven desktop based processing functionality to the web. This enables the embedding of ILWIS functionality in Spatial Data Infrastructures or even the execution in scalable and on-demand cloud computing environments.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.