statistical analysis required: Topics by Science.gov

Sample records for statistical analysis required

The Content of Statistical Requirements for Authors in Biomedical Research Journals

PubMed Central

Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang

2016-01-01

Background: Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues’ serious considerations not only at the stage of data analysis but also at the stage of methodological design. Methods: Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Results: Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including “address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation,” and “statistical methods and the reasons.” Conclusions: Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible. PMID:27748343
The Content of Statistical Requirements for Authors in Biomedical Research Journals.

PubMed

Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang

2016-10-20

Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues' serious considerations not only at the stage of data analysis but also at the stage of methodological design. Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including "address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation," and "statistical methods and the reasons." Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible.
How Much Math Do Students Need to Succeed in Business and Economics Statistics? An Ordered Probit Analysis

ERIC Educational Resources Information Center

Green, Jeffrey J.; Stone, Courtenay C.; Zegeye, Abera; Charles, Thomas A.

2009-01-01

Because statistical analysis requires the ability to use mathematics, students typically are required to take one or more prerequisite math courses prior to enrolling in the business statistics course. Despite these math prerequisites, however, many students find it difficult to learn business statistics. In this study, we use an ordered probit…
Background Information and User’s Guide for MIL-F-9490

DTIC Science & Technology

1975-01-01

requirements, although different analysis results will apply to each requirement. Basic differences between the two realibility requirements are: MIL-F-8785B...provides the rationale for establishing such limits. The specific risk analysis comprises the same data which formed the average risk analysis , except...statistical analysis will be based on statistical data taken using limited exposure Limes of components and equipment. The exposure times and resulting
Performance Analysis of Live-Virtual-Constructive and Distributed Virtual Simulations: Defining Requirements in Terms of Temporal Consistency

DTIC Science & Technology

2009-12-01

events. Work associated with aperiodic tasks have the same statistical behavior and the same timing requirements. The timing deadlines are soft. • Sporadic...answers, but it is possible to calculate how precise the estimates are. Simulation-based performance analysis of a model includes a statistical ...to evaluate all pos- sible states in a timely manner. This is the principle reason for resorting to simulation and statistical analysis to evaluate
Trial Sequential Analysis in systematic reviews with meta-analysis.

PubMed

Wetterslev, Jørn; Jakobsen, Janus Christian; Gluud, Christian

2017-03-06

Most meta-analyses in systematic reviews, including Cochrane ones, do not have sufficient statistical power to detect or refute even large intervention effects. This is why a meta-analysis ought to be regarded as an interim analysis on its way towards a required information size. The results of the meta-analyses should relate the total number of randomised participants to the estimated required meta-analytic information size accounting for statistical diversity. When the number of participants and the corresponding number of trials in a meta-analysis are insufficient, the use of the traditional 95% confidence interval or the 5% statistical significance threshold will lead to too many false positive conclusions (type I errors) and too many false negative conclusions (type II errors). We developed a methodology for interpreting meta-analysis results, using generally accepted, valid evidence on how to adjust thresholds for significance in randomised clinical trials when the required sample size has not been reached. The Lan-DeMets trial sequential monitoring boundaries in Trial Sequential Analysis offer adjusted confidence intervals and restricted thresholds for statistical significance when the diversity-adjusted required information size and the corresponding number of required trials for the meta-analysis have not been reached. Trial Sequential Analysis provides a frequentistic approach to control both type I and type II errors. We define the required information size and the corresponding number of required trials in a meta-analysis and the diversity (D 2 ) measure of heterogeneity. We explain the reasons for using Trial Sequential Analysis of meta-analysis when the actual information size fails to reach the required information size. We present examples drawn from traditional meta-analyses using unadjusted naïve 95% confidence intervals and 5% thresholds for statistical significance. Spurious conclusions in systematic reviews with traditional meta-analyses can be reduced using Trial Sequential Analysis. Several empirical studies have demonstrated that the Trial Sequential Analysis provides better control of type I errors and of type II errors than the traditional naïve meta-analysis. Trial Sequential Analysis represents analysis of meta-analytic data, with transparent assumptions, and better control of type I and type II errors than the traditional meta-analysis using naïve unadjusted confidence intervals.
Statistical properties of filtered pseudorandom digital sequences formed from the sum of maximum-length sequences

NASA Technical Reports Server (NTRS)

Wallace, G. R.; Weathers, G. D.; Graf, E. R.

1973-01-01

The statistics of filtered pseudorandom digital sequences called hybrid-sum sequences, formed from the modulo-two sum of several maximum-length sequences, are analyzed. The results indicate that a relation exists between the statistics of the filtered sequence and the characteristic polynomials of the component maximum length sequences. An analysis procedure is developed for identifying a large group of sequences with good statistical properties for applications requiring the generation of analog pseudorandom noise. By use of the analysis approach, the filtering process is approximated by the convolution of the sequence with a sum of unit step functions. A parameter reflecting the overall statistical properties of filtered pseudorandom sequences is derived. This parameter is called the statistical quality factor. A computer algorithm to calculate the statistical quality factor for the filtered sequences is presented, and the results for two examples of sequence combinations are included. The analysis reveals that the statistics of the signals generated with the hybrid-sum generator are potentially superior to the statistics of signals generated with maximum-length generators. Furthermore, fewer calculations are required to evaluate the statistics of a large group of hybrid-sum generators than are required to evaluate the statistics of the same size group of approximately equivalent maximum-length sequences.
U.S. Marine Corps Study of Establishing Time Criteria for Logistics Tasks

DTIC Science & Technology

2004-09-30

STATISTICS FOR REQUESTS PER DAY FOR TWO BATTALIONS II-25 II-6 SUMMARY STATISTICS IN HOURS FOR RESOURCE REQUIREMENTS PER DAY FOR TWO BATTALIONS II-26 II-7...SUMMARY STATISTICS FOR INDIVIDUALS FOR RESOURCE REQUIREMENTS PER DAY FOR TWO BATTALIONS II-27 Study of Establishing Time Criteria for Logistics...developed and run to provide statistical information for analysis. In Task Four, the study team used Task Three findings to determine data requirements
Towards Solving the Mixing Problem in the Decomposition of Geophysical Time Series by Independent Component Analysis

NASA Technical Reports Server (NTRS)

Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

2000-01-01

The use of the Principal Component Analysis technique for the analysis of geophysical time series has been questioned in particular for its tendency to extract components that mix several physical phenomena even when the signal is just their linear sum. We demonstrate with a data simulation experiment that the Independent Component Analysis, a recently developed technique, is able to solve this problem. This new technique requires the statistical independence of components, a stronger constraint, that uses higher-order statistics, instead of the classical decorrelation a weaker constraint, that uses only second-order statistics. Furthermore, ICA does not require additional a priori information such as the localization constraint used in Rotational Techniques.
Public and patient involvement in quantitative health research: A statistical perspective.

PubMed

Hannigan, Ailish

2018-06-19

The majority of studies included in recent reviews of impact for public and patient involvement (PPI) in health research had a qualitative design. PPI in solely quantitative designs is underexplored, particularly its impact on statistical analysis. Statisticians in practice have a long history of working in both consultative (indirect) and collaborative (direct) roles in health research, yet their perspective on PPI in quantitative health research has never been explicitly examined. To explore the potential and challenges of PPI from a statistical perspective at distinct stages of quantitative research, that is sampling, measurement and statistical analysis, distinguishing between indirect and direct PPI. Statistical analysis is underpinned by having a representative sample, and a collaborative or direct approach to PPI may help achieve that by supporting access to and increasing participation of under-represented groups in the population. Acknowledging and valuing the role of lay knowledge of the context in statistical analysis and in deciding what variables to measure may support collective learning and advance scientific understanding, as evidenced by the use of participatory modelling in other disciplines. A recurring issue for quantitative researchers, which reflects quantitative sampling methods, is the selection and required number of PPI contributors, and this requires further methodological development. Direct approaches to PPI in quantitative health research may potentially increase its impact, but the facilitation and partnership skills required may require further training for all stakeholders, including statisticians. © 2018 The Authors Health Expectations published by John Wiley & Sons Ltd.
The Heuristics of Statistical Argumentation: Scaffolding at the Postsecondary Level

ERIC Educational Resources Information Center

Pardue, Teneal Messer

2017-01-01

Language plays a key role in statistics and, by extension, in statistics education. Enculturating students into the practice of statistics requires preparing them to communicate results of data analysis. Statistical argumentation is one way of providing structure to facilitate discourse in the statistics classroom. In this study, a teaching…
How Many Studies Do You Need? A Primer on Statistical Power for Meta-Analysis

ERIC Educational Resources Information Center

Valentine, Jeffrey C.; Pigott, Therese D.; Rothstein, Hannah R.

2010-01-01

In this article, the authors outline methods for using fixed and random effects power analysis in the context of meta-analysis. Like statistical power analysis for primary studies, power analysis for meta-analysis can be done either prospectively or retrospectively and requires assumptions about parameters that are unknown. The authors provide…
A new statistic for the analysis of circular data in gamma-ray astronomy

NASA Technical Reports Server (NTRS)

Protheroe, R. J.

1985-01-01

A new statistic is proposed for the analysis of circular data. The statistic is designed specifically for situations where a test of uniformity is required which is powerful against alternatives in which a small fraction of the observations is grouped in a small range of directions, or phases.
Noise Reduction in High-Throughput Gene Perturbation Screens

USDA-ARS?s Scientific Manuscript database

Motivation: Accurate interpretation of perturbation screens is essential for a successful functional investigation. However, the screened phenotypes are often distorted by noise, and their analysis requires specialized statistical analysis tools. The number and scope of statistical methods available...
10 CFR 431.173 - Requirements applicable to all manufacturers.

Code of Federal Regulations, 2011 CFR

2011-01-01

... COMMERCIAL AND INDUSTRIAL EQUIPMENT Provisions for Commercial Heating, Ventilating, Air-Conditioning and... is based on engineering or statistical analysis, computer simulation or modeling, or other analytic... method or methods used; (B) The mathematical model, the engineering or statistical analysis, computer...
Implementation and evaluation of an efficient secure computation system using ‘R’ for healthcare statistics

PubMed Central

Chida, Koji; Morohashi, Gembu; Fuji, Hitoshi; Magata, Fumihiko; Fujimura, Akiko; Hamada, Koki; Ikarashi, Dai; Yamamoto, Ryuichi

2014-01-01

Background and objective While the secondary use of medical data has gained attention, its adoption has been constrained due to protection of patient privacy. Making medical data secure by de-identification can be problematic, especially when the data concerns rare diseases. We require rigorous security management measures. Materials and methods Using secure computation, an approach from cryptography, our system can compute various statistics over encrypted medical records without decrypting them. An issue of secure computation is that the amount of processing time required is immense. We implemented a system that securely computes healthcare statistics from the statistical computing software ‘R’ by effectively combining secret-sharing-based secure computation with original computation. Results Testing confirmed that our system could correctly complete computation of average and unbiased variance of approximately 50 000 records of dummy insurance claim data in a little over a second. Computation including conditional expressions and/or comparison of values, for example, t test and median, could also be correctly completed in several tens of seconds to a few minutes. Discussion If medical records are simply encrypted, the risk of leaks exists because decryption is usually required during statistical analysis. Our system possesses high-level security because medical records remain in encrypted state even during statistical analysis. Also, our system can securely compute some basic statistics with conditional expressions using ‘R’ that works interactively while secure computation protocols generally require a significant amount of processing time. Conclusions We propose a secure statistical analysis system using ‘R’ for medical data that effectively integrates secret-sharing-based secure computation and original computation. PMID:24763677
Implementation and evaluation of an efficient secure computation system using 'R' for healthcare statistics.

PubMed

Chida, Koji; Morohashi, Gembu; Fuji, Hitoshi; Magata, Fumihiko; Fujimura, Akiko; Hamada, Koki; Ikarashi, Dai; Yamamoto, Ryuichi

2014-10-01

While the secondary use of medical data has gained attention, its adoption has been constrained due to protection of patient privacy. Making medical data secure by de-identification can be problematic, especially when the data concerns rare diseases. We require rigorous security management measures. Using secure computation, an approach from cryptography, our system can compute various statistics over encrypted medical records without decrypting them. An issue of secure computation is that the amount of processing time required is immense. We implemented a system that securely computes healthcare statistics from the statistical computing software 'R' by effectively combining secret-sharing-based secure computation with original computation. Testing confirmed that our system could correctly complete computation of average and unbiased variance of approximately 50,000 records of dummy insurance claim data in a little over a second. Computation including conditional expressions and/or comparison of values, for example, t test and median, could also be correctly completed in several tens of seconds to a few minutes. If medical records are simply encrypted, the risk of leaks exists because decryption is usually required during statistical analysis. Our system possesses high-level security because medical records remain in encrypted state even during statistical analysis. Also, our system can securely compute some basic statistics with conditional expressions using 'R' that works interactively while secure computation protocols generally require a significant amount of processing time. We propose a secure statistical analysis system using 'R' for medical data that effectively integrates secret-sharing-based secure computation and original computation. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kogalovskii, M.R.

This paper presents a review of problems related to statistical database systems, which are wide-spread in various fields of activity. Statistical databases (SDB) are referred to as databases that consist of data and are used for statistical analysis. Topics under consideration are: SDB peculiarities, properties of data models adequate for SDB requirements, metadata functions, null-value problems, SDB compromise protection problems, stored data compression techniques, and statistical data representation means. Also examined is whether the present Database Management Systems (DBMS) satisfy the SDB requirements. Some actual research directions in SDB systems are considered.
Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems

NASA Technical Reports Server (NTRS)

He, Yuning; Davies, Misty Dawn

2014-01-01

The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.
NAUSEA and the Principle of Supplementarity of Damping and Isolation in Noise Control.

DTIC Science & Technology

1980-02-01

New approaches and uses of the statistical energy analysis (NAUSEA) have been considered and developed in recent months. The advances were made...possible in that the requirement, in the olde statistical energy analysis , that the dynamic systems be highly reverberant and the couplings between the...analytical consideration in terms of the statistical energy analysis (SEA). A brief discussion and simple examples that relate to these recent advances

Systematic Review and Meta-Analysis of Studies Evaluating Diagnostic Test Accuracy: A Practical Review for Clinical Researchers-Part II. Statistical Methods of Meta-Analysis

PubMed Central

Lee, Juneyoung; Kim, Kyung Won; Choi, Sang Hyun; Huh, Jimi

2015-01-01

Meta-analysis of diagnostic test accuracy studies differs from the usual meta-analysis of therapeutic/interventional studies in that, it is required to simultaneously analyze a pair of two outcome measures such as sensitivity and specificity, instead of a single outcome. Since sensitivity and specificity are generally inversely correlated and could be affected by a threshold effect, more sophisticated statistical methods are required for the meta-analysis of diagnostic test accuracy. Hierarchical models including the bivariate model and the hierarchical summary receiver operating characteristic model are increasingly being accepted as standard methods for meta-analysis of diagnostic test accuracy studies. We provide a conceptual review of statistical methods currently used and recommended for meta-analysis of diagnostic test accuracy studies. This article could serve as a methodological reference for those who perform systematic review and meta-analysis of diagnostic test accuracy studies. PMID:26576107
Limitations of Using Microsoft Excel Version 2016 (MS Excel 2016) for Statistical Analysis for Medical Research.

PubMed

Tanavalee, Chotetawan; Luksanapruksa, Panya; Singhatanadgige, Weerasak

2016-06-01

Microsoft Excel (MS Excel) is a commonly used program for data collection and statistical analysis in biomedical research. However, this program has many limitations, including fewer functions that can be used for analysis and a limited number of total cells compared with dedicated statistical programs. MS Excel cannot complete analyses with blank cells, and cells must be selected manually for analysis. In addition, it requires multiple steps of data transformation and formulas to plot survival analysis graphs, among others. The Megastat add-on program, which will be supported by MS Excel 2016 soon, would eliminate some limitations of using statistic formulas within MS Excel.
Ten Ways to Improve the Use of Statistical Mediation Analysis in the Practice of Child and Adolescent Treatment Research

ERIC Educational Resources Information Center

Maric, Marija; Wiers, Reinout W.; Prins, Pier J. M.

2012-01-01

Despite guidelines and repeated calls from the literature, statistical mediation analysis in youth treatment outcome research is rare. Even more concerning is that many studies that "have" reported mediation analyses do not fulfill basic requirements for mediation analysis, providing inconclusive data and clinical implications. As a result, after…
Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM

ERIC Educational Resources Information Center

Warner, Rebecca M.

2007-01-01

This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…
Naive Analysis of Variance

ERIC Educational Resources Information Center

Braun, W. John

2012-01-01

The Analysis of Variance is often taught in introductory statistics courses, but it is not clear that students really understand the method. This is because the derivation of the test statistic and p-value requires a relatively sophisticated mathematical background which may not be well-remembered or understood. Thus, the essential concept behind…
Statistics For Success Statistical Analysis Of Student Data Is A Lot Easier Than You Think And More Useful Than You Imagine.

ERIC Educational Resources Information Center

Kadel, Robert

2004-01-01

To her surprise, Ms. Logan had just conducted a statistical analysis of her 10th grade biology students' quiz scores. The results indicated that she needed to reinforce mitosis before the students took the high-school proficiency test in three weeks, as required by the state. "Oh! That's easy!" She exclaimed. Teachers like Ms. Logan are…
CADDIS Volume 4. Data Analysis: Biological and Environmental Data Requirements

EPA Pesticide Factsheets

Overview of PECBO Module, using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, methods for inferring environmental conditions, statistical scripts in module.
Role of microstructure on twin nucleation and growth in HCP titanium: A statistical study

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arul Kumar, M.; Wroński, M.; McCabe, Rodney James

In this study, a detailed statistical analysis is performed using Electron Back Scatter Diffraction (EBSD) to establish the effect of microstructure on twin nucleation and growth in deformed commercial purity hexagonal close packed (HCP) titanium. Rolled titanium samples are compressed along rolling, transverse and normal directions to establish statistical correlations for {10–12}, {11–21}, and {11–22} twins. A recently developed automated EBSD-twinning analysis software is employed for the statistical analysis. Finally, the analysis provides the following key findings: (I) grain size and strain dependence is different for twin nucleation and growth; (II) twinning statistics can be generalized for the HCP metalsmore » magnesium, zirconium and titanium; and (III) complex microstructure, where grain shape and size distribution is heterogeneous, requires multi-point statistical correlations.« less
Role of microstructure on twin nucleation and growth in HCP titanium: A statistical study

DOE PAGES

Arul Kumar, M.; Wroński, M.; McCabe, Rodney James; ...

2018-02-01

In this study, a detailed statistical analysis is performed using Electron Back Scatter Diffraction (EBSD) to establish the effect of microstructure on twin nucleation and growth in deformed commercial purity hexagonal close packed (HCP) titanium. Rolled titanium samples are compressed along rolling, transverse and normal directions to establish statistical correlations for {10–12}, {11–21}, and {11–22} twins. A recently developed automated EBSD-twinning analysis software is employed for the statistical analysis. Finally, the analysis provides the following key findings: (I) grain size and strain dependence is different for twin nucleation and growth; (II) twinning statistics can be generalized for the HCP metalsmore » magnesium, zirconium and titanium; and (III) complex microstructure, where grain shape and size distribution is heterogeneous, requires multi-point statistical correlations.« less
Targeting Change: Assessing a Faculty Learning Community Focused on Increasing Statistics Content in Life Science Curricula

ERIC Educational Resources Information Center

Parker, Loran Carleton; Gleichsner, Alyssa M.; Adedokun, Omolola A.; Forney, James

2016-01-01

Transformation of research in all biological fields necessitates the design, analysis and, interpretation of large data sets. Preparing students with the requisite skills in experimental design, statistical analysis, and interpretation, and mathematical reasoning will require both curricular reform and faculty who are willing and able to integrate…
Telecommunication market research processing

NASA Astrophysics Data System (ADS)

Dupont, J. F.

1983-06-01

The data processing in two telecommunication market investigations is described. One of the studies concerns the office applications of communication and the other the experiences with a videotex terminal. Statistical factorial analysis was performed on a large mass of data. A comparison between utilization intentions and effective utilization is made. Extensive rewriting of statistical analysis computer programs was required.
Pre-installation customer satisfaction survey

DOT National Transportation Integrated Search

1996-10-01

The National Center for Statistics and Analysis (NCSA) Information Services Branch (ISB) required a more effective method of receiving, tracking, and completing requests for data, statistics, and information. To enhance ISBs services, a new cus...
WASP (Write a Scientific Paper) using Excel 9: Analysis of variance.

PubMed

Grech, Victor

2018-06-01

Analysis of variance (ANOVA) may be required by researchers as an inferential statistical test when more than two means require comparison. This paper explains how to perform ANOVA in Microsoft Excel. Copyright © 2018 Elsevier B.V. All rights reserved.
Antecedents to Organizational Performance: Theoretical and Practical Implications for Aircraft Maintenance Officer Force Development

DTIC Science & Technology

2015-03-26

to my reader, Lieutenant Colonel Robert Overstreet, for helping solidify my research, coaching me through the statistical analysis, and positive...61 Descriptive Statistics .............................................................................................................. 61...common-method bias requires careful assessment of potential sources of bias and implementing procedural and statistical control methods. Podsakoff
APA's Learning Objectives for Research Methods and Statistics in Practice: A Multimethod Analysis

ERIC Educational Resources Information Center

Tomcho, Thomas J.; Rice, Diana; Foels, Rob; Folmsbee, Leah; Vladescu, Jason; Lissman, Rachel; Matulewicz, Ryan; Bopp, Kara

2009-01-01

Research methods and statistics courses constitute a core undergraduate psychology requirement. We analyzed course syllabi and faculty self-reported coverage of both research methods and statistics course learning objectives to assess the concordance with APA's learning objectives (American Psychological Association, 2007). We obtained a sample of…
A Survey of Statistical Capstone Projects

ERIC Educational Resources Information Center

Martonosi, Susan E.; Williams, Talithia D.

2016-01-01

In this article, we highlight the advantages of incorporating a statistical capstone experience in the undergraduate curriculum, where students perform an in-depth analysis of real-world data. Capstone experiences develop statistical thinking by allowing students to engage in a consulting-like experience that requires skills outside the scope of…
Analysis and interpretation of cost data in randomised controlled trials: review of published studies

PubMed Central

Barber, Julie A; Thompson, Simon G

1998-01-01

Objective To review critically the statistical methods used for health economic evaluations in randomised controlled trials where an estimate of cost is available for each patient in the study. Design Survey of published randomised trials including an economic evaluation with cost values suitable for statistical analysis; 45 such trials published in 1995 were identified from Medline. Main outcome measures The use of statistical methods for cost data was assessed in terms of the descriptive statistics reported, use of statistical inference, and whether the reported conclusions were justified. Results Although all 45 trials reviewed apparently had cost data for each patient, only 9 (20%) reported adequate measures of variability for these data and only 25 (56%) gave results of statistical tests or a measure of precision for the comparison of costs between the randomised groups. Only 16 (36%) of the articles gave conclusions which were justified on the basis of results presented in the paper. No paper reported sample size calculations for costs. Conclusions The analysis and interpretation of cost data from published trials reveal a lack of statistical awareness. Strong and potentially misleading conclusions about the relative costs of alternative therapies have often been reported in the absence of supporting statistical evidence. Improvements in the analysis and reporting of health economic assessments are urgently required. Health economic guidelines need to be revised to incorporate more detailed statistical advice. Key messagesHealth economic evaluations required for important healthcare policy decisions are often carried out in randomised controlled trialsA review of such published economic evaluations assessed whether statistical methods for cost outcomes have been appropriately used and interpretedFew publications presented adequate descriptive information for costs or performed appropriate statistical analysesIn at least two thirds of the papers, the main conclusions regarding costs were not justifiedThe analysis and reporting of health economic assessments within randomised controlled trials urgently need improving PMID:9794854
[Review of research design and statistical methods in Chinese Journal of Cardiology].

PubMed

Zhang, Li-jun; Yu, Jin-ming

2009-07-01

To evaluate the research design and the use of statistical methods in Chinese Journal of Cardiology. Peer through the research design and statistical methods in all of the original papers in Chinese Journal of Cardiology from December 2007 to November 2008. The most frequently used research designs are cross-sectional design (34%), prospective design (21%) and experimental design (25%). In all of the articles, 49 (25%) use wrong statistical methods, 29 (15%) lack some sort of statistic analysis, 23 (12%) have inconsistencies in description of methods. There are significant differences between different statistical methods (P < 0.001). The correction rates of multifactor analysis were low and repeated measurement datas were not used repeated measurement analysis. Many problems exist in Chinese Journal of Cardiology. Better research design and correct use of statistical methods are still needed. More strict review by statistician and epidemiologist is also required to improve the literature qualities.
Statistical analysis and interpolation of compositional data in materials science.

PubMed

Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M

2015-02-09

Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.
Space station software reliability analysis based on failures observed during testing at the multisystem integration facility

NASA Technical Reports Server (NTRS)

Tamayo, Tak Chai

1987-01-01

Quality of software not only is vital to the successful operation of the space station, it is also an important factor in establishing testing requirements, time needed for software verification and integration as well as launching schedules for the space station. Defense of management decisions can be greatly strengthened by combining engineering judgments with statistical analysis. Unlike hardware, software has the characteristics of no wearout and costly redundancies, thus making traditional statistical analysis not suitable in evaluating reliability of software. A statistical model was developed to provide a representation of the number as well as types of failures occur during software testing and verification. From this model, quantitative measure of software reliability based on failure history during testing are derived. Criteria to terminate testing based on reliability objectives and methods to estimate the expected number of fixings required are also presented.

STATISTICAL ANALYSIS OF SNAP 10A THERMOELECTRIC CONVERTER ELEMENT PROCESS DEVELOPMENT VARIABLES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fitch, S.H.; Morris, J.W.

1962-12-15

Statistical analysis, primarily analysis of variance, was applied to evaluate several factors involved in the development of suitable fabrication and processing techniques for the production of lead telluride thermoelectric elements for the SNAP 10A energy conversion system. The analysis methods are described as to their application for determining the effects of various processing steps, estabIishing the value of individual operations, and evaluating the significance of test results. The elimination of unnecessary or detrimental processing steps was accomplished and the number of required tests was substantially reduced by application of these statistical methods to the SNAP 10A production development effort. (auth)
Meta-analysis of gene-level associations for rare variants based on single-variant statistics.

PubMed

Hu, Yi-Juan; Berndt, Sonja I; Gustafsson, Stefan; Ganna, Andrea; Hirschhorn, Joel; North, Kari E; Ingelsson, Erik; Lin, Dan-Yu

2013-08-08

Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Teaching statistics in biology: using inquiry-based learning to strengthen understanding of statistical analysis in biology laboratory courses.

PubMed

Metz, Anneke M

2008-01-01

There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study.
Analysis of half diallel mating designs I: a practical analysis procedure for ANOVA approximation.

Treesearch

G.R. Johnson; J.N. King

1998-01-01

Procedures to analyze half-diallel mating designs using the SAS statistical package are presented. The procedure requires two runs of PROC and VARCOMP and results in estimates of additive and non-additive genetic variation. The procedures described can be modified to work on most statistical software packages which can compute variance component estimates. The...
Critical Views of 8th Grade Students toward Statistical Data in Newspaper Articles: Analysis in Light of Statistical Literacy

ERIC Educational Resources Information Center

Guler, Mustafa; Gursoy, Kadir; Guven, Bulent

2016-01-01

Understanding and interpreting biased data, decision-making in accordance with the data, and critically evaluating situations involving data are among the fundamental skills necessary in the modern world. To develop these required skills, emphasis on statistical literacy in school mathematics has been gradually increased in recent years. The…
Topical tranexamic acid in total knee replacement: a systematic review and meta-analysis.

PubMed

Panteli, Michalis; Papakostidis, Costas; Dahabreh, Ziad; Giannoudis, Peter V

2013-10-01

To examine the safety and efficacy of topical use of tranexamic acid (TA) in total knee arthroplasty (TKA). An electronic literature search of PubMed Medline; Ovid Medline; Embase; and the Cochrane Library was performed, identifying studies published in any language from 1966 to February 2013. The studies enrolled adults undergoing a primary TKA, where topical TA was used. Inverse variance statistical method and either a fixed or random effect model, depending on the absence or presence of statistical heterogeneity were used; subgroup analysis was performed when possible. We identified a total of seven eligible reports for analysis. Our meta-analysis indicated that when compared with the control group, topical application of TA limited significantly postoperative drain output (mean difference: -268.36ml), total blood loss (mean difference=-220.08ml), Hb drop (mean difference=-0.94g/dL) and lowered the risk of transfusion requirements (risk ratio=0.47, 95CI=0.26-0.84), without increased risk of thromboembolic events. Sub-group analysis indicated that a higher dose of topical TA (>2g) significantly reduced transfusion requirements. Although the present meta-analysis proved a statistically significant reduction of postoperative blood loss and transfusion requirements with topical use of TA in TKA, the clinical importance of the respective estimates of effect size should be interpreted with caution. I, II. Copyright © 2013 Elsevier B.V. All rights reserved.
Teaching Statistics in Biology: Using Inquiry-based Learning to Strengthen Understanding of Statistical Analysis in Biology Laboratory Courses

PubMed Central

2008-01-01

There is an increasing need for students in the biological sciences to build a strong foundation in quantitative approaches to data analyses. Although most science, engineering, and math field majors are required to take at least one statistics course, statistical analysis is poorly integrated into undergraduate biology course work, particularly at the lower-division level. Elements of statistics were incorporated into an introductory biology course, including a review of statistics concepts and opportunity for students to perform statistical analysis in a biological context. Learning gains were measured with an 11-item statistics learning survey instrument developed for the course. Students showed a statistically significant 25% (p < 0.005) increase in statistics knowledge after completing introductory biology. Students improved their scores on the survey after completing introductory biology, even if they had previously completed an introductory statistics course (9%, improvement p < 0.005). Students retested 1 yr after completing introductory biology showed no loss of their statistics knowledge as measured by this instrument, suggesting that the use of statistics in biology course work may aid long-term retention of statistics knowledge. No statistically significant differences in learning were detected between male and female students in the study. PMID:18765754
Powerlaw: a Python package for analysis of heavy-tailed distributions.

PubMed

Alstott, Jeff; Bullmore, Ed; Plenz, Dietmar

2014-01-01

Power laws are theoretically interesting probability distributions that are also frequently used to describe empirical data. In recent years, effective statistical methods for fitting power laws have been developed, but appropriate use of these techniques requires significant programming and statistical insight. In order to greatly decrease the barriers to using good statistical methods for fitting power law distributions, we developed the powerlaw Python package. This software package provides easy commands for basic fitting and statistical analysis of distributions. Notably, it also seeks to support a variety of user needs by being exhaustive in the options available to the user. The source code is publicly available and easily extensible.
A Comparative Analysis of the Minuteman Education Programs as Currently Offered at Six SAC Bases.

DTIC Science & Technology

1980-06-01

Principles of Marketing 3 Business Statistics 3 Business Law 3 Management Total... Principles of Marketing 3 Mathematics Methods I Total prerequisite hours 26 Required Graduate Courses Policy Formulation and Administration 3 Management...Business and Economic Statistics 3 Intermediate Business and Economic Statistics 3 Principles of Management 3 Corporation Finance 3 Principles of Marketing
An analysis of I/O efficient order-statistic-based techniques for noise power estimation in the HRMS sky survey's operational system

NASA Technical Reports Server (NTRS)

Zimmerman, G. A.; Olsen, E. T.

1992-01-01

Noise power estimation in the High-Resolution Microwave Survey (HRMS) sky survey element is considered as an example of a constant false alarm rate (CFAR) signal detection problem. Order-statistic-based noise power estimators for CFAR detection are considered in terms of required estimator accuracy and estimator dynamic range. By limiting the dynamic range of the value to be estimated, the performance of an order-statistic estimator can be achieved by simpler techniques requiring only a single pass of the data. Simple threshold-and-count techniques are examined, and it is shown how several parallel threshold-and-count estimation devices can be used to expand the dynamic range to meet HRMS system requirements with minimal hardware complexity. An input/output (I/O) efficient limited-precision order-statistic estimator with wide but limited dynamic range is also examined.
The Design and Analysis of Transposon-Insertion Sequencing Experiments

PubMed Central

Chao, Michael C.; Abel, Sören; Davis, Brigid M.; Waldor, Matthew K.

2016-01-01

Preface Transposon-insertion sequencing (TIS) is a powerful approach that can be widely applied to genome-wide definition of loci that are required for growth in diverse conditions. However, experimental design choices and stochastic biological processes can heavily influence the results of TIS experiments and affect downstream statistical analysis. Here, we discuss TIS experimental parameters and how these factors relate to the benefits and limitations of the various statistical frameworks that can be applied to computational analysis of TIS data. PMID:26775926
Method for data analysis in different institutions: example of image guidance of prostate cancer patients.

PubMed

Piotrowski, T; Rodrigues, G; Bajon, T; Yartsev, S

2014-03-01

Multi-institutional collaborations allow for more information to be analyzed but the data from different sources may vary in the subgroup sizes and/or conditions of measuring. Rigorous statistical analysis is required for pooling the data in a larger set. Careful comparison of all the components of the data acquisition is indispensable: identical conditions allow for enlargement of the database with improved statistical analysis, clearly defined differences provide opportunity for establishing a better practice. The optimal sequence of required normality, asymptotic normality, and independence tests is proposed. An example of analysis of six subgroups of position corrections in three directions obtained during image guidance procedures for 216 prostate cancer patients from two institutions is presented. Copyright © 2013 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
76 FR 27563 - Margin and Capital Requirements for Covered Swap Entities

Federal Register 2010, 2011, 2012, 2013, 2014

2011-05-11

.... Board: Sean D. Campbell, Deputy Associate Director, Division of Research and Statistics, (202) 452-3761, Michael Gibson, Senior Associate Director, Division of Research and Statistics, (202) 452- 2495, or Jeremy..., DC 20429. FHFA: Robert Collender, Principal Policy Analyst, Office of Policy Analysis and Research...
Statistics 101 for Radiologists.

PubMed

Anvari, Arash; Halpern, Elkan F; Samir, Anthony E

2015-10-01

Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
Preparing for the first meeting with a statistician.

PubMed

De Muth, James E

2008-12-15

Practical statistical issues that should be considered when performing data collection and analysis are reviewed. The meeting with a statistician should take place early in the research development before any study data are collected. The process of statistical analysis involves establishing the research question, formulating a hypothesis, selecting an appropriate test, sampling correctly, collecting data, performing tests, and making decisions. Once the objectives are established, the researcher can determine the characteristics or demographics of the individuals required for the study, how to recruit volunteers, what type of data are needed to answer the research question(s), and the best methods for collecting the required information. There are two general types of statistics: descriptive and inferential. Presenting data in a more palatable format for the reader is called descriptive statistics. Inferential statistics involve making an inference or decision about a population based on results obtained from a sample of that population. In order for the results of a statistical test to be valid, the sample should be representative of the population from which it is drawn. When collecting information about volunteers, researchers should only collect information that is directly related to the study objectives. Important information that a statistician will require first is an understanding of the type of variables involved in the study and which variables can be controlled by researchers and which are beyond their control. Data can be presented in one of four different measurement scales: nominal, ordinal, interval, or ratio. Hypothesis testing involves two mutually exclusive and exhaustive statements related to the research question. Statisticians should not be replaced by computer software, and they should be consulted before any research data are collected. When preparing to meet with a statistician, the pharmacist researcher should be familiar with the steps of statistical analysis and consider several questions related to the study to be conducted.
Assessing the Kansas water-level monitoring program: An example of the application of classical statistics to a geological problem

USGS Publications Warehouse

Davis, J.C.

2000-01-01

Geologists may feel that geological data are not amenable to statistical analysis, or at best require specialized approaches such as nonparametric statistics and geostatistics. However, there are many circumstances, particularly in systematic studies conducted for environmental or regulatory purposes, where traditional parametric statistical procedures can be beneficial. An example is the application of analysis of variance to data collected in an annual program of measuring groundwater levels in Kansas. Influences such as well conditions, operator effects, and use of the water can be assessed and wells that yield less reliable measurements can be identified. Such statistical studies have resulted in yearly improvements in the quality and reliability of the collected hydrologic data. Similar benefits may be achieved in other geological studies by the appropriate use of classical statistical tools.
Statistical Analysis of speckle noise reduction techniques for echocardiographic Images

NASA Astrophysics Data System (ADS)

Saini, Kalpana; Dewal, M. L.; Rohit, Manojkumar

2011-12-01

Echocardiography is the safe, easy and fast technology for diagnosing the cardiac diseases. As in other ultrasound images these images also contain speckle noise. In some cases this speckle noise is useful such as in motion detection. But in general noise removal is required for better analysis of the image and proper diagnosis. Different Adaptive and anisotropic filters are included for statistical analysis. Statistical parameters such as Signal-to-Noise Ratio (SNR), Peak Signal-to-Noise Ratio (PSNR), and Root Mean Square Error (RMSE) calculated for performance measurement. One more important aspect that there may be blurring during speckle noise removal. So it is prefered that filter should be able to enhance edges during noise removal.
Application of statistical process control and process capability analysis procedures in orbiter processing activities at the Kennedy Space Center

NASA Technical Reports Server (NTRS)

Safford, Robert R.; Jackson, Andrew E.; Swart, William W.; Barth, Timothy S.

1994-01-01

Successful ground processing at KSC requires that flight hardware and ground support equipment conform to specifications at tens of thousands of checkpoints. Knowledge of conformance is an essential requirement for launch. That knowledge of conformance at every requisite point does not, however, enable identification of past problems with equipment, or potential problem areas. This paper describes how the introduction of Statistical Process Control and Process Capability Analysis identification procedures into existing shuttle processing procedures can enable identification of potential problem areas and candidates for improvements to increase processing performance measures. Results of a case study describing application of the analysis procedures to Thermal Protection System processing are used to illustrate the benefits of the approaches described in the paper.
Proceedings of the second annual Forest Inventory and Analysis symposium; Salt Lake City, UT. October 17-18, 2000

Treesearch

Gregory A. Reams; Ronald E. McRoberts; Paul C. van Deusen; [Editors

2001-01-01

Documents progress in developing techniques in remote sensing, statistics, information management, and analysis required for full implementation of the national Forest Inventory and Analysis programâs annual forest inventory system.
Study design and statistical analysis of data in human population studies with the micronucleus assay.

PubMed

Ceppi, Marcello; Gallo, Fabio; Bonassi, Stefano

2011-01-01

The most common study design performed in population studies based on the micronucleus (MN) assay, is the cross-sectional study, which is largely performed to evaluate the DNA damaging effects of exposure to genotoxic agents in the workplace, in the environment, as well as from diet or lifestyle factors. Sample size is still a critical issue in the design of MN studies since most recent studies considering gene-environment interaction, often require a sample size of several hundred subjects, which is in many cases difficult to achieve. The control of confounding is another major threat to the validity of causal inference. The most popular confounders considered in population studies using MN are age, gender and smoking habit. Extensive attention is given to the assessment of effect modification, given the increasing inclusion of biomarkers of genetic susceptibility in the study design. Selected issues concerning the statistical treatment of data have been addressed in this mini-review, starting from data description, which is a critical step of statistical analysis, since it allows to detect possible errors in the dataset to be analysed and to check the validity of assumptions required for more complex analyses. Basic issues dealing with statistical analysis of biomarkers are extensively evaluated, including methods to explore the dose-response relationship among two continuous variables and inferential analysis. A critical approach to the use of parametric and non-parametric methods is presented, before addressing the issue of most suitable multivariate models to fit MN data. In the last decade, the quality of statistical analysis of MN data has certainly evolved, although even nowadays only a small number of studies apply the Poisson model, which is the most suitable method for the analysis of MN data.

ODM Data Analysis-A tool for the automatic validation, monitoring and generation of generic descriptive statistics of patient data.

PubMed

Brix, Tobias Johannes; Bruland, Philipp; Sarfraz, Saad; Ernsting, Jan; Neuhaus, Philipp; Storck, Michael; Doods, Justin; Ständer, Sonja; Dugas, Martin

2018-01-01

A required step for presenting results of clinical studies is the declaration of participants demographic and baseline characteristics as claimed by the FDAAA 801. The common workflow to accomplish this task is to export the clinical data from the used electronic data capture system and import it into statistical software like SAS software or IBM SPSS. This software requires trained users, who have to implement the analysis individually for each item. These expenditures may become an obstacle for small studies. Objective of this work is to design, implement and evaluate an open source application, called ODM Data Analysis, for the semi-automatic analysis of clinical study data. The system requires clinical data in the CDISC Operational Data Model format. After uploading the file, its syntax and data type conformity of the collected data is validated. The completeness of the study data is determined and basic statistics, including illustrative charts for each item, are generated. Datasets from four clinical studies have been used to evaluate the application's performance and functionality. The system is implemented as an open source web application (available at https://odmanalysis.uni-muenster.de) and also provided as Docker image which enables an easy distribution and installation on local systems. Study data is only stored in the application as long as the calculations are performed which is compliant with data protection endeavors. Analysis times are below half an hour, even for larger studies with over 6000 subjects. Medical experts have ensured the usefulness of this application to grant an overview of their collected study data for monitoring purposes and to generate descriptive statistics without further user interaction. The semi-automatic analysis has its limitations and cannot replace the complex analysis of statisticians, but it can be used as a starting point for their examination and reporting.
Deep learning for media analysis in defense scenariosan evaluation of an open source framework for object detection in intelligence related image sets

DTIC Science & Technology

2017-06-01

Training time statistics from Jones’ thesis. . . . . . . . . . . . . . 15 Table 2.2 Evaluation runtime statistics from Camp’s thesis for a single image. 17...Table 2.3 Training and evaluation runtime statistics from Sharpe’s thesis. . . 19 Table 2.4 Sharpe’s screenshot detector results for combinations of...training resources available and time required for each algorithm Jones [15] tested. Table 2.1. Training time statistics from Jones’ [15] thesis. Algorithm
14 CFR 417.203 - Compliance.

Code of Federal Regulations, 2012 CFR

2012-01-01

... analysis method is based on accurate data and scientific principles and is statistically valid. The FAA... safety analysis must also meet the requirements for methods of analysis contained in appendices A and B... from an identical or similar launch if the analysis still applies to the later launch. (b) Method of...
14 CFR 417.203 - Compliance.

Code of Federal Regulations, 2014 CFR

2014-01-01

... analysis method is based on accurate data and scientific principles and is statistically valid. The FAA... safety analysis must also meet the requirements for methods of analysis contained in appendices A and B... from an identical or similar launch if the analysis still applies to the later launch. (b) Method of...
14 CFR 417.203 - Compliance.

Code of Federal Regulations, 2013 CFR

2013-01-01

... analysis method is based on accurate data and scientific principles and is statistically valid. The FAA... safety analysis must also meet the requirements for methods of analysis contained in appendices A and B... from an identical or similar launch if the analysis still applies to the later launch. (b) Method of...
Rotation of EOFs by the Independent Component Analysis: Towards A Solution of the Mixing Problem in the Decomposition of Geophysical Time Series

NASA Technical Reports Server (NTRS)

Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

2001-01-01

The Independent Component Analysis is a recently developed technique for component extraction. This new method requires the statistical independence of the extracted components, a stronger constraint that uses higher-order statistics, instead of the classical decorrelation, a weaker constraint that uses only second-order statistics. This technique has been used recently for the analysis of geophysical time series with the goal of investigating the causes of variability in observed data (i.e. exploratory approach). We demonstrate with a data simulation experiment that, if initialized with a Principal Component Analysis, the Independent Component Analysis performs a rotation of the classical PCA (or EOF) solution. This rotation uses no localization criterion like other Rotation Techniques (RT), only the global generalization of decorrelation by statistical independence is used. This rotation of the PCA solution seems to be able to solve the tendency of PCA to mix several physical phenomena, even when the signal is just their linear sum.
A crash course on data analysis in asteroseismology

NASA Astrophysics Data System (ADS)

Appourchaux, Thierry

2014-02-01

In this course, I try to provide a few basics required for performing data analysis in asteroseismology. First, I address how one can properly treat times series: the sampling, the filtering effect, the use of Fourier transform, the associated statistics. Second, I address how one can apply statistics for decision making and for parameter estimation either in a frequentist of a Bayesian framework. Last, I review how these basic principle have been applied (or not) in asteroseismology.
Data analysis of gravitational-wave signals from spinning neutron stars. III. Detection statistics and computational requirements

NASA Astrophysics Data System (ADS)

Jaranowski, Piotr; Królak, Andrzej

2000-03-01

We develop the analytic and numerical tools for data analysis of the continuous gravitational-wave signals from spinning neutron stars for ground-based laser interferometric detectors. The statistical data analysis method that we investigate is maximum likelihood detection which for the case of Gaussian noise reduces to matched filtering. We study in detail the statistical properties of the optimum functional that needs to be calculated in order to detect the gravitational-wave signal and estimate its parameters. We find it particularly useful to divide the parameter space into elementary cells such that the values of the optimal functional are statistically independent in different cells. We derive formulas for false alarm and detection probabilities both for the optimal and the suboptimal filters. We assess the computational requirements needed to do the signal search. We compare a number of criteria to build sufficiently accurate templates for our data analysis scheme. We verify the validity of our concepts and formulas by means of the Monte Carlo simulations. We present algorithms by which one can estimate the parameters of the continuous signals accurately. We find, confirming earlier work of other authors, that given a 100 Gflops computational power an all-sky search for observation time of 7 days and directed search for observation time of 120 days are possible whereas an all-sky search for 120 days of observation time is computationally prohibitive.
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies

PubMed Central

Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong

2013-01-01

We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
Bayesian Statistics in Educational Research: A Look at the Current State of Affairs

ERIC Educational Resources Information Center

König, Christoph; van de Schoot, Rens

2018-01-01

The ability of a scientific discipline to build cumulative knowledge depends on its predominant method of data analysis. A steady accumulation of knowledge requires approaches which allow researchers to consider results from comparable prior research. Bayesian statistics is especially relevant for establishing a cumulative scientific discipline,…
Improved analyses using function datasets and statistical modeling

Treesearch

John S. Hogland; Nathaniel M. Anderson

2014-01-01

Raster modeling is an integral component of spatial analysis. However, conventional raster modeling techniques can require a substantial amount of processing time and storage space and have limited statistical functionality and machine learning algorithms. To address this issue, we developed a new modeling framework using C# and ArcObjects and integrated that framework...
A PROPOSED CHEMICAL INFORMATION AND DATA SYSTEM. VOLUME I.

DTIC Science & Technology

CHEMICAL COMPOUNDS, *DATA PROCESSING, *INFORMATION RETRIEVAL, * CHEMICAL ANALYSIS, INPUT OUTPUT DEVICES, COMPUTER PROGRAMMING, CLASSIFICATION...CONFIGURATIONS, DATA STORAGE SYSTEMS, ATOMS, MOLECULES, PERFORMANCE( ENGINEERING ), MAINTENANCE, SUBJECT INDEXING, MAGNETIC TAPE, AUTOMATIC, MILITARY REQUIREMENTS, TYPEWRITERS, OPTICS, TOPOLOGY, STATISTICAL ANALYSIS, FLOW CHARTING.
Statistical Relational Learning (SRL) as an Enabling Technology for Data Acquisition and Data Fusion in Video

DTIC Science & Technology

2013-05-02

REPORT Statistical Relational Learning ( SRL ) as an Enabling Technology for Data Acquisition and Data Fusion in Video 14. ABSTRACT 16. SECURITY...particular, it is important to reason about which portions of video require expensive analysis and storage. This project aims to make these...inferences using new and existing tools from Statistical Relational Learning ( SRL ). SRL is a recently emerging technology that enables the effective 1
The Role of Margin in Link Design and Optimization

NASA Technical Reports Server (NTRS)

Cheung, K.

2015-01-01

Link analysis is a system engineering process in the design, development, and operation of communication systems and networks. Link models that are mathematical abstractions representing the useful signal power and the undesirable noise and attenuation effects (including weather effects if the signal path transverses through the atmosphere) that are integrated into the link budget calculation that provides the estimates of signal power and noise power at the receiver. Then the link margin is applied which attempts to counteract the fluctuations of the signal and noise power to ensure reliable data delivery from transmitter to receiver. (Link margin is dictated by the link margin policy or requirements.) A simple link budgeting approach assumes link parameters to be deterministic values typically adopted a rule-of-thumb policy of 3 dB link margin. This policy works for most S- and X-band links due to their insensitivity to weather effects. But for higher frequency links like Ka-band, Ku-band, and optical communication links, it is unclear if a 3 dB link margin would guarantee link closure. Statistical link analysis that adopted the 2-sigma or 3-sigma link margin incorporates link uncertainties in the sigma calculation. (The Deep Space Network (DSN) link margin policies are 2-sigma for downlink and 3-sigma for uplink.) The link reliability can therefore be quantified statistically even for higher frequency links. However in the current statistical link analysis approach, link reliability is only expressed as the likelihood of exceeding the signal-to-noise ratio (SNR) threshold that corresponds to a given bit-error-rate (BER) or frame-error-rate (FER) requirement. The method does not provide the true BER or FER estimate of the link with margin, or the required signalto-noise ratio (SNR) that would meet the BER or FER requirement in the statistical sense. In this paper, we perform in-depth analysis on the relationship between BER/FER requirement, operating SNR, and coding performance curve, in the case when the channel coherence time of link fluctuation is comparable or larger than the time duration of a codeword. We compute the "true" SNR design point that would meet the BER/FER requirement by taking into account the fluctuation of signal power and noise power at the receiver, and the shape of the coding performance curve. This analysis yields a number of valuable insights on the design choices of coding scheme and link margin for the reliable data delivery of a communication system - space and ground. We illustrate the aforementioned analysis using a number of standard NASA error-correcting codes.
Statistical analysis of flight times for space shuttle ferry flights

NASA Technical Reports Server (NTRS)

Graves, M. E.; Perlmutter, M.

1974-01-01

Markov chain and Monte Carlo analysis techniques are applied to the simulated Space Shuttle Orbiter Ferry flights to obtain statistical distributions of flight time duration between Edwards Air Force Base and Kennedy Space Center. The two methods are compared, and are found to be in excellent agreement. The flights are subjected to certain operational and meteorological requirements, or constraints, which cause eastbound and westbound trips to yield different results. Persistence of events theory is applied to the occurrence of inclement conditions to find their effect upon the statistical flight time distribution. In a sensitivity test, some of the constraints are varied to observe the corresponding changes in the results.
Quasi-Experimental Analysis: A Mixture of Methods and Judgment.

ERIC Educational Resources Information Center

Cordray, David S.

1986-01-01

The role of human judgment in the development and synthesis of evidence has not been adequately developed or acknowledged within quasi-experimental analysis. Corrective solutions need to confront the fact that causal analysis within complex environments will require a more active assessment that entails reasoning and statistical modeling.…
Trial Sequential Methods for Meta-Analysis

ERIC Educational Resources Information Center

Kulinskaya, Elena; Wood, John

2014-01-01

Statistical methods for sequential meta-analysis have applications also for the design of new trials. Existing methods are based on group sequential methods developed for single trials and start with the calculation of a required information size. This works satisfactorily within the framework of fixed effects meta-analysis, but conceptual…
SPA- STATISTICAL PACKAGE FOR TIME AND FREQUENCY DOMAIN ANALYSIS

NASA Technical Reports Server (NTRS)

Brownlow, J. D.

1994-01-01

The need for statistical analysis often arises when data is in the form of a time series. This type of data is usually a collection of numerical observations made at specified time intervals. Two kinds of analysis may be performed on the data. First, the time series may be treated as a set of independent observations using a time domain analysis to derive the usual statistical properties including the mean, variance, and distribution form. Secondly, the order and time intervals of the observations may be used in a frequency domain analysis to examine the time series for periodicities. In almost all practical applications, the collected data is actually a mixture of the desired signal and a noise signal which is collected over a finite time period with a finite precision. Therefore, any statistical calculations and analyses are actually estimates. The Spectrum Analysis (SPA) program was developed to perform a wide range of statistical estimation functions. SPA can provide the data analyst with a rigorous tool for performing time and frequency domain studies. In a time domain statistical analysis the SPA program will compute the mean variance, standard deviation, mean square, and root mean square. It also lists the data maximum, data minimum, and the number of observations included in the sample. In addition, a histogram of the time domain data is generated, a normal curve is fit to the histogram, and a goodness-of-fit test is performed. These time domain calculations may be performed on both raw and filtered data. For a frequency domain statistical analysis the SPA program computes the power spectrum, cross spectrum, coherence, phase angle, amplitude ratio, and transfer function. The estimates of the frequency domain parameters may be smoothed with the use of Hann-Tukey, Hamming, Barlett, or moving average windows. Various digital filters are available to isolate data frequency components. Frequency components with periods longer than the data collection interval are removed by least-squares detrending. As many as ten channels of data may be analyzed at one time. Both tabular and plotted output may be generated by the SPA program. This program is written in FORTRAN IV and has been implemented on a CDC 6000 series computer with a central memory requirement of approximately 142K (octal) of 60 bit words. This core requirement can be reduced by segmentation of the program. The SPA program was developed in 1978.
Multi-reader ROC studies with split-plot designs: a comparison of statistical methods.

PubMed

Obuchowski, Nancy A; Gallas, Brandon D; Hillis, Stephen L

2012-12-01

Multireader imaging trials often use a factorial design, in which study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of this design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper, the authors compare three methods of analysis for the split-plot design. Three statistical methods are presented: the Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean analysis-of-variance approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power, and confidence interval coverage of the three test statistics. The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% confidence intervals falls close to the nominal coverage for small and large sample sizes. The split-plot multireader, multicase study design can be statistically efficient compared to the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rates, similar power, and nominal confidence interval coverage, are available for this study design. Copyright © 2012 AUR. All rights reserved.
[A Review on the Use of Effect Size in Nursing Research].

PubMed

Kang, Hyuncheol; Yeon, Kyupil; Han, Sang Tae

2015-10-01

The purpose of this study was to introduce the main concepts of statistical testing and effect size and to provide researchers in nursing science with guidance on how to calculate the effect size for the statistical analysis methods mainly used in nursing. For t-test, analysis of variance, correlation analysis, regression analysis which are used frequently in nursing research, the generally accepted definitions of the effect size were explained. Some formulae for calculating the effect size are described with several examples in nursing research. Furthermore, the authors present the required minimum sample size for each example utilizing G*Power 3 software that is the most widely used program for calculating sample size. It is noted that statistical significance testing and effect size measurement serve different purposes, and the reliance on only one side may be misleading. Some practical guidelines are recommended for combining statistical significance testing and effect size measure in order to make more balanced decisions in quantitative analyses.

SimHap GUI: an intuitive graphical user interface for genetic association analysis.

PubMed

Carter, Kim W; McCaskie, Pamela A; Palmer, Lyle J

2008-12-25

Researchers wishing to conduct genetic association analysis involving single nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language, provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a graphical user interface that allows anyone but a professional statistician to effectively utilise the tool. We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI features a novel workflow interface that guides the user through each logical step of the analysis process, making it accessible to both novice and advanced users. This tool provides a seamless interface to the SimHap R package, while providing enhanced functionality such as sophisticated data checking, automated data conversion, and real-time estimations of haplotype simulation progress. SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a range of genetic and non-genetic association analyses. This provides a free alternative to commercial statistics packages that is specifically designed for genetic association analysis.
Exceedance statistics of accelerations resulting from thruster firings on the Apollo-Soyuz mission

NASA Technical Reports Server (NTRS)

Fichtl, G. H.; Holland, R. L.

1981-01-01

Spacecraft acceleration resulting from firings of vernier control system thrusters is an important consideration in the design, planning, execution and post-flight analysis of laboratory experiments in space. In particular, scientists and technologists involved with the development of experiments to be performed in space in many instances required statistical information on the magnitude and rate of occurrence of spacecraft accelerations. Typically, these accelerations are stochastic in nature, so that it is useful to characterize these accelerations in statistical terms. Statistics of spacecraft accelerations are summarized.
Progress of statistical analysis in biomedical research through the historical review of the development of the Framingham score.

PubMed

Ignjatović, Aleksandra; Stojanović, Miodrag; Milošević, Zoran; Anđelković Apostolović, Marija

2017-12-02

The interest in developing risk models in medicine not only is appealing, but also associated with many obstacles in different aspects of predictive model development. Initially, the association of biomarkers or the association of more markers with the specific outcome was proven by statistical significance, but novel and demanding questions required the development of new and more complex statistical techniques. Progress of statistical analysis in biomedical research can be observed the best through the history of the Framingham study and development of the Framingham score. Evaluation of predictive models comes from a combination of the facts which are results of several metrics. Using logistic regression and Cox proportional hazards regression analysis, the calibration test, and the ROC curve analysis should be mandatory and eliminatory, and the central place should be taken by some new statistical techniques. In order to obtain complete information related to the new marker in the model, recently, there is a recommendation to use the reclassification tables by calculating the net reclassification index and the integrated discrimination improvement. Decision curve analysis is a novel method for evaluating the clinical usefulness of a predictive model. It may be noted that customizing and fine-tuning of the Framingham risk score initiated the development of statistical analysis. Clinically applicable predictive model should be a trade-off between all abovementioned statistical metrics, a trade-off between calibration and discrimination, accuracy and decision-making, costs and benefits, and quality and quantity of patient's life.
The analysis of the statistical and historical information gathered during the development of the Shuttle Orbiter Primary Flight Software

NASA Technical Reports Server (NTRS)

Simmons, D. B.; Marchbanks, M. P., Jr.; Quick, M. J.

1982-01-01

The results of an effort to thoroughly and objectively analyze the statistical and historical information gathered during the development of the Shuttle Orbiter Primary Flight Software are given. The particular areas of interest include cost of the software, reliability of the software, requirements for the software and how the requirements changed during development of the system. Data related to the current version of the software system produced some interesting results. Suggestions are made for the saving of additional data which will allow additional investigation.
LP-search and its use in analysis of the accuracy of control systems with acoustical models

NASA Technical Reports Server (NTRS)

Sergeyev, V. I.; Sobol, I. M.; Statnikov, R. B.; Statnikov, I. N.

1973-01-01

The LP-search is proposed as an analog of the Monte Carlo method for finding values in nonlinear statistical systems. It is concluded that: To attain the required accuracy in solution to the problem of control for a statistical system in the LP-search, a considerably smaller number of tests is required than in the Monte Carlo method. The LP-search allows the possibility of multiple repetitions of tests under identical conditions and observability of the output variables of the system.
Teaching Students to Use Summary Statistics and Graphics to Clean and Analyze Data

ERIC Educational Resources Information Center

Holcomb, John; Spalsbury, Angela

2005-01-01

Textbooks and websites today abound with real data. One neglected issue is that statistical investigations often require a good deal of "cleaning" to ready data for analysis. The purpose of this dataset and exercise is to teach students to use exploratory tools to identify erroneous observations. This article discusses the merits of such…
Analysis of USAREUR Family Housing.

DTIC Science & Technology

1985-04-01

Standard Installation/Division Personnel System SJA ................ Staff Judge Advocate SPSS ............... Statistical Package for the...for Projecting Family Housing Requirements. a. Attempts to define USAREUR’s programmable family housing deficit Sbased on the FHS have caused anguish ...responses using the Statistical Package for the Social Sciences ( SPSS ) computer program. E-2 ANNEX E RESPONSE TO ESC HOUSING QUESTIONNAIRE Section Page I
Multiple linear regression analysis

NASA Technical Reports Server (NTRS)

Edwards, T. R.

1980-01-01

Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
The Effect of Using Case Studies in Business Statistics

ERIC Educational Resources Information Center

Pariseau, Susan E.; Kezim, Boualem

2007-01-01

The authors evaluated the effect on learning of using case studies in business statistics courses. The authors divided students into 3 groups: a control group, a group that completed 1 case study, and a group that completed 3 case studies. Results evidenced that, on average, students whom the authors required to complete a case analysis received…
77 FR 17460 - Proposed Collection; Comment Request

Federal Register 2010, 2011, 2012, 2013, 2014

2012-03-26

..., Associated Form, and OMB Control Number: The 2012 Post- Election Survey of State and Local Election Officials; OMB Control Number 0704-0125. Needs and Uses: The information collection requirement is necessary to.... 1973ff]). UOCAVA requires a statistical analysis report to the President and Congress on the...
Asymptotic modal analysis and statistical energy analysis

NASA Technical Reports Server (NTRS)

Dowell, Earl H.

1992-01-01

Asymptotic Modal Analysis (AMA) is a method which is used to model linear dynamical systems with many participating modes. The AMA method was originally developed to show the relationship between statistical energy analysis (SEA) and classical modal analysis (CMA). In the limit of a large number of modes of a vibrating system, the classical modal analysis result can be shown to be equivalent to the statistical energy analysis result. As the CMA result evolves into the SEA result, a number of systematic assumptions are made. Most of these assumptions are based upon the supposition that the number of modes approaches infinity. It is for this reason that the term 'asymptotic' is used. AMA is the asymptotic result of taking the limit of CMA as the number of modes approaches infinity. AMA refers to any of the intermediate results between CMA and SEA, as well as the SEA result which is derived from CMA. The main advantage of the AMA method is that individual modal characteristics are not required in the model or computations. By contrast, CMA requires that each modal parameter be evaluated at each frequency. In the latter, contributions from each mode are computed and the final answer is obtained by summing over all the modes in the particular band of interest. AMA evaluates modal parameters only at their center frequency and does not sum the individual contributions from each mode in order to obtain a final result. The method is similar to SEA in this respect. However, SEA is only capable of obtaining spatial averages or means, as it is a statistical method. Since AMA is systematically derived from CMA, it can obtain local spatial information as well.
Statistical analysis and application of quasi experiments to antimicrobial resistance intervention studies.

PubMed

Shardell, Michelle; Harris, Anthony D; El-Kamary, Samer S; Furuno, Jon P; Miller, Ram R; Perencevich, Eli N

2007-10-01

Quasi-experimental study designs are frequently used to assess interventions that aim to limit the emergence of antimicrobial-resistant pathogens. However, previous studies using these designs have often used suboptimal statistical methods, which may result in researchers making spurious conclusions. Methods used to analyze quasi-experimental data include 2-group tests, regression analysis, and time-series analysis, and they all have specific assumptions, data requirements, strengths, and limitations. An example of a hospital-based intervention to reduce methicillin-resistant Staphylococcus aureus infection rates and reduce overall length of stay is used to explore these methods.
Exploratory Visual Analysis of Statistical Results from Microarray Experiments Comparing High and Low Grade Glioma

PubMed Central

Reif, David M.; Israel, Mark A.; Moore, Jason H.

2007-01-01

The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666
a Study of Women Engineering Students and Time to Completion of First-Year Required Courses at Texas A&M University

NASA Astrophysics Data System (ADS)

Kimball, Jorja; Cole, Bryan; Hobson, Margaret; Watson, Karan; Stanley, Christine

This paper reports findings on gender that were part of a larger study reviewing time to completion of course work that includes the first two semesters of calculus, chemistry, and physics, which are often considered the stumbling points or "barrier courses" to an engineering baccalaureate degree. Texas A&M University terms these courses core body of knowledge (CBK), and statistical analysis was conducted on two cohorts of first-year enrolling engineering students at the institution. Findings indicate that gender is statistically significantly related to completion of CBK with female engineering students completing required courses faster than males at the .01 level (p = 0.008). Statistical significance for gender and ethnicity was found between white male and white female students at the .01 level (p = 0.008). Descriptive analysis indicated that of the five majors studied (chemical, civil, computer, electrical, and mechanical engineering), women completed CBK faster than men, and African American and Hispanic women completed CBK faster than males of the same ethnicity.
Flight path control strategies and preliminary deltaV requirements for the 2007 Mars Phoenix (PHX) mission

NASA Technical Reports Server (NTRS)

Raofi, Behzad

2005-01-01

This paper describes the methods used to estimate the statistical deltaV requirements for the propulsive maneuvers that will deliver the spacecraft to its target landing site while satisfying planetary protection requirements. the paper presents flight path control analysis results for three different trajectories, open, middle, and close of launch period for the mission.
Visualization of the variability of 3D statistical shape models by animation.

PubMed

Lamecker, Hans; Seebass, Martin; Lange, Thomas; Hege, Hans-Christian; Deuflhard, Peter

2004-01-01

Models of the 3D shape of anatomical objects and the knowledge about their statistical variability are of great benefit in many computer assisted medical applications like images analysis, therapy or surgery planning. Statistical model of shapes have successfully been applied to automate the task of image segmentation. The generation of 3D statistical shape models requires the identification of corresponding points on two shapes. This remains a difficult problem, especially for shapes of complicated topology. In order to interpret and validate variations encoded in a statistical shape model, visual inspection is of great importance. This work describes the generation and interpretation of statistical shape models of the liver and the pelvic bone.
Exploratory study on a statistical method to analyse time resolved data obtained during nanomaterial exposure measurements

NASA Astrophysics Data System (ADS)

Clerc, F.; Njiki-Menga, G.-H.; Witschger, O.

2013-04-01

Most of the measurement strategies that are suggested at the international level to assess workplace exposure to nanomaterials rely on devices measuring, in real time, airborne particles concentrations (according different metrics). Since none of the instruments to measure aerosols can distinguish a particle of interest to the background aerosol, the statistical analysis of time resolved data requires special attention. So far, very few approaches have been used for statistical analysis in the literature. This ranges from simple qualitative analysis of graphs to the implementation of more complex statistical models. To date, there is still no consensus on a particular approach and the current period is always looking for an appropriate and robust method. In this context, this exploratory study investigates a statistical method to analyse time resolved data based on a Bayesian probabilistic approach. To investigate and illustrate the use of the this statistical method, particle number concentration data from a workplace study that investigated the potential for exposure via inhalation from cleanout operations by sandpapering of a reactor producing nanocomposite thin films have been used. In this workplace study, the background issue has been addressed through the near-field and far-field approaches and several size integrated and time resolved devices have been used. The analysis of the results presented here focuses only on data obtained with two handheld condensation particle counters. While one was measuring at the source of the released particles, the other one was measuring in parallel far-field. The Bayesian probabilistic approach allows a probabilistic modelling of data series, and the observed task is modelled in the form of probability distributions. The probability distributions issuing from time resolved data obtained at the source can be compared with the probability distributions issuing from the time resolved data obtained far-field, leading in a quantitative estimation of the airborne particles released at the source when the task is performed. Beyond obtained results, this exploratory study indicates that the analysis of the results requires specific experience in statistics.
Statistical analysis of nonmonotonic dose-response relationships: research design and analysis of nasal cell proliferation in rats exposed to formaldehyde.

PubMed

Gaylor, David W; Lutz, Werner K; Conolly, Rory B

2004-01-01

Statistical analyses of nonmonotonic dose-response curves are proposed, experimental designs to detect low-dose effects of J-shaped curves are suggested, and sample sizes are provided. For quantal data such as cancer incidence rates, much larger numbers of animals are required than for continuous data such as biomarker measurements. For example, 155 animals per dose group are required to have at least an 80% chance of detecting a decrease from a 20% incidence in controls to an incidence of 10% at a low dose. For a continuous measurement, only 14 animals per group are required to have at least an 80% chance of detecting a change of the mean by one standard deviation of the control group. Experimental designs based on three dose groups plus controls are discussed to detect nonmonotonicity or to estimate the zero equivalent dose (ZED), i.e., the dose that produces a response equal to the average response in the controls. Cell proliferation data in the nasal respiratory epithelium of rats exposed to formaldehyde by inhalation are used to illustrate the statistical procedures. Statistically significant departures from a monotonic dose response were obtained for time-weighted average labeling indices with an estimated ZED at a formaldehyde dose of 5.4 ppm, with a lower 95% confidence limit of 2.7 ppm. It is concluded that demonstration of a statistically significant bi-phasic dose-response curve, together with estimation of the resulting ZED, could serve as a point-of departure in establishing a reference dose for low-dose risk assessment.
Assessing Statistically Significant Heavy-Metal Concentrations in Abandoned Mine Areas via Hot Spot Analysis of Portable XRF Data

PubMed Central

Kim, Sung-Min; Choi, Yosoon

2017-01-01

To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs) in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z-score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF) analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES) data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z-scores: high content with a high z-score (HH), high content with a low z-score (HL), low content with a high z-score (LH), and low content with a low z-score (LL). The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1–4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required. PMID:28629168
Assessing Statistically Significant Heavy-Metal Concentrations in Abandoned Mine Areas via Hot Spot Analysis of Portable XRF Data.

PubMed

Kim, Sung-Min; Choi, Yosoon

2017-06-18

To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs) in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z -score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF) analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES) data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z -scores: high content with a high z -score (HH), high content with a low z -score (HL), low content with a high z -score (LH), and low content with a low z -score (LL). The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1-4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required.

Three Tier Unified Process Model for Requirement Negotiations and Stakeholder Collaborations

NASA Astrophysics Data System (ADS)

Niazi, Muhammad Ashraf Khan; Abbas, Muhammad; Shahzad, Muhammad

2012-11-01

This research paper is focused towards carrying out a pragmatic qualitative analysis of various models and approaches of requirements negotiations (a sub process of requirements management plan which is an output of scope managementís collect requirements process) and studies stakeholder collaborations methodologies (i.e. from within communication management knowledge area). Experiential analysis encompass two tiers; first tier refers to the weighted scoring model while second tier focuses on development of SWOT matrices on the basis of findings of weighted scoring model for selecting an appropriate requirements negotiation model. Finally the results are simulated with the help of statistical pie charts. On the basis of simulated results of prevalent models and approaches of negotiations, a unified approach for requirements negotiations and stakeholder collaborations is proposed where the collaboration methodologies are embeded into selected requirements negotiation model as internal parameters of the proposed process alongside some external required parameters like MBTI, opportunity analysis etc.
Application of Turchin's method of statistical regularization

NASA Astrophysics Data System (ADS)

Zelenyi, Mikhail; Poliakova, Mariia; Nozik, Alexander; Khudyakov, Alexey

2018-04-01

During analysis of experimental data, one usually needs to restore a signal after it has been convoluted with some kind of apparatus function. According to Hadamard's definition this problem is ill-posed and requires regularization to provide sensible results. In this article we describe an implementation of the Turchin's method of statistical regularization based on the Bayesian approach to the regularization strategy.
Preliminary results from a method to update timber resource statistics in North Carolina

Treesearch

Glenn P. Catts; Noel D. Cost; Raymond L. Czaplewski; Paul W. Snook

1987-01-01

Forest Inventory and Analysis units of the USDA Forest Service produce timber resource statistics every 8 to 10 years. Midcycle surveys are often performed to update inventory estimates. This requires timely identification of forest lands. There are several kinds of remotely sensed data that are suitable for this purpose. Medium scale color infrared aerial photography...
Detecting temporal change in freshwater fisheries surveys: statistical power and the important linkages between management questions and monitoring objectives

USGS Publications Warehouse

Wagner, Tyler; Irwin, Brian J.; James R. Bence,; Daniel B. Hayes,

2016-01-01

Monitoring to detect temporal trends in biological and habitat indices is a critical component of fisheries management. Thus, it is important that management objectives are linked to monitoring objectives. This linkage requires a definition of what constitutes a management-relevant “temporal trend.” It is also important to develop expectations for the amount of time required to detect a trend (i.e., statistical power) and for choosing an appropriate statistical model for analysis. We provide an overview of temporal trends commonly encountered in fisheries management, review published studies that evaluated statistical power of long-term trend detection, and illustrate dynamic linear models in a Bayesian context, as an additional analytical approach focused on shorter term change. We show that monitoring programs generally have low statistical power for detecting linear temporal trends and argue that often management should be focused on different definitions of trends, some of which can be better addressed by alternative analytical approaches.
Which statistics should tropical biologists learn?

PubMed

Loaiza Velásquez, Natalia; González Lutz, María Isabel; Monge-Nájera, Julián

2011-09-01

Tropical biologists study the richest and most endangered biodiversity in the planet, and in these times of climate change and mega-extinctions, the need for efficient, good quality research is more pressing than in the past. However, the statistical component in research published by tropical authors sometimes suffers from poor quality in data collection; mediocre or bad experimental design and a rigid and outdated view of data analysis. To suggest improvements in their statistical education, we listed all the statistical tests and other quantitative analyses used in two leading tropical journals, the Revista de Biología Tropical and Biotropica, during a year. The 12 most frequent tests in the articles were: Analysis of Variance (ANOVA), Chi-Square Test, Student's T Test, Linear Regression, Pearson's Correlation Coefficient, Mann-Whitney U Test, Kruskal-Wallis Test, Shannon's Diversity Index, Tukey's Test, Cluster Analysis, Spearman's Rank Correlation Test and Principal Component Analysis. We conclude that statistical education for tropical biologists must abandon the old syllabus based on the mathematical side of statistics and concentrate on the correct selection of these and other procedures and tests, on their biological interpretation and on the use of reliable and friendly freeware. We think that their time will be better spent understanding and protecting tropical ecosystems than trying to learn the mathematical foundations of statistics: in most cases, a well designed one-semester course should be enough for their basic requirements.
Analysis and meta-analysis of single-case designs with a standardized mean difference statistic: a primer and applications.

PubMed

Shadish, William R; Hedges, Larry V; Pustejovsky, James E

2014-04-01

This article presents a d-statistic for single-case designs that is in the same metric as the d-statistic used in between-subjects designs such as randomized experiments and offers some reasons why such a statistic would be useful in SCD research. The d has a formal statistical development, is accompanied by appropriate power analyses, and can be estimated using user-friendly SPSS macros. We discuss both advantages and disadvantages of d compared to other approaches such as previous d-statistics, overlap statistics, and multilevel modeling. It requires at least three cases for computation and assumes normally distributed outcomes and stationarity, assumptions that are discussed in some detail. We also show how to test these assumptions. The core of the article then demonstrates in depth how to compute d for one study, including estimation of the autocorrelation and the ratio of between case variance to total variance (between case plus within case variance), how to compute power using a macro, and how to use the d to conduct a meta-analysis of studies using single-case designs in the free program R, including syntax in an appendix. This syntax includes how to read data, compute fixed and random effect average effect sizes, prepare a forest plot and a cumulative meta-analysis, estimate various influence statistics to identify studies contributing to heterogeneity and effect size, and do various kinds of publication bias analyses. This d may prove useful for both the analysis and meta-analysis of data from SCDs. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Exceedance statistics of accelerations resulting from thruster firings on the Apollo-Soyuz mission

NASA Technical Reports Server (NTRS)

Fichtl, G. H.; Holland, R. L.

1983-01-01

Spacecraft acceleration resulting from firings of vernier control system thrusters is an important consideration in the design, planning, execution and post-flight analysis of laboratory experiments in space. In particular, scientists and technologists involved with the development of experiments to be performed in space in many instances required statistical information on the magnitude and rate of occurrence of spacecraft accelerations. Typically, these accelerations are stochastic in nature, so that it is useful to characterize these accelerations in statistical terms. Statistics of spacecraft accelerations are summarized. Previously announced in STAR as N82-12127
Functional Relationships and Regression Analysis.

ERIC Educational Resources Information Center

Preece, Peter F. W.

1978-01-01

Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…
General Nature of Multicollinearity in Multiple Regression Analysis.

ERIC Educational Resources Information Center

Liu, Richard

1981-01-01

Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)
SimHap GUI: An intuitive graphical user interface for genetic association analysis

PubMed Central

Carter, Kim W; McCaskie, Pamela A; Palmer, Lyle J

2008-01-01

Background Researchers wishing to conduct genetic association analysis involving single nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language, provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a graphical user interface that allows anyone but a professional statistician to effectively utilise the tool. Results We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI features a novel workflow interface that guides the user through each logical step of the analysis process, making it accessible to both novice and advanced users. This tool provides a seamless interface to the SimHap R package, while providing enhanced functionality such as sophisticated data checking, automated data conversion, and real-time estimations of haplotype simulation progress. Conclusion SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a range of genetic and non-genetic association analyses. This provides a free alternative to commercial statistics packages that is specifically designed for genetic association analysis. PMID:19109877
Comparison of requirements and capabilities of major multipurpose software packages.

PubMed

Igo, Robert P; Schnell, Audrey H

2012-01-01

The aim of this chapter is to introduce the reader to commonly used software packages and illustrate their input requirements, analysis options, strengths, and limitations. We focus on packages that perform more than one function and include a program for quality control, linkage, and association analyses. Additional inclusion criteria were (1) programs that are free to academic users and (2) currently supported, maintained, and developed. Using those criteria, we chose to review three programs: Statistical Analysis for Genetic Epidemiology (S.A.G.E.), PLINK, and Merlin. We will describe the required input format and analysis options. We will not go into detail about every possible program in the packages, but we will give an overview of the packages requirements and capabilities.
[Utilization of feed energy by growing pigs. 3. Energy requirement for the growth and fattening of pigs].

PubMed

Hoffmann, L; Schiemann, R; Jentsch, W

1979-02-01

The test series for the investigation of the energy consumption of growing pigs of the breeds large white and improved land race pig as well as cross breeds of the two breeds in a total of 369 metabolism periods (as described in the first two pieces of information of this publication series -- Hoffmann and others, 1977 and Jentsch and Hoffmann, 1977) were statistically analysed for the purpose of the derivation of the energy requirement for maintenance and the partial energy requirement for growth in order to test the possibilities of the factorial analysis for the derivation of energy requirement values of growing pigs. The dependence of the maintenance requirement of growing pigs (investigations in the live weight range of 10 to 40 kg -- see 1st information--were made with boars those in the live weight range of 30 to 120 kg were made with gelded boars, 2nd information) on the live weight can best be characterised by applying a power exponent of 0,61 or 0,62 for the live weight. A definition is offered to be discussed for the energetic maintenance requirement of productive live stock and laboratory animals as a conventional value. The energy requirement values derived from the doubly-factorial statistical analysis show a satisfactory adaptation to the measured values as such concerning energy intake and observed growth performance of the test animals. The conclusion is drawn that the factorial analysis of the energy requirement (maintenance plus partial performances) results in a better estimate of the requirement of growing animals than the assessment according only to live weight and live weight increase without characterising the energy requirement for partial performances. This is important for the further working on and more exact definition of requirement norms.
New software for statistical analysis of Cambridge Structural Database data

PubMed Central

Sykes, Richard A.; McCabe, Patrick; Allen, Frank H.; Battle, Gary M.; Bruno, Ian J.; Wood, Peter A.

2011-01-01

A collection of new software tools is presented for the analysis of geometrical, chemical and crystallographic data from the Cambridge Structural Database (CSD). This software supersedes the program Vista. The new functionality is integrated into the program Mercury in order to provide statistical, charting and plotting options alongside three-dimensional structural visualization and analysis. The integration also permits immediate access to other information about specific CSD entries through the Mercury framework, a common requirement in CSD data analyses. In addition, the new software includes a range of more advanced features focused towards structural analysis such as principal components analysis, cone-angle correction in hydrogen-bond analyses and the ability to deal with topological symmetry that may be exhibited in molecular search fragments. PMID:22477784
Precipitate statistics in an Al-Mg-Si-Cu alloy from scanning precession electron diffraction data

NASA Astrophysics Data System (ADS)

Sunde, J. K.; Paulsen, Ø.; Wenner, S.; Holmestad, R.

2017-09-01

The key microstructural feature providing strength to age-hardenable Al alloys is nanoscale precipitates. Alloy development requires a reliable statistical assessment of these precipitates, in order to link the microstructure with material properties. Here, it is demonstrated that scanning precession electron diffraction combined with computational analysis enable the semi-automated extraction of precipitate statistics in an Al-Mg-Si-Cu alloy. Among the main findings is the precipitate number density, which agrees well with a conventional method based on manual counting and measurements. By virtue of its data analysis objectivity, our methodology is therefore seen as an advantageous alternative to existing routines, offering reproducibility and efficiency in alloy statistics. Additional results include improved qualitative information on phase distributions. The developed procedure is generic and applicable to any material containing nanoscale precipitates.
[Development of Hospital Equipment Maintenance Information System].

PubMed

Zhou, Zhixin

2015-11-01

Hospital equipment maintenance information system plays an important role in improving medical treatment quality and efficiency. By requirement analysis of hospital equipment maintenance, the system function diagram is drawed. According to analysis of input and output data, tables and reports in connection with equipment maintenance process, relationships between entity and attribute is found out, and E-R diagram is drawed and relational database table is established. Software development should meet actual process requirement of maintenance and have a friendly user interface and flexible operation. The software can analyze failure cause by statistical analysis.
Long-term Results of an Analytical Assessment of Student Compounded Preparations

PubMed Central

Roark, Angie M.; Anksorus, Heidi N.

2014-01-01

Objective. To investigate the long-term (ie, 6-year) impact of a required remake vs an optional remake on student performance in a compounding laboratory course in which students’ compounded preparations were analyzed. Methods. The analysis data for several preparations made by students were compared for differences in the analyzed content of the active pharmaceutical ingredient (API) and the number of students who successfully compounded the preparation on the first attempt. Results. There was a consistent statistical difference in the API amount or concentration in 4 of the preparations (diphenhydramine, ketoprofen, metoprolol, and progesterone) in each optional remake year compared to the required remake year. As the analysis requirement was continued, the outcome for each preparation approached and/or attained the expected API result. Two preparations required more than 1 year to demonstrate a statistical difference. Conclusion. The analytical assessment resulted in a consistent, long-term improvement in student performance during the 5-year period after the optional remake policy was instituted. Our assumption is that investment in such an assessment would result in a similar benefits at other colleges and schools of pharmacy. PMID:26056402
Long-term Results of an Analytical Assessment of Student Compounded Preparations.

PubMed

Roark, Angie M; Anksorus, Heidi N; Shrewsbury, Robert P

2014-11-15

To investigate the long-term (ie, 6-year) impact of a required remake vs an optional remake on student performance in a compounding laboratory course in which students' compounded preparations were analyzed. The analysis data for several preparations made by students were compared for differences in the analyzed content of the active pharmaceutical ingredient (API) and the number of students who successfully compounded the preparation on the first attempt. There was a consistent statistical difference in the API amount or concentration in 4 of the preparations (diphenhydramine, ketoprofen, metoprolol, and progesterone) in each optional remake year compared to the required remake year. As the analysis requirement was continued, the outcome for each preparation approached and/or attained the expected API result. Two preparations required more than 1 year to demonstrate a statistical difference. The analytical assessment resulted in a consistent, long-term improvement in student performance during the 5-year period after the optional remake policy was instituted. Our assumption is that investment in such an assessment would result in a similar benefits at other colleges and schools of pharmacy.
Uncertainty Analysis of Inertial Model Attitude Sensor Calibration and Application with a Recommended New Calibration Method

NASA Technical Reports Server (NTRS)

Tripp, John S.; Tcheng, Ping

1999-01-01

Statistical tools, previously developed for nonlinear least-squares estimation of multivariate sensor calibration parameters and the associated calibration uncertainty analysis, have been applied to single- and multiple-axis inertial model attitude sensors used in wind tunnel testing to measure angle of attack and roll angle. The analysis provides confidence and prediction intervals of calibrated sensor measurement uncertainty as functions of applied input pitch and roll angles. A comparative performance study of various experimental designs for inertial sensor calibration is presented along with corroborating experimental data. The importance of replicated calibrations over extended time periods has been emphasized; replication provides independent estimates of calibration precision and bias uncertainties, statistical tests for calibration or modeling bias uncertainty, and statistical tests for sensor parameter drift over time. A set of recommendations for a new standardized model attitude sensor calibration method and usage procedures is included. The statistical information provided by these procedures is necessary for the uncertainty analysis of aerospace test results now required by users of industrial wind tunnel test facilities.
DISTMIX: direct imputation of summary statistics for unmeasured SNPs from mixed ethnicity cohorts.

PubMed

Lee, Donghyung; Bigdeli, T Bernard; Williamson, Vernell S; Vladimirov, Vladimir I; Riley, Brien P; Fanous, Ayman H; Bacanu, Silviu-Alin

2015-10-01

To increase the signal resolution for large-scale meta-analyses of genome-wide association studies, genotypes at unmeasured single nucleotide polymorphisms (SNPs) are commonly imputed using large multi-ethnic reference panels. However, the ever increasing size and ethnic diversity of both reference panels and cohorts makes genotype imputation computationally challenging for moderately sized computer clusters. Moreover, genotype imputation requires subject-level genetic data, which unlike summary statistics provided by virtually all studies, is not publicly available. While there are much less demanding methods which avoid the genotype imputation step by directly imputing SNP statistics, e.g. Directly Imputing summary STatistics (DIST) proposed by our group, their implicit assumptions make them applicable only to ethnically homogeneous cohorts. To decrease computational and access requirements for the analysis of cosmopolitan cohorts, we propose DISTMIX, which extends DIST capabilities to the analysis of mixed ethnicity cohorts. The method uses a relevant reference panel to directly impute unmeasured SNP statistics based only on statistics at measured SNPs and estimated/user-specified ethnic proportions. Simulations show that the proposed method adequately controls the Type I error rates. The 1000 Genomes panel imputation of summary statistics from the ethnically diverse Psychiatric Genetic Consortium Schizophrenia Phase 2 suggests that, when compared to genotype imputation methods, DISTMIX offers comparable imputation accuracy for only a fraction of computational resources. DISTMIX software, its reference population data, and usage examples are publicly available at http://code.google.com/p/distmix. dlee4@vcu.edu Supplementary Data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Interpretation of correlations in clinical research.

PubMed

Hung, Man; Bounsanga, Jerry; Voss, Maren Wright

2017-11-01

Critically analyzing research is a key skill in evidence-based practice and requires knowledge of research methods, results interpretation, and applications, all of which rely on a foundation based in statistics. Evidence-based practice makes high demands on trained medical professionals to interpret an ever-expanding array of research evidence. As clinical training emphasizes medical care rather than statistics, it is useful to review the basics of statistical methods and what they mean for interpreting clinical studies. We reviewed the basic concepts of correlational associations, violations of normality, unobserved variable bias, sample size, and alpha inflation. The foundations of causal inference were discussed and sound statistical analyses were examined. We discuss four ways in which correlational analysis is misused, including causal inference overreach, over-reliance on significance, alpha inflation, and sample size bias. Recent published studies in the medical field provide evidence of causal assertion overreach drawn from correlational findings. The findings present a primer on the assumptions and nature of correlational methods of analysis and urge clinicians to exercise appropriate caution as they critically analyze the evidence before them and evaluate evidence that supports practice. Critically analyzing new evidence requires statistical knowledge in addition to clinical knowledge. Studies can overstate relationships, expressing causal assertions when only correlational evidence is available. Failure to account for the effect of sample size in the analyses tends to overstate the importance of predictive variables. It is important not to overemphasize the statistical significance without consideration of effect size and whether differences could be considered clinically meaningful.

Statistical analysis of fNIRS data: a comprehensive review.

PubMed

Tak, Sungho; Ye, Jong Chul

2014-01-15

Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.
Text grouping in patent analysis using adaptive K-means clustering algorithm

NASA Astrophysics Data System (ADS)

Shanie, Tiara; Suprijadi, Jadi; Zulhanif

2017-03-01

Patents are one of the Intellectual Property. Analyzing patent is one requirement in knowing well the development of technology in each country and in the world now. This study uses the patent document coming from the Espacenet server about Green Tea. Patent documents related to the technology in the field of tea is still widespread, so it will be difficult for users to information retrieval (IR). Therefore, it is necessary efforts to categorize documents in a specific group of related terms contained therein. This study uses titles patent text data with the proposed Green Tea in Statistical Text Mining methods consists of two phases: data preparation and data analysis stage. The data preparation phase uses Text Mining methods and data analysis stage is done by statistics. Statistical analysis in this study using a cluster analysis algorithm, the Adaptive K-Means Clustering Algorithm. Results from this study showed that based on the maximum value Silhouette, generate 87 clusters associated fifteen terms therein that can be utilized in the process of information retrieval needs.
Improved processes for meeting the data requirements for implementing the Highway Safety Manual (HSM) and Safety Analyst in Florida.

DOT National Transportation Integrated Search

2014-03-01

Recent research in highway safety has focused on the more advanced and statistically proven techniques of highway : safety analysis. This project focuses on the two most recent safety analysis tools, the Highway Safety Manual (HSM) : and SafetyAnalys...
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bennett, Janine Camille; Thompson, David; Pebay, Philippe Pierre

Statistical analysis is typically used to reduce the dimensionality of and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. Many statistical techniques, e.g., descriptive statistics or principal component analysis, are based on moments and co-moments and, using robust online update formulas, can be computed in an embarrassingly parallel manner, amenable to a map-reduce style implementation. In this paper we focus on contingency tables, through which numerous derived statistics such as joint and marginal probability, point-wise mutual information, information entropy,more » and {chi}{sup 2} independence statistics can be directly obtained. However, contingency tables can become large as data size increases, requiring a correspondingly large amount of communication between processors. This potential increase in communication prevents optimal parallel speedup and is the main difference with moment-based statistics (which we discussed in [1]) where the amount of inter-processor communication is independent of data size. Here we present the design trade-offs which we made to implement the computation of contingency tables in parallel. We also study the parallel speedup and scalability properties of our open source implementation. In particular, we observe optimal speed-up and scalability when the contingency statistics are used in their appropriate context, namely, when the data input is not quasi-diffuse.« less
Understanding spatial organizations of chromosomes via statistical analysis of Hi-C data

PubMed Central

Hu, Ming; Deng, Ke; Qin, Zhaohui; Liu, Jun S.

2015-01-01

Understanding how chromosomes fold provides insights into the transcription regulation, hence, the functional state of the cell. Using the next generation sequencing technology, the recently developed Hi-C approach enables a global view of spatial chromatin organization in the nucleus, which substantially expands our knowledge about genome organization and function. However, due to multiple layers of biases, noises and uncertainties buried in the protocol of Hi-C experiments, analyzing and interpreting Hi-C data poses great challenges, and requires novel statistical methods to be developed. This article provides an overview of recent Hi-C studies and their impacts on biomedical research, describes major challenges in statistical analysis of Hi-C data, and discusses some perspectives for future research. PMID:26124977
Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan-Meier survival curves.

PubMed

Guyot, Patricia; Ades, A E; Ouwens, Mario J N M; Welton, Nicky J

2012-02-01

The results of Randomized Controlled Trials (RCTs) on time-to-event outcomes that are usually reported are median time to events and Cox Hazard Ratio. These do not constitute the sufficient statistics required for meta-analysis or cost-effectiveness analysis, and their use in secondary analyses requires strong assumptions that may not have been adequately tested. In order to enhance the quality of secondary data analyses, we propose a method which derives from the published Kaplan Meier survival curves a close approximation to the original individual patient time-to-event data from which they were generated. We develop an algorithm that maps from digitised curves back to KM data by finding numerical solutions to the inverted KM equations, using where available information on number of events and numbers at risk. The reproducibility and accuracy of survival probabilities, median survival times and hazard ratios based on reconstructed KM data was assessed by comparing published statistics (survival probabilities, medians and hazard ratios) with statistics based on repeated reconstructions by multiple observers. The validation exercise established there was no material systematic error and that there was a high degree of reproducibility for all statistics. Accuracy was excellent for survival probabilities and medians, for hazard ratios reasonable accuracy can only be obtained if at least numbers at risk or total number of events are reported. The algorithm is a reliable tool for meta-analysis and cost-effectiveness analyses of RCTs reporting time-to-event data. It is recommended that all RCTs should report information on numbers at risk and total number of events alongside KM curves.
Consequences of common data analysis inaccuracies in CNS trauma injury basic research.

PubMed

Burke, Darlene A; Whittemore, Scott R; Magnuson, David S K

2013-05-15

The development of successful treatments for humans after traumatic brain or spinal cord injuries (TBI and SCI, respectively) requires animal research. This effort can be hampered when promising experimental results cannot be replicated because of incorrect data analysis procedures. To identify and hopefully avoid these errors in future studies, the articles in seven journals with the highest number of basic science central nervous system TBI and SCI animal research studies published in 2010 (N=125 articles) were reviewed for their data analysis procedures. After identifying the most common statistical errors, the implications of those findings were demonstrated by reanalyzing previously published data from our laboratories using the identified inappropriate statistical procedures, then comparing the two sets of results. Overall, 70% of the articles contained at least one type of inappropriate statistical procedure. The highest percentage involved incorrect post hoc t-tests (56.4%), followed by inappropriate parametric statistics (analysis of variance and t-test; 37.6%). Repeated Measures analysis was inappropriately missing in 52.0% of all articles and, among those with behavioral assessments, 58% were analyzed incorrectly. Reanalysis of our published data using the most common inappropriate statistical procedures resulted in a 14.1% average increase in significant effects compared to the original results. Specifically, an increase of 15.5% occurred with Independent t-tests and 11.1% after incorrect post hoc t-tests. Utilizing proper statistical procedures can allow more-definitive conclusions, facilitate replicability of research results, and enable more accurate translation of those results to the clinic.
Statistical process control methods allow the analysis and improvement of anesthesia care.

PubMed

Fasting, Sigurd; Gisvold, Sven E

2003-10-01

Quality aspects of the anesthetic process are reflected in the rate of intraoperative adverse events. The purpose of this report is to illustrate how the quality of the anesthesia process can be analyzed using statistical process control methods, and exemplify how this analysis can be used for quality improvement. We prospectively recorded anesthesia-related data from all anesthetics for five years. The data included intraoperative adverse events, which were graded into four levels, according to severity. We selected four adverse events, representing important quality and safety aspects, for statistical process control analysis. These were: inadequate regional anesthesia, difficult emergence from general anesthesia, intubation difficulties and drug errors. We analyzed the underlying process using 'p-charts' for statistical process control. In 65,170 anesthetics we recorded adverse events in 18.3%; mostly of lesser severity. Control charts were used to define statistically the predictable normal variation in problem rate, and then used as a basis for analysis of the selected problems with the following results: Inadequate plexus anesthesia: stable process, but unacceptably high failure rate; Difficult emergence: unstable process, because of quality improvement efforts; Intubation difficulties: stable process, rate acceptable; Medication errors: methodology not suited because of low rate of errors. By applying statistical process control methods to the analysis of adverse events, we have exemplified how this allows us to determine if a process is stable, whether an intervention is required, and if quality improvement efforts have the desired effect.
Temporal scaling and spatial statistical analyses of groundwater level fluctuations

NASA Astrophysics Data System (ADS)

Sun, H.; Yuan, L., Sr.; Zhang, Y.

2017-12-01

Natural dynamics such as groundwater level fluctuations can exhibit multifractionality and/or multifractality due likely to multi-scale aquifer heterogeneity and controlling factors, whose statistics requires efficient quantification methods. This study explores multifractionality and non-Gaussian properties in groundwater dynamics expressed by time series of daily level fluctuation at three wells located in the lower Mississippi valley, after removing the seasonal cycle in the temporal scaling and spatial statistical analysis. First, using the time-scale multifractional analysis, a systematic statistical method is developed to analyze groundwater level fluctuations quantified by the time-scale local Hurst exponent (TS-LHE). Results show that the TS-LHE does not remain constant, implying the fractal-scaling behavior changing with time and location. Hence, we can distinguish the potentially location-dependent scaling feature, which may characterize the hydrology dynamic system. Second, spatial statistical analysis shows that the increment of groundwater level fluctuations exhibits a heavy tailed, non-Gaussian distribution, which can be better quantified by a Lévy stable distribution. Monte Carlo simulations of the fluctuation process also show that the linear fractional stable motion model can well depict the transient dynamics (i.e., fractal non-Gaussian property) of groundwater level, while fractional Brownian motion is inadequate to describe natural processes with anomalous dynamics. Analysis of temporal scaling and spatial statistics therefore may provide useful information and quantification to understand further the nature of complex dynamics in hydrology.
NASA DOE POD NDE Capabilities Data Book

NASA Technical Reports Server (NTRS)

Generazio, Edward R.

2015-01-01

This data book contains the Directed Design of Experiments for Validating Probability of Detection (POD) Capability of NDE Systems (DOEPOD) analyses of the nondestructive inspection data presented in the NTIAC, Nondestructive Evaluation (NDE) Capabilities Data Book, 3rd ed., NTIAC DB-97-02. DOEPOD is designed as a decision support system to validate inspection system, personnel, and protocol demonstrating 0.90 POD with 95% confidence at critical flaw sizes, a90/95. The test methodology used in DOEPOD is based on the field of statistical sequential analysis founded by Abraham Wald. Sequential analysis is a method of statistical inference whose characteristic feature is that the number of observations required by the procedure is not determined in advance of the experiment. The decision to terminate the experiment depends, at each stage, on the results of the observations previously made. A merit of the sequential method, as applied to testing statistical hypotheses, is that test procedures can be constructed which require, on average, a substantially smaller number of observations than equally reliable test procedures based on a predetermined number of observations.
Support Provided to the External Tank (ET) Project on the Use of Statistical Analysis for ET Certification Consultation Position Paper

NASA Technical Reports Server (NTRS)

Null, Cynthia H.

2009-01-01

In June 2004, the June Space Flight Leadership Council (SFLC) assigned an action to the NASA Engineering and Safety Center (NESC) and External Tank (ET) project jointly to characterize the available dataset [of defect sizes from dissections of foam], identify resultant limitations to statistical treatment of ET as-built foam as part of the overall thermal protection system (TPS) certification, and report to the Program Requirements Change Board (PRCB) and SFLC in September 2004. The NESC statistics team was formed to assist the ET statistics group in August 2004. The NESC's conclusions are presented in this report.
Tables of square-law signal detection statistics for Hann spectra with 50 percent overlap

NASA Technical Reports Server (NTRS)

Deans, Stanley R.; Cullers, D. Kent

1991-01-01

The Search for Extraterrestrial Intelligence, currently being planned by NASA, will require that an enormous amount of data be analyzed in real time by special purpose hardware. It is expected that overlapped Hann data windows will play an important role in this analysis. In order to understand the statistical implication of this approach, it has been necessary to compute detection statistics for overlapped Hann spectra. Tables of signal detection statistics are given for false alarm rates from 10(exp -14) to 10(exp -1) and signal detection probabilities from 0.50 to 0.99; the number of computed spectra ranges from 4 to 2000.
Challenges of Big Data Analysis.

PubMed

Fan, Jianqing; Han, Fang; Liu, Han

2014-06-01

Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.
Challenges of Big Data Analysis

PubMed Central

Fan, Jianqing; Han, Fang; Liu, Han

2014-01-01

Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions. PMID:25419469
Legal Environment v. Business Law Courses: A Distinction without a Difference?

ERIC Educational Resources Information Center

Miller, Carol J.; Crain, Susan J.

2011-01-01

The purpose of this article is to provide a content analysis and statistics on the law-related core course requirements in colleges of business to assist professors and administrators in making curriculum decisions. It examines the name of "undergraduate" law-based course requirements in the business core in 404 universities accredited by the…
UNITY: Confronting Supernova Cosmology's Statistical and Systematic Uncertainties in a Unified Bayesian Framework

NASA Astrophysics Data System (ADS)

Rubin, D.; Aldering, G.; Barbary, K.; Boone, K.; Chappell, G.; Currie, M.; Deustua, S.; Fagrelius, P.; Fruchter, A.; Hayden, B.; Lidman, C.; Nordin, J.; Perlmutter, S.; Saunders, C.; Sofiatti, C.; Supernova Cosmology Project, The

2015-11-01

While recent supernova (SN) cosmology research has benefited from improved measurements, current analysis approaches are not statistically optimal and will prove insufficient for future surveys. This paper discusses the limitations of current SN cosmological analyses in treating outliers, selection effects, shape- and color-standardization relations, unexplained dispersion, and heterogeneous observations. We present a new Bayesian framework, called UNITY (Unified Nonlinear Inference for Type-Ia cosmologY), that incorporates significant improvements in our ability to confront these effects. We apply the framework to real SN observations and demonstrate smaller statistical and systematic uncertainties. We verify earlier results that SNe Ia require nonlinear shape and color standardizations, but we now include these nonlinear relations in a statistically well-justified way. This analysis was primarily performed blinded, in that the basic framework was first validated on simulated data before transitioning to real data. We also discuss possible extensions of the method.
P-MartCancer-Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets.

PubMed

Webb-Robertson, Bobbie-Jo M; Bramer, Lisa M; Jensen, Jeffrey L; Kobold, Markus A; Stratton, Kelly G; White, Amanda M; Rodland, Karin D

2017-11-01

P-MartCancer is an interactive web-based software environment that enables statistical analyses of peptide or protein data, quantitated from mass spectrometry-based global proteomics experiments, without requiring in-depth knowledge of statistical programming. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification, and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access and the capability to analyze multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium at the peptide, gene, and protein levels. P-MartCancer is deployed as a web service (https://pmart.labworks.org/cptac.html), alternatively available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/). Cancer Res; 77(21); e47-50. ©2017 AACR . ©2017 American Association for Cancer Research.
Experimental design of an interlaboratory study for trace metal analysis of liquid fluids. [for aerospace vehicles

NASA Technical Reports Server (NTRS)

Greenbauer-Seng, L. A.

1983-01-01

The accurate determination of trace metals and fuels is an important requirement in much of the research into and development of alternative fuels for aerospace applications. Recognizing the detrimental effects of certain metals on fuel performance and fuel systems at the part per million and in some cases part per billion levels requires improved accuracy in determining these low concentration elements. Accurate analyses are also required to ensure interchangeability of analysis results between vendor, researcher, and end use for purposes of quality control. Previous interlaboratory studies have demonstrated the inability of different laboratories to agree on the results of metal analysis, particularly at low concentration levels, yet typically good precisions are reported within a laboratory. An interlaboratory study was designed to gain statistical information about the sources of variation in the reported concentrations. Five participant laboratories were used on a fee basis and were not informed of the purpose of the analyses. The effects of laboratory, analytical technique, concentration level, and ashing additive were studied in four fuel types for 20 elements of interest. The prescribed sample preparation schemes (variations of dry ashing) were used by all of the laboratories. The analytical data were statistically evaluated using a computer program for the analysis of variance technique.
Statistical models for the analysis and design of digital polymerase chain (dPCR) experiments

USGS Publications Warehouse

Dorazio, Robert; Hunter, Margaret

2015-01-01

Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log–log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model’s parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments.

PubMed

Dorazio, Robert M; Hunter, Margaret E

2015-11-03

Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log-log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model's parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.

The space of ultrametric phylogenetic trees.

PubMed

Gavryushkin, Alex; Drummond, Alexei J

2016-08-21

The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
[Notes on vital statistics for the study of perinatal health].

PubMed

Juárez, Sol Pía

2014-01-01

Vital statistics, published by the National Statistics Institute in Spain, are a highly important source for the study of perinatal health nationwide. However, the process of data collection is not well-known and has implications both for the quality and interpretation of the epidemiological results derived from this source. The aim of this study was to present how the information is collected and some of the associated problems. This study is the result of an analysis of the methodological notes from the National Statistics Institute and first-hand information obtained from hospitals, the Central Civil Registry of Madrid, and the Madrid Institute for Statistics. Greater integration between these institutions is required to improve the quality of birth and stillbirth statistics. Copyright © 2014 SESPAS. Published by Elsevier Espana. All rights reserved.
From fields to objects: A review of geographic boundary analysis

NASA Astrophysics Data System (ADS)

Jacquez, G. M.; Maruca, S.; Fortin, M.-J.

Geographic boundary analysis is a relatively new approach unfamiliar to many spatial analysts. It is best viewed as a technique for defining objects - geographic boundaries - on spatial fields, and for evaluating the statistical significance of characteristics of those boundary objects. This is accomplished using null spatial models representative of the spatial processes expected in the absence of boundary-generating phenomena. Close ties to the object-field dialectic eminently suit boundary analysis to GIS data. The majority of existing spatial methods are field-based in that they describe, estimate, or predict how attributes (variables defining the field) vary through geographic space. Such methods are appropriate for field representations but not object representations. As the object-field paradigm gains currency in geographic information science, appropriate techniques for the statistical analysis of objects are required. The methods reviewed in this paper are a promising foundation. Geographic boundary analysis is clearly a valuable addition to the spatial statistical toolbox. This paper presents the philosophy of, and motivations for geographic boundary analysis. It defines commonly used statistics for quantifying boundaries and their characteristics, as well as simulation procedures for evaluating their significance. We review applications of these techniques, with the objective of making this promising approach accessible to the GIS-spatial analysis community. We also describe the implementation of these methods within geographic boundary analysis software: GEM.
Propensity score to detect baseline imbalance in cluster randomized trials: the role of the c-statistic.

PubMed

Leyrat, Clémence; Caille, Agnès; Foucher, Yohann; Giraudeau, Bruno

2016-01-22

Despite randomization, baseline imbalance and confounding bias may occur in cluster randomized trials (CRTs). Covariate imbalance may jeopardize the validity of statistical inferences if they occur on prognostic factors. Thus, the diagnosis of a such imbalance is essential to adjust statistical analysis if required. We developed a tool based on the c-statistic of the propensity score (PS) model to detect global baseline covariate imbalance in CRTs and assess the risk of confounding bias. We performed a simulation study to assess the performance of the proposed tool and applied this method to analyze the data from 2 published CRTs. The proposed method had good performance for large sample sizes (n =500 per arm) and when the number of unbalanced covariates was not too small as compared with the total number of baseline covariates (≥40% of unbalanced covariates). We also provide a strategy for pre selection of the covariates needed to be included in the PS model to enhance imbalance detection. The proposed tool could be useful in deciding whether covariate adjustment is required before performing statistical analyses of CRTs.
A PERT/CPM of the Computer Assisted Completion of The Ministry September Report. Research Report.

ERIC Educational Resources Information Center

Feeney, J. D.

Using two statistical analysis techniques (the Program Evaluation and Review Technique and the Critical Path Method), this study analyzed procedures for compiling the required yearly report of the Metropolitan Separate School Board (Catholic) of Toronto, Canada. The computer-assisted analysis organized the process of completing the report more…
Quality Assessments of Long-Term Quantitative Proteomic Analysis of Breast Cancer Xenograft Tissues

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Jian-Ying; Chen, Lijun; Zhang, Bai

The identification of protein biomarkers requires large-scale analysis of human specimens to achieve statistical significance. In this study, we evaluated the long-term reproducibility of an iTRAQ (isobaric tags for relative and absolute quantification) based quantitative proteomics strategy using one channel for universal normalization across all samples. A total of 307 liquid chromatography tandem mass spectrometric (LC-MS/MS) analyses were completed, generating 107 one-dimensional (1D) LC-MS/MS datasets and 8 offline two-dimensional (2D) LC-MS/MS datasets (25 fractions for each set) for human-in-mouse breast cancer xenograft tissues representative of basal and luminal subtypes. Such large-scale studies require the implementation of robust metrics to assessmore » the contributions of technical and biological variability in the qualitative and quantitative data. Accordingly, we developed a quantification confidence score based on the quality of each peptide-spectrum match (PSM) to remove quantification outliers from each analysis. After combining confidence score filtering and statistical analysis, reproducible protein identification and quantitative results were achieved from LC-MS/MS datasets collected over a 16 month period.« less
Statistical analysis of experimental data for mathematical modeling of physical processes in the atmosphere

NASA Astrophysics Data System (ADS)

Karpushin, P. A.; Popov, Yu B.; Popova, A. I.; Popova, K. Yu; Krasnenko, N. P.; Lavrinenko, A. V.

2017-11-01

In this paper, the probabilities of faultless operation of aerologic stations are analyzed, the hypothesis of normality of the empirical data required for using the Kalman filter algorithms is tested, and the spatial correlation functions of distributions of meteorological parameters are determined. The results of a statistical analysis of two-term (0, 12 GMT) radiosonde observations of the temperature and wind velocity components at some preset altitude ranges in the troposphere in 2001-2016 are presented. These data can be used in mathematical modeling of physical processes in the atmosphere.
Uncertainty Analysis of Seebeck Coefficient and Electrical Resistivity Characterization

NASA Technical Reports Server (NTRS)

Mackey, Jon; Sehirlioglu, Alp; Dynys, Fred

2014-01-01

In order to provide a complete description of a materials thermoelectric power factor, in addition to the measured nominal value, an uncertainty interval is required. The uncertainty may contain sources of measurement error including systematic bias error and precision error of a statistical nature. The work focuses specifically on the popular ZEM-3 (Ulvac Technologies) measurement system, but the methods apply to any measurement system. The analysis accounts for sources of systematic error including sample preparation tolerance, measurement probe placement, thermocouple cold-finger effect, and measurement parameters; in addition to including uncertainty of a statistical nature. Complete uncertainty analysis of a measurement system allows for more reliable comparison of measurement data between laboratories.
Statistical methods for astronomical data with upper limits. I - Univariate distributions

NASA Technical Reports Server (NTRS)

Feigelson, E. D.; Nelson, P. I.

1985-01-01

The statistical treatment of univariate censored data is discussed. A heuristic derivation of the Kaplan-Meier maximum-likelihood estimator from first principles is presented which results in an expression amenable to analytic error analysis. Methods for comparing two or more censored samples are given along with simple computational examples, stressing the fact that most astronomical problems involve upper limits while the standard mathematical methods require lower limits. The application of univariate survival analysis to six data sets in the recent astrophysical literature is described, and various aspects of the use of survival analysis in astronomy, such as the limitations of various two-sample tests and the role of parametric modelling, are discussed.
Revised Perturbation Statistics for the Global Scale Atmospheric Model

NASA Technical Reports Server (NTRS)

Justus, C. G.; Woodrum, A.

1975-01-01

Magnitudes and scales of atmospheric perturbations about the monthly mean for the thermodynamic variables and wind components are presented by month at various latitudes. These perturbation statistics are a revision of the random perturbation data required for the global scale atmospheric model program and are from meteorological rocket network statistical summaries in the 22 to 65 km height range and NASA grenade and pitot tube data summaries in the region up to 90 km. The observed perturbations in the thermodynamic variables were adjusted to make them consistent with constraints required by the perfect gas law and the hydrostatic equation. Vertical scales were evaluated by Buell's depth of pressure system equation and from vertical structure function analysis. Tables of magnitudes and vertical scales are presented for each month at latitude 10, 30, 50, 70, and 90 degrees.
Multi-Reader ROC studies with Split-Plot Designs: A Comparison of Statistical Methods

PubMed Central

Obuchowski, Nancy A.; Gallas, Brandon D.; Hillis, Stephen L.

2012-01-01

Rationale and Objectives Multi-reader imaging trials often use a factorial design, where study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of the design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper we compare three methods of analysis for the split-plot design. Materials and Methods Three statistical methods are presented: Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean ANOVA approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power and confidence interval coverage of the three test statistics. Results The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% CIs fall close to the nominal coverage for small and large sample sizes. Conclusions The split-plot MRMC study design can be statistically efficient compared with the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rate, similar power, and nominal CI coverage, are available for this study design. PMID:23122570
[Statistical validity of the Mexican Food Security Scale and the Latin American and Caribbean Food Security Scale].

PubMed

Villagómez-Ornelas, Paloma; Hernández-López, Pedro; Carrasco-Enríquez, Brenda; Barrios-Sánchez, Karina; Pérez-Escamilla, Rafael; Melgar-Quiñónez, Hugo

2014-01-01

This article validates the statistical consistency of two food security scales: the Mexican Food Security Scale (EMSA) and the Latin American and Caribbean Food Security Scale (ELCSA). Validity tests were conducted in order to verify that both scales were consistent instruments, conformed by independent, properly calibrated and adequately sorted items, arranged in a continuum of severity. The following tests were developed: sorting of items; Cronbach's alpha analysis; parallelism of prevalence curves; Rasch models; sensitivity analysis through mean differences' hypothesis test. The tests showed that both scales meet the required attributes and are robust statistical instruments for food security measurement. This is relevant given that the lack of access to food indicator, included in multidimensional poverty measurement in Mexico, is calculated with EMSA.
Crux: Rapid Open Source Protein Tandem Mass Spectrometry Analysis

PubMed Central

2015-01-01

Efficiently and accurately analyzing big protein tandem mass spectrometry data sets requires robust software that incorporates state-of-the-art computational, machine learning, and statistical methods. The Crux mass spectrometry analysis software toolkit (http://cruxtoolkit.sourceforge.net) is an open source project that aims to provide users with a cross-platform suite of analysis tools for interpreting protein mass spectrometry data. PMID:25182276
Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data.

PubMed

Tintle, Nathan L; Sitarik, Alexandra; Boerema, Benjamin; Young, Kylie; Best, Aaron A; Dejongh, Matthew

2012-08-08

Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.
Data management in large-scale collaborative toxicity studies: how to file experimental data for automated statistical analysis.

PubMed

Stanzel, Sven; Weimer, Marc; Kopp-Schneider, Annette

2013-06-01

High-throughput screening approaches are carried out for the toxicity assessment of a large number of chemical compounds. In such large-scale in vitro toxicity studies several hundred or thousand concentration-response experiments are conducted. The automated evaluation of concentration-response data using statistical analysis scripts saves time and yields more consistent results in comparison to data analysis performed by the use of menu-driven statistical software. Automated statistical analysis requires that concentration-response data are available in a standardised data format across all compounds. To obtain consistent data formats, a standardised data management workflow must be established, including guidelines for data storage, data handling and data extraction. In this paper two procedures for data management within large-scale toxicological projects are proposed. Both procedures are based on Microsoft Excel files as the researcher's primary data format and use a computer programme to automate the handling of data files. The first procedure assumes that data collection has not yet started whereas the second procedure can be used when data files already exist. Successful implementation of the two approaches into the European project ACuteTox is illustrated. Copyright © 2012 Elsevier Ltd. All rights reserved.
Statistical analysis of target acquisition sensor modeling experiments

NASA Astrophysics Data System (ADS)

Deaver, Dawne M.; Moyer, Steve

2015-05-01

The U.S. Army RDECOM CERDEC NVESD Modeling and Simulation Division is charged with the development and advancement of military target acquisition models to estimate expected soldier performance when using all types of imaging sensors. Two elements of sensor modeling are (1) laboratory-based psychophysical experiments used to measure task performance and calibrate the various models and (2) field-based experiments used to verify the model estimates for specific sensors. In both types of experiments, it is common practice to control or measure environmental, sensor, and target physical parameters in order to minimize uncertainty of the physics based modeling. Predicting the minimum number of test subjects required to calibrate or validate the model should be, but is not always, done during test planning. The objective of this analysis is to develop guidelines for test planners which recommend the number and types of test samples required to yield a statistically significant result.
Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review.

PubMed

Weir, Christopher J; Butcher, Isabella; Assi, Valentina; Lewis, Stephanie C; Murray, Gordon D; Langhorne, Peter; Brady, Marian C

2018-03-07

Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses.
Statistical flaws in design and analysis of fertility treatment studies on cryopreservation raise doubts on the conclusions

PubMed Central

van Gelder, P.H.A.J.M.; Nijs, M.

2011-01-01

Decisions about pharmacotherapy are being taken by medical doctors and authorities based on comparative studies on the use of medications. In studies on fertility treatments in particular, the methodological quality is of utmost importance in the application of evidence-based medicine and systematic reviews. Nevertheless, flaws and omissions appear quite regularly in these types of studies. Current study aims to present an overview of some of the typical statistical flaws, illustrated by a number of example studies which have been published in peer reviewed journals. Based on an investigation of eleven studies at random selected on fertility treatments with cryopreservation, it appeared that the methodological quality of these studies often did not fulfil the required statistical criteria. The following statistical flaws were identified: flaws in study design, patient selection, and units of analysis or in the definition of the primary endpoints. Other errors could be found in p-value and power calculations or in critical p-value definitions. Proper interpretation of the results and/or use of these study results in a meta analysis should therefore be conducted with care. PMID:24753877
Statistical flaws in design and analysis of fertility treatment -studies on cryopreservation raise doubts on the conclusions.

PubMed

van Gelder, P H A J M; Nijs, M

2011-01-01

Decisions about pharmacotherapy are being taken by medical doctors and authorities based on comparative studies on the use of medications. In studies on fertility treatments in particular, the methodological quality is of utmost -importance in the application of evidence-based medicine and systematic reviews. Nevertheless, flaws and omissions appear quite regularly in these types of studies. Current study aims to present an overview of some of the typical statistical flaws, illustrated by a number of example studies which have been published in peer reviewed journals. Based on an investigation of eleven studies at random selected on fertility treatments with cryopreservation, it appeared that the methodological quality of these studies often did not fulfil the -required statistical criteria. The following statistical flaws were identified: flaws in study design, patient selection, and units of analysis or in the definition of the primary endpoints. Other errors could be found in p-value and power calculations or in critical p-value definitions. Proper -interpretation of the results and/or use of these study results in a meta analysis should therefore be conducted with care.
Applying Monte Carlo Simulation to Launch Vehicle Design and Requirements Analysis

NASA Technical Reports Server (NTRS)

Hanson, J. M.; Beard, B. B.

2010-01-01

This Technical Publication (TP) is meant to address a number of topics related to the application of Monte Carlo simulation to launch vehicle design and requirements analysis. Although the focus is on a launch vehicle application, the methods may be applied to other complex systems as well. The TP is organized so that all the important topics are covered in the main text, and detailed derivations are in the appendices. The TP first introduces Monte Carlo simulation and the major topics to be discussed, including discussion of the input distributions for Monte Carlo runs, testing the simulation, how many runs are necessary for verification of requirements, what to do if results are desired for events that happen only rarely, and postprocessing, including analyzing any failed runs, examples of useful output products, and statistical information for generating desired results from the output data. Topics in the appendices include some tables for requirements verification, derivation of the number of runs required and generation of output probabilistic data with consumer risk included, derivation of launch vehicle models to include possible variations of assembled vehicles, minimization of a consumable to achieve a two-dimensional statistical result, recontact probability during staging, ensuring duplicated Monte Carlo random variations, and importance sampling.

A Non-Intrusive Algorithm for Sensitivity Analysis of Chaotic Flow Simulations

NASA Technical Reports Server (NTRS)

Blonigan, Patrick J.; Wang, Qiqi; Nielsen, Eric J.; Diskin, Boris

2017-01-01

We demonstrate a novel algorithm for computing the sensitivity of statistics in chaotic flow simulations to parameter perturbations. The algorithm is non-intrusive but requires exposing an interface. Based on the principle of shadowing in dynamical systems, this algorithm is designed to reduce the effect of the sampling error in computing sensitivity of statistics in chaotic simulations. We compare the effectiveness of this method to that of the conventional finite difference method.
Measuring Efficiency and Tradeoffs in Attainment of EEO Goals.

DTIC Science & Technology

1982-02-01

in FY78 and FY79. i.e., T9tese goals Are based on undifferentiated Civilian Labor Force (CLF) ratios required for reporting by the Equal Employment...Lewis and R.J. Niehaus, "Design and Development of Equal Employment Opportunity Human Resources Planning Models," NPDRC TR79--141 (San Diego: Navy...Approach to Analysis of Tradeoffs Among Household Ptoduction Outputs," American Statistical Association 1979 Proceedings of the Social Statistics Section
Hierarchical models and bayesian analysis of bird survey information

Treesearch

John R. Sauer; William A. Link; J. Andrew Royle

2005-01-01

Summary of bird survey information is a critical component of conservation activities, but often our summaries rely on statistical methods that do not accommodate the limitations of the information. Prioritization of species requires ranking and analysis of species by magnitude of population trend, but often magnitude of trend is a misleading measure of actual decline...
Variable Neighborhood Search Heuristics for Selecting a Subset of Variables in Principal Component Analysis

ERIC Educational Resources Information Center

Brusco, Michael J.; Singh, Renu; Steinley, Douglas

2009-01-01

The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the…
[The concept "a case in outpatient treatment" in military policlinic activity].

PubMed

Vinogradov, S N; Vorob'ev, E G; Shklovskiĭ, B L

2014-04-01

Substantiates the necessity of transition of military policlinics to the accounting system and evaluation of their activity on the finished cases of outpatient treatment. Only automating data-statistical processes can solve this problem. On the basis of analysis of the literature data, requirements of the guidance documents and observational results concludes that preliminarily should be done revisal (formalisation) of existing concepts of medical statistics from the position of information environment which in use - electronic databases. In this aspect specified the main features of outpatient treatment case as a unit of medical-statistical record, and formulated its definition.
An Investigation of the Variety and Complexity of Statistical Methods Used in Current Internal Medicine Literature.

PubMed

Narayanan, Roshni; Nugent, Rebecca; Nugent, Kenneth

2015-10-01

Accreditation Council for Graduate Medical Education guidelines require internal medicine residents to develop skills in the interpretation of medical literature and to understand the principles of research. A necessary component is the ability to understand the statistical methods used and their results, material that is not an in-depth focus of most medical school curricula and residency programs. Given the breadth and depth of the current medical literature and an increasing emphasis on complex, sophisticated statistical analyses, the statistical foundation and education necessary for residents are uncertain. We reviewed the statistical methods and terms used in 49 articles discussed at the journal club in the Department of Internal Medicine residency program at Texas Tech University between January 1, 2013 and June 30, 2013. We collected information on the study type and on the statistical methods used for summarizing and comparing samples, determining the relations between independent variables and dependent variables, and estimating models. We then identified the typical statistics education level at which each term or method is learned. A total of 14 articles came from the Journal of the American Medical Association Internal Medicine, 11 from the New England Journal of Medicine, 6 from the Annals of Internal Medicine, 5 from the Journal of the American Medical Association, and 13 from other journals. Twenty reported randomized controlled trials. Summary statistics included mean values (39 articles), category counts (38), and medians (28). Group comparisons were based on t tests (14 articles), χ2 tests (21), and nonparametric ranking tests (10). The relations between dependent and independent variables were analyzed with simple regression (6 articles), multivariate regression (11), and logistic regression (8). Nine studies reported odds ratios with 95% confidence intervals, and seven analyzed test performance using sensitivity and specificity calculations. These papers used 128 statistical terms and context-defined concepts, including some from data analysis (56), epidemiology-biostatistics (31), modeling (24), data collection (12), and meta-analysis (5). Ten different software programs were used in these articles. Based on usual undergraduate and graduate statistics curricula, 64.3% of the concepts and methods used in these papers required at least a master's degree-level statistics education. The interpretation of the current medical literature can require an extensive background in statistical methods at an education level exceeding the material and resources provided to most medical students and residents. Given the complexity and time pressure of medical education, these deficiencies will be hard to correct, but this project can serve as a basis for developing a curriculum in study design and statistical methods needed by physicians-in-training.
Statistical Methods for Rapid Aerothermal Analysis and Design Technology: Validation

NASA Technical Reports Server (NTRS)

DePriest, Douglas; Morgan, Carolyn

2003-01-01

The cost and safety goals for NASA s next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to identify adequate statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The initial research work focused on establishing suitable candidate models for these purposes. The second phase is focused on assessing the performance of these models to accurately predict the heat rate for a given candidate data set. This validation work compared models and methods that may be useful in predicting the heat rate.
Application of microarray analysis on computer cluster and cloud platforms.

PubMed

Bernau, C; Boulesteix, A-L; Knaus, J

2013-01-01

Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
Statistical Learning Analysis in Neuroscience: Aiming for Transparency

PubMed Central

Hanke, Michael; Halchenko, Yaroslav O.; Haxby, James V.; Pollmann, Stefan

2009-01-01

Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires “neuroscience-aware” technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities. PMID:20582270
Comparative analysis on the selection of number of clusters in community detection

NASA Astrophysics Data System (ADS)

Kawamoto, Tatsuro; Kabashima, Yoshiyuki

2018-02-01

We conduct a comparative analysis on various estimates of the number of clusters in community detection. An exhaustive comparison requires testing of all possible combinations of frameworks, algorithms, and assessment criteria. In this paper we focus on the framework based on a stochastic block model, and investigate the performance of greedy algorithms, statistical inference, and spectral methods. For the assessment criteria, we consider modularity, map equation, Bethe free energy, prediction errors, and isolated eigenvalues. From the analysis, the tendency of overfit and underfit that the assessment criteria and algorithms have becomes apparent. In addition, we propose that the alluvial diagram is a suitable tool to visualize statistical inference results and can be useful to determine the number of clusters.
15 CFR 30.51 - Statistical information required for import entries.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 15 Commerce and Foreign Trade 1 2011-01-01 2011-01-01 false Statistical information required for import entries. 30.51 Section 30.51 Commerce and Foreign Trade Regulations Relating to Commerce and... § 30.51 Statistical information required for import entries. The information required for statistical...
15 CFR 30.51 - Statistical information required for import entries.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 15 Commerce and Foreign Trade 1 2010-01-01 2010-01-01 false Statistical information required for import entries. 30.51 Section 30.51 Commerce and Foreign Trade Regulations Relating to Commerce and... § 30.51 Statistical information required for import entries. The information required for statistical...
Analysis of Statistical Methods and Errors in the Articles Published in the Korean Journal of Pain

PubMed Central

Yim, Kyoung Hoon; Han, Kyoung Ah; Park, Soo Young

2010-01-01

Background Statistical analysis is essential in regard to obtaining objective reliability for medical research. However, medical researchers do not have enough statistical knowledge to properly analyze their study data. To help understand and potentially alleviate this problem, we have analyzed the statistical methods and errors of articles published in the Korean Journal of Pain (KJP), with the intention to improve the statistical quality of the journal. Methods All the articles, except case reports and editorials, published from 2004 to 2008 in the KJP were reviewed. The types of applied statistical methods and errors in the articles were evaluated. Results One hundred and thirty-nine original articles were reviewed. Inferential statistics and descriptive statistics were used in 119 papers and 20 papers, respectively. Only 20.9% of the papers were free from statistical errors. The most commonly adopted statistical method was the t-test (21.0%) followed by the chi-square test (15.9%). Errors of omission were encountered 101 times in 70 papers. Among the errors of omission, "no statistics used even though statistical methods were required" was the most common (40.6%). The errors of commission were encountered 165 times in 86 papers, among which "parametric inference for nonparametric data" was the most common (33.9%). Conclusions We found various types of statistical errors in the articles published in the KJP. This suggests that meticulous attention should be given not only in the applying statistical procedures but also in the reviewing process to improve the value of the article. PMID:20552071
The Statistical Value of Raw Fluorescence Signal in Luminex xMAP Based Multiplex Immunoassays

PubMed Central

Breen, Edmond J.; Tan, Woei; Khan, Alamgir

2016-01-01

Tissue samples (plasma, saliva, serum or urine) from 169 patients classified as either normal or having one of seven possible diseases are analysed across three 96-well plates for the presences of 37 analytes using cytokine inflammation multiplexed immunoassay panels. Censoring for concentration data caused problems for analysis of the low abundant analytes. Using fluorescence analysis over concentration based analysis allowed analysis of these low abundant analytes. Mixed-effects analysis on the resulting fluorescence and concentration responses reveals a combination of censoring and mapping the fluorescence responses to concentration values, through a 5PL curve, changed observed analyte concentrations. Simulation verifies this, by showing a dependence on the mean florescence response and its distribution on the observed analyte concentration levels. Differences from normality, in the fluorescence responses, can lead to differences in concentration estimates and unreliable probabilities for treatment effects. It is seen that when fluorescence responses are normally distributed, probabilities of treatment effects for fluorescence based t-tests has greater statistical power than the same probabilities from concentration based t-tests. We add evidence that the fluorescence response, unlike concentration values, doesn’t require censoring and we show with respect to differential analysis on the fluorescence responses that background correction is not required. PMID:27243383
Risk-based Methodology for Validation of Pharmaceutical Batch Processes.

PubMed

Wiles, Frederick

2013-01-01

In January 2011, the U.S. Food and Drug Administration published new process validation guidance for pharmaceutical processes. The new guidance debunks the long-held industry notion that three consecutive validation batches or runs are all that are required to demonstrate that a process is operating in a validated state. Instead, the new guidance now emphasizes that the level of monitoring and testing performed during process performance qualification (PPQ) studies must be sufficient to demonstrate statistical confidence both within and between batches. In some cases, three qualification runs may not be enough. Nearly two years after the guidance was first published, little has been written defining a statistical methodology for determining the number of samples and qualification runs required to satisfy Stage 2 requirements of the new guidance. This article proposes using a combination of risk assessment, control charting, and capability statistics to define the monitoring and testing scheme required to show that a pharmaceutical batch process is operating in a validated state. In this methodology, an assessment of process risk is performed through application of a process failure mode, effects, and criticality analysis (PFMECA). The output of PFMECA is used to select appropriate levels of statistical confidence and coverage which, in turn, are used in capability calculations to determine when significant Stage 2 (PPQ) milestones have been met. The achievement of Stage 2 milestones signals the release of batches for commercial distribution and the reduction of monitoring and testing to commercial production levels. Individuals, moving range, and range/sigma charts are used in conjunction with capability statistics to demonstrate that the commercial process is operating in a state of statistical control. The new process validation guidance published by the U.S. Food and Drug Administration in January of 2011 indicates that the number of process validation batches or runs required to demonstrate that a pharmaceutical process is operating in a validated state should be based on sound statistical principles. The old rule of "three consecutive batches and you're done" is no longer sufficient. The guidance, however, does not provide any specific methodology for determining the number of runs required, and little has been published to augment this shortcoming. The paper titled "Risk-based Methodology for Validation of Pharmaceutical Batch Processes" describes a statistically sound methodology for determining when a statistically valid number of validation runs has been acquired based on risk assessment and calculation of process capability.
The Outlook for Technological Change and Employment. Technology and the American Economy, Appendix Volume I.

ERIC Educational Resources Information Center

National Commission on Technology, Automation and Economic Progress, Washington, DC.

Findings of a study of the nation's manpower requirements to 1975 are presented. Part I, on the employment outlook, consists of a 10-year projection of manpower requirements by occupation and by industry prepared by the Bureau of Labor Statistics and an analysis of the growth prospects and the state of fiscal policy in the United States economy as…
MORTICIA, a statistical analysis software package for determining optical surveillance system effectiveness.

NASA Astrophysics Data System (ADS)

Ramkilowan, A.; Griffith, D. J.

2017-10-01

Surveillance modelling in terms of the standard Detect, Recognise and Identify (DRI) thresholds remains a key requirement for determining the effectiveness of surveillance sensors. With readily available computational resources it has become feasible to perform statistically representative evaluations of the effectiveness of these sensors. A new capability for performing this Monte-Carlo type analysis is demonstrated in the MORTICIA (Monte- Carlo Optical Rendering for Theatre Investigations of Capability under the Influence of the Atmosphere) software package developed at the Council for Scientific and Industrial Research (CSIR). This first generation, python-based open-source integrated software package, currently in the alpha stage of development aims to provide all the functionality required to perform statistical investigations of the effectiveness of optical surveillance systems in specific or generic deployment theatres. This includes modelling of the mathematical and physical processes that govern amongst other components of a surveillance system; a sensor's detector and optical components, a target and its background as well as the intervening atmospheric influences. In this paper we discuss integral aspects of the bespoke framework that are critical to the longevity of all subsequent modelling efforts. Additionally, some preliminary results are presented.
Spatial Statistics for Tumor Cell Counting and Classification

NASA Astrophysics Data System (ADS)

Wirjadi, Oliver; Kim, Yoo-Jin; Breuel, Thomas

To count and classify cells in histological sections is a standard task in histology. One example is the grading of meningiomas, benign tumors of the meninges, which requires to assess the fraction of proliferating cells in an image. As this process is very time consuming when performed manually, automation is required. To address such problems, we propose a novel application of Markov point process methods in computer vision, leading to algorithms for computing the locations of circular objects in images. In contrast to previous algorithms using such spatial statistics methods in image analysis, the present one is fully trainable. This is achieved by combining point process methods with statistical classifiers. Using simulated data, the method proposed in this paper will be shown to be more accurate and more robust to noise than standard image processing methods. On the publicly available SIMCEP benchmark for cell image analysis algorithms, the cell count performance of the present paper is significantly more accurate than results published elsewhere, especially when cells form dense clusters. Furthermore, the proposed system performs as well as a state-of-the-art algorithm for the computer-aided histological grading of meningiomas when combined with a simple k-nearest neighbor classifier for identifying proliferating cells.
Validation tools for image segmentation

NASA Astrophysics Data System (ADS)

Padfield, Dirk; Ross, James

2009-02-01

A large variety of image analysis tasks require the segmentation of various regions in an image. For example, segmentation is required to generate accurate models of brain pathology that are important components of modern diagnosis and therapy. While the manual delineation of such structures gives accurate information, the automatic segmentation of regions such as the brain and tumors from such images greatly enhances the speed and repeatability of quantifying such structures. The ubiquitous need for such algorithms has lead to a wide range of image segmentation algorithms with various assumptions, parameters, and robustness. The evaluation of such algorithms is an important step in determining their effectiveness. Therefore, rather than developing new segmentation algorithms, we here describe validation methods for segmentation algorithms. Using similarity metrics comparing the automatic to manual segmentations, we demonstrate methods for optimizing the parameter settings for individual cases and across a collection of datasets using the Design of Experiment framework. We then employ statistical analysis methods to compare the effectiveness of various algorithms. We investigate several region-growing algorithms from the Insight Toolkit and compare their accuracy to that of a separate statistical segmentation algorithm. The segmentation algorithms are used with their optimized parameters to automatically segment the brain and tumor regions in MRI images of 10 patients. The validation tools indicate that none of the ITK algorithms studied are able to outperform with statistical significance the statistical segmentation algorithm although they perform reasonably well considering their simplicity.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilbert, Richard O.

The application of statistics to environmental pollution monitoring studies requires a knowledge of statistical analysis methods particularly well suited to pollution data. This book fills that need by providing sampling plans, statistical tests, parameter estimation procedure techniques, and references to pertinent publications. Most of the statistical techniques are relatively simple, and examples, exercises, and case studies are provided to illustrate procedures. The book is logically divided into three parts. Chapters 1, 2, and 3 are introductory chapters. Chapters 4 through 10 discuss field sampling designs and Chapters 11 through 18 deal with a broad range of statistical analysis procedures. Somemore » statistical techniques given here are not commonly seen in statistics book. For example, see methods for handling correlated data (Sections 4.5 and 11.12), for detecting hot spots (Chapter 10), and for estimating a confidence interval for the mean of a lognormal distribution (Section 13.2). Also, Appendix B lists a computer code that estimates and tests for trends over time at one or more monitoring stations using nonparametric methods (Chapters 16 and 17). Unfortunately, some important topics could not be included because of their complexity and the need to limit the length of the book. For example, only brief mention could be made of time series analysis using Box-Jenkins methods and of kriging techniques for estimating spatial and spatial-time patterns of pollution, although multiple references on these topics are provided. Also, no discussion of methods for assessing risks from environmental pollution could be included.« less

The emergence of modern statistics in agricultural science: analysis of variance, experimental design and the reshaping of research at Rothamsted Experimental Station, 1919-1933.

PubMed

Parolini, Giuditta

2015-01-01

During the twentieth century statistical methods have transformed research in the experimental and social sciences. Qualitative evidence has largely been replaced by quantitative results and the tools of statistical inference have helped foster a new ideal of objectivity in scientific knowledge. The paper will investigate this transformation by considering the genesis of analysis of variance and experimental design, statistical methods nowadays taught in every elementary course of statistics for the experimental and social sciences. These methods were developed by the mathematician and geneticist R. A. Fisher during the 1920s, while he was working at Rothamsted Experimental Station, where agricultural research was in turn reshaped by Fisher's methods. Analysis of variance and experimental design required new practices and instruments in field and laboratory research, and imposed a redistribution of expertise among statisticians, experimental scientists and the farm staff. On the other hand the use of statistical methods in agricultural science called for a systematization of information management and made computing an activity integral to the experimental research done at Rothamsted, permanently integrating the statisticians' tools and expertise into the station research programme. Fisher's statistical methods did not remain confined within agricultural research and by the end of the 1950s they had come to stay in psychology, sociology, education, chemistry, medicine, engineering, economics, quality control, just to mention a few of the disciplines which adopted them.
Data handling and analysis for the 1971 corn blight watch experiment.

NASA Technical Reports Server (NTRS)

Anuta, P. E.; Phillips, T. L.; Landgrebe, D. A.

1972-01-01

Review of the data handling and analysis methods used in the near-operational test of remote sensing systems provided by the 1971 corn blight watch experiment. The general data analysis techniques and, particularly, the statistical multispectral pattern recognition methods for automatic computer analysis of aircraft scanner data are described. Some of the results obtained are examined, and the implications of the experiment for future data communication requirements of earth resource survey systems are discussed.
Anima: Modular Workflow System for Comprehensive Image Data Analysis

PubMed Central

Rantanen, Ville; Valori, Miko; Hautaniemi, Sampsa

2014-01-01

Modern microscopes produce vast amounts of image data, and computational methods are needed to analyze and interpret these data. Furthermore, a single image analysis project may require tens or hundreds of analysis steps starting from data import and pre-processing to segmentation and statistical analysis; and ending with visualization and reporting. To manage such large-scale image data analysis projects, we present here a modular workflow system called Anima. Anima is designed for comprehensive and efficient image data analysis development, and it contains several features that are crucial in high-throughput image data analysis: programing language independence, batch processing, easily customized data processing, interoperability with other software via application programing interfaces, and advanced multivariate statistical analysis. The utility of Anima is shown with two case studies focusing on testing different algorithms developed in different imaging platforms and an automated prediction of alive/dead C. elegans worms by integrating several analysis environments. Anima is a fully open source and available with documentation at www.anduril.org/anima. PMID:25126541
P-MartCancer–Interactive Online Software to Enable Analysis of Shotgun Cancer Proteomic Datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Webb-Robertson, Bobbie-Jo M.; Bramer, Lisa M.; Jensen, Jeffrey L.

P-MartCancer is a new interactive web-based software environment that enables biomedical and biological scientists to perform in-depth analyses of global proteomics data without requiring direct interaction with the data or with statistical software. P-MartCancer offers a series of statistical modules associated with quality assessment, peptide and protein statistics, protein quantification and exploratory data analyses driven by the user via customized workflows and interactive visualization. Currently, P-MartCancer offers access to multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium (CPTAC) at the peptide, gene and protein levels. P-MartCancer is deployed using Azure technologies (http://pmart.labworks.org/cptac.html), the web-service is alternativelymore » available via Docker Hub (https://hub.docker.com/r/pnnl/pmart-web/) and many statistical functions can be utilized directly from an R package available on GitHub (https://github.com/pmartR).« less
Statistical analysis of solid waste composition data: Arithmetic mean, standard deviation and correlation coefficients.

PubMed

Edjabou, Maklawe Essonanawe; Martín-Fernández, Josep Antoni; Scheutz, Charlotte; Astrup, Thomas Fruergaard

2017-11-01

Data for fractional solid waste composition provide relative magnitudes of individual waste fractions, the percentages of which always sum to 100, thereby connecting them intrinsically. Due to this sum constraint, waste composition data represent closed data, and their interpretation and analysis require statistical methods, other than classical statistics that are suitable only for non-constrained data such as absolute values. However, the closed characteristics of waste composition data are often ignored when analysed. The results of this study showed, for example, that unavoidable animal-derived food waste amounted to 2.21±3.12% with a confidence interval of (-4.03; 8.45), which highlights the problem of the biased negative proportions. A Pearson's correlation test, applied to waste fraction generation (kg mass), indicated a positive correlation between avoidable vegetable food waste and plastic packaging. However, correlation tests applied to waste fraction compositions (percentage values) showed a negative association in this regard, thus demonstrating that statistical analyses applied to compositional waste fraction data, without addressing the closed characteristics of these data, have the potential to generate spurious or misleading results. Therefore, ¨compositional data should be transformed adequately prior to any statistical analysis, such as computing mean, standard deviation and correlation coefficients. Copyright © 2017 Elsevier Ltd. All rights reserved.
Specifying the ISS Plasma Environment

NASA Technical Reports Server (NTRS)

Minow, Joseph I.; Diekmann, Anne; Neergaard, Linda; Bui, Them; Mikatarian, Ronald; Barsamian, Hagop; Koontz, Steven

2002-01-01

Quantifying the spacecraft charging risks and corresponding hazards for the International Space Station (ISS) requires a plasma environment specification describing the natural variability of ionospheric temperature (Te) and density (Ne). Empirical ionospheric specification and forecast models such as the International Reference Ionosphere (IN) model typically only provide estimates of long term (seasonal) mean Te and Ne values for the low Earth orbit environment. Knowledge of the Te and Ne variability as well as the likelihood of extreme deviations from the mean values are required to estimate both the magnitude and frequency of occurrence of potentially hazardous spacecraft charging environments for a given ISS construction stage and flight configuration. This paper describes the statistical analysis of historical ionospheric low Earth orbit plasma measurements used to estimate Ne, Te variability in the ISS flight environment. The statistical variability analysis of Ne and Te enables calculation of the expected frequency of occurrence of any particular values of Ne and Te, especially those that correspond to possibly hazardous spacecraft charging environments. The database used in the original analysis included measurements from the AE-C, AE-D, and DE-2 satellites. Recent work on the database has added additional satellites to the database and ground based incoherent scatter radar observations as well. Deviations of the data values from the IRI estimated Ne, Te parameters for each data point provide a statistical basis for modeling the deviations of the plasma environment from the IRI model output.
Probability of identification: a statistical model for the validation of qualitative botanical identification methods.

PubMed

LaBudde, Robert A; Harnly, James M

2012-01-01

A qualitative botanical identification method (BIM) is an analytical procedure that returns a binary result (1 = Identified, 0 = Not Identified). A BIM may be used by a buyer, manufacturer, or regulator to determine whether a botanical material being tested is the same as the target (desired) material, or whether it contains excessive nontarget (undesirable) material. The report describes the development and validation of studies for a BIM based on the proportion of replicates identified, or probability of identification (POI), as the basic observed statistic. The statistical procedures proposed for data analysis follow closely those of the probability of detection, and harmonize the statistical concepts and parameters between quantitative and qualitative method validation. Use of POI statistics also harmonizes statistical concepts for botanical, microbiological, toxin, and other analyte identification methods that produce binary results. The POI statistical model provides a tool for graphical representation of response curves for qualitative methods, reporting of descriptive statistics, and application of performance requirements. Single collaborator and multicollaborative study examples are given.
Quantification of integrated HIV DNA by repetitive-sampling Alu-HIV PCR on the basis of poisson statistics.

PubMed

De Spiegelaere, Ward; Malatinkova, Eva; Lynch, Lindsay; Van Nieuwerburgh, Filip; Messiaen, Peter; O'Doherty, Una; Vandekerckhove, Linos

2014-06-01

Quantification of integrated proviral HIV DNA by repetitive-sampling Alu-HIV PCR is a candidate virological tool to monitor the HIV reservoir in patients. However, the experimental procedures and data analysis of the assay are complex and hinder its widespread use. Here, we provide an improved and simplified data analysis method by adopting binomial and Poisson statistics. A modified analysis method on the basis of Poisson statistics was used to analyze the binomial data of positive and negative reactions from a 42-replicate Alu-HIV PCR by use of dilutions of an integration standard and on samples of 57 HIV-infected patients. Results were compared with the quantitative output of the previously described Alu-HIV PCR method. Poisson-based quantification of the Alu-HIV PCR was linearly correlated with the standard dilution series, indicating that absolute quantification with the Poisson method is a valid alternative for data analysis of repetitive-sampling Alu-HIV PCR data. Quantitative outputs of patient samples assessed by the Poisson method correlated with the previously described Alu-HIV PCR analysis, indicating that this method is a valid alternative for quantifying integrated HIV DNA. Poisson-based analysis of the Alu-HIV PCR data enables absolute quantification without the need of a standard dilution curve. Implementation of the CI estimation permits improved qualitative analysis of the data and provides a statistical basis for the required minimal number of technical replicates. © 2014 The American Association for Clinical Chemistry.
A random-sum Wilcoxon statistic and its application to analysis of ROC and LROC data.

PubMed

Tang, Liansheng Larry; Balakrishnan, N

2011-01-01

The Wilcoxon-Mann-Whitney statistic is commonly used for a distribution-free comparison of two groups. One requirement for its use is that the sample sizes of the two groups are fixed. This is violated in some of the applications such as medical imaging studies and diagnostic marker studies; in the former, the violation occurs since the number of correctly localized abnormal images is random, while in the latter the violation is due to some subjects not having observable measurements. For this reason, we propose here a random-sum Wilcoxon statistic for comparing two groups in the presence of ties, and derive its variance as well as its asymptotic distribution for large sample sizes. The proposed statistic includes the regular Wilcoxon rank-sum statistic. Finally, we apply the proposed statistic for summarizing location response operating characteristic data from a liver computed tomography study, and also for summarizing diagnostic accuracy of biomarker data.
Statistical Methodologies to Integrate Experimental and Computational Research

NASA Technical Reports Server (NTRS)

Parker, P. A.; Johnson, R. T.; Montgomery, D. C.

2008-01-01

Development of advanced algorithms for simulating engine flow paths requires the integration of fundamental experiments with the validation of enhanced mathematical models. In this paper, we provide an overview of statistical methods to strategically and efficiently conduct experiments and computational model refinement. Moreover, the integration of experimental and computational research efforts is emphasized. With a statistical engineering perspective, scientific and engineering expertise is combined with statistical sciences to gain deeper insights into experimental phenomenon and code development performance; supporting the overall research objectives. The particular statistical methods discussed are design of experiments, response surface methodology, and uncertainty analysis and planning. Their application is illustrated with a coaxial free jet experiment and a turbulence model refinement investigation. Our goal is to provide an overview, focusing on concepts rather than practice, to demonstrate the benefits of using statistical methods in research and development, thereby encouraging their broader and more systematic application.
FabricS: A user-friendly, complete and robust software for particle shape-fabric analysis

NASA Astrophysics Data System (ADS)

Moreno Chávez, G.; Castillo Rivera, F.; Sarocchi, D.; Borselli, L.; Rodríguez-Sedano, L. A.

2018-06-01

Shape-fabric is a textural parameter related to the spatial arrangement of elongated particles in geological samples. Its usefulness spans a range from sedimentary petrology to igneous and metamorphic petrology. Independently of the process being studied, when a material flows, the elongated particles are oriented with the major axis in the direction of flow. In sedimentary petrology this information has been used for studies of paleo-flow direction of turbidites, the origin of quartz sediments, and locating ignimbrite vents, among others. In addition to flow direction and its polarity, the method enables flow rheology to be inferred. The use of shape-fabric has been limited due to the difficulties of automatically measuring particles and analyzing them with reliable circular statistics programs. This has dampened interest in the method for a long time. Shape-fabric measurement has increased in popularity since the 1980s thanks to the development of new image analysis techniques and circular statistics software. However, the programs currently available are unreliable, old and are incompatible with newer operating systems, or require programming skills. The goal of our work is to develop a user-friendly program, in the MATLAB environment, with a graphical user interface, that can process images and includes editing functions, and thresholds (elongation and size) for selecting a particle population and analyzing it with reliable circular statistics algorithms. Moreover, the method also has to produce rose diagrams, orientation vectors, and a complete series of statistical parameters. All these requirements are met by our new software. In this paper, we briefly explain the methodology from collection of oriented samples in the field to the minimum number of particles needed to obtain reliable fabric data. We obtained the data using specific statistical tests and taking into account the degree of iso-orientation of the samples and the required degree of reliability. The program has been verified by means of several simulations performed using appropriately designed features and by analyzing real samples.
Interpretation of statistical results.

PubMed

García Garmendia, J L; Maroto Monserrat, F

2018-02-21

The appropriate interpretation of the statistical results is crucial to understand the advances in medical science. The statistical tools allow us to transform the uncertainty and apparent chaos in nature to measurable parameters which are applicable to our clinical practice. The importance of understanding the meaning and actual extent of these instruments is essential for researchers, the funders of research and for professionals who require a permanent update based on good evidence and supports to decision making. Various aspects of the designs, results and statistical analysis are reviewed, trying to facilitate his comprehension from the basics to what is most common but no better understood, and bringing a constructive, non-exhaustive but realistic look. Copyright © 2018 Elsevier España, S.L.U. y SEMICYUC. All rights reserved.
SDGs and Geospatial Frameworks: Data Integration in the United States

NASA Astrophysics Data System (ADS)

Trainor, T.

2016-12-01

Responding to the need to monitor a nation's progress towards meeting the Sustainable Development Goals (SDG) outlined in the 2030 U.N. Agenda requires the integration of earth observations with statistical information. The urban agenda proposed in SDG 11 challenges the global community to find a geospatial approach to monitor and measure inclusive, safe, resilient, and sustainable cities and communities. Target 11.7 identifies public safety, accessibility to green and public spaces, and the most vulnerable populations (i.e., women and children, older persons, and persons with disabilities) as the most important priorities of this goal. A challenge for both national statistical organizations and earth observation agencies in addressing SDG 11 is the requirement for detailed statistics at a sufficient spatial resolution to provide the basis for meaningful analysis of the urban population and city environments. Using an example for the city of Pittsburgh, this presentation proposes data and methods to illustrate how earth science and statistical data can be integrated to respond to Target 11.7. Finally, a preliminary series of data initiatives are proposed for extending this method to other global cities.
Geospatial methods and data analysis for assessing distribution of grazing livestock

USDA-ARS?s Scientific Manuscript database

Free-ranging livestock research must begin with a well conceived problem statement and employ appropriate data acquisition tools and analytical techniques to accomplish the research objective. These requirements are especially critical in addressing animal distribution. Tools and statistics used t...
IDENTIFICATION OF REGIME SHIFTS IN TIME SERIES USING NEIGHBORHOOD STATISTICS

EPA Science Inventory

The identification of alternative dynamic regimes in ecological systems requires several lines of evidence. Previous work on time series analysis of dynamic regimes includes mainly model-fitting methods. We introduce two methods that do not use models. These approaches use state-...
Journal of Naval Science. Volume 2, Number 1

DTIC Science & Technology

1976-01-01

has defined a probability distribution function which fits this type of data and forms the basis for statistical analysis of test results (see...Conditions to Assess the Performance of Fire-Resistant Fluids’. Wear, 28 (1974) 29. J.N.S., Vol. 2, No. 1 APPENDIX A Analysis of Fatigue Test Data...used to produce the impulse response and the equipment required for the analysis is relatively simple. The methods that must be used to produce
Data handling and analysis for the 1971 corn blight watch experiment

NASA Technical Reports Server (NTRS)

Anuta, P. E.; Phillips, T. L.

1973-01-01

The overall corn blight watch experiment data flow is described and the organization of the LARS/Purdue data center is discussed. Data analysis techniques are discussed in general and the use of statistical multispectral pattern recognition methods for automatic computer analysis of aircraft scanner data is described. Some of the results obtained are discussed and the implications of the experiment on future data communication requirements for earth resource survey systems is discussed.
Statistical crystallography of surface micelle spacing

NASA Technical Reports Server (NTRS)

Noever, David A.

1992-01-01

The aggregation of the recently reported surface micelles of block polyelectrolytes is analyzed using techniques of statistical crystallography. A polygonal lattice (Voronoi mosaic) connects center-to-center points, yielding statistical agreement with crystallographic predictions; Aboav-Weaire's law and Lewis's law are verified. This protocol supplements the standard analysis of surface micelles leading to aggregation number determination and, when compared to numerical simulations, allows further insight into the random partitioning of surface films. In particular, agreement with Lewis's law has been linked to the geometric packing requirements of filling two-dimensional space which compete with (or balance) physical forces such as interfacial tension, electrostatic repulsion, and van der Waals attraction.
Proper joint analysis of summary association statistics requires the adjustment of heterogeneity in SNP coverage pattern.

PubMed

Zhang, Han; Wheeler, William; Song, Lei; Yu, Kai

2017-07-07

As meta-analysis results published by consortia of genome-wide association studies (GWASs) become increasingly available, many association summary statistics-based multi-locus tests have been developed to jointly evaluate multiple single-nucleotide polymorphisms (SNPs) to reveal novel genetic architectures of various complex traits. The validity of these approaches relies on the accurate estimate of z-score correlations at considered SNPs, which in turn requires knowledge on the set of SNPs assessed by each study participating in the meta-analysis. However, this exact SNP coverage information is usually unavailable from the meta-analysis results published by GWAS consortia. In the absence of the coverage information, researchers typically estimate the z-score correlations by making oversimplified coverage assumptions. We show through real studies that such a practice can generate highly inflated type I errors, and we demonstrate the proper way to incorporate correct coverage information into multi-locus analyses. We advocate that consortia should make SNP coverage information available when posting their meta-analysis results, and that investigators who develop analytic tools for joint analyses based on summary data should pay attention to the variation in SNP coverage and adjust for it appropriately. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.
Statistical issues in quality control of proteomic analyses: good experimental design and planning.

PubMed

Cairns, David A

2011-03-01

Quality control is becoming increasingly important in proteomic investigations as experiments become more multivariate and quantitative. Quality control applies to all stages of an investigation and statistics can play a key role. In this review, the role of statistical ideas in the design and planning of an investigation is described. This involves the design of unbiased experiments using key concepts from statistical experimental design, the understanding of the biological and analytical variation in a system using variance components analysis and the determination of a required sample size to perform a statistically powerful investigation. These concepts are described through simple examples and an example data set from a 2-D DIGE pilot experiment. Each of these concepts can prove useful in producing better and more reproducible data. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

A Statistical Method for Synthesizing Mediation Analyses Using the Product of Coefficient Approach Across Multiple Trials

PubMed Central

Huang, Shi; MacKinnon, David P.; Perrino, Tatiana; Gallo, Carlos; Cruden, Gracelyn; Brown, C Hendricks

2016-01-01

Mediation analysis often requires larger sample sizes than main effect analysis to achieve the same statistical power. Combining results across similar trials may be the only practical option for increasing statistical power for mediation analysis in some situations. In this paper, we propose a method to estimate: 1) marginal means for mediation path a, the relation of the independent variable to the mediator; 2) marginal means for path b, the relation of the mediator to the outcome, across multiple trials; and 3) the between-trial level variance-covariance matrix based on a bivariate normal distribution. We present the statistical theory and an R computer program to combine regression coefficients from multiple trials to estimate a combined mediated effect and confidence interval under a random effects model. Values of coefficients a and b, along with their standard errors from each trial are the input for the method. This marginal likelihood based approach with Monte Carlo confidence intervals provides more accurate inference than the standard meta-analytic approach. We discuss computational issues, apply the method to two real-data examples and make recommendations for the use of the method in different settings. PMID:28239330
Quantitative trait nucleotide analysis using Bayesian model selection.

PubMed

Blangero, John; Goring, Harald H H; Kent, Jack W; Williams, Jeff T; Peterson, Charles P; Almasy, Laura; Dyer, Thomas D

2005-10-01

Although much attention has been given to statistical genetic methods for the initial localization and fine mapping of quantitative trait loci (QTLs), little methodological work has been done to date on the problem of statistically identifying the most likely functional polymorphisms using sequence data. In this paper we provide a general statistical genetic framework, called Bayesian quantitative trait nucleotide (BQTN) analysis, for assessing the likely functional status of genetic variants. The approach requires the initial enumeration of all genetic variants in a set of resequenced individuals. These polymorphisms are then typed in a large number of individuals (potentially in families), and marker variation is related to quantitative phenotypic variation using Bayesian model selection and averaging. For each sequence variant a posterior probability of effect is obtained and can be used to prioritize additional molecular functional experiments. An example of this quantitative nucleotide analysis is provided using the GAW12 simulated data. The results show that the BQTN method may be useful for choosing the most likely functional variants within a gene (or set of genes). We also include instructions on how to use our computer program, SOLAR, for association analysis and BQTN analysis.
Cosmology constraints from shear peak statistics in Dark Energy Survey Science Verification data

NASA Astrophysics Data System (ADS)

Kacprzak, T.; Kirk, D.; Friedrich, O.; Amara, A.; Refregier, A.; Marian, L.; Dietrich, J. P.; Suchyta, E.; Aleksić, J.; Bacon, D.; Becker, M. R.; Bonnett, C.; Bridle, S. L.; Chang, C.; Eifler, T. F.; Hartley, W. G.; Huff, E. M.; Krause, E.; MacCrann, N.; Melchior, P.; Nicola, A.; Samuroff, S.; Sheldon, E.; Troxel, M. A.; Weller, J.; Zuntz, J.; Abbott, T. M. C.; Abdalla, F. B.; Armstrong, R.; Benoit-Lévy, A.; Bernstein, G. M.; Bernstein, R. A.; Bertin, E.; Brooks, D.; Burke, D. L.; Carnero Rosell, A.; Carrasco Kind, M.; Carretero, J.; Castander, F. J.; Crocce, M.; D'Andrea, C. B.; da Costa, L. N.; Desai, S.; Diehl, H. T.; Evrard, A. E.; Neto, A. Fausti; Flaugher, B.; Fosalba, P.; Frieman, J.; Gerdes, D. W.; Goldstein, D. A.; Gruen, D.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; Jain, B.; James, D. J.; Jarvis, M.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Lima, M.; March, M.; Marshall, J. L.; Martini, P.; Miller, C. J.; Miquel, R.; Mohr, J. J.; Nichol, R. C.; Nord, B.; Plazas, A. A.; Romer, A. K.; Roodman, A.; Rykoff, E. S.; Sanchez, E.; Scarpine, V.; Schubnell, M.; Sevilla-Noarbe, I.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Vikram, V.; Walker, A. R.; Zhang, Y.; DES Collaboration

2016-12-01

Shear peak statistics has gained a lot of attention recently as a practical alternative to the two-point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 deg2 field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range 04 would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two-point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. We discuss prospects for future peak statistics analysis with upcoming DES data.

Statistical plant set estimation using Schroeder-phased multisinusoidal input design

NASA Technical Reports Server (NTRS)

Bayard, D. S.

1992-01-01

A frequency domain method is developed for plant set estimation. The estimation of a plant 'set' rather than a point estimate is required to support many methods of modern robust control design. The approach here is based on using a Schroeder-phased multisinusoid input design which has the special property of placing input energy only at the discrete frequency points used in the computation. A detailed analysis of the statistical properties of the frequency domain estimator is given, leading to exact expressions for the probability distribution of the estimation error, and many important properties. It is shown that, for any nominal parametric plant estimate, one can use these results to construct an overbound on the additive uncertainty to any prescribed statistical confidence. The 'soft' bound thus obtained can be used to replace 'hard' bounds presently used in many robust control analysis and synthesis methods.

Prediction of the Electromagnetic Field Distribution in a Typical Aircraft Using the Statistical Energy Analysis

NASA Astrophysics Data System (ADS)

Kovalevsky, Louis; Langley, Robin S.; Caro, Stephane

2016-05-01

Due to the high cost of experimental EMI measurements significant attention has been focused on numerical simulation. Classical methods such as Method of Moment or Finite Difference Time Domain are not well suited for this type of problem, as they require a fine discretisation of space and failed to take into account uncertainties. In this paper, the authors show that the Statistical Energy Analysis is well suited for this type of application. The SEA is a statistical approach employed to solve high frequency problems of electromagnetically reverberant cavities at a reduced computational cost. The key aspects of this approach are (i) to consider an ensemble of system that share the same gross parameter, and (ii) to avoid solving Maxwell's equations inside the cavity, using the power balance principle. The output is an estimate of the field magnitude distribution in each cavity. The method is applied on a typical aircraft structure.

Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series

NASA Technical Reports Server (NTRS)

Vautard, R.; Ghil, M.

1989-01-01

Two dimensions of a dynamical system given by experimental time series are distinguished. Statistical dimension gives a theoretical upper bound for the minimal number of degrees of freedom required to describe the attractor up to the accuracy of the data, taking into account sampling and noise problems. The dynamical dimension is the intrinsic dimension of the attractor and does not depend on the quality of the data. Singular Spectrum Analysis (SSA) provides estimates of the statistical dimension. SSA also describes the main physical phenomena reflected by the data. It gives adaptive spectral filters associated with the dominant oscillations of the system and clarifies the noise characteristics of the data. SSA is applied to four paleoclimatic records. The principal climatic oscillations and the regime changes in their amplitude are detected. About 10 degrees of freedom are statistically significant in the data. Large noise and insufficient sample length do not allow reliable estimates of the dynamical dimension.

ToxMiner Software Interface for Visualizing and Analyzing ToxCast Data

EPA Science Inventory

The ToxCast dataset represents a collection of assays and endpoints that will require both standard statistical approaches as well as customized data analysis workflows. To analyze this unique dataset, we have developed an integrated database with Javabased interface called ToxMi...

Data Analysis and Instrumentation Requirements for Evaluating Rail Joints and Rail Fasteners in Urban Track

DOT National Transportation Integrated Search

1975-02-01

Rail fasteners for concrete ties and direct fixation and bolted rail joints have been identified as key components for improving track performance. However, the lack of statistical load data limits the development of improved design criteria and eval...

Analyzing Mixed-Dyadic Data Using Structural Equation Models

ERIC Educational Resources Information Center

Peugh, James L.; DiLillo, David; Panuzio, Jillian

2013-01-01

Mixed-dyadic data, collected from distinguishable (nonexchangeable) or indistinguishable (exchangeable) dyads, require statistical analysis techniques that model the variation within dyads and between dyads appropriately. The purpose of this article is to provide a tutorial for performing structural equation modeling analyses of cross-sectional…

Object Classification Based on Analysis of Spectral Characteristics of Seismic Signal Envelopes

NASA Astrophysics Data System (ADS)

Morozov, Yu. V.; Spektor, A. A.

2017-11-01

A method for classifying moving objects having a seismic effect on the ground surface is proposed which is based on statistical analysis of the envelopes of received signals. The values of the components of the amplitude spectrum of the envelopes obtained applying Hilbert and Fourier transforms are used as classification criteria. Examples illustrating the statistical properties of spectra and the operation of the seismic classifier are given for an ensemble of objects of four classes (person, group of people, large animal, vehicle). It is shown that the computational procedures for processing seismic signals are quite simple and can therefore be used in real-time systems with modest requirements for computational resources.

Transfusion Indication Threshold Reduction (TITRe2) randomized controlled trial in cardiac surgery: statistical analysis plan.

PubMed

Pike, Katie; Nash, Rachel L; Murphy, Gavin J; Reeves, Barnaby C; Rogers, Chris A

2015-02-22

The Transfusion Indication Threshold Reduction (TITRe2) trial is the largest randomized controlled trial to date to compare red blood cell transfusion strategies following cardiac surgery. This update presents the statistical analysis plan, detailing how the study will be analyzed and presented. The statistical analysis plan has been written following recommendations from the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, prior to database lock and the final analysis of trial data. Outlined analyses are in line with the Consolidated Standards of Reporting Trials (CONSORT). The study aims to randomize 2000 patients from 17 UK centres. Patients are randomized to either a restrictive (transfuse if haemoglobin concentration <7.5 g/dl) or liberal (transfuse if haemoglobin concentration <9 g/dl) transfusion strategy. The primary outcome is a binary composite outcome of any serious infectious or ischaemic event in the first 3 months following randomization. The statistical analysis plan details how non-adherence with the intervention, withdrawals from the study, and the study population will be derived and dealt with in the analysis. The planned analyses of the trial primary and secondary outcome measures are described in detail, including approaches taken to deal with multiple testing, model assumptions not being met and missing data. Details of planned subgroup and sensitivity analyses and pre-specified ancillary analyses are given, along with potential issues that have been identified with such analyses and possible approaches to overcome such issues. ISRCTN70923932 .

Evaluation of Next-Generation Vision Testers for Aeromedical Certification of Aviation Personnel

DTIC Science & Technology

2009-07-01

measure distant, intermediate, and near acuity. The slides are essentially abbreviated versions of the Early Treatment for Diabetic Retinopathy Study...over, requiring intermediate vision testing and 12 were color deficient. Analysis was designed to detect statistically significant differences between...Vertical Phoria (Right & Left Hyperphoria) Test scores from each of the vision testers were collated and analyzed. Analysis was designed to detect

Defining the ecological hydrology of Taiwan Rivers using multivariate statistical methods

NASA Astrophysics Data System (ADS)

Chang, Fi-John; Wu, Tzu-Ching; Tsai, Wen-Ping; Herricks, Edwin E.

2009-09-01

SummaryThe identification and verification of ecohydrologic flow indicators has found new support as the importance of ecological flow regimes is recognized in modern water resources management, particularly in river restoration and reservoir management. An ecohydrologic indicator system reflecting the unique characteristics of Taiwan's water resources and hydrology has been developed, the Taiwan ecohydrological indicator system (TEIS). A major challenge for the water resources community is using the TEIS to provide environmental flow rules that improve existing water resources management. This paper examines data from the extensive network of flow monitoring stations in Taiwan using TEIS statistics to define and refine environmental flow options in Taiwan. Multivariate statistical methods were used to examine TEIS statistics for 102 stations representing the geographic and land use diversity of Taiwan. The Pearson correlation coefficient showed high multicollinearity between the TEIS statistics. Watersheds were separated into upper and lower-watershed locations. An analysis of variance indicated significant differences between upstream, more natural, and downstream, more developed, locations in the same basin with hydrologic indicator redundancy in flow change and magnitude statistics. Issues of multicollinearity were examined using a Principal Component Analysis (PCA) with the first three components related to general flow and high/low flow statistics, frequency and time statistics, and quantity statistics. These principle components would explain about 85% of the total variation. A major conclusion is that managers must be aware of differences among basins, as well as differences within basins that will require careful selection of management procedures to achieve needed flow regimes.

The sumLINK statistic for genetic linkage analysis in the presence of heterogeneity.

PubMed

Christensen, G B; Knight, S; Camp, N J

2009-11-01

We present the "sumLINK" statistic--the sum of multipoint LOD scores for the subset of pedigrees with nominally significant linkage evidence at a given locus--as an alternative to common methods to identify susceptibility loci in the presence of heterogeneity. We also suggest the "sumLOD" statistic (the sum of positive multipoint LOD scores) as a companion to the sumLINK. sumLINK analysis identifies genetic regions of extreme consistency across pedigrees without regard to negative evidence from unlinked or uninformative pedigrees. Significance is determined by an innovative permutation procedure based on genome shuffling that randomizes linkage information across pedigrees. This procedure for generating the empirical null distribution may be useful for other linkage-based statistics as well. Using 500 genome-wide analyses of simulated null data, we show that the genome shuffling procedure results in the correct type 1 error rates for both the sumLINK and sumLOD. The power of the statistics was tested using 100 sets of simulated genome-wide data from the alternative hypothesis from GAW13. Finally, we illustrate the statistics in an analysis of 190 aggressive prostate cancer pedigrees from the International Consortium for Prostate Cancer Genetics, where we identified a new susceptibility locus. We propose that the sumLINK and sumLOD are ideal for collaborative projects and meta-analyses, as they do not require any sharing of identifiable data between contributing institutions. Further, loci identified with the sumLINK have good potential for gene localization via statistical recombinant mapping, as, by definition, several linked pedigrees contribute to each peak.

Viewpoint: observations on scaled average bioequivalence.

PubMed

Patterson, Scott D; Jones, Byron

2012-01-01

The two one-sided test procedure (TOST) has been used for average bioequivalence testing since 1992 and is required when marketing new formulations of an approved drug. TOST is known to require comparatively large numbers of subjects to demonstrate bioequivalence for highly variable drugs, defined as those drugs having intra-subject coefficients of variation greater than 30%. However, TOST has been shown to protect public health when multiple generic formulations enter the marketplace following patent expiration. Recently, scaled average bioequivalence (SABE) has been proposed as an alternative statistical analysis procedure for such products by multiple regulatory agencies. SABE testing requires that a three-period partial replicate cross-over or full replicate cross-over design be used. Following a brief summary of SABE analysis methods applied to existing data, we will consider three statistical ramifications of the proposed additional decision rules and the potential impact of implementation of scaled average bioequivalence in the marketplace using simulation. It is found that a constraint being applied is biased, that bias may also result from the common problem of missing data and that the SABE methods allow for much greater changes in exposure when generic-generic switching occurs in the marketplace. Copyright © 2011 John Wiley & Sons, Ltd.

Reexamining Sample Size Requirements for Multivariate, Abundance-Based Community Research: When Resources are Limited, the Research Does Not Have to Be.

PubMed

Forcino, Frank L; Leighton, Lindsey R; Twerdy, Pamela; Cahill, James F

2015-01-01

Community ecologists commonly perform multivariate techniques (e.g., ordination, cluster analysis) to assess patterns and gradients of taxonomic variation. A critical requirement for a meaningful statistical analysis is accurate information on the taxa found within an ecological sample. However, oversampling (too many individuals counted per sample) also comes at a cost, particularly for ecological systems in which identification and quantification is substantially more resource consuming than the field expedition itself. In such systems, an increasingly larger sample size will eventually result in diminishing returns in improving any pattern or gradient revealed by the data, but will also lead to continually increasing costs. Here, we examine 396 datasets: 44 previously published and 352 created datasets. Using meta-analytic and simulation-based approaches, the research within the present paper seeks (1) to determine minimal sample sizes required to produce robust multivariate statistical results when conducting abundance-based, community ecology research. Furthermore, we seek (2) to determine the dataset parameters (i.e., evenness, number of taxa, number of samples) that require larger sample sizes, regardless of resource availability. We found that in the 44 previously published and the 220 created datasets with randomly chosen abundances, a conservative estimate of a sample size of 58 produced the same multivariate results as all larger sample sizes. However, this minimal number varies as a function of evenness, where increased evenness resulted in increased minimal sample sizes. Sample sizes as small as 58 individuals are sufficient for a broad range of multivariate abundance-based research. In cases when resource availability is the limiting factor for conducting a project (e.g., small university, time to conduct the research project), statistically viable results can still be obtained with less of an investment.

Web-TCGA: an online platform for integrated analysis of molecular cancer data sets.

PubMed

Deng, Mario; Brägelmann, Johannes; Schultze, Joachim L; Perner, Sven

2016-02-06

The Cancer Genome Atlas (TCGA) is a pool of molecular data sets publicly accessible and freely available to cancer researchers anywhere around the world. However, wide spread use is limited since an advanced knowledge of statistics and statistical software is required. In order to improve accessibility we created Web-TCGA, a web based, freely accessible online tool, which can also be run in a private instance, for integrated analysis of molecular cancer data sets provided by TCGA. In contrast to already available tools, Web-TCGA utilizes different methods for analysis and visualization of TCGA data, allowing users to generate global molecular profiles across different cancer entities simultaneously. In addition to global molecular profiles, Web-TCGA offers highly detailed gene and tumor entity centric analysis by providing interactive tables and views. As a supplement to other already available tools, such as cBioPortal (Sci Signal 6:pl1, 2013, Cancer Discov 2:401-4, 2012), Web-TCGA is offering an analysis service, which does not require any installation or configuration, for molecular data sets available at the TCGA. Individual processing requests (queries) are generated by the user for mutation, methylation, expression and copy number variation (CNV) analyses. The user can focus analyses on results from single genes and cancer entities or perform a global analysis (multiple cancer entities and genes simultaneously).

Training in metabolomics research. II. Processing and statistical analysis of metabolomics data, metabolite identification, pathway analysis, applications of metabolomics and its future

PubMed Central

Barnes, Stephen; Benton, H. Paul; Casazza, Krista; Cooper, Sara; Cui, Xiangqin; Du, Xiuxia; Engler, Jeffrey; Kabarowski, Janusz H.; Li, Shuzhao; Pathmasiri, Wimal; Prasain, Jeevan K.; Renfrow, Matthew B.; Tiwari, Hemant K.

2017-01-01

Metabolomics, a systems biology discipline representing analysis of known and unknown pathways of metabolism, has grown tremendously over the past 20 years. Because of its comprehensive nature, metabolomics requires careful consideration of the question(s) being asked, the scale needed to answer the question(s), collection and storage of the sample specimens, methods for extraction of the metabolites from biological matrices, the analytical method(s) to be employed and the quality control of the analyses, how collected data are correlated, the statistical methods to determine metabolites undergoing significant change, putative identification of metabolites, and the use of stable isotopes to aid in verifying metabolite identity and establishing pathway connections and fluxes. This second part of a comprehensive description of the methods of metabolomics focuses on data analysis, emerging methods in metabolomics and the future of this discipline. PMID:28239968

Debating Curricular Strategies for Teaching Statistics and Research Methods: What Does the Current Evidence Suggest?

ERIC Educational Resources Information Center

Barron, Kenneth E.; Apple, Kevin J.

2014-01-01

Coursework in statistics and research methods is a core requirement in most undergraduate psychology programs. However, is there an optimal way to structure and sequence methodology courses to facilitate student learning? For example, should statistics be required before research methods, should research methods be required before statistics, or…

Evaluating the statistical methodology of randomized trials on dentin hypersensitivity management.

PubMed

Matranga, Domenica; Matera, Federico; Pizzo, Giuseppe

2017-12-27

The present study aimed to evaluate the characteristics and quality of statistical methodology used in clinical studies on dentin hypersensitivity management. An electronic search was performed for data published from 2009 to 2014 by using PubMed, Ovid/MEDLINE, and Cochrane Library databases. The primary search terms were used in combination. Eligibility criteria included randomized clinical trials that evaluated the efficacy of desensitizing agents in terms of reducing dentin hypersensitivity. A total of 40 studies were considered eligible for assessment of quality statistical methodology. The four main concerns identified were i) use of nonparametric tests in the presence of large samples, coupled with lack of information about normality and equality of variances of the response; ii) lack of P-value adjustment for multiple comparisons; iii) failure to account for interactions between treatment and follow-up time; and iv) no information about the number of teeth examined per patient and the consequent lack of cluster-specific approach in data analysis. Owing to these concerns, statistical methodology was judged as inappropriate in 77.1% of the 35 studies that used parametric methods. Additional studies with appropriate statistical analysis are required to obtain appropriate assessment of the efficacy of desensitizing agents.

«

9

10

11

12

13

»

«

10

11

12

13

14

»

Accommodating the Spectrum of Individual Abilities. Clearinghouse Publication 81.

ERIC Educational Resources Information Center

Commission on Civil Rights, Washington, DC.

The monograph addresses legal issues involving discrimination against handicapped persons and the key legal requirement of reasonable accommodation. Four chapters in Part I examine background issues, including definitions and statistical overviews of handicaps; historical attitudes toward handicapped persons and an analysis of the extent of…

Real English Project Report.

ERIC Educational Resources Information Center

Cautin, Harvey; Regan, Edward

Requirements are discussed for an information retrieval language that enables users to employ natural language sentences in interaction with computer-stored files. Anticipated modes of operation of the system are outlined. These are: the search mode, the dictionary mode, the tables mode, and the statistical mode. Analysis of sample sentences…

Hazard Analysis and Safety Requirements for Small Drone Operations: To What Extent Do Popular Drones Embed Safety?

PubMed

Plioutsias, Anastasios; Karanikas, Nektarios; Chatzimihailidou, Maria Mikela

2018-03-01

Currently, published risk analyses for drones refer mainly to commercial systems, use data from civil aviation, and are based on probabilistic approaches without suggesting an inclusive list of hazards and respective requirements. Within this context, this article presents: (1) a set of safety requirements generated from the application of the systems theoretic process analysis (STPA) technique on a generic small drone system; (2) a gap analysis between the set of safety requirements and the ones met by 19 popular drone models; (3) the extent of the differences between those models, their manufacturers, and the countries of origin; and (4) the association of drone prices with the extent they meet the requirements derived by STPA. The application of STPA resulted in 70 safety requirements distributed across the authority, manufacturer, end user, or drone automation levels. A gap analysis showed high dissimilarities regarding the extent to which the 19 drones meet the same safety requirements. Statistical results suggested a positive correlation between drone prices and the extent that the 19 drones studied herein met the safety requirements generated by STPA, and significant differences were identified among the manufacturers. This work complements the existing risk assessment frameworks for small drones, and contributes to the establishment of a commonly endorsed international risk analysis framework. Such a framework will support the development of a holistic and methodologically justified standardization scheme for small drone flights. © 2017 Society for Risk Analysis.

Statistical Issues in Testing Conformance with the Quantitative Imaging Biomarker Alliance (QIBA) Profile Claims.

PubMed

Obuchowski, Nancy A; Buckler, Andrew; Kinahan, Paul; Chen-Mayer, Heather; Petrick, Nicholas; Barboriak, Daniel P; Bullen, Jennifer; Barnhart, Huiman; Sullivan, Daniel C

2016-04-01

A major initiative of the Quantitative Imaging Biomarker Alliance is to develop standards-based documents called "Profiles," which describe one or more technical performance claims for a given imaging modality. The term "actor" denotes any entity (device, software, or person) whose performance must meet certain specifications for the claim to be met. The objective of this paper is to present the statistical issues in testing actors' conformance with the specifications. In particular, we present the general rationale and interpretation of the claims, the minimum requirements for testing whether an actor achieves the performance requirements, the study designs used for testing conformity, and the statistical analysis plan. We use three examples to illustrate the process: apparent diffusion coefficient in solid tumors measured by MRI, change in Perc 15 as a biomarker for the progression of emphysema, and percent change in solid tumor volume by computed tomography as a biomarker for lung cancer progression. Copyright © 2016 The Association of University Radiologists. All rights reserved.

Compression technique for large statistical data bases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eggers, S.J.; Olken, F.; Shoshani, A.

1981-03-01

The compression of large statistical databases is explored and are proposed for organizing the compressed data, such that the time required to access the data is logarithmic. The techniques exploit special characteristics of statistical databases, namely, variation in the space required for the natural encoding of integer attributes, a prevalence of a few repeating values or constants, and the clustering of both data of the same length and constants in long, separate series. The techniques are variations of run-length encoding, in which modified run-lengths for the series are extracted from the data stream and stored in a header, which ismore » used to form the base level of a B-tree index into the database. The run-lengths are cumulative, and therefore the access time of the data is logarithmic in the size of the header. The details of the compression scheme and its implementation are discussed, several special cases are presented, and an analysis is given of the relative performance of the various versions.« less

Stationary statistical theory of two-surface multipactor regarding all impacts for efficient threshold analysis

NASA Astrophysics Data System (ADS)

Lin, Shu; Wang, Rui; Xia, Ning; Li, Yongdong; Liu, Chunliang

2018-01-01

Statistical multipactor theories are critical prediction approaches for multipactor breakdown determination. However, these approaches still require a negotiation between the calculation efficiency and accuracy. This paper presents an improved stationary statistical theory for efficient threshold analysis of two-surface multipactor. A general integral equation over the distribution function of the electron emission phase with both the single-sided and double-sided impacts considered is formulated. The modeling results indicate that the improved stationary statistical theory can not only obtain equally good accuracy of multipactor threshold calculation as the nonstationary statistical theory, but also achieve high calculation efficiency concurrently. By using this improved stationary statistical theory, the total time consumption in calculating full multipactor susceptibility zones of parallel plates can be decreased by as much as a factor of four relative to the nonstationary statistical theory. It also shows that the effect of single-sided impacts is indispensable for accurate multipactor prediction of coaxial lines and also more significant for the high order multipactor. Finally, the influence of secondary emission yield (SEY) properties on the multipactor threshold is further investigated. It is observed that the first cross energy and the energy range between the first cross and the SEY maximum both play a significant role in determining the multipactor threshold, which agrees with the numerical simulation results in the literature.

Missing Data and Multiple Imputation: An Unbiased Approach

NASA Technical Reports Server (NTRS)

Foy, M.; VanBaalen, M.; Wear, M.; Mendez, C.; Mason, S.; Meyers, V.; Alexander, D.; Law, J.

2014-01-01

The default method of dealing with missing data in statistical analyses is to only use the complete observations (complete case analysis), which can lead to unexpected bias when data do not meet the assumption of missing completely at random (MCAR). For the assumption of MCAR to be met, missingness cannot be related to either the observed or unobserved variables. A less stringent assumption, missing at random (MAR), requires that missingness not be associated with the value of the missing variable itself, but can be associated with the other observed variables. When data are truly MAR as opposed to MCAR, the default complete case analysis method can lead to biased results. There are statistical options available to adjust for data that are MAR, including multiple imputation (MI) which is consistent and efficient at estimating effects. Multiple imputation uses informing variables to determine statistical distributions for each piece of missing data. Then multiple datasets are created by randomly drawing on the distributions for each piece of missing data. Since MI is efficient, only a limited number, usually less than 20, of imputed datasets are required to get stable estimates. Each imputed dataset is analyzed using standard statistical techniques, and then results are combined to get overall estimates of effect. A simulation study will be demonstrated to show the results of using the default complete case analysis, and MI in a linear regression of MCAR and MAR simulated data. Further, MI was successfully applied to the association study of CO2 levels and headaches when initial analysis showed there may be an underlying association between missing CO2 levels and reported headaches. Through MI, we were able to show that there is a strong association between average CO2 levels and the risk of headaches. Each unit increase in CO2 (mmHg) resulted in a doubling in the odds of reported headaches.

Calibration and Data Analysis of the MC-130 Air Balance

NASA Technical Reports Server (NTRS)

Booth, Dennis; Ulbrich, N.

2012-01-01

Design, calibration, calibration analysis, and intended use of the MC-130 air balance are discussed. The MC-130 balance is an 8.0 inch diameter force balance that has two separate internal air flow systems and one external bellows system. The manual calibration of the balance consisted of a total of 1854 data points with both unpressurized and pressurized air flowing through the balance. A subset of 1160 data points was chosen for the calibration data analysis. The regression analysis of the subset was performed using two fundamentally different analysis approaches. First, the data analysis was performed using a recently developed extension of the Iterative Method. This approach fits gage outputs as a function of both applied balance loads and bellows pressures while still allowing the application of the iteration scheme that is used with the Iterative Method. Then, for comparison, the axial force was also analyzed using the Non-Iterative Method. This alternate approach directly fits loads as a function of measured gage outputs and bellows pressures and does not require a load iteration. The regression models used by both the extended Iterative and Non-Iterative Method were constructed such that they met a set of widely accepted statistical quality requirements. These requirements lead to reliable regression models and prevent overfitting of data because they ensure that no hidden near-linear dependencies between regression model terms exist and that only statistically significant terms are included. Finally, a comparison of the axial force residuals was performed. Overall, axial force estimates obtained from both methods show excellent agreement as the differences of the standard deviation of the axial force residuals are on the order of 0.001 % of the axial force capacity.

IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

PubMed

Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben

2017-09-15

Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Local image statistics: maximum-entropy constructions and perceptual salience

PubMed Central

Victor, Jonathan D.; Conte, Mary M.

2012-01-01

The space of visual signals is high-dimensional and natural visual images have a highly complex statistical structure. While many studies suggest that only a limited number of image statistics are used for perceptual judgments, a full understanding of visual function requires analysis not only of the impact of individual image statistics, but also, how they interact. In natural images, these statistical elements (luminance distributions, correlations of low and high order, edges, occlusions, etc.) are intermixed, and their effects are difficult to disentangle. Thus, there is a need for construction of stimuli in which one or more statistical elements are introduced in a controlled fashion, so that their individual and joint contributions can be analyzed. With this as motivation, we present algorithms to construct synthetic images in which local image statistics—including luminance distributions, pair-wise correlations, and higher-order correlations—are explicitly specified and all other statistics are determined implicitly by maximum-entropy. We then apply this approach to measure the sensitivity of the human visual system to local image statistics and to sample their interactions. PMID:22751397

Evidence for a Global Sampling Process in Extraction of Summary Statistics of Item Sizes in a Set.

PubMed

Tokita, Midori; Ueda, Sachiyo; Ishiguchi, Akira

2016-01-01

Several studies have shown that our visual system may construct a "summary statistical representation" over groups of visual objects. Although there is a general understanding that human observers can accurately represent sets of a variety of features, many questions on how summary statistics, such as an average, are computed remain unanswered. This study investigated sampling properties of visual information used by human observers to extract two types of summary statistics of item sets, average and variance. We presented three models of ideal observers to extract the summary statistics: a global sampling model without sampling noise, global sampling model with sampling noise, and limited sampling model. We compared the performance of an ideal observer of each model with that of human observers using statistical efficiency analysis. Results suggest that summary statistics of items in a set may be computed without representing individual items, which makes it possible to discard the limited sampling account. Moreover, the extraction of summary statistics may not necessarily require the representation of individual objects with focused attention when the sets of items are larger than 4.

MetaGenyo: a web tool for meta-analysis of genetic association studies.

PubMed

Martorell-Marugan, Jordi; Toro-Dominguez, Daniel; Alarcon-Riquelme, Marta E; Carmona-Saez, Pedro

2017-12-16

Genetic association studies (GAS) aims to evaluate the association between genetic variants and phenotypes. In the last few years, the number of this type of study has increased exponentially, but the results are not always reproducible due to experimental designs, low sample sizes and other methodological errors. In this field, meta-analysis techniques are becoming very popular tools to combine results across studies to increase statistical power and to resolve discrepancies in genetic association studies. A meta-analysis summarizes research findings, increases statistical power and enables the identification of genuine associations between genotypes and phenotypes. Meta-analysis techniques are increasingly used in GAS, but it is also increasing the amount of published meta-analysis containing different errors. Although there are several software packages that implement meta-analysis, none of them are specifically designed for genetic association studies and in most cases their use requires advanced programming or scripting expertise. We have developed MetaGenyo, a web tool for meta-analysis in GAS. MetaGenyo implements a complete and comprehensive workflow that can be executed in an easy-to-use environment without programming knowledge. MetaGenyo has been developed to guide users through the main steps of a GAS meta-analysis, covering Hardy-Weinberg test, statistical association for different genetic models, analysis of heterogeneity, testing for publication bias, subgroup analysis and robustness testing of the results. MetaGenyo is a useful tool to conduct comprehensive genetic association meta-analysis. The application is freely available at http://bioinfo.genyo.es/metagenyo/ .

Critical discussion of evaluation parameters for inter-observer variability in target definition for radiation therapy.

PubMed

Fotina, I; Lütgendorf-Caucig, C; Stock, M; Pötter, R; Georg, D

2012-02-01

Inter-observer studies represent a valid method for the evaluation of target definition uncertainties and contouring guidelines. However, data from the literature do not yet give clear guidelines for reporting contouring variability. Thus, the purpose of this work was to compare and discuss various methods to determine variability on the basis of clinical cases and a literature review. In this study, 7 prostate and 8 lung cases were contoured on CT images by 8 experienced observers. Analysis of variability included descriptive statistics, calculation of overlap measures, and statistical measures of agreement. Cross tables with ratios and correlations were established for overlap parameters. It was shown that the minimal set of parameters to be reported should include at least one of three volume overlap measures (i.e., generalized conformity index, Jaccard coefficient, or conformation number). High correlation between these parameters and scatter of the results was observed. A combination of descriptive statistics, overlap measure, and statistical measure of agreement or reliability analysis is required to fully report the interrater variability in delineation.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zdarek, J.; Pecinka, L.

Leak-before-break (LBB) analysis of WWER type reactors in the Czech and Sloval Republics is summarized in this paper. Legislative bases, required procedures, and validation and verification of procedures are discussed. A list of significant issues identified during the application of LBB analysis is presented. The results of statistical evaluation of crack length characteristics are presented and compared for the WWER 440 Type 230 and 213 reactors and for the WWER 1000 Type 302, 320 and 338 reactors.

Training Effectiveness Assessment. Volume II. Problems, Concepts, and Evaluation Alternatives.

DTIC Science & Technology

1976-12-01

i nforma ti on abou t areas where course impr ov emer t might be indicated . Percentiles , pretest and posttest scores , or other measures of amount...statistical sophisti- cation. Interpretation of gain scores derived from pretests - posttests of trainees and other forms of trend analysis requires...CPM ), computer - managed testing (CMI). time-series analysi s, pretest / posttest design , and secondary anal ysis. Criterion -referenced measurement is

Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer

DTIC Science & Technology

2015-09-01

EPICOPY to obtain reliable copy number variation ( CNV ) data from the methylome array data, thereby decreasing the DNA requirements in half...in the R statistical environment. Samples were assessed for good performance on the array using detection p-values, a metric implemented by...Illumina to identify probes detected with confidence. Samples less than 90% of probes detected were removed from the analysis and probes undetected in any

Summary and Statistical Analysis of the First AIAA Sonic Boom Prediction Workshop

NASA Technical Reports Server (NTRS)

Park, Michael A.; Morgenstern, John M.

2014-01-01

A summary is provided for the First AIAA Sonic Boom Workshop held 11 January 2014 in conjunction with AIAA SciTech 2014. Near-field pressure signatures extracted from computational fluid dynamics solutions are gathered from nineteen participants representing three countries for the two required cases, an axisymmetric body and simple delta wing body. Structured multiblock, unstructured mixed-element, unstructured tetrahedral, overset, and Cartesian cut-cell methods are used by the participants. Participants provided signatures computed on participant generated and solution adapted grids. Signatures are also provided for a series of uniformly refined workshop provided grids. These submissions are propagated to the ground and loudness measures are computed. This allows the grid convergence of a loudness measure and a validation metric (dfference norm between computed and wind tunnel measured near-field signatures) to be studied for the first time. Statistical analysis is also presented for these measures. An optional configuration includes fuselage, wing, tail, flow-through nacelles, and blade sting. This full configuration exhibits more variation in eleven submissions than the sixty submissions provided for each required case. Recommendations are provided for potential improvements to the analysis methods and a possible subsequent workshop.

Silt fences: An economical technique for measuring hillslope soil erosion

Treesearch

Peter R. Robichaud; Robert E. Brown

2002-01-01

Measuring hillslope erosion has historically been a costly, time-consuming practice. An easy to install low-cost technique using silt fences (geotextile fabric) and tipping bucket rain gauges to measure onsite hillslope erosion was developed and tested. Equipment requirements, installation procedures, statistical design, and analysis methods for measuring hillslope...

Introducing Mathematics to Information Problem-Solving Tasks: Surface or Substance?

ERIC Educational Resources Information Center

Erickson, Ander

2017-01-01

This study employs a cross-case analysis in order to explore the demands and opportunities that arise when information problem-solving tasks are introduced into college mathematics classes. Professors at three universities collaborated with me to develop statistics-related activities that required students to engage in research outside the…

Bioassessment Tools for Stony Corals: Monitoring Approaches and Proposed Sampling Plan for the U.S. Virgin Islands

EPA Science Inventory

This document describes three general approaches to the design of a sampling plan for biological monitoring of coral reefs. Status assessment, trend detection and targeted monitoring each require a different approach to site selection and statistical analysis. For status assessm...

«

10

11

12

13

14

»

«

11

12

13

14

15

»

Joint QTL linkage mapping for multiple-cross mating design sharing one common parent

USDA-ARS?s Scientific Manuscript database

Nested association mapping (NAM) is a novel genetic mating design that combines the advantages of linkage analysis and association mapping. This design provides opportunities to study the inheritance of complex traits, but also requires more advanced statistical methods. In this paper, we present th...

Collected Notes on the Workshop for Pattern Discovery in Large Databases

NASA Technical Reports Server (NTRS)

Buntine, Wray (Editor); Delalto, Martha (Editor)

1991-01-01

These collected notes are a record of material presented at the Workshop. The core data analysis is addressed that have traditionally required statistical or pattern recognition techniques. Some of the core tasks include classification, discrimination, clustering, supervised and unsupervised learning, discovery and diagnosis, i.e., general pattern discovery.

Developing and Assessing E-Learning Techniques for Teaching Forecasting

ERIC Educational Resources Information Center

Gel, Yulia R.; O'Hara Hines, R. Jeanette; Chen, He; Noguchi, Kimihiro; Schoner, Vivian

2014-01-01

In the modern business environment, managers are increasingly required to perform decision making and evaluate related risks based on quantitative information in the face of uncertainty, which in turn increases demand for business professionals with sound skills and hands-on experience with statistical data analysis. Computer-based training…

Teaching Graduate Business Students to Write Clearly about Technical Topics

ERIC Educational Resources Information Center

Dyrud, Marilyn A.; Worley, Rebecca B.; Jameson, Daphne

2006-01-01

Graduate programs in business emphasize technical analysis in finance, accounting, marketing, and other core courses. Important business decisions--what market to target, which products to offer, how to finance an acquisition, whether to lease or buy equipment--require mathematical and statistical problem solving. Management communication courses…

Simulating Ordinal Data

ERIC Educational Resources Information Center

Ferrari, Pier Alda; Barbiero, Alessandro

2012-01-01

The increasing use of ordinal variables in different fields has led to the introduction of new statistical methods for their analysis. The performance of these methods needs to be investigated under a number of experimental conditions. Procedures to simulate from ordinal variables are then required. In this article, we deal with simulation from…

Open Source Tools for Seismicity Analysis

NASA Astrophysics Data System (ADS)

Powers, P.

2010-12-01

The spatio-temporal analysis of seismicity plays an important role in earthquake forecasting and is integral to research on earthquake interactions and triggering. For instance, the third version of the Uniform California Earthquake Rupture Forecast (UCERF), currently under development, will use Epidemic Type Aftershock Sequences (ETAS) as a model for earthquake triggering. UCERF will be a "living" model and therefore requires robust, tested, and well-documented ETAS algorithms to ensure transparency and reproducibility. Likewise, as earthquake aftershock sequences unfold, real-time access to high quality hypocenter data makes it possible to monitor the temporal variability of statistical properties such as the parameters of the Omori Law and the Gutenberg Richter b-value. Such statistical properties are valuable as they provide a measure of how much a particular sequence deviates from expected behavior and can be used when assigning probabilities of aftershock occurrence. To address these demands and provide public access to standard methods employed in statistical seismology, we present well-documented, open-source JavaScript and Java software libraries for the on- and off-line analysis of seismicity. The Javascript classes facilitate web-based asynchronous access to earthquake catalog data and provide a framework for in-browser display, analysis, and manipulation of catalog statistics; implementations of this framework will be made available on the USGS Earthquake Hazards website. The Java classes, in addition to providing tools for seismicity analysis, provide tools for modeling seismicity and generating synthetic catalogs. These tools are extensible and will be released as part of the open-source OpenSHA Commons library.

Statistical assessment on a combined analysis of GRYN-ROMN-UCBN upland vegetation vital signs

USGS Publications Warehouse

Irvine, Kathryn M.; Rodhouse, Thomas J.

2014-01-01

As of 2013, Rocky Mountain and Upper Columbia Basin Inventory and Monitoring Networks have multiple years of vegetation data and Greater Yellowstone Network has three years of vegetation data and monitoring is ongoing in all three networks. Our primary objective is to assess whether a combined analysis of these data aimed at exploring correlations with climate and weather data is feasible. We summarize the core survey design elements across protocols and point out the major statistical challenges for a combined analysis at present. The dissimilarity in response designs between ROMN and UCBN-GRYN network protocols presents a statistical challenge that has not been resolved yet. However, the UCBN and GRYN data are compatible as they implement a similar response design; therefore, a combined analysis is feasible and will be pursued in future. When data collected by different networks are combined, the survey design describing the merged dataset is (likely) a complex survey design. A complex survey design is the result of combining datasets from different sampling designs. A complex survey design is characterized by unequal probability sampling, varying stratification, and clustering (see Lohr 2010 Chapter 7 for general overview). Statistical analysis of complex survey data requires modifications to standard methods, one of which is to include survey design weights within a statistical model. We focus on this issue for a combined analysis of upland vegetation from these networks, leaving other topics for future research. We conduct a simulation study on the possible effects of equal versus unequal probability selection of points on parameter estimates of temporal trend using available packages within the R statistical computing package. We find that, as written, using lmer or lm for trend detection in a continuous response and clm and clmm for visually estimated cover classes with “raw” GRTS design weights specified for the weight argument leads to substantially different results and/or computational instability. However, when only fixed effects are of interest, the survey package (svyglm and svyolr) may be suitable for a model-assisted analysis for trend. We provide possible directions for future research into combined analysis for ordinal and continuous vital sign indictors.

Meta-analyses and Forest plots using a microsoft excel spreadsheet: step-by-step guide focusing on descriptive data analysis.

PubMed

Neyeloff, Jeruza L; Fuchs, Sandra C; Moreira, Leila B

2012-01-20

Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software.

Meta-analyses and Forest plots using a microsoft excel spreadsheet: step-by-step guide focusing on descriptive data analysis

PubMed Central

2012-01-01

Background Meta-analyses are necessary to synthesize data obtained from primary research, and in many situations reviews of observational studies are the only available alternative. General purpose statistical packages can meta-analyze data, but usually require external macros or coding. Commercial specialist software is available, but may be expensive and focused in a particular type of primary data. Most available softwares have limitations in dealing with descriptive data, and the graphical display of summary statistics such as incidence and prevalence is unsatisfactory. Analyses can be conducted using Microsoft Excel, but there was no previous guide available. Findings We constructed a step-by-step guide to perform a meta-analysis in a Microsoft Excel spreadsheet, using either fixed-effect or random-effects models. We have also developed a second spreadsheet capable of producing customized forest plots. Conclusions It is possible to conduct a meta-analysis using only Microsoft Excel. More important, to our knowledge this is the first description of a method for producing a statistically adequate but graphically appealing forest plot summarizing descriptive data, using widely available software. PMID:22264277

The problem of pseudoreplication in neuroscientific studies: is it affecting your analysis?

PubMed Central

2010-01-01

Background Pseudoreplication occurs when observations are not statistically independent, but treated as if they are. This can occur when there are multiple observations on the same subjects, when samples are nested or hierarchically organised, or when measurements are correlated in time or space. Analysis of such data without taking these dependencies into account can lead to meaningless results, and examples can easily be found in the neuroscience literature. Results A single issue of Nature Neuroscience provided a number of examples and is used as a case study to highlight how pseudoreplication arises in neuroscientific studies, why the analyses in these papers are incorrect, and appropriate analytical methods are provided. 12% of papers had pseudoreplication and a further 36% were suspected of having pseudoreplication, but it was not possible to determine for certain because insufficient information was provided. Conclusions Pseudoreplication can undermine the conclusions of a statistical analysis, and it would be easier to detect if the sample size, degrees of freedom, the test statistic, and precise p-values are reported. This information should be a requirement for all publications. PMID:20074371

2D Affine and Projective Shape Analysis.

PubMed

Bryner, Darshan; Klassen, Eric; Huiling Le; Srivastava, Anuj

2014-05-01

Current techniques for shape analysis tend to seek invariance to similarity transformations (rotation, translation, and scale), but certain imaging situations require invariance to larger groups, such as affine or projective groups. Here we present a general Riemannian framework for shape analysis of planar objects where metrics and related quantities are invariant to affine and projective groups. Highlighting two possibilities for representing object boundaries-ordered points (or landmarks) and parameterized curves-we study different combinations of these representations (points and curves) and transformations (affine and projective). Specifically, we provide solutions to three out of four situations and develop algorithms for computing geodesics and intrinsic sample statistics, leading up to Gaussian-type statistical models, and classifying test shapes using such models learned from training data. In the case of parameterized curves, we also achieve the desired goal of invariance to re-parameterizations. The geodesics are constructed by particularizing the path-straightening algorithm to geometries of current manifolds and are used, in turn, to compute shape statistics and Gaussian-type shape models. We demonstrate these ideas using a number of examples from shape and activity recognition.

Predictive data modeling of human type II diabetes related statistics

NASA Astrophysics Data System (ADS)

Jaenisch, Kristina L.; Jaenisch, Holger M.; Handley, James W.; Albritton, Nathaniel G.

2009-04-01

During the course of routine Type II treatment of one of the authors, it was decided to derive predictive analytical Data Models of the daily sampled vital statistics: namely weight, blood pressure, and blood sugar, to determine if the covariance among the observed variables could yield a descriptive equation based model, or better still, a predictive analytical model that could forecast the expected future trend of the variables and possibly eliminate the number of finger stickings required to montior blood sugar levels. The personal history and analysis with resulting models are presented.

Clustangles: An Open Library for Clustering Angular Data.

PubMed

Sargsyan, Karen; Hua, Yun Hao; Lim, Carmay

2015-08-24

Dihedral angles are good descriptors of the numerous conformations visited by large, flexible systems, but their analysis requires directional statistics. A single package including the various multivariate statistical methods for angular data that accounts for the distinct topology of such data does not exist. Here, we present a lightweight standalone, operating-system independent package called Clustangles to fill this gap. Clustangles will be useful in analyzing the ever-increasing number of structures in the Protein Data Bank and clustering the copious conformations from increasingly long molecular dynamics simulations.

Stochastic or statistic? Comparing flow duration curve models in ungauged basins and changing climates

NASA Astrophysics Data System (ADS)

Müller, M. F.; Thompson, S. E.

2015-09-01

The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.

Comparing statistical and process-based flow duration curve models in ungauged basins and changing rain regimes

NASA Astrophysics Data System (ADS)

Müller, M. F.; Thompson, S. E.

2016-02-01

The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.

Results of the Verification of the Statistical Distribution Model of Microseismicity Emission Characteristics

NASA Astrophysics Data System (ADS)

Cianciara, Aleksander

2016-09-01

The paper presents the results of research aimed at verifying the hypothesis that the Weibull distribution is an appropriate statistical distribution model of microseismicity emission characteristics, namely: energy of phenomena and inter-event time. It is understood that the emission under consideration is induced by the natural rock mass fracturing. Because the recorded emission contain noise, therefore, it is subjected to an appropriate filtering. The study has been conducted using the method of statistical verification of null hypothesis that the Weibull distribution fits the empirical cumulative distribution function. As the model describing the cumulative distribution function is given in an analytical form, its verification may be performed using the Kolmogorov-Smirnov goodness-of-fit test. Interpretations by means of probabilistic methods require specifying the correct model describing the statistical distribution of data. Because in these methods measurement data are not used directly, but their statistical distributions, e.g., in the method based on the hazard analysis, or in that that uses maximum value statistics.

The impact of implementation of the requirements of Standard No. OHSAS 18001:2007 to reduce the number of injuries at work and financial costs in the Republic of Croatia.

PubMed

Palačić, Darko

2017-06-01

This article contains the results of research into the impact of implementation of the requirements mentioned in Standard No. OHSAS 18001:2007 to reduce the number of injuries at work and the financial costs incurred in this way. The study was conducted on a determined sample by a written questionnaire survey method in the Republic of Croatia. The objective of the empirical research is to determine the impact of implementation of the requirements of Standard No. OHSAS 18001:2007 to reduce the number of injuries at work and financial costs in Croatia in business organizations that implement these requirements. To provide a broader picture, the research included the collection and analysis of data on the impact of the Standard No. OHSAS 18001:2007 on accidents and fatalities at work. Research findings are based on the analysis of performed statistical data where correlation and regression analysis has been applied.

Cosmology constraints from shear peak statistics in Dark Energy Survey Science Verification data

DOE PAGES

Kacprzak, T.; Kirk, D.; Friedrich, O.; ...

2016-08-19

Shear peak statistics has gained a lot of attention recently as a practical alternative to the two point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 degmore » $^2$ field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range $$0<\\mathcal S / \\mathcal N<4$$. To predict the peak counts as a function of cosmological parameters we use a suite of $N$-body simulations spanning 158 models with varying $$\\Omega_{\\rm m}$$ and $$\\sigma_8$$, fixing $w = -1$, $$\\Omega_{\\rm b} = 0.04$$, $h = 0.7$ and $$n_s=1$$, to which we have applied the DES SV mask and redshift distribution. In our fiducial analysis we measure $$\\sigma_{8}(\\Omega_{\\rm m}/0.3)^{0.6}=0.77 \\pm 0.07$$, after marginalising over the shear multiplicative bias and the error on the mean redshift of the galaxy sample. We introduce models of intrinsic alignments, blending, and source contamination by cluster members. These models indicate that peaks with $$\\mathcal S / \\mathcal N>4$$ would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. As a result, we discuss prospects for future peak statistics analysis with upcoming DES data.« less

Design of point-of-care (POC) microfluidic medical diagnostic devices

NASA Astrophysics Data System (ADS)

Leary, James F.

2018-02-01

Design of inexpensive and portable hand-held microfluidic flow/image cytometry devices for initial medical diagnostics at the point of initial patient contact by emergency medical personnel in the field requires careful design in terms of power/weight requirements to allow for realistic portability as a hand-held, point-of-care medical diagnostics device. True portability also requires small micro-pumps for high-throughput capability. Weight/power requirements dictate use of super-bright LEDs and very small silicon photodiodes or nanophotonic sensors that can be powered by batteries. Signal-to-noise characteristics can be greatly improved by appropriately pulsing the LED excitation sources and sampling and subtracting noise in between excitation pulses. The requirements for basic computing, imaging, GPS and basic telecommunications can be simultaneously met by use of smartphone technologies, which become part of the overall device. Software for a user-interface system, limited real-time computing, real-time imaging, and offline data analysis can be accomplished through multi-platform software development systems that are well-suited to a variety of currently available cellphone technologies which already contain all of these capabilities. Microfluidic cytometry requires judicious use of small sample volumes and appropriate statistical sampling by microfluidic cytometry or imaging for adequate statistical significance to permit real-time (typically < 15 minutes) medical decisions for patients at the physician's office or real-time decision making in the field. One or two drops of blood obtained by pin-prick should be able to provide statistically meaningful results for use in making real-time medical decisions without the need for blood fractionation, which is not realistic in the field.

New robust statistical procedures for the polytomous logistic regression models.

PubMed

Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

2018-05-17

This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.

«

11

12

13

14

15

»

«

12

13

14

15

16

»

Certification of medical librarians, 1949--1977 statistical analysis.

PubMed

Schmidt, D

1979-01-01

The Medical Library Association's Code for Training and Certification of Medical Librarians was in effect from 1949 to August 1977, a period during which 3,216 individuals were certified. Statistics on each type of certificate granted each year are provided. Because 54.5% of those granted certification were awarded it in the last three-year, two-month period of the code's existence, these applications are reviewed in greater detail. Statistics on each type of certificate granted each year are provided. Because 54.5% of those granted certification were awarded it in the last three-year, two-month period of the code's existence, these applications are reviewed in greater detail. Statistics on MLA membership, sex, residence, library school, and method of meeting requirements are detailed. Questions relating to certification under the code now in existence are raised.

Certification of medical librarians, 1949--1977 statistical analysis.

PubMed Central

Schmidt, D

1979-01-01

The Medical Library Association's Code for Training and Certification of Medical Librarians was in effect from 1949 to August 1977, a period during which 3,216 individuals were certified. Statistics on each type of certificate granted each year are provided. Because 54.5% of those granted certification were awarded it in the last three-year, two-month period of the code's existence, these applications are reviewed in greater detail. Statistics on each type of certificate granted each year are provided. Because 54.5% of those granted certification were awarded it in the last three-year, two-month period of the code's existence, these applications are reviewed in greater detail. Statistics on MLA membership, sex, residence, library school, and method of meeting requirements are detailed. Questions relating to certification under the code now in existence are raised. PMID:427287

A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls

PubMed Central

Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique LCT; Heeren, Ron MA; Sillevis Smitt, Peter A; Luider, Theo M

2006-01-01

Background Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. Results A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. Conclusion The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the higher resolution and mass accuracy of the FT-ICR mass spectrometry prevents the clustering of peaks of different peptides and allows the identification of differentially expressed proteins from the peptide profiles. PMID:16953879

A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls.

PubMed

Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique L C T; Heeren, Ron M A; Sillevis Smitt, Peter A; Luider, Theo M

2006-09-05

Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the higher resolution and mass accuracy of the FT-ICR mass spectrometry prevents the clustering of peaks of different peptides and allows the identification of differentially expressed proteins from the peptide profiles.

The 1993 Mississippi river flood: A one hundred or a one thousand year event?

USGS Publications Warehouse

Malamud, B.D.; Turcotte, D.L.; Barton, C.C.

1996-01-01

Power-law (fractal) extreme-value statistics are applicable to many natural phenomena under a wide variety of circumstances. Data from a hydrologic station in Keokuk, Iowa, shows the great flood of the Mississippi River in 1993 has a recurrence interval on the order of 100 years using power-law statistics applied to partial-duration flood series and on the order of 1,000 years using a log-Pearson type 3 (LP3) distribution applied to annual series. The LP3 analysis is the federally adopted probability distribution for flood-frequency estimation of extreme events. We suggest that power-law statistics are preferable to LP3 analysis. As a further test of the power-law approach we consider paleoflood data from the Colorado River. We compare power-law and LP3 extrapolations of historical data with these paleo-floods. The results are remarkably similar to those obtained for the Mississippi River: Recurrence intervals from power-law statistics applied to Lees Ferry discharge data are generally consistent with inferred 100- and 1,000-year paleofloods, whereas LP3 analysis gives recurrence intervals that are orders of magnitude longer. For both the Keokuk and Lees Ferry gauges, the use of an annual series introduces an artificial curvature in log-log space that leads to an underestimate of severe floods. Power-law statistics are predicting much shorter recurrence intervals than the federally adopted LP3 statistics. We suggest that if power-law behavior is applicable, then the likelihood of severe floods is much higher. More conservative dam designs and land-use restrictions Nay be required.

Upper Atmosphere Research Satellite (UARS) onboard attitude determination using a Kalman filter

NASA Technical Reports Server (NTRS)

Garrick, Joseph

1993-01-01

The Upper Atmospheric Research Satellite (UARS) requires a highly accurate knowledge of its attitude to accomplish its mission. Propagation of the attitude state using gyro measurements is not sufficient to meet the accuracy requirements, and must be supplemented by a observer/compensation process to correct for dynamics and observation anomalies. The process of amending the attitude state utilizes a well known method, the discrete Kalman Filter. This study is a sensitivity analysis of the discrete Kalman Filter as implemented in the UARS Onboard Computer (OBC). The stability of the Kalman Filter used in the normal on-orbit control mode within the OBC, is investigated for the effects of corrupted observations and nonlinear errors. Also, a statistical analysis on the residuals of the Kalman Filter is performed. These analysis is based on simulations using the UARS Dynamics Simulator (UARSDSIM) and compared against attitude requirements as defined by General Electric (GE). An independent verification of expected accuracies is performed using the Attitude Determination Error Analysis System (ADEAS).

Multivariate meta-analysis: a robust approach based on the theory of U-statistic.

PubMed

Ma, Yan; Mazumdar, Madhu

2011-10-30

Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.

Specification of the ISS Plasma Environment Variability

NASA Technical Reports Server (NTRS)

Minow, Joseph I.; Neergaard, Linda F.; Bui, Them H.; Mikatarian, Ronald R.; Barsamian, H.; Koontz, Steven L.

2002-01-01

Quantifying the spacecraft charging risks and corresponding hazards for the International Space Station (ISS) requires a plasma environment specification describing the natural variability of ionospheric temperature (Te) and density (Ne). Empirical ionospheric specification and forecast models such as the International Reference Ionosphere (IRI) model typically only provide estimates of long term (seasonal) mean Te and Ne values for the low Earth orbit environment. Knowledge of the Te and Ne variability as well as the likelihood of extreme deviations from the mean values are required to estimate both the magnitude and frequency of occurrence of potentially hazardous spacecraft charging environments for a given ISS construction stage and flight configuration. This paper describes the statistical analysis of historical ionospheric low Earth orbit plasma measurements used to estimate Ne, Te variability in the ISS flight environment. The statistical variability analysis of Ne and Te enables calculation of the expected frequency of Occurrence of any particular values of Ne and Te, especially those that correspond to possibly hazardous spacecraft charging environments. The database used in the original analysis included measurements from the AE-C, AE-D, and DE-2 satellites. Recent work on the database has added additional satellites to the database and ground based incoherent scatter radar observations as well. Deviations of the data values from the IRI estimated Ne, Te parameters for each data point provide a statistical basis for modeling the deviations of the plasma environment from the IRI model output. This technique, while developed specifically for the Space Station analysis, can also be generalized to provide ionospheric plasma environment risk specification models for low Earth orbit over an altitude range of 200 km through approximately 1000 km.

Robust Strategy for Rocket Engine Health Monitoring

NASA Technical Reports Server (NTRS)

Santi, L. Michael

2001-01-01

Monitoring the health of rocket engine systems is essentially a two-phase process. The acquisition phase involves sensing physical conditions at selected locations, converting physical inputs to electrical signals, conditioning the signals as appropriate to establish scale or filter interference, and recording results in a form that is easy to interpret. The inference phase involves analysis of results from the acquisition phase, comparison of analysis results to established health measures, and assessment of health indications. A variety of analytical tools may be employed in the inference phase of health monitoring. These tools can be separated into three broad categories: statistical, rule based, and model based. Statistical methods can provide excellent comparative measures of engine operating health. They require well-characterized data from an ensemble of "typical" engines, or "golden" data from a specific test assumed to define the operating norm in order to establish reliable comparative measures. Statistical methods are generally suitable for real-time health monitoring because they do not deal with the physical complexities of engine operation. The utility of statistical methods in rocket engine health monitoring is hindered by practical limits on the quantity and quality of available data. This is due to the difficulty and high cost of data acquisition, the limited number of available test engines, and the problem of simulating flight conditions in ground test facilities. In addition, statistical methods incur a penalty for disregarding flow complexity and are therefore limited in their ability to define performance shift causality. Rule based methods infer the health state of the engine system based on comparison of individual measurements or combinations of measurements with defined health norms or rules. This does not mean that rule based methods are necessarily simple. Although binary yes-no health assessment can sometimes be established by relatively simple rules, the causality assignment needed for refined health monitoring often requires an exceptionally complex rule base involving complicated logical maps. Structuring the rule system to be clear and unambiguous can be difficult, and the expert input required to maintain a large logic network and associated rule base can be prohibitive.

Mission Analysis Program for Solar Electric Propulsion (MAPSEP). Volume 2: User's manual for earth orbital MAPSEP

NASA Technical Reports Server (NTRS)

1975-01-01

The trajectory simulation mode (SIMSEP) requires the namelist SIMSEP to follow TRAJ. The SIMSEP contains parameters which describe the scope of the simulation, expected dynamic errors, and cumulative statistics from previous SIMSEP runs. Following SIMSEP are a set of GUID namelists, one for each guidance correction maneuver. The GUID describes the strategy, knowledge or estimation uncertainties and cumulative statistics for that particular maneuver. The trajectory display mode (REFSEP) requires only the namelist TRAJ followed by scheduling cards, similar to those used in GODSEP. The fixed field schedule cards define: types of data displayed, span of interest, and frequency of printout. For those users who can vary the amount of blank common storage in their runs, a guideline to estimate the total MAPSEP core requirements is given. Blank common length is related directly to the dimension of the dynamic state (NDIM) used in transition matrix (STM) computation, and, the total augmented (knowledge) state (NAUG). The values of program and blank common must be added to compute the total decimal core for a CDC 6500. Other operating systems must scale these requirements appropriately.

A Review of ETS Differential Item Functioning Assessment Procedures: Flagging Rules, Minimum Sample Size Requirements, and Criterion Refinement. Research Report. ETS RR-12-08

ERIC Educational Resources Information Center

Zwick, Rebecca

2012-01-01

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…

LENS: web-based lens for enrichment and network studies of human proteins

PubMed Central

2015-01-01

Background Network analysis is a common approach for the study of genetic view of diseases and biological pathways. Typically, when a set of genes are identified to be of interest in relation to a disease, say through a genome wide association study (GWAS) or a different gene expression study, these genes are typically analyzed in the context of their protein-protein interaction (PPI) networks. Further analysis is carried out to compute the enrichment of known pathways and disease-associations in the network. Having tools for such analysis at the fingertips of biologists without the requirement for computer programming or curation of data would accelerate the characterization of genes of interest. Currently available tools do not integrate network and enrichment analysis and their visualizations, and most of them present results in formats not most conducive to human cognition. Results We developed the tool Lens for Enrichment and Network Studies of human proteins (LENS) that performs network and pathway and diseases enrichment analyses on genes of interest to users. The tool creates a visualization of the network, provides easy to read statistics on network connectivity, and displays Venn diagrams with statistical significance values of the network's association with drugs, diseases, pathways, and GWASs. We used the tool to analyze gene sets related to craniofacial development, autism, and schizophrenia. Conclusion LENS is a web-based tool that does not require and download or plugins to use. The tool is free and does not require login for use, and is available at http://severus.dbmi.pitt.edu/LENS. PMID:26680011

45 CFR 309.170 - What statistical and narrative reporting requirements apply to Tribal IV-D programs?

Code of Federal Regulations, 2011 CFR

2011-10-01

... 45 Public Welfare 2 2011-10-01 2011-10-01 false What statistical and narrative reporting... (IV-D) PROGRAM Statistical and Narrative Reporting Requirements § 309.170 What statistical and... organizations must submit the following information and statistics for Tribal IV-D program activity and caseload...

45 CFR 309.170 - What statistical and narrative reporting requirements apply to Tribal IV-D programs?

Code of Federal Regulations, 2010 CFR

2010-10-01

... 45 Public Welfare 2 2010-10-01 2010-10-01 false What statistical and narrative reporting... (IV-D) PROGRAM Statistical and Narrative Reporting Requirements § 309.170 What statistical and... organizations must submit the following information and statistics for Tribal IV-D program activity and caseload...

Across-cohort QC analyses of GWAS summary statistics from complex traits.

PubMed

Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

2016-01-01

Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics F st statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy.

Across-cohort QC analyses of GWAS summary statistics from complex traits

PubMed Central

Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

2017-01-01

Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics Fst statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy. PMID:27552965

Statistical quality control through overall vibration analysis

NASA Astrophysics Data System (ADS)

Carnero, M. ^a. Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos

2010-05-01

The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence of predictive variables (high-frequency vibration displacements) that are sensible to the processes setup and the quality of the products obtained. Based on the result of this overall vibration analysis, a second paper will analyse self-induced vibration spectrums in order to define limit vibration bands, controllable every cycle or connected to permanent vibration-monitoring systems able to adjust sensible process variables identified by ANOVA, once the vibration readings exceed established quality limits.

Reliability and statistical power analysis of cortical and subcortical FreeSurfer metrics in a large sample of healthy elderly.

PubMed

Liem, Franziskus; Mérillat, Susan; Bezzola, Ladina; Hirsiger, Sarah; Philipp, Michel; Madhyastha, Tara; Jäncke, Lutz

2015-03-01

FreeSurfer is a tool to quantify cortical and subcortical brain anatomy automatically and noninvasively. Previous studies have reported reliability and statistical power analyses in relatively small samples or only selected one aspect of brain anatomy. Here, we investigated reliability and statistical power of cortical thickness, surface area, volume, and the volume of subcortical structures in a large sample (N=189) of healthy elderly subjects (64+ years). Reliability (intraclass correlation coefficient) of cortical and subcortical parameters is generally high (cortical: ICCs>0.87, subcortical: ICCs>0.95). Surface-based smoothing increases reliability of cortical thickness maps, while it decreases reliability of cortical surface area and volume. Nevertheless, statistical power of all measures benefits from smoothing. When aiming to detect a 10% difference between groups, the number of subjects required to test effects with sufficient power over the entire cortex varies between cortical measures (cortical thickness: N=39, surface area: N=21, volume: N=81; 10mm smoothing, power=0.8, α=0.05). For subcortical regions this number is between 16 and 76 subjects, depending on the region. We also demonstrate the advantage of within-subject designs over between-subject designs. Furthermore, we publicly provide a tool that allows researchers to perform a priori power analysis and sensitivity analysis to help evaluate previously published studies and to design future studies with sufficient statistical power. Copyright © 2014 Elsevier Inc. All rights reserved.

[Multilevel Analysis in Health Services Research in Healthcare Organizations: Benefits, Requirements and Implementation].

PubMed

Ansmann, L; Kuhr, K; Kowalski, C

2017-03-01

Multilevel Analysis (MLA) are still rarely used in Health Services Research in Germany, though hierarchical data, e. g. from patients clustered in hospitals, is often present. MLA provide the valuable opportunity to study the health care context in health care organizations and the associations between context and health care outcomes. This article's aims are to introduce this particular method of data analysis, to discuss its' benefits and its' applicability particularly for Health Services Research focusing on organizational characteristics and to provide a concise guideline for performing the analysis. First, the benefits and the necessity for MLA compared to ordinary correlation analyses in the case of hierarchical data are discussed. Furthermore, the statistical requirements and key decisions for the performance of MLA are illustrated. © Georg Thieme Verlag KG Stuttgart · New York.

Metrology Optical Power Budgeting in SIM Using Statistical Analysis Techniques

NASA Technical Reports Server (NTRS)

Kuan, Gary M

2008-01-01

The Space Interferometry Mission (SIM) is a space-based stellar interferometry instrument, consisting of up to three interferometers, which will be capable of micro-arc second resolution. Alignment knowledge of the three interferometer baselines requires a three-dimensional, 14-leg truss with each leg being monitored by an external metrology gauge. In addition, each of the three interferometers requires an internal metrology gauge to monitor the optical path length differences between the two sides. Both external and internal metrology gauges are interferometry based, operating at a wavelength of 1319 nanometers. Each gauge has fiber inputs delivering measurement and local oscillator (LO) power, split into probe-LO and reference-LO beam pairs. These beams experience power loss due to a variety of mechanisms including, but not restricted to, design efficiency, material attenuation, element misalignment, diffraction, and coupling efficiency. Since the attenuation due to these sources may degrade over time, an accounting of the range of expected attenuation is needed so an optical power margin can be book kept. A method of statistical optical power analysis and budgeting, based on a technique developed for deep space RF telecommunications, is described in this paper and provides a numerical confidence level for having sufficient optical power relative to mission metrology performance requirements.

«

12

13

14

15

16

»

«

13

14

15

16

17

»

Improving information retrieval in functional analysis.

PubMed

Rodriguez, Juan C; González, Germán A; Fresno, Cristóbal; Llera, Andrea S; Fernández, Elmer A

2016-12-01

Transcriptome analysis is essential to understand the mechanisms regulating key biological processes and functions. The first step usually consists of identifying candidate genes; to find out which pathways are affected by those genes, however, functional analysis (FA) is mandatory. The most frequently used strategies for this purpose are Gene Set and Singular Enrichment Analysis (GSEA and SEA) over Gene Ontology. Several statistical methods have been developed and compared in terms of computational efficiency and/or statistical appropriateness. However, whether their results are similar or complementary, the sensitivity to parameter settings, or possible bias in the analyzed terms has not been addressed so far. Here, two GSEA and four SEA methods and their parameter combinations were evaluated in six datasets by comparing two breast cancer subtypes with well-known differences in genetic background and patient outcomes. We show that GSEA and SEA lead to different results depending on the chosen statistic, model and/or parameters. Both approaches provide complementary results from a biological perspective. Hence, an Integrative Functional Analysis (IFA) tool is proposed to improve information retrieval in FA. It provides a common gene expression analytic framework that grants a comprehensive and coherent analysis. Only a minimal user parameter setting is required, since the best SEA/GSEA alternatives are integrated. IFA utility was demonstrated by evaluating four prostate cancer and the TCGA breast cancer microarray datasets, which showed its biological generalization capabilities. Copyright © 2016 Elsevier Ltd. All rights reserved.

77 FR 16228 - Key Hyundai of Manchester, LLC; Analysis of Proposed Consent Order To Aid Public Comment

Federal Register 2010, 2011, 2012, 2013, 2014

2012-03-20

... Social Security number, date of birth, driver's license number or other state identification number or... information such as costs, sales statistics, inventories, formulas, patterns, devices, manufacturing processes... consumer credit. It also requires that if any finance charge is advertised, the rate be stated as an...

Understanding the behavioral linkages needed for designing effective interventions to increase fruit and vegetable intake in diverse populations

USDA-ARS?s Scientific Manuscript database

The design of interventions to increase fruit and vegetable consumption in a population (e.g. all men, all elementary school students) requires an underlying model that organizes the relevant literatures and provides an audience. The mediating-moderating variable model is a statistical analysis tech...

Statistical analysis of CSP plants by simulating extensive meteorological series

NASA Astrophysics Data System (ADS)

Pavón, Manuel; Fernández, Carlos M.; Silva, Manuel; Moreno, Sara; Guisado, María V.; Bernardos, Ana

2017-06-01

The feasibility analysis of any power plant project needs the estimation of the amount of energy it will be able to deliver to the grid during its lifetime. To achieve this, its feasibility study requires a precise knowledge of the solar resource over a long term period. In Concentrating Solar Power projects (CSP), financing institutions typically requires several statistical probability of exceedance scenarios of the expected electric energy output. Currently, the industry assumes a correlation between probabilities of exceedance of annual Direct Normal Irradiance (DNI) and energy yield. In this work, this assumption is tested by the simulation of the energy yield of CSP plants using as input a 34-year series of measured meteorological parameters and solar irradiance. The results of this work show that, even if some correspondence between the probabilities of exceedance of annual DNI values and energy yields is found, the intra-annual distribution of DNI may significantly affect this correlation. This result highlights the need of standardized procedures for the elaboration of representative DNI time series representative of a given probability of exceedance of annual DNI.

A study of two statistical methods as applied to shuttle solid rocket booster expenditures

NASA Technical Reports Server (NTRS)

Perlmutter, M.; Huang, Y.; Graves, M.

1974-01-01

The state probability technique and the Monte Carlo technique are applied to finding shuttle solid rocket booster expenditure statistics. For a given attrition rate per launch, the probable number of boosters needed for a given mission of 440 launches is calculated. Several cases are considered, including the elimination of the booster after a maximum of 20 consecutive launches. Also considered is the case where the booster is composed of replaceable components with independent attrition rates. A simple cost analysis is carried out to indicate the number of boosters to build initially, depending on booster costs. Two statistical methods were applied in the analysis: (1) state probability method which consists of defining an appropriate state space for the outcome of the random trials, and (2) model simulation method or the Monte Carlo technique. It was found that the model simulation method was easier to formulate while the state probability method required less computing time and was more accurate.

Spatial statistical analysis of tree deaths using airborne digital imagery

NASA Astrophysics Data System (ADS)

Chang, Ya-Mei; Baddeley, Adrian; Wallace, Jeremy; Canci, Michael

2013-04-01

High resolution digital airborne imagery offers unprecedented opportunities for observation and monitoring of vegetation, providing the potential to identify, locate and track individual vegetation objects over time. Analytical tools are required to quantify relevant information. In this paper, locations of trees over a large area of native woodland vegetation were identified using morphological image analysis techniques. Methods of spatial point process statistics were then applied to estimate the spatially-varying tree death risk, and to show that it is significantly non-uniform. [Tree deaths over the area were detected in our previous work (Wallace et al., 2008).] The study area is a major source of ground water for the city of Perth, and the work was motivated by the need to understand and quantify vegetation changes in the context of water extraction and drying climate. The influence of hydrological variables on tree death risk was investigated using spatial statistics (graphical exploratory methods, spatial point pattern modelling and diagnostics).

UNCERTAINTY ON RADIATION DOSES ESTIMATED BY BIOLOGICAL AND RETROSPECTIVE PHYSICAL METHODS.

PubMed

Ainsbury, Elizabeth A; Samaga, Daniel; Della Monaca, Sara; Marrale, Maurizio; Bassinet, Celine; Burbidge, Christopher I; Correcher, Virgilio; Discher, Michael; Eakins, Jon; Fattibene, Paola; Güçlü, Inci; Higueras, Manuel; Lund, Eva; Maltar-Strmecki, Nadica; McKeever, Stephen; Rääf, Christopher L; Sholom, Sergey; Veronese, Ivan; Wieser, Albrecht; Woda, Clemens; Trompier, Francois

2018-03-01

Biological and physical retrospective dosimetry are recognised as key techniques to provide individual estimates of dose following unplanned exposures to ionising radiation. Whilst there has been a relatively large amount of recent development in the biological and physical procedures, development of statistical analysis techniques has failed to keep pace. The aim of this paper is to review the current state of the art in uncertainty analysis techniques across the 'EURADOS Working Group 10-Retrospective dosimetry' members, to give concrete examples of implementation of the techniques recommended in the international standards, and to further promote the use of Monte Carlo techniques to support characterisation of uncertainties. It is concluded that sufficient techniques are available and in use by most laboratories for acute, whole body exposures to highly penetrating radiation, but further work will be required to ensure that statistical analysis is always wholly sufficient for the more complex exposure scenarios.

Parametric distribution approach for flow availability in small hydro potential analysis

NASA Astrophysics Data System (ADS)

Abdullah, Samizee; Basri, Mohd Juhari Mat; Jamaluddin, Zahrul Zamri; Azrulhisham, Engku Ahmad; Othman, Jamel

2016-10-01

Small hydro system is one of the important sources of renewable energy and it has been recognized worldwide as clean energy sources. Small hydropower generation system uses the potential energy in flowing water to produce electricity is often questionable due to inconsistent and intermittent of power generated. Potential analysis of small hydro system which is mainly dependent on the availability of water requires the knowledge of water flow or stream flow distribution. This paper presented the possibility of applying Pearson system for stream flow availability distribution approximation in the small hydro system. By considering the stochastic nature of stream flow, the Pearson parametric distribution approximation was computed based on the significant characteristic of Pearson system applying direct correlation between the first four statistical moments of the distribution. The advantage of applying various statistical moments in small hydro potential analysis will have the ability to analyze the variation shapes of stream flow distribution.

Basic Student Charges at Postsecondary Institutions: Academic Year 1994-95. Tuition and Required Fees and Room and Board Charges at 4-Year, 2-Year, and Public Less-Than-2-Year Institutions. Statistical Analysis Report.

ERIC Educational Resources Information Center

Barbett, Samuel F.; And Others

This document lists the typical tuition and required fees and room and board charges assessed to college students in 1994-95 based on a national "Institutional Characteristics" survey which is part of the Integrated Postsecondary Education Data System. The data were collected from over 5,000 of the 5,775 4-year, 2-year, and public…

Restoration of MRI data for intensity non-uniformities using local high order intensity statistics

PubMed Central

Hadjidemetriou, Stathis; Studholme, Colin; Mueller, Susanne; Weiner, Michael; Schuff, Norbert

2008-01-01

MRI at high magnetic fields (>3.0 T) is complicated by strong inhomogeneous radio-frequency fields, sometimes termed the “bias field”. These lead to non-biological intensity non-uniformities across the image. They can complicate further image analysis such as registration and tissue segmentation. Existing methods for intensity uniformity restoration have been optimized for 1.5 T, but they are less effective for 3.0 T MRI, and not at all satisfactory for higher fields. Also, many of the existing restoration algorithms require a brain template or use a prior atlas, which can restrict their practicalities. In this study an effective intensity uniformity restoration algorithm has been developed based on non-parametric statistics of high order local intensity co-occurrences. These statistics are restored with a non-stationary Wiener filter. The algorithm also assumes a smooth non-uniformity and is stable. It does not require a prior atlas and is robust to variations in anatomy. In geriatric brain imaging it is robust to variations such as enlarged ventricles and low contrast to noise ratio. The co-occurrence statistics improve robustness to whole head images with pronounced non-uniformities present in high field acquisitions. Its significantly improved performance and lower time requirements have been demonstrated by comparing it to the very commonly used N3 algorithm on BrainWeb MR simulator images as well as on real 4 T human head images. PMID:18621568

Does Topical Ozone Therapy Improve Patient Comfort After Surgical Removal of Impacted Mandibular Third Molar? A Randomized Controlled Trial.

PubMed

Sivalingam, Varun P; Panneerselvam, Elavenil; Raja, Krishnakumar V B; Gopi, Gayathri

2017-01-01

To assess the influence of topical ozone administration on patient comfort after third molar surgery. A single-blinded randomized controlled clinical trial was designed involving patients who required removal of bilateral impacted mandibular third molars. The predictor variable was the postoperative medication used after third molar surgery. Using the split-mouth design, the study group received topical ozone without postoperative systemic antibiotics, whereas the control group did not receive ozone but only systemic antibiotics. The 2 groups were prescribed analgesics for 2 days. The assessing surgeon was blinded to treatment assignment. The primary outcome variables were postoperative mouth opening, pain, and swelling. The secondary outcome variable was the number of analgesic doses required by each group on postoperative days 3 to 5. Data analysis involved descriptive statistics, paired t tests, and 2-way analysis of variance with repeated measures (P < .05). SPSS 20.0 was used for data analysis. The study sample included 33 patients (n = 33 in each group). The study group showed statistically relevant decreases in postoperative pain, swelling, and trismus. Further, the number of analgesics required was smaller than in the control group. No adverse effects of ozone gel were observed in any patient. Ozone gel was found to be an effective topical agent that considerably improves patient comfort postoperatively and can be considered a substitute of postoperative systemic antibiotics. Copyright © 2016 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.

Coloc-stats: a unified web interface to perform colocalization analysis of genomic features.

PubMed

Simovski, Boris; Kanduri, Chakravarthi; Gundersen, Sveinung; Titov, Dmytro; Domanska, Diana; Bock, Christoph; Bossini-Castillo, Lara; Chikina, Maria; Favorov, Alexander; Layer, Ryan M; Mironov, Andrey A; Quinlan, Aaron R; Sheffield, Nathan C; Trynka, Gosia; Sandve, Geir K

2018-06-05

Functional genomics assays produce sets of genomic regions as one of their main outputs. To biologically interpret such region-sets, researchers often use colocalization analysis, where the statistical significance of colocalization (overlap, spatial proximity) between two or more region-sets is tested. Existing colocalization analysis tools vary in the statistical methodology and analysis approaches, thus potentially providing different conclusions for the same research question. As the findings of colocalization analysis are often the basis for follow-up experiments, it is helpful to use several tools in parallel and to compare the results. We developed the Coloc-stats web service to facilitate such analyses. Coloc-stats provides a unified interface to perform colocalization analysis across various analytical methods and method-specific options (e.g. colocalization measures, resolution, null models). Coloc-stats helps the user to find a method that supports their experimental requirements and allows for a straightforward comparison across methods. Coloc-stats is implemented as a web server with a graphical user interface that assists users with configuring their colocalization analyses. Coloc-stats is freely available at https://hyperbrowser.uio.no/coloc-stats/.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kacprzak, T.; Kirk, D.; Friedrich, O.

Shear peak statistics has gained a lot of attention recently as a practical alternative to the two point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 degmore » $^2$ field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range $$0<\\mathcal S / \\mathcal N<4$$. To predict the peak counts as a function of cosmological parameters we use a suite of $N$-body simulations spanning 158 models with varying $$\\Omega_{\\rm m}$$ and $$\\sigma_8$$, fixing $w = -1$, $$\\Omega_{\\rm b} = 0.04$$, $h = 0.7$ and $$n_s=1$$, to which we have applied the DES SV mask and redshift distribution. In our fiducial analysis we measure $$\\sigma_{8}(\\Omega_{\\rm m}/0.3)^{0.6}=0.77 \\pm 0.07$$, after marginalising over the shear multiplicative bias and the error on the mean redshift of the galaxy sample. We introduce models of intrinsic alignments, blending, and source contamination by cluster members. These models indicate that peaks with $$\\mathcal S / \\mathcal N>4$$ would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. As a result, we discuss prospects for future peak statistics analysis with upcoming DES data.« less

Training in metabolomics research. II. Processing and statistical analysis of metabolomics data, metabolite identification, pathway analysis, applications of metabolomics and its future.

PubMed

Barnes, Stephen; Benton, H Paul; Casazza, Krista; Cooper, Sara J; Cui, Xiangqin; Du, Xiuxia; Engler, Jeffrey; Kabarowski, Janusz H; Li, Shuzhao; Pathmasiri, Wimal; Prasain, Jeevan K; Renfrow, Matthew B; Tiwari, Hemant K

2016-08-01

Metabolomics, a systems biology discipline representing analysis of known and unknown pathways of metabolism, has grown tremendously over the past 20 years. Because of its comprehensive nature, metabolomics requires careful consideration of the question(s) being asked, the scale needed to answer the question(s), collection and storage of the sample specimens, methods for extraction of the metabolites from biological matrices, the analytical method(s) to be employed and the quality control of the analyses, how collected data are correlated, the statistical methods to determine metabolites undergoing significant change, putative identification of metabolites and the use of stable isotopes to aid in verifying metabolite identity and establishing pathway connections and fluxes. This second part of a comprehensive description of the methods of metabolomics focuses on data analysis, emerging methods in metabolomics and the future of this discipline. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

Site Suitability Analysis for Beekeeping via Analythical Hyrearchy Process, Konya Example

NASA Astrophysics Data System (ADS)

Sarı, F.; Ceylan, D. A.

2017-11-01

Over the past decade, the importance of the beekeeping activities has been emphasized in the field of biodiversity, ecosystems, agriculture and human health. Thus, efficient management and deciding correct beekeeping activities seems essential to maintain and improve productivity and efficiency. Due to this importance, considering the economic contributions to the rural area, the need for suitability analysis concept has been revealed. At this point, Multi Criteria Decision Analysis (MCDA) and Geographical Information Systems (GIS) integration provides efficient solutions to the complex structure of decision- making process for beekeeping activities. In this study, site suitability analysis via Analytical Hierarchy Process (AHP) was carried out for Konya city in Turkey. Slope, elevation, aspect, distance to water resources, roads and settlements, precipitation and flora criteria are included to determine suitability. The requirements, expectations and limitations of beekeeping activities are specified with the participation of experts and stakeholders. The final suitability map were validated with existing 117 beekeeping locations and Turkish Statistical Institute 2016 beekeeping statistics for Konya province.

Multi-response permutation procedure as an alternative to the analysis of variance: an SPSS implementation.

PubMed

Cai, Li

2006-02-01

A permutation test typically requires fewer assumptions than does a comparable parametric counterpart. The multi-response permutation procedure (MRPP) is a class of multivariate permutation tests of group difference useful for the analysis of experimental data. However, psychologists seldom make use of the MRPP in data analysis, in part because the MRPP is not implemented in popular statistical packages that psychologists use. A set of SPSS macros implementing the MRPP test is provided in this article. The use of the macros is illustrated by analyzing example data sets.

From Biophysics to Evolutionary Genetics: Statistical Aspects of Gene Regulation

NASA Astrophysics Data System (ADS)

Lässig, Michael

Genomic functions often cannot be understood at the level of single genes but require the study of gene networks. This systems biology credo is nearly commonplace by now. Evidence comes from the comparative analysis of entire genomes: current estimates put, for example, the number of human genes at around 22,000, hardly more than the 14,000 of the fruit fly, and not even an order of magnitude higher than the 6,000 of baker's yeast. The complexity and diversity of higher animals, therefore, cannot be explained in terms of their gene numbers. If, however, a biological function requires the concerted action of several genes, and conversely, a gene takes part in several functional contexts, an organism may be defined less by its individual genes but by their interactions. The emerging picture of the genome as a strongly interacting system with many degrees of freedom brings new challenges for experiment and theory, many of which are of a statistical nature. And indeed, this picture continues to make the subject attractive to a growing number of statistical physicists.

A Prototype System for Retrieval of Gene Functional Information

PubMed Central

Folk, Lillian C.; Patrick, Timothy B.; Pattison, James S.; Wolfinger, Russell D.; Mitchell, Joyce A.

2003-01-01

Microarrays allow researchers to gather data about the expression patterns of thousands of genes simultaneously. Statistical analysis can reveal which genes show statistically significant results. Making biological sense of those results requires the retrieval of functional information about the genes thus identified, typically a manual gene-by-gene retrieval of information from various on-line databases. For experiments generating thousands of genes of interest, retrieval of functional information can become a significant bottleneck. To address this issue, we are currently developing a prototype system to automate the process of retrieval of functional information from multiple on-line sources. PMID:14728346

Data Treatment for LC-MS Untargeted Analysis.

PubMed

Riccadonna, Samantha; Franceschi, Pietro

2018-01-01

Liquid chromatography-mass spectrometry (LC-MS) untargeted experiments require complex chemometrics strategies to extract information from the experimental data. Here we discuss "data preprocessing", the set of procedures performed on the raw data to produce a data matrix which will be the starting point for the subsequent statistical analysis. Data preprocessing is a crucial step on the path to knowledge extraction, which should be carefully controlled and optimized in order to maximize the output of any untargeted metabolomics investigation.

An empirical comparison of statistical tests for assessing the proportional hazards assumption of Cox's model.

PubMed

Ng'andu, N H

1997-03-30

In the analysis of survival data using the Cox proportional hazard (PH) model, it is important to verify that the explanatory variables analysed satisfy the proportional hazard assumption of the model. This paper presents results of a simulation study that compares five test statistics to check the proportional hazard assumption of Cox's model. The test statistics were evaluated under proportional hazards and the following types of departures from the proportional hazard assumption: increasing relative hazards; decreasing relative hazards; crossing hazards; diverging hazards, and non-monotonic hazards. The test statistics compared include those based on partitioning of failure time and those that do not require partitioning of failure time. The simulation results demonstrate that the time-dependent covariate test, the weighted residuals score test and the linear correlation test have equally good power for detection of non-proportionality in the varieties of non-proportional hazards studied. Using illustrative data from the literature, these test statistics performed similarly.

«

13

14

15

16

17

»

«

14

15

16

17

18

»

Sparse approximation of currents for statistics on curves and surfaces.

PubMed

Durrleman, Stanley; Pennec, Xavier; Trouvé, Alain; Ayache, Nicholas

2008-01-01

Computing, processing, visualizing statistics on shapes like curves or surfaces is a real challenge with many applications ranging from medical image analysis to computational geometry. Modelling such geometrical primitives with currents avoids feature-based approach as well as point-correspondence method. This framework has been proved to be powerful to register brain surfaces or to measure geometrical invariants. However, if the state-of-the-art methods perform efficiently pairwise registrations, new numerical schemes are required to process groupwise statistics due to an increasing complexity when the size of the database is growing. Statistics such as mean and principal modes of a set of shapes often have a heavy and highly redundant representation. We propose therefore to find an adapted basis on which mean and principal modes have a sparse decomposition. Besides the computational improvement, this sparse representation offers a way to visualize and interpret statistics on currents. Experiments show the relevance of the approach on 34 sets of 70 sulcal lines and on 50 sets of 10 meshes of deep brain structures.

Proposal for a biometrics of the cortical surface: a statistical method for relative surface distance metrics

NASA Astrophysics Data System (ADS)

Bookstein, Fred L.

1995-08-01

Recent advances in computational geometry have greatly extended the range of neuroanatomical questions that can be approached by rigorous quantitative methods. One of the major current challenges in this area is to describe the variability of human cortical surface form and its implications for individual differences in neurophysiological functioning. Existing techniques for representation of stochastically invaginated surfaces do not conduce to the necessary parametric statistical summaries. In this paper, following a hint from David Van Essen and Heather Drury, I sketch a statistical method customized for the constraints of this complex data type. Cortical surface form is represented by its Riemannian metric tensor and averaged according to parameters of a smooth averaged surface. Sulci are represented by integral trajectories of the smaller principal strains of this metric, and their statistics follow the statistics of that relative metric. The diagrams visualizing this tensor analysis look like alligator leather but summarize all aspects of cortical surface form in between the principal sulci, the reliable ones; no flattening is required.

The LaueUtil toolkit for Laue photocrystallography. II. Spot finding and integration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kalinowski, Jaroslaw A.; Fournier, Bertrand; Makal, Anna

2015-10-15

A spot-integration method is described which does not require prior indexing of the reflections. It is based on statistical analysis of the values from each of the pixels on successive frames, followed for each frame by morphological analysis to identify clusters of high value pixels which form an appropriate mask corresponding to a reflection peak. The method does not require prior assumptions such as fitting of a profile or definition of an integration box. The results are compared with those of the seed-skewness method which is based on minimizing the skewness of the intensity distribution within a peak's integration box.more » Applications in Laue photocrystallography are presented.« less

45 CFR 160.536 - Statistical sampling.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 45 Public Welfare 1 2010-10-01 2010-10-01 false Statistical sampling. 160.536 Section 160.536... REQUIREMENTS GENERAL ADMINISTRATIVE REQUIREMENTS Procedures for Hearings § 160.536 Statistical sampling. (a) In... statistical sampling study as evidence of the number of violations under § 160.406 of this part, or the...

45 CFR 160.536 - Statistical sampling.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 45 Public Welfare 1 2011-10-01 2011-10-01 false Statistical sampling. 160.536 Section 160.536... REQUIREMENTS GENERAL ADMINISTRATIVE REQUIREMENTS Procedures for Hearings § 160.536 Statistical sampling. (a) In... statistical sampling study as evidence of the number of violations under § 160.406 of this part, or the...

24 CFR 1710.13 - Metropolitan Statistical Area (MSA) exemption.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 24 Housing and Urban Development 5 2011-04-01 2011-04-01 false Metropolitan Statistical Area (MSA... Requirements § 1710.13 Metropolitan Statistical Area (MSA) exemption. (a) Eligibility requirements. The sale of... since April 28, 1969. (2) The lot is located within a Metropolitan Statistical Area (MSA) as defined by...

24 CFR 1710.13 - Metropolitan Statistical Area (MSA) exemption.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 24 Housing and Urban Development 5 2010-04-01 2010-04-01 false Metropolitan Statistical Area (MSA... Requirements § 1710.13 Metropolitan Statistical Area (MSA) exemption. (a) Eligibility requirements. The sale of... since April 28, 1969. (2) The lot is located within a Metropolitan Statistical Area (MSA) as defined by...

Multivariate Meta-Analysis of Heterogeneous Studies Using Only Summary Statistics: Efficiency and Robustness

PubMed Central

Liu, Dungang; Liu, Regina; Xie, Minge

2014-01-01

Meta-analysis has been widely used to synthesize evidence from multiple studies for common hypotheses or parameters of interest. However, it has not yet been fully developed for incorporating heterogeneous studies, which arise often in applications due to different study designs, populations or outcomes. For heterogeneous studies, the parameter of interest may not be estimable for certain studies, and in such a case, these studies are typically excluded from conventional meta-analysis. The exclusion of part of the studies can lead to a non-negligible loss of information. This paper introduces a metaanalysis for heterogeneous studies by combining the confidence density functions derived from the summary statistics of individual studies, hence referred to as the CD approach. It includes all the studies in the analysis and makes use of all information, direct as well as indirect. Under a general likelihood inference framework, this new approach is shown to have several desirable properties, including: i) it is asymptotically as efficient as the maximum likelihood approach using individual participant data (IPD) from all studies; ii) unlike the IPD analysis, it suffices to use summary statistics to carry out the CD approach. Individual-level data are not required; and iii) it is robust against misspecification of the working covariance structure of the parameter estimates. Besides its own theoretical significance, the last property also substantially broadens the applicability of the CD approach. All the properties of the CD approach are further confirmed by data simulated from a randomized clinical trials setting as well as by real data on aircraft landing performance. Overall, one obtains an unifying approach for combining summary statistics, subsuming many of the existing meta-analysis methods as special cases. PMID:26190875

Statistical analysis of modal properties of a cable-stayed bridge through long-term structural health monitoring with wireless smart sensor networks

NASA Astrophysics Data System (ADS)

Asadollahi, Parisa; Li, Jian

2016-04-01

Understanding the dynamic behavior of complex structures such as long-span bridges requires dense deployment of sensors. Traditional wired sensor systems are generally expensive and time-consuming to install due to cabling. With wireless communication and on-board computation capabilities, wireless smart sensor networks have the advantages of being low cost, easy to deploy and maintain and therefore facilitate dense instrumentation for structural health monitoring. A long-term monitoring project was recently carried out for a cable-stayed bridge in South Korea with a dense array of 113 smart sensors, which feature the world's largest wireless smart sensor network for civil structural monitoring. This paper presents a comprehensive statistical analysis of the modal properties including natural frequencies, damping ratios and mode shapes of the monitored cable-stayed bridge. Data analyzed in this paper is composed of structural vibration signals monitored during a 12-month period under ambient excitations. The correlation between environmental temperature and the modal frequencies is also investigated. The results showed the long-term statistical structural behavior of the bridge, which serves as the basis for Bayesian statistical updating for the numerical model.

A statistical study of EMIC waves observed by Cluster: 1. Wave properties

NASA Astrophysics Data System (ADS)

Allen, R. C.; Zhang, J.-C.; Kistler, L. M.; Spence, H. E.; Lin, R.-L.; Klecker, B.; Dunlop, M. W.; André, M.; Jordanova, V. K.

2015-07-01

Electromagnetic ion cyclotron (EMIC) waves are an important mechanism for particle energization and losses inside the magnetosphere. In order to better understand the effects of these waves on particle dynamics, detailed information about the occurrence rate, wave power, ellipticity, normal angle, energy propagation angle distributions, and local plasma parameters are required. Previous statistical studies have used in situ observations to investigate the distribution of these parameters in the magnetic local time versus L-shell (MLT-L) frame within a limited magnetic latitude (MLAT) range. In this study, we present a statistical analysis of EMIC wave properties using 10 years (2001-2010) of data from Cluster, totaling 25,431 min of wave activity. Due to the polar orbit of Cluster, we are able to investigate EMIC waves at all MLATs and MLTs. This allows us to further investigate the MLAT dependence of various wave properties inside different MLT sectors and further explore the effects of Shabansky orbits on EMIC wave generation and propagation. The statistical analysis is presented in two papers. This paper focuses on the wave occurrence distribution as well as the distribution of wave properties. The companion paper focuses on local plasma parameters during wave observations as well as wave generation proxies.

A Dynamic Intrusion Detection System Based on Multivariate Hotelling's T2 Statistics Approach for Network Environments

PubMed Central

Avalappampatty Sivasamy, Aneetha; Sundan, Bose

2015-01-01

The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T2 method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T2 statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better. PMID:26357668

A Dynamic Intrusion Detection System Based on Multivariate Hotelling's T2 Statistics Approach for Network Environments.

PubMed

Sivasamy, Aneetha Avalappampatty; Sundan, Bose

2015-01-01

The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T(2) method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T(2) statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better.

An Asynchronous Many-Task Implementation of In-Situ Statistical Analysis using Legion.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pebay, Philippe Pierre; Bennett, Janine Camille

2015-11-01

In this report, we propose a framework for the design and implementation of in-situ analy- ses using an asynchronous many-task (AMT) model, using the Legion programming model together with the MiniAero mini-application as a surrogate for full-scale parallel scientific computing applications. The bulk of this work consists of converting the Learn/Derive/Assess model which we had initially developed for parallel statistical analysis using MPI [PTBM11], from a SPMD to an AMT model. In this goal, we propose an original use of the concept of Legion logical regions as a replacement for the parallel communication schemes used for the only operation ofmore » the statistics engines that require explicit communication. We then evaluate this proposed scheme in a shared memory environment, using the Legion port of MiniAero as a proxy for a full-scale scientific application, as a means to provide input data sets of variable size for the in-situ statistical analyses in an AMT context. We demonstrate in particular that the approach has merit, and warrants further investigation, in collaboration with ongoing efforts to improve the overall parallel performance of the Legion system.« less

A graph theory approach to identify resonant and non-resonant transmission paths in statistical modal energy distribution analysis

NASA Astrophysics Data System (ADS)

Aragonès, Àngels; Maxit, Laurent; Guasch, Oriol

2015-08-01

Statistical modal energy distribution analysis (SmEdA) extends classical statistical energy analysis (SEA) to the mid frequency range by establishing power balance equations between modes in different subsystems. This circumvents the SEA requirement of modal energy equipartition and enables applying SmEdA to the cases of low modal overlap, locally excited subsystems and to deal with complex heterogeneous subsystems as well. Yet, widening the range of application of SEA is done at a price with large models because the number of modes per subsystem can become considerable when the frequency increases. Therefore, it would be worthwhile to have at one's disposal tools for a quick identification and ranking of the resonant and non-resonant paths involved in modal energy transmission between subsystems. It will be shown that previously developed graph theory algorithms for transmission path analysis (TPA) in SEA can be adapted to SmEdA and prove useful for that purpose. The case of airborne transmission between two cavities separated apart by homogeneous and ribbed plates will be first addressed to illustrate the potential of the graph approach. A more complex case representing transmission between non-contiguous cavities in a shipbuilding structure will be also presented.

Volcanic hazard assessment for the Canary Islands (Spain) using extreme value theory, and the recent volcanic eruption of El Hierro

NASA Astrophysics Data System (ADS)

Sobradelo, R.; Martí, J.; Mendoza-Rosas, A. T.; Gómez, G.

2012-04-01

The Canary Islands are an active volcanic region densely populated and visited by several millions of tourists every year. Nearly twenty eruptions have been reported through written chronicles in the last 600 years, suggesting that the probability of a new eruption in the near future is far from zero. This shows the importance of assessing and monitoring the volcanic hazard of the region in order to reduce and manage its potential volcanic risk, and ultimately contribute to the design of appropriate preparedness plans. Hence, the probabilistic analysis of the volcanic eruption time series for the Canary Islands is an essential step for the assessment of volcanic hazard and risk in the area. Such a series describes complex processes involving different types of eruptions over different time scales. Here we propose a statistical method for calculating the probabilities of future eruptions which is most appropriate given the nature of the documented historical eruptive data. We first characterise the eruptions by their magnitudes, and then carry out a preliminary analysis of the data to establish the requirements for the statistical method. Past studies in eruptive time series used conventional statistics and treated the series as an homogeneous process. In this paper, we will use a method that accounts for the time-dependence of the series and includes rare or extreme events, in the form of few data of large eruptions, since these data require special methods of analysis. Hence, we will use a statistical method from extreme value theory. In particular, we will apply a non-homogeneous Poisson process to the historical eruptive data of the Canary Islands to estimate the probability of having at least one volcanic event of a magnitude greater than one in the upcoming years. Shortly after the publication of this method an eruption in the island of El Hierro took place for the first time in historical times, supporting our method and contributing towards the validation of our results.

Effect of Flexible Duty Hour Policies on Length of Stay for Complex Intra-Abdominal Operations: A Flexibility in Duty Hour Requirements for Surgical Trainees (FIRST) Trial Analysis.

PubMed

Stulberg, Jonah J; Pavey, Emily S; Cohen, Mark E; Ko, Clifford Y; Hoyt, David B; Bilimoria, Karl Y

2017-02-01

Changes to resident duty hour policies in the Flexibility in Duty Hour Requirements for Surgical Trainees (FIRST) trial could impact hospitalized patients' length of stay (LOS) by altering care coordination. Length of stay can also serve as a reflection of all complications, particularly those not captured in the FIRST trial (eg pneumothorax from central line). Programs were randomized to either maintaining current ACGME duty hour policies (Standard arm) or more flexible policies waiving rules on maximum shift lengths and time off between shifts (Flexible arm). Our objective was to determine whether flexibility in resident duty hours affected LOS in patients undergoing high-risk surgical operations. Patients were identified who underwent hepatectomy, pancreatectomy, laparoscopic colectomy, open colectomy, or ventral hernia repair (2014-2015 academic year) at 154 hospitals participating in the FIRST trial. Two procedure-stratified evaluations of LOS were undertaken: multivariable negative binomial regression analysis on LOS and a multivariable logistic regression analysis on the likelihood of a prolonged LOS (>75 th percentile). Before any adjustments, there was no statistically significant difference in overall mean LOS between study arms (Flexible Policy: mean [SD] LOS 6.03 [5.78] days vs Standard Policy: mean LOS 6.21 [5.82] days; p = 0.74). In adjusted analyses, there was no statistically significant difference in LOS between study arms overall (incidence rate ratio for Flexible vs Standard: 0.982; 95% CI, 0.939-1.026; p = 0.41) or for any individual procedures. In addition, there was no statistically significant difference in the proportion of patients with prolonged LOS between study arms overall (Flexible vs Standard: odds ratio = 1.028; 95% CI, 0.871-1.212) or for any individual procedures. Duty hour flexibility had no statistically significant effect on LOS in patients undergoing complex intra-abdominal operations. Copyright © 2016 American College of Surgeons. Published by Elsevier Inc. All rights reserved.

Development of guidance for states transitioning to new safety analysis tools

NASA Astrophysics Data System (ADS)

Alluri, Priyanka

With about 125 people dying on US roads each day, the US Department of Transportation heightened the awareness of critical safety issues with the passage of SAFETEA-LU (Safe Accountable Flexible Efficient Transportation Equity Act---a Legacy for Users) legislation in 2005. The legislation required each of the states to develop a Strategic Highway Safety Plan (SHSP) and incorporate data-driven approaches to prioritize and evaluate program outcomes: Failure to do so resulted in funding sanctioning. In conjunction with the legislation, research efforts have also been progressing toward the development of new safety analysis tools such as IHSDM (Interactive Highway Safety Design Model), SafetyAnalyst, and HSM (Highway Safety Manual). These software and analysis tools are comparatively more advanced in statistical theory and level of accuracy, and have a tendency to be more data intensive. A review of the 2009 five-percent reports and excerpts from the nationwide survey revealed astonishing facts about the continuing use of traditional methods including crash frequencies and rates for site selection and prioritization. The intense data requirements and statistical complexity of advanced safety tools are considered as a hindrance to their adoption. In this context, this research aims at identifying the data requirements and data availability for SafetyAnalyst and HSM by working with both the tools. This research sets the stage for working with the Empirical Bayes approach by highlighting some of the biases and issues associated with the traditional methods of selecting projects such as greater emphasis on traffic volume and regression-to-mean phenomena. Further, the not-so-obvious issue with shorter segment lengths, which effect the results independent of the methods used, is also discussed. The more reliable and statistically acceptable Empirical Bayes methodology requires safety performance functions (SPFs), regression equations predicting the relation between crashes and exposure for a subset of roadway network. These SPFs, specific to a region and the analysis period are often unavailable. Calibration of already existing default national SPFs to the state's data could be a feasible solution, but, how well the state's data is represented is a legitimate question. With this background, SPFs were generated for various classifications of segments in Georgia and compared against the national default SPFs used in SafetyAnalyst calibrated to Georgia data. Dwelling deeper into the development of SPFs, the influence of actual and estimated traffic data on the fit of the equations is also studied questioning the accuracy and reliability of traffic estimations. In addition to SafetyAnalyst, HSM aims at performing quantitative safety analysis. Applying HSM methodology to two-way two-lane rural roads, the effect of using multiple CMFs (Crash Modification Factors) is studied. Lastly, data requirements, methodology, constraints, and results are compared between SafetyAnalyst and HSM.

Statistical issues on the analysis of change in follow-up studies in dental research.

PubMed

Blance, Andrew; Tu, Yu-Kang; Baelum, Vibeke; Gilthorpe, Mark S

2007-12-01

To provide an overview to the problems in study design and associated analyses of follow-up studies in dental research, particularly addressing three issues: treatment-baselineinteractions; statistical power; and nonrandomization. Our previous work has shown that many studies purport an interacion between change (from baseline) and baseline values, which is often based on inappropriate statistical analyses. A priori power calculations are essential for randomized controlled trials (RCTs), but in the pre-test/post-test RCT design it is not well known to dental researchers that the choice of statistical method affects power, and that power is affected by treatment-baseline interactions. A common (good) practice in the analysis of RCT data is to adjust for baseline outcome values using ancova, thereby increasing statistical power. However, an important requirement for ancova is there to be no interaction between the groups and baseline outcome (i.e. effective randomization); the patient-selection process should not cause differences in mean baseline values across groups. This assumption is often violated for nonrandomized (observational) studies and the use of ancova is thus problematic, potentially giving biased estimates, invoking Lord's paradox and leading to difficulties in the interpretation of results. Baseline interaction issues can be overcome by use of statistical methods; not widely practiced in dental research: Oldham's method and multilevel modelling; the latter is preferred for its greater flexibility to deal with more than one follow-up occasion as well as additional covariates To illustrate these three key issues, hypothetical examples are considered from the fields of periodontology, orthodontics, and oral implantology. Caution needs to be exercised when considering the design and analysis of follow-up studies. ancova is generally inappropriate for nonrandomized studies and causal inferences from observational data should be avoided.

Algorithm for computing descriptive statistics for very large data sets and the exa-scale era

NASA Astrophysics Data System (ADS)

Beekman, Izaak

2017-11-01

An algorithm for Single-point, Parallel, Online, Converging Statistics (SPOCS) is presented. It is suited for in situ analysis that traditionally would be relegated to post-processing, and can be used to monitor the statistical convergence and estimate the error/residual in the quantity-useful for uncertainty quantification too. Today, data may be generated at an overwhelming rate by numerical simulations and proliferating sensing apparatuses in experiments and engineering applications. Monitoring descriptive statistics in real time lets costly computations and experiments be gracefully aborted if an error has occurred, and monitoring the level of statistical convergence allows them to be run for the shortest amount of time required to obtain good results. This algorithm extends work by Pébay (Sandia Report SAND2008-6212). Pébay's algorithms are recast into a converging delta formulation, with provably favorable properties. The mean, variance, covariances and arbitrary higher order statistical moments are computed in one pass. The algorithm is tested using Sillero, Jiménez, & Moser's (2013, 2014) publicly available UPM high Reynolds number turbulent boundary layer data set, demonstrating numerical robustness, efficiency and other favorable properties.

Scripts for TRUMP data analyses. Part II (HLA-related data): statistical analyses specific for hematopoietic stem cell transplantation.

PubMed

Kanda, Junya

2016-01-01

The Transplant Registry Unified Management Program (TRUMP) made it possible for members of the Japan Society for Hematopoietic Cell Transplantation (JSHCT) to analyze large sets of national registry data on autologous and allogeneic hematopoietic stem cell transplantation. However, as the processes used to collect transplantation information are complex and differed over time, the background of these processes should be understood when using TRUMP data. Previously, information on the HLA locus of patients and donors had been collected using a questionnaire-based free-description method, resulting in some input errors. To correct minor but significant errors and provide accurate HLA matching data, the use of a Stata or EZR/R script offered by the JSHCT is strongly recommended when analyzing HLA data in the TRUMP dataset. The HLA mismatch direction, mismatch counting method, and different impacts of HLA mismatches by stem cell source are other important factors in the analysis of HLA data. Additionally, researchers should understand the statistical analyses specific for hematopoietic stem cell transplantation, such as competing risk, landmark analysis, and time-dependent analysis, to correctly analyze transplant data. The data center of the JSHCT can be contacted if statistical assistance is required.

«

14

15

16

17

18

»

«

15

16

17

18

19

»

Working Performance Analysis of Rolling Bearings Used in Mining Electric Excavator Crowd Reducer

NASA Astrophysics Data System (ADS)

Zhang, Y. H.; Hou, G.; Chen, G.; Liang, J. F.; Zheng, Y. M.

2017-12-01

Refer to the statistical load data of digging process, on the basis of simulation analysis of crowd reducer system dynamics, the working performance simulation analysis of rolling bearings used in crowd reducer of large mining electric excavator is completed. The contents of simulation analysis include analysis of internal load distribution, rolling elements contact stresses and rolling bearing fatigue life. The internal load characteristics of rolling elements in cylindrical roller bearings are obtained. The results of this study identified that all rolling bearings satisfy the requirements of contact strength and fatigue life. The rationality of bearings selection and arrangement is also verified.

A Census of Statistics Requirements at U.S. Journalism Programs and a Model for a "Statistics for Journalism" Course

ERIC Educational Resources Information Center

Martin, Justin D.

2017-01-01

This essay presents data from a census of statistics requirements and offerings at all 4-year journalism programs in the United States (N = 369) and proposes a model of a potential course in statistics for journalism majors. The author proposes that three philosophies underlie a statistics course for journalism students. Such a course should (a)…

Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analyses.

PubMed

Bennett, Bradley C; Husby, Chad E

2008-03-28

Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.

78 FR 49403 - Approval and Promulgation of Air Quality Implementation Plans; Pennsylvania; Determination of...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-08-14

... requirement for one or more quarters during 2010-2012 monitoring period. EPA has addressed missing data from... recorded values are substituted for the missing data, and the resulting 24-hour design value is compared to... missing data from the Greensburg monitor by performing a statistical analysis of the data, in which a...

Ozone Contamination in Aircraft Cabins. Appendix B: Overview papers. Flight 8 planning to avoid high ozone

NASA Technical Reports Server (NTRS)

Belmont, A. D.

1979-01-01

The problem of preventing cabin ozone from exceeding a given standard was investigated. Statistical analysis of vertical distribution of ozone is summarized. The cost, logistics, maintenance, ability to forecast ozone, and avoiding high ozone concentrations are presented. Filtering approaches and the requirements to remove ozone toxicity are discussed.

Energy resources

NASA Technical Reports Server (NTRS)

1973-01-01

A statistical analysis of the availability of fossil fuels for energy and non-energy production is presented. The cumulative requirements for petroleum, natural gas, and coal are discussed. Alternate forms of energy are described and the advantages and limitations are analyzed. Emphasis is placed on solar energy availability and methods for conversion. The Federal energy research and development funding for energy sources is tabulated.

HAZARDS SUMMARY REPORT FOR A TWO WATT PROMETHIUM-147 FUELED THERMOELECTRIC GENERATOR

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

1959-06-01

Discussions are included of the APU design, vehicle integration, Pm/sup 147/ properties, shielding requirements, hazards design criteria, statistical analysis for impact, and radiation protection. The use of Pm/sup 147/ makes possible the fabrication of an auxiliary power unit which has applications for low power space missions of <10 watts (electrical). (B.O.G.)

Training in the Food and Beverages Sector in the United Kingdom. Report for the FORCE Programme. First Edition.

ERIC Educational Resources Information Center

Burns, Jim A.; King, Richard

An international team of researchers studied the following aspects of training in the United Kingdom's food and beverage sector: structure and characteristics, business and social context, training and recruitment, and future training requirements. Data were collected from an analysis of social and labor/employment statistics, literature review,…

Mars Exploration Rover Six-Degree-Of-Freedom Entry Trajectory Analysis

NASA Technical Reports Server (NTRS)

Desai, Prasun N.; Schoenenberger, Mark; Cheatwood, F. M.

2003-01-01

The Mars Exploration Rover mission will be the next opportunity for surface exploration of Mars in January 2004. Two rovers will be delivered to the surface of Mars using the same entry, descent, and landing scenario that was developed and successfully implemented by Mars Pathfinder. This investigation describes the trajectory analysis that was performed for the hypersonic portion of the MER entry. In this analysis, a six-degree-of-freedom trajectory simulation of the entry is performed to determine the entry characteristics of the capsules. In addition, a Monte Carlo analysis is also performed to statistically assess the robustness of the entry design to off-nominal conditions to assure that all entry requirements are satisfied. The results show that the attitude at peak heating and parachute deployment are well within entry limits. In addition, the parachute deployment dynamics pressure and Mach number are also well within the design requirements.

Volcanic hazard assessment for the Canary Islands (Spain) using extreme value theory

NASA Astrophysics Data System (ADS)

Sobradelo, R.; Martí, J.; Mendoza-Rosas, A. T.; Gómez, G.

2011-10-01

The Canary Islands are an active volcanic region densely populated and visited by several millions of tourists every year. Nearly twenty eruptions have been reported through written chronicles in the last 600 yr, suggesting that the probability of a new eruption in the near future is far from zero. This shows the importance of assessing and monitoring the volcanic hazard of the region in order to reduce and manage its potential volcanic risk, and ultimately contribute to the design of appropriate preparedness plans. Hence, the probabilistic analysis of the volcanic eruption time series for the Canary Islands is an essential step for the assessment of volcanic hazard and risk in the area. Such a series describes complex processes involving different types of eruptions over different time scales. Here we propose a statistical method for calculating the probabilities of future eruptions which is most appropriate given the nature of the documented historical eruptive data. We first characterize the eruptions by their magnitudes, and then carry out a preliminary analysis of the data to establish the requirements for the statistical method. Past studies in eruptive time series used conventional statistics and treated the series as an homogeneous process. In this paper, we will use a method that accounts for the time-dependence of the series and includes rare or extreme events, in the form of few data of large eruptions, since these data require special methods of analysis. Hence, we will use a statistical method from extreme value theory. In particular, we will apply a non-homogeneous Poisson process to the historical eruptive data of the Canary Islands to estimate the probability of having at least one volcanic event of a magnitude greater than one in the upcoming years. This is done in three steps: First, we analyze the historical eruptive series to assess independence and homogeneity of the process. Second, we perform a Weibull analysis of the distribution of repose time between successive eruptions. Third, we analyze the non-homogeneous Poisson process with a generalized Pareto distribution as the intensity function.

MetaboLyzer: A Novel Statistical Workflow for Analyzing Post-Processed LC/MS Metabolomics Data

PubMed Central

Mak, Tytus D.; Laiakis, Evagelia C.; Goudarzi, Maryam; Fornace, Albert J.

2014-01-01

Metabolomics, the global study of small molecules in a particular system, has in the last few years risen to become a primary –omics platform for the study of metabolic processes. With the ever-increasing pool of quantitative data yielded from metabolomic research, specialized methods and tools with which to analyze and extract meaningful conclusions from these data are becoming more and more crucial. Furthermore, the depth of knowledge and expertise required to undertake a metabolomics oriented study is a daunting obstacle to investigators new to the field. As such, we have created a new statistical analysis workflow, MetaboLyzer, which aims to both simplify analysis for investigators new to metabolomics, as well as provide experienced investigators the flexibility to conduct sophisticated analysis. MetaboLyzer’s workflow is specifically tailored to the unique characteristics and idiosyncrasies of postprocessed liquid chromatography/mass spectrometry (LC/MS) based metabolomic datasets. It utilizes a wide gamut of statistical tests, procedures, and methodologies that belong to classical biostatistics, as well as several novel statistical techniques that we have developed specifically for metabolomics data. Furthermore, MetaboLyzer conducts rapid putative ion identification and putative biologically relevant analysis via incorporation of four major small molecule databases: KEGG, HMDB, Lipid Maps, and BioCyc. MetaboLyzer incorporates these aspects into a comprehensive workflow that outputs easy to understand statistically significant and potentially biologically relevant information in the form of heatmaps, volcano plots, 3D visualization plots, correlation maps, and metabolic pathway hit histograms. For demonstration purposes, a urine metabolomics data set from a previously reported radiobiology study in which samples were collected from mice exposed to gamma radiation was analyzed. MetaboLyzer was able to identify 243 statistically significant ions out of a total of 1942. Numerous putative metabolites and pathways were found to be biologically significant from the putative ion identification workflow. PMID:24266674

Statistical inference for noisy nonlinear ecological dynamic systems.

PubMed

Wood, Simon N

2010-08-26

Chaotic ecological dynamic systems defy conventional statistical analysis. Systems with near-chaotic dynamics are little better. Such systems are almost invariably driven by endogenous dynamic processes plus demographic and environmental process noise, and are only observable with error. Their sensitivity to history means that minute changes in the driving noise realization, or the system parameters, will cause drastic changes in the system trajectory. This sensitivity is inherited and amplified by the joint probability density of the observable data and the process noise, rendering it useless as the basis for obtaining measures of statistical fit. Because the joint density is the basis for the fit measures used by all conventional statistical methods, this is a major theoretical shortcoming. The inability to make well-founded statistical inferences about biological dynamic models in the chaotic and near-chaotic regimes, other than on an ad hoc basis, leaves dynamic theory without the methods of quantitative validation that are essential tools in the rest of biological science. Here I show that this impasse can be resolved in a simple and general manner, using a method that requires only the ability to simulate the observed data on a system from the dynamic model about which inferences are required. The raw data series are reduced to phase-insensitive summary statistics, quantifying local dynamic structure and the distribution of observations. Simulation is used to obtain the mean and the covariance matrix of the statistics, given model parameters, allowing the construction of a 'synthetic likelihood' that assesses model fit. This likelihood can be explored using a straightforward Markov chain Monte Carlo sampler, but one further post-processing step returns pure likelihood-based inference. I apply the method to establish the dynamic nature of the fluctuations in Nicholson's classic blowfly experiments.

PIVOT: platform for interactive analysis and visualization of transcriptomics data.

PubMed

Zhu, Qin; Fisher, Stephen A; Dueck, Hannah; Middleton, Sarah; Khaladkar, Mugdha; Kim, Junhyong

2018-01-05

Many R packages have been developed for transcriptome analysis but their use often requires familiarity with R and integrating results of different packages requires scripts to wrangle the datatypes. Furthermore, exploratory data analyses often generate multiple derived datasets such as data subsets or data transformations, which can be difficult to track. Here we present PIVOT, an R-based platform that wraps open source transcriptome analysis packages with a uniform user interface and graphical data management that allows non-programmers to interactively explore transcriptomics data. PIVOT supports more than 40 popular open source packages for transcriptome analysis and provides an extensive set of tools for statistical data manipulations. A graph-based visual interface is used to represent the links between derived datasets, allowing easy tracking of data versions. PIVOT further supports automatic report generation, publication-quality plots, and program/data state saving, such that all analysis can be saved, shared and reproduced. PIVOT will allow researchers with broad background to easily access sophisticated transcriptome analysis tools and interactively explore transcriptome datasets.

WASP (Write a Scientific Paper) using Excel - 2: Pivot tables.

PubMed

Grech, Victor

2018-02-01

Data analysis at the descriptive stage and the eventual presentation of results requires the tabulation and summarisation of data. This exercise should always precede inferential statistics. Pivot tables and pivot charts are one of Excel's most powerful and underutilised features, with tabulation functions that immensely facilitate descriptive statistics. Pivot tables permit users to dynamically summarise and cross-tabulate data, create tables in several dimensions, offer a range of summary statistics and can be modified interactively with instant outputs. Large and detailed datasets are thereby easily manipulated making pivot tables arguably the best way to explore, summarise and present data from many different angles. This second paper in the WASP series in Early Human Development provides pointers for pivot table manipulation in Excel™. Copyright © 2018 Elsevier B.V. All rights reserved.

Point process statistics in atom probe tomography.

PubMed

Philippe, T; Duguay, S; Grancher, G; Blavette, D

2013-09-01

We present a review of spatial point processes as statistical models that we have designed for the analysis and treatment of atom probe tomography (APT) data. As a major advantage, these methods do not require sampling. The mean distance to nearest neighbour is an attractive approach to exhibit a non-random atomic distribution. A χ(2) test based on distance distributions to nearest neighbour has been developed to detect deviation from randomness. Best-fit methods based on first nearest neighbour distance (1 NN method) and pair correlation function are presented and compared to assess the chemical composition of tiny clusters. Delaunay tessellation for cluster selection has been also illustrated. These statistical tools have been applied to APT experiments on microelectronics materials. Copyright © 2012 Elsevier B.V. All rights reserved.

Issues in Quantitative Analysis of Ultraviolet Imager (UV) Data: Airglow

NASA Technical Reports Server (NTRS)

Germany, G. A.; Richards, P. G.; Spann, J. F.; Brittnacher, M. J.; Parks, G. K.

1999-01-01

The GGS Ultraviolet Imager (UVI) has proven to be especially valuable in correlative substorm, auroral morphology, and extended statistical studies of the auroral regions. Such studies are based on knowledge of the location, spatial, and temporal behavior of auroral emissions. More quantitative studies, based on absolute radiometric intensities from UVI images, require a more intimate knowledge of the instrument behavior and data processing requirements and are inherently more difficult than studies based on relative knowledge of the oval location. In this study, UVI airglow observations are analyzed and compared with model predictions to illustrate issues that arise in quantitative analysis of UVI images. These issues include instrument calibration, long term changes in sensitivity, and imager flat field response as well as proper background correction. Airglow emissions are chosen for this study because of their relatively straightforward modeling requirements and because of their implications for thermospheric compositional studies. The analysis issues discussed here, however, are identical to those faced in quantitative auroral studies.

Statistical wind analysis for near-space applications

NASA Astrophysics Data System (ADS)

Roney, Jason A.

2007-09-01

Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.

Analysis of spontaneous MEG activity in mild cognitive impairment and Alzheimer's disease using spectral entropies and statistical complexity measures

NASA Astrophysics Data System (ADS)

Bruña, Ricardo; Poza, Jesús; Gómez, Carlos; García, María; Fernández, Alberto; Hornero, Roberto

2012-06-01

Alzheimer's disease (AD) is the most common cause of dementia. Over the last few years, a considerable effort has been devoted to exploring new biomarkers. Nevertheless, a better understanding of brain dynamics is still required to optimize therapeutic strategies. In this regard, the characterization of mild cognitive impairment (MCI) is crucial, due to the high conversion rate from MCI to AD. However, only a few studies have focused on the analysis of magnetoencephalographic (MEG) rhythms to characterize AD and MCI. In this study, we assess the ability of several parameters derived from information theory to describe spontaneous MEG activity from 36 AD patients, 18 MCI subjects and 26 controls. Three entropies (Shannon, Tsallis and Rényi entropies), one disequilibrium measure (based on Euclidean distance ED) and three statistical complexities (based on Lopez Ruiz-Mancini-Calbet complexity LMC) were used to estimate the irregularity and statistical complexity of MEG activity. Statistically significant differences between AD patients and controls were obtained with all parameters (p < 0.01). In addition, statistically significant differences between MCI subjects and controls were achieved by ED and LMC (p < 0.05). In order to assess the diagnostic ability of the parameters, a linear discriminant analysis with a leave-one-out cross-validation procedure was applied. The accuracies reached 83.9% and 65.9% to discriminate AD and MCI subjects from controls, respectively. Our findings suggest that MCI subjects exhibit an intermediate pattern of abnormalities between normal aging and AD. Furthermore, the proposed parameters provide a new description of brain dynamics in AD and MCI.

User's manual for the Simulated Life Analysis of Vehicle Elements (SLAVE) model

NASA Technical Reports Server (NTRS)

Paul, D. D., Jr.

1972-01-01

The simulated life analysis of vehicle elements model was designed to perform statistical simulation studies for any constant loss rate. The outputs of the model consist of the total number of stages required, stages successfully completing their lifetime, and average stage flight life. This report contains a complete description of the model. Users' instructions and interpretation of input and output data are presented such that a user with little or no prior programming knowledge can successfully implement the program.

Comments on: blood product transfusion in emergency department patients: a case control study of practice patterns and impact on outcome.

PubMed

Karami, Manoochehr; Khazaei, Salman

2017-12-06

Clinical decision makings according studies result require the valid and correct data collection, andanalysis. However, there are some common methodological and statistical issues which may ignore by authors. In individual matched case- control design bias arising from the unconditional analysis instead of conditional analysis. Using an unconditional logistic for matched data causes the imposition of a large number of nuisance parameters which may result in seriously biased estimates.

«

15

16

17

18

19

»

«

16

17

18

19

20

»

Proteomic Workflows for Biomarker Identification Using Mass Spectrometry — Technical and Statistical Considerations during Initial Discovery

PubMed Central

Orton, Dennis J.; Doucette, Alan A.

2013-01-01

Identification of biomarkers capable of differentiating between pathophysiological states of an individual is a laudable goal in the field of proteomics. Protein biomarker discovery generally employs high throughput sample characterization by mass spectrometry (MS), being capable of identifying and quantifying thousands of proteins per sample. While MS-based technologies have rapidly matured, the identification of truly informative biomarkers remains elusive, with only a handful of clinically applicable tests stemming from proteomic workflows. This underlying lack of progress is attributed in large part to erroneous experimental design, biased sample handling, as well as improper statistical analysis of the resulting data. This review will discuss in detail the importance of experimental design and provide some insight into the overall workflow required for biomarker identification experiments. Proper balance between the degree of biological vs. technical replication is required for confident biomarker identification. PMID:28250400

Coronal emission-line polarization from the statistical equilibrium of magnetic sublevels. II. Fe XIV 5303 A

DOE Office of Scientific and Technical Information (OSTI.GOV)

House, L.L.; Querfeld, C.W.; Rees, D.E.

1982-04-15

Coronal magnetic fields influence in the intensity and linear polarization of light scattered by coronal Fe XIV ions. To interpret polarization measurements of Fe XIV 5303 A coronal emission requires a detailed understanding of the dependence of the emitted Stokes vector on coronal magnetic field direction, electron density, and temperature and on height of origin. The required dependence is included in the solutions of statistical equilibrium for the ion which are solved explicitly for 34 magnetic sublevels in both the ground and four excited terms. The full solutions are reduced to equivalent simple analytic forms which clearly show the requiredmore » dependence on coronal conditions. The analytic forms of the reduced solutions are suitable for routine analysis of 5303 green line polarimetric data obtained at Pic du Midi and from the Solar Maximum Mission Coronagraph/Polarimeter.« less

Assessment of statistic analysis in non-radioisotopic local lymph node assay (non-RI-LLNA) with alpha-hexylcinnamic aldehyde as an example.

PubMed

Takeyoshi, Masahiro; Sawaki, Masakuni; Yamasaki, Kanji; Kimber, Ian

2003-09-30

The murine local lymph node assay (LLNA) is used for the identification of chemicals that have the potential to cause skin sensitization. However, it requires specific facility and handling procedures to accommodate a radioisotopic (RI) endpoint. We have developed non-radioisotopic (non-RI) endpoint of LLNA based on BrdU incorporation to avoid a use of RI. Although this alternative method appears viable in principle, it is somewhat less sensitive than the standard assay. In this study, we report investigations to determine the use of statistical analysis to improve the sensitivity of a non-RI LLNA procedure with alpha-hexylcinnamic aldehyde (HCA) in two separate experiments. Consequently, the alternative non-RI method required HCA concentrations of greater than 25% to elicit a positive response based on the criterion for classification as a skin sensitizer in the standard LLNA. Nevertheless, dose responses to HCA in the alternative method were consistent in both experiments and we examined whether the use of an endpoint based upon the statistical significance of induced changes in LNC turnover, rather than an SI of 3 or greater, might provide for additional sensitivity. The results reported here demonstrate that with HCA at least significant responses were, in each of two experiments, recorded following exposure of mice to 25% of HCA. These data suggest that this approach may be more satisfactory-at least when BrdU incorporation is measured. However, this modification of the LLNA is rather less sensitive than the standard method if employing statistical endpoint. Taken together the data reported here suggest that a modified LLNA in which BrdU is used in place of radioisotope incorporation shows some promise, but that in its present form, even with the use of a statistical endpoint, lacks some of the sensitivity of the standard method. The challenge is to develop strategies for further refinement of this approach.

Visualizing statistical significance of disease clusters using cartograms.

PubMed

Kronenfeld, Barry J; Wong, David W S

2017-05-15

Health officials and epidemiological researchers often use maps of disease rates to identify potential disease clusters. Because these maps exaggerate the prominence of low-density districts and hide potential clusters in urban (high-density) areas, many researchers have used density-equalizing maps (cartograms) as a basis for epidemiological mapping. However, we do not have existing guidelines for visual assessment of statistical uncertainty. To address this shortcoming, we develop techniques for visual determination of statistical significance of clusters spanning one or more districts on a cartogram. We developed the techniques within a geovisual analytics framework that does not rely on automated significance testing, and can therefore facilitate visual analysis to detect clusters that automated techniques might miss. On a cartogram of the at-risk population, the statistical significance of a disease cluster is determinate from the rate, area and shape of the cluster under standard hypothesis testing scenarios. We develop formulae to determine, for a given rate, the area required for statistical significance of a priori and a posteriori designated regions under certain test assumptions. Uniquely, our approach enables dynamic inference of aggregate regions formed by combining individual districts. The method is implemented in interactive tools that provide choropleth mapping, automated legend construction and dynamic search tools to facilitate cluster detection and assessment of the validity of tested assumptions. A case study of leukemia incidence analysis in California demonstrates the ability to visually distinguish between statistically significant and insignificant regions. The proposed geovisual analytics approach enables intuitive visual assessment of statistical significance of arbitrarily defined regions on a cartogram. Our research prompts a broader discussion of the role of geovisual exploratory analyses in disease mapping and the appropriate framework for visually assessing the statistical significance of spatial clusters.

Local sensitivity analysis for inverse problems solved by singular value decomposition

USGS Publications Warehouse

Hill, M.C.; Nolan, B.T.

2010-01-01

Local sensitivity analysis provides computationally frugal ways to evaluate models commonly used for resource management, risk assessment, and so on. This includes diagnosing inverse model convergence problems caused by parameter insensitivity and(or) parameter interdependence (correlation), understanding what aspects of the model and data contribute to measures of uncertainty, and identifying new data likely to reduce model uncertainty. Here, we consider sensitivity statistics relevant to models in which the process model parameters are transformed using singular value decomposition (SVD) to create SVD parameters for model calibration. The statistics considered include the PEST identifiability statistic, and combined use of the process-model parameter statistics composite scaled sensitivities and parameter correlation coefficients (CSS and PCC). The statistics are complimentary in that the identifiability statistic integrates the effects of parameter sensitivity and interdependence, while CSS and PCC provide individual measures of sensitivity and interdependence. PCC quantifies correlations between pairs or larger sets of parameters; when a set of parameters is intercorrelated, the absolute value of PCC is close to 1.00 for all pairs in the set. The number of singular vectors to include in the calculation of the identifiability statistic is somewhat subjective and influences the statistic. To demonstrate the statistics, we use the USDA’s Root Zone Water Quality Model to simulate nitrogen fate and transport in the unsaturated zone of the Merced River Basin, CA. There are 16 log-transformed process-model parameters, including water content at field capacity (WFC) and bulk density (BD) for each of five soil layers. Calibration data consisted of 1,670 observations comprising soil moisture, soil water tension, aqueous nitrate and bromide concentrations, soil nitrate concentration, and organic matter content. All 16 of the SVD parameters could be estimated by regression based on the range of singular values. Identifiability statistic results varied based on the number of SVD parameters included. Identifiability statistics calculated for four SVD parameters indicate the same three most important process-model parameters as CSS/PCC (WFC1, WFC2, and BD2), but the order differed. Additionally, the identifiability statistic showed that BD1 was almost as dominant as WFC1. The CSS/PCC analysis showed that this results from its high correlation with WCF1 (-0.94), and not its individual sensitivity. Such distinctions, combined with analysis of how high correlations and(or) sensitivities result from the constructed model, can produce important insights into, for example, the use of sensitivity analysis to design monitoring networks. In conclusion, the statistics considered identified similar important parameters. They differ because (1) with CSS/PCC can be more awkward because sensitivity and interdependence are considered separately and (2) identifiability requires consideration of how many SVD parameters to include. A continuing challenge is to understand how these computationally efficient methods compare with computationally demanding global methods like Markov-Chain Monte Carlo given common nonlinear processes and the often even more nonlinear models.

Ecological Momentary Assessments and Automated Time Series Analysis to Promote Tailored Health Care: A Proof-of-Principle Study.

PubMed

van der Krieke, Lian; Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith Gm; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter

2015-08-07

Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher's tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use.

Ecological Momentary Assessments and Automated Time Series Analysis to Promote Tailored Health Care: A Proof-of-Principle Study

PubMed Central

Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith GM; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter

2015-01-01

Background Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. Objective This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. Methods We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher’s tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). Results An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Conclusions Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use. PMID:26254160

Basic Student Charges at Postsecondary Institutions: Academic Year 1992-93. Tuition and Required Fees and Room and Board Charges at 4-year, 2-year, and Public Less-than-2-year Institutions. Statistical Analysis Report.

ERIC Educational Resources Information Center

Broyles, Susan G.; Morgan, Frank B.

This report lists the typical tuition and required fees and room and board charges for academic year 1992-93 at nearly 5,000 4-year, 2-year, and public less-than-2-year postsecondary institutions in the United States and its outlying areas. Included are tuition and fee charges to in-state and out-of-state students at the undergraduate and graduate…

Spacecraft platform cost estimating relationships

NASA Technical Reports Server (NTRS)

Gruhl, W. M.

1972-01-01

The three main cost areas of unmanned satellite development are discussed. The areas are identified as: (1) the spacecraft platform (SCP), (2) the payload or experiments, and (3) the postlaunch ground equipment and operations. The SCP normally accounts for over half of the total project cost and accurate estimates of SCP costs are required early in project planning as a basis for determining total project budget requirements. The development of single formula SCP cost estimating relationships (CER) from readily available data by statistical linear regression analysis is described. The advantages of single formula CER are presented.

A study of the portability of an Ada system in the software engineering laboratory (SEL)

NASA Technical Reports Server (NTRS)

Jun, Linda O.; Valett, Susan Ray

1990-01-01

A particular porting effort is discussed, and various statistics on analyzing the portability of Ada and the total staff months (overall and by phase) required to accomplish the rehost, are given. This effort is compared to past experiments on the rehosting of FORTRAN systems. The discussion includes an analysis of the types of errors encountered during the rehosting, the changes required to rehost the system, experiences with the Alsys IBM Ada compiler, the impediments encountered, and the lessons learned during this study.

Statistical Approaches to Adjusting Weights for Dependent Arms in Network Meta-analysis.

PubMed

Su, Yu-Xuan; Tu, Yu-Kang

2018-05-22

Network meta-analysis compares multiple treatments in terms of their efficacy and harm by including evidence from randomized controlled trials. Most clinical trials use parallel design, where patients are randomly allocated to different treatments and receive only one treatment. However, some trials use within person designs such as split-body, split-mouth and cross-over designs, where each patient may receive more than one treatment. Data from treatment arms within these trials are no longer independent, so the correlations between dependent arms need to be accounted for within the statistical analyses. Ignoring these correlations may result in incorrect conclusions. The main objective of this study is to develop statistical approaches to adjusting weights for dependent arms within special design trials. In this study, we demonstrate the following three approaches: the data augmentation approach, the adjusting variance approach, and the reducing weight approach. These three methods could be perfectly applied in current statistic tools such as R and STATA. An example of periodontal regeneration was used to demonstrate how these approaches could be undertaken and implemented within statistical software packages, and to compare results from different approaches. The adjusting variance approach can be implemented within the network package in STATA, while reducing weight approach requires computer software programming to set up the within-study variance-covariance matrix. This article is protected by copyright. All rights reserved.

Bed Capacity Planning Using Stochastic Simulation Approach in Cardiac-surgery Department of Teaching Hospitals, Tehran, Iran

PubMed Central

TORABIPOUR, Amin; ZERAATI, Hojjat; ARAB, Mohammad; RASHIDIAN, Arash; AKBARI SARI, Ali; SARZAIEM, Mahmuod Reza

2016-01-01

Background: To determine the hospital required beds using stochastic simulation approach in cardiac surgery departments. Methods: This study was performed from Mar 2011 to Jul 2012 in three phases: First, collection data from 649 patients in cardiac surgery departments of two large teaching hospitals (in Tehran, Iran). Second, statistical analysis and formulate a multivariate linier regression model to determine factors that affect patient's length of stay. Third, develop a stochastic simulation system (from admission to discharge) based on key parameters to estimate required bed capacity. Results: Current cardiac surgery department with 33 beds can only admit patients in 90.7% of days. (4535 d) and will be required to over the 33 beds only in 9.3% of days (efficient cut off point). According to simulation method, studied cardiac surgery department will requires 41–52 beds for admission of all patients in the 12 next years. Finally, one-day reduction of length of stay lead to decrease need for two hospital beds annually. Conclusion: Variation of length of stay and its affecting factors can affect required beds. Statistic and stochastic simulation model are applied and useful methods to estimate and manage hospital beds based on key hospital parameters. PMID:27957466

Relationship between High School Mathematics Grade and Number of Attempts Required to Pass the Medication Calculation Test in Nurse Education: An Explorative Study

PubMed Central

Alteren, Johanne; Nerdal, Lisbeth

2015-01-01

In Norwegian nurse education, students are required to achieve a perfect score in a medication calculation test before undertaking their first practice period during the second semester. Passing the test is a challenge, and students often require several attempts. Adverse events in medication administration can be related to poor mathematical skills. The purpose of this study was to explore the relationship between high school mathematics grade and the number of attempts required to pass the medication calculation test in nurse education. The study used an exploratory design. The participants were 90 students enrolled in a bachelor’s nursing program. They completed a self-report questionnaire, and statistical analysis was performed. The results provided no basis for the conclusion that a statistical relationship existed between high school mathematics grade and number of attempts required to pass the medication calculation test. Regardless of their grades in mathematics, 43% of the students passed the medication calculation test on the first attempt. All of the students who had achieved grade 5 had passed by the third attempt. High grades in mathematics were not crucial to passing the medication calculation test. Nonetheless, the grade may be important in ensuring a pass within fewer attempts. PMID:27417767

Power in randomized group comparisons: the value of adding a single intermediate time point to a traditional pretest-posttest design.

PubMed

Venter, Anre; Maxwell, Scott E; Bolig, Erika

2002-06-01

Adding a pretest as a covariate to a randomized posttest-only design increases statistical power, as does the addition of intermediate time points to a randomized pretest-posttest design. Although typically 5 waves of data are required in this instance to produce meaningful gains in power, a 3-wave intensive design allows the evaluation of the straight-line growth model and may reduce the effect of missing data. The authors identify the statistically most powerful method of data analysis in the 3-wave intensive design. If straight-line growth is assumed, the pretest-posttest slope must assume fairly extreme values for the intermediate time point to increase power beyond the standard analysis of covariance on the posttest with the pretest as covariate, ignoring the intermediate time point.

[Pitfalls in informed consent: a statistical analysis of malpractice law suits].

PubMed

Echigo, Junko

2014-05-01

In medical malpractice law suits, the notion of informed consent is often relevant in assessing whether negligence can be attributed to the medical practitioner who has caused injury to a patient. Furthermore, it is not rare that courts award damages for a lack of appropriate informed consent alone. In this study, two results were arrived at from a statistical analysis of medical malpractice law suits. One, unexpectedly, was that the severity of a patient's illness made no significant difference to whether damages were awarded. The other was that cases of typical medical treatment that national medical insurance does not cover were involved significantly more often than insured treatment cases. In cases where damages were awarded, the courts required more disclosure and written documents of information by medical practitioners, especially about complications and adverse effects that the patient might suffer.

Low-dose ionizing radiation increases the mortality risk of solid cancers in nuclear industry workers: A meta-analysis.

PubMed

Qu, Shu-Gen; Gao, Jin; Tang, Bo; Yu, Bo; Shen, Yue-Ping; Tu, Yu

2018-05-01

Low-dose ionizing radiation (LDIR) may increase the mortality of solid cancers in nuclear industry workers, but only few individual cohort studies exist, and the available reports have low statistical power. The aim of the present study was to focus on solid cancer mortality risk from LDIR in the nuclear industry using standard mortality ratios (SMRs) and 95% confidence intervals. A systematic literature search through the PubMed and Embase databases identified 27 studies relevant to this meta-analysis. There was statistical significance for total, solid and lung cancers, with meta-SMR values of 0.88, 0.80, and 0.89, respectively. There was evidence of stochastic effects by IR, but more definitive conclusions require additional analyses using standardized protocols to determine whether LDIR increases the risk of solid cancer-related mortality.

The effect of dexmedetomidine continuous infusion as an adjuvant to general anesthesia on sevoflurane requirements: A study based on entropy analysis.

PubMed

Patel, Chirag Ramanlal; Engineer, Smita R; Shah, Bharat J; Madhu, S

2013-07-01

Dexmedetomidine, a α2 agonist as an adjuvant in general anesthesia, has anesthetic and analgesic-sparing property. To evaluate the effect of continuous infusion of dexmedetomidine alone, without use of opioids, on requirement of sevoflurane during general anesthesia with continuous monitoring of depth of anesthesia by entropy analysis. Sixty patients were randomly divided into 2 groups of 30 each. In group A, fentanyl 2 mcg/kg was given while in group B, dexmedetomidine was given intravenously as loading dose of 1 mcg/kg over 10 min prior to induction. After induction with thiopentone in group B, dexmedetomidine was given as infusion at a dose of 0.2-0.8 mcg/kg. Sevoflurane was used as inhalation agent in both groups. Hemodynamic variables, sevoflurane inspired fraction (FIsevo), sevoflurane expired fraction (ETsevo), and entropy (Response entropy and state entropy) were continuously recorded. Statistical analysis was done by unpaired student's t-test and Chi-square test for continuous and categorical variables, respectively. A P-value < 0.05 was considered significant. The use of dexmedetomidine with sevoflurane was associated with a statistical significant decrease in ETsevo at 5 minutes post-intubation (1.49 ± 0.11) and 60 minutes post-intubation (1.11 ±0.28) as compared to the group A [1.73 ±0.30 (5 minutes); 1.68 ±0.50 (60 minutes)]. There was an average 21.5% decrease in ETsevo in group B as compared to group A. Dexmedetomidine, as an adjuvant in general anesthesia, decreases requirement of sevoflurane for maintaining adequate depth of anesthesia.

Specialized data analysis for the Space Shuttle Main Engine and diagnostic evaluation of advanced propulsion system components

NASA Technical Reports Server (NTRS)

1993-01-01

The Marshall Space Flight Center is responsible for the development and management of advanced launch vehicle propulsion systems, including the Space Shuttle Main Engine (SSME), which is presently operational, and the Space Transportation Main Engine (STME) under development. The SSME's provide high performance within stringent constraints on size, weight, and reliability. Based on operational experience, continuous design improvement is in progress to enhance system durability and reliability. Specialized data analysis and interpretation is required in support of SSME and advanced propulsion system diagnostic evaluations. Comprehensive evaluation of the dynamic measurements obtained from test and flight operations is necessary to provide timely assessment of the vibrational characteristics indicating the operational status of turbomachinery and other critical engine components. Efficient performance of this effort is critical due to the significant impact of dynamic evaluation results on ground test and launch schedules, and requires direct familiarity with SSME and derivative systems, test data acquisition, and diagnostic software. Detailed analysis and evaluation of dynamic measurements obtained during SSME and advanced system ground test and flight operations was performed including analytical/statistical assessment of component dynamic behavior, and the development and implementation of analytical/statistical models to efficiently define nominal component dynamic characteristics, detect anomalous behavior, and assess machinery operational condition. In addition, the SSME and J-2 data will be applied to develop vibroacoustic environments for advanced propulsion system components, as required. This study will provide timely assessment of engine component operational status, identify probable causes of malfunction, and indicate feasible engineering solutions. This contract will be performed through accomplishment of negotiated task orders.

Impact of Requirements Quality on Project Success or Failure

NASA Astrophysics Data System (ADS)

Tamai, Tetsuo; Kamata, Mayumi Itakura

We are interested in the relationship between the quality of the requirements specifications for software projects and the subsequent outcome of the projects. To examine this relationship, we investigated 32 projects started and completed between 2003 and 2005 by the software development division of a large company in Tokyo. The company has collected reliable data on requirements specification quality, as evaluated by software quality assurance teams, and overall project performance data relating to cost and time overruns. The data for requirements specification quality were first converted into a multiple-dimensional space, with each dimension corresponding to an item of the recommended structure for software requirements specifications (SRS) defined in IEEE Std. 830-1998. We applied various statistical analysis methods to the SRS quality data and project outcomes.

Reply on Comment on "High resolution coherence analysis between planetary and climate oscillations" by S. Holm

NASA Astrophysics Data System (ADS)

Scafetta, Nicola

2018-07-01

Holm (ASR, 2018) claims that Scafetta (ASR 57, 2121-2135, 2016) is "irreproducible" because I would have left "undocumented" the values of two parameters (a reduced-rank index p and a regularization term δ) that he claimed to be requested in the Magnitude Squared Coherence Canonical Correlation Analysis (MSC-CCA). Yet, my analysis did not require such two parameters. In fact: (1) using the MSC-CCA reduced-rank option neither changes the result nor was needed since Scafetta (2016) statistically evaluated the significance of the coherence spectral peaks; (2) the analysis algorithm neither contains nor needed the regularization term δ . Herein I show that Holm could not replicate Scafetta (2016) because he used different analysis algorithms. In fact, although Holm claimed to be using MSC-CCA, for his Figs. 2-4 he used a MatLab code labeled "gcs_cca_1D.m" (see paragraph 2 of his Section 3), which Holm also modified, that implements a different methodology known as the Generalized Coherence Spectrum using the Canonical Correlation Analysis (GCS-CCA). This code is herein demonstrated to be unreliable under specific statistical circumstances such as those required to replicate Scafetta (2016). On the contrary, the MSC-CCA method is stable and reliable. Moreover, Holm could not replicate my result also in his Fig. 5 because there he used the basic Welch MSC algorithm by erroneously equating it to MSC-CCA. Herein I clarify step-by-step how to proceed with the correct analysis, and I fully confirm the 95% significance of my results. I add data and codes to easily replicate my results.

«

16

17

18

19

20

»

«

17

18

19

20

21

»

Reinventing Biostatistics Education for Basic Scientists

PubMed Central

Weissgerber, Tracey L.; Garovic, Vesna D.; Milin-Lazovic, Jelena S.; Winham, Stacey J.; Obradovic, Zoran; Trzeciakowski, Jerome P.; Milic, Natasa M.

2016-01-01

Numerous studies demonstrating that statistical errors are common in basic science publications have led to calls to improve statistical training for basic scientists. In this article, we sought to evaluate statistical requirements for PhD training and to identify opportunities for improving biostatistics education in the basic sciences. We provide recommendations for improving statistics training for basic biomedical scientists, including: 1. Encouraging departments to require statistics training, 2. Tailoring coursework to the students’ fields of research, and 3. Developing tools and strategies to promote education and dissemination of statistical knowledge. We also provide a list of statistical considerations that should be addressed in statistics education for basic scientists. PMID:27058055

How can my research paper be useful for future meta-analyses on forest restoration practices?

Treesearch

Enrique Andivia; Pedro Villar‑Salvador; Juan A. Oliet; Jaime Puertolas; R. Kasten Dumroese

2018-01-01

Statistical meta-analysis is a powerful and useful tool to quantitatively synthesize the information conveyed in published studies on a particular topic. It allows identifying and quantifying overall patterns and exploring causes of variation. The inclusion of published works in meta-analyses requires, however, a minimum quality standard of the reported data and...

Generalized linear models and point count data: statistical considerations for the design and analysis of monitoring studies

Treesearch

Nathaniel E. Seavy; Suhel Quader; John D. Alexander; C. John Ralph

2005-01-01

The success of avian monitoring programs to effectively guide management decisions requires that studies be efficiently designed and data be properly analyzed. A complicating factor is that point count surveys often generate data with non-normal distributional properties. In this paper we review methods of dealing with deviations from normal assumptions, and we focus...

Constructed Response Tests in the NELS:88 High School Effectiveness Study. National Education Longitudinal Study of 1988 Second Followup. Statistical Analysis Report.

ERIC Educational Resources Information Center

Pollock, Judith M.; And Others

This report describes an experiment in constructed response testing undertaken in conjunction with the National Education Longitudinal Study of 1988 (NELS:88). Constructed response questions are those that require students to produce their own response rather than selecting the correct answer from several options. Participants in this experiment…

Development of computer-assisted instruction application for statistical data analysis android platform as learning resource

NASA Astrophysics Data System (ADS)

Hendikawati, P.; Arifudin, R.; Zahid, M. Z.

2018-03-01

This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.

Kurtosis Approach Nonlinear Blind Source Separation

NASA Technical Reports Server (NTRS)

Duong, Vu A.; Stubbemd, Allen R.

2005-01-01

In this paper, we introduce a new algorithm for blind source signal separation for post-nonlinear mixtures. The mixtures are assumed to be linearly mixed from unknown sources first and then distorted by memoryless nonlinear functions. The nonlinear functions are assumed to be smooth and can be approximated by polynomials. Both the coefficients of the unknown mixing matrix and the coefficients of the approximated polynomials are estimated by the gradient descent method conditional on the higher order statistical requirements. The results of simulation experiments presented in this paper demonstrate the validity and usefulness of our approach for nonlinear blind source signal separation Keywords: Independent Component Analysis, Kurtosis, Higher order statistics.

Neuronal Correlation Parameter and the Idea of Thermodynamic Entropy of an N-Body Gravitationally Bounded System.

PubMed

Haranas, Ioannis; Gkigkitzis, Ioannis; Kotsireas, Ilias; Austerlitz, Carlos

2017-01-01

Understanding how the brain encodes information and performs computation requires statistical and functional analysis. Given the complexity of the human brain, simple methods that facilitate the interpretation of statistical correlations among different brain regions can be very useful. In this report we introduce a numerical correlation measure that may serve the interpretation of correlational neuronal data, and may assist in the evaluation of different brain states. The description of the dynamical brain system, through a global numerical measure may indicate the presence of an action principle which may facilitate a application of physics principles in the study of the human brain and cognition.

Location error uncertainties - an advanced using of probabilistic inverse theory

NASA Astrophysics Data System (ADS)

Debski, Wojciech

2016-04-01

The spatial location of sources of seismic waves is one of the first tasks when transient waves from natural (uncontrolled) sources are analyzed in many branches of physics, including seismology, oceanology, to name a few. Source activity and its spatial variability in time, the geometry of recording network, the complexity and heterogeneity of wave velocity distribution are all factors influencing the performance of location algorithms and accuracy of the achieved results. While estimating of the earthquake foci location is relatively simple a quantitative estimation of the location accuracy is really a challenging task even if the probabilistic inverse method is used because it requires knowledge of statistics of observational, modelling, and apriori uncertainties. In this presentation we addressed this task when statistics of observational and/or modeling errors are unknown. This common situation requires introduction of apriori constraints on the likelihood (misfit) function which significantly influence the estimated errors. Based on the results of an analysis of 120 seismic events from the Rudna copper mine operating in southwestern Poland we illustrate an approach based on an analysis of Shanon's entropy calculated for the aposteriori distribution. We show that this meta-characteristic of the aposteriori distribution carries some information on uncertainties of the solution found.

Exploring dangerous neighborhoods: Latent Semantic Analysis and computing beyond the bounds of the familiar

PubMed Central

Cohen, Trevor; Blatter, Brett; Patel, Vimla

2005-01-01

Certain applications require computer systems to approximate intended human meaning. This is achievable in constrained domains with a finite number of concepts. Areas such as psychiatry, however, draw on concepts from the world-at-large. A knowledge structure with broad scope is required to comprehend such domains. Latent Semantic Analysis (LSA) is an unsupervised corpus-based statistical method that derives quantitative estimates of the similarity between words and documents from their contextual usage statistics. The aim of this research was to evaluate the ability of LSA to derive meaningful associations between concepts relevant to the assessment of dangerousness in psychiatry. An expert reference model of dangerousness was used to guide the construction of a relevant corpus. Derived associations between words in the corpus were evaluated qualitatively. A similarity-based scoring function was used to assign dangerousness categories to discharge summaries. LSA was shown to derive intuitive relationships between concepts and correlated significantly better than random with human categorization of psychiatric discharge summaries according to dangerousness. The use of LSA to derive a simulated knowledge structure can extend the scope of computer systems beyond the boundaries of constrained conceptual domains. PMID:16779020

Vision-based localization of the center of mass of large space debris via statistical shape analysis

NASA Astrophysics Data System (ADS)

Biondi, G.; Mauro, S.; Pastorelli, S.

2017-08-01

The current overpopulation of artificial objects orbiting the Earth has increased the interest of the space agencies on planning missions for de-orbiting the largest inoperative satellites. Since this kind of operations involves the capture of the debris, the accurate knowledge of the position of their center of mass is a fundamental safety requirement. As ground observations are not sufficient to reach the required accuracy level, this information should be acquired in situ just before any contact between the chaser and the target. Some estimation methods in the literature rely on the usage of stereo cameras for tracking several features of the target surface. The actual positions of these features are estimated together with the location of the center of mass by state observers. The principal drawback of these methods is related to possible sudden disappearances of one or more features from the field of view of the cameras. An alternative method based on 3D Kinematic registration is presented in this paper. The method, which does not suffer of the mentioned drawback, considers a preliminary reduction of the inaccuracies in detecting features by the usage of statistical shape analysis.

Implementation of Statistical Process Control: Evaluating the Mechanical Performance of a Candidate Silicone Elastomer Docking Seal

NASA Technical Reports Server (NTRS)

Oravec, Heather Ann; Daniels, Christopher C.

2014-01-01

The National Aeronautics and Space Administration has been developing a novel docking system to meet the requirements of future exploration missions to low-Earth orbit and beyond. A dynamic gas pressure seal is located at the main interface between the active and passive mating components of the new docking system. This seal is designed to operate in the harsh space environment, but is also to perform within strict loading requirements while maintaining an acceptable level of leak rate. In this study, a candidate silicone elastomer seal was designed, and multiple subscale test articles were manufactured for evaluation purposes. The force required to fully compress each test article at room temperature was quantified and found to be below the maximum allowable load for the docking system. However, a significant amount of scatter was observed in the test results. Due to the stochastic nature of the mechanical performance of this candidate docking seal, a statistical process control technique was implemented to isolate unusual compression behavior from typical mechanical performance. The results of this statistical analysis indicated a lack of process control, suggesting a variation in the manufacturing phase of the process. Further investigation revealed that changes in the manufacturing molding process had occurred which may have influenced the mechanical performance of the seal. This knowledge improves the chance of this and future space seals to satisfy or exceed design specifications.

Statistical sensor fusion analysis of near-IR polarimetric and thermal imagery for the detection of minelike targets

NASA Astrophysics Data System (ADS)

Weisenseel, Robert A.; Karl, William C.; Castanon, David A.; DiMarzio, Charles A.

1999-02-01

We present an analysis of statistical model based data-level fusion for near-IR polarimetric and thermal data, particularly for the detection of mines and mine-like targets. Typical detection-level data fusion methods, approaches that fuse detections from individual sensors rather than fusing at the level of the raw data, do not account rationally for the relative reliability of different sensors, nor the redundancy often inherent in multiple sensors. Representative examples of such detection-level techniques include logical AND/OR operations on detections from individual sensors and majority vote methods. In this work, we exploit a statistical data model for the detection of mines and mine-like targets to compare and fuse multiple sensor channels. Our purpose is to quantify the amount of knowledge that each polarimetric or thermal channel supplies to the detection process. With this information, we can make reasonable decisions about the usefulness of each channel. We can use this information to improve the detection process, or we can use it to reduce the number of required channels.

Prediction of Regulation Reserve Requirements in California ISO Control Area based on BAAL Standard

DOE Office of Scientific and Technical Information (OSTI.GOV)

Etingov, Pavel V.; Makarov, Yuri V.; Samaan, Nader A.

This paper presents new methodologies developed at Pacific Northwest National Laboratory (PNNL) to estimate regulation capacity requirements in the California ISO control area. Two approaches have been developed: (1) an approach based on statistical analysis of actual historical area control error (ACE) and regulation data, and (2) an approach based on balancing authority ACE limit control performance standard. The approaches predict regulation reserve requirements on a day-ahead basis including upward and downward requirements, for each operating hour of a day. California ISO data has been used to test the performance of the proposed algorithms. Results show that software tool allowsmore » saving up to 30% on the regulation procurements cost .« less

Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis

PubMed Central

Steele, Joe; Bastola, Dhundy

2014-01-01

Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base–base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel–Ziv techniques from data compression. PMID:23904502

Impact of distributions on the archetypes and prototypes in heterogeneous nanoparticle ensembles.

PubMed

Fernandez, Michael; Wilson, Hugh F; Barnard, Amanda S

2017-01-05

The magnitude and complexity of the structural and functional data available on nanomaterials requires data analytics, statistical analysis and information technology to drive discovery. We demonstrate that multivariate statistical analysis can recognise the sets of truly significant nanostructures and their most relevant properties in heterogeneous ensembles with different probability distributions. The prototypical and archetypal nanostructures of five virtual ensembles of Si quantum dots (SiQDs) with Boltzmann, frequency, normal, Poisson and random distributions are identified using clustering and archetypal analysis, where we find that their diversity is defined by size and shape, regardless of the type of distribution. At the complex hull of the SiQD ensembles, simple configuration archetypes can efficiently describe a large number of SiQDs, whereas more complex shapes are needed to represent the average ordering of the ensembles. This approach provides a route towards the characterisation of computationally intractable virtual nanomaterial spaces, which can convert big data into smart data, and significantly reduce the workload to simulate experimentally relevant virtual samples.

Measuring a diffusion coefficient by single-particle tracking: statistical analysis of experimental mean squared displacement curves.

PubMed

Ernst, Dominique; Köhler, Jürgen

2013-01-21

We provide experimental results on the accuracy of diffusion coefficients obtained by a mean squared displacement (MSD) analysis of single-particle trajectories. We have recorded very long trajectories comprising more than 1.5 × 10(5) data points and decomposed these long trajectories into shorter segments providing us with ensembles of trajectories of variable lengths. This enabled a statistical analysis of the resulting MSD curves as a function of the lengths of the segments. We find that the relative error of the diffusion coefficient can be minimized by taking an optimum number of points into account for fitting the MSD curves, and that this optimum does not depend on the segment length. Yet, the magnitude of the relative error for the diffusion coefficient does, and achieving an accuracy in the order of 10% requires the recording of trajectories with about 1000 data points. Finally, we compare our results with theoretical predictions and find very good qualitative and quantitative agreement between experiment and theory.

Handbook Of X-ray Astronomy

NASA Astrophysics Data System (ADS)

Arnaud, Keith A.; Smith, R. K.; Siemiginowska, A.; Edgar, R. J.; Grant, C. E.; Kuntz, K. D.; Schwartz, D. A.

2011-09-01

This poster advertises a book to be published in September 2011 by Cambridge University Press. Written for graduate students, professional astronomers and researchers who want to start working in this field, this book is a practical guide to x-ray astronomy. The handbook begins with x-ray optics, basic detector physics and CCDs, before focussing on data analysis. It introduces the reduction and calibration of x-ray data, scientific analysis, archives, statistical issues and the particular problems of highly extended sources. The book describes the main hardware used in x-ray astronomy, emphasizing the implications for data analysis. The concepts behind common x-ray astronomy data analysis software are explained. The appendices present reference material often required during data analysis.

ProteoSign: an end-user online differential proteomics statistical analysis platform.

PubMed

Efstathiou, Georgios; Antonakis, Andreas N; Pavlopoulos, Georgios A; Theodosiou, Theodosios; Divanach, Peter; Trudgian, David C; Thomas, Benjamin; Papanikolaou, Nikolas; Aivaliotis, Michalis; Acuto, Oreste; Iliopoulos, Ioannis

2017-07-03

Profiling of proteome dynamics is crucial for understanding cellular behavior in response to intrinsic and extrinsic stimuli and maintenance of homeostasis. Over the last 20 years, mass spectrometry (MS) has emerged as the most powerful tool for large-scale identification and characterization of proteins. Bottom-up proteomics, the most common MS-based proteomics approach, has always been challenging in terms of data management, processing, analysis and visualization, with modern instruments capable of producing several gigabytes of data out of a single experiment. Here, we present ProteoSign, a freely available web application, dedicated in allowing users to perform proteomics differential expression/abundance analysis in a user-friendly and self-explanatory way. Although several non-commercial standalone tools have been developed for post-quantification statistical analysis of proteomics data, most of them are not end-user appealing as they often require very stringent installation of programming environments, third-party software packages and sometimes further scripting or computer programming. To avoid this bottleneck, we have developed a user-friendly software platform accessible via a web interface in order to enable proteomics laboratories and core facilities to statistically analyse quantitative proteomics data sets in a resource-efficient manner. ProteoSign is available at http://bioinformatics.med.uoc.gr/ProteoSign and the source code at https://github.com/yorgodillo/ProteoSign. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Labby, Z.

Physicists are often expected to have a solid grounding in experimental design and statistical analysis, sometimes filling in when biostatisticians or other experts are not available for consultation. Unfortunately, graduate education on these topics is seldom emphasized and few opportunities for continuing education exist. Clinical physicists incorporate new technology and methods into their practice based on published literature. A poor understanding of experimental design and analysis could Result in inappropriate use of new techniques. Clinical physicists also improve current practice through quality initiatives that require sound experimental design and analysis. Academic physicists with a poor understanding of design and analysismore » may produce ambiguous (or misleading) results. This can Result in unnecessary rewrites, publication rejection, and experimental redesign (wasting time, money, and effort). This symposium will provide a practical review of error and uncertainty, common study designs, and statistical tests. Instruction will primarily focus on practical implementation through examples and answer questions such as: where would you typically apply the test/design and where is the test/design typically misapplied (i.e., common pitfalls)? An analysis of error and uncertainty will also be explored using biological studies and associated modeling as a specific use case. Learning Objectives: Understand common experimental testing and clinical trial designs, what questions they can answer, and how to interpret the results Determine where specific statistical tests are appropriate and identify common pitfalls Understand the how uncertainty and error are addressed in biological testing and associated biological modeling.« less

A statistical study of EMIC waves observed by Cluster. 1. Wave properties. EMIC Wave Properties

DOE PAGES

Allen, R. C.; Zhang, J. -C.; Kistler, L. M.; ...

2015-07-23

Electromagnetic ion cyclotron (EMIC) waves are an important mechanism for particle energization and losses inside the magnetosphere. In order to better understand the effects of these waves on particle dynamics, detailed information about the occurrence rate, wave power, ellipticity, normal angle, energy propagation angle distributions, and local plasma parameters are required. Previous statistical studies have used in situ observations to investigate the distribution of these parameters in the magnetic local time versus L-shell (MLT-L) frame within a limited magnetic latitude (MLAT) range. In our study, we present a statistical analysis of EMIC wave properties using 10 years (2001–2010) of datamore » from Cluster, totaling 25,431 min of wave activity. Due to the polar orbit of Cluster, we are able to investigate EMIC waves at all MLATs and MLTs. This allows us to further investigate the MLAT dependence of various wave properties inside different MLT sectors and further explore the effects of Shabansky orbits on EMIC wave generation and propagation. Thus, the statistical analysis is presented in two papers. OUr paper focuses on the wave occurrence distribution as well as the distribution of wave properties. The companion paper focuses on local plasma parameters during wave observations as well as wave generation proxies.« less

«

17

18

19

20

21

»

«

18

19

20

21

22

»

Implementation of statistical process control for proteomic experiments via LC MS/MS.

PubMed

Bereman, Michael S; Johnson, Richard; Bollinger, James; Boss, Yuval; Shulman, Nick; MacLean, Brendan; Hoofnagle, Andrew N; MacCoss, Michael J

2014-04-01

Statistical process control (SPC) is a robust set of tools that aids in the visualization, detection, and identification of assignable causes of variation in any process that creates products, services, or information. A tool has been developed termed Statistical Process Control in Proteomics (SProCoP) which implements aspects of SPC (e.g., control charts and Pareto analysis) into the Skyline proteomics software. It monitors five quality control metrics in a shotgun or targeted proteomic workflow. None of these metrics require peptide identification. The source code, written in the R statistical language, runs directly from the Skyline interface, which supports the use of raw data files from several of the mass spectrometry vendors. It provides real time evaluation of the chromatographic performance (e.g., retention time reproducibility, peak asymmetry, and resolution), and mass spectrometric performance (targeted peptide ion intensity and mass measurement accuracy for high resolving power instruments) via control charts. Thresholds are experiment- and instrument-specific and are determined empirically from user-defined quality control standards that enable the separation of random noise and systematic error. Finally, Pareto analysis provides a summary of performance metrics and guides the user to metrics with high variance. The utility of these charts to evaluate proteomic experiments is illustrated in two case studies.

A hierarchical fuzzy rule-based approach to aphasia diagnosis.

PubMed

Akbarzadeh-T, Mohammad-R; Moshtagh-Khorasani, Majid

2007-10-01

Aphasia diagnosis is a particularly challenging medical diagnostic task due to the linguistic uncertainty and vagueness, inconsistencies in the definition of aphasic syndromes, large number of measurements with imprecision, natural diversity and subjectivity in test objects as well as in opinions of experts who diagnose the disease. To efficiently address this diagnostic process, a hierarchical fuzzy rule-based structure is proposed here that considers the effect of different features of aphasia by statistical analysis in its construction. This approach can be efficient for diagnosis of aphasia and possibly other medical diagnostic applications due to its fuzzy and hierarchical reasoning construction. Initially, the symptoms of the disease which each consists of different features are analyzed statistically. The measured statistical parameters from the training set are then used to define membership functions and the fuzzy rules. The resulting two-layered fuzzy rule-based system is then compared with a back propagating feed-forward neural network for diagnosis of four Aphasia types: Anomic, Broca, Global and Wernicke. In order to reduce the number of required inputs, the technique is applied and compared on both comprehensive and spontaneous speech tests. Statistical t-test analysis confirms that the proposed approach uses fewer Aphasia features while also presenting a significant improvement in terms of accuracy.

Using statistical text classification to identify health information technology incidents

PubMed Central

Chai, Kevin E K; Anthony, Stephen; Coiera, Enrico; Magrabi, Farah

2013-01-01

Objective To examine the feasibility of using statistical text classification to automatically identify health information technology (HIT) incidents in the USA Food and Drug Administration (FDA) Manufacturer and User Facility Device Experience (MAUDE) database. Design We used a subset of 570 272 incidents including 1534 HIT incidents reported to MAUDE between 1 January 2008 and 1 July 2010. Text classifiers using regularized logistic regression were evaluated with both ‘balanced’ (50% HIT) and ‘stratified’ (0.297% HIT) datasets for training, validation, and testing. Dataset preparation, feature extraction, feature selection, cross-validation, classification, performance evaluation, and error analysis were performed iteratively to further improve the classifiers. Feature-selection techniques such as removing short words and stop words, stemming, lemmatization, and principal component analysis were examined. Measurements κ statistic, F1 score, precision and recall. Results Classification performance was similar on both the stratified (0.954 F1 score) and balanced (0.995 F1 score) datasets. Stemming was the most effective technique, reducing the feature set size to 79% while maintaining comparable performance. Training with balanced datasets improved recall (0.989) but reduced precision (0.165). Conclusions Statistical text classification appears to be a feasible method for identifying HIT reports within large databases of incidents. Automated identification should enable more HIT problems to be detected, analyzed, and addressed in a timely manner. Semi-supervised learning may be necessary when applying machine learning to big data analysis of patient safety incidents and requires further investigation. PMID:23666777

Statistical process control: separating signal from noise in emergency department operations.

PubMed

Pimentel, Laura; Barrueto, Fermin

2015-05-01

Statistical process control (SPC) is a visually appealing and statistically rigorous methodology very suitable to the analysis of emergency department (ED) operations. We demonstrate that the control chart is the primary tool of SPC; it is constructed by plotting data measuring the key quality indicators of operational processes in rationally ordered subgroups such as units of time. Control limits are calculated using formulas reflecting the variation in the data points from one another and from the mean. SPC allows managers to determine whether operational processes are controlled and predictable. We review why the moving range chart is most appropriate for use in the complex ED milieu, how to apply SPC to ED operations, and how to determine when performance improvement is needed. SPC is an excellent tool for operational analysis and quality improvement for these reasons: 1) control charts make large data sets intuitively coherent by integrating statistical and visual descriptions; 2) SPC provides analysis of process stability and capability rather than simple comparison with a benchmark; 3) SPC allows distinction between special cause variation (signal), indicating an unstable process requiring action, and common cause variation (noise), reflecting a stable process; and 4) SPC keeps the focus of quality improvement on process rather than individual performance. Because data have no meaning apart from their context, and every process generates information that can be used to improve it, we contend that SPC should be seriously considered for driving quality improvement in emergency medicine. Copyright © 2015 Elsevier Inc. All rights reserved.

Statistical analysis of corn yields responding to climate variability at various spatio-temporal resolutions

NASA Astrophysics Data System (ADS)

Jiang, H.; Lin, T.

2017-12-01

Rain-fed corn production systems are subject to sub-seasonal variations of precipitation and temperature during the growing season. As each growth phase has varied inherent physiological process, plants necessitate different optimal environmental conditions during each phase. However, this temporal heterogeneity towards climate variability alongside the lifecycle of crops is often simplified and fixed as constant responses in large scale statistical modeling analysis. To capture the time-variant growing requirements in large scale statistical analysis, we develop and compare statistical models at various spatial and temporal resolutions to quantify the relationship between corn yield and weather factors for 12 corn belt states from 1981 to 2016. The study compares three spatial resolutions (county, agricultural district, and state scale) and three temporal resolutions (crop growth phase, monthly, and growing season) to characterize the effects of spatial and temporal variability. Our results show that the agricultural district model together with growth phase resolution can explain 52% variations of corn yield caused by temperature and precipitation variability. It provides a practical model structure balancing the overfitting problem in county specific model and weak explanation power in state specific model. In US corn belt, precipitation has positive impact on corn yield in growing season except for vegetative stage while extreme heat attains highest sensitivity from silking to dough phase. The results show the northern counties in corn belt area are less interfered by extreme heat but are more vulnerable to water deficiency.

19 CFR 141.61 - Completion of entry and entry summary documentation.

Code of Federal Regulations, 2010 CFR

2010-04-01

... on CBP Form 7501. (e) Statistical information—(1) Information required on entry summary or withdrawal... a separate statistical reporting number, the applicable information required by the General Statistical Notes, Harmonized Tariff Schedule of the United States (HTSUS), must be shown on the entry summary...

[Transient enlargement of craniopharyngioma cysts after stereotactic radiotherapy and radiosurgery].

PubMed

Mazerkina, N A; Savateev, A N; Gorelyshev, S K; Konovalov, A N; Trunin, Yu Yu; Golanov, A V; Medvedeva, O A; Kalinin, P L; Kutin, M A; Astafieva, L I; Krasnova, T S; Ozerova, V I; Serova, N K; Butenko, E I; Strunina, Yu V

Stereotactic radiotherapy/radiosurgery (RT/ES) is an effective technique for treating craniopharyngiomas (CPs). However, enlargement of the cystic part of the tumor occurs in some cases after irradiation. The enlargement may be transient and not require treatment or be a true relapse requiring treatment. In this study, we performed a retrospective analysis of 79 pediatric patients who underwent stereotactic RT or RS after resection of craniopharyngioma. Five-year relapse-free survival after complex treatment of CP was 86%. In the early period after irradiation, 3.5 months (2.7-9.4) on average, enlargement of the cystic component of the tumor was detected in 10 (12.7%) patients; in 9 (11.4%) of them, the enlargement was transient and did not require treatment; in one case, the patient underwent surgery due to reduced visual acuity. In 8 (10.1%) patients, an increase in the residual tumor (a solid component of the tumor in 2 cases and a cystic component of the tumor in 6 cases) occurred in the long-term period after irradiation - after 26.3 months (16.6-48.9) and did not decrease during follow-up in none of the cases, i.e. continued growth of the tumor was diagnosed. A statistical analysis revealed that differences in the terms of transient enlargement and true continued growth were statistically significant (p<0.01). Enlargement of a craniopharyngioma cyst in the early period (up to 1 year) after RT/RS is usually transient and does not require surgical treatment (except cases where worsening of neurological symptoms occurs, or occlusive hydrocephalus develops).

Sampling designs for contaminant temporal trend analyses using sedentary species exemplified by the snails Bellamya aeruginosa and Viviparus viviparus.

PubMed

Yin, Ge; Danielsson, Sara; Dahlberg, Anna-Karin; Zhou, Yihui; Qiu, Yanling; Nyberg, Elisabeth; Bignert, Anders

2017-10-01

Environmental monitoring typically assumes samples and sampling activities to be representative of the population being studied. Given a limited budget, an appropriate sampling strategy is essential to support detecting temporal trends of contaminants. In the present study, based on real chemical analysis data on polybrominated diphenyl ethers in snails collected from five subsites in Tianmu Lake, computer simulation is performed to evaluate three sampling strategies by the estimation of required sample size, to reach a detection of an annual change of 5% with a statistical power of 80% and 90% with a significant level of 5%. The results showed that sampling from an arbitrarily selected sampling spot is the worst strategy, requiring much more individual analyses to achieve the above mentioned criteria compared with the other two approaches. A fixed sampling site requires the lowest sample size but may not be representative for the intended study object e.g. a lake and is also sensitive to changes of that particular sampling site. In contrast, sampling at multiple sites along the shore each year, and using pooled samples when the cost to collect and prepare individual specimens are much lower than the cost for chemical analysis, would be the most robust and cost efficient strategy in the long run. Using statistical power as criterion, the results demonstrated quantitatively the consequences of various sampling strategies, and could guide users with respect of required sample sizes depending on sampling design for long term monitoring programs. Copyright © 2017 Elsevier Ltd. All rights reserved.

Assigning statistical significance to proteotypic peptides via database searches

PubMed Central

Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo

2011-01-01

Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489

GIA Model Statistics for GRACE Hydrology, Cryosphere, and Ocean Science

NASA Astrophysics Data System (ADS)

Caron, L.; Ivins, E. R.; Larour, E.; Adhikari, S.; Nilsson, J.; Blewitt, G.

2018-03-01

We provide a new analysis of glacial isostatic adjustment (GIA) with the goal of assembling the model uncertainty statistics required for rigorously extracting trends in surface mass from the Gravity Recovery and Climate Experiment (GRACE) mission. Such statistics are essential for deciphering sea level, ocean mass, and hydrological changes because the latter signals can be relatively small (≤2 mm/yr water height equivalent) over very large regions, such as major ocean basins and watersheds. With abundant new >7 year continuous measurements of vertical land motion (VLM) reported by Global Positioning System stations on bedrock and new relative sea level records, our new statistical evaluation of GIA uncertainties incorporates Bayesian methodologies. A unique aspect of the method is that both the ice history and 1-D Earth structure vary through a total of 128,000 forward models. We find that best fit models poorly capture the statistical inferences needed to correctly invert for lower mantle viscosity and that GIA uncertainty exceeds the uncertainty ascribed to trends from 14 years of GRACE data in polar regions.

Statistical strength of experiments to reject local realism with photon pairs and inefficient detectors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang Yanbao; Mathematical and Computational Sciences Division, National Institute of Standards and Technology, Boulder, Colorado, 80305; Knill, Emanuel

2010-03-15

Because of the fundamental importance of Bell's theorem, a loophole-free demonstration of a violation of local realism (LR) is highly desirable. Here, we study violations of LR involving photon pairs. We quantify the experimental evidence against LR by using measures of statistical strength related to the Kullback-Leibler (KL) divergence, as suggested by van Dam et al.[W. van Dam, R. D. Gill, and P. D. Grunwald, IEEE Trans. Inf. Theory. 51, 2812 (2005)]. Specifically, we analyze a test of LR with entangled states created from two independent polarized photons passing through a polarizing beam splitter. We numerically study the detection efficiencymore » required to achieve a specified statistical strength for the rejection of LR depending on whether photon counters or detectors are used. Based on our results, we find that a test of LR free of the detection loophole requires photon counters with efficiencies of at least 89.71%, or photon detectors with efficiencies of at least 91.11%. For comparison, we also perform this analysis with ideal unbalanced Bell states, which are known to allow rejection of LR with detector efficiencies above 2/3.« less

Psychometric properties of the Generalized Anxiety Disorder Inventory in a Canadian sample.

PubMed

Henderson, Leigh C; Antony, Martin M; Koerner, Naomi

2014-05-01

The Generalized Anxiety Disorder Inventory is a recently developed self-report measure that assesses symptoms of generalized anxiety disorder. Its psychometric properties have not been investigated further since its original development. The current study investigated its psychometric properties in a Canadian student/community sample. Exploratory principal component analysis replicated the original three-component structure. The total scale and subscales demonstrated excellent internal consistency reliability (α = 0.84-0.94) and correlated strongly with the Penn State Worry Questionnaire (r = 0.41-0.74, all ps <0.001) and Generalized Anxiety Disorder-7 (r = 0.55-0.84, all ps <0.001). However, only the total scale and cognitive subscale (r = 0.48-0.49, all ps <0.05) significantly predicted generalized anxiety disorder diagnosis established by diagnostic interview. The somatic subscale in particular may require revision to improve predictive validity. Revision may also be necessary given changes in required somatic symptoms for generalized anxiety disorder diagnostic criteria in more recent versions of the Diagnostic and Statistical Manual of Mental Disorders (i.e. although major changes occurred from Diagnostic and Statistical Manual of Mental Disorders-III-R to Diagnostic and Statistical Manual of Mental Disorders-IV, changes in Diagnostic and Statistical Manual of Mental Disorders-5 were minimal) and the possibility of changes in the upcoming 11th revision of the International Classification of Diseases.

A study of engineering student attributes and time to completion of first-year required courses at Texas A&M University

NASA Astrophysics Data System (ADS)

Kimball, Jorja Lay

For many years, colleges of engineering across the nation have required that a foundational set of courses be completed for entry into upper division coursework or into a specific engineering major. Since 1998, The Dwight Look College of Engineering at Texas A&M University (TAMU) has required that incoming first-time enrolling students complete a Core Body of Knowledge (CBK) with specific cumulative grade points required for specific majors. However, considerations of the time to completion of coursework and other student characteristics and academic factors have not been taken into consideration by TAMU, like most institutions. The purpose of this study is to determine for first year engineering students at TAMU the relationship of gender, ethnicity, engineering major, unmet financial need, cumulative grade point average, and total transfer hours on time to completion of CBK courses. The results of the analysis showed that cumulative grade point average (CGPA) had the strongest relationship to completion of CBK of any independent variable in this study. Statistical significance was found for the following variables in this study: CGPA, gender, ethnicity, and unmet financial need. For the study's variable of major, statistical significance was found for Chemical, Electrical, and Computer Engineering majors. The one variable in this study that did not show statistical significance in relation to time to completion of CBK was transfer credit. Findings with implications for recruitment and retention of underrepresented in engineering is a statistical significance indicating that on average females are taking less time than males to complete CBK. The conclusion from the study is that efforts to attract more women into engineering have merit as do programs to support underrepresented students in order that they may complete CBK at a faster pace. Further study to determine profiles of those majors where statistical significance was found for students taking a greater or lesser amount of time for CBK completion than the mean is recommended, as is ongoing data collection and comparison for current cohorts of engineering majors at TAMU.

GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis.

PubMed

Zheng, Qi; Wang, Xiu-Jie

2008-07-01

Gene Ontology (GO) analysis has become a commonly used approach for functional studies of large-scale genomic or transcriptomic data. Although there have been a lot of software with GO-related analysis functions, new tools are still needed to meet the requirements for data generated by newly developed technologies or for advanced analysis purpose. Here, we present a Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), an easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets. Compared with available GO analysis tools, GOEAST has the following improved features: (i) GOEAST displays enriched GO terms in graphical format according to their relationships in the hierarchical tree of each GO category (biological process, molecular function and cellular component), therefore, provides better understanding of the correlations among enriched GO terms; (ii) GOEAST supports analysis for data from various sources (probe or probe set IDs of Affymetrix, Illumina, Agilent or customized microarrays, as well as different gene identifiers) and multiple species (about 60 prokaryote and eukaryote species); (iii) One unique feature of GOEAST is to allow cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them. GOEAST also provides rigorous statistical tests to enhance the reliability of analysis results. GOEAST is freely accessible at http://omicslab.genetics.ac.cn/GOEAST/

Descriptive data analysis.

PubMed

Thompson, Cheryl Bagley

2009-01-01

This 13th article of the Basics of Research series is first in a short series on statistical analysis. These articles will discuss creating your statistical analysis plan, levels of measurement, descriptive statistics, probability theory, inferential statistics, and general considerations for interpretation of the results of a statistical analysis.

The new statistics: why and how.

PubMed

Cumming, Geoff

2014-01-01

We need to make substantial changes to how we conduct research. First, in response to heightened concern that our published research literature is incomplete and untrustworthy, we need new requirements to ensure research integrity. These include prespecification of studies whenever possible, avoidance of selection and other inappropriate data-analytic practices, complete reporting, and encouragement of replication. Second, in response to renewed recognition of the severe flaws of null-hypothesis significance testing (NHST), we need to shift from reliance on NHST to estimation and other preferred techniques. The new statistics refers to recommended practices, including estimation based on effect sizes, confidence intervals, and meta-analysis. The techniques are not new, but adopting them widely would be new for many researchers, as well as highly beneficial. This article explains why the new statistics are important and offers guidance for their use. It describes an eight-step new-statistics strategy for research with integrity, which starts with formulation of research questions in estimation terms, has no place for NHST, and is aimed at building a cumulative quantitative discipline.

Bangladesh.

PubMed

Ahmed, K S

1979-01-01

In Bangladesh the Population Control and Family Planning Division of the Ministry of Health and Population Control has decided to delegate increased financial and administrative powers to the officers of the family planning program at the district level and below. Currently, about 20,000 family planning workers and officials are at work in rural areas. The government believes that the success of the entire family planning program depends on the performance of workers in rural areas, because that is where about 90% of the population lives. Awareness of the need to improve statistical data in Bangladesh has been increasing, particularly in regard to the development of rural areas. An accurate statistical profile of rural Bangladesh is crucial to the formation, implementation and evaluation of rural development programs. A Seminar on Statistics for Rural Development will be held from June 18-20, 1980. The primary objectives of the Seminar are to make an exhaustive analysis of the current availability of statistics required for rural development programs and to consider methodological and operational improvements toward building up an adequate data base.

Good experimental design and statistics can save animals, but how can it be promoted?

PubMed

Festing, Michael F W

2004-06-01

Surveys of published papers show that there are many errors both in the design of the experiments and in the statistical analysis of the resulting data. This must result in a waste of animals and scientific resources, and it is surely unethical. Scientific quality might be improved, to some extent, by journal editors, but they are constrained by lack of statistical referees and inadequate statistical training of those referees that they do use. Other parties, such as welfare regulators, ethical review committees and individual scientists also have an interest in scientific quality, but they do not seem to be well placed to make the required changes. However, those who fund research would have the power to do something if they could be convinced that it is in their best interests to do so. More examples of the way in which better experimental design has led to improved experiments would be helpful in persuading these funding organisations to take further action.

Method for Identifying Probable Archaeological Sites from Remotely Sensed Data

NASA Technical Reports Server (NTRS)

Tilton, James C.; Comer, Douglas C.; Priebe, Carey E.; Sussman, Daniel

2011-01-01

Archaeological sites are being compromised or destroyed at a catastrophic rate in most regions of the world. The best solution to this problem is for archaeologists to find and study these sites before they are compromised or destroyed. One way to facilitate the necessary rapid, wide area surveys needed to find these archaeological sites is through the generation of maps of probable archaeological sites from remotely sensed data. We describe an approach for identifying probable locations of archaeological sites over a wide area based on detecting subtle anomalies in vegetative cover through a statistically based analysis of remotely sensed data from multiple sources. We further developed this approach under a recent NASA ROSES Space Archaeology Program project. Under this project we refined and elaborated this statistical analysis to compensate for potential slight miss-registrations between the remote sensing data sources and the archaeological site location data. We also explored data quantization approaches (required by the statistical analysis approach), and we identified a superior data quantization approached based on a unique image segmentation approach. In our presentation we will summarize our refined approach and demonstrate the effectiveness of the overall approach with test data from Santa Catalina Island off the southern California coast. Finally, we discuss our future plans for further improving our approach.

Analysis of Loss-of-Offsite-Power Events 1997-2015

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnson, Nancy Ellen; Schroeder, John Alton

2016-07-01

Loss of offsite power (LOOP) can have a major negative impact on a power plant’s ability to achieve and maintain safe shutdown conditions. LOOP event frequencies and times required for subsequent restoration of offsite power are important inputs to plant probabilistic risk assessments. This report presents a statistical and engineering analysis of LOOP frequencies and durations at U.S. commercial nuclear power plants. The data used in this study are based on the operating experience during calendar years 1997 through 2015. LOOP events during critical operation that do not result in a reactor trip, are not included. Frequencies and durations weremore » determined for four event categories: plant-centered, switchyard-centered, grid-related, and weather-related. Emergency diesel generator reliability is also considered (failure to start, failure to load and run, and failure to run more than 1 hour). There is an adverse trend in LOOP durations. The previously reported adverse trend in LOOP frequency was not statistically significant for 2006-2015. Grid-related LOOPs happen predominantly in the summer. Switchyard-centered LOOPs happen predominantly in winter and spring. Plant-centered and weather-related LOOPs do not show statistically significant seasonality. The engineering analysis of LOOP data shows that human errors have been much less frequent since 1997 than in the 1986 -1996 time period.« less

«

18

19

20

21

22

»

«

19

20

21

22

23

»

A change in humidification system can eliminate endotracheal tube occlusion.

PubMed

Doyle, Alex; Joshi, Manasi; Frank, Peter; Craven, Thomas; Moondi, Parvez; Young, Peter

2011-12-01

Inadequate airway humidification can result in endotracheal tube occlusion. There is evidence that heat and moisture exchangers (HMEs) are more prone to endotracheal tube occlusion than heated humidifiers (HHs) that contain a heated wire circuit. We aimed to compare the incidence of endotracheal tube occlusion while introducing a new dual-heated wire circuit HH in place of an established hydrophobic HME. This was a prospective observational study. All patients who required intubation were included in our analysis. Univariate statistical analysis was performed using a Fisher exact test. P < .05 was considered statistically significant. There were 158 patients in the HME group and 88 patients in the HH group. The incidence of endotracheal tube occlusion was 5.7% in the HME group and 0% in the HH group. Statistical analysis revealed a significant difference between the 2 groups (P = .02). In light of this finding, we changed our practice to provide humidification exclusively by HH. In the subsequent 18-month period, there were no further episodes of endotracheal tube occlusion. Our study demonstrates that there is a significant increase in the incidence of endotracheal tube occlusion when using a hydrophobic HME compared with an HH and that using a dual-heated wire circuit HH can eliminate endotracheal tube occlusion. Copyright © 2011 Elsevier Inc. All rights reserved.

Nearfield Summary and Statistical Analysis of the Second AIAA Sonic Boom Prediction Workshop

NASA Technical Reports Server (NTRS)

Park, Michael A.; Nemec, Marian

2017-01-01

A summary is provided for the Second AIAA Sonic Boom Workshop held 8-9 January 2017 in conjunction with AIAA SciTech 2017. The workshop used three required models of increasing complexity: an axisymmetric body, a wing body, and a complete configuration with flow-through nacelle. An optional complete configuration with propulsion boundary conditions is also provided. These models are designed with similar nearfield signatures to isolate geometry and shock/expansion interaction effects. Eleven international participant groups submitted nearfield signatures with forces, pitching moment, and iterative convergence norms. Statistics and grid convergence of these nearfield signatures are presented. These submissions are propagated to the ground, and noise levels are computed. This allows the grid convergence and the statistical distribution of a noise level to be computed. While progress is documented since the first workshop, improvement to the analysis methods for a possible subsequent workshop are provided. The complete configuration with flow-through nacelle showed the most dramatic improvement between the two workshops. The current workshop cases are more relevant to vehicles with lower loudness and have the potential for lower annoyance than the first workshop cases. The models for this workshop with quieter ground noise levels than the first workshop exposed weaknesses in analysis, particularly in convective discretization.

78 FR 64883 - Filing Financial and Other Reports

Federal Register 2010, 2011, 2012, 2013, 2014

2013-10-30

... paragraph (a) introductory text to read as follows: Sec. 741.6 Financial and statistical and other reports..., statistical, and other reports and credit union profiles by requiring all federally insured credit unions.... Section 741.6(a) of NCUA's regulations requires FICUs to file financial, statistical, and other reports...

Finite Element Analysis of Reverberation Chambers

NASA Technical Reports Server (NTRS)

Bunting, Charles F.; Nguyen, Duc T.

2000-01-01

The primary motivating factor behind the initiation of this work was to provide a deterministic means of establishing the validity of the statistical methods that are recommended for the determination of fields that interact in -an avionics system. The application of finite element analysis to reverberation chambers is the initial step required to establish a reasonable course of inquiry in this particularly data-intensive study. The use of computational electromagnetics provides a high degree of control of the "experimental" parameters that can be utilized in a simulation of reverberating structures. As the work evolved there were four primary focus areas they are: 1. The eigenvalue problem for the source free problem. 2. The development of a complex efficient eigensolver. 3. The application of a source for the TE and TM fields for statistical characterization. 4. The examination of shielding effectiveness in a reverberating environment. One early purpose of this work was to establish the utility of finite element techniques in the development of an extended low frequency statistical model for reverberation phenomena. By employing finite element techniques, structures of arbitrary complexity can be analyzed due to the use of triangular shape functions in the spatial discretization. The effects of both frequency stirring and mechanical stirring are presented. It is suggested that for the low frequency operation the typical tuner size is inadequate to provide a sufficiently random field and that frequency stirring should be used. The results of the finite element analysis of the reverberation chamber illustrate io-W the potential utility of a 2D representation for enhancing the basic statistical characteristics of the chamber when operating in a low frequency regime. The basic field statistics are verified for frequency stirring over a wide range of frequencies. Mechanical stirring is shown to provide an effective frequency deviation.

A retrospective analysis of hyperthermic intraperitoneal chemotherapy for gastric cancer with peritoneal metastasis

PubMed Central

Yuan, Meiqin; Wang, Zeng; Hu, Guinv; Yang, Yunshan; Lv, Wangxia; Lu, Fangxiao; Zhong, Haijun

2016-01-01

Peritoneal metastasis (PM) is a poor prognostic factor in patients with gastric cancer. The aim of this study was to evaluate the efficacy and safety of hyperthermic intraperitoneal chemotherapy (HIPEC) in patients with advanced gastric cancer with PM by retrospective analysis. A total of 54 gastric cancer patients with positive ascitic fluid cytology were included in this study: 23 patients were treated with systemic chemotherapy combined with HIPEC (HIPEC+ group) and 31 received systemic chemotherapy alone (HIPEC- group). The patients were divided into 4 categories according to the changes of ascites, namely disappear, decrease, stable and increase. The disappear + decrease rate in the HIPEC+ group was 82.60%, which was statistically significantly superior to that of the HIPEC- group (54.80%). The disappear + decrease + stable rate was 95.70% in the HIPEC+ group and 74.20% in the HIPEC- group, but the difference was not statistically significant. In 33 patients with complete survival data, including 12 from the HIPEC+ and 21 from the HIPEC- group, the median progression-free survival was 164 and 129 days, respectively, and the median overall survival (OS) was 494 and 223 days, respectively. In patients with ascites disappear/decrease/stable, the OS appeared to be better compared with that in patients with ascites increase, but the difference was not statistically significant. Further analysis revealed that patients with controlled disease (complete response + partial response + stable disease) may have a better OS compared with patients with progressive disease, with a statistically significant difference. The toxicities were well tolerated in both groups. Therefore, HIPEC was found to improve survival in advanced gastric cancer patients with PM, but the difference was not statistically significant, which may be attributed to the small number of cases. Further studies with larger samples are required to confirm our data. PMID:27446587

Statistical analysis of water-quality data containing multiple detection limits II: S-language software for nonparametric distribution modeling and hypothesis testing

USGS Publications Warehouse

Lee, L.; Helsel, D.

2007-01-01

Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.

A statistical simulation model for field testing of non-target organisms in environmental risk assessment of genetically modified plants.

PubMed

Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore

2014-04-01

Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided.

A statistical simulation model for field testing of non-target organisms in environmental risk assessment of genetically modified plants

PubMed Central

Goedhart, Paul W; van der Voet, Hilko; Baldacchino, Ferdinando; Arpaia, Salvatore

2014-01-01

Genetic modification of plants may result in unintended effects causing potentially adverse effects on the environment. A comparative safety assessment is therefore required by authorities, such as the European Food Safety Authority, in which the genetically modified plant is compared with its conventional counterpart. Part of the environmental risk assessment is a comparative field experiment in which the effect on non-target organisms is compared. Statistical analysis of such trials come in two flavors: difference testing and equivalence testing. It is important to know the statistical properties of these, for example, the power to detect environmental change of a given magnitude, before the start of an experiment. Such prospective power analysis can best be studied by means of a statistical simulation model. This paper describes a general framework for simulating data typically encountered in environmental risk assessment of genetically modified plants. The simulation model, available as Supplementary Material, can be used to generate count data having different statistical distributions possibly with excess-zeros. In addition the model employs completely randomized or randomized block experiments, can be used to simulate single or multiple trials across environments, enables genotype by environment interaction by adding random variety effects, and finally includes repeated measures in time following a constant, linear or quadratic pattern in time possibly with some form of autocorrelation. The model also allows to add a set of reference varieties to the GM plants and its comparator to assess the natural variation which can then be used to set limits of concern for equivalence testing. The different count distributions are described in some detail and some examples of how to use the simulation model to study various aspects, including a prospective power analysis, are provided. PMID:24834325

The use of statistical tools in field testing of putative effects of genetically modified plants on nontarget organisms

PubMed Central

Semenov, Alexander V; Elsas, Jan Dirk; Glandorf, Debora C M; Schilthuizen, Menno; Boer, Willem F

2013-01-01

Abstract To fulfill existing guidelines, applicants that aim to place their genetically modified (GM) insect-resistant crop plants on the market are required to provide data from field experiments that address the potential impacts of the GM plants on nontarget organisms (NTO's). Such data may be based on varied experimental designs. The recent EFSA guidance document for environmental risk assessment (2010) does not provide clear and structured suggestions that address the statistics of field trials on effects on NTO's. This review examines existing practices in GM plant field testing such as the way of randomization, replication, and pseudoreplication. Emphasis is placed on the importance of design features used for the field trials in which effects on NTO's are assessed. The importance of statistical power and the positive and negative aspects of various statistical models are discussed. Equivalence and difference testing are compared, and the importance of checking the distribution of experimental data is stressed to decide on the selection of the proper statistical model. While for continuous data (e.g., pH and temperature) classical statistical approaches – for example, analysis of variance (ANOVA) – are appropriate, for discontinuous data (counts) only generalized linear models (GLM) are shown to be efficient. There is no golden rule as to which statistical test is the most appropriate for any experimental situation. In particular, in experiments in which block designs are used and covariates play a role GLMs should be used. Generic advice is offered that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in this testing. The combination of decision trees and a checklist for field trials, which are provided, will help in the interpretation of the statistical analyses of field trials and to assess whether such analyses were correctly applied. We offer generic advice to risk assessors and applicants that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in field testing. PMID:24567836

The use of statistical tools in field testing of putative effects of genetically modified plants on nontarget organisms.

PubMed

Semenov, Alexander V; Elsas, Jan Dirk; Glandorf, Debora C M; Schilthuizen, Menno; Boer, Willem F

2013-08-01

To fulfill existing guidelines, applicants that aim to place their genetically modified (GM) insect-resistant crop plants on the market are required to provide data from field experiments that address the potential impacts of the GM plants on nontarget organisms (NTO's). Such data may be based on varied experimental designs. The recent EFSA guidance document for environmental risk assessment (2010) does not provide clear and structured suggestions that address the statistics of field trials on effects on NTO's. This review examines existing practices in GM plant field testing such as the way of randomization, replication, and pseudoreplication. Emphasis is placed on the importance of design features used for the field trials in which effects on NTO's are assessed. The importance of statistical power and the positive and negative aspects of various statistical models are discussed. Equivalence and difference testing are compared, and the importance of checking the distribution of experimental data is stressed to decide on the selection of the proper statistical model. While for continuous data (e.g., pH and temperature) classical statistical approaches - for example, analysis of variance (ANOVA) - are appropriate, for discontinuous data (counts) only generalized linear models (GLM) are shown to be efficient. There is no golden rule as to which statistical test is the most appropriate for any experimental situation. In particular, in experiments in which block designs are used and covariates play a role GLMs should be used. Generic advice is offered that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in this testing. The combination of decision trees and a checklist for field trials, which are provided, will help in the interpretation of the statistical analyses of field trials and to assess whether such analyses were correctly applied. We offer generic advice to risk assessors and applicants that will help in both the setting up of field testing and the interpretation and data analysis of the data obtained in field testing.

Using volcano plots and regularized-chi statistics in genetic association studies.

PubMed

Li, Wentian; Freudenberg, Jan; Suh, Young Ju; Yang, Yaning

2014-02-01

Labor intensive experiments are typically required to identify the causal disease variants from a list of disease associated variants in the genome. For designing such experiments, candidate variants are ranked by their strength of genetic association with the disease. However, the two commonly used measures of genetic association, the odds-ratio (OR) and p-value may rank variants in different order. To integrate these two measures into a single analysis, here we transfer the volcano plot methodology from gene expression analysis to genetic association studies. In its original setting, volcano plots are scatter plots of fold-change and t-test statistic (or -log of the p-value), with the latter being more sensitive to sample size. In genetic association studies, the OR and Pearson's chi-square statistic (or equivalently its square root, chi; or the standardized log(OR)) can be analogously used in a volcano plot, allowing for their visual inspection. Moreover, the geometric interpretation of these plots leads to an intuitive method for filtering results by a combination of both OR and chi-square statistic, which we term "regularized-chi". This method selects associated markers by a smooth curve in the volcano plot instead of the right-angled lines which corresponds to independent cutoffs for OR and chi-square statistic. The regularized-chi incorporates relatively more signals from variants with lower minor-allele-frequencies than chi-square test statistic. As rare variants tend to have stronger functional effects, regularized-chi is better suited to the task of prioritization of candidate genes. Copyright © 2013 Elsevier Ltd. All rights reserved.

Compilation and Analysis of 20 and 30 GHz Rain Fade Events at the ACTS NASA Ground Station: Statistics and Model Assessment

NASA Technical Reports Server (NTRS)

Manning, Robert M.

1996-01-01

The purpose of the propagation studies within the ACTS Project Office is to acquire 20 and 30 GHz rain fade statistics using the ACTS beacon links received at the NGS (NASA Ground Station) in Cleveland. Other than the raw, statistically unprocessed rain fade events that occur in real time, relevant rain fade statistics derived from such events are the cumulative rain fade statistics as well as fade duration statistics (beyond given fade thresholds) over monthly and yearly time intervals. Concurrent with the data logging exercise, monthly maximum rainfall levels recorded at the US Weather Service at Hopkins Airport are appended to the database to facilitate comparison of observed fade statistics with those predicted by the ACTS Rain Attenuation Model. Also, the raw fade data will be in a format, complete with documentation, for use by other investigators who require realistic fade event evolution in time for simulation purposes or further analysis for comparisons with other rain fade prediction models, etc. The raw time series data from the 20 and 30 GHz beacon signals is purged of non relevant data intervals where no rain fading has occurred. All other data intervals which contain rain fade events are archived with the accompanying time stamps. The definition of just what constitutes a rain fade event will be discussed later. The archived data serves two purposes. First, all rain fade event data is recombined into a contiguous data series every month and every year; this will represent an uninterrupted record of the actual (i.e., not statistically processed) temporal evolution of rain fade at 20 and 30 GHz at the location of the NGS. The second purpose of the data in such a format is to enable a statistical analysis of prevailing propagation parameters such as cumulative distributions of attenuation on a monthly and yearly basis as well as fade duration probabilities below given fade thresholds, also on a monthly and yearly basis. In addition, various subsidiary statistics such as attenuation rate probabilities are derived. The purged raw rain fade data as well as the results of the analyzed data will be made available for use by parties in the private sector upon their request. The process which will be followed in this dissemination is outlined in this paper.

SandiaMRCR

DOE Office of Scientific and Technical Information (OSTI.GOV)

2012-01-05

SandiaMCR was developed to identify pure components and their concentrations from spectral data. This software efficiently implements the multivariate calibration regression alternating least squares (MCR-ALS), principal component analysis (PCA), and singular value decomposition (SVD). Version 3.37 also includes the PARAFAC-ALS Tucker-1 (for trilinear analysis) algorithms. The alternating least squares methods can be used to determine the composition without or with incomplete prior information on the constituents and their concentrations. It allows the specification of numerous preprocessing, initialization and data selection and compression options for the efficient processing of large data sets. The software includes numerous options including the definition ofmore » equality and non-negativety constraints to realistically restrict the solution set, various normalization or weighting options based on the statistics of the data, several initialization choices and data compression. The software has been designed to provide a practicing spectroscopist the tools required to routinely analysis data in a reasonable time and without requiring expert intervention.« less

Sensitivity analysis, calibration, and testing of a distributed hydrological model using error‐based weighting and one objective function

USGS Publications Warehouse

Foglia, L.; Hill, Mary C.; Mehl, Steffen W.; Burlando, P.

2009-01-01

We evaluate the utility of three interrelated means of using data to calibrate the fully distributed rainfall‐runoff model TOPKAPI as applied to the Maggia Valley drainage area in Switzerland. The use of error‐based weighting of observation and prior information data, local sensitivity analysis, and single‐objective function nonlinear regression provides quantitative evaluation of sensitivity of the 35 model parameters to the data, identification of data types most important to the calibration, and identification of correlations among parameters that contribute to nonuniqueness. Sensitivity analysis required only 71 model runs, and regression required about 50 model runs. The approach presented appears to be ideal for evaluation of models with long run times or as a preliminary step to more computationally demanding methods. The statistics used include composite scaled sensitivities, parameter correlation coefficients, leverage, Cook's D, and DFBETAS. Tests suggest predictive ability of the calibrated model typical of hydrologic models.

Formalizing the definition of meta-analysis in Molecular Ecology.

PubMed

ArchMiller, Althea A; Bauer, Eric F; Koch, Rebecca E; Wijayawardena, Bhagya K; Anil, Ammu; Kottwitz, Jack J; Munsterman, Amelia S; Wilson, Alan E

2015-08-01

Meta-analysis, the statistical synthesis of pertinent literature to develop evidence-based conclusions, is relatively new to the field of molecular ecology, with the first meta-analysis published in the journal Molecular Ecology in 2003 (Slate & Phua 2003). The goal of this article is to formalize the definition of meta-analysis for the authors, editors, reviewers and readers of Molecular Ecology by completing a review of the meta-analyses previously published in this journal. We also provide a brief overview of the many components required for meta-analysis with a more specific discussion of the issues related to the field of molecular ecology, including the use and statistical considerations of Wright's FST and its related analogues as effect sizes in meta-analysis. We performed a literature review to identify articles published as 'meta-analyses' in Molecular Ecology, which were then evaluated by at least two reviewers. We specifically targeted Molecular Ecology publications because as a flagship journal in this field, meta-analyses published in Molecular Ecology have the potential to set the standard for meta-analyses in other journals. We found that while many of these reviewed articles were strong meta-analyses, others failed to follow standard meta-analytical techniques. One of these unsatisfactory meta-analyses was in fact a secondary analysis. Other studies attempted meta-analyses but lacked the fundamental statistics that are considered necessary for an effective and powerful meta-analysis. By drawing attention to the inconsistency of studies labelled as meta-analyses, we emphasize the importance of understanding the components of traditional meta-analyses to fully embrace the strengths of quantitative data synthesis in the field of molecular ecology. © 2015 John Wiley & Sons Ltd.

Computation of the Molenaar Sijtsma Statistic

NASA Astrophysics Data System (ADS)

Andries van der Ark, L.

The Molenaar Sijtsma statistic is an estimate of the reliability of a test score. In some special cases, computation of the Molenaar Sijtsma statistic requires provisional measures. These provisional measures have not been fully described in the literature, and we show that they have not been implemented in the software. We describe the required provisional measures as to allow the computation of the Molenaar Sijtsma statistic for all data sets.

Statistical analysis of the limitation of half integer resonances on the available momentum acceptance of the High Energy Photon Source

NASA Astrophysics Data System (ADS)

Jiao, Yi; Duan, Zhe

2017-01-01

In a diffraction-limited storage ring, half integer resonances can have strong effects on the beam dynamics, associated with the large detuning terms from the strong focusing and strong sextupoles as required for an ultralow emittance. In this study, the limitation of half integer resonances on the available momentum acceptance (MA) was statistically analyzed based on one design of the High Energy Photon Source (HEPS). It was found that the probability of MA reduction due to crossing of half integer resonances is closely correlated with the level of beta beats at the nominal tunes, but independent of the error sources. The analysis indicated that for the presented HEPS lattice design, the rms amplitude of beta beats should be kept below 1.5% horizontally and 2.5% vertically to reach a small MA reduction probability of about 1%.

Specification of ISS Plasma Environment Variability

NASA Technical Reports Server (NTRS)

Minow, Joseph I.; Neergaard, Linda F.; Bui, Them H.; Mikatarian, Ronald R.; Barsamian, H.; Koontz, Steven L.

2004-01-01

Quantifying spacecraft charging risks and associated hazards for the International Space Station (ISS) requires a plasma environment specification for the natural variability of ionospheric temperature (Te) and density (Ne). Empirical ionospheric specification and forecast models such as the International Reference Ionosphere (IRI) model typically only provide long term (seasonal) mean Te and Ne values for the low Earth orbit environment. This paper describes a statistical analysis of historical ionospheric low Earth orbit plasma measurements from the AE-C, AE-D, and DE-2 satellites used to derive a model of deviations of observed data values from IRI-2001 estimates of Ne, Te parameters for each data point to provide a statistical basis for modeling the deviations of the plasma environment from the IRI model output. Application of the deviation model with the IRI-2001 output yields a method for estimating extreme environments for the ISS spacecraft charging analysis.

Statistical properties of the radiation belt seed population

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boyd, A. J.; Spence, H. E.; Huang, C. -L.

Here, we present a statistical analysis of phase space density data from the first 26 months of the Van Allen Probes mission. In particular, we investigate the relationship between the tens and hundreds of keV seed electrons and >1 MeV core radiation belt electron population. Using a cross-correlation analysis, we find that the seed and core populations are well correlated with a coefficient of ≈0.73 with a time lag of 10–15 h. We present evidence of a seed population threshold that is necessary for subsequent acceleration. The depth of penetration of the seed population determines the inner boundary of themore » acceleration process. However, we show that an enhanced seed population alone is not enough to produce acceleration in the higher energies, implying that the seed population of hundreds of keV electrons is only one of several conditions required for MeV electron radiation belt acceleration.« less

Analysis of Convair 990 rejected-takeoff accident with emphasis on decision making, training and procedures

NASA Technical Reports Server (NTRS)

Batthauer, Byron E.

1987-01-01

This paper analyzes a NASA Convair 990 (CV-990) accident with emphasis on rejected-takeoff (RTO) decision making, training, procedures, and accident statistics. The NASA Aircraft Accident Investigation Board was somewhat perplexed that an aircraft could be destroyed as a result of blown tires during the takeoff roll. To provide a better understanding of tire failure RTO's, The Board obtained accident reports, Federal Aviation Administration (FAA) studies, and other pertinent information related to the elements of this accident. This material enhanced the analysis process and convinced the Accident Board that high-speed RTO's in transport aircraft should be given more emphasis during pilot training. Pilots should be made aware of various RTO situations and statistics with emphasis on failed-tire RTO's. This background information could enhance the split-second decision-making process that is required prior to initiating an RTO.

«

19

20

21

22

23

»

«

20

21

22

23

24

»

Statistical properties of the radiation belt seed population

DOE PAGES

Boyd, A. J.; Spence, H. E.; Huang, C. -L.; ...

2016-07-25

Here, we present a statistical analysis of phase space density data from the first 26 months of the Van Allen Probes mission. In particular, we investigate the relationship between the tens and hundreds of keV seed electrons and >1 MeV core radiation belt electron population. Using a cross-correlation analysis, we find that the seed and core populations are well correlated with a coefficient of ≈0.73 with a time lag of 10–15 h. We present evidence of a seed population threshold that is necessary for subsequent acceleration. The depth of penetration of the seed population determines the inner boundary of themore » acceleration process. However, we show that an enhanced seed population alone is not enough to produce acceleration in the higher energies, implying that the seed population of hundreds of keV electrons is only one of several conditions required for MeV electron radiation belt acceleration.« less

Low-dose ionizing radiation increases the mortality risk of solid cancers in nuclear industry workers: A meta-analysis

PubMed Central

Qu, Shu-Gen; Gao, Jin; Tang, Bo; Yu, Bo; Shen, Yue-Ping; Tu, Yu

2018-01-01

Low-dose ionizing radiation (LDIR) may increase the mortality of solid cancers in nuclear industry workers, but only few individual cohort studies exist, and the available reports have low statistical power. The aim of the present study was to focus on solid cancer mortality risk from LDIR in the nuclear industry using standard mortality ratios (SMRs) and 95% confidence intervals. A systematic literature search through the PubMed and Embase databases identified 27 studies relevant to this meta-analysis. There was statistical significance for total, solid and lung cancers, with meta-SMR values of 0.88, 0.80, and 0.89, respectively. There was evidence of stochastic effects by IR, but more definitive conclusions require additional analyses using standardized protocols to determine whether LDIR increases the risk of solid cancer-related mortality. PMID:29725540

Rolling-Element Fatigue Testing and Data Analysis - A Tutorial

NASA Technical Reports Server (NTRS)

Vlcek, Brian L.; Zaretsky, Erwin V.

2011-01-01

In order to rank bearing materials, lubricants and other design variables using rolling-element bench type fatigue testing of bearing components and full-scale rolling-element bearing tests, the investigator needs to be cognizant of the variables that affect rolling-element fatigue life and be able to maintain and control them within an acceptable experimental tolerance. Once these variables are controlled, the number of tests and the test conditions must be specified to assure reasonable statistical certainty of the final results. There is a reasonable correlation between the results from elemental test rigs with those results obtained with full-scale bearings. Using the statistical methods of W. Weibull and L. Johnson, the minimum number of tests required can be determined. This paper brings together and discusses the technical aspects of rolling-element fatigue testing and data analysis as well as making recommendations to assure quality and reliable testing of rolling-element specimens and full-scale rolling-element bearings.

Detection of Anomalies in Hydrometric Data Using Artificial Intelligence Techniques

NASA Astrophysics Data System (ADS)

Lauzon, N.; Lence, B. J.

2002-12-01

This work focuses on the detection of anomalies in hydrometric data sequences, such as 1) outliers, which are individual data having statistical properties that differ from those of the overall population; 2) shifts, which are sudden changes over time in the statistical properties of the historical records of data; and 3) trends, which are systematic changes over time in the statistical properties. For the purpose of the design and management of water resources systems, it is important to be aware of these anomalies in hydrometric data, for they can induce a bias in the estimation of water quantity and quality parameters. These anomalies may be viewed as specific patterns affecting the data, and therefore pattern recognition techniques can be used for identifying them. However, the number of possible patterns is very large for each type of anomaly and consequently large computing capacities are required to account for all possibilities using the standard statistical techniques, such as cluster analysis. Artificial intelligence techniques, such as the Kohonen neural network and fuzzy c-means, are clustering techniques commonly used for pattern recognition in several areas of engineering and have recently begun to be used for the analysis of natural systems. They require much less computing capacity than the standard statistical techniques, and therefore are well suited for the identification of outliers, shifts and trends in hydrometric data. This work constitutes a preliminary study, using synthetic data representing hydrometric data that can be found in Canada. The analysis of the results obtained shows that the Kohonen neural network and fuzzy c-means are reasonably successful in identifying anomalies. This work also addresses the problem of uncertainties inherent to the calibration procedures that fit the clusters to the possible patterns for both the Kohonen neural network and fuzzy c-means. Indeed, for the same database, different sets of clusters can be established with these calibration procedures. A simple method for analyzing uncertainties associated with the Kohonen neural network and fuzzy c-means is developed here. The method combines the results from several sets of clusters, either from the Kohonen neural network or fuzzy c-means, so as to provide an overall diagnosis as to the identification of outliers, shifts and trends. The results indicate an improvement in the performance for identifying anomalies when the method of combining cluster sets is used, compared with when only one cluster set is used.

Interpreting the Results of Weighted Least-Squares Regression: Caveats for the Statistical Consumer.

ERIC Educational Resources Information Center

Willett, John B.; Singer, Judith D.

In research, data sets often occur in which the variance of the distribution of the dependent variable at given levels of the predictors is a function of the values of the predictors. In this situation, the use of weighted least-squares (WLS) or techniques is required. Weights suitable for use in a WLS regression analysis must be estimated. A…

Surface inspection of flat products by means of texture analysis: on-line implementation using neural networks

NASA Astrophysics Data System (ADS)

Fernandez, Carlos; Platero, Carlos; Campoy, Pascual; Aracil, Rafael

1994-11-01

This paper describes some texture-based techniques that can be applied to quality assessment of flat products continuously produced (metal strips, wooden surfaces, cork, textile products, ...). Since the most difficult task is that of inspecting for product appearance, human-like inspection ability is required. A common feature to all these products is the presence of non- deterministic texture on their surfaces. Two main subjects are discussed: statistical techniques for both surface finishing determination and surface defect analysis as well as real-time implementation for on-line inspection in high-speed applications. For surface finishing determination a Gray Level Difference technique is presented to perform over low resolution images, that is, no-zoomed images. Defect analysis is performed by means of statistical texture analysis over defective portions of the surface. On-line implementation is accomplished by means of neural networks. When a defect arises, textural analysis is applied which result in a data-vector, acting as input of a neural net, previously trained in a supervised way. This approach tries to reach on-line performance in automated visual inspection applications when texture is presented in flat product surfaces.

Imaging mass spectrometry statistical analysis.

PubMed

Jones, Emrys A; Deininger, Sören-Oliver; Hogendoorn, Pancras C W; Deelder, André M; McDonnell, Liam A

2012-08-30

Imaging mass spectrometry is increasingly used to identify new candidate biomarkers. This clinical application of imaging mass spectrometry is highly multidisciplinary: expertise in mass spectrometry is necessary to acquire high quality data, histology is required to accurately label the origin of each pixel's mass spectrum, disease biology is necessary to understand the potential meaning of the imaging mass spectrometry results, and statistics to assess the confidence of any findings. Imaging mass spectrometry data analysis is further complicated because of the unique nature of the data (within the mass spectrometry field); several of the assumptions implicit in the analysis of LC-MS/profiling datasets are not applicable to imaging. The very large size of imaging datasets and the reporting of many data analysis routines, combined with inadequate training and accessible reviews, have exacerbated this problem. In this paper we provide an accessible review of the nature of imaging data and the different strategies by which the data may be analyzed. Particular attention is paid to the assumptions of the data analysis routines to ensure that the reader is apprised of their correct usage in imaging mass spectrometry research. Copyright © 2012 Elsevier B.V. All rights reserved.

A Review of the Study Designs and Statistical Methods Used in the Determination of Predictors of All-Cause Mortality in HIV-Infected Cohorts: 2002–2011

PubMed Central

Otwombe, Kennedy N.; Petzold, Max; Martinson, Neil; Chirwa, Tobias

2014-01-01

Background Research in the predictors of all-cause mortality in HIV-infected people has widely been reported in literature. Making an informed decision requires understanding the methods used. Objectives We present a review on study designs, statistical methods and their appropriateness in original articles reporting on predictors of all-cause mortality in HIV-infected people between January 2002 and December 2011. Statistical methods were compared between 2002–2006 and 2007–2011. Time-to-event analysis techniques were considered appropriate. Data Sources Pubmed/Medline. Study Eligibility Criteria Original English-language articles were abstracted. Letters to the editor, editorials, reviews, systematic reviews, meta-analysis, case reports and any other ineligible articles were excluded. Results A total of 189 studies were identified (n = 91 in 2002–2006 and n = 98 in 2007–2011) out of which 130 (69%) were prospective and 56 (30%) were retrospective. One hundred and eighty-two (96%) studies described their sample using descriptive statistics while 32 (17%) made comparisons using t-tests. Kaplan-Meier methods for time-to-event analysis were commonly used in the earlier period (n = 69, 76% vs. n = 53, 54%, p = 0.002). Predictors of mortality in the two periods were commonly determined using Cox regression analysis (n = 67, 75% vs. n = 63, 64%, p = 0.12). Only 7 (4%) used advanced survival analysis methods of Cox regression analysis with frailty in which 6 (3%) were used in the later period. Thirty-two (17%) used logistic regression while 8 (4%) used other methods. There were significantly more articles from the first period using appropriate methods compared to the second (n = 80, 88% vs. n = 69, 70%, p-value = 0.003). Conclusion Descriptive statistics and survival analysis techniques remain the most common methods of analysis in publications on predictors of all-cause mortality in HIV-infected cohorts while prospective research designs are favoured. Sophisticated techniques of time-dependent Cox regression and Cox regression with frailty are scarce. This motivates for more training in the use of advanced time-to-event methods. PMID:24498313

Single-row, double-row, and transosseous equivalent techniques for isolated supraspinatus tendon tears with minimal atrophy: A retrospective comparative outcome and radiographic analysis at minimum 2-year followup

PubMed Central

McCormick, Frank; Gupta, Anil; Bruce, Ben; Harris, Josh; Abrams, Geoff; Wilson, Hillary; Hussey, Kristen; Cole, Brian J.

2014-01-01

Purpose: The purpose of this study was to measure and compare the subjective, objective, and radiographic healing outcomes of single-row (SR), double-row (DR), and transosseous equivalent (TOE) suture techniques for arthroscopic rotator cuff repair. Materials and Methods: A retrospective comparative analysis of arthroscopic rotator cuff repairs by one surgeon from 2004 to 2010 at minimum 2-year followup was performed. Cohorts were matched for age, sex, and tear size. Subjective outcome variables included ASES, Constant, SST, UCLA, and SF-12 scores. Objective outcome variables included strength, active range of motion (ROM). Radiographic healing was assessed by magnetic resonance imaging (MRI). Statistical analysis was performed using analysis of variance (ANOVA), Mann — Whitney and Kruskal — Wallis tests with significance, and the Fisher exact probability test <0.05. Results: Sixty-three patients completed the study requirements (20 SR, 21 DR, 22 TOE). There was a clinically and statistically significant improvement in outcomes with all repair techniques (ASES mean improvement P = <0.0001). The mean final ASES scores were: SR 83; (SD 21.4); DR 87 (SD 18.2); TOE 87 (SD 13.2); (P = 0.73). There was a statistically significant improvement in strength for each repair technique (P < 0.001). There was no significant difference between techniques across all secondary outcome assessments: ASES improvement, Constant, SST, UCLA, SF-12, ROM, Strength, and MRI re-tear rates. There was a decrease in re-tear rates from single row (22%) to double-row (18%) to transosseous equivalent (11%); however, this difference was not statistically significant (P = 0.6). Conclusions: Compared to preoperatively, arthroscopic rotator cuff repair, using SR, DR, or TOE techniques, yielded a clinically and statistically significant improvement in subjective and objective outcomes at a minimum 2-year follow-up. Level of Evidence: Therapeutic level 3. PMID:24926159

A Simple Test of Class-Level Genetic Association Can Reveal Novel Cardiometabolic Trait Loci.

PubMed

Qian, Jing; Nunez, Sara; Reed, Eric; Reilly, Muredach P; Foulkes, Andrea S

2016-01-01

Characterizing the genetic determinants of complex diseases can be further augmented by incorporating knowledge of underlying structure or classifications of the genome, such as newly developed mappings of protein-coding genes, epigenetic marks, enhancer elements and non-coding RNAs. We apply a simple class-level testing framework, termed Genetic Class Association Testing (GenCAT), to identify protein-coding gene association with 14 cardiometabolic (CMD) related traits across 6 publicly available genome wide association (GWA) meta-analysis data resources. GenCAT uses SNP-level meta-analysis test statistics across all SNPs within a class of elements, as well as the size of the class and its unique correlation structure, to determine if the class is statistically meaningful. The novelty of findings is evaluated through investigation of regional signals. A subset of findings are validated using recently updated, larger meta-analysis resources. A simulation study is presented to characterize overall performance with respect to power, control of family-wise error and computational efficiency. All analysis is performed using the GenCAT package, R version 3.2.1. We demonstrate that class-level testing complements the common first stage minP approach that involves individual SNP-level testing followed by post-hoc ascribing of statistically significant SNPs to genes and loci. GenCAT suggests 54 protein-coding genes at 41 distinct loci for the 13 CMD traits investigated in the discovery analysis, that are beyond the discoveries of minP alone. An additional application to biological pathways demonstrates flexibility in defining genetic classes. We conclude that it would be prudent to include class-level testing as standard practice in GWA analysis. GenCAT, for example, can be used as a simple, complementary and efficient strategy for class-level testing that leverages existing data resources, requires only summary level data in the form of test statistics, and adds significant value with respect to its potential for identifying multiple novel and clinically relevant trait associations.

Cyber Risk Management for Critical Infrastructure: A Risk Analysis Model and Three Case Studies.

PubMed

Paté-Cornell, M-Elisabeth; Kuypers, Marshall; Smith, Matthew; Keller, Philip

2018-02-01

Managing cyber security in an organization involves allocating the protection budget across a spectrum of possible options. This requires assessing the benefits and the costs of these options. The risk analyses presented here are statistical when relevant data are available, and system-based for high-consequence events that have not happened yet. This article presents, first, a general probabilistic risk analysis framework for cyber security in an organization to be specified. It then describes three examples of forward-looking analyses motivated by recent cyber attacks. The first one is the statistical analysis of an actual database, extended at the upper end of the loss distribution by a Bayesian analysis of possible, high-consequence attack scenarios that may happen in the future. The second is a systems analysis of cyber risks for a smart, connected electric grid, showing that there is an optimal level of connectivity. The third is an analysis of sequential decisions to upgrade the software of an existing cyber security system or to adopt a new one to stay ahead of adversaries trying to find their way in. The results are distributions of losses to cyber attacks, with and without some considered countermeasures in support of risk management decisions based both on past data and anticipated incidents. © 2017 Society for Risk Analysis.

Combined statistical analyses for long-term stability data with multiple storage conditions: a simulation study.

PubMed

Almalik, Osama; Nijhuis, Michiel B; van den Heuvel, Edwin R

2014-01-01

Shelf-life estimation usually requires that at least three registration batches are tested for stability at multiple storage conditions. The shelf-life estimates are often obtained by linear regression analysis per storage condition, an approach implicitly suggested by ICH guideline Q1E. A linear regression analysis combining all data from multiple storage conditions was recently proposed in the literature when variances are homogeneous across storage conditions. The combined analysis is expected to perform better than the separate analysis per storage condition, since pooling data would lead to an improved estimate of the variation and higher numbers of degrees of freedom, but this is not evident for shelf-life estimation. Indeed, the two approaches treat the observed initial batch results, the intercepts in the model, and poolability of batches differently, which may eliminate or reduce the expected advantage of the combined approach with respect to the separate approach. Therefore, a simulation study was performed to compare the distribution of simulated shelf-life estimates on several characteristics between the two approaches and to quantify the difference in shelf-life estimates. In general, the combined statistical analysis does estimate the true shelf life more consistently and precisely than the analysis per storage condition, but it did not outperform the separate analysis in all circumstances.

Computational Analysis for Rocket-Based Combined-Cycle Systems During Rocket-Only Operation

NASA Technical Reports Server (NTRS)

Steffen, C. J., Jr.; Smith, T. D.; Yungster, S.; Keller, D. J.

2000-01-01

A series of Reynolds-averaged Navier-Stokes calculations were employed to study the performance of rocket-based combined-cycle systems operating in an all-rocket mode. This parametric series of calculations were executed within a statistical framework, commonly known as design of experiments. The parametric design space included four geometric and two flowfield variables set at three levels each, for a total of 729 possible combinations. A D-optimal design strategy was selected. It required that only 36 separate computational fluid dynamics (CFD) solutions be performed to develop a full response surface model, which quantified the linear, bilinear, and curvilinear effects of the six experimental variables. The axisymmetric, Reynolds-averaged Navier-Stokes simulations were executed with the NPARC v3.0 code. The response used in the statistical analysis was created from Isp efficiency data integrated from the 36 CFD simulations. The influence of turbulence modeling was analyzed by using both one- and two-equation models. Careful attention was also given to quantify the influence of mesh dependence, iterative convergence, and artificial viscosity upon the resulting statistical model. Thirteen statistically significant effects were observed to have an influence on rocket-based combined-cycle nozzle performance. It was apparent that the free-expansion process, directly downstream of the rocket nozzle, can influence the Isp efficiency. Numerical schlieren images and particle traces have been used to further understand the physical phenomena behind several of the statistically significant results.

Efforts to improve international migration statistics: a historical perspective.

PubMed

Kraly, E P; Gnanasekaran, K S

1987-01-01

During the past decade, the international statistical community has made several efforts to develop standards for the definition, collection and publication of statistics on international migration. This article surveys the history of official initiatives to standardize international migration statistics by reviewing the recommendations of the International Statistical Institute, International Labor Organization, and the UN, and reports a recently proposed agenda for moving toward comparability among national statistical systems. Heightening awareness of the benefits of exchange and creating motivation to implement international standards requires a 3-pronged effort from the international statistical community. 1st, it is essential to continue discussion about the significance of improvement, specifically standardization, of international migration statistics. The move from theory to practice in this area requires ongoing focus by migration statisticians so that conformity to international standards itself becomes a criterion by which national statistical practices are examined and assessed. 2nd, the countries should be provided with technical documentation to support and facilitate the implementation of the recommended statistical systems. Documentation should be developed with an understanding that conformity to international standards for migration and travel statistics must be achieved within existing national statistical programs. 3rd, the call for statistical research in this area requires more efforts by the community of migration statisticians, beginning with the mobilization of bilateral and multilateral resources to undertake the preceding list of activities.

Some evidentiary considerations for physician billing.

PubMed

Rooks, Franklin J

2011-01-01

In a criminal prosecution for medical billing fraud alleging up-coding and overbilling, the government's evidence may encompass the practice's entire billings and draw inferences from them. In addition, fraud may be demonstrated through statistical analysis comparing a physician's billings relative to other providers of the same specialty. The Federal Rules of Evidence govern the admissibility of evidence during a trial, to provide fairness for both the prosecution and the defense. Physicians and practice managers should be well versed in the billing requirements and particularly careful when CPT codes are expressed in terms of "required times" as opposed to "typical times."

Mars Microprobe Entry Analysis

NASA Technical Reports Server (NTRS)

Braun, Robert D.; Mitcheltree, Robert A.; Cheatwood, F. McNeil

1998-01-01

The Mars Microprobe mission will provide the first opportunity for subsurface measurements, including water detection, near the south pole of Mars. In this paper, performance of the Microprobe aeroshell design is evaluated through development of a six-degree-of-freedom (6-DOF) aerodynamic database and flight dynamics simulation. Numerous mission uncertainties are quantified and a Monte-Carlo analysis is performed to statistically assess mission performance. Results from this 6-DOF Monte-Carlo simulation demonstrate that, in a majority of the cases (approximately 2-sigma), the penetrator impact conditions are within current design tolerances. Several trajectories are identified in which the current set of impact requirements are not satisfied. From these cases, critical design parameters are highlighted and additional system requirements are suggested. In particular, a relatively large angle-of-attack range near peak heating is identified.

Statistical principle and methodology in the NISAN system.

PubMed Central

Asano, C

1979-01-01

The NISAN system is a new interactive statistical analysis program package constructed by an organization of Japanese statisticans. The package is widely available for both statistical situations, confirmatory analysis and exploratory analysis, and is planned to obtain statistical wisdom and to choose optimal process of statistical analysis for senior statisticians. PMID:540594

CMM Data Analysis Tool

DOE Office of Scientific and Technical Information (OSTI.GOV)

Due to the increase in the use of Coordinate Measuring Machines (CMMs) to measure fine details and complex geometries in manufacturing, many programs have been made to compile and analyze the data. These programs typically require extensive setup to determine the expected results in order to not only track the pass/fail of a dimension, but also to use statistical process control (SPC). These extra steps and setup times have been addressed through the CMM Data Analysis Tool, which only requires the output of the CMM to provide both pass/fail analysis on all parts run to the same inspection program asmore » well as provide graphs which help visualize where the part measures within the allowed tolerances. This provides feedback not only to the customer for approval of a part during development, but also to machining process engineers to identify when any dimension is drifting towards an out of tolerance condition during production. This program can handle hundreds of parts with complex dimensions and will provide an analysis within minutes.« less

The effect of dexmedetomidine continuous infusion as an adjuvant to general anesthesia on sevoflurane requirements: A study based on entropy analysis

PubMed Central

Patel, Chirag Ramanlal; Engineer, Smita R; Shah, Bharat J; Madhu, S

2013-01-01

Background: Dexmedetomidine, a α2 agonist as an adjuvant in general anesthesia, has anesthetic and analgesic-sparing property. Aims: To evaluate the effect of continuous infusion of dexmedetomidine alone, without use of opioids, on requirement of sevoflurane during general anesthesia with continuous monitoring of depth of anesthesia by entropy analysis. Materials and Methods: Sixty patients were randomly divided into 2 groups of 30 each. In group A, fentanyl 2 mcg/kg was given while in group B, dexmedetomidine was given intravenously as loading dose of 1 mcg/kg over 10 min prior to induction. After induction with thiopentone in group B, dexmedetomidine was given as infusion at a dose of 0.2-0.8 mcg/kg. Sevoflurane was used as inhalation agent in both groups. Hemodynamic variables, sevoflurane inspired fraction (FIsevo), sevoflurane expired fraction (ETsevo), and entropy (Response entropy and state entropy) were continuously recorded. Statistical analysis was done by unpaired student's t-test and Chi-square test for continuous and categorical variables, respectively. A P-value < 0.05 was considered significant. Results: The use of dexmedetomidine with sevoflurane was associated with a statistical significant decrease in ETsevo at 5 minutes post-intubation (1.49 ± 0.11) and 60 minutes post-intubation (1.11 ±0.28) as compared to the group A [1.73 ±0.30 (5 minutes); 1.68 ±0.50 (60 minutes)]. There was an average 21.5% decrease in ETsevo in group B as compared to group A. Conclusions: Dexmedetomidine, as an adjuvant in general anesthesia, decreases requirement of sevoflurane for maintaining adequate depth of anesthesia. PMID:24106354

The influence of anthropometrics on physical employment standard performance.

PubMed

Reilly, T; Spivock, M; Prayal-Brown, A; Stockbrugger, B; Blacklock, R

2016-10-01

The Canadian Armed Forces (CAF) recently implemented the Fitness for Operational Requirements of CAF Employment (FORCE), a new physical employment standard (PES). Data collection throughout development included anthropometric profiles of the CAF. To determine if anthropometric measurements and demographic information would predict the performance outcomes of the FORCE and/or Common Military Task Fitness Evaluation (CMTFE). We conducted a secondary analysis of data from FORCE research. We obtained bioelectrical impedance and segmental analysis. Statistical analysis included correlation and linear regression analyses. Among the 668 study subjects, as predicted, any task requiring lifting, pulling or moving of an object was significantly and positively correlated (r > 0.67) to lean body mass (LBM) measurements. LBM correlated with stretcher carry (r = 0.78) and with lifting actions such as sand bag drag (r = 0.77), vehicle extrication (r = 0.71), sand bag fortification (r = 0.68) and sand bag lift time (r = -0.67). The difference between the correlation of dead mass (DM) with task performance compared with LBM was not statistically significant. DM and LBM can be used in a PES to predict success on military tasks such as casualty evacuation and manual material handling. However, there is no minimum LBM required to perform these tasks successfully. These data direct future research on how we should diversify research participants by anthropometrics, in addition to the traditional demographic variables of gender and age, to highlight potential important adverse impact with PES design. In addition, the results can be used to develop better training regimens to facilitate passing a PES. © All rights reserved. ‘The Influence of Anthropometrics on Physical Employment Standard Performance’ has been reproduced with the permission of DND, 2016.

«

20

21

22

23

24

»

«

21

22

23

24

25

»

Tranexamic acid versus aminocaproic acid for blood management after total knee and total hip arthroplasty: A systematic review and meta-analysis.

PubMed

Liu, Qiuliang; Geng, Peishuo; Shi, Longyan; Wang, Qi; Wang, Pengliang

2018-06-01

To compare the efficacy and safety of tranexamic acid and aminocaproic acid for reducing blood loss and transfusion requirements after total knee and total hip arthroplasty. We conduct electronic searches of Medline (1966-2017.11), PubMed (1966-2017.11), Embase (1980-2017.11), ScienceDirect (1985-2017.11) and the Cochrane Library (1900-2017.11). The primary outcomes, including total blood loss, hemoglobin decline and transfusion requirements. Secondary outcomes include length of hospital stay and postoperative complications such as the incidence of deep vein thrombosis and pulmonary embolism. Each outcome is combined and calculated using the statistical software STATA 12.0. Fixed/random effect model is adopted based on the heterogeneity tested by I 2 statistic. A total of 1714 patients are analyzed across three randomized controlled trials (RCTs) and one non-RCT. The present meta-analysis reveals that TXA is associated with a significantly reduction of total blood loss and postoperative hemoglobin drop compared with EACA. No significant differences are identified in terms of transfusion rates, length of hospital stay, and the incidence of postoperative complications. Although total blood loss and postoperative hemoglobin drop are significant greater in EACA groups, there is no significant difference between TXA and EACA groups in terms of transfusion rates. Based on the current evidence available, higher quality RCTs are still required for further research. Copyright © 2018 IJS Publishing Group Ltd. Published by Elsevier Ltd. All rights reserved.

An integrated user-friendly ArcMAP tool for bivariate statistical modeling in geoscience applications

NASA Astrophysics Data System (ADS)

Jebur, M. N.; Pradhan, B.; Shafri, H. Z. M.; Yusof, Z.; Tehrany, M. S.

2014-10-01

Modeling and classification difficulties are fundamental issues in natural hazard assessment. A geographic information system (GIS) is a domain that requires users to use various tools to perform different types of spatial modeling. Bivariate statistical analysis (BSA) assists in hazard modeling. To perform this analysis, several calculations are required and the user has to transfer data from one format to another. Most researchers perform these calculations manually by using Microsoft Excel or other programs. This process is time consuming and carries a degree of uncertainty. The lack of proper tools to implement BSA in a GIS environment prompted this study. In this paper, a user-friendly tool, BSM (bivariate statistical modeler), for BSA technique is proposed. Three popular BSA techniques such as frequency ratio, weights-of-evidence, and evidential belief function models are applied in the newly proposed ArcMAP tool. This tool is programmed in Python and is created by a simple graphical user interface, which facilitates the improvement of model performance. The proposed tool implements BSA automatically, thus allowing numerous variables to be examined. To validate the capability and accuracy of this program, a pilot test area in Malaysia is selected and all three models are tested by using the proposed program. Area under curve is used to measure the success rate and prediction rate. Results demonstrate that the proposed program executes BSA with reasonable accuracy. The proposed BSA tool can be used in numerous applications, such as natural hazard, mineral potential, hydrological, and other engineering and environmental applications.

An integrated user-friendly ArcMAP tool for bivariate statistical modelling in geoscience applications

NASA Astrophysics Data System (ADS)

Jebur, M. N.; Pradhan, B.; Shafri, H. Z. M.; Yusoff, Z. M.; Tehrany, M. S.

2015-03-01

Modelling and classification difficulties are fundamental issues in natural hazard assessment. A geographic information system (GIS) is a domain that requires users to use various tools to perform different types of spatial modelling. Bivariate statistical analysis (BSA) assists in hazard modelling. To perform this analysis, several calculations are required and the user has to transfer data from one format to another. Most researchers perform these calculations manually by using Microsoft Excel or other programs. This process is time-consuming and carries a degree of uncertainty. The lack of proper tools to implement BSA in a GIS environment prompted this study. In this paper, a user-friendly tool, bivariate statistical modeler (BSM), for BSA technique is proposed. Three popular BSA techniques, such as frequency ratio, weight-of-evidence (WoE), and evidential belief function (EBF) models, are applied in the newly proposed ArcMAP tool. This tool is programmed in Python and created by a simple graphical user interface (GUI), which facilitates the improvement of model performance. The proposed tool implements BSA automatically, thus allowing numerous variables to be examined. To validate the capability and accuracy of this program, a pilot test area in Malaysia is selected and all three models are tested by using the proposed program. Area under curve (AUC) is used to measure the success rate and prediction rate. Results demonstrate that the proposed program executes BSA with reasonable accuracy. The proposed BSA tool can be used in numerous applications, such as natural hazard, mineral potential, hydrological, and other engineering and environmental applications.

Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology

NASA Astrophysics Data System (ADS)

Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen

2018-07-01

Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.

Space Shuttle booster thrust imbalance analysis

NASA Technical Reports Server (NTRS)

Bailey, W. R.; Blackwell, D. L.

1985-01-01

An analysis of the Shuttle SRM thrust imbalance during the steady-state and tailoff portions of the boost phase of flight are presented. Results from flights STS-1 through STS-13 are included. A statistical analysis of the observed thrust imbalance data is presented. A 3 sigma thrust imbalance history versus time was generated from the observed data and is compared to the vehicle design requirements. The effect on Shuttle thrust imbalance from the use of replacement SRM segments is predicted. Comparisons of observed thrust imbalances with respect to predicted imbalances are presented for the two space shuttle flights which used replacement aft segments (STS-9 and STS-13).

Atom-scale compositional distribution in InAlAsSb-based triple junction solar cells by atom probe tomography.

PubMed

Hernández-Saz, J; Herrera, M; Delgado, F J; Duguay, S; Philippe, T; Gonzalez, M; Abell, J; Walters, R J; Molina, S I

2016-07-29

The analysis by atom probe tomography (APT) of InAlAsSb layers with applications in triple junction solar cells (TJSCs) has shown the existence of In- and Sb-rich regions in the material. The composition variation found is not evident from the direct observation of the 3D atomic distribution and because of this a statistical analysis has been required. From previous analysis of these samples, it is shown that the small compositional fluctuations determined have a strong effect on the optical properties of the material and ultimately on the performance of TJSCs.

Space biology initiative program definition review. Trade study 4: Design modularity and commonality

NASA Technical Reports Server (NTRS)

Jackson, L. Neal; Crenshaw, John, Sr.; Davidson, William L.; Herbert, Frank J.; Bilodeau, James W.; Stoval, J. Michael; Sutton, Terry

1989-01-01

The relative cost impacts (up or down) of developing Space Biology hardware using design modularity and commonality is studied. Recommendations for how the hardware development should be accomplished to meet optimum design modularity requirements for Life Science investigation hardware will be provided. In addition, the relative cost impacts of implementing commonality of hardware for all Space Biology hardware are defined. Cost analysis and supporting recommendations for levels of modularity and commonality are presented. A mathematical or statistical cost analysis method with the capability to support development of production design modularity and commonality impacts to parametric cost analysis is provided.

Examining the effectiveness of discriminant function analysis and cluster analysis in species identification of male field crickets based on their calling songs.

PubMed

Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini

2013-01-01

Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6-7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification.

THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures

PubMed Central

Theobald, Douglas L.; Wuttke, Deborah S.

2008-01-01

Summary THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. PMID:16777907

An Elementary Algorithm for Autonomous Air Terminal Merging and Interval Management

NASA Technical Reports Server (NTRS)

White, Allan L.

2017-01-01

A central element of air traffic management is the safe merging and spacing of aircraft during the terminal area flight phase. This paper derives and examines an algorithm for the merging and interval managing problem for Standard Terminal Arrival Routes. It describes a factor analysis for performance based on the distribution of arrivals, the operating period of the terminal, and the topology of the arrival routes; then presents results from a performance analysis and from a safety analysis for a realistic topology based on typical routes for a runway at Phoenix International Airport. The heart of the safety analysis is a statistical derivation on how to conduct a safety analysis for a local simulation when the safety requirement is given for the entire airspace.

Competent statistical programmer: Need of business process outsourcing industry

PubMed Central

Khan, Imran

2014-01-01

Over the last two decades Business Process Outsourcing (BPO) has evolved as much mature practice. India is looked as preferred destination for pharmaceutical outsourcing over a cost arbitrage. Among the biometrics outsourcing, statistical programming and analysis required very niche skill for service delivery. The demand and supply ratios are imbalance due to high churn out rate and less supply of competent programmer. Industry is moving from task delivery to ownership and accountability. The paradigm shift from an outsourcing to consulting is triggering the need for competent statistical programmer. Programmers should be trained in technical, analytical, problem solving, decision making and soft skill as the expectations from the customer are changing from task delivery to accountability of the project. This paper will highlight the common issue SAS programming service industry is facing and skills the programmers need to develop to cope up with these changes. PMID:24987578

Competent statistical programmer: Need of business process outsourcing industry.

PubMed

Khan, Imran

2014-07-01

Over the last two decades Business Process Outsourcing (BPO) has evolved as much mature practice. India is looked as preferred destination for pharmaceutical outsourcing over a cost arbitrage. Among the biometrics outsourcing, statistical programming and analysis required very niche skill for service delivery. The demand and supply ratios are imbalance due to high churn out rate and less supply of competent programmer. Industry is moving from task delivery to ownership and accountability. The paradigm shift from an outsourcing to consulting is triggering the need for competent statistical programmer. Programmers should be trained in technical, analytical, problem solving, decision making and soft skill as the expectations from the customer are changing from task delivery to accountability of the project. This paper will highlight the common issue SAS programming service industry is facing and skills the programmers need to develop to cope up with these changes.

A Statistical Framework for the Functional Analysis of Metagenomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sharon, Itai; Pati, Amrita; Markowitz, Victor

2008-10-01

Metagenomic studies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. They present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. They also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements.more » They tested their method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that their framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.« less

[The metrology of uncertainty: a study of vital statistics from Chile and Brazil].

PubMed

Carvajal, Yuri; Kottow, Miguel

2012-11-01

This paper addresses the issue of uncertainty in the measurements used in public health analysis and decision-making. The Shannon-Wiener entropy measure was adapted to express the uncertainty contained in counting causes of death in official vital statistics from Chile. Based on the findings, the authors conclude that metrological requirements in public health are as important as the measurements themselves. The study also considers and argues for the existence of uncertainty associated with the statistics' performative properties, both by the way the data are structured as a sort of syntax of reality and by exclusion of what remains beyond the quantitative modeling used in each case. Following the legacy of pragmatic thinking and using conceptual tools from the sociology of translation, the authors emphasize that by taking uncertainty into account, public health can contribute to a discussion on the relationship between technology, democracy, and formation of a participatory public.

Online Updating of Statistical Inference in the Big Data Setting.

PubMed

Schifano, Elizabeth D; Wu, Jing; Wang, Chun; Yan, Jun; Chen, Ming-Hui

2016-01-01

We present statistical methods for big data arising from online analytical processing, where large amounts of data arrive in streams and require fast analysis without storage/access to the historical data. In particular, we develop iterative estimating algorithms and statistical inferences for linear models and estimating equations that update as new data arrive. These algorithms are computationally efficient, minimally storage-intensive, and allow for possible rank deficiencies in the subset design matrices due to rare-event covariates. Within the linear model setting, the proposed online-updating framework leads to predictive residual tests that can be used to assess the goodness-of-fit of the hypothesized model. We also propose a new online-updating estimator under the estimating equation setting. Theoretical properties of the goodness-of-fit tests and proposed estimators are examined in detail. In simulation studies and real data applications, our estimator compares favorably with competing approaches under the estimating equation setting.

Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models

PubMed Central

Chiu, Chi-yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-ling; Xiong, Momiao; Fan, Ruzong

2017-01-01

To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data. PMID:28000696

Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models.

PubMed

Chiu, Chi-Yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-Ling; Xiong, Momiao; Fan, Ruzong

2017-02-01

To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data.

Learning the Language of Statistics: Challenges and Teaching Approaches

ERIC Educational Resources Information Center

Dunn, Peter K.; Carey, Michael D.; Richardson, Alice M.; McDonald, Christine

2016-01-01

Learning statistics requires learning the language of statistics. Statistics draws upon words from general English, mathematical English, discipline-specific English and words used primarily in statistics. This leads to many linguistic challenges in teaching statistics and the way in which the language is used in statistics creates an extra layer…

Statistical Analysis of Research Data | Center for Cancer Research

Cancer.gov

Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 5-6, 2018 from 9 a.m.-5 p.m. at the National Institutes of Health's Natcher Conference Center, Balcony C on the Bethesda Campus. SARD is designed to provide an overview on the general principles of statistical analysis of research data. The first day will feature univariate data analysis, including descriptive statistics, probability distributions, one- and two-sample inferential statistics.

Study designs, use of statistical tests, and statistical analysis software choice in 2015: Results from two Pakistani monthly Medline indexed journals.

PubMed

Shaikh, Masood Ali

2017-09-01

Assessment of research articles in terms of study designs used, statistical tests applied and the use of statistical analysis programmes help determine research activity profile and trends in the country. In this descriptive study, all original articles published by Journal of Pakistan Medical Association (JPMA) and Journal of the College of Physicians and Surgeons Pakistan (JCPSP), in the year 2015 were reviewed in terms of study designs used, application of statistical tests, and the use of statistical analysis programmes. JPMA and JCPSP published 192 and 128 original articles, respectively, in the year 2015. Results of this study indicate that cross-sectional study design, bivariate inferential statistical analysis entailing comparison between two variables/groups, and use of statistical software programme SPSS to be the most common study design, inferential statistical analysis, and statistical analysis software programmes, respectively. These results echo previously published assessment of these two journals for the year 2014.

«

21

22

23

24

25

»

«

21

22

23

24

25

»

Mapping the global health employment market: an analysis of global health jobs.

PubMed

Keralis, Jessica M; Riggin-Pathak, Brianne L; Majeski, Theresa; Pathak, Bogdan A; Foggia, Janine; Cullinen, Kathleen M; Rajagopal, Abbhirami; West, Heidi S

2018-02-27

The number of university global health training programs has grown in recent years. However, there is little research on the needs of the global health profession. We therefore set out to characterize the global health employment market by analyzing global health job vacancies. We collected data from advertised, paid positions posted to web-based job boards, email listservs, and global health organization websites from November 2015 to May 2016. Data on requirements for education, language proficiency, technical expertise, physical location, and experience level were analyzed for all vacancies. Descriptive statistics were calculated for the aforementioned job characteristics. Associations between technical specialty area and requirements for non-English language proficiency and overseas experience were calculated using Chi-square statistics. A qualitative thematic analysis was performed on a subset of vacancies. We analyzed the data from 1007 global health job vacancies from 127 employers. Among private and non-profit sector vacancies, 40% (n = 354) were for technical or subject matter experts, 20% (n = 177) for program directors, and 16% (n = 139) for managers, compared to 9.8% (n = 87) for entry-level and 13.6% (n = 120) for mid-level positions. The most common technical focus area was program or project management, followed by HIV/AIDS and quantitative analysis. Thematic analysis demonstrated a common emphasis on program operations, relations, design and planning, communication, and management. Our analysis shows a demand for candidates with several years of experience with global health programs, particularly program managers/directors and technical experts, with very few entry-level positions accessible to recent graduates of global health training programs. It is unlikely that global health training programs equip graduates to be competitive for the majority of positions that are currently available in this field.

Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.

PubMed

Bonham-Carter, Oliver; Steele, Joe; Bastola, Dhundy

2014-11-01

Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base-base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel-Ziv techniques from data compression. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

ACCOUNTING FOR CALIBRATION UNCERTAINTIES IN X-RAY ANALYSIS: EFFECTIVE AREAS IN SPECTRAL FITTING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Hyunsook; Kashyap, Vinay L.; Drake, Jeremy J.

2011-04-20

While considerable advance has been made to account for statistical uncertainties in astronomical analyses, systematic instrumental uncertainties have been generally ignored. This can be crucial to a proper interpretation of analysis results because instrumental calibration uncertainty is a form of systematic uncertainty. Ignoring it can underestimate error bars and introduce bias into the fitted values of model parameters. Accounting for such uncertainties currently requires extensive case-specific simulations if using existing analysis packages. Here, we present general statistical methods that incorporate calibration uncertainties into spectral analysis of high-energy data. We first present a method based on multiple imputation that can bemore » applied with any fitting method, but is necessarily approximate. We then describe a more exact Bayesian approach that works in conjunction with a Markov chain Monte Carlo based fitting. We explore methods for improving computational efficiency, and in particular detail a method of summarizing calibration uncertainties with a principal component analysis of samples of plausible calibration files. This method is implemented using recently codified Chandra effective area uncertainties for low-resolution spectral analysis and is verified using both simulated and actual Chandra data. Our procedure for incorporating effective area uncertainty is easily generalized to other types of calibration uncertainties.« less

Can upstaging of ductal carcinoma in situ be predicted at biopsy by histologic and mammographic features?

NASA Astrophysics Data System (ADS)

Shi, Bibo; Grimm, Lars J.; Mazurowski, Maciej A.; Marks, Jeffrey R.; King, Lorraine M.; Maley, Carlo C.; Hwang, E. Shelley; Lo, Joseph Y.

2017-03-01

Reducing the overdiagnosis and overtreatment associated with ductal carcinoma in situ (DCIS) requires accurate prediction of the invasive potential at cancer screening. In this work, we investigated the utility of pre-operative histologic and mammographic features to predict upstaging of DCIS. The goal was to provide intentionally conservative baseline performance using readily available data from radiologists and pathologists and only linear models. We conducted a retrospective analysis on 99 patients with DCIS. Of those 25 were upstaged to invasive cancer at the time of definitive surgery. Pre-operative factors including both the histologic features extracted from stereotactic core needle biopsy (SCNB) reports and the mammographic features annotated by an expert breast radiologist were investigated with statistical analysis. Furthermore, we built classification models based on those features in an attempt to predict the presence of an occult invasive component in DCIS, with generalization performance assessed by receiver operating characteristic (ROC) curve analysis. Histologic features including nuclear grade and DCIS subtype did not show statistically significant differences between cases with pure DCIS and with DCIS plus invasive disease. However, three mammographic features, i.e., the major axis length of DCIS lesion, the BI-RADS level of suspicion, and radiologist's assessment did achieve the statistical significance. Using those three statistically significant features as input, a linear discriminant model was able to distinguish patients with DCIS plus invasive disease from those with pure DCIS, with AUC-ROC equal to 0.62. Overall, mammograms used for breast screening contain useful information that can be perceived by radiologists and help predict occult invasive components in DCIS.

Can Propensity Score Analysis Approximate Randomized Experiments Using Pretest and Demographic Information in Pre-K Intervention Research?

PubMed

Dong, Nianbo; Lipsey, Mark W

2017-01-01

It is unclear whether propensity score analysis (PSA) based on pretest and demographic covariates will meet the ignorability assumption for replicating the results of randomized experiments. This study applies within-study comparisons to assess whether pre-Kindergarten (pre-K) treatment effects on achievement outcomes estimated using PSA based on a pretest and demographic covariates can approximate those found in a randomized experiment. Data-Four studies with samples of pre-K children each provided data on two math achievement outcome measures with baseline pretests and child demographic variables that included race, gender, age, language spoken at home, and mother's highest education. Research Design and Data Analysis-A randomized study of a pre-K math curriculum provided benchmark estimates of effects on achievement measures. Comparison samples from other pre-K studies were then substituted for the original randomized control and the effects were reestimated using PSA. The correspondence was evaluated using multiple criteria. The effect estimates using PSA were in the same direction as the benchmark estimates, had similar but not identical statistical significance, and did not differ from the benchmarks at statistically significant levels. However, the magnitude of the effect sizes differed and displayed both absolute and relative bias larger than required to show statistical equivalence with formal tests, but those results were not definitive because of the limited statistical power. We conclude that treatment effect estimates based on a single pretest and demographic covariates in PSA correspond to those from a randomized experiment on the most general criteria for equivalence.

A Quality Management Evaluation of the Graduate Education Process for Ocean Engineers in the Civil Engineer Corps

DTIC Science & Technology

1993-12-01

graduate education required for Ocean Facilities Program (OFP) officers in the Civil Engineer Corps (CEC) of the United States Navy. For the purpose...determined by distributing questionnaires to all officers in the OFP. Statistical analyses of numerical data and judgmental3 analysis of professional...45 B. Ocean Facility Program Officer Graduate Education Questionnaire ....... 47 C. Summary of Questionnaire Responses

General Blending Models for Data From Mixture Experiments

PubMed Central

Brown, L.; Donev, A. N.; Bissett, A. C.

2015-01-01

We propose a new class of models providing a powerful unification and extension of existing statistical methodology for analysis of data obtained in mixture experiments. These models, which integrate models proposed by Scheffé and Becker, extend considerably the range of mixture component effects that may be described. They become complex when the studied phenomenon requires it, but remain simple whenever possible. This article has supplementary material online. PMID:26681812

Prototyping with Data Dictionaries for Requirements Analysis.

DTIC Science & Technology

1985-03-01

statistical packages and software for screen layout. These items work at a higher level than another category of prototyping tool, program generators... Program generators are software packages which, when given specifications, produce source listings, usually in a high order language such as COBCL...with users and this will not happen if he must stop to develcp a detailed program . [Ref. 241] Hardware as well as software should be considered in

20 CFR 634.4 - Statistical standards.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 20 Employees' Benefits 3 2011-04-01 2011-04-01 false Statistical standards. 634.4 Section 634.4... System § 634.4 Statistical standards. Recipients shall agree to provide required data following the statistical standards prescribed by the Bureau of Labor Statistics for cooperative statistical programs. ...

20 CFR 634.4 - Statistical standards.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 20 Employees' Benefits 3 2010-04-01 2010-04-01 false Statistical standards. 634.4 Section 634.4... System § 634.4 Statistical standards. Recipients shall agree to provide required data following the statistical standards prescribed by the Bureau of Labor Statistics for cooperative statistical programs. ...

Genome-wide association study identifies multiple loci associated with bladder cancer risk

PubMed Central

Figueroa, Jonine D.; Ye, Yuanqing; Siddiq, Afshan; Garcia-Closas, Montserrat; Chatterjee, Nilanjan; Prokunina-Olsson, Ludmila; Cortessis, Victoria K.; Kooperberg, Charles; Cussenot, Olivier; Benhamou, Simone; Prescott, Jennifer; Porru, Stefano; Dinney, Colin P.; Malats, Núria; Baris, Dalsu; Purdue, Mark; Jacobs, Eric J.; Albanes, Demetrius; Wang, Zhaoming; Deng, Xiang; Chung, Charles C.; Tang, Wei; Bas Bueno-de-Mesquita, H.; Trichopoulos, Dimitrios; Ljungberg, Börje; Clavel-Chapelon, Françoise; Weiderpass, Elisabete; Krogh, Vittorio; Dorronsoro, Miren; Travis, Ruth; Tjønneland, Anne; Brenan, Paul; Chang-Claude, Jenny; Riboli, Elio; Conti, David; Gago-Dominguez, Manuela; Stern, Mariana C.; Pike, Malcolm C.; Van Den Berg, David; Yuan, Jian-Min; Hohensee, Chancellor; Rodabough, Rebecca; Cancel-Tassin, Geraldine; Roupret, Morgan; Comperat, Eva; Chen, Constance; De Vivo, Immaculata; Giovannucci, Edward; Hunter, David J.; Kraft, Peter; Lindstrom, Sara; Carta, Angela; Pavanello, Sofia; Arici, Cecilia; Mastrangelo, Giuseppe; Kamat, Ashish M.; Lerner, Seth P.; Barton Grossman, H.; Lin, Jie; Gu, Jian; Pu, Xia; Hutchinson, Amy; Burdette, Laurie; Wheeler, William; Kogevinas, Manolis; Tardón, Adonina; Serra, Consol; Carrato, Alfredo; García-Closas, Reina; Lloreta, Josep; Schwenn, Molly; Karagas, Margaret R.; Johnson, Alison; Schned, Alan; Armenti, Karla R.; Hosain, G.M.; Andriole, Gerald; Grubb, Robert; Black, Amanda; Ryan Diver, W.; Gapstur, Susan M.; Weinstein, Stephanie J.; Virtamo, Jarmo; Haiman, Chris A.; Landi, Maria T.; Caporaso, Neil; Fraumeni, Joseph F.; Vineis, Paolo; Wu, Xifeng; Silverman, Debra T.; Chanock, Stephen; Rothman, Nathaniel

2014-01-01

Candidate gene and genome-wide association studies (GWAS) have identified 11 independent susceptibility loci associated with bladder cancer risk. To discover additional risk variants, we conducted a new GWAS of 2422 bladder cancer cases and 5751 controls, followed by a meta-analysis with two independently published bladder cancer GWAS, resulting in a combined analysis of 6911 cases and 11 814 controls of European descent. TaqMan genotyping of 13 promising single nucleotide polymorphisms with P < 1 × 10−5 was pursued in a follow-up set of 801 cases and 1307 controls. Two new loci achieved genome-wide statistical significance: rs10936599 on 3q26.2 (P = 4.53 × 10−9) and rs907611 on 11p15.5 (P = 4.11 × 10−8). Two notable loci were also identified that approached genome-wide statistical significance: rs6104690 on 20p12.2 (P = 7.13 × 10−7) and rs4510656 on 6p22.3 (P = 6.98 × 10−7); these require further studies for confirmation. In conclusion, our study has identified new susceptibility alleles for bladder cancer risk that require fine-mapping and laboratory investigation, which could further understanding into the biological underpinnings of bladder carcinogenesis. PMID:24163127

The dynamic improvement methods of energy efficiency and reliability of oil production submersible electric motors

NASA Astrophysics Data System (ADS)

Romanov, V. S.; Goldstein, V. G.

2018-01-01

In the organization of production and operation of submersible electric motors (ESP), as the most essential element of electric submersible plants (ESP) in the oil industry, it is necessary to consider specific operating conditions. The submersible electric motors (SEM) as most essential element of electrosubmersible installations (EI) in oil branch accounting of operation specific conditions is necessary in the process production and operation. They are determined by the conditions under which the EPU is operated. They are defined by the EPU operation conditions. For a complete picture the current state of the SED fleet in oil production, the results of its statistical analysis are given. For a comprehensive idea of the SEM park current state the results of statistical analysis are given in oil production. Currently, assessed the performance of submersible equipment produced by major manufacturers. Currently the operational characteristics assessment of the submersible equipment released by the main producers is given. It is stated that standard equipment cannot fully ensure efficient operation with the help of serial EIs, therefore new technologies and corresponding equipment are required to be developed. It is noted that the standard equipment could not provide fully effective operation by means of serial ESP therefore new technologies development and the corresponding equipment are required.

[Induced abortion and labor activity. Reflections for discussion].

PubMed

Orjuela-Ramírez, María E

2012-06-01

The induced abortion is a global phenomenon that according to various authors respond to socially constructed patterns of behavior and where they influence social realities of each country. This phenomenon requires the information necessary to understand the complex process leading to the decision of women to opt for abortion and able to understand the social, economic and health that can explain this requirement. For this purpose is presented for discussion, some considerations on voluntary abortion and labor activity of women who opt for this practice, with special mention of the situation in Spain. The arguments are supported by statistical analysis of the voluntary interruption of pregnancy (IVE) reported by the Ministry of Health and Social Policy, participation of women in the labor market in Spain obtained from the National Statistics Institute (INE), the research results on the association between employment status of women and voluntary termination of pregnancy and comprehensive review of scientific literature on the different perspectives of the approach of voluntary abortion. Analysis deserves special importance of women's work activity as a possible factor in the decision of women to terminate their pregnancies, a variable that has been identified in most of the investigations as a socioeconomic condition for women who choose for that alternative, considering that pregnancy interferes with the employment of women or, rather, prevents them from use.

Using meta-information of a posteriori Bayesian solutions of the hypocentre location task for improving accuracy of location error estimation

NASA Astrophysics Data System (ADS)

Debski, Wojciech

2015-06-01

The spatial location of sources of seismic waves is one of the first tasks when transient waves from natural (uncontrolled) sources are analysed in many branches of physics, including seismology, oceanology, to name a few. Source activity and its spatial variability in time, the geometry of recording network, the complexity and heterogeneity of wave velocity distribution are all factors influencing the performance of location algorithms and accuracy of the achieved results. Although estimating of the earthquake foci location is relatively simple, a quantitative estimation of the location accuracy is really a challenging task even if the probabilistic inverse method is used because it requires knowledge of statistics of observational, modelling and a priori uncertainties. In this paper, we addressed this task when statistics of observational and/or modelling errors are unknown. This common situation requires introduction of a priori constraints on the likelihood (misfit) function which significantly influence the estimated errors. Based on the results of an analysis of 120 seismic events from the Rudna copper mine operating in southwestern Poland, we propose an approach based on an analysis of Shanon's entropy calculated for the a posteriori distribution. We show that this meta-characteristic of the a posteriori distribution carries some information on uncertainties of the solution found.

Characterization of interfade duration for satellite communication systems design and optimization in a temperate climate

NASA Astrophysics Data System (ADS)

Jorge, Flávio; Riva, Carlo; Rocha, Armando

2016-03-01

The characterization of the fade dynamics on Earth-satellite links is an important subject when designing the so called fade mitigation techniques that contribute to the proper reliability of the satellite communication systems and the customers' quality of service (QoS). The interfade duration, defined as the period between two consecutive fade events, has been only poorly analyzed using limited data sets, but its complete characterization would enable the design and optimization of the satellite communication systems by estimating the system requirements to recover in time before the next propagation impairment. Depending on this analysis, several actions can be taken ensuring the service maintenance. In this paper we present for the first time a detailed and comprehensive analysis of the interfade events statistical properties based on 9 years of in-excess attenuation measurements at Ka band (19.7 GHz) with very high availability that is required to build a reliable data set mainly for the longer interfade duration events. The number of years necessary to reach the statistical stability of interfade duration is also evaluated for the first time, providing a reference when accessing the relevance of the results published in the past. The study is carried out in Aveiro, Portugal, which is conditioned by temperate Mediterranean climate with Oceanic influences.

Spatial and temporal variation of water quality of a segment of Marikina River using multivariate statistical methods.

PubMed

Chounlamany, Vanseng; Tanchuling, Maria Antonia; Inoue, Takanobu

2017-09-01

Payatas landfill in Quezon City, Philippines, releases leachate to the Marikina River through a creek. Multivariate statistical techniques were applied to study temporal and spatial variations in water quality of a segment of the Marikina River. The data set included 12 physico-chemical parameters for five monitoring stations over a year. Cluster analysis grouped the monitoring stations into four clusters and identified January-May as dry season and June-September as wet season. Principal components analysis showed that three latent factors are responsible for the data set explaining 83% of its total variance. The chemical oxygen demand, biochemical oxygen demand, total dissolved solids, Cl - and PO 4 3- are influenced by anthropogenic impact/eutrophication pollution from point sources. Total suspended solids, turbidity and SO 4 2- are influenced by rain and soil erosion. The highest state of pollution is at the Payatas creek outfall from March to May, whereas at downstream stations it is in May. The current study indicates that the river monitoring requires only four stations, nine water quality parameters and testing over three specific months of the year. The findings of this study imply that Payatas landfill requires a proper leachate collection and treatment system to reduce its impact on the Marikina River.

Innovative Approach for Developing Spacecraft Interior Acoustic Requirement Allocation

NASA Technical Reports Server (NTRS)

Chu, S. Reynold; Dandaroy, Indranil; Allen, Christopher S.

2016-01-01

The Orion Multi-Purpose Crew Vehicle (MPCV) is an American spacecraft for carrying four astronauts during deep space missions. This paper describes an innovative application of Power Injection Method (PIM) for allocating Orion cabin continuous noise Sound Pressure Level (SPL) limits to the sound power level (PWL) limits of major noise sources in the Environmental Control and Life Support System (ECLSS) during all mission phases. PIM is simulated using both Statistical Energy Analysis (SEA) and Hybrid Statistical Energy Analysis-Finite Element (SEA-FE) models of the Orion MPCV to obtain the transfer matrix from the PWL of the noise sources to the acoustic energies of the receivers, i.e., the cavities associated with the cabin habitable volume. The goal of the allocation strategy is to control the total energy of cabin habitable volume for maintaining the required SPL limits. Simulations are used to demonstrate that applying the allocated PWLs to the noise sources in the models indeed reproduces the SPL limits in the habitable volume. The effects of Noise Control Treatment (NCT) on allocated noise source PWLs are investigated. The measurement of source PWLs of involved fan and pump development units are also discussed as it is related to some case-specific details of the allocation strategy discussed here.

Interim analyses in 2 x 2 crossover trials.

PubMed

Cook, R J

1995-09-01

A method is presented for performing interim analyses in long term 2 x 2 crossover trials with serial patient entry. The analyses are based on a linear statistic that combines data from individuals observed for one treatment period with data from individuals observed for both periods. The coefficients in this linear combination can be chosen quite arbitrarily, but we focus on variance-based weights to maximize power for tests regarding direct treatment effects. The type I error rate of this procedure is controlled by utilizing the joint distribution of the linear statistics over analysis stages. Methods for performing power and sample size calculations are indicated. A two-stage sequential design involving simultaneous patient entry and a single between-period interim analysis is considered in detail. The power and average number of measurements required for this design are compared to those of the usual crossover trial. The results indicate that, while there is minimal loss in power relative to the usual crossover design in the absence of differential carry-over effects, the proposed design can have substantially greater power when differential carry-over effects are present. The two-stage crossover design can also lead to more economical studies in terms of the expected number of measurements required, due to the potential for early stopping. Attention is directed toward normally distributed responses.

Increasing the reliability of the fluid/crystallized difference score from the Kaufman Adolescent and Adult Intelligence Test with reliable component analysis.

PubMed

Caruso, J C

2001-06-01

The unreliability of difference scores is a well documented phenomenon in the social sciences and has led researchers and practitioners to interpret differences cautiously, if at all. In the case of the Kaufman Adult and Adolescent Intelligence Test (KAIT), the unreliability of the difference between the Fluid IQ and the Crystallized IQ is due to the high correlation between the two scales. The consequences of the lack of precision with which differences are identified are wide confidence intervals and unpowerful significance tests (i.e., large differences are required to be declared statistically significant). Reliable component analysis (RCA) was performed on the subtests of the KAIT in order to address these problems. RCA is a new data reduction technique that results in uncorrelated component scores with maximum proportions of reliable variance. Results indicate that the scores defined by RCA have discriminant and convergent validity (with respect to the equally weighted scores) and that differences between the scores, derived from a single testing session, were more reliable than differences derived from equal weighting for each age group (11-14 years, 15-34 years, 35-85+ years). This reliability advantage results in narrower confidence intervals around difference scores and smaller differences required for statistical significance.

Targeting change: Assessing a faculty learning community focused on increasing statistics content in life science curricula.

PubMed

Parker, Loran Carleton; Gleichsner, Alyssa M; Adedokun, Omolola A; Forney, James

2016-11-12

Transformation of research in all biological fields necessitates the design, analysis and, interpretation of large data sets. Preparing students with the requisite skills in experimental design, statistical analysis, and interpretation, and mathematical reasoning will require both curricular reform and faculty who are willing and able to integrate mathematical and statistical concepts into their life science courses. A new Faculty Learning Community (FLC) was constituted each year for four years to assist in the transformation of the life sciences curriculum and faculty at a large, Midwestern research university. Participants were interviewed after participation and surveyed before and after participation to assess the impact of the FLC on their attitudes toward teaching, perceived pedagogical skills, and planned teaching practice. Overall, the FLC had a meaningful positive impact on participants' attitudes toward teaching, knowledge about teaching, and perceived pedagogical skills. Interestingly, confidence for viewing the classroom as a site for research about teaching declined. Implications for the creation and development of FLCs for science faculty are discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 44(6):517-525, 2016. © 2016 The International Union of Biochemistry and Molecular Biology.

«

21

22

23

24

25

»

«

21

22

23

24

25

»

Lactobacillus for preventing recurrent urinary tract infections in women: meta-analysis.

PubMed

Grin, Peter M; Kowalewska, Paulina M; Alhazzan, Waleed; Fox-Robichaud, Alison E

2013-02-01

Urinary tract infections (UTIs) are the most common infections affecting women, and often recur. Lactobacillus probiotics could potentially replace low dose, long term antibiotics as a safer prophylactic for recurrent UTI (rUTI). This systematic review and meta-analysis was performed to compile the results of existing randomized clinical trials (RCTs) to determine the efficacy of probiotic Lactobacillus species in preventing rUTI. MEDLINE and EMBASE were searched from inception to July 2012 for RCTs using a Lactobacillus prophylactic against rUTI in premenopausal adult women. A random-effects model meta-analysis was performed using a pooled risk ratio, comparing incidence of rUTI in patients receiving Lactobacillus to control. Data from 294 patients across five studies were included. There was no statistically significant difference in the risk for rUTI in patients receiving Lactobacillus versus controls, as indicated by the pooled risk ratio of 0.85 (95% confidence interval of 0.58-1.25, p = 0.41). A sensitivity analysis was performed, excluding studies using ineffective strains and studies testing for safety. Data from 127 patients in two studies were included. A statistically significant decrease in rUTI was found in patients given Lactobacillus, denoted by the pooled risk ratio of 0.51 (95% confidence interval 0.26-0.99, p = 0.05) with no statistical heterogeneity (I2 = 0%). Probiotic strains of Lactobacillus are safe and effective in preventing rUTI in adult women. However, more RCTs are required before a definitive recommendation can be made since the patient population contributing data to this meta-analysis was small.

Statistical process management: An essential element of quality improvement

NASA Astrophysics Data System (ADS)

Buckner, M. R.

Successful quality improvement requires a balanced program involving the three elements that control quality: organization, people and technology. The focus of the SPC/SPM User's Group is to advance the technology component of Total Quality by networking within the Group and by providing an outreach within Westinghouse to foster the appropriate use of statistic techniques to achieve Total Quality. SPM encompasses the disciplines by which a process is measured against its intrinsic design capability, in the face of measurement noise and other obscuring variability. SPM tools facilitate decisions about the process that generated the data. SPM deals typically with manufacturing processes, but with some flexibility of definition and technique it accommodates many administrative processes as well. The techniques of SPM are those of Statistical Process Control, Statistical Quality Control, Measurement Control, and Experimental Design. In addition, techniques such as job and task analysis, and concurrent engineering are important elements of systematic planning and analysis that are needed early in the design process to ensure success. The SPC/SPM User's Group is endeavoring to achieve its objectives by sharing successes that have occurred within the member's own Westinghouse department as well as within other US and foreign industry. In addition, failures are reviewed to establish lessons learned in order to improve future applications. In broader terms, the Group is interested in making SPM the accepted way of doing business within Westinghouse.

Metrology: Calibration and measurement processes guidelines

NASA Technical Reports Server (NTRS)

Castrup, Howard T.; Eicke, Woodward G.; Hayes, Jerry L.; Mark, Alexander; Martin, Robert E.; Taylor, James L.

1994-01-01

The guide is intended as a resource to aid engineers and systems contracts in the design, implementation, and operation of metrology, calibration, and measurement systems, and to assist NASA personnel in the uniform evaluation of such systems supplied or operated by contractors. Methodologies and techniques acceptable in fulfilling metrology quality requirements for NASA programs are outlined. The measurement process is covered from a high level through more detailed discussions of key elements within the process, Emphasis is given to the flowdown of project requirements to measurement system requirements, then through the activities that will provide measurements with defined quality. In addition, innovations and techniques for error analysis, development of statistical measurement process control, optimization of calibration recall systems, and evaluation of measurement uncertainty are presented.

Ideas for Effective Communication of Statistical Results

DOE PAGES

Anderson-Cook, Christine M.

2015-03-01

Effective presentation of statistical results to those with less statistical training, including managers and decision-makers requires planning, anticipation and thoughtful delivery. Here are several recommendations for effectively presenting statistical results.

Electron microscopic quantification of collagen fibril diameters in the rabbit medial collateral ligament: a baseline for comparison.

PubMed

Frank, C; Bray, D; Rademaker, A; Chrusch, C; Sabiston, P; Bodie, D; Rangayyan, R

1989-01-01

To establish a normal baseline for comparison, thirty-one thousand collagen fibril diameters were measured in calibrated transmission electron (TEM) photomicrographs of normal rabbit medial collateral ligaments (MCL's). A new automated method of quantitation was used to compare statistically fibril minimum diameter distributions in one midsubstance location in both MCL's from six animals at 3 months of age (immature) and three animals at 10 months of age (mature). Pooled results demonstrate that rabbit MCL's have statistically different (p less than 0.001) mean minimum diameters at these two ages. Interanimal differences in mean fibril minimum diameters were also significant (p less than 0.001) and varied by 20% to 25% in both mature and immature animals. Finally, there were significant differences (p less than 0.001) in mean diameters and distributions from side-to-side in all animals. These mean left-to-right differences were less than 10% in all mature animals but as much as 62% in some immature animals. Statistical analysis of these data demonstrate that animal-to-animal comparisons using these protocols require a large number of animals with appropriate numbers of fibrils being measured to detect small intergroup differences. With experiments which compare left to right ligaments, far fewer animals are required to detect similarly small differences. These results demonstrate the necessity for rigorous control of sampling, an extensive normal baseline and statistically confirmed experimental designs in any TEM comparisons of collagen fibril diameters.

ASCS online fault detection and isolation based on an improved MPCA

NASA Astrophysics Data System (ADS)

Peng, Jianxin; Liu, Haiou; Hu, Yuhui; Xi, Junqiang; Chen, Huiyan

2014-09-01

Multi-way principal component analysis (MPCA) has received considerable attention and been widely used in process monitoring. A traditional MPCA algorithm unfolds multiple batches of historical data into a two-dimensional matrix and cut the matrix along the time axis to form subspaces. However, low efficiency of subspaces and difficult fault isolation are the common disadvantages for the principal component model. This paper presents a new subspace construction method based on kernel density estimation function that can effectively reduce the storage amount of the subspace information. The MPCA model and the knowledge base are built based on the new subspace. Then, fault detection and isolation with the squared prediction error (SPE) statistic and the Hotelling ( T 2) statistic are also realized in process monitoring. When a fault occurs, fault isolation based on the SPE statistic is achieved by residual contribution analysis of different variables. For fault isolation of subspace based on the T 2 statistic, the relationship between the statistic indicator and state variables is constructed, and the constraint conditions are presented to check the validity of fault isolation. Then, to improve the robustness of fault isolation to unexpected disturbances, the statistic method is adopted to set the relation between single subspace and multiple subspaces to increase the corrective rate of fault isolation. Finally fault detection and isolation based on the improved MPCA is used to monitor the automatic shift control system (ASCS) to prove the correctness and effectiveness of the algorithm. The research proposes a new subspace construction method to reduce the required storage capacity and to prove the robustness of the principal component model, and sets the relationship between the state variables and fault detection indicators for fault isolation.

Uncertainty Requirement Analysis for the Orbit, Attitude, and Burn Performance of the 1st Lunar Orbit Insertion Maneuver

NASA Astrophysics Data System (ADS)

Song, Young-Joo; Bae, Jonghee; Kim, Young-Rok; Kim, Bang-Yeop

2016-12-01

In this study, the uncertainty requirements for orbit, attitude, and burn performance were estimated and analyzed for the execution of the 1st lunar orbit insertion (LOI) maneuver of the Korea Pathfinder Lunar Orbiter (KPLO) mission. During the early design phase of the system, associate analysis is an essential design factor as the 1st LOI maneuver is the largest burn that utilizes the onboard propulsion system; the success of the lunar capture is directly affected by the performance achieved. For the analysis, the spacecraft is assumed to have already approached the periselene with a hyperbolic arrival trajectory around the moon. In addition, diverse arrival conditions and mission constraints were considered, such as varying periselene approach velocity, altitude, and orbital period of the capture orbit after execution of the 1st LOI maneuver. The current analysis assumed an impulsive LOI maneuver, and two-body equations of motion were adapted to simplify the problem for a preliminary analysis. Monte Carlo simulations were performed for the statistical analysis to analyze diverse uncertainties that might arise at the moment when the maneuver is executed. As a result, three major requirements were analyzed and estimated for the early design phase. First, the minimum requirements were estimated for the burn performance to be captured around the moon. Second, the requirements for orbit, attitude, and maneuver burn performances were simultaneously estimated and analyzed to maintain the 1st elliptical orbit achieved around the moon within the specified orbital period. Finally, the dispersion requirements on the B-plane aiming at target points to meet the target insertion goal were analyzed and can be utilized as reference target guidelines for a mid-course correction (MCC) maneuver during the transfer. More detailed system requirements for the KPLO mission, particularly for the spacecraft bus itself and for the flight dynamics subsystem at the ground control center, are expected to be prepared and established based on the current results, including a contingency trajectory design plan.

Analysis of Variance: What Is Your Statistical Software Actually Doing?

ERIC Educational Resources Information Center

Li, Jian; Lomax, Richard G.

2011-01-01

Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…

The Population Tracking Model: A Simple, Scalable Statistical Model for Neural Population Data

PubMed Central

O'Donnell, Cian; alves, J. Tiago Gonç; Whiteley, Nick; Portera-Cailliau, Carlos; Sejnowski, Terrence J.

2017-01-01

Our understanding of neural population coding has been limited by a lack of analysis methods to characterize spiking data from large populations. The biggest challenge comes from the fact that the number of possible network activity patterns scales exponentially with the number of neurons recorded (∼2Neurons). Here we introduce a new statistical method for characterizing neural population activity that requires semi-independent fitting of only as many parameters as the square of the number of neurons, requiring drastically smaller data sets and minimal computation time. The model works by matching the population rate (the number of neurons synchronously active) and the probability that each individual neuron fires given the population rate. We found that this model can accurately fit synthetic data from up to 1000 neurons. We also found that the model could rapidly decode visual stimuli from neural population data from macaque primary visual cortex about 65 ms after stimulus onset. Finally, we used the model to estimate the entropy of neural population activity in developing mouse somatosensory cortex and, surprisingly, found that it first increases, and then decreases during development. This statistical model opens new options for interrogating neural population data and can bolster the use of modern large-scale in vivo Ca2+ and voltage imaging tools. PMID:27870612

A Powerful Procedure for Pathway-Based Meta-analysis Using Summary Statistics Identifies 43 Pathways Associated with Type II Diabetes in European Populations.

PubMed

Zhang, Han; Wheeler, William; Hyland, Paula L; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai

2016-06-01

Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs.

A Powerful Procedure for Pathway-Based Meta-analysis Using Summary Statistics Identifies 43 Pathways Associated with Type II Diabetes in European Populations

PubMed Central

Zhang, Han; Wheeler, William; Hyland, Paula L.; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai

2016-01-01

Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs. PMID:27362418

Kinetic analysis of single molecule FRET transitions without trajectories

NASA Astrophysics Data System (ADS)

Schrangl, Lukas; Göhring, Janett; Schütz, Gerhard J.

2018-03-01

Single molecule Förster resonance energy transfer (smFRET) is a popular tool to study biological systems that undergo topological transitions on the nanometer scale. smFRET experiments typically require recording of long smFRET trajectories and subsequent statistical analysis to extract parameters such as the states' lifetimes. Alternatively, analysis of probability distributions exploits the shapes of smFRET distributions at well chosen exposure times and hence works without the acquisition of time traces. Here, we describe a variant that utilizes statistical tests to compare experimental datasets with Monte Carlo simulations. For a given model, parameters are varied to cover the full realistic parameter space. As output, the method yields p-values which quantify the likelihood for each parameter setting to be consistent with the experimental data. The method provides suitable results even if the actual lifetimes differ by an order of magnitude. We also demonstrated the robustness of the method to inaccurately determine input parameters. As proof of concept, the new method was applied to the determination of transition rate constants for Holliday junctions.

Statistical Significance for Hierarchical Clustering

PubMed Central

Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.

2017-01-01

Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990

Monitoring the metering performance of an electronic voltage transformer on-line based on cyber-physics correlation analysis

NASA Astrophysics Data System (ADS)

Zhang, Zhu; Li, Hongbin; Tang, Dengping; Hu, Chen; Jiao, Yang

2017-10-01

Metering performance is the key parameter of an electronic voltage transformer (EVT), and it requires high accuracy. The conventional off-line calibration method using a standard voltage transformer is not suitable for the key equipment in a smart substation, which needs on-line monitoring. In this article, we propose a method for monitoring the metering performance of an EVT on-line based on cyber-physics correlation analysis. By the electrical and physical properties of a substation running in three-phase symmetry, the principal component analysis method is used to separate the metering deviation caused by the primary fluctuation and the EVT anomaly. The characteristic statistics of the measured data during operation are extracted, and the metering performance of the EVT is evaluated by analyzing the change in statistics. The experimental results show that the method successfully monitors the metering deviation of a Class 0.2 EVT accurately. The method demonstrates the accurate evaluation of on-line monitoring of the metering performance on an EVT without a standard voltage transformer.

[Design and implementation of online statistical analysis function in information system of air pollution and health impact monitoring].

PubMed

Lü, Yiran; Hao, Shuxin; Zhang, Guoqing; Liu, Jie; Liu, Yue; Xu, Dongqun

2018-01-01

To implement the online statistical analysis function in information system of air pollution and health impact monitoring, and obtain the data analysis information real-time. Using the descriptive statistical method as well as time-series analysis and multivariate regression analysis, SQL language and visual tools to implement online statistical analysis based on database software. Generate basic statistical tables and summary tables of air pollution exposure and health impact data online; Generate tendency charts of each data part online and proceed interaction connecting to database; Generate butting sheets which can lead to R, SAS and SPSS directly online. The information system air pollution and health impact monitoring implements the statistical analysis function online, which can provide real-time analysis result to its users.

A Mokken scale analysis of the peer physical examination questionnaire.

PubMed

Vaughan, Brett; Grace, Sandra

2018-01-01

Peer physical examination (PPE) is a teaching and learning strategy utilised in most health profession education programs. Perceptions of participating in PPE have been described in the literature, focusing on areas of the body students are willing, or unwilling, to examine. A small number of questionnaires exist to evaluate these perceptions, however none have described the measurement properties that may allow them to be used longitudinally. The present study undertook a Mokken scale analysis of the Peer Physical Examination Questionnaire (PPEQ) to evaluate its dimensionality and structure when used with Australian osteopathy students. Students enrolled in Year 1 of the osteopathy programs at Victoria University (Melbourne, Australia) and Southern Cross University (Lismore, Australia) were invited to complete the PPEQ prior to their first practical skills examination class. R, an open-source statistics program, was used to generate the descriptive statistics and perform a Mokken scale analysis. Mokken scale analysis is a non-parametric item response theory approach that is used to cluster items measuring a latent construct. Initial analysis suggested the PPEQ did not form a single scale. Further analysis identified three subscales: 'comfort', 'concern', and 'professionalism and education'. The properties of each subscale suggested they were unidimensional with variable internal structures. The 'comfort' subscale was the strongest of the three identified. All subscales demonstrated acceptable reliability estimation statistics (McDonald's omega > 0.75) supporting the calculation of a sum score for each subscale. The subscales identified are consistent with the literature. The 'comfort' subscale may be useful to longitudinally evaluate student perceptions of PPE. Further research is required to evaluate changes with PPE and the utility of the questionnaire with other health profession education programs.

Multivariate approaches for stability control of the olive oil reference materials for sensory analysis - part I: framework and fundamentals.

PubMed

Valverde-Som, Lucia; Ruiz-Samblás, Cristina; Rodríguez-García, Francisco P; Cuadros-Rodríguez, Luis

2018-02-09

Virgin olive oil is the only food product for which sensory analysis is regulated to classify it in different quality categories. To harmonize the results of the sensorial method, the use of standards or reference materials is crucial. The stability of sensory reference materials is required to enable their suitable control, aiming to confirm that their specific target values are maintained on an ongoing basis. Currently, such stability is monitored by means of sensory analysis and the sensory panels are in the paradoxical situation of controlling the standards that are devoted to controlling the panels. In the present study, several approaches based on similarity analysis are exploited. For each approach, the specific methodology to build a proper multivariate control chart to monitor the stability of the sensory properties is explained and discussed. The normalized Euclidean and Mahalanobis distances, the so-called nearness and hardiness indices respectively, have been defined as new similarity indices to range the values from 0 to 1. Also, the squared mean from Hotelling's T 2 -statistic and Q 2 -statistic has been proposed as another similarity index. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.

Evaluating the efficiency of environmental monitoring programs

USGS Publications Warehouse

Levine, Carrie R.; Yanai, Ruth D.; Lampman, Gregory G.; Burns, Douglas A.; Driscoll, Charles T.; Lawrence, Gregory B.; Lynch, Jason; Schoch, Nina

2014-01-01

Statistical uncertainty analyses can be used to improve the efficiency of environmental monitoring, allowing sampling designs to maximize information gained relative to resources required for data collection and analysis. In this paper, we illustrate four methods of data analysis appropriate to four types of environmental monitoring designs. To analyze a long-term record from a single site, we applied a general linear model to weekly stream chemistry data at Biscuit Brook, NY, to simulate the effects of reducing sampling effort and to evaluate statistical confidence in the detection of change over time. To illustrate a detectable difference analysis, we analyzed a one-time survey of mercury concentrations in loon tissues in lakes in the Adirondack Park, NY, demonstrating the effects of sampling intensity on statistical power and the selection of a resampling interval. To illustrate a bootstrapping method, we analyzed the plot-level sampling intensity of forest inventory at the Hubbard Brook Experimental Forest, NH, to quantify the sampling regime needed to achieve a desired confidence interval. Finally, to analyze time-series data from multiple sites, we assessed the number of lakes and the number of samples per year needed to monitor change over time in Adirondack lake chemistry using a repeated-measures mixed-effects model. Evaluations of time series and synoptic long-term monitoring data can help determine whether sampling should be re-allocated in space or time to optimize the use of financial and human resources.

Smart Sampling and HPC-based Probabilistic Look-ahead Contingency Analysis Implementation and its Evaluation with Real-world Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Yousu; Etingov, Pavel V.; Ren, Huiying

This paper describes a probabilistic look-ahead contingency analysis application that incorporates smart sampling and high-performance computing (HPC) techniques. Smart sampling techniques are implemented to effectively represent the structure and statistical characteristics of uncertainty introduced by different sources in the power system. They can significantly reduce the data set size required for multiple look-ahead contingency analyses, and therefore reduce the time required to compute them. High-performance-computing (HPC) techniques are used to further reduce computational time. These two techniques enable a predictive capability that forecasts the impact of various uncertainties on potential transmission limit violations. The developed package has been tested withmore » real world data from the Bonneville Power Administration. Case study results are presented to demonstrate the performance of the applications developed.« less

Are patient specific meshes required for EIT head imaging?

PubMed

Jehl, Markus; Aristovich, Kirill; Faulkner, Mayo; Holder, David

2016-06-01

Head imaging with electrical impedance tomography (EIT) is usually done with time-differential measurements, to reduce time-invariant modelling errors. Previous research suggested that more accurate head models improved image quality, but no thorough analysis has been done on the required accuracy. We propose a novel pipeline for creation of precise head meshes from magnetic resonance imaging and computed tomography scans, which was applied to four different heads. Voltages were simulated on all four heads for perturbations of different magnitude, haemorrhage and ischaemia, in five different positions and for three levels of instrumentation noise. Statistical analysis showed that reconstructions on the correct mesh were on average 25% better than on the other meshes. However, the stroke detection rates were not improved. We conclude that a generic head mesh is sufficient for monitoring patients for secondary strokes following head trauma.

«

21

22

23

24

25

»

Some links on this page may take you to non-federal websites. Their policies may differ from this site.