Science.gov

Sample records for adequate statistical methods

  1. [Adequate application of quantitative and qualitative statistic analytic methods in acupuncture clinical trials].

    PubMed

    Tan, Ming T; Liu, Jian-ping; Lao, Lixing

    2012-08-01

    Recently, proper use of the statistical methods in traditional Chinese medicine (TCM) randomized controlled trials (RCTs) has received increased attention. Statistical inference based on hypothesis testing is the foundation of clinical trials and evidence-based medicine. In this article, the authors described the methodological differences between literature published in Chinese and Western journals in the design and analysis of acupuncture RCTs and the application of basic statistical principles. In China, qualitative analysis method has been widely used in acupuncture and TCM clinical trials, while the between-group quantitative analysis methods on clinical symptom scores are commonly used in the West. The evidence for and against these analytical differences were discussed based on the data of RCTs assessing acupuncture for pain relief. The authors concluded that although both methods have their unique advantages, quantitative analysis should be used as the primary analysis while qualitative analysis can be a secondary criterion for analysis. The purpose of this paper is to inspire further discussion of such special issues in clinical research design and thus contribute to the increased scientific rigor of TCM research.

  2. Are shear force methods adequately reported?

    PubMed

    Holman, Benjamin W B; Fowler, Stephanie M; Hopkins, David L

    2016-09-01

    This study aimed to determine the detail to which shear force (SF) protocols and methods have been reported in the scientific literature between 2009 and 2015. Articles (n=734) published in peer-reviewed animal and food science journals and limited to only those testing the SF of unprocessed and non-fabricated mammal meats were evaluated. It was found that most of these SF articles originated in Europe (35.3%), investigated bovine species (49.0%), measured m. longissimus samples (55.2%), used tenderometers manufactured by Instron (31.2%), and equipped with Warner-Bratzler blades (68.8%). SF samples were also predominantly thawed prior to cooking (37.1%) and cooked sous vide, using a water bath (50.5%). Information pertaining to blade crosshead speed (47.5%), recorded SF resistance (56.7%), muscle fibre orientation when tested (49.2%), sub-section or core dimension (21.8%), end-point temperature (29.3%), and other factors contributing to SF variation were often omitted. This base failure diminishes repeatability and accurate SF interpretation, and must therefore be rectified. PMID:27107727

  3. Are shear force methods adequately reported?

    PubMed

    Holman, Benjamin W B; Fowler, Stephanie M; Hopkins, David L

    2016-09-01

    This study aimed to determine the detail to which shear force (SF) protocols and methods have been reported in the scientific literature between 2009 and 2015. Articles (n=734) published in peer-reviewed animal and food science journals and limited to only those testing the SF of unprocessed and non-fabricated mammal meats were evaluated. It was found that most of these SF articles originated in Europe (35.3%), investigated bovine species (49.0%), measured m. longissimus samples (55.2%), used tenderometers manufactured by Instron (31.2%), and equipped with Warner-Bratzler blades (68.8%). SF samples were also predominantly thawed prior to cooking (37.1%) and cooked sous vide, using a water bath (50.5%). Information pertaining to blade crosshead speed (47.5%), recorded SF resistance (56.7%), muscle fibre orientation when tested (49.2%), sub-section or core dimension (21.8%), end-point temperature (29.3%), and other factors contributing to SF variation were often omitted. This base failure diminishes repeatability and accurate SF interpretation, and must therefore be rectified.

  4. 42 CFR 417.568 - Adequate financial records, statistical data, and cost finding.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 42 Public Health 3 2014-10-01 2014-10-01 false Adequate financial records, statistical data, and....568 Adequate financial records, statistical data, and cost finding. (a) Maintenance of records. (1) An HMO or CMP must maintain sufficient financial records and statistical data for proper determination...

  5. 42 CFR 417.568 - Adequate financial records, statistical data, and cost finding.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 42 Public Health 3 2013-10-01 2013-10-01 false Adequate financial records, statistical data, and....568 Adequate financial records, statistical data, and cost finding. (a) Maintenance of records. (1) An HMO or CMP must maintain sufficient financial records and statistical data for proper determination...

  6. 42 CFR 417.568 - Adequate financial records, statistical data, and cost finding.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 42 Public Health 3 2012-10-01 2012-10-01 false Adequate financial records, statistical data, and....568 Adequate financial records, statistical data, and cost finding. (a) Maintenance of records. (1) An HMO or CMP must maintain sufficient financial records and statistical data for proper determination...

  7. 42 CFR 417.568 - Adequate financial records, statistical data, and cost finding.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 42 Public Health 3 2011-10-01 2011-10-01 false Adequate financial records, statistical data, and... financial records, statistical data, and cost finding. (a) Maintenance of records. (1) An HMO or CMP must maintain sufficient financial records and statistical data for proper determination of costs payable by...

  8. Geopositional Statistical Methods

    NASA Technical Reports Server (NTRS)

    Ross, Kenton

    2006-01-01

    RMSE based methods distort circular error estimates (up to 50% overestimation). The empirical approach is the only statistically unbiased estimator offered. Ager modification to Shultz approach is nearly unbiased, but cumbersome. All methods hover around 20% uncertainty (@ 95% confidence) for low geopositional bias error estimates. This requires careful consideration in assessment of higher accuracy products.

  9. Statistical Methods in Cosmology

    NASA Astrophysics Data System (ADS)

    Verde, L.

    2010-03-01

    The advent of large data-set in cosmology has meant that in the past 10 or 20 years our knowledge and understanding of the Universe has changed not only quantitatively but also, and most importantly, qualitatively. Cosmologists rely on data where a host of useful information is enclosed, but is encoded in a non-trivial way. The challenges in extracting this information must be overcome to make the most of a large experimental effort. Even after having converged to a standard cosmological model (the LCDM model) we should keep in mind that this model is described by 10 or more physical parameters and if we want to study deviations from it, the number of parameters is even larger. Dealing with such a high dimensional parameter space and finding parameters constraints is a challenge on itself. Cosmologists want to be able to compare and combine different data sets both for testing for possible disagreements (which could indicate new physics) and for improving parameter determinations. Finally, cosmologists in many cases want to find out, before actually doing the experiment, how much one would be able to learn from it. For all these reasons, sophisiticated statistical techniques are being employed in cosmology, and it has become crucial to know some statistical background to understand recent literature in the field. I will introduce some statistical tools that any cosmologist should know about in order to be able to understand recently published results from the analysis of cosmological data sets. I will not present a complete and rigorous introduction to statistics as there are several good books which are reported in the references. The reader should refer to those.

  10. Statistical Methods in Psychology Journals.

    ERIC Educational Resources Information Center

    Willkinson, Leland

    1999-01-01

    Proposes guidelines for revising the American Psychological Association (APA) publication manual or other APA materials to clarify the application of statistics in research reports. The guidelines are intended to induce authors and editors to recognize the thoughtless application of statistical methods. Contains 54 references. (SLD)

  11. Statistical methods in language processing.

    PubMed

    Abney, Steven

    2011-05-01

    The term statistical methods here refers to a methodology that has been dominant in computational linguistics since about 1990. It is characterized by the use of stochastic models, substantial data sets, machine learning, and rigorous experimental evaluation. The shift to statistical methods in computational linguistics parallels a movement in artificial intelligence more broadly. Statistical methods have so thoroughly permeated computational linguistics that almost all work in the field draws on them in some way. There has, however, been little penetration of the methods into general linguistics. The methods themselves are largely borrowed from machine learning and information theory. We limit attention to that which has direct applicability to language processing, though the methods are quite general and have many nonlinguistic applications. Not every use of statistics in language processing falls under statistical methods as we use the term. Standard hypothesis testing and experimental design, for example, are not covered in this article. WIREs Cogni Sci 2011 2 315-322 DOI: 10.1002/wcs.111 For further resources related to this article, please visit the WIREs website.

  12. Improved ASTM G72 Test Method for Ensuring Adequate Fuel-to-Oxidizer Ratios

    NASA Technical Reports Server (NTRS)

    Juarez, Alfredo; Harper, Susana Tapia

    2016-01-01

    The ASTM G72/G72M-15 Standard Test Method for Autogenous Ignition Temperature of Liquids and Solids in a High-Pressure Oxygen-Enriched Environment is currently used to evaluate materials for the ignition susceptibility driven by exposure to external heat in an enriched oxygen environment. Testing performed on highly volatile liquids such as cleaning solvents has proven problematic due to inconsistent test results (non-ignitions). Non-ignition results can be misinterpreted as favorable oxygen compatibility, although they are more likely associated with inadequate fuel-to-oxidizer ratios. Forced evaporation during purging and inadequate sample size were identified as two potential causes for inadequate available sample material during testing. In an effort to maintain adequate fuel-to-oxidizer ratios within the reaction vessel during test, several parameters were considered, including sample size, pretest sample chilling, pretest purging, and test pressure. Tests on a variety of solvents exhibiting a range of volatilities are presented in this paper. A proposed improvement to the standard test protocol as a result of this evaluation is also presented. Execution of the final proposed improved test protocol outlines an incremental step method of determining optimal conditions using increased sample sizes while considering test system safety limits. The proposed improved test method increases confidence in results obtained by utilizing the ASTM G72 autogenous ignition temperature test method and can aid in the oxygen compatibility assessment of highly volatile liquids and other conditions that may lead to false non-ignition results.

  13. Recent statistical methods for orientation data

    NASA Technical Reports Server (NTRS)

    Batschelet, E.

    1972-01-01

    The application of statistical methods for determining the areas of animal orientation and navigation are discussed. The method employed is limited to the two-dimensional case. Various tests for determining the validity of the statistical analysis are presented. Mathematical models are included to support the theoretical considerations and tables of data are developed to show the value of information obtained by statistical analysis.

  14. Statistical methods in physical mapping

    SciTech Connect

    Nelson, D.O.

    1995-05-01

    One of the great success stories of modern molecular genetics has been the ability of biologists to isolate and characterize the genes responsible for serious inherited diseases like fragile X syndrome, cystic fibrosis and myotonic muscular dystrophy. This dissertation concentrates on constructing high-resolution physical maps. It demonstrates how probabilistic modeling and statistical analysis can aid molecular geneticists in the tasks of planning, execution, and evaluation of physical maps of chromosomes and large chromosomal regions. The dissertation is divided into six chapters. Chapter 1 provides an introduction to the field of physical mapping, describing the role of physical mapping in gene isolation and ill past efforts at mapping chromosomal regions. The next two chapters review and extend known results on predicting progress in large mapping projects. Such predictions help project planners decide between various approaches and tactics for mapping large regions of the human genome. Chapter 2 shows how probability models have been used in the past to predict progress in mapping projects. Chapter 3 presents new results, based on stationary point process theory, for progress measures for mapping projects based on directed mapping strategies. Chapter 4 describes in detail the construction of all initial high-resolution physical map for human chromosome 19. This chapter introduces the probability and statistical models involved in map construction in the context of a large, ongoing physical mapping project. Chapter 5 concentrates on one such model, the trinomial model. This chapter contains new results on the large-sample behavior of this model, including distributional results, asymptotic moments, and detection error rates. In addition, it contains an optimality result concerning experimental procedures based on the trinomial model. The last chapter explores unsolved problems and describes future work.

  15. Elementary Science Methods Courses and the "National Science Education Standards": Are We Adequately Preparing Teachers?

    ERIC Educational Resources Information Center

    Smith, Leigh K.; Gess-Newsome, Julie

    2004-01-01

    Despite the apparent lack of universally accepted goals or objectives for elementary science methods courses, teacher educators nationally are autonomously designing these classes to prepare prospective teachers to teach science. It is unclear, however, whether science methods courses are preparing teachers to teach science effectively or to…

  16. Are adequate methods available to detect protist parasites on fresh produce?

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Human parasitic protists such as Cryptosporidium, Giardia and microsporidia contaminate a variety of fresh produce worldwide. Existing detection methods lack sensitivity and specificity for most foodborne parasites. Furthermore, detection has been problematic because these parasites adhere tenacious...

  17. Statistical methods for nuclear material management

    SciTech Connect

    Bowen W.M.; Bennett, C.A.

    1988-12-01

    This book is intended as a reference manual of statistical methodology for nuclear material management practitioners. It describes statistical methods currently or potentially important in nuclear material management, explains the choice of methods for specific applications, and provides examples of practical applications to nuclear material management problems. Together with the accompanying training manual, which contains fully worked out problems keyed to each chapter, this book can also be used as a textbook for courses in statistical methods for nuclear material management. It should provide increased understanding and guidance to help improve the application of statistical methods to nuclear material management problems.

  18. Statistical methods for material characterization and qualification

    SciTech Connect

    Hunn, John D; Kercher, Andrew K

    2005-01-01

    This document describes a suite of statistical methods that can be used to infer lot parameters from the data obtained from inspection/testing of random samples taken from that lot. Some of these methods will be needed to perform the statistical acceptance tests required by the Advanced Gas Reactor Fuel Development and Qualification (AGR) Program. Special focus has been placed on proper interpretation of acceptance criteria and unambiguous methods of reporting the statistical results. In addition, modified statistical methods are described that can provide valuable measures of quality for different lots of material. This document has been written for use as a reference and a guide for performing these statistical calculations. Examples of each method are provided. Uncertainty analysis (e.g., measurement uncertainty due to instrumental bias) is not included in this document, but should be considered when reporting statistical results.

  19. Statistical Methods for Material Characterization and Qualification

    SciTech Connect

    Kercher, A.K.

    2005-04-01

    This document describes a suite of statistical methods that can be used to infer lot parameters from the data obtained from inspection/testing of random samples taken from that lot. Some of these methods will be needed to perform the statistical acceptance tests required by the Advanced Gas Reactor Fuel Development and Qualification (AGR) Program. Special focus has been placed on proper interpretation of acceptance criteria and unambiguous methods of reporting the statistical results. In addition, modified statistical methods are described that can provide valuable measures of quality for different lots of material. This document has been written for use as a reference and a guide for performing these statistical calculations. Examples of each method are provided. Uncertainty analysis (e.g., measurement uncertainty due to instrumental bias) is not included in this document, but should be considered when reporting statistical results.

  20. Quasi-Isotropic Approximation of Geometrical Optics Method as Adequate Electrodynamical Basis for Tokamak Plasma Polarimetry

    NASA Astrophysics Data System (ADS)

    Bieg, Bohdan; Chrzanowski, Janusz; Kravtsov, Yury A.; Orsitto, Francesco

    Basic principles and recent findings of quasi-isotropic approximation (QIA) of a geometrical optics method are presented in a compact manner. QIA was developed in 1969 to describe electromagnetic waves in weakly anisotropic media. QIA represents the wave field as a power series in two small parameters, one of which is a traditional geometrical optics parameter, equal to wavelength ratio to plasma characteristic scale, and the other one is the largest component of anisotropy tensor. As a result, "" QIA ideally suits to tokamak polarimetry/interferometry systems in submillimeter range, where plasma manifests properties of weakly anisotropic medium.

  1. Estimating the benefits of maintaining adequate lake levels to homeowners using the hedonic property method

    NASA Astrophysics Data System (ADS)

    Loomis, John; Feldman, Marvin

    2003-09-01

    The hedonic property method was used to estimate residents' economic benefits from maintaining high and stable lake levels at Lake Almanor, California. Nearly a thousand property transactions over a 14-year period from 1987 to 2001 were analyzed. The linear hedonic property regression explained more than 60% of the variation in-house prices. Property prices were negatively and significantly related to the number of linear feet of exposed lake shoreline. Each additional one foot of exposed shoreline reduces the property price by 108-119. A view of the lake added nearly 31,000 to house prices, while lakefront properties sold for 209,000 more than non-lake front properties.

  2. Statistical methods for environmental pollution monitoring

    SciTech Connect

    Gilbert, R.O.

    1987-01-01

    The application of statistics to environmental pollution monitoring studies requires a knowledge of statistical analysis methods particularly well suited to pollution data. This book fills that need by providing sampling plans, statistical tests, parameter estimation procedure techniques, and references to pertinent publications. Most of the statistical techniques are relatively simple, and examples, exercises, and case studies are provided to illustrate procedures. The book is logically divided into three parts. Chapters 1, 2, and 3 are introductory chapters. Chapters 4 through 10 discuss field sampling designs and Chapters 11 through 18 deal with a broad range of statistical analysis procedures. Some statistical techniques given here are not commonly seen in statistics book. For example, see methods for handling correlated data (Sections 4.5 and 11.12), for detecting hot spots (Chapter 10), and for estimating a confidence interval for the mean of a lognormal distribution (Section 13.2). Also, Appendix B lists a computer code that estimates and tests for trends over time at one or more monitoring stations using nonparametric methods (Chapters 16 and 17). Unfortunately, some important topics could not be included because of their complexity and the need to limit the length of the book. For example, only brief mention could be made of time series analysis using Box-Jenkins methods and of kriging techniques for estimating spatial and spatial-time patterns of pollution, although multiple references on these topics are provided. Also, no discussion of methods for assessing risks from environmental pollution could be included.

  3. Statistical Methods for Selecting Merit Schools.

    ERIC Educational Resources Information Center

    Abalos, Jose; And Others

    This study investigated six statistical merit school selection methods using student scores on a nationally normed, standardized achievement test to identify merit schools. More specifically, its purpose was to select a method for the Palm Beach County School system which meets the Florida merit school program criterion of fairness in terms of…

  4. A Statistical Method for Syntactic Dialectometry

    ERIC Educational Resources Information Center

    Sanders, Nathan C.

    2010-01-01

    This dissertation establishes the utility and reliability of a statistical distance measure for syntactic dialectometry, expanding dialectometry's methods to include syntax as well as phonology and the lexicon. It establishes the measure's reliability by comparing its results to those of dialectology and phonological dialectometry on Swedish…

  5. ESD protection device design using statistical methods

    NASA Astrophysics Data System (ADS)

    Shigyo, N.; Kawashima, H.; Yasuda, S.

    2002-12-01

    This paper describes a design of the electrostatic discharge (ESD) protection device to minimize its area Ap while maintaining the breakdown voltage VESD. Hypothesis tests using measured data were performed to find the severest applied serge condition and to select control factors for the design-of-experiments (DOE). Also, technology CAD (TCAD) was used to estimate VESD. An optimum device structure, where salicide block was employed, was found using statistical methods and TCAD.

  6. Computational Statistical Methods for Social Network Models

    PubMed Central

    Hunter, David R.; Krivitsky, Pavel N.; Schweinberger, Michael

    2013-01-01

    We review the broad range of recent statistical work in social network models, with emphasis on computational aspects of these methods. Particular focus is applied to exponential-family random graph models (ERGM) and latent variable models for data on complete networks observed at a single time point, though we also briefly review many methods for incompletely observed networks and networks observed at multiple time points. Although we mention far more modeling techniques than we can possibly cover in depth, we provide numerous citations to current literature. We illustrate several of the methods on a small, well-known network dataset, Sampson’s monks, providing code where possible so that these analyses may be duplicated. PMID:23828720

  7. Statistical Methods for Rapid Aerothermal Analysis and Design Technology: Validation

    NASA Technical Reports Server (NTRS)

    DePriest, Douglas; Morgan, Carolyn

    2003-01-01

    The cost and safety goals for NASA s next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to identify adequate statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The initial research work focused on establishing suitable candidate models for these purposes. The second phase is focused on assessing the performance of these models to accurately predict the heat rate for a given candidate data set. This validation work compared models and methods that may be useful in predicting the heat rate.

  8. Some useful statistical methods for model validation.

    PubMed Central

    Marcus, A H; Elias, R W

    1998-01-01

    Although formal hypothesis tests provide a convenient framework for displaying the statistical results of empirical comparisons, standard tests should not be used without consideration of underlying measurement error structure. As part of the validation process, predictions of individual blood lead concentrations from models with site-specific input parameters are often compared with blood lead concentrations measured in field studies that also report lead concentrations in environmental media (soil, dust, water, paint) as surrogates for exposure. Measurements of these environmental media are subject to several sources of variability, including temporal and spatial sampling, sample preparation and chemical analysis, and data entry or recording. Adjustments for measurement error must be made before statistical tests can be used to empirically compare environmental data with model predictions. This report illustrates the effect of measurement error correction using a real dataset of child blood lead concentrations for an undisclosed midwestern community. We illustrate both the apparent failure of some standard regression tests and the success of adjustment of such tests for measurement error using the SIMEX (simulation-extrapolation) procedure. This procedure adds simulated measurement error to model predictions and then subtracts the total measurement error, analogous to the method of standard additions used by analytical chemists. Images Figure 1 Figure 3 PMID:9860913

  9. Advanced statistical methods for the definition of new staging models.

    PubMed

    Kates, Ronald; Schmitt, Manfred; Harbeck, Nadia

    2003-01-01

    Adequate staging procedures are the prerequisite for individualized therapy concepts in cancer, particularly in the adjuvant setting. Molecular staging markers tend to characterize specific, fundamental disease processes to a greater extent than conventional staging markers. At the biological level, the course of the disease will almost certainly involve interactions between multiple underlying processes. Since new therapeutic strategies tend to target specific processes as well, their impact will also involve interactions. Hence, assessment of the prognostic impact of new markers and their utilization for prediction of response to therapy will require increasingly sophisticated statistical tools that are capable of detecting and modeling complicated interactions. Because they are designed to model arbitrary interactions, neural networks offer a promising approach to improved staging. However, the typical clinical data environment poses severe challenges to high-performance survival modeling using neural nets, particularly the key problem of maintaining good generalization. Nonetheless, it turns out that by using newly developed methods to minimize unnecessary complexity in the neural network representation of disease course, it is possible to obtain models with high predictive performance. This performance has been validated on both simulated and real patient data sets. There are important applications for design of studies involving targeted therapy concepts and for identification of the improvement in decision support resulting from new staging markers. In this article, advantages of advanced statistical methods such as neural networks for definition of new staging models will be illustrated using breast cancer as an example.

  10. Are the most distressing concerns of patients with inoperable lung cancer adequately assessed? A mixed-methods analysis.

    PubMed

    Tishelman, Carol; Lövgren, Malin; Broberger, Eva; Hamberg, Katarina; Sprangers, Mirjam A G

    2010-04-10

    PURPOSE Standardized questionnaires for patient-reported outcomes are generally composed of specified predetermined items, although other areas may also cause patients distress. We therefore studied reports of what was most distressing for 343 patients with inoperable lung cancer (LC) at six time points during the first year postdiagnosis and how these concerns were assessed by three quality-of-life and symptom questionnaires. PATIENTS AND METHODS Qualitative analysis of patients' responses to the question "What do you find most distressing at present?" generated 20 categories, with 17 under the dimensions of "bodily distress," "life situation with LC," and "iatrogenic distress." Descriptive and inferential statistical analyses were conducted. RESULTS The majority of statements reported as most distressing related to somatic and psychosocial problems, with 26% of patients reporting an overarching form of distress instead of specific problems at some time point. Twenty-seven percent reported some facet of their contact with the health care system as causing them most distress. While 55% to 59% of concerns reported as most distressing were clearly assessed by the European Organisation for Research and Treatment for Cancer Quality of Life Questionnaire Core-30 and Lung Cancer Module instruments, the Memorial Symptom Assessment Scale, and the modified Distress Screening Tool, iatrogenic distress is not specifically targeted by any of the three instruments examined. CONCLUSION Using this approach, several distressing issues were found to be commonly reported by this patient group but were not assessed by standardized questionnaires. This highlights the need to carefully consider choice of instrument in relation to study objectives and characteristics of the sample investigated and to consider complementary means of assessment in clinical practice.

  11. Seasonal UK Drought Forecasting using Statistical Methods

    NASA Astrophysics Data System (ADS)

    Richardson, Doug; Fowler, Hayley; Kilsby, Chris; Serinaldi, Francesco

    2016-04-01

    In the UK drought is a recurrent feature of climate with potentially large impacts on public water supply. Water companies' ability to mitigate the impacts of drought by managing diminishing availability depends on forward planning and it would be extremely valuable to improve forecasts of drought on monthly to seasonal time scales. By focusing on statistical forecasting methods, this research aims to provide techniques that are simpler, faster and computationally cheaper than physically based models. In general, statistical forecasting is done by relating the variable of interest (some hydro-meteorological variable such as rainfall or streamflow, or a drought index) to one or more predictors via some formal dependence. These predictors are generally antecedent values of the response variable or external factors such as teleconnections. A candidate model is Generalised Additive Models for Location, Scale and Shape parameters (GAMLSS). GAMLSS is a very flexible class allowing for more general distribution functions (e.g. highly skewed and/or kurtotic distributions) and the modelling of not just the location parameter but also the scale and shape parameters. Additionally GAMLSS permits the forecasting of an entire distribution, allowing the output to be assessed in probabilistic terms rather than simply the mean and confidence intervals. Exploratory analysis of the relationship between long-memory processes (e.g. large-scale atmospheric circulation patterns, sea surface temperatures and soil moisture content) and drought should result in the identification of suitable predictors to be included in the forecasting model, and further our understanding of the drivers of UK drought.

  12. Methods of the computer-aided statistical analysis of microcircuits

    NASA Astrophysics Data System (ADS)

    Beliakov, Iu. N.; Kurmaev, F. A.; Batalov, B. V.

    Methods that are currently used for the computer-aided statistical analysis of microcircuits at the design stage are summarized. In particular, attention is given to methods for solving problems in statistical analysis, statistical planning, and factorial model synthesis by means of irregular experimental design. Efficient ways of reducing the computer time required for statistical analysis and numerical methods of microcircuit analysis are proposed. The discussion also covers various aspects of the organization of computer-aided microcircuit modeling and analysis systems.

  13. Statistical methods of estimating mining costs

    USGS Publications Warehouse

    Long, K.R.

    2011-01-01

    Until it was defunded in 1995, the U.S. Bureau of Mines maintained a Cost Estimating System (CES) for prefeasibility-type economic evaluations of mineral deposits and estimating costs at producing and non-producing mines. This system had a significant role in mineral resource assessments to estimate costs of developing and operating known mineral deposits and predicted undiscovered deposits. For legal reasons, the U.S. Geological Survey cannot update and maintain CES. Instead, statistical tools are under development to estimate mining costs from basic properties of mineral deposits such as tonnage, grade, mineralogy, depth, strip ratio, distance from infrastructure, rock strength, and work index. The first step was to reestimate "Taylor's Rule" which relates operating rate to available ore tonnage. The second step was to estimate statistical models of capital and operating costs for open pit porphyry copper mines with flotation concentrators. For a sample of 27 proposed porphyry copper projects, capital costs can be estimated from three variables: mineral processing rate, strip ratio, and distance from nearest railroad before mine construction began. Of all the variables tested, operating costs were found to be significantly correlated only with strip ratio.

  14. A new method for derivation of statistical weight of the Gentile Statistics

    NASA Astrophysics Data System (ADS)

    Selvi, Sevilay; Uncu, Haydar

    2015-10-01

    We present a new method for obtaining the statistical weight of the Gentile Statistics. In a recent paper, Perez and Tun presented an approximate combinatoric and an exact recursive formula for the statistical weight of Gentile Statistics, beginning from bosonic and fermionic cases, respectively Hernandez-Perez and Tun (2007). In this paper, we obtain two exact, one combinatoric and one recursive, formulae for the statistical weight of Gentile Statistics, by another approach. The combinatoric formula is valid only for special cases, whereas recursive formula is valid for all possible cases. Moreover, for a given q-maximum number of particles that can occupy a level for Gentile statistics-the recursive formula we have derived gives the result much faster than the recursive formula presented in Hernandez-Perez and Tun (2007), when one uses a computer program. Moreover we obtained the statistical weight for the distribution proposed by Dai and Xie (2009).

  15. MSD Recombination Method in Statistical Machine Translation

    NASA Astrophysics Data System (ADS)

    Gros, Jerneja Žganec

    2008-11-01

    Freely available tools and language resources were used to build the VoiceTRAN statistical machine translation (SMT) system. Various configuration variations of the system are presented and evaluated. The VoiceTRAN SMT system outperformed the baseline conventional rule-based MT system in all English-Slovenian in-domain test setups. To further increase the generalization capability of the translation model for lower-coverage out-of-domain test sentences, an "MSD-recombination" approach was proposed. This approach not only allows a better exploitation of conventional translation models, but also performs well in the more demanding translation direction; that is, into a highly inflectional language. Using this approach in the out-of-domain setup of the English-Slovenian JRC-ACQUIS task, we have achieved significant improvements in translation quality.

  16. Statistical Methods Used in Gifted Education Journals, 2006-2010

    ERIC Educational Resources Information Center

    Warne, Russell T.; Lazo, Maria; Ramos, Tammy; Ritter, Nicola

    2012-01-01

    This article describes the statistical methods used in quantitative and mixed methods articles between 2006 and 2010 in five gifted education research journals. Results indicate that the most commonly used statistical methods are means (85.9% of articles), standard deviations (77.8%), Pearson's "r" (47.8%), X[superscript 2] (32.2%), ANOVA (30.7%),…

  17. Statistical methods and computing for big data

    PubMed Central

    Wang, Chun; Chen, Ming-Hui; Schifano, Elizabeth; Wu, Jing

    2016-01-01

    Big data are data on a massive scale in terms of volume, intensity, and complexity that exceed the capacity of standard analytic tools. They present opportunities as well as challenges to statisticians. The role of computational statisticians in scientific discovery from big data analyses has been under-recognized even by peer statisticians. This article summarizes recent methodological and software developments in statistics that address the big data challenges. Methodologies are grouped into three classes: subsampling-based, divide and conquer, and online updating for stream data. As a new contribution, the online updating approach is extended to variable selection with commonly used criteria, and their performances are assessed in a simulation study with stream data. Software packages are summarized with focuses on the open source R and R packages, covering recent tools that help break the barriers of computer memory and computing power. Some of the tools are illustrated in a case study with a logistic regression for the chance of airline delay. PMID:27695593

  18. Statistical methods and computing for big data

    PubMed Central

    Wang, Chun; Chen, Ming-Hui; Schifano, Elizabeth; Wu, Jing

    2016-01-01

    Big data are data on a massive scale in terms of volume, intensity, and complexity that exceed the capacity of standard analytic tools. They present opportunities as well as challenges to statisticians. The role of computational statisticians in scientific discovery from big data analyses has been under-recognized even by peer statisticians. This article summarizes recent methodological and software developments in statistics that address the big data challenges. Methodologies are grouped into three classes: subsampling-based, divide and conquer, and online updating for stream data. As a new contribution, the online updating approach is extended to variable selection with commonly used criteria, and their performances are assessed in a simulation study with stream data. Software packages are summarized with focuses on the open source R and R packages, covering recent tools that help break the barriers of computer memory and computing power. Some of the tools are illustrated in a case study with a logistic regression for the chance of airline delay.

  19. Problems of applicability of statistical methods in cosmology

    SciTech Connect

    Levin, S. F.

    2015-12-15

    The problems arising from the incorrect formulation of measuring problems of identification for cosmological models and violations of conditions of applicability of statistical methods are considered.

  20. Cratering statistics on asteroids: Methods and perspectives

    NASA Astrophysics Data System (ADS)

    Chapman, C.

    2014-07-01

    Crater size-frequency distributions (SFDs) on the surfaces of solid-surfaced bodies in the solar system have provided valuable insights about planetary surface processes and about impactor populations since the first spacecraft images were obtained in the 1960s. They can be used to determine relative age differences between surficial units, to obtain absolute model ages if the impactor flux and scaling laws are understood, to assess various endogenic planetary or asteroidal processes that degrade craters or resurface units, as well as assess changes in impactor populations across the solar system and/or with time. The first asteroid SFDs were measured from Galileo images of Gaspra and Ida (cf., Chapman 2002). Despite the superficial simplicity of these studies, they are fraught with many difficulties, including confusion by secondary and/or endogenic cratering and poorly understood aspects of varying target properties (including regoliths, ejecta blankets, and nearly-zero-g rubble piles), widely varying attributes of impactors, and a host of methodological problems including recognizability of degraded craters, which is affected by illumination angle and by the ''personal equations'' of analysts. Indeed, controlled studies (Robbins et al. 2014) demonstrate crater-density differences of a factor of two or more between experienced crater counters. These inherent difficulties have been especially apparent in divergent results for Vesta from different members of the Dawn Science Team (cf. Russell et al. 2013). Indeed, they have been exacerbated by misuse of a widely available tool (Craterstats: hrscview.fu- berlin.de/craterstats.html), which incorrectly computes error bars for proper interpretation of cumulative SFDs, resulting in derived model ages specified to three significant figures and interpretations of statistically insignificant kinks. They are further exacerbated, and for other small-body crater SFDs analyzed by the Berlin group, by stubbornly adopting

  1. Review of robust multivariate statistical methods in high dimension.

    PubMed

    Filzmoser, Peter; Todorov, Valentin

    2011-10-31

    General ideas of robust statistics, and specifically robust statistical methods for calibration and dimension reduction are discussed. The emphasis is on analyzing high-dimensional data. The discussed methods are applied using the packages chemometrics and rrcov of the statistical software environment R. It is demonstrated how the functions can be applied to real high-dimensional data from chemometrics, and how the results can be interpreted.

  2. Online Statistics Labs in MSW Research Methods Courses: Reducing Reluctance toward Statistics

    ERIC Educational Resources Information Center

    Elliott, William; Choi, Eunhee; Friedline, Terri

    2013-01-01

    This article presents results from an evaluation of an online statistics lab as part of a foundations research methods course for master's-level social work students. The article discusses factors that contribute to an environment in social work that fosters attitudes of reluctance toward learning and teaching statistics in research methods…

  3. An Introductory Overview of Statistical Methods for Discrete Time Series

    NASA Astrophysics Data System (ADS)

    Meng, X.-L.; California-Harvard AstroStat Collaboration

    2004-08-01

    A number of statistical problems encounted in astrophysics are concerned with discrete time series, such as photon counts with variation in source intensity over time. This talk provides an introductory overview of the current state-of-the-art methods in statistics, including Bayesian methods aided by Markov chain Monte Carlo, for modeling and analyzing such data. These methods have also been successfully applied in other fields, such as economics.

  4. The estimation of the measurement results with using statistical methods

    NASA Astrophysics Data System (ADS)

    Velychko, O.; Gordiyenko, T.

    2015-02-01

    The row of international standards and guides describe various statistical methods that apply for a management, control and improvement of processes with the purpose of realization of analysis of the technical measurement results. The analysis of international standards and guides on statistical methods estimation of the measurement results recommendations for those applications in laboratories is described. For realization of analysis of standards and guides the cause-and-effect Ishikawa diagrams concerting to application of statistical methods for estimation of the measurement results are constructed.

  5. Using statistical methods and genotyping to detect tuberculosis outbreaks

    PubMed Central

    2013-01-01

    Background Early identification of outbreaks remains a key component in continuing to reduce the burden of infectious disease in the United States. Previous studies have applied statistical methods to detect unexpected cases of disease in space or time. The objectives of our study were to assess the ability and timeliness of three spatio-temporal methods to detect known outbreaks of tuberculosis. Methods We used routinely available molecular and surveillance data to retrospectively assess the effectiveness of three statistical methods in detecting tuberculosis outbreaks: county-based log-likelihood ratio, cumulative sums, and a spatial scan statistic. Results Our methods identified 8 of the 9 outbreaks, and 6 outbreaks would have been identified 1–52 months (median = 10 months) before local public health authorities identified them. Assuming no delays in data availability, 46 (59.7%) of the 77 patients in the 9 outbreaks were identified after our statistical methods would have detected the outbreak but before local public health authorities became aware of the problem. Conclusions Statistical methods, when applied retrospectively to routinely collected tuberculosis data, can successfully detect known outbreaks, potentially months before local public health authorities become aware of the problem. The three methods showed similar results; no single method was clearly superior to the other two. Further study to elucidate the performance of these methods in detecting tuberculosis outbreaks will be done in a prospective analysis. PMID:23497235

  6. Statistical limitations in functional neuroimaging. I. Non-inferential methods and statistical models.

    PubMed Central

    Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P

    1999-01-01

    Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149

  7. [Evaluation of using statistical methods in selected national medical journals].

    PubMed

    Sych, Z

    1996-01-01

    The paper covers the performed evaluation of frequency with which the statistical methods were applied in analyzed works having been published in six selected, national medical journals in the years 1988-1992. For analysis the following journals were chosen, namely: Klinika Oczna, Medycyna Pracy, Pediatria Polska, Polski Tygodnik Lekarski, Roczniki Państwowego Zakładu Higieny, Zdrowie Publiczne. Appropriate number of works up to the average in the remaining medical journals was randomly selected from respective volumes of Pol. Tyg. Lek. The studies did not include works wherein the statistical analysis was not implemented, which referred both to national and international publications. That exemption was also extended to review papers, casuistic ones, reviews of books, handbooks, monographies, reports from scientific congresses, as well as papers on historical topics. The number of works was defined in each volume. Next, analysis was performed to establish the mode of finding out a suitable sample in respective studies, differentiating two categories: random and target selections. Attention was also paid to the presence of control sample in the individual works. In the analysis attention was also focussed on the existence of sample characteristics, setting up three categories: complete, partial and lacking. In evaluating the analyzed works an effort was made to present the results of studies in tables and figures (Tab. 1, 3). Analysis was accomplished with regard to the rate of employing statistical methods in analyzed works in relevant volumes of six selected, national medical journals for the years 1988-1992, simultaneously determining the number of works, in which no statistical methods were used. Concurrently the frequency of applying the individual statistical methods was analyzed in the scrutinized works. Prominence was given to fundamental statistical methods in the field of descriptive statistics (measures of position, measures of dispersion) as well as

  8. Statistical Methods for Establishing Personalized Treatment Rules in Oncology

    PubMed Central

    Ma, Junsheng; Hobbs, Brian P.; Stingo, Francesco C.

    2015-01-01

    The process for using statistical inference to establish personalized treatment strategies requires specific techniques for data-analysis that optimize the combination of competing therapies with candidate genetic features and characteristics of the patient and disease. A wide variety of methods have been developed. However, heretofore the usefulness of these recent advances has not been fully recognized by the oncology community, and the scope of their applications has not been summarized. In this paper, we provide an overview of statistical methods for establishing optimal treatment rules for personalized medicine and discuss specific examples in various medical contexts with oncology as an emphasis. We also point the reader to statistical software for implementation of the methods when available. PMID:26446492

  9. Advances in Statistical Methods for Substance Abuse Prevention Research

    PubMed Central

    MacKinnon, David P.; Lockwood, Chondra M.

    2010-01-01

    The paper describes advances in statistical methods for prevention research with a particular focus on substance abuse prevention. Standard analysis methods are extended to the typical research designs and characteristics of the data collected in prevention research. Prevention research often includes longitudinal measurement, clustering of data in units such as schools or clinics, missing data, and categorical as well as continuous outcome variables. Statistical methods to handle these features of prevention data are outlined. Developments in mediation, moderation, and implementation analysis allow for the extraction of more detailed information from a prevention study. Advancements in the interpretation of prevention research results include more widespread calculation of effect size and statistical power, the use of confidence intervals as well as hypothesis testing, detailed causal analysis of research findings, and meta-analysis. The increased availability of statistical software has contributed greatly to the use of new methods in prevention research. It is likely that the Internet will continue to stimulate the development and application of new methods. PMID:12940467

  10. Knowledge acquisition for expert systems using statistical methods

    NASA Technical Reports Server (NTRS)

    Belkin, Brenda L.; Stengel, Robert F.

    1991-01-01

    A common problem in the design of expert systems is the definition of rules from data obtained in system operation or simulation. A statistical method for generating rule bases from numerical data, motivated by an example based on aircraft navigation with multiple sensors is presented. The specific objective is to design an expert system that selects a satisfactory suite of measurements from a dissimilar, redundant set, given an arbitrary navigation geometry and possible sensor failures. The systematic development of a Navigation Sensor Management (NSM) Expert System from Kalman Filter covariance data is described. The development method invokes two statistical techniques: Analysis-of-Variance (ANOVA) and the ID3 algorithm. The ANOVA technique indicates whether variations of problem parameters give statistically different covariance results, and the ID3 algorithm identifies the relationships between the problem parameters using probabilistic knowledge extracted from a simulation example set.

  11. Landslide Susceptibility Statistical Methods: A Critical and Systematic Literature Review

    NASA Astrophysics Data System (ADS)

    Mihir, Monika; Malamud, Bruce; Rossi, Mauro; Reichenbach, Paola; Ardizzone, Francesca

    2014-05-01

    Landslide susceptibility assessment, the subject of this systematic review, is aimed at understanding the spatial probability of slope failures under a set of geomorphological and environmental conditions. It is estimated that about 375 landslides that occur globally each year are fatal, with around 4600 people killed per year. Past studies have brought out the increasing cost of landslide damages which primarily can be attributed to human occupation and increased human activities in the vulnerable environments. Many scientists, to evaluate and reduce landslide risk, have made an effort to efficiently map landslide susceptibility using different statistical methods. In this paper, we do a critical and systematic landslide susceptibility literature review, in terms of the different statistical methods used. For each of a broad set of studies reviewed we note: (i) study geography region and areal extent, (ii) landslide types, (iii) inventory type and temporal period covered, (iv) mapping technique (v) thematic variables used (vi) statistical models, (vii) assessment of model skill, (viii) uncertainty assessment methods, (ix) validation methods. We then pulled out broad trends within our review of landslide susceptibility, particularly regarding the statistical methods. We found that the most common statistical methods used in the study of landslide susceptibility include logistic regression, artificial neural network, discriminant analysis and weight of evidence. Although most of the studies we reviewed assessed the model skill, very few assessed model uncertainty. In terms of geographic extent, the largest number of landslide susceptibility zonations were in Turkey, Korea, Spain, Italy and Malaysia. However, there are also many landslides and fatalities in other localities, particularly India, China, Philippines, Nepal and Indonesia, Guatemala, and Pakistan, where there are much fewer landslide susceptibility studies available in the peer-review literature. This

  12. Peer-Assisted Learning in Research Methods and Statistics

    ERIC Educational Resources Information Center

    Stone, Anna; Meade, Claire; Watling, Rosamond

    2012-01-01

    Feedback from students on a Level 1 Research Methods and Statistics module, studied as a core part of a BSc Psychology programme, highlighted demand for additional tutorials to help them to understand basic concepts. Students in their final year of study commonly request work experience to enhance their employability. All students on the Level 1…

  13. Statistical Consequences of Attribute Misspecification in the Rule Space Method

    ERIC Educational Resources Information Center

    Im, Seongah; Corter, James E.

    2011-01-01

    The present study investigates the statistical consequences of attribute misspecification in the rule space method for cognitively diagnostic measurement. The two types of attribute misspecifications examined in the present study are exclusion of an essential attribute (which affects problem-solving performance) and inclusion of a superfluous…

  14. Reliability of groundwater vulnerability maps obtained through statistical methods.

    PubMed

    Sorichetta, Alessandro; Masetti, Marco; Ballabio, Cristiano; Sterlacchini, Simone; Beretta, Giovanni Pietro

    2011-04-01

    Statistical methods are widely used in environmental studies to evaluate natural hazards. Within groundwater vulnerability in particular, statistical methods are used to support decisions about environmental planning and management. The production of vulnerability maps obtained by statistical methods can greatly help decision making. One of the key points in all of these studies is the validation of the model outputs, which is performed through the application of various techniques to analyze the quality and reliability of the final results and to evaluate the model having the best performance. In this study, a groundwater vulnerability assessment to nitrate contamination was performed for the shallow aquifer located in the Province of Milan (Italy). The Weights of Evidence modeling technique was used to generate six model outputs, each one with a different number of input predictive factors. Considering that a vulnerability map is meaningful and useful only if it represents the study area through a limited number of classes with different degrees of vulnerability, the spatial agreement of different reclassified maps has been evaluated through the kappa statistics and a series of validation procedures has been proposed and applied to evaluate the reliability of the reclassified maps. Results show that performance is not directly related to the number of input predictor factors and that is possible to identify, among apparently similar maps, those best representing groundwater vulnerability in the study area. Thus, vulnerability maps generated using statistical modeling techniques have to be carefully handled before they are disseminated. Indeed, the results may appear to be excellent and final maps may perform quite well when, in fact, the depicted spatial distribution of vulnerability is greatly different from the actual one. For this reason, it is necessary to carefully evaluate the obtained results using multiple statistical techniques that are capable of

  15. System and method for statistically monitoring and analyzing sensed conditions

    DOEpatents

    Pebay, Philippe P.; Brandt, James M.; Gentile, Ann C.; Marzouk, Youssef M.; Hale, Darrian J.; Thompson, David C.

    2011-01-25

    A system and method of monitoring and analyzing a plurality of attributes for an alarm condition is disclosed. The attributes are processed and/or unprocessed values of sensed conditions of a collection of a statistically significant number of statistically similar components subjected to varying environmental conditions. The attribute values are used to compute the normal behaviors of some of the attributes and also used to infer parameters of a set of models. Relative probabilities of some attribute values are then computed and used along with the set of models to determine whether an alarm condition is met. The alarm conditions are used to prevent or reduce the impact of impending failure.

  16. System and method for statistically monitoring and analyzing sensed conditions

    DOEpatents

    Pebay, Philippe P.; Brandt, James M. , Gentile; Ann C. , Marzouk; Youssef M. , Hale; Darrian J. , Thompson; David C.

    2010-07-13

    A system and method of monitoring and analyzing a plurality of attributes for an alarm condition is disclosed. The attributes are processed and/or unprocessed values of sensed conditions of a collection of a statistically significant number of statistically similar components subjected to varying environmental conditions. The attribute values are used to compute the normal behaviors of some of the attributes and also used to infer parameters of a set of models. Relative probabilities of some attribute values are then computed and used along with the set of models to determine whether an alarm condition is met. The alarm conditions are used to prevent or reduce the impact of impending failure.

  17. System and method for statistically monitoring and analyzing sensed conditions

    DOEpatents

    Pebay, Philippe P.; Brandt, James M.; Gentile, Ann C.; Marzouk, Youssef M.; Hale, Darrian J.; Thompson, David C.

    2011-01-04

    A system and method of monitoring and analyzing a plurality of attributes for an alarm condition is disclosed. The attributes are processed and/or unprocessed values of sensed conditions of a collection of a statistically significant number of statistically similar components subjected to varying environmental conditions. The attribute values are used to compute the normal behaviors of some of the attributes and also used to infer parameters of a set of models. Relative probabilities of some attribute values are then computed and used along with the set of models to determine whether an alarm condition is met. The alarm conditions are used to prevent or reduce the impact of impending failure.

  18. Predicting recreational water quality advisories: A comparison of statistical methods

    USGS Publications Warehouse

    Brooks, Wesley R.; Corsi, Steven R.; Fienen, Michael N.; Carvin, Rebecca B.

    2016-01-01

    Epidemiological studies indicate that fecal indicator bacteria (FIB) in beach water are associated with illnesses among people having contact with the water. In order to mitigate public health impacts, many beaches are posted with an advisory when the concentration of FIB exceeds a beach action value. The most commonly used method of measuring FIB concentration takes 18–24 h before returning a result. In order to avoid the 24 h lag, it has become common to ”nowcast” the FIB concentration using statistical regressions on environmental surrogate variables. Most commonly, nowcast models are estimated using ordinary least squares regression, but other regression methods from the statistical and machine learning literature are sometimes used. This study compares 14 regression methods across 7 Wisconsin beaches to identify which consistently produces the most accurate predictions. A random forest model is identified as the most accurate, followed by multiple regression fit using the adaptive LASSO.

  19. Statistical approaches to pharmacodynamic modeling: motivations, methods, and misperceptions.

    PubMed

    Mick, R; Ratain, M J

    1993-01-01

    We have attempted to outline the fundamental statistical aspects of pharmacodynamic modeling. Unexpected yet substantial variability in effect in a group of similarly treated patients is the key motivation for pharmacodynamic investigations. Pharmacokinetic and/or pharmacodynamic factors may influence this variability. Residual variability in effect that persists after accounting for drug exposure indicates that further statistical modeling with pharmacodynamic factors is warranted. Factors that significantly predict interpatient variability in effect may then be employed to individualize the drug dose. In this paper we have emphasized the need to understand the properties of the effect measure and explanatory variables in terms of scale, distribution, and statistical relationship. The assumptions that underlie many types of statistical models have been discussed. The role of residual analysis has been stressed as a useful method to verify assumptions. We have described transformations and alternative regression methods that are employed when these assumptions are found to be in violation. Sequential selection procedures for the construction of multivariate models have been presented. The importance of assessing model performance has been underscored, most notably in terms of bias and precision. In summary, pharmacodynamic analyses are now commonly performed and reported in the oncologic literature. The content and format of these analyses has been variable. The goals of such analyses are to identify and describe pharmacodynamic relationships and, in many cases, to propose a statistical model. However, the appropriateness and performance of the proposed model are often difficult to judge. Table 1 displays suggestions (in a checklist format) for structuring the presentation of pharmacodynamic analyses, which reflect the topics reviewed in this paper. PMID:8269582

  20. Yang-Yang equilibrium statistical mechanics: A brilliant method

    NASA Astrophysics Data System (ADS)

    Guan, Xi-Wen; Chen, Yang-Yang

    2016-03-01

    Yang and Yang in 1969 [J. Math. Phys. 10, 1115 (1969)] for the first time proposed a rigorous approach to the thermodynamics of the one-dimensional system of bosons with a delta-function interaction. This paper was a breakthrough in exact statistical mechanics, after Yang [Phys. Rev. Lett. 19, 1312 (1967)] published his seminal work on the discovery of the Yang-Baxter equation in 1967. Yang and Yang’s brilliant method yields significant applications in a wide range of fields of physics. In this paper, we briefly introduce the method of the Yang-Yang equilibrium statistical mechanics and demonstrate a fundamental application of the Yang-Yang method for the study of thermodynamics of the Lieb-Liniger model with strong and weak interactions in a whole temperature regime. We also consider the equivalence between the Yang-Yang’s thermodynamic Bethe ansatz equation and the thermodynamics of the ideal gas with the Haldane’s generalized exclusion statistics.

  1. Yang-Yang Equilibrium Statistical Mechanics: A Brilliant Method

    NASA Astrophysics Data System (ADS)

    Guan, Xi-Wen; Chen, Yang-Yang

    C. N. Yang and C. P. Yang in 1969 (J. Math. Phys. 10, 1115 (1969)) for the first time proposed a rigorous approach to the thermodynamics of the one-dimensional system of bosons with a delta-function interaction. This paper was a breakthrough in exact statistical mechanics, after C. N. Yang (Phys. Rev. Lett. 19, 1312 (1967)) published his seminal work on the discovery of the Yang-Baxter equation in 1967. Yang and Yang's brilliant method yields significant applications in a wide range of fields of physics. In this communication, we briefly introduce the method of the Yang-Yang equilibrium statistical mechanics and demonstrate a fundamental application of the Yang-Yang method for the study of thermodynamics of the Lieb-Liniger model with strong and weak interactions in a whole temperature regime. We also consider the equivalence between the Yang-Yang's thermodynamic Bethe ansatz equation and the thermodynamics of the ideal gas with the Haldane's generalized exclusion statistics.

  2. An analytic method to compute star cluster luminosity statistics

    NASA Astrophysics Data System (ADS)

    da Silva, Robert L.; Krumholz, Mark R.; Fumagalli, Michele; Fall, S. Michael

    2014-03-01

    The luminosity distribution of the brightest star clusters in a population of galaxies encodes critical pieces of information about how clusters form, evolve and disperse, and whether and how these processes depend on the large-scale galactic environment. However, extracting constraints on models from these data is challenging, in part because comparisons between theory and observation have traditionally required computationally intensive Monte Carlo methods to generate mock data that can be compared to observations. We introduce a new method that circumvents this limitation by allowing analytic computation of cluster order statistics, i.e. the luminosity distribution of the Nth most luminous cluster in a population. Our method is flexible and requires few assumptions, allowing for parametrized variations in the initial cluster mass function and its upper and lower cutoffs, variations in the cluster age distribution, stellar evolution and dust extinction, as well as observational uncertainties in both the properties of star clusters and their underlying host galaxies. The method is fast enough to make it feasible for the first time to use Markov chain Monte Carlo methods to search parameter space to find best-fitting values for the parameters describing cluster formation and disruption, and to obtain rigorous confidence intervals on the inferred values. We implement our method in a software package called the Cluster Luminosity Order-Statistic Code, which we have made publicly available.

  3. Statistical Methods Handbook for Advanced Gas Reactor Fuel Materials

    SciTech Connect

    J. J. Einerson

    2005-05-01

    Fuel materials such as kernels, coated particles, and compacts are being manufactured for experiments simulating service in the next generation of high temperature gas reactors. These must meet predefined acceptance specifications. Many tests are performed for quality assurance, and many of these correspond to criteria that must be met with specified confidence, based on random samples. This report describes the statistical methods to be used. The properties of the tests are discussed, including the risk of false acceptance, the risk of false rejection, and the assumption of normality. Methods for calculating sample sizes are also described.

  4. Methods for estimating low-flow statistics for Massachusetts streams

    USGS Publications Warehouse

    Ries, Kernell G.; Friesz, Paul J.

    2000-01-01

    Methods and computer software are described in this report for determining flow duration, low-flow frequency statistics, and August median flows. These low-flow statistics can be estimated for unregulated streams in Massachusetts using different methods depending on whether the location of interest is at a streamgaging station, a low-flow partial-record station, or an ungaged site where no data are available. Low-flow statistics for streamgaging stations can be estimated using standard U.S. Geological Survey methods described in the report. The MOVE.1 mathematical method and a graphical correlation method can be used to estimate low-flow statistics for low-flow partial-record stations. The MOVE.1 method is recommended when the relation between measured flows at a partial-record station and daily mean flows at a nearby, hydrologically similar streamgaging station is linear, and the graphical method is recommended when the relation is curved. Equations are presented for computing the variance and equivalent years of record for estimates of low-flow statistics for low-flow partial-record stations when either a single or multiple index stations are used to determine the estimates. The drainage-area ratio method or regression equations can be used to estimate low-flow statistics for ungaged sites where no data are available. The drainage-area ratio method is generally as accurate as or more accurate than regression estimates when the drainage-area ratio for an ungaged site is between 0.3 and 1.5 times the drainage area of the index data-collection site. Regression equations were developed to estimate the natural, long-term 99-, 98-, 95-, 90-, 85-, 80-, 75-, 70-, 60-, and 50-percent duration flows; the 7-day, 2-year and the 7-day, 10-year low flows; and the August median flow for ungaged sites in Massachusetts. Streamflow statistics and basin characteristics for 87 to 133 streamgaging stations and low-flow partial-record stations were used to develop the equations. The

  5. Accuracy Evaluation of a Mobile Mapping System with Advanced Statistical Methods

    NASA Astrophysics Data System (ADS)

    Toschi, I.; Rodríguez-Gonzálvez, P.; Remondino, F.; Minto, S.; Orlandini, S.; Fuller, A.

    2015-02-01

    This paper discusses a methodology to evaluate the precision and the accuracy of a commercial Mobile Mapping System (MMS) with advanced statistical methods. So far, the metric potentialities of this emerging mapping technology have been studied in few papers, where generally the assumption that errors follow a normal distribution is made. In fact, this hypothesis should be carefully verified in advance, in order to test how well the Gaussian classic statistics can adapt to datasets that are usually affected by asymmetrical gross errors. The workflow adopted in this study relies on a Gaussian assessment, followed by an outlier filtering process. Finally, non-parametric statistical models are applied, in order to achieve a robust estimation of the error dispersion. Among the different MMSs available on the market, the latest solution provided by RIEGL is here tested, i.e. the VMX-450 Mobile Laser Scanning System. The test-area is the historic city centre of Trento (Italy), selected in order to assess the system performance in dealing with a challenging and historic urban scenario. Reference measures are derived from photogrammetric and Terrestrial Laser Scanning (TLS) surveys. All datasets show a large lack of symmetry that leads to the conclusion that the standard normal parameters are not adequate to assess this type of data. The use of non-normal statistics gives thus a more appropriate description of the data and yields results that meet the quoted a-priori errors.

  6. Application of the Bootstrap Statistical Method in Deriving Vibroacoustic Specifications

    NASA Technical Reports Server (NTRS)

    Hughes, William O.; Paez, Thomas L.

    2006-01-01

    This paper discusses the Bootstrap Method for specification of vibroacoustic test specifications. Vibroacoustic test specifications are necessary to properly accept or qualify a spacecraft and its components for the expected acoustic, random vibration and shock environments seen on an expendable launch vehicle. Traditionally, NASA and the U.S. Air Force have employed methods of Normal Tolerance Limits to derive these test levels based upon the amount of data available, and the probability and confidence levels desired. The Normal Tolerance Limit method contains inherent assumptions about the distribution of the data. The Bootstrap is a distribution-free statistical subsampling method which uses the measured data themselves to establish estimates of statistical measures of random sources. This is achieved through the computation of large numbers of Bootstrap replicates of a data measure of interest and the use of these replicates to derive test levels consistent with the probability and confidence desired. The comparison of the results of these two methods is illustrated via an example utilizing actual spacecraft vibroacoustic data.

  7. Development of an optomechanical statistical tolerancing method for cost reduction

    NASA Astrophysics Data System (ADS)

    Lamontagne, Frédéric; Doucet, Michel

    2012-10-01

    Optical systems generally require a high level of optical components positioning precision resulting in elevated manufacturing cost. The optomechanical tolerance analysis is usually performed by the optomechanical engineer using his personal knowledge of the manufacturing precision capability. Worst case or root sum square (RSS) tolerance calculation methods are frequently used for their simplicity. In most situations, the chance to encounter the worst case error is statistically almost impossible. On the other hand, RSS method is generally not an accurate representation of the reality since it assumes centered normal distributions. Moreover, the RSS method is not suitable for multidimensional tolerance analysis that combines translational and rotational variations. An optomechanical tolerance analysis method based on Monte Carlo simulation has been developed at INO to reduce overdesign caused by pessimist manufacturing and assembly error predictions. Manufacturing data errors have been compiled and computed to be used as input for the optomechanical Monte Carlo tolerance model. This is resulting in a more realistic prediction of the optical components positioning errors (decenter, tilt and air gap). Calculated errors probabilities were validated on a real lenses barrels assembly using a high precision centering machine. Results show that the statistical error prediction is more accurate and that can relax significantly the precision required in comparison to the worst case method. Manufacturing, inspection, adjustment mechanism and alignment cost can then be reduced considerably.

  8. Statistical methods for investigating quiescence and other temporal seismicity patterns

    USGS Publications Warehouse

    Matthews, M.V.; Reasenberg, P.A.

    1988-01-01

    We propose a statistical model and a technique for objective recognition of one of the most commonly cited seismicity patterns:microearthquake quiescence. We use a Poisson process model for seismicity and define a process with quiescence as one with a particular type of piece-wise constant intensity function. From this model, we derive a statistic for testing stationarity against a 'quiescence' alternative. The large-sample null distribution of this statistic is approximated from simulated distributions of appropriate functionals applied to Brownian bridge processes. We point out the restrictiveness of the particular model we propose and of the quiescence idea in general. The fact that there are many point processes which have neither constant nor quiescent rate functions underscores the need to test for and describe nonuniformity thoroughly. We advocate the use of the quiescence test in conjunction with various other tests for nonuniformity and with graphical methods such as density estimation. ideally these methods may promote accurate description of temporal seismicity distributions and useful characterizations of interesting patterns. ?? 1988 Birkha??user Verlag.

  9. Statistics

    Cancer.gov

    Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.

  10. A review of statistical methods for preprocessing oligonucleotide microarrays.

    PubMed

    Wu, Zhijin

    2009-12-01

    Microarrays have become an indispensable tool in biomedical research. This powerful technology not only makes it possible to quantify a large number of nucleic acid molecules simultaneously, but also produces data with many sources of noise. A number of preprocessing steps are therefore necessary to convert the raw data, usually in the form of hybridisation images, to measures of biological meaning that can be used in further statistical analysis. Preprocessing of oligonucleotide arrays includes image processing, background adjustment, data normalisation/transformation and sometimes summarisation when multiple probes are used to target one genomic unit. In this article, we review the issues encountered in each preprocessing step and introduce the statistical models and methods in preprocessing.

  11. Huffman and linear scanning methods with statistical language models.

    PubMed

    Roark, Brian; Fried-Oken, Melanie; Gibbons, Chris

    2015-03-01

    Current scanning access methods for text generation in AAC devices are limited to relatively few options, most notably row/column variations within a matrix. We present Huffman scanning, a new method for applying statistical language models to binary-switch, static-grid typing AAC interfaces, and compare it to other scanning options under a variety of conditions. We present results for 16 adults without disabilities and one 36-year-old man with locked-in syndrome who presents with complex communication needs and uses AAC scanning devices for writing. Huffman scanning with a statistical language model yielded significant typing speedups for the 16 participants without disabilities versus any of the other methods tested, including two row/column scanning methods. A similar pattern of results was found with the individual with locked-in syndrome. Interestingly, faster typing speeds were obtained with Huffman scanning using a more leisurely scan rate than relatively fast individually calibrated scan rates. Overall, the results reported here demonstrate great promise for the usability of Huffman scanning as a faster alternative to row/column scanning.

  12. Huffman and linear scanning methods with statistical language models.

    PubMed

    Roark, Brian; Fried-Oken, Melanie; Gibbons, Chris

    2015-03-01

    Current scanning access methods for text generation in AAC devices are limited to relatively few options, most notably row/column variations within a matrix. We present Huffman scanning, a new method for applying statistical language models to binary-switch, static-grid typing AAC interfaces, and compare it to other scanning options under a variety of conditions. We present results for 16 adults without disabilities and one 36-year-old man with locked-in syndrome who presents with complex communication needs and uses AAC scanning devices for writing. Huffman scanning with a statistical language model yielded significant typing speedups for the 16 participants without disabilities versus any of the other methods tested, including two row/column scanning methods. A similar pattern of results was found with the individual with locked-in syndrome. Interestingly, faster typing speeds were obtained with Huffman scanning using a more leisurely scan rate than relatively fast individually calibrated scan rates. Overall, the results reported here demonstrate great promise for the usability of Huffman scanning as a faster alternative to row/column scanning. PMID:25672825

  13. Radiological decontamination, survey, and statistical release method for vehicles

    SciTech Connect

    Goodwill, M.E.; Lively, J.W.; Morris, R.L.

    1996-06-01

    Earth-moving vehicles (e.g., dump trucks, belly dumps) commonly haul radiologically contaminated materials from a site being remediated to a disposal site. Traditionally, each vehicle must be surveyed before being released. The logistical difficulties of implementing the traditional approach on a large scale demand that an alternative be devised. A statistical method for assessing product quality from a continuous process was adapted to the vehicle decontamination process. This method produced a sampling scheme that automatically compensates and accommodates fluctuating batch sizes and changing conditions without the need to modify or rectify the sampling scheme in the field. Vehicles are randomly selected (sampled) upon completion of the decontamination process to be surveyed for residual radioactive surface contamination. The frequency of sampling is based on the expected number of vehicles passing through the decontamination process in a given period and the confidence level desired. This process has been successfully used for 1 year at the former uranium millsite in Monticello, Utah (a cleanup site regulated under the Comprehensive Environmental Response, Compensation, and Liability Act). The method forces improvement in the quality of the decontamination process and results in a lower likelihood that vehicles exceeding the surface contamination standards are offered for survey. Implementation of this statistical sampling method on Monticello projects has resulted in more efficient processing of vehicles through decontamination and radiological release, saved hundreds of hours of processing time, provided a high level of confidence that release limits are met, and improved the radiological cleanliness of vehicles leaving the controlled site.

  14. Statistical length measurement method by direct imaging of carbon nanotubes.

    PubMed

    Bengio, E Amram; Tsentalovich, Dmitri E; Behabtu, Natnael; Kleinerman, Olga; Kesselman, Ellina; Schmidt, Judith; Talmon, Yeshayahu; Pasquali, Matteo

    2014-05-14

    The influence of carbon nanotube (CNT) length on their macroscopic properties requires an accurate methodology for CNT length measurement. So far, existing techniques are limited to short (less than a few micrometers) CNTs and sample preparation methods that bias the measured values. Here, we show that the average length of carbon nanotubes (CNTs) can be measured by cryogenic transmission electron microscopy (cryo-TEM) of CNTs in chlorosulfonic acid. The method consists of dissolving at low concentration CNTs in chlorosulfonic acid (a true solvent), imaging the individual CNTs by cryo-TEM, and processing and analyzing the images to determine CNT length. By measuring the total CNT contour length and number of CNT ends in each image, and by applying statistical analysis, we extend the method to cases where each CNT is long enough to span many cryo-TEM images, making the direct length measurement of an entire CNT impractical. Hence, this new technique can be used effectively to estimate samples in a wide range of CNT lengths, although we find that cryo-TEM imaging may bias the measurement towards longer CNTs, which are easier to detect. Our statistical method is also applied to AFM images of CNTs to show that, by using only a few AFM images, it yields estimates that are consistent with literature techniques, based on individually measuring a higher number of CNTs. PMID:24773046

  15. Texture analysis with statistical methods for wheat ear extraction

    NASA Astrophysics Data System (ADS)

    Bakhouche, M.; Cointault, F.; Gouton, P.

    2007-01-01

    In agronomic domain, the simplification of crop counting, necessary for yield prediction and agronomic studies, is an important project for technical institutes such as Arvalis. Although the main objective of our global project is to conceive a mobile robot for natural image acquisition directly in a field, Arvalis has proposed us first to detect by image processing the number of wheat ears in images before to count them, which will allow to obtain the first component of the yield. In this paper we compare different texture image segmentation techniques based on feature extraction by first and higher order statistical methods which have been applied on our images. The extracted features are used for unsupervised pixel classification to obtain the different classes in the image. So, the K-means algorithm is implemented before the choice of a threshold to highlight the ears. Three methods have been tested in this feasibility study with very average error of 6%. Although the evaluation of the quality of the detection is visually done, automatic evaluation algorithms are currently implementing. Moreover, other statistical methods of higher order will be implemented in the future jointly with methods based on spatio-frequential transforms and specific filtering.

  16. Hybrid perturbation methods based on statistical time series models

    NASA Astrophysics Data System (ADS)

    San-Juan, Juan Félix; San-Martín, Montserrat; Pérez, Iván; López, Rosario

    2016-04-01

    In this work we present a new methodology for orbit propagation, the hybrid perturbation theory, based on the combination of an integration method and a prediction technique. The former, which can be a numerical, analytical or semianalytical theory, generates an initial approximation that contains some inaccuracies derived from the fact that, in order to simplify the expressions and subsequent computations, not all the involved forces are taken into account and only low-order terms are considered, not to mention the fact that mathematical models of perturbations not always reproduce physical phenomena with absolute precision. The prediction technique, which can be based on either statistical time series models or computational intelligence methods, is aimed at modelling and reproducing missing dynamics in the previously integrated approximation. This combination results in the precision improvement of conventional numerical, analytical and semianalytical theories for determining the position and velocity of any artificial satellite or space debris object. In order to validate this methodology, we present a family of three hybrid orbit propagators formed by the combination of three different orders of approximation of an analytical theory and a statistical time series model, and analyse their capability to process the effect produced by the flattening of the Earth. The three considered analytical components are the integration of the Kepler problem, a first-order and a second-order analytical theories, whereas the prediction technique is the same in the three cases, namely an additive Holt-Winters method.

  17. How to eradicate fraudulent statistical methods: statisticians must do science

    SciTech Connect

    Bross, I.D. )

    1990-12-01

    The two steps necessary for the clinical expression of a mutagenic disease, genetic damage and viability, are countervailing forces and therefore the dosage response curve for mutagens must have a maximum. To illustrate that science is common sense reduced to calculation, a new mathematical derivation of this result and supporting data are given. This example also shows that the term 'context-free' is a snare and a delusion. When statistical methods are used in a scientific context where their assumptions are known to fail and where there is a reasonable presumption of intent to deceive, they are fraudulent. Estimation of low-level mutagenic risks by linear extrapolation from high-dose data is one example of such a method that is widely used by Executive Branch agencies. Other examples are given of fraudulent statistical methods that are currently used in biomedical research done by or for U.S. government agencies. In the long run, it is argued, the surest way to eradicate such fraud is for biostatisticians to do their own science.

  18. Optimization of Statistical Methods Impact on Quantitative Proteomics Data.

    PubMed

    Pursiheimo, Anna; Vehmas, Anni P; Afzal, Saira; Suomi, Tomi; Chand, Thaman; Strauss, Leena; Poutanen, Matti; Rokka, Anne; Corthals, Garry L; Elo, Laura L

    2015-10-01

    As tools for quantitative label-free mass spectrometry (MS) rapidly develop, a consensus about the best practices is not apparent. In the work described here we compared popular statistical methods for detecting differential protein expression from quantitative MS data using both controlled experiments with known quantitative differences for specific proteins used as standards as well as "real" experiments where differences in protein abundance are not known a priori. Our results suggest that data-driven reproducibility-optimization can consistently produce reliable differential expression rankings for label-free proteome tools and are straightforward in their application. PMID:26321463

  19. Statistical estimation of mineral age by K-Ar method

    SciTech Connect

    Vistelius, A.B.; Drubetzkoy, E.R.; Faas, A.V. )

    1989-11-01

    Statistical estimation of age of {sup 40}Ar/{sup 40}K ratios may be considered a result of convolution of uniform and normal distributions with different weights for different minerals. Data from Gul'shad Massif (Nearbalkhash, Kazakhstan, USSR) indicate that {sup 40}Ar/{sup 40}K ratios reflecting the intensity of geochemical processes can be resolved using convolutions. Loss of {sup 40}Ar in biotites is shown whereas hornblende retained the original content of {sup 40}Ar throughout the geological history of the massif. Results demonstrate that different estimation methods must be used for different minerals and different rocks when radiometric ages are employed for dating.

  20. Of pacemakers and statistics: the actuarial method extended.

    PubMed

    Dussel, J; Wolbarst, A B; Scott-Millar, R N; Obel, I W

    1980-01-01

    Pacemakers cease functioning because of either natural battery exhaustion (nbe) or component failure (cf). A study of four series of pacemakers shows that a simple extension of the actuarial method, so as to incorporate Normal statistics, makes possible a quantitative differentiation between the two modes of failure. This involves the separation of the overall failure probability density function PDF(t) into constituent parts pdfnbe(t) and pdfcf(t). The approach should allow a meaningful comparison of the characteristics of different pacemaker types.

  1. Stochastic and statistical methods in hydrology and environmental engineering

    NASA Astrophysics Data System (ADS)

    Hipel, K. W.

    1995-03-01

    Recent developments in stochastic and statistical methods in hydrology and environmental engineering presented in the upcoming sequence of research papers are evaluated, compared and put into proper perspective. These papers are being published as a memorial to Professor T. E. Unny who was a founding Editor of the journal Stochastic Hydrology and Hydraulics. As explained in this introductory paper, other activities that took place to celebrate Professor Unny's lifetime academic accomplishments include an international conference held in his honor at the University of Waterloo in June, 1993 and the publication of a four-volume conference proceedings in 1994.

  2. Optimization of Statistical Methods Impact on Quantitative Proteomics Data.

    PubMed

    Pursiheimo, Anna; Vehmas, Anni P; Afzal, Saira; Suomi, Tomi; Chand, Thaman; Strauss, Leena; Poutanen, Matti; Rokka, Anne; Corthals, Garry L; Elo, Laura L

    2015-10-01

    As tools for quantitative label-free mass spectrometry (MS) rapidly develop, a consensus about the best practices is not apparent. In the work described here we compared popular statistical methods for detecting differential protein expression from quantitative MS data using both controlled experiments with known quantitative differences for specific proteins used as standards as well as "real" experiments where differences in protein abundance are not known a priori. Our results suggest that data-driven reproducibility-optimization can consistently produce reliable differential expression rankings for label-free proteome tools and are straightforward in their application.

  3. Estimated Accuracy of Three Common Trajectory Statistical Methods

    NASA Technical Reports Server (NTRS)

    Kabashnikov, Vitaliy P.; Chaikovsky, Anatoli P.; Kucsera, Tom L.; Metelskaya, Natalia S.

    2011-01-01

    Three well-known trajectory statistical methods (TSMs), namely concentration field (CF), concentration weighted trajectory (CWT), and potential source contribution function (PSCF) methods were tested using known sources and artificially generated data sets to determine the ability of TSMs to reproduce spatial distribution of the sources. In the works by other authors, the accuracy of the trajectory statistical methods was estimated for particular species and at specified receptor locations. We have obtained a more general statistical estimation of the accuracy of source reconstruction and have found optimum conditions to reconstruct source distributions of atmospheric trace substances. Only virtual pollutants of the primary type were considered. In real world experiments, TSMs are intended for application to a priori unknown sources. Therefore, the accuracy of TSMs has to be tested with all possible spatial distributions of sources. An ensemble of geographical distributions of virtual sources was generated. Spearman s rank order correlation coefficient between spatial distributions of the known virtual and the reconstructed sources was taken to be a quantitative measure of the accuracy. Statistical estimates of the mean correlation coefficient and a range of the most probable values of correlation coefficients were obtained. All the TSMs that were considered here showed similar close results. The maximum of the ratio of the mean correlation to the width of the correlation interval containing the most probable correlation values determines the optimum conditions for reconstruction. An optimal geographical domain roughly coincides with the area supplying most of the substance to the receptor. The optimal domain s size is dependent on the substance decay time. Under optimum reconstruction conditions, the mean correlation coefficients can reach 0.70 0.75. The boundaries of the interval with the most probable correlation values are 0.6 0.9 for the decay time of 240 h

  4. A Statistical Process Control Method for Semiconductor Manufacturing

    NASA Astrophysics Data System (ADS)

    Kubo, Tomoaki; Ino, Tomomi; Minami, Kazuhiro; Minami, Masateru; Homma, Tetsuya

    To maintain stable operation of semiconductor fabrication lines, statistical process control (SPC) methods are recognized to be effective. However, in semiconductor fabrication lines, there exist a huge number of process state signals to be monitored, and these signals contain both normally and non-normally distributed data. Therefore, if we try to apply SPC methods to those signals, we need one which satisfies three requirements: 1) It can deal with both normally distributed data, and non-normally distributed data, 2) It can be set up automatically, 3) It can be easily understood by engineers and technicians. In this paper, we propose a new SPC method which satisfies these three requirements at the same time. This method uses similar rules to the Shewhart chart, but can deal with non-normally distributed data by introducing “effective standard deviations”. Usefulness of this method is demonstrated by comparing false alarm ratios to that of the Shewhart chart method. In the demonstration, we use various kinds of artificially generated data, and real data observed in a chemical vapor deposition (CVD) process tool in a semiconductor fabrication line.

  5. FOREWORD: Special issue on Statistical and Probabilistic Methods for Metrology

    NASA Astrophysics Data System (ADS)

    Bich, Walter; Cox, Maurice G.

    2006-08-01

    This special issue of Metrologia is the first that is not devoted to units, or constants, or measurement techniques in some specific field of metrology, but to the generic topic of statistical and probabilistic methods for metrology. The number of papers on this subject in measurement journals, and in Metrologia in particular, has continued to increase over the years, driven by the publication of the Guide to the Expression of Uncertainty in Measurement (GUM) [1] and the Mutual Recognition Arrangement (MRA) of the CIPM [2]. The former stimulated metrologists to think in greater depth about the appropriate modelling of their measurements, in order to provide uncertainty evaluations associated with measurement results. The latter obliged the metrological community to investigate reliable measures for assessing the calibration and measurement capabilities declared by the national metrology institutes (NMIs). Furthermore, statistical analysis of measurement data became even more important than hitherto, with the need, on the one hand, to treat the greater quantities of data provided by sophisticated measurement systems, and, on the other, to deal appropriately with relatively small sets of data that are difficult or expensive to obtain. The importance of supporting the GUM and extending its provisions was recognized by the formation in the year 2000 of Working Group 1, Measurement uncertainty, of the Joint Committee for Guides in Metrology. The need to provide guidance on key comparison data evaluation was recognized by the formation in the year 2001 of the BIPM Director's Advisory Group on Uncertainty. A further international initiative was the revision, in the year 2004, of the remit and title of a working group of ISO/TC 69, Application of Statistical Methods, to reflect the need to concentrate more on statistical methods to support measurement uncertainty evaluation. These international activities are supplemented by national programmes such as the Software Support

  6. Regional homogenization of surface temperature records using robust statistical methods

    NASA Astrophysics Data System (ADS)

    Pintar, A. L.; Possolo, A.; Zhang, N. F.

    2013-09-01

    An algorithm is described that is intended to estimate and remove spurious influences from the surface temperature record at a meteorological station, which may be due to changes in the location of the station or in its environment, or in the method used to make measurements, and which are unrelated to climate change, similarly to [1]. The estimate of these influences is based on a comparison of non-parametric decompositions of the target series with series measured at other stations in a neighborhood of the target series. The uncertainty of the estimated spurious artifacts is determined using a statistical bootstrap method that accounts for temporal correlation structure beyond what is expected from seasonal effects. Our computer-intensive bootstrap procedure lends itself readily to parallelization, which makes the algorithm practicable for large collections of stations. The role that the proposed procedure may play in practice is contingent on the results of large-scale testing, still under way, using historical data.

  7. Statistical method for detecting structural change in the growth process.

    PubMed

    Ninomiya, Yoshiyuki; Yoshimoto, Atsushi

    2008-03-01

    Due to competition among individual trees and other exogenous factors that change the growth environment, each tree grows following its own growth trend with some structural changes in growth over time. In the present article, a new method is proposed to detect a structural change in the growth process. We formulate the method as a simple statistical test for signal detection without constructing any specific model for the structural change. To evaluate the p-value of the test, the tube method is developed because the regular distribution theory is insufficient. Using two sets of tree diameter growth data sampled from planted forest stands of Cryptomeria japonica in Japan, we conduct an analysis of identifying the effect of thinning on the growth process as a structural change. Our results demonstrate that the proposed method is useful to identify the structural change caused by thinning. We also provide the properties of the method in terms of the size and power of the test. PMID:17608782

  8. Measurement of Plethysmogram and Statistical Method for Analysis

    NASA Astrophysics Data System (ADS)

    Shimizu, Toshihiro

    The plethysmogram is measured at different points of human body by using the photo interrupter, which sensitively depends on the physical and mental situation of human body. In this paper the statistical method of the data-analysis is investigated to discuss the dependence of plethysmogram on stress and aging. The first one is the representation method based on the return map, which provides usuful information for the waveform, the flucuation in phase and the fluctuation in amplitude. The return map method makes it possible to understand the fluctuation of plethymogram in amplitude and in phase more clearly and globally than in the conventional power spectrum method. The second is the Lisajous plot and the correlation function to analyze the phase difference between the plethysmograms of the right finger tip and of the left finger tip. The third is the R-index, from which we can estimate “the age of the blood flow”. The R-index is defined by the global character of plethysmogram, which is different from the usual APG-index. The stress- and age-dependence of plethysmogram is discussed by using these methods.

  9. Statistical methods for the detection and analysis of radioactive sources

    NASA Astrophysics Data System (ADS)

    Klumpp, John

    We consider four topics from areas of radioactive statistical analysis in the present study: Bayesian methods for the analysis of count rate data, analysis of energy data, a model for non-constant background count rate distributions, and a zero-inflated model of the sample count rate. The study begins with a review of Bayesian statistics and techniques for analyzing count rate data. Next, we consider a novel system for incorporating energy information into count rate measurements which searches for elevated count rates in multiple energy regions simultaneously. The system analyzes time-interval data in real time to sequentially update a probability distribution for the sample count rate. We then consider a "moving target" model of background radiation in which the instantaneous background count rate is a function of time, rather than being fixed. Unlike the sequential update system, this model assumes a large body of pre-existing data which can be analyzed retrospectively. Finally, we propose a novel Bayesian technique which allows for simultaneous source detection and count rate analysis. This technique is fully compatible with, but independent of, the sequential update system and moving target model.

  10. Jet Noise Diagnostics Supporting Statistical Noise Prediction Methods

    NASA Technical Reports Server (NTRS)

    Bridges, James E.

    2006-01-01

    compared against measurements of mean and rms velocity statistics over a range of jet speeds and temperatures. Models for flow parameters used in the acoustic analogy, most notably the space-time correlations of velocity, have been compared against direct measurements, and modified to better fit the observed data. These measurements have been extremely challenging for hot, high speed jets, and represent a sizeable investment in instrumentation development. As an intermediate check that the analysis is predicting the physics intended, phased arrays have been employed to measure source distributions for a wide range of jet cases. And finally, careful far-field spectral directivity measurements have been taken for final validation of the prediction code. Examples of each of these experimental efforts will be presented. The main result of these efforts is a noise prediction code, named JeNo, which is in middevelopment. JeNo is able to consistently predict spectral directivity, including aft angle directivity, for subsonic cold jets of most geometries. Current development on JeNo is focused on extending its capability to hot jets, requiring inclusion of a previously neglected second source associated with thermal fluctuations. A secondary result of the intensive experimentation is the archiving of various flow statistics applicable to other acoustic analogies and to development of time-resolved prediction methods. These will be of lasting value as we look ahead at future challenges to the aeroacoustic experimentalist.

  11. Development and testing of improved statistical wind power forecasting methods.

    SciTech Connect

    Mendes, J.; Bessa, R.J.; Keko, H.; Sumaili, J.; Miranda, V.; Ferreira, C.; Gama, J.; Botterud, A.; Zhou, Z.; Wang, J.

    2011-12-06

    (with spatial and/or temporal dependence). Statistical approaches to uncertainty forecasting basically consist of estimating the uncertainty based on observed forecasting errors. Quantile regression (QR) is currently a commonly used approach in uncertainty forecasting. In Chapter 3, we propose new statistical approaches to the uncertainty estimation problem by employing kernel density forecast (KDF) methods. We use two estimators in both offline and time-adaptive modes, namely, the Nadaraya-Watson (NW) and Quantilecopula (QC) estimators. We conduct detailed tests of the new approaches using QR as a benchmark. One of the major issues in wind power generation are sudden and large changes of wind power output over a short period of time, namely ramping events. In Chapter 4, we perform a comparative study of existing definitions and methodologies for ramp forecasting. We also introduce a new probabilistic method for ramp event detection. The method starts with a stochastic algorithm that generates wind power scenarios, which are passed through a high-pass filter for ramp detection and estimation of the likelihood of ramp events to happen. The report is organized as follows: Chapter 2 presents the results of the application of ITL training criteria to deterministic WPF; Chapter 3 reports the study on probabilistic WPF, including new contributions to wind power uncertainty forecasting; Chapter 4 presents a new method to predict and visualize ramp events, comparing it with state-of-the-art methodologies; Chapter 5 briefly summarizes the main findings and contributions of this report.

  12. Using the statistical analysis method to assess the landslide susceptibility

    NASA Astrophysics Data System (ADS)

    Chan, Hsun-Chuan; Chen, Bo-An; Wen, Yo-Ting

    2015-04-01

    This study assessed the landslide susceptibility in Jing-Shan River upstream watershed, central Taiwan. The landslide inventories during typhoons Toraji in 2001, Mindulle in 2004, Kalmaegi and Sinlaku in 2008, Morakot in 2009, and the 0719 rainfall event in 2011, which were established by Taiwan Central Geological Survey, were used as landslide data. This study aims to assess the landslide susceptibility by using different statistical methods including logistic regression, instability index method and support vector machine (SVM). After the evaluations, the elevation, slope, slope aspect, lithology, terrain roughness, slope roughness, plan curvature, profile curvature, total curvature, average of rainfall were chosen as the landslide factors. The validity of the three established models was further examined by the receiver operating characteristic curve. The result of logistic regression showed that the factor of terrain roughness and slope roughness had a stronger impact on the susceptibility value. Instability index method showed that the factor of terrain roughness and lithology had a stronger impact on the susceptibility value. Due to the fact that the use of instability index method may lead to possible underestimation around the river side. In addition, landslide susceptibility indicated that the use of instability index method laid a potential issue about the number of factor classification. An increase of the number of factor classification may cause excessive variation coefficient of the factor. An decrease of the number of factor classification may make a large range of nearby cells classified into the same susceptibility level. Finally, using the receiver operating characteristic curve discriminate the three models. SVM is a preferred method than the others in assessment of landslide susceptibility. Moreover, SVM is further suggested to be nearly logistic regression in terms of recognizing the medium-high and high susceptibility.

  13. A Comparison of Three Presentation Methods of Teaching Statistics.

    ERIC Educational Resources Information Center

    Packard, Abbot L.; And Others

    The use of computer assisted instruction in teaching statistical concepts was studied. Students enrolled in classes in education who lacked statistical experience participated. Knowledge questions for pretest and posttest assessments were prepared from a pool of questions used in the statistics department of the College of Education at Virginia…

  14. Systematic variational method for statistical nonlinear state and parameter estimation.

    PubMed

    Ye, Jingxin; Rey, Daniel; Kadakia, Nirag; Eldridge, Michael; Morone, Uriel I; Rozdeba, Paul; Abarbanel, Henry D I; Quinn, John C

    2015-11-01

    In statistical data assimilation one evaluates the conditional expected values, conditioned on measurements, of interesting quantities on the path of a model through observation and prediction windows. This often requires working with very high dimensional integrals in the discrete time descriptions of the observations and model dynamics, which become functional integrals in the continuous-time limit. Two familiar methods for performing these integrals include (1) Monte Carlo calculations and (2) variational approximations using the method of Laplace plus perturbative corrections to the dominant contributions. We attend here to aspects of the Laplace approximation and develop an annealing method for locating the variational path satisfying the Euler-Lagrange equations that comprises the major contribution to the integrals. This begins with the identification of the minimum action path starting with a situation where the model dynamics is totally unresolved in state space, and the consistent minimum of the variational problem is known. We then proceed to slowly increase the model resolution, seeking to remain in the basin of the minimum action path, until a path that gives the dominant contribution to the integral is identified. After a discussion of some general issues, we give examples of the assimilation process for some simple, instructive models from the geophysical literature. Then we explore a slightly richer model of the same type with two distinct time scales. This is followed by a model characterizing the biophysics of individual neurons. PMID:26651756

  15. Statistically qualified neuro-analytic failure detection method and system

    DOEpatents

    Vilim, Richard B.; Garcia, Humberto E.; Chen, Frederick W.

    2002-03-02

    An apparatus and method for monitoring a process involve development and application of a statistically qualified neuro-analytic (SQNA) model to accurately and reliably identify process change. The development of the SQNA model is accomplished in two stages: deterministic model adaption and stochastic model modification of the deterministic model adaptation. Deterministic model adaption involves formulating an analytic model of the process representing known process characteristics, augmenting the analytic model with a neural network that captures unknown process characteristics, and training the resulting neuro-analytic model by adjusting the neural network weights according to a unique scaled equation error minimization technique. Stochastic model modification involves qualifying any remaining uncertainty in the trained neuro-analytic model by formulating a likelihood function, given an error propagation equation, for computing the probability that the neuro-analytic model generates measured process output. Preferably, the developed SQNA model is validated using known sequential probability ratio tests and applied to the process as an on-line monitoring system. Illustrative of the method and apparatus, the method is applied to a peristaltic pump system.

  16. Reexamination of Statistical Methods for Comparative Anatomy: Examples of Its Application and Comparisons with Other Parametric and Nonparametric Statistics.

    PubMed

    Aversi-Ferreira, Roqueline A G M F; Nishijo, Hisao; Aversi-Ferreira, Tales Alexandre

    2015-01-01

    Various statistical methods have been published for comparative anatomy. However, few studies compared parametric and nonparametric statistical methods. Moreover, some previous studies using statistical method for comparative anatomy (SMCA) proposed the formula for comparison of groups of anatomical structures (multiple structures) among different species. The present paper described the usage of SMCA and compared the results by SMCA with those by parametric test (t-test) and nonparametric analyses (cladistics) of anatomical data. In conclusion, the SMCA can offer a more exact and precise way to compare single and multiple anatomical structures across different species, which requires analyses of nominal features in comparative anatomy. PMID:26413553

  17. Reexamination of Statistical Methods for Comparative Anatomy: Examples of Its Application and Comparisons with Other Parametric and Nonparametric Statistics

    PubMed Central

    Aversi-Ferreira, Roqueline A. G. M. F.; Nishijo, Hisao; Aversi-Ferreira, Tales Alexandre

    2015-01-01

    Various statistical methods have been published for comparative anatomy. However, few studies compared parametric and nonparametric statistical methods. Moreover, some previous studies using statistical method for comparative anatomy (SMCA) proposed the formula for comparison of groups of anatomical structures (multiple structures) among different species. The present paper described the usage of SMCA and compared the results by SMCA with those by parametric test (t-test) and nonparametric analyses (cladistics) of anatomical data. In conclusion, the SMCA can offer a more exact and precise way to compare single and multiple anatomical structures across different species, which requires analyses of nominal features in comparative anatomy. PMID:26413553

  18. Emperical Laws in Economics Uncovered Using Methods in Statistical Mechanics

    NASA Astrophysics Data System (ADS)

    Stanley, H. Eugene

    2001-06-01

    In recent years, statistical physicists and computational physicists have determined that physical systems which consist of a large number of interacting particles obey universal "scaling laws" that serve to demonstrate an intrinsic self-similarity operating in such systems. Further, the parameters appearing in these scaling laws appear to be largely independent of the microscopic details. Since economic systems also consist of a large number of interacting units, it is plausible that scaling theory can be usefully applied to economics. To test this possibility using realistic data sets, a number of scientists have begun analyzing economic data using methods of statistical physics [1]. We have found evidence for scaling (and data collapse), as well as universality, in various quantities, and these recent results will be reviewed in this talk--starting with the most recent study [2]. We also propose models that may lead to some insight into these phenomena. These results will be discussed, as well as the overall rationale for why one might expect scaling principles to hold for complex economic systems. This work on which this talk is based is supported by BP, and was carried out in collaboration with L. A. N. Amaral S. V. Buldyrev, D. Canning, P. Cizeau, X. Gabaix, P. Gopikrishnan, S. Havlin, Y. Lee, Y. Liu, R. N. Mantegna, K. Matia, M. Meyer, C.-K. Peng, V. Plerou, M. A. Salinger, and M. H. R. Stanley. [1.] See, e.g., R. N. Mantegna and H. E. Stanley, Introduction to Econophysics: Correlations & Complexity in Finance (Cambridge University Press, Cambridge, 1999). [2.] P. Gopikrishnan, B. Rosenow, V. Plerou, and H. E. Stanley, "Identifying Business Sectors from Stock Price Fluctuations," e-print cond-mat/0011145; V. Plerou, P. Gopikrishnan, L. A. N. Amaral, X. Gabaix, and H. E. Stanley, "Diffusion and Economic Fluctuations," Phys. Rev. E (Rapid Communications) 62, 3023-3026 (2000); P. Gopikrishnan, V. Plerou, X. Gabaix, and H. E. Stanley, "Statistical Properties of

  19. Statistical methods for detecting periodic fragments in DNA sequence data

    PubMed Central

    2011-01-01

    Background Period 10 dinucleotides are structurally and functionally validated factors that influence the ability of DNA to form nucleosomes, histone core octamers. Robust identification of periodic signals in DNA sequences is therefore required to understand nucleosome organisation in genomes. While various techniques for identifying periodic components in genomic sequences have been proposed or adopted, the requirements for such techniques have not been considered in detail and confirmatory testing for a priori specified periods has not been developed. Results We compared the estimation accuracy and suitability for confirmatory testing of autocorrelation, discrete Fourier transform (DFT), integer period discrete Fourier transform (IPDFT) and a previously proposed Hybrid measure. A number of different statistical significance procedures were evaluated but a blockwise bootstrap proved superior. When applied to synthetic data whose period-10 signal had been eroded, or for which the signal was approximately period-10, the Hybrid technique exhibited superior properties during exploratory period estimation. In contrast, confirmatory testing using the blockwise bootstrap procedure identified IPDFT as having the greatest statistical power. These properties were validated on yeast sequences defined from a ChIP-chip study where the Hybrid metric confirmed the expected dominance of period-10 in nucleosome associated DNA but IPDFT identified more significant occurrences of period-10. Application to the whole genomes of yeast and mouse identified ~ 21% and ~ 19% respectively of these genomes as spanned by period-10 nucleosome positioning sequences (NPS). Conclusions For estimating the dominant period, we find the Hybrid period estimation method empirically to be the most effective for both eroded and approximate periodicity. The blockwise bootstrap was found to be effective as a significance measure, performing particularly well in the problem of period detection in the

  20. System Synthesis in Preliminary Aircraft Design using Statistical Methods

    NASA Technical Reports Server (NTRS)

    DeLaurentis, Daniel; Mavris, Dimitri N.; Schrage, Daniel P.

    1996-01-01

    This paper documents an approach to conceptual and preliminary aircraft design in which system synthesis is achieved using statistical methods, specifically design of experiments (DOE) and response surface methodology (RSM). These methods are employed in order to more efficiently search the design space for optimum configurations. In particular, a methodology incorporating three uses of these techniques is presented. First, response surface equations are formed which represent aerodynamic analyses, in the form of regression polynomials, which are more sophisticated than generally available in early design stages. Next, a regression equation for an overall evaluation criterion is constructed for the purpose of constrained optimization at the system level. This optimization, though achieved in a innovative way, is still traditional in that it is a point design solution. The methodology put forward here remedies this by introducing uncertainty into the problem, resulting a solutions which are probabilistic in nature. DOE/RSM is used for the third time in this setting. The process is demonstrated through a detailed aero-propulsion optimization of a high speed civil transport. Fundamental goals of the methodology, then, are to introduce higher fidelity disciplinary analyses to the conceptual aircraft synthesis and provide a roadmap for transitioning from point solutions to probabalistic designs (and eventually robust ones).

  1. System Synthesis in Preliminary Aircraft Design Using Statistical Methods

    NASA Technical Reports Server (NTRS)

    DeLaurentis, Daniel; Mavris, Dimitri N.; Schrage, Daniel P.

    1996-01-01

    This paper documents an approach to conceptual and early preliminary aircraft design in which system synthesis is achieved using statistical methods, specifically Design of Experiments (DOE) and Response Surface Methodology (RSM). These methods are employed in order to more efficiently search the design space for optimum configurations. In particular, a methodology incorporating three uses of these techniques is presented. First, response surface equations are formed which represent aerodynamic analyses, in the form of regression polynomials, which are more sophisticated than generally available in early design stages. Next, a regression equation for an Overall Evaluation Criterion is constructed for the purpose of constrained optimization at the system level. This optimization, though achieved in an innovative way, is still traditional in that it is a point design solution. The methodology put forward here remedies this by introducing uncertainty into the problem, resulting in solutions which are probabilistic in nature. DOE/RSM is used for the third time in this setting. The process is demonstrated through a detailed aero-propulsion optimization of a High Speed Civil Transport. Fundamental goals of the methodology, then, are to introduce higher fidelity disciplinary analyses to the conceptual aircraft synthesis and provide a roadmap for transitioning from point solutions to probabilistic designs (and eventually robust ones).

  2. Methods in probability and statistical inference. Final report, June 15, 1975-June 30, 1979. [Dept. of Statistics, Univ. of Chicago

    SciTech Connect

    Wallace, D L; Perlman, M D

    1980-06-01

    This report describes the research activities of the Department of Statistics, University of Chicago, during the period June 15, 1975 to July 30, 1979. Nine research projects are briefly described on the following subjects: statistical computing and approximation techniques in statistics; numerical computation of first passage distributions; probabilities of large deviations; combining independent tests of significance; small-sample efficiencies of tests and estimates; improved procedures for simultaneous estimation and testing of many correlations; statistical computing and improved regression methods; comparison of several populations; and unbiasedness in multivariate statistics. A description of the statistical consultation activities of the Department that are of interest to DOE, in particular, the scientific interactions between the Department and the scientists at Argonne National Laboratories, is given. A list of publications issued during the term of the contract is included.

  3. Hydrologic extremes - an intercomparison of multiple gridded statistical downscaling methods

    NASA Astrophysics Data System (ADS)

    Werner, Arelia T.; Cannon, Alex J.

    2016-04-01

    Gridded statistical downscaling methods are the main means of preparing climate model data to drive distributed hydrological models. Past work on the validation of climate downscaling methods has focused on temperature and precipitation, with less attention paid to the ultimate outputs from hydrological models. Also, as attention shifts towards projections of extreme events, downscaling comparisons now commonly assess methods in terms of climate extremes, but hydrologic extremes are less well explored. Here, we test the ability of gridded downscaling models to replicate historical properties of climate and hydrologic extremes, as measured in terms of temporal sequencing (i.e. correlation tests) and distributional properties (i.e. tests for equality of probability distributions). Outputs from seven downscaling methods - bias correction constructed analogues (BCCA), double BCCA (DBCCA), BCCA with quantile mapping reordering (BCCAQ), bias correction spatial disaggregation (BCSD), BCSD using minimum/maximum temperature (BCSDX), the climate imprint delta method (CI), and bias corrected CI (BCCI) - are used to drive the Variable Infiltration Capacity (VIC) model over the snow-dominated Peace River basin, British Columbia. Outputs are tested using split-sample validation on 26 climate extremes indices (ClimDEX) and two hydrologic extremes indices (3-day peak flow and 7-day peak flow). To characterize observational uncertainty, four atmospheric reanalyses are used as climate model surrogates and two gridded observational data sets are used as downscaling target data. The skill of the downscaling methods generally depended on reanalysis and gridded observational data set. However, CI failed to reproduce the distribution and BCSD and BCSDX the timing of winter 7-day low-flow events, regardless of reanalysis or observational data set. Overall, DBCCA passed the greatest number of tests for the ClimDEX indices, while BCCAQ, which is designed to more accurately resolve event

  4. Hydrologic extremes - an intercomparison of multiple gridded statistical downscaling methods

    NASA Astrophysics Data System (ADS)

    Werner, A. T.; Cannon, A. J.

    2015-06-01

    Gridded statistical downscaling methods are the main means of preparing climate model data to drive distributed hydrological models. Past work on the validation of climate downscaling methods has focused on temperature and precipitation, with less attention paid to the ultimate outputs from hydrological models. Also, as attention shifts towards projections of extreme events, downscaling comparisons now commonly assess methods in terms of climate extremes, but hydrologic extremes are less well explored. Here, we test the ability of gridded downscaling models to replicate historical properties of climate and hydrologic extremes, as measured in terms of temporal sequencing (i.e., correlation tests) and distributional properties (i.e., tests for equality of probability distributions). Outputs from seven downscaling methods - bias correction constructed analogues (BCCA), double BCCA (DBCCA), BCCA with quantile mapping reordering (BCCAQ), bias correction spatial disaggregation (BCSD), BCSD using minimum/maximum temperature (BCSDX), climate imprint delta method (CI), and bias corrected CI (BCCI) - are used to drive the Variable Infiltration Capacity (VIC) model over the snow-dominated Peace River basin, British Columbia. Outputs are tested using split-sample validation on 26 climate extremes indices (ClimDEX) and two hydrologic extremes indices (3 day peak flow and 7 day peak flow). To characterize observational uncertainty, four atmospheric reanalyses are used as climate model surrogates and two gridded observational datasets are used as downscaling target data. The skill of the downscaling methods generally depended on reanalysis and gridded observational dataset. However, CI failed to reproduce the distribution and BCSD and BCSDX the timing of winter 7 day low flow events, regardless of reanalysis or observational dataset. Overall, DBCCA passed the greatest number of tests for the ClimDEX indices, while BCCAQ, which is designed to more accurately resolve event

  5. A statistical method for draft tube pressure pulsation analysis

    NASA Astrophysics Data System (ADS)

    Doerfler, P. K.; Ruchonnet, N.

    2012-11-01

    Draft tube pressure pulsation (DTPP) in Francis turbines is composed of various components originating from different physical phenomena. These components may be separated because they differ by their spatial relationships and by their propagation mechanism. The first step for such an analysis was to distinguish between so-called synchronous and asynchronous pulsations; only approximately periodic phenomena could be described in this manner. However, less regular pulsations are always present, and these become important when turbines have to operate in the far off-design range, in particular at very low load. The statistical method described here permits to separate the stochastic (random) component from the two traditional 'regular' components. It works in connection with the standard technique of model testing with several pressure signals measured in draft tube cone. The difference between the individual signals and the averaged pressure signal, together with the coherence between the individual pressure signals is used for analysis. An example reveals that a generalized, non-periodic version of the asynchronous pulsation is important at low load.

  6. Standardization of methods for extracting statistics from surface profile measurements

    NASA Astrophysics Data System (ADS)

    Takacs, Peter Z.

    2014-08-01

    Surface profilers and optical interferometers produce 2D maps of surface and wavefront topography. Traditional standards and methods for characterizing the properties of these surfaces use coordinate space representations of the surface topography. The computing power available today in modest personal computers makes it easy to transform into frequency space and apply well-known signal processing techniques to analyze the data. The Power Spectral Density (PSD) function of the surface height distribution is a powerful tool to assess the quality and characteristics of the surface in question. In order to extract useful information about the spectral distribution of surface roughness or mid-spatial frequency error over a particular spatial frequency band, it is necessary to pre-process the data by first detrending the surface figure terms and then applying a window function before computing the PSD. This process eliminates discontinuities at the borders of the profile that would otherwise produce large amounts of spurious power that would mask the true nature of the surface texture. This procedure is now part of a new draft standard that is being adopted by the US OEOSC for analysis of the statistics of optical surface, OP1.005.1 Illustrations of the usefulness of these procedures will be presented.

  7. Determination of Reference Catalogs for Meridian Observations Using Statistical Method

    NASA Astrophysics Data System (ADS)

    Li, Z. Y.

    2014-09-01

    The meridian observational data are useful for developing high-precision planetary ephemerides of the solar system. These historical data are provided by the jet propulsion laboratory (JPL) or the Institut De Mecanique Celeste Et De Calcul Des Ephemerides (IMCCE). However, we find that the reference systems (realized by the fundamental catalogs FK3 (Third Fundamental Catalogue), FK4 (Fourth Fundamental Catalogue), and FK5 (Fifth Fundamental Catalogue), or Hipparcos), to which the observations are referred, are not given explicitly for some sets of data. The incompleteness of information prevents us from eliminating the systematic effects due to the different fundamental catalogs. The purpose of this paper is to specify clearly the reference catalogs of these observations with the problems in their records by using the JPL DE421 ephemeris. The data for the corresponding planets in the geocentric celestial reference system (GCRS) obtained from the DE421 are transformed to the apparent places with different hypothesis regarding the reference catalogs. Then the validations of the hypothesis are tested by two kinds of statistical quantities which are used to indicate the significance of difference between the original and transformed data series. As a result, this method is proved to be effective for specifying the reference catalogs, and the missed information is determined unambiguously. Finally these meridian data are transformed to the GCRS for further applications in the development of planetary ephemerides.

  8. Instruments, methods, statistics, plasmaphysical interpretation of type IIIb bursts

    NASA Astrophysics Data System (ADS)

    Urbarz, H. W.

    Type-IIIb solar bursts in the m-dkm band and the methods used to study them are characterized in a review of recent research. The value of high-resolution spectrographs (with effective apertures of 1000-100,000 sq m, frequency resolution 20 kHz, and time resolution 100 msec) in detecting and investigating type-IIIb bursts is emphasized, and the parameters of the most important instruments are listed in a table. Burst spectra, sources, polarization, flux, occurrence, and association with other types are discussed and illustrated with sample spectra, tables, and histograms. The statistics of observations made at Weissenau Observatory (Tuebingen, FRG) from August, 1978, through December, 1979, are considered in detail. Theories proposed to explain type-III and type-IIIb bursts are summarized, including frequency splitting (FS) of the Langmuir spectrum, FS during the transverse-wave conversion process, FS during propagation-effect transverse-wave escape, and discrete source regions with different f(p) values.

  9. Statistical comparison of random allocation methods in cancer clinical trials.

    PubMed

    Hagino, Atsushi; Hamada, Chikuma; Yoshimura, Isao; Ohashi, Yasuo; Sakamoto, Junichi; Nakazato, Hiroaki

    2004-12-01

    The selection of a trial design is an important issue in the planning of clinical trials. One of the most important considerations in trial design is the method of treatment allocation and appropriate analysis plan corresponding to the design. In this article, we conducted computer simulations using the actual data from 2158 rectal cancer patients enrolled in the surgery-alone group from seven randomized controlled trials in Japan to compare the performance of allocation methods, simple randomization, stratified randomization and minimization in relatively small-scale trials (total number of two groups are 50, 100, 150 or 200 patients). The degree of imbalance in prognostic factors between groups was evaluated by changing the allocation probability of minimization from 1.00 to 0.70 by 0.05. The simulation demonstrated that minimization provides the best performance to ensure balance in the number of patients between groups and prognostic factors. Moreover, to achieve the 1 percentile for the p-value of chi-square test around 0.50 with respect to balance in prognostic factors, the allocation probability of minimization was required to be set to 0.95 for 50, 0.80 for 100, 0.75 for 150 and 0.70 for 200 patients. When the sample size was larger, sufficient balance could be achieved even if reducing allocation probability. The simulation using actual data demonstrated that unadjusted tests for the allocation factors resulted in conservative type I errors when dynamic allocation, such as minimization, was used. In contrast, adjusted tests for allocation factors as covariates improved type I errors closer to the nominal significance level and they provided slightly higher power. In conclusion, both the statistical and clinical validity of minimization was demonstrated in our study.

  10. Statistics.

    PubMed

    1993-02-01

    In 1984, 99% of abortions conducted in Bombay, India, were of female fetuses. In 1986-87, 30,000-50,000 female fetuses were aborted in India. In 1987-88, 7 Delhi clinics conducted 13,000 sex determination tests. Thus, discrimination against females begins before birth in India. Some states (Maharashtra, Goa, and Gujarat) have drafted legislation to prevent the use of prenatal diagnostic tests (e.g., ultrasonography) for sex determination purposes. Families make decisions about an infant's nutrition based on the infant's sex so it is not surprising to see a higher incidence of morbidity among girls than boys (e.g., for respiratory infections in 1985, 55.5% vs. 27.3%). Consequently, they are more likely to die than boys. Even though vasectomy is simpler and safer than tubectomy, the government promotes female sterilizations. The percentage of all sexual sterilizations being tubectomy has increased steadily from 84% to 94% (1986-90). Family planning programs focus on female contraceptive methods, despite the higher incidence of adverse health effects from female methods (e.g., IUD causes pain and heavy bleeding). Some women advocates believe the effects to be so great that India should ban contraceptives and injectable contraceptives. The maternal mortality rate is quite high (460/100,000 live births), equaling a lifetime risk of 1:18 of a pregnancy-related death. 70% of these maternal deaths are preventable. Leading causes of maternal deaths in India are anemia, hemorrhage, eclampsia, sepsis, and abortion. Most pregnant women do not receive prenatal care. Untrained personnel attend about 70% of deliveries in rural areas and 29% in urban areas. Appropriate health services and other interventions would prevent the higher age specific death rates for females between 0 and 35 years old. Even though the government does provide maternal and child health services, it needs to stop decreasing resource allocate for health and start increasing it. PMID:12286355

  11. Teaching biology through statistics: application of statistical methods in genetics and zoology courses.

    PubMed

    Colon-Berlingeri, Migdalisel; Burrowes, Patricia A

    2011-01-01

    Incorporation of mathematics into biology curricula is critical to underscore for undergraduate students the relevance of mathematics to most fields of biology and the usefulness of developing quantitative process skills demanded in modern biology. At our institution, we have made significant changes to better integrate mathematics into the undergraduate biology curriculum. The curricular revision included changes in the suggested course sequence, addition of statistics and precalculus as prerequisites to core science courses, and incorporating interdisciplinary (math-biology) learning activities in genetics and zoology courses. In this article, we describe the activities developed for these two courses and the assessment tools used to measure the learning that took place with respect to biology and statistics. We distinguished the effectiveness of these learning opportunities in helping students improve their understanding of the math and statistical concepts addressed and, more importantly, their ability to apply them to solve a biological problem. We also identified areas that need emphasis in both biology and mathematics courses. In light of our observations, we recommend best practices that biology and mathematics academic departments can implement to train undergraduates for the demands of modern biology.

  12. Teaching Biology through Statistics: Application of Statistical Methods in Genetics and Zoology Courses

    PubMed Central

    Colon-Berlingeri, Migdalisel; Burrowes, Patricia A.

    2011-01-01

    Incorporation of mathematics into biology curricula is critical to underscore for undergraduate students the relevance of mathematics to most fields of biology and the usefulness of developing quantitative process skills demanded in modern biology. At our institution, we have made significant changes to better integrate mathematics into the undergraduate biology curriculum. The curricular revision included changes in the suggested course sequence, addition of statistics and precalculus as prerequisites to core science courses, and incorporating interdisciplinary (math–biology) learning activities in genetics and zoology courses. In this article, we describe the activities developed for these two courses and the assessment tools used to measure the learning that took place with respect to biology and statistics. We distinguished the effectiveness of these learning opportunities in helping students improve their understanding of the math and statistical concepts addressed and, more importantly, their ability to apply them to solve a biological problem. We also identified areas that need emphasis in both biology and mathematics courses. In light of our observations, we recommend best practices that biology and mathematics academic departments can implement to train undergraduates for the demands of modern biology. PMID:21885822

  13. Cluster size statistic and cluster mass statistic: two novel methods for identifying changes in functional connectivity between groups or conditions.

    PubMed

    Ing, Alex; Schwarzbauer, Christian

    2014-01-01

    Functional connectivity has become an increasingly important area of research in recent years. At a typical spatial resolution, approximately 300 million connections link each voxel in the brain with every other. This pattern of connectivity is known as the functional connectome. Connectivity is often compared between experimental groups and conditions. Standard methods used to control the type 1 error rate are likely to be insensitive when comparisons are carried out across the whole connectome, due to the huge number of statistical tests involved. To address this problem, two new cluster based methods--the cluster size statistic (CSS) and cluster mass statistic (CMS)--are introduced to control the family wise error rate across all connectivity values. These methods operate within a statistical framework similar to the cluster based methods used in conventional task based fMRI. Both methods are data driven, permutation based and require minimal statistical assumptions. Here, the performance of each procedure is evaluated in a receiver operator characteristic (ROC) analysis, utilising a simulated dataset. The relative sensitivity of each method is also tested on real data: BOLD (blood oxygen level dependent) fMRI scans were carried out on twelve subjects under normal conditions and during the hypercapnic state (induced through the inhalation of 6% CO2 in 21% O2 and 73%N2). Both CSS and CMS detected significant changes in connectivity between normal and hypercapnic states. A family wise error correction carried out at the individual connection level exhibited no significant changes in connectivity.

  14. Asbestos/NESHAP adequately wet guidance

    SciTech Connect

    Shafer, R.; Throwe, S.; Salgado, O.; Garlow, C.; Hoerath, E.

    1990-12-01

    The Asbestos NESHAP requires facility owners and/or operators involved in demolition and renovation activities to control emissions of particulate asbestos to the outside air because no safe concentration of airborne asbestos has ever been established. The primary method used to control asbestos emissions is to adequately wet the Asbestos Containing Material (ACM) with a wetting agent prior to, during and after demolition/renovation activities. The purpose of the document is to provide guidance to asbestos inspectors and the regulated community on how to determine if friable ACM is adequately wet as required by the Asbestos NESHAP.

  15. Debating Curricular Strategies for Teaching Statistics and Research Methods: What Does the Current Evidence Suggest?

    ERIC Educational Resources Information Center

    Barron, Kenneth E.; Apple, Kevin J.

    2014-01-01

    Coursework in statistics and research methods is a core requirement in most undergraduate psychology programs. However, is there an optimal way to structure and sequence methodology courses to facilitate student learning? For example, should statistics be required before research methods, should research methods be required before statistics, or…

  16. Basic Statistical Concepts and Methods for Earth Scientists

    USGS Publications Warehouse

    Olea, Ricardo A.

    2008-01-01

    INTRODUCTION Statistics is the science of collecting, analyzing, interpreting, modeling, and displaying masses of numerical data primarily for the characterization and understanding of incompletely known systems. Over the years, these objectives have lead to a fair amount of analytical work to achieve, substantiate, and guide descriptions and inferences.

  17. CAI and CMI Methods for Teaching Business Statistics Using COMPENSTAT.

    ERIC Educational Resources Information Center

    Sanders, William V.

    COMPENSTAT, a menu-driven statistical program for IBM-compatible microcomputers, has two distinct versions: instructional and computational. The instructional version can be used by instructors as a classroom resource, and the computational version is used directly by students to calculate answers to problems. The software package is primarily…

  18. Accountability Indicators from the Viewpoint of Statistical Method.

    ERIC Educational Resources Information Center

    Jordan, Larry

    Few people seriously regard students as "products" coming off an educational assembly line, but notions about accountability and quality improvement in higher education are pervaded by manufacturing ideas and metaphors. Because numerical indicators of quality are inevitably expressed by trend lines or statistical control chars of some kind, they…

  19. Critical Realism and Statistical Methods--A Response to Nash

    ERIC Educational Resources Information Center

    Scott, David

    2007-01-01

    This article offers a defence of critical realism in the face of objections Nash (2005) makes to it in a recent edition of this journal. It is argued that critical and scientific realisms are closely related and that both are opposed to statistical positivism. However, the suggestion is made that scientific realism retains (from statistical…

  20. Statistical methods of combining information: Applications to sensor data fusion

    SciTech Connect

    Burr, T.

    1996-12-31

    This paper reviews some statistical approaches to combining information from multiple sources. Promising new approaches will be described, and potential applications to combining not-so-different data sources such as sensor data will be discussed. Experiences with one real data set are described.

  1. Statistical Methods for Rapid Aerothermal Analysis and Design Technology

    NASA Technical Reports Server (NTRS)

    Morgan, Carolyn; DePriest, Douglas; Thompson, Richard (Technical Monitor)

    2002-01-01

    The cost and safety goals for NASA's next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to establish statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The research work was focused on establishing the suitable mathematical/statistical models for these purposes. It is anticipated that the resulting models can be incorporated into a software tool to provide rapid, variable-fidelity, aerothermal environments to predict heating along an arbitrary trajectory. This work will support development of an integrated design tool to perform automated thermal protection system (TPS) sizing and material selection.

  2. [Methods of the multivariate statistical analysis of so-called polyetiological diseases using the example of coronary heart disease].

    PubMed

    Lifshits, A M

    1979-01-01

    General characteristics of the multivariate statistical analysis (MSA) is given. Methodical premises and criteria for the selection of an adequate MSA method applicable to pathoanatomic investigations of the epidemiology of multicausal diseases are presented. The experience of using MSA with computors and standard computing programs in studies of coronary arteries aterosclerosis on the materials of 2060 autopsies is described. The combined use of 4 MSA methods: sequential, correlational, regressional, and discriminant permitted to quantitate the contribution of each of the 8 examined risk factors in the development of aterosclerosis. The most important factors were found to be the age, arterial hypertension, and heredity. Occupational hypodynamia and increased fatness were more important in men, whereas diabetes melitus--in women. The registration of this combination of risk factors by MSA methods provides for more reliable prognosis of the likelihood of coronary heart disease with a fatal outcome than prognosis of the degree of coronary aterosclerosis.

  3. Modification of codes NUALGAM and BREMRAD. Volume 3: Statistical considerations of the Monte Carlo method

    NASA Technical Reports Server (NTRS)

    Firstenberg, H.

    1971-01-01

    The statistics are considered of the Monte Carlo method relative to the interpretation of the NUGAM2 and NUGAM3 computer code results. A numerical experiment using the NUGAM2 code is presented and the results are statistically interpreted.

  4. M&M's "The Method," and Other Ideas about Teaching Elementary Statistics.

    ERIC Educational Resources Information Center

    May, E. Lee Jr.

    2000-01-01

    Consists of a collection of observations about the teaching of the first course in elementary probability and statistics offered by many colleges and universities. Highlights the Goldberg Method for solving problems in probability and statistics. (Author/ASK)

  5. Statistical Methods and Tools for Hanford Staged Feed Tank Sampling

    SciTech Connect

    Fountain, Matthew S.; Brigantic, Robert T.; Peterson, Reid A.

    2013-10-01

    This report summarizes work conducted by Pacific Northwest National Laboratory to technically evaluate the current approach to staged feed sampling of high-level waste (HLW) sludge to meet waste acceptance criteria (WAC) for transfer from tank farms to the Hanford Waste Treatment and Immobilization Plant (WTP). The current sampling and analysis approach is detailed in the document titled Initial Data Quality Objectives for WTP Feed Acceptance Criteria, 24590-WTP-RPT-MGT-11-014, Revision 0 (Arakali et al. 2011). The goal of this current work is to evaluate and provide recommendations to support a defensible, technical and statistical basis for the staged feed sampling approach that meets WAC data quality objectives (DQOs).

  6. Statistical energy analysis response prediction methods for structural systems

    NASA Technical Reports Server (NTRS)

    Davis, R. F.

    1979-01-01

    The results of an effort to document methods for accomplishing response predictions for commonly encountered aerospace structural configurations is presented. Application of these methods to specified aerospace structure to provide sample analyses is included. An applications manual, with the structural analyses appended as example problems is given. Comparisons of the response predictions with measured data are provided for three of the example problems.

  7. Students' Attitudes toward Statistics across the Disciplines: A Mixed-Methods Approach

    ERIC Educational Resources Information Center

    Griffith, James D.; Adams, Lea T.; Gu, Lucy L.; Hart, Christian L.; Nichols-Whitehead, Penney

    2012-01-01

    Students' attitudes toward statistics were investigated using a mixed-methods approach including a discovery-oriented qualitative methodology among 684 undergraduate students across business, criminal justice, and psychology majors where at least one course in statistics was required. Students were asked about their attitudes toward statistics and…

  8. Counting Better? An Examination of the Impact of Quantitative Method Teaching on Statistical Anxiety and Confidence

    ERIC Educational Resources Information Center

    Chamberlain, John Martyn; Hillier, John; Signoretta, Paola

    2015-01-01

    This article reports the results of research concerned with students' statistical anxiety and confidence to both complete and learn to complete statistical tasks. Data were collected at the beginning and end of a quantitative method statistics module. Students recognised the value of numeracy skills but felt they were not necessarily relevant for…

  9. Data Analysis & Statistical Methods for Command File Errors

    NASA Technical Reports Server (NTRS)

    Meshkat, Leila; Waggoner, Bruce; Bryant, Larry

    2014-01-01

    This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.

  10. Statistical methods and neural network approaches for classification of data from multiple sources

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon Atli; Swain, Philip H.

    1990-01-01

    Statistical methods for classification of data from multiple data sources are investigated and compared to neural network models. A problem with using conventional multivariate statistical approaches for classification of data of multiple types is in general that a multivariate distribution cannot be assumed for the classes in the data sources. Another common problem with statistical classification methods is that the data sources are not equally reliable. This means that the data sources need to be weighted according to their reliability but most statistical classification methods do not have a mechanism for this. This research focuses on statistical methods which can overcome these problems: a method of statistical multisource analysis and consensus theory. Reliability measures for weighting the data sources in these methods are suggested and investigated. Secondly, this research focuses on neural network models. The neural networks are distribution free since no prior knowledge of the statistical distribution of the data is needed. This is an obvious advantage over most statistical classification methods. The neural networks also automatically take care of the problem involving how much weight each data source should have. On the other hand, their training process is iterative and can take a very long time. Methods to speed up the training procedure are introduced and investigated. Experimental results of classification using both neural network models and statistical methods are given, and the approaches are compared based on these results.

  11. APA's Learning Objectives for Research Methods and Statistics in Practice: A Multimethod Analysis

    ERIC Educational Resources Information Center

    Tomcho, Thomas J.; Rice, Diana; Foels, Rob; Folmsbee, Leah; Vladescu, Jason; Lissman, Rachel; Matulewicz, Ryan; Bopp, Kara

    2009-01-01

    Research methods and statistics courses constitute a core undergraduate psychology requirement. We analyzed course syllabi and faculty self-reported coverage of both research methods and statistics course learning objectives to assess the concordance with APA's learning objectives (American Psychological Association, 2007). We obtained a sample of…

  12. Best Practices in Teaching Statistics and Research Methods in the Behavioral Sciences [with CD-ROM

    ERIC Educational Resources Information Center

    Dunn, Dana S., Ed.; Smith, Randolph A., Ed.; Beins, Barney, Ed.

    2007-01-01

    This book provides a showcase for "best practices" in teaching statistics and research methods in two- and four-year colleges and universities. A helpful resource for teaching introductory, intermediate, and advanced statistics and/or methods, the book features coverage of: (1) ways to integrate these courses; (2) how to promote ethical conduct;…

  13. Relationship between Students' Scores on Research Methods and Statistics, and Undergraduate Project Scores

    ERIC Educational Resources Information Center

    Ossai, Peter Agbadobi Uloku

    2016-01-01

    This study examined the relationship between students' scores on Research Methods and statistics, and undergraduate project at the final year. The purpose was to find out whether students matched knowledge of research with project-writing skill. The study adopted an expost facto correlational design. Scores on Research Methods and Statistics for…

  14. Statistical classification methods for estimating ancestry using morphoscopic traits.

    PubMed

    Hefner, Joseph T; Ousley, Stephen D

    2014-07-01

    Ancestry assessments using cranial morphoscopic traits currently rely on subjective trait lists and observer experience rather than empirical support. The trait list approach, which is untested, unverified, and in many respects unrefined, is relied upon because of tradition and subjective experience. Our objective was to examine the utility of frequently cited morphoscopic traits and to explore eleven appropriate and novel methods for classifying an unknown cranium into one of several reference groups. Based on these results, artificial neural networks (aNNs), OSSA, support vector machines, and random forest models showed mean classification accuracies of at least 85%. The aNNs had the highest overall classification rate (87.8%), and random forests show the smallest difference between the highest (90.4%) and lowest (76.5%) classification accuracies. The results of this research demonstrate that morphoscopic traits can be successfully used to assess ancestry without relying only on the experience of the observer.

  15. Predicting sulphur and nitrogen deposition using a simple statistical method

    NASA Astrophysics Data System (ADS)

    Oulehle, Filip; Kopáček, Jiří; Chuman, Tomáš; Černohous, Vladimír; Hůnová, Iva; Hruška, Jakub; Krám, Pavel; Lachmanová, Zora; Navrátil, Tomáš; Štěpánek, Petr; Tesař, Miroslav; Evans, Christopher D.

    2016-09-01

    Data from 32 long-term (1994-2012) monitoring sites were used to assess temporal development and spatial variability of sulphur (S) and inorganic nitrogen (N) concentrations in bulk precipitation, and S in throughfall, for the Czech Republic. Despite large variance in absolute S and N concentration/deposition among sites, temporal coherence using standardised data (Z score) was demonstrated. Overall significant declines of SO4 concentration in bulk and throughfall precipitation, as well as NO3 and NH4 concentration in bulk precipitation, were observed. Median Z score values of bulk SO4, NO3 and NH4 and throughfall SO4 derived from observations and the respective emission rates of SO2, NOx and NH3 in the Czech Republic and Slovakia showed highly significant (p < 0.001) relationships. Using linear regression models, Z score values were calculated for the whole period 1900-2012 and then back-transformed to give estimates of concentration for the individual sites. Uncertainty associated with the concentration calculations was estimated as 20% for SO4 bulk precipitation, 22% for throughfall SO4, 18% for bulk NO3 and 28% for bulk NH4. The application of the method suggested that it is effective in the long-term reconstruction and prediction of S and N deposition at a variety of sites. Multiple regression modelling was used to extrapolate site characteristics (mean precipitation chemistry and its standard deviation) from monitored to unmonitored sites. Spatially distributed temporal development of S and N depositions were calculated since 1900. The method allows spatio-temporal estimation of the acid deposition in regions with extensive monitoring of precipitation chemistry.

  16. Refining developmental coordination disorder subtyping with multivariate statistical methods

    PubMed Central

    2012-01-01

    Background With a large number of potentially relevant clinical indicators penalization and ensemble learning methods are thought to provide better predictive performance than usual linear predictors. However, little is known about how they perform in clinical studies where few cases are available. We used Random Forests and Partial Least Squares Discriminant Analysis to select the most salient impairments in Developmental Coordination Disorder (DCD) and assess patients similarity. Methods We considered a wide-range testing battery for various neuropsychological and visuo-motor impairments which aimed at characterizing subtypes of DCD in a sample of 63 children. Classifiers were optimized on a training sample, and they were used subsequently to rank the 49 items according to a permuted measure of variable importance. In addition, subtyping consistency was assessed with cluster analysis on the training sample. Clustering fitness and predictive accuracy were evaluated on the validation sample. Results Both classifiers yielded a relevant subset of items impairments that altogether accounted for a sharp discrimination between three DCD subtypes: ideomotor, visual-spatial and constructional, and mixt dyspraxia. The main impairments that were found to characterize the three subtypes were: digital perception, imitations of gestures, digital praxia, lego blocks, visual spatial structuration, visual motor integration, coordination between upper and lower limbs. Classification accuracy was above 90% for all classifiers, and clustering fitness was found to be satisfactory. Conclusions Random Forests and Partial Least Squares Discriminant Analysis are useful tools to extract salient features from a large pool of correlated binary predictors, but also provide a way to assess individuals proximities in a reduced factor space. Less than 15 neuro-visual, neuro-psychomotor and neuro-psychological tests might be required to provide a sensitive and specific diagnostic of DCD on this

  17. A REVIEW OF STATISTICAL METHODS FOR THE METEOROLOGICAL ADJUSTMENT OF TROPOSPHERIC OZONE

    EPA Science Inventory

    A variety of statistical methods for meteorological adjustment of ozone have been proposed in the literature over the last decade for purposes of forecasting, estimating ozone time trends, or investigating underlying mechanisms from an empirical perspective. The methods can be...

  18. Statistical methods for the forensic analysis of striated tool marks

    SciTech Connect

    Hoeksema, Amy Beth

    2013-01-01

    In forensics, fingerprints can be used to uniquely identify suspects in a crime. Similarly, a tool mark left at a crime scene can be used to identify the tool that was used. However, the current practice of identifying matching tool marks involves visual inspection of marks by forensic experts which can be a very subjective process. As a result, declared matches are often successfully challenged in court, so law enforcement agencies are particularly interested in encouraging research in more objective approaches. Our analysis is based on comparisons of profilometry data, essentially depth contours of a tool mark surface taken along a linear path. In current practice, for stronger support of a match or non-match, multiple marks are made in the lab under the same conditions by the suspect tool. We propose the use of a likelihood ratio test to analyze the difference between a sample of comparisons of lab tool marks to a field tool mark, against a sample of comparisons of two lab tool marks. Chumbley et al. (2010) point out that the angle of incidence between the tool and the marked surface can have a substantial impact on the tool mark and on the effectiveness of both manual and algorithmic matching procedures. To better address this problem, we describe how the analysis can be enhanced to model the effect of tool angle and allow for angle estimation for a tool mark left at a crime scene. With sufficient development, such methods may lead to more defensible forensic analyses.

  19. Deep Mixing in Stellar Variability: Improved Method, Statistics, and Applications

    NASA Astrophysics Data System (ADS)

    Arkhypov, Oleksiy V.; Khodachenko, Maxim L.; Lammer, Helmut; Güdel, Manuel; Lüftinger, Theresa; Johnstone, Colin P.

    2016-07-01

    The preliminary results on deep-mixing manifestations in stellar variability are tested using our improved method and extended data set. We measure the timescales τ m of the stochastic change in the spectral power of rotational harmonics with numbers m ≤ 3 in the light curves of 1361 main-sequence stars from the Kepler mission archive. We find that the gradient [{log}({τ }2)-{log}({τ }1)]/[{log}(2)-{log}(1)] has a histogram maximum at ‑2/3, demonstrating agreement with Kolmogorov’s theory of turbulence and therefore confirming the manifestation of deep mixing. The squared amplitudes of the first and second rotational harmonics, corrected for integral photometry distortion, also show a quasi-Kolmogorov character with spectral index ≈‑5/3. Moreover, the reduction of τ 1 and τ 2 to the timescales τ lam1 and τ lam2 of laminar convection in the deep stellar layers reveals the proximity of both τ lam1 and τ lam2 to the turnover time τ MLT of standard mixing length theory. Considering this result, we use the obtained stellar variability timescales instead of τ MLT in our analysis of the relation between stellar activity and the Rossby number P/τ MLT. Comparison of our diagrams with previous results and theoretical expectations shows that best-fit correspondence is achieved for τ lam1, which can therefore be used as an analog of τ MLT. This means that the laminar component (giant cells) of stellar turbulent convection indeed plays an important role in the physics of stars. Additionally, we estimate the diffusivity of magnetic elements in stellar photospheres.

  20. Deep Mixing in Stellar Variability: Improved Method, Statistics, and Applications

    NASA Astrophysics Data System (ADS)

    Arkhypov, Oleksiy V.; Khodachenko, Maxim L.; Lammer, Helmut; Güdel, Manuel; Lüftinger, Theresa; Johnstone, Colin P.

    2016-07-01

    The preliminary results on deep-mixing manifestations in stellar variability are tested using our improved method and extended data set. We measure the timescales τ m of the stochastic change in the spectral power of rotational harmonics with numbers m ≤ 3 in the light curves of 1361 main-sequence stars from the Kepler mission archive. We find that the gradient [{log}({τ }2)-{log}({τ }1)]/[{log}(2)-{log}(1)] has a histogram maximum at -2/3, demonstrating agreement with Kolmogorov’s theory of turbulence and therefore confirming the manifestation of deep mixing. The squared amplitudes of the first and second rotational harmonics, corrected for integral photometry distortion, also show a quasi-Kolmogorov character with spectral index ≈-5/3. Moreover, the reduction of τ 1 and τ 2 to the timescales τ lam1 and τ lam2 of laminar convection in the deep stellar layers reveals the proximity of both τ lam1 and τ lam2 to the turnover time τ MLT of standard mixing length theory. Considering this result, we use the obtained stellar variability timescales instead of τ MLT in our analysis of the relation between stellar activity and the Rossby number P/τ MLT. Comparison of our diagrams with previous results and theoretical expectations shows that best-fit correspondence is achieved for τ lam1, which can therefore be used as an analog of τ MLT. This means that the laminar component (giant cells) of stellar turbulent convection indeed plays an important role in the physics of stars. Additionally, we estimate the diffusivity of magnetic elements in stellar photospheres.

  1. Methods for estimating selected low-flow statistics and development of annual flow-duration statistics for Ohio

    USGS Publications Warehouse

    Koltun, G.F.; Kula, Stephanie P.

    2013-01-01

    This report presents the results of a study to develop methods for estimating selected low-flow statistics and for determining annual flow-duration statistics for Ohio streams. Regression techniques were used to develop equations for estimating 10-year recurrence-interval (10-percent annual-nonexceedance probability) low-flow yields, in cubic feet per second per square mile, with averaging periods of 1, 7, 30, and 90-day(s), and for estimating the yield corresponding to the long-term 80-percent duration flow. These equations, which estimate low-flow yields as a function of a streamflow-variability index, are based on previously published low-flow statistics for 79 long-term continuous-record streamgages with at least 10 years of data collected through water year 1997. When applied to the calibration dataset, average absolute percent errors for the regression equations ranged from 15.8 to 42.0 percent. The regression results have been incorporated into the U.S. Geological Survey (USGS) StreamStats application for Ohio (http://water.usgs.gov/osw/streamstats/ohio.html) in the form of a yield grid to facilitate estimation of the corresponding streamflow statistics in cubic feet per second. Logistic-regression equations also were developed and incorporated into the USGS StreamStats application for Ohio for selected low-flow statistics to help identify occurrences of zero-valued statistics. Quantiles of daily and 7-day mean streamflows were determined for annual and annual-seasonal (September–November) periods for each complete climatic year of streamflow-gaging station record for 110 selected streamflow-gaging stations with 20 or more years of record. The quantiles determined for each climatic year were the 99-, 98-, 95-, 90-, 80-, 75-, 70-, 60-, 50-, 40-, 30-, 25-, 20-, 10-, 5-, 2-, and 1-percent exceedance streamflows. Selected exceedance percentiles of the annual-exceedance percentiles were subsequently computed and tabulated to help facilitate consideration of the

  2. Using the Bootstrap Method for a Statistical Significance Test of Differences between Summary Histograms

    NASA Technical Reports Server (NTRS)

    Xu, Kuan-Man

    2006-01-01

    A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.

  3. 34 CFR 85.900 - Adequate evidence.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 34 Education 1 2010-07-01 2010-07-01 false Adequate evidence. 85.900 Section 85.900 Education Office of the Secretary, Department of Education GOVERNMENTWIDE DEBARMENT AND SUSPENSION (NONPROCUREMENT) Definitions § 85.900 Adequate evidence. Adequate evidence means information sufficient to support...

  4. 12 CFR 380.52 - Adequate protection.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 12 Banks and Banking 5 2012-01-01 2012-01-01 false Adequate protection. 380.52 Section 380.52... ORDERLY LIQUIDATION AUTHORITY Receivership Administrative Claims Process § 380.52 Adequate protection. (a... interest of a claimant, the receiver shall provide adequate protection by any of the following means:...

  5. 12 CFR 380.52 - Adequate protection.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 12 Banks and Banking 5 2013-01-01 2013-01-01 false Adequate protection. 380.52 Section 380.52... ORDERLY LIQUIDATION AUTHORITY Receivership Administrative Claims Process § 380.52 Adequate protection. (a... interest of a claimant, the receiver shall provide adequate protection by any of the following means:...

  6. 12 CFR 380.52 - Adequate protection.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 12 Banks and Banking 5 2014-01-01 2014-01-01 false Adequate protection. 380.52 Section 380.52... ORDERLY LIQUIDATION AUTHORITY Receivership Administrative Claims Process § 380.52 Adequate protection. (a... interest of a claimant, the receiver shall provide adequate protection by any of the following means:...

  7. 21 CFR 1404.900 - Adequate evidence.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 21 Food and Drugs 9 2010-04-01 2010-04-01 false Adequate evidence. 1404.900 Section 1404.900 Food and Drugs OFFICE OF NATIONAL DRUG CONTROL POLICY GOVERNMENTWIDE DEBARMENT AND SUSPENSION (NONPROCUREMENT) Definitions § 1404.900 Adequate evidence. Adequate evidence means information sufficient...

  8. A Statistical Method of Evaluating the Pronunciation Proficiency/Intelligibility of English Presentations by Japanese Speakers

    ERIC Educational Resources Information Center

    Kibishi, Hiroshi; Hirabayashi, Kuniaki; Nakagawa, Seiichi

    2015-01-01

    In this paper, we propose a statistical evaluation method of pronunciation proficiency and intelligibility for presentations made in English by native Japanese speakers. We statistically analyzed the actual utterances of speakers to find combinations of acoustic and linguistic features with high correlation between the scores estimated by the…

  9. Strategies for Enhancing the Learning of Ecological Research Methods and Statistics by Tertiary Environmental Science Students

    ERIC Educational Resources Information Center

    Panizzon, D. L.; Boulton, A. J.

    2004-01-01

    To undertake rigorous research in biology and ecology, students must be able to pose testable hypotheses, design decisive studies, and analyse results using suitable statistics. Yet, few biology students excel in topics involving statistics and most attempt to evade optional courses in research methods. Over the last few years, we have developed…

  10. Physics-based statistical model and simulation method of RF propagation in urban environments

    DOEpatents

    Pao, Hsueh-Yuan; Dvorak, Steven L.

    2010-09-14

    A physics-based statistical model and simulation/modeling method and system of electromagnetic wave propagation (wireless communication) in urban environments. In particular, the model is a computationally efficient close-formed parametric model of RF propagation in an urban environment which is extracted from a physics-based statistical wireless channel simulation method and system. The simulation divides the complex urban environment into a network of interconnected urban canyon waveguides which can be analyzed individually; calculates spectral coefficients of modal fields in the waveguides excited by the propagation using a database of statistical impedance boundary conditions which incorporates the complexity of building walls in the propagation model; determines statistical parameters of the calculated modal fields; and determines a parametric propagation model based on the statistical parameters of the calculated modal fields from which predictions of communications capability may be made.

  11. Recommended methods for statistical analysis of data containing less-than-detectable measurements

    SciTech Connect

    Atwood, C.L.; Blackwood, L.G.; Harris, G.A.; Loehr, C.A.

    1990-09-01

    This report is a manual for statistical workers dealing with environmental measurements, when some of the measurements are not given exactly but are only reported as less than detectable. For some statistical settings with such data, many methods have been proposed in the literature, while for others few or none have been proposed. This report gives a recommended method in each of the settings considered. The body of the report gives a brief description of each recommended method. Appendix A gives example programs using the statistical package SAS, for those methods that involve nonstandard methods. Appendix B presents the methods that were compared and the reasons for selecting each recommended method, and explains any fine points that might be of interest. This is an interim version. Future revisions will complete the recommendations. 34 refs., 2 figs., 11 tabs.

  12. Probability of Detection (POD) as a statistical model for the validation of qualitative methods.

    PubMed

    Wehling, Paul; LaBudde, Robert A; Brunelle, Sharon L; Nelson, Maria T

    2011-01-01

    A statistical model is presented for use in validation of qualitative methods. This model, termed Probability of Detection (POD), harmonizes the statistical concepts and parameters between quantitative and qualitative method validation. POD characterizes method response with respect to concentration as a continuous variable. The POD model provides a tool for graphical representation of response curves for qualitative methods. In addition, the model allows comparisons between candidate and reference methods, and provides calculations of repeatability, reproducibility, and laboratory effects from collaborative study data. Single laboratory study and collaborative study examples are given.

  13. Probability of identification: a statistical model for the validation of qualitative botanical identification methods.

    PubMed

    LaBudde, Robert A; Harnly, James M

    2012-01-01

    A qualitative botanical identification method (BIM) is an analytical procedure that returns a binary result (1 = Identified, 0 = Not Identified). A BIM may be used by a buyer, manufacturer, or regulator to determine whether a botanical material being tested is the same as the target (desired) material, or whether it contains excessive nontarget (undesirable) material. The report describes the development and validation of studies for a BIM based on the proportion of replicates identified, or probability of identification (POI), as the basic observed statistic. The statistical procedures proposed for data analysis follow closely those of the probability of detection, and harmonize the statistical concepts and parameters between quantitative and qualitative method validation. Use of POI statistics also harmonizes statistical concepts for botanical, microbiological, toxin, and other analyte identification methods that produce binary results. The POI statistical model provides a tool for graphical representation of response curves for qualitative methods, reporting of descriptive statistics, and application of performance requirements. Single collaborator and multicollaborative study examples are given.

  14. Statistical methods to estimate treatment effects from multichannel electroencephalography (EEG) data in clinical trials.

    PubMed

    Ma, Junshui; Wang, Shubing; Raubertas, Richard; Svetnik, Vladimir

    2010-07-15

    With the increasing popularity of using electroencephalography (EEG) to reveal the treatment effect in drug development clinical trials, the vast volume and complex nature of EEG data compose an intriguing, but challenging, topic. In this paper the statistical analysis methods recommended by the EEG community, along with methods frequently used in the published literature, are first reviewed. A straightforward adjustment of the existing methods to handle multichannel EEG data is then introduced. In addition, based on the spatial smoothness property of EEG data, a new category of statistical methods is proposed. The new methods use a linear combination of low-degree spherical harmonic (SPHARM) basis functions to represent a spatially smoothed version of the EEG data on the scalp, which is close to a sphere in shape. In total, seven statistical methods, including both the existing and the newly proposed methods, are applied to two clinical datasets to compare their power to detect a drug effect. Contrary to the EEG community's recommendation, our results suggest that (1) the nonparametric method does not outperform its parametric counterpart; and (2) including baseline data in the analysis does not always improve the statistical power. In addition, our results recommend that (3) simple paired statistical tests should be avoided due to their poor power; and (4) the proposed spatially smoothed methods perform better than their unsmoothed versions.

  15. Performance comparison of three predictor selection methods for statistical downscaling of daily precipitation

    NASA Astrophysics Data System (ADS)

    Yang, Chunli; Wang, Ninglian; Wang, Shijin; Zhou, Liang

    2016-10-01

    Predictor selection is a critical factor affecting the statistical downscaling of daily precipitation. This study provides a general comparison between uncertainties in downscaled results from three commonly used predictor selection methods (correlation analysis, partial correlation analysis, and stepwise regression analysis). Uncertainty is analyzed by comparing statistical indices, including the mean, variance, and the distribution of monthly mean daily precipitation, wet spell length, and the number of wet days. The downscaled results are produced by the artificial neural network (ANN) statistical downscaling model and 50 years (1961-2010) of observed daily precipitation together with reanalysis predictors. Although results show little difference between downscaling methods, stepwise regression analysis is generally the best method for selecting predictors for the ANN statistical downscaling model of daily precipitation, followed by partial correlation analysis and then correlation analysis.

  16. Statistical methods to assess the reliability of measurements in the procedures for forensic age estimation.

    PubMed

    Ferrante, L; Cameriere, R

    2009-07-01

    In forensic science, anthropology, and archaeology, several techniques have been developed to estimate chronological age in both children and adults, using the relationship between age and morphological changes in the structure of teeth. Before implementing a statistical model to describe age as a function of the measured morphological variables, the reliability of the measurements of these variables must be evaluated using suitable statistical methods. This paper introduces some commonly used statistical methods for assessing the reliability of procedures for age estimation in the forensic field. The use of the concordance correlation coefficient and the intraclass correlation coefficient are explained. Finally, some pitfalls in the choice of the statistical methods to assess reliability of the measurements in age estimation are discussed.

  17. Teaching Research Methods and Statistics in eLearning Environments: Pedagogy, Practical Examples, and Possible Futures.

    PubMed

    Rock, Adam J; Coventry, William L; Morgan, Methuen I; Loi, Natasha M

    2016-01-01

    Generally, academic psychologists are mindful of the fact that, for many students, the study of research methods and statistics is anxiety provoking (Gal et al., 1997). Given the ubiquitous and distributed nature of eLearning systems (Nof et al., 2015), teachers of research methods and statistics need to cultivate an understanding of how to effectively use eLearning tools to inspire psychology students to learn. Consequently, the aim of the present paper is to discuss critically how using eLearning systems might engage psychology students in research methods and statistics. First, we critically appraise definitions of eLearning. Second, we examine numerous important pedagogical principles associated with effectively teaching research methods and statistics using eLearning systems. Subsequently, we provide practical examples of our own eLearning-based class activities designed to engage psychology students to learn statistical concepts such as Factor Analysis and Discriminant Function Analysis. Finally, we discuss general trends in eLearning and possible futures that are pertinent to teachers of research methods and statistics in psychology. PMID:27014147

  18. Teaching Research Methods and Statistics in eLearning Environments: Pedagogy, Practical Examples, and Possible Futures.

    PubMed

    Rock, Adam J; Coventry, William L; Morgan, Methuen I; Loi, Natasha M

    2016-01-01

    Generally, academic psychologists are mindful of the fact that, for many students, the study of research methods and statistics is anxiety provoking (Gal et al., 1997). Given the ubiquitous and distributed nature of eLearning systems (Nof et al., 2015), teachers of research methods and statistics need to cultivate an understanding of how to effectively use eLearning tools to inspire psychology students to learn. Consequently, the aim of the present paper is to discuss critically how using eLearning systems might engage psychology students in research methods and statistics. First, we critically appraise definitions of eLearning. Second, we examine numerous important pedagogical principles associated with effectively teaching research methods and statistics using eLearning systems. Subsequently, we provide practical examples of our own eLearning-based class activities designed to engage psychology students to learn statistical concepts such as Factor Analysis and Discriminant Function Analysis. Finally, we discuss general trends in eLearning and possible futures that are pertinent to teachers of research methods and statistics in psychology.

  19. Teaching Research Methods and Statistics in eLearning Environments: Pedagogy, Practical Examples, and Possible Futures

    PubMed Central

    Rock, Adam J.; Coventry, William L.; Morgan, Methuen I.; Loi, Natasha M.

    2016-01-01

    Generally, academic psychologists are mindful of the fact that, for many students, the study of research methods and statistics is anxiety provoking (Gal et al., 1997). Given the ubiquitous and distributed nature of eLearning systems (Nof et al., 2015), teachers of research methods and statistics need to cultivate an understanding of how to effectively use eLearning tools to inspire psychology students to learn. Consequently, the aim of the present paper is to discuss critically how using eLearning systems might engage psychology students in research methods and statistics. First, we critically appraise definitions of eLearning. Second, we examine numerous important pedagogical principles associated with effectively teaching research methods and statistics using eLearning systems. Subsequently, we provide practical examples of our own eLearning-based class activities designed to engage psychology students to learn statistical concepts such as Factor Analysis and Discriminant Function Analysis. Finally, we discuss general trends in eLearning and possible futures that are pertinent to teachers of research methods and statistics in psychology. PMID:27014147

  20. The limitations of multivariate statistical methods in the mensuration of human misery.

    PubMed

    Hall, W

    1989-12-01

    Multivariate statistical methods have been widely used in the analysis of the multiple symptom data which are routinely collected in psychiatric research on the classification of depressive illnesses. The most commonly used methods, those of factor analysis and discriminant function analysis, were introduced into research on the classification of depressive illness with unreasonably high expectations about what they could achieve. The failure to realize these expectations has produced scepticism in some quarters about the usefulness of multivariate methods in psychiatric research. When evaluated more circumspectly, multivariate statistical methods have made a contribution to our understanding of depressive illnesses, and they will continue to do so, if they are used with more reasonable expectations.

  1. Statistics-based reconstruction method with high random-error tolerance for integral imaging.

    PubMed

    Zhang, Juan; Zhou, Liqiu; Jiao, Xiaoxue; Zhang, Lei; Song, Lipei; Zhang, Bo; Zheng, Yi; Zhang, Zan; Zhao, Xing

    2015-10-01

    A three-dimensional (3D) digital reconstruction method for integral imaging with high random-error tolerance based on statistics is proposed. By statistically analyzing the points reconstructed by triangulation from all corresponding image points in an elemental images array, 3D reconstruction with high random-error tolerance could be realized. To simulate the impacts of random errors, random offsets with different error levels are added to a different number of elemental images in simulation and optical experiments. The results of simulation and optical experiments showed that the proposed statistic-based reconstruction method has relatively stable and better reconstruction accuracy than the conventional reconstruction method. It can be verified that the proposed method can effectively reduce the impacts of random errors on 3D reconstruction of integral imaging. This method is simple and very helpful to the development of integral imaging technology.

  2. Initial evaluation of Centroidal Voronoi Tessellation method for statistical sampling and function integration.

    SciTech Connect

    Romero, Vicente Jose; Peterson, Janet S.; Burkhardt, John V.; Gunzburger, Max Donald

    2003-09-01

    A recently developed Centroidal Voronoi Tessellation (CVT) unstructured sampling method is investigated here to assess its suitability for use in statistical sampling and function integration. CVT efficiently generates a highly uniform distribution of sample points over arbitrarily shaped M-Dimensional parameter spaces. It has recently been shown on several 2-D test problems to provide superior point distributions for generating locally conforming response surfaces. In this paper, its performance as a statistical sampling and function integration method is compared to that of Latin-Hypercube Sampling (LHS) and Simple Random Sampling (SRS) Monte Carlo methods, and Halton and Hammersley quasi-Monte-Carlo sequence methods. Specifically, sampling efficiencies are compared for function integration and for resolving various statistics of response in a 2-D test problem. It is found that on balance CVT performs best of all these sampling methods on our test problems.

  3. Review of statistical methods used in enhanced-oil-recovery research and performance prediction. [131 references

    SciTech Connect

    Selvidge, J.E.

    1982-06-01

    Recent literature in the field of enhanced oil recovery (EOR) was surveyed to determine the extent to which researchers in EOR take advantage of statistical techniques in analyzing their data. In addition to determining the current level of reliance on statistical tools, another objective of this study is to promote by example the greater use of these tools. To serve this objective, the discussion of the techniques highlights the observed trend toward the use of increasingly more sophisticated methods and points out the strengths and pitfalls of different approaches. Several examples are also given of opportunities for extending EOR research findings by additional statistical manipulation. The search of the EOR literature, conducted mainly through computerized data bases, yielded nearly 200 articles containing mathematical analysis of the research. Of these, 21 were found to include examples of statistical approaches to data analysis and are discussed in detail in this review. The use of statistical techniques, as might be expected from their general purpose nature, extends across nearly all types of EOR research covering thermal methods of recovery, miscible processes, and micellar polymer floods. Data come from field tests, the laboratory, and computer simulation. The statistical methods range from simple comparisons of mean values to multiple non-linear regression equations and to probabilistic decision functions. The methods are applied to both engineering and economic data. The results of the survey are grouped by statistical technique and include brief descriptions of each of the 21 relevant papers. Complete abstracts of the papers are included in the bibliography. Brief bibliographic information (without abstracts) is also given for the articles identified in the initial search as containing mathematical analyses using other than statistical methods.

  4. A Comparative Review of Sensitivity and Uncertainty Analysis of Large-Scale Systems - II: Statistical Methods

    SciTech Connect

    Cacuci, Dan G.; Ionescu-Bujor, Mihaela

    2004-07-15

    Part II of this review paper highlights the salient features of the most popular statistical methods currently used for local and global sensitivity and uncertainty analysis of both large-scale computational models and indirect experimental measurements. These statistical procedures represent sampling-based methods (random sampling, stratified importance sampling, and Latin Hypercube sampling), first- and second-order reliability algorithms (FORM and SORM, respectively), variance-based methods (correlation ratio-based methods, the Fourier Amplitude Sensitivity Test, and the Sobol Method), and screening design methods (classical one-at-a-time experiments, global one-at-a-time design methods, systematic fractional replicate designs, and sequential bifurcation designs). It is emphasized that all statistical uncertainty and sensitivity analysis procedures first commence with the 'uncertainty analysis' stage and only subsequently proceed to the 'sensitivity analysis' stage; this path is the exact reverse of the conceptual path underlying the methods of deterministic sensitivity and uncertainty analysis where the sensitivities are determined prior to using them for uncertainty analysis. By comparison to deterministic methods, statistical methods for uncertainty and sensitivity analysis are relatively easier to develop and use but cannot yield exact values of the local sensitivities. Furthermore, current statistical methods have two major inherent drawbacks as follows: 1. Since many thousands of simulations are needed to obtain reliable results, statistical methods are at best expensive (for small systems) or, at worst, impracticable (e.g., for large time-dependent systems).2. Since the response sensitivities and parameter uncertainties are inherently and inseparably amalgamated in the results produced by these methods, improvements in parameter uncertainties cannot be directly propagated to improve response uncertainties; rather, the entire set of simulations and

  5. Defining the ecological hydrology of Taiwan Rivers using multivariate statistical methods

    NASA Astrophysics Data System (ADS)

    Chang, Fi-John; Wu, Tzu-Ching; Tsai, Wen-Ping; Herricks, Edwin E.

    2009-09-01

    SummaryThe identification and verification of ecohydrologic flow indicators has found new support as the importance of ecological flow regimes is recognized in modern water resources management, particularly in river restoration and reservoir management. An ecohydrologic indicator system reflecting the unique characteristics of Taiwan's water resources and hydrology has been developed, the Taiwan ecohydrological indicator system (TEIS). A major challenge for the water resources community is using the TEIS to provide environmental flow rules that improve existing water resources management. This paper examines data from the extensive network of flow monitoring stations in Taiwan using TEIS statistics to define and refine environmental flow options in Taiwan. Multivariate statistical methods were used to examine TEIS statistics for 102 stations representing the geographic and land use diversity of Taiwan. The Pearson correlation coefficient showed high multicollinearity between the TEIS statistics. Watersheds were separated into upper and lower-watershed locations. An analysis of variance indicated significant differences between upstream, more natural, and downstream, more developed, locations in the same basin with hydrologic indicator redundancy in flow change and magnitude statistics. Issues of multicollinearity were examined using a Principal Component Analysis (PCA) with the first three components related to general flow and high/low flow statistics, frequency and time statistics, and quantity statistics. These principle components would explain about 85% of the total variation. A major conclusion is that managers must be aware of differences among basins, as well as differences within basins that will require careful selection of management procedures to achieve needed flow regimes.

  6. Which Ab Initio Wave Function Methods Are Adequate for Quantitative Calculations of the Energies of Biradicals? The Performance of Coupled-Cluster and Multi-Reference Methods Along a Single-Bond Dissociation Coordinate

    SciTech Connect

    Yang, Ke; Jalan, Amrit; Green, William H.; Truhlar, Donald G.

    2013-01-08

    We examine the accuracy of single-reference and multireference correlated wave function methods for predicting accurate energies and potential energy curves of biradicals. The biradicals considered are intermediate species along the bond dissociation coordinates for breaking the F-F bond in F2, the O-O bond in H2O2, and the C-C bond in CH3CH3. We apply a host of single-reference and multireference approximations in a consistent way to the same cases to provide a better assessment of their relative accuracies than was previously possible. The most accurate method studied is coupled cluster theory with all connected excitations through quadruples, CCSDTQ. Without explicit quadruple excitations, the most accurate potential energy curves are obtained by the single-reference RCCSDt method, followed, in order of decreasing accuracy, by UCCSDT, RCCSDT, UCCSDt, seven multireference methods, including perturbation theory, configuration interaction, and coupled-cluster methods (with MRCI+Q being the best and Mk-MR-CCSD the least accurate), four CCSD(T) methods, and then CCSD.

  7. Statistical studies of animal response data from USF toxicity screening test method

    NASA Technical Reports Server (NTRS)

    Hilado, C. J.; Machado, A. M.

    1978-01-01

    Statistical examination of animal response data obtained using Procedure B of the USF toxicity screening test method indicates that the data deviate only slightly from a normal or Gaussian distribution. This slight departure from normality is not expected to invalidate conclusions based on theoretical statistics. Comparison of times to staggering, convulsions, collapse, and death as endpoints shows that time to death appears to be the most reliable endpoint because it offers the lowest probability of missed observations and premature judgements.

  8. A new statistical method for design and analyses of component tolerance

    NASA Astrophysics Data System (ADS)

    Movahedi, Mohammad Mehdi; Khounsiavash, Mohsen; Otadi, Mahmood; Mosleh, Maryam

    2016-09-01

    Tolerancing conducted by design engineers to meet customers' needs is a prerequisite for producing high-quality products. Engineers use handbooks to conduct tolerancing. While use of statistical methods for tolerancing is not something new, engineers often use known distributions, including the normal distribution. Yet, if the statistical distribution of the given variable is unknown, a new statistical method will be employed to design tolerance. In this paper, we use generalized lambda distribution for design and analyses component tolerance. We use percentile method (PM) to estimate the distribution parameters. The findings indicated that, when the distribution of the component data is unknown, the proposed method can be used to expedite the design of component tolerance. Moreover, in the case of assembled sets, more extensive tolerance for each component with the same target performance can be utilized.

  9. Feasibility of voxel-based statistical analysis method for myocardial PET

    NASA Astrophysics Data System (ADS)

    Ram Yu, A.; Kim, Jin Su; Paik, Chang H.; Kim, Kyeong Min; Moo Lim, Sang

    2014-09-01

    Although statistical parametric mapping (SPM) analysis is widely used in neuroimaging studies, to our best knowledge, there was no application to myocardial PET data analysis. In this study, we developed the voxel based statistical analysis method for myocardial PET which provides statistical comparison results between groups in image space. PET Emission data of normal and myocardial infarction rats were acquired For the SPM analysis, a rat heart template was created. In addition, individual PET data was spatially normalized and smoothed. Two sample t-tests were performed to identify the myocardial infarct region. This developed SPM method was compared with conventional ROI methods. Myocardial glucose metabolism was decreased in the lateral wall of the left ventricle. In the result of ROI analysis, the mean value of the lateral wall was 29% decreased. The newly developed SPM method for myocardial PET could provide quantitative information in myocardial PET study.

  10. Verification of statistical method CORN for modeling of microfuel in the case of high grain concentration

    SciTech Connect

    Chukbar, B. K.

    2015-12-15

    Two methods of modeling a double-heterogeneity fuel are studied: the deterministic positioning and the statistical method CORN of the MCU software package. The effect of distribution of microfuel in a pebble bed on the calculation results is studied. The results of verification of the statistical method CORN for the cases of the microfuel concentration up to 170 cm{sup –3} in a pebble bed are presented. The admissibility of homogenization of the microfuel coating with the graphite matrix is studied. The dependence of the reactivity on the relative location of fuel and graphite spheres in a pebble bed is found.

  11. Differential Expression Analysis for RNA-Seq: An Overview of Statistical Methods and Computational Software

    PubMed Central

    Huang, Huei-Chung; Niu, Yi; Qin, Li-Xuan

    2015-01-01

    Deep sequencing has recently emerged as a powerful alternative to microarrays for the high-throughput profiling of gene expression. In order to account for the discrete nature of RNA sequencing data, new statistical methods and computational tools have been developed for the analysis of differential expression to identify genes that are relevant to a disease such as cancer. In this paper, it is thus timely to provide an overview of these analysis methods and tools. For readers with statistical background, we also review the parameter estimation algorithms and hypothesis testing strategies used in these methods. PMID:26688660

  12. Verification of statistical method CORN for modeling of microfuel in the case of high grain concentration

    NASA Astrophysics Data System (ADS)

    Chukbar, B. K.

    2015-12-01

    Two methods of modeling a double-heterogeneity fuel are studied: the deterministic positioning and the statistical method CORN of the MCU software package. The effect of distribution of microfuel in a pebble bed on the calculation results is studied. The results of verification of the statistical method CORN for the cases of the microfuel concentration up to 170 cm-3 in a pebble bed are presented. The admissibility of homogenization of the microfuel coating with the graphite matrix is studied. The dependence of the reactivity on the relative location of fuel and graphite spheres in a pebble bed is found.

  13. A NEW METHOD TO CORRECT FOR FIBER COLLISIONS IN GALAXY TWO-POINT STATISTICS

    SciTech Connect

    Guo Hong; Zehavi, Idit; Zheng Zheng

    2012-09-10

    In fiber-fed galaxy redshift surveys, the finite size of the fiber plugs prevents two fibers from being placed too close to one another, limiting the ability to study galaxy clustering on all scales. We present a new method for correcting such fiber collision effects in galaxy clustering statistics based on spectroscopic observations. The target galaxy sample is divided into two distinct populations according to the targeting algorithm of fiber placement, one free of fiber collisions and the other consisting of collided galaxies. The clustering statistics are a combination of the contributions from these two populations. Our method makes use of observations in tile overlap regions to measure the contributions from the collided population, and to therefore recover the full clustering statistics. The method is rooted in solid theoretical ground and is tested extensively on mock galaxy catalogs. We demonstrate that our method can well recover the projected and the full three-dimensional (3D) redshift-space two-point correlation functions (2PCFs) on scales both below and above the fiber collision scale, superior to the commonly used nearest neighbor and angular correction methods. We discuss potential systematic effects in our method. The statistical correction accuracy of our method is only limited by sample variance, which scales down with (the square root of) the volume probed. For a sample similar to the final SDSS-III BOSS galaxy sample, the statistical correction error is expected to be at the level of 1% on scales {approx}0.1-30 h {sup -1} Mpc for the 2PCFs. The systematic error only occurs on small scales, caused by imperfect correction of collision multiplets, and its magnitude is expected to be smaller than 5%. Our correction method, which can be generalized to other clustering statistics as well, enables more accurate measurements of full 3D galaxy clustering on all scales with galaxy redshift surveys.

  14. The Effectiveness of Propositional Manipulation as a Lecturing Method in the Statistics Knowledge Domain

    ERIC Educational Resources Information Center

    Leppink, Jimmie; Broers, Nick J.; Imbos, Tjaart; van der Vleuten, Cees P. M.; Berger, Martijn P. F.

    2013-01-01

    The current experiment examined the potential effects of the method of propositional manipulation (MPM) as a lecturing method on motivation to learn and conceptual understanding of statistics. MPM aims to help students develop conceptual understanding by guiding them into self-explanation at two different stages: First, at the stage of…

  15. Choice of Statistical Method Influences Apparent Association Between Structure and Function in Glaucoma

    PubMed Central

    Marín-Franch, Iván; Malik, Rizwan; Crabb, David P.; Swanson, William H.

    2013-01-01

    Purpose. The aim of this study was to explore how different statistical methods may lead to inconsistent inferences about the association between structure and function in glaucoma. Methods. Two datasets from published studies were selected for their illustrative value. The first consisted of measurements of neuroretinal rim area in the superior-temporal sector paired with the corresponding visual field sensitivity. The second consisted of measurements of average retinal nerve fiber layer thickness over all sectors paired with the corresponding visual field sensitivity. Statistical methods included linear and segmented regression, and a nonparametric local-linear fit known as loess. The analyses were repeated with all measurements expressed as percent of mean normal. Results. Slopes from linear fits to the data changed by a factor of 10 depending on the linear regression method applied. Inferences about whether structural abnormality precedes functional abnormality varied with the statistical design and the units of measure used. Conclusions. The apparent association between structure and function in glaucoma, and consequent interpretation, varies with the statistical method and units of measure. Awareness of the limitations of any statistical analysis is necessary to avoid finding spurious results that ultimately may lead to inadequate clinical recommendations. PMID:23640041

  16. Big data analysis using modern statistical and machine learning methods in medicine.

    PubMed

    Yoo, Changwon; Ramirez, Luis; Liuzzi, Juan

    2014-06-01

    In this article we introduce modern statistical machine learning and bioinformatics approaches that have been used in learning statistical relationships from big data in medicine and behavioral science that typically include clinical, genomic (and proteomic) and environmental variables. Every year, data collected from biomedical and behavioral science is getting larger and more complicated. Thus, in medicine, we also need to be aware of this trend and understand the statistical tools that are available to analyze these datasets. Many statistical analyses that are aimed to analyze such big datasets have been introduced recently. However, given many different types of clinical, genomic, and environmental data, it is rather uncommon to see statistical methods that combine knowledge resulting from those different data types. To this extent, we will introduce big data in terms of clinical data, single nucleotide polymorphism and gene expression studies and their interactions with environment. In this article, we will introduce the concept of well-known regression analyses such as linear and logistic regressions that has been widely used in clinical data analyses and modern statistical models such as Bayesian networks that has been introduced to analyze more complicated data. Also we will discuss how to represent the interaction among clinical, genomic, and environmental data in using modern statistical models. We conclude this article with a promising modern statistical method called Bayesian networks that is suitable in analyzing big data sets that consists with different type of large data from clinical, genomic, and environmental data. Such statistical model form big data will provide us with more comprehensive understanding of human physiology and disease.

  17. Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics.

    PubMed

    Chen, Wenan; Larrabee, Beth R; Ovsyannikova, Inna G; Kennedy, Richard B; Haralambieva, Iana H; Poland, Gregory A; Schaid, Daniel J

    2015-07-01

    Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf.

  18. Impact of Statistical Learning Methods on the Predictive Power of Multivariate Normal Tissue Complication Probability Models

    SciTech Connect

    Xu Chengjian; Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van't

    2012-03-15

    Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.

  19. A model and variance reduction method for computing statistical outputs of stochastic elliptic partial differential equations

    SciTech Connect

    Vidal-Codina, F.; Nguyen, N.C.; Giles, M.B.; Peraire, J.

    2015-09-15

    We present a model and variance reduction method for the fast and reliable computation of statistical outputs of stochastic elliptic partial differential equations. Our method consists of three main ingredients: (1) the hybridizable discontinuous Galerkin (HDG) discretization of elliptic partial differential equations (PDEs), which allows us to obtain high-order accurate solutions of the governing PDE; (2) the reduced basis method for a new HDG discretization of the underlying PDE to enable real-time solution of the parameterized PDE in the presence of stochastic parameters; and (3) a multilevel variance reduction method that exploits the statistical correlation among the different reduced basis approximations and the high-fidelity HDG discretization to accelerate the convergence of the Monte Carlo simulations. The multilevel variance reduction method provides efficient computation of the statistical outputs by shifting most of the computational burden from the high-fidelity HDG approximation to the reduced basis approximations. Furthermore, we develop a posteriori error estimates for our approximations of the statistical outputs. Based on these error estimates, we propose an algorithm for optimally choosing both the dimensions of the reduced basis approximations and the sizes of Monte Carlo samples to achieve a given error tolerance. We provide numerical examples to demonstrate the performance of the proposed method.

  20. A comparison of statistical methods for deriving freshwater quality criteria for the protection of aquatic organisms.

    PubMed

    Xing, Liqun; Liu, Hongling; Zhang, Xiaowei; Hecker, Markus; Giesy, John P; Yu, Hongxia

    2014-01-01

    Species sensitivity distributions (SSDs) are increasingly used in both ecological risk assessment and derivation of water quality criteria. However, there has been debate about the choice of an appropriate approach for derivation of water quality criteria based on SSDs because the various methods can generate different values. The objective of this study was to compare the differences among various methods. Data sets of acute toxicities of 12 substances to aquatic organisms, representing a range of classes with different modes of action, were studied. Nine typical statistical approaches, including parametric and nonparametric methods, were used to construct SSDs for 12 chemicals. Water quality criteria, expressed as hazardous concentration for 5% of species (HC5), were derived by use of several approaches. All approaches produced comparable results, and the data generated by the different approaches were significantly correlated. Variability among estimates of HC5 of all inclusive species decreased with increasing sample size, and variability was similar among the statistical methods applied. Of the statistical methods selected, the bootstrap method represented the best-fitting model for all chemicals, while log-triangle and Weibull were the best models among the parametric methods evaluated. The bootstrap method was the primary choice to derive water quality criteria when data points are sufficient (more than 20). If the available data are few, all other methods should be constructed, and that which best describes the distribution of the data was selected.

  1. A statistical method for the analysis of nonlinear temperature time series from compost.

    PubMed

    Yu, Shouhai; Clark, O Grant; Leonard, Jerry J

    2008-04-01

    Temperature is widely accepted as a critical indicator of aerobic microbial activity during composting but, to date, little effort has been made to devise an appropriate statistical approach for the analysis of temperature time series. Nonlinear, time-correlated effects have not previously been considered in the statistical analysis of temperature data from composting, despite their importance and the ubiquity of such features. A novel mathematical model is proposed here, based on a modified Gompertz function, which includes nonlinear, time-correlated effects. Methods are shown to estimate initial values for the model parameter. Algorithms in SAS are used to fit the model to different sets of temperature data from passively aerated compost. Methods are then shown for testing the goodness-of-fit of the model to data. Next, a method is described to determine, in a statistically rigorous manner, the significance of differences among the time-correlated characteristics of the datasets as described using the proposed model. An extra-sum-of-squares method was selected for this purpose. Finally, the model and methods are used to analyze a sample dataset and are shown to be useful tools for the statistical comparison of temperature data in composting. PMID:17997302

  2. Application of statistics filter method and clustering analysis in fault diagnosis of roller bearings

    NASA Astrophysics Data System (ADS)

    Song, L. Y.; Wang, H. Q.; Gao, J. J.; Yang, J. F.; Liu, W. B.; Chen, P.

    2012-05-01

    Condition diagnosis of roller bearings depends largely on the feature analysis of vibration signals. Spectrum statistics filter (SSF) method could adaptively reduce the noise. This method is based on hypothesis testing in the frequency domain to eliminate the identical component between the reference signal and the primary signal. This paper presents a statistical parameter namely similarity factor to evaluate the filtering performance. The performance of the method is compared with the classical method, band pass filter (BPF). Results show that statistics filter is preferable to BPF in vibration signal processing. Moreover, the significance level awould be optimized by genetic algorithms. However, it is very difficult to identify fault states only from time domain waveform or frequency spectrum when the effect of the noise is so strong or fault feature is not obvious. Pattern recognition is then applied to fault diagnosis in this study through system clustering method. This paper processes experiment rig data that after statistics filter, and the accuracy of clustering analysis increases substantially.

  3. Assessment of methods for creating a national building statistics database for atmospheric dispersion modeling

    SciTech Connect

    Velugubantla, S. P.; Burian, S. J.; Brown, M. J.; McKinnon, A. T.; McPherson, T. N.; Han, W. S.

    2004-01-01

    Mesoscale meteorological codes and transport and dispersion models are increasingly being applied in urban areas. Representing urban terrain characteristics in these models is critical for accurate predictions of air flow, heating and cooling, and airborne contaminant concentrations in cities. A key component of urban terrain characterization is the description of building morphology (e.g., height, plan area, frontal area) and derived properties (e.g., roughness length). Methods to determine building morphological statistics range from manual field surveys to automated processing of digital building databases. In order to improve the quality and consistency of mesoscale meteorological and atmospheric dispersion modeling, a national dataset of building morphological statistics is needed. Currently, due to the expense and logistics of conducting detailed field surveys, building statistics have been derived for only small sections of a few cities. In most other cities, modeling projects rely on building statistics estimated using intuition and best guesses. There has been increasing emphasis in recent years to derive building statistics using digital building data or other data sources as a proxy for those data. Although there is a current expansion in public and private sector development of digital building data, at present there is insufficient data to derive a national building statistics database using automated analysis tools. Too many cities lack digital data on building footprints and heights and many of the cities having such data do so for only small areas. Due to the lack of sufficient digital building data, other datasets are used to estimate building statistics. Land use often serves as means to provide building statistics for a model domain, but the strength and consistency of the relationship between land use and building morphology is largely uncertain. In this paper, we investigate whether building statistics can be correlated to the underlying land

  4. A study of two statistical methods as applied to shuttle solid rocket booster expenditures

    NASA Technical Reports Server (NTRS)

    Perlmutter, M.; Huang, Y.; Graves, M.

    1974-01-01

    The state probability technique and the Monte Carlo technique are applied to finding shuttle solid rocket booster expenditure statistics. For a given attrition rate per launch, the probable number of boosters needed for a given mission of 440 launches is calculated. Several cases are considered, including the elimination of the booster after a maximum of 20 consecutive launches. Also considered is the case where the booster is composed of replaceable components with independent attrition rates. A simple cost analysis is carried out to indicate the number of boosters to build initially, depending on booster costs. Two statistical methods were applied in the analysis: (1) state probability method which consists of defining an appropriate state space for the outcome of the random trials, and (2) model simulation method or the Monte Carlo technique. It was found that the model simulation method was easier to formulate while the state probability method required less computing time and was more accurate.

  5. Statistical Physics Methods Provide the Exact Solution to a Long-Standing Problem of Genetics.

    PubMed

    Samal, Areejit; Martin, Olivier C

    2015-06-12

    Analytic and computational methods developed within statistical physics have found applications in numerous disciplines. In this Letter, we use such methods to solve a long-standing problem in statistical genetics. The problem, posed by Haldane and Waddington [Genetics 16, 357 (1931)], concerns so-called recombinant inbred lines (RILs) produced by repeated inbreeding. Haldane and Waddington derived the probabilities of RILs when considering two and three genes but the case of four or more genes has remained elusive. Our solution uses two probabilistic frameworks relatively unknown outside of physics: Glauber's formula and self-consistent equations of the Schwinger-Dyson type. Surprisingly, this combination of statistical formalisms unveils the exact probabilities of RILs for any number of genes. Extensions of the framework may have applications in population genetics and beyond.

  6. Statistical Physics Methods Provide the Exact Solution to a Long-Standing Problem of Genetics

    NASA Astrophysics Data System (ADS)

    Samal, Areejit; Martin, Olivier C.

    2015-06-01

    Analytic and computational methods developed within statistical physics have found applications in numerous disciplines. In this Letter, we use such methods to solve a long-standing problem in statistical genetics. The problem, posed by Haldane and Waddington [Genetics 16, 357 (1931)], concerns so-called recombinant inbred lines (RILs) produced by repeated inbreeding. Haldane and Waddington derived the probabilities of RILs when considering two and three genes but the case of four or more genes has remained elusive. Our solution uses two probabilistic frameworks relatively unknown outside of physics: Glauber's formula and self-consistent equations of the Schwinger-Dyson type. Surprisingly, this combination of statistical formalisms unveils the exact probabilities of RILs for any number of genes. Extensions of the framework may have applications in population genetics and beyond.

  7. Tips and Tricks for Successful Application of Statistical Methods to Biological Data.

    PubMed

    Schlenker, Evelyn

    2016-01-01

    This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.

  8. Teaching Statistical Research Methods to Graduate Students: Lessons Learned from Three Different Degree Programs

    ERIC Educational Resources Information Center

    Ekmekci, Ozgur; Hancock, Adrienne B.; Swayze, Susan

    2012-01-01

    This paper examines the challenge of teaching statistical research methods in three master's degree programs at a private university based in Washington, DC. We, as three professors teaching at this university, discuss the way we employ innovative approaches to deal with this challenge. We ground our discussion within the theoretical framework of…

  9. A Mixed-Methods Assessment of Using an Online Commercial Tutoring System to Teach Introductory Statistics

    ERIC Educational Resources Information Center

    Xu, Yonghong Jade; Meyer, Katrina A.; Morgan, Dianne D.

    2009-01-01

    This study used a mixed-methods approach to evaluate a hybrid teaching format that incorporated an online tutoring system, ALEKS, to address students' learning needs in a graduate-level introductory statistics course. Student performance in the hybrid course with ALEKS was found to be no different from that in a course taught in a traditional…

  10. Taguchi statistical design and analysis of cleaning methods for spacecraft materials

    NASA Technical Reports Server (NTRS)

    Lin, Y.; Chung, S.; Kazarians, G. A.; Blosiu, J. O.; Beaudet, R. A.; Quigley, M. S.; Kern, R. G.

    2003-01-01

    In this study, we have extensively tested various cleaning protocols. The variant parameters included the type and concentration of solvent, type of wipe, pretreatment conditions, and various rinsing systems. Taguchi statistical method was used to design and evaluate various cleaning conditions on ten common spacecraft materials.

  11. Using Matrix Structures to Integrate Theory and Statistics into a Research Methods Course.

    ERIC Educational Resources Information Center

    Youngs, George A., Jr.

    1987-01-01

    Describes a visual device, borrowed from matrix algebra, which helps students integrate theory, methods, and statistics in sociological research. Data are organized in a rectangular structure, with the columns of variable names representing theory, the rows representing observations whose quality is related to methodological concerns and the need…

  12. A REVIEW OF STATISTICAL METHODS FOR THE METEOROLOGICAL ADJUSTMENT OF TROPOSPHERIC OZONE. (R825173)

    EPA Science Inventory

    Abstract

    A variety of statistical methods for meteorological adjustment of ozone have been proposed in the literature over the last decade for purposes of forecasting, estimating ozone time trends, or investigating underlying mechanisms from an empirical perspective. T...

  13. Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.

    ERIC Educational Resources Information Center

    Kieffer, Kevin M.; Thompson, Bruce

    As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate unless "corrected" effect…

  14. Demonstrating the Effectiveness of an Integrated and Intensive Research Methods and Statistics Course Sequence

    ERIC Educational Resources Information Center

    Pliske, Rebecca M.; Caldwell, Tracy L.; Calin-Jageman, Robert J.; Taylor-Ritzler, Tina

    2015-01-01

    We developed a two-semester series of intensive (six-contact hours per week) behavioral research methods courses with an integrated statistics curriculum. Our approach includes the use of team-based learning, authentic projects, and Excel and SPSS. We assessed the effectiveness of our approach by examining our students' content area scores on the…

  15. Meta-analysis for Discovering Rare-Variant Associations: Statistical Methods and Software Programs

    PubMed Central

    Tang, Zheng-Zheng; Lin, Dan-Yu

    2015-01-01

    There is heightened interest in using next-generation sequencing technologies to identify rare variants that influence complex human diseases and traits. Meta-analysis is essential to this endeavor because large sample sizes are required for detecting associations with rare variants. In this article, we provide a comprehensive overview of statistical methods for meta-analysis of sequencing studies for discovering rare-variant associations. Specifically, we discuss the calculation of relevant summary statistics from participating studies, the construction of gene-level association tests, the choice of transformation for quantitative traits, the use of fixed-effects versus random-effects models, and the removal of shadow association signals through conditional analysis. We also show that meta-analysis based on properly calculated summary statistics is as powerful as joint analysis of individual-participant data. In addition, we demonstrate the performance of different meta-analysis methods by using both simulated and empirical data. We then compare four major software packages for meta-analysis of rare-variant associations—MASS, RAREMETAL, MetaSKAT, and seqMeta—in terms of the underlying statistical methodology, analysis pipeline, and software interface. Finally, we present PreMeta, a software interface that integrates the four meta-analysis packages and allows a consortium to combine otherwise incompatible summary statistics. PMID:26094574

  16. Meta-analysis for Discovering Rare-Variant Associations: Statistical Methods and Software Programs.

    PubMed

    Tang, Zheng-Zheng; Lin, Dan-Yu

    2015-07-01

    There is heightened interest in using next-generation sequencing technologies to identify rare variants that influence complex human diseases and traits. Meta-analysis is essential to this endeavor because large sample sizes are required for detecting associations with rare variants. In this article, we provide a comprehensive overview of statistical methods for meta-analysis of sequencing studies for discovering rare-variant associations. Specifically, we discuss the calculation of relevant summary statistics from participating studies, the construction of gene-level association tests, the choice of transformation for quantitative traits, the use of fixed-effects versus random-effects models, and the removal of shadow association signals through conditional analysis. We also show that meta-analysis based on properly calculated summary statistics is as powerful as joint analysis of individual-participant data. In addition, we demonstrate the performance of different meta-analysis methods by using both simulated and empirical data. We then compare four major software packages for meta-analysis of rare-variant associations-MASS, RAREMETAL, MetaSKAT, and seqMeta-in terms of the underlying statistical methodology, analysis pipeline, and software interface. Finally, we present PreMeta, a software interface that integrates the four meta-analysis packages and allows a consortium to combine otherwise incompatible summary statistics.

  17. An overview of recent developments in genomics and associated statistical methods.

    PubMed

    Bickel, Peter J; Brown, James B; Huang, Haiyan; Li, Qunhua

    2009-11-13

    The landscape of genomics has changed drastically in the last two decades. Increasingly inexpensive sequencing has shifted the primary focus from the acquisition of biological sequences to the study of biological function. Assays have been developed to study many intricacies of biological systems, and publicly available databases have given rise to integrative analyses that combine information from many sources to draw complex conclusions. Such research was the focus of the recent workshop at the Isaac Newton Institute, 'High dimensional statistics in biology'. Many computational methods from modern genomics and related disciplines were presented and discussed. Using, as much as possible, the material from these talks, we give an overview of modern genomics: from the essential assays that make data-generation possible, to the statistical methods that yield meaningful inference. We point to current analytical challenges, where novel methods, or novel applications of extant methods, are presently needed.

  18. a Probability-Based Statistical Method to Extract Water Body of TM Images with Missing Information

    NASA Astrophysics Data System (ADS)

    Lian, Shizhong; Chen, Jiangping; Luo, Minghai

    2016-06-01

    Water information cannot be accurately extracted using TM images because true information is lost in some images because of blocking clouds and missing data stripes, thereby water information cannot be accurately extracted. Water is continuously distributed in natural conditions; thus, this paper proposed a new method of water body extraction based on probability statistics to improve the accuracy of water information extraction of TM images with missing information. Different disturbing information of clouds and missing data stripes are simulated. Water information is extracted using global histogram matching, local histogram matching, and the probability-based statistical method in the simulated images. Experiments show that smaller Areal Error and higher Boundary Recall can be obtained using this method compared with the conventional methods.

  19. Exploratory study on a statistical method to analyse time resolved data obtained during nanomaterial exposure measurements

    NASA Astrophysics Data System (ADS)

    Clerc, F.; Njiki-Menga, G.-H.; Witschger, O.

    2013-04-01

    Most of the measurement strategies that are suggested at the international level to assess workplace exposure to nanomaterials rely on devices measuring, in real time, airborne particles concentrations (according different metrics). Since none of the instruments to measure aerosols can distinguish a particle of interest to the background aerosol, the statistical analysis of time resolved data requires special attention. So far, very few approaches have been used for statistical analysis in the literature. This ranges from simple qualitative analysis of graphs to the implementation of more complex statistical models. To date, there is still no consensus on a particular approach and the current period is always looking for an appropriate and robust method. In this context, this exploratory study investigates a statistical method to analyse time resolved data based on a Bayesian probabilistic approach. To investigate and illustrate the use of the this statistical method, particle number concentration data from a workplace study that investigated the potential for exposure via inhalation from cleanout operations by sandpapering of a reactor producing nanocomposite thin films have been used. In this workplace study, the background issue has been addressed through the near-field and far-field approaches and several size integrated and time resolved devices have been used. The analysis of the results presented here focuses only on data obtained with two handheld condensation particle counters. While one was measuring at the source of the released particles, the other one was measuring in parallel far-field. The Bayesian probabilistic approach allows a probabilistic modelling of data series, and the observed task is modelled in the form of probability distributions. The probability distributions issuing from time resolved data obtained at the source can be compared with the probability distributions issuing from the time resolved data obtained far-field, leading in a

  20. Effect Size as the Essential Statistic in Developing Methods for mTBI Diagnosis.

    PubMed

    Gibson, Douglas Brandt

    2015-01-01

    The descriptive statistic known as "effect size" measures the distinguishability of two sets of data. Distingishability is at the core of diagnosis. This article is intended to point out the importance of effect size in the development of effective diagnostics for mild traumatic brain injury and to point out the applicability of the effect size statistic in comparing diagnostic efficiency across the main proposed TBI diagnostic methods: psychological, physiological, biochemical, and radiologic. Comparing diagnostic approaches is difficult because different researcher in different fields have different approaches to measuring efficacy. Converting diverse measures to effect sizes, as is done in meta-analysis, is a relatively easy way to make studies comparable.

  1. A statistical method for verifying mesh convergence in Monte Carlo simulations with application to fragmentation

    SciTech Connect

    Bishop, Joseph E.; Strack, O. E.

    2011-03-22

    A novel method is presented for assessing the convergence of a sequence of statistical distributions generated by direct Monte Carlo sampling. The primary application is to assess the mesh or grid convergence, and possibly divergence, of stochastic outputs from non-linear continuum systems. Example systems include those from fluid or solid mechanics, particularly those with instabilities and sensitive dependence on initial conditions or system parameters. The convergence assessment is based on demonstrating empirically that a sequence of cumulative distribution functions converges in the Linfty norm. The effect of finite sample sizes is quantified using confidence levels from the Kolmogorov–Smirnov statistic. The statistical method is independent of the underlying distributions. The statistical method is demonstrated using two examples: (1) the logistic map in the chaotic regime, and (2) a fragmenting ductile ring modeled with an explicit-dynamics finite element code. In the fragmenting ring example the convergence of the distribution describing neck spacing is investigated. The initial yield strength is treated as a random field. Two different random fields are considered, one with spatial correlation and the other without. Both cases converged, albeit to different distributions. The case with spatial correlation exhibited a significantly higher convergence rate compared with the one without spatial correlation.

  2. Improved statistical methods for hit selection in high-throughput screening.

    PubMed

    Brideau, Christine; Gunter, Bert; Pikounis, Bill; Liaw, Andy

    2003-12-01

    High-throughput screening (HTS) plays a central role in modern drug discovery, allowing the rapid screening of large compound collections against a variety of putative drug targets. HTS is an industrial-scale process, relying on sophisticated automation, control, and state-of-the art detection technologies to organize, test, and measure hundreds of thousands to millions of compounds in nano- to microliter volumes. Despite this high technology, hit selection for HTS is still typically done using simple data analysis and basic statistical methods. The authors discuss in this article some shortcomings of these methods and present alternatives based on modern methods of statistical data analysis. Most important, they describe and show numerous real examples from the biologist-friendly Stat Server HTS application (SHS), a custom-developed software tool built on the commercially available S-PLUS and StatServer statistical analysis and server software. This system remotely processes HTS data using powerful and sophisticated statistical methodology but insulates users from the technical details by outputting results in a variety of readily interpretable graphs and tables.

  3. Application of multivariate statistical methods to the analysis of ancient Turkish potsherds

    SciTech Connect

    Martin, R.C.

    1986-01-01

    Three hundred ancient Turkish potsherds were analyzed by instrumental neutron activation analysis, and the resulting data analyzed by several techniques of multivariate statistical analysis, some only recently developed. The programs AGCLUS, MASLOC, and SIMCA were sequentially employed to characterize and group the samples by type of pottery and site of excavation. Comparison of the statistical analyses by each method provided archaeological insight into the site/type relationships of the samples and ultimately evidence relevant to the commercial relations between the ancient communities and specialization of pottery production over time. The techniques used for statistical analysis were found to be of significant potential utility in the future analysis of other archaeometric data sets. 25 refs., 33 figs.

  4. A Meta-View of Multivariate Statistical Inference Methods in European Psychology Journals.

    PubMed

    Harlow, Lisa L; Korendijk, Elly; Hamaker, Ellen L; Hox, Joop; Duerr, Sunny R

    2013-09-01

    We investigated the extent and nature of multivariate statistical inferential procedures used in eight European psychology journals covering a range of content (i.e., clinical, social, health, personality, organizational, developmental, educational, and cognitive). Multivariate methods included those found in popular texts that focused on prediction, group difference, and advanced modeling: multiple regression, logistic regression, analysis of covariance, multivariate analysis of variance, factor or principal component analysis, structural equation modeling, multilevel modeling, and other methods. Results revealed that an average of 57% of the articles from these eight journals involved multivariate analyses with a third using multiple regression, 17% using structural modeling, and the remaining methods collectively comprising about 50% of the analyses. The most frequently occurring inferential procedures involved prediction weights, dichotomous p values, figures with data, and significance tests with very few articles involving confidence intervals, statistical mediation, longitudinal analyses, power analysis, or meta-analysis. Contributions, limitations and future directions are discussed.

  5. Monte Carlo based statistical power analysis for mediation models: methods and software.

    PubMed

    Zhang, Zhiyong

    2014-12-01

    The existing literature on statistical power analysis for mediation models often assumes data normality and is based on a less powerful Sobel test instead of the more powerful bootstrap test. This study proposes to estimate statistical power to detect mediation effects on the basis of the bootstrap method through Monte Carlo simulation. Nonnormal data with excessive skewness and kurtosis are allowed in the proposed method. A free R package called bmem is developed to conduct the power analysis discussed in this study. Four examples, including a simple mediation model, a multiple-mediator model with a latent mediator, a multiple-group mediation model, and a longitudinal mediation model, are provided to illustrate the proposed method.

  6. A new mathematical evaluation of smoking problem based of algebraic statistical method.

    PubMed

    Mohammed, Maysaa J; Rakhimov, Isamiddin S; Shitan, Mahendran; Ibrahim, Rabha W; Mohammed, Nadia F

    2016-01-01

    Smoking problem is considered as one of the hot topics for many years. In spite of overpowering facts about the dangers, smoking is still a bad habit widely spread and socially accepted. Many people start smoking during their gymnasium period. The discovery of the dangers of smoking gave a warning sign of danger for individuals. There are different statistical methods used to analyze the dangers of smoking. In this study, we apply an algebraic statistical method to analyze and classify real data using Markov basis for the independent model on the contingency table. Results show that the Markov basis based classification is able to distinguish different date elements. Moreover, we check our proposed method via information theory by utilizing the Shannon formula to illustrate which one of these alternative tables is the best in term of independent.

  7. Hot spot or not: a comparison of spatial statistical methods to predict prospective malaria infections

    PubMed Central

    2014-01-01

    Background Within affected communities, Plasmodium falciparum infections may be skewed in distribution such that single or small clusters of households consistently harbour a disproportionate number of infected individuals throughout the year. Identifying these hotspots of malaria transmission would permit targeting of interventions and a more rapid reduction in malaria burden across the whole community. This study set out to compare different statistical methods of hotspot detection (SaTScan, kernel smoothing, weighted local prevalence) using different indicators (PCR positivity, AMA-1 and MSP-1 antibodies) for prediction of infection the following year. Methods Two full surveys of four villages in Mwanza, Tanzania were completed over consecutive years, 2010-2011. In both surveys, infection was assessed using nested polymerase chain reaction (nPCR). In addition in 2010, serologic markers (AMA-1 and MSP-119 antibodies) of exposure were assessed. Baseline clustering of infection and serological markers were assessed using three geospatial methods: spatial scan statistics, kernel analysis and weighted local prevalence analysis. Methods were compared in their ability to predict infection in the second year of the study using random effects logistic regression models, and comparisons of the area under the receiver operating curve (AUC) for each model. Sensitivity analysis was conducted to explore the effect of varying radius size for the kernel and weighted local prevalence methods and maximum population size for the spatial scan statistic. Results Guided by AUC values, the kernel method and spatial scan statistics appeared to be more predictive of infection in the following year. Hotspots of PCR-detected infection and seropositivity to AMA-1 were predictive of subsequent infection. For the kernel method, a 1 km window was optimal. Similarly, allowing hotspots to contain up to 50% of the population was a better predictor of infection in the second year using spatial

  8. A parsimonious statistical method to detect groupwise differentially expressed functional connectivity networks.

    PubMed

    Chen, Shuo; Kang, Jian; Xing, Yishi; Wang, Guoqing

    2015-12-01

    Group-level functional connectivity analyses often aim to detect the altered connectivity patterns between subgroups with different clinical or psychological experimental conditions, for example, comparing cases and healthy controls. We present a new statistical method to detect differentially expressed connectivity networks with significantly improved power and lower false-positive rates. The goal of our method was to capture most differentially expressed connections within networks of constrained numbers of brain regions (by the rule of parsimony). By virtue of parsimony, the false-positive individual connectivity edges within a network are effectively reduced, whereas the informative (differentially expressed) edges are allowed to borrow strength from each other to increase the overall power of the network. We develop a test statistic for each network in light of combinatorics graph theory, and provide p-values for the networks (in the weak sense) by using permutation test with multiple-testing adjustment. We validate and compare this new approach with existing methods, including false discovery rate and network-based statistic, via simulation studies and a resting-state functional magnetic resonance imaging case-control study. The results indicate that our method can identify differentially expressed connectivity networks, whereas existing methods are limited.

  9. A Parsimonious Statistical Method to Detect Groupwise Differentially Expressed Functional Connectivity Networks

    PubMed Central

    Chen, Shuo; Kang, Jian; Xing, Yishi; Wang, Guoqing

    2016-01-01

    Group-level functional connectivity analyses often aim to detect the altered connectivity patterns between subgroups with different clinical or psychological experimental conditions, for example, comparing cases and healthy controls. We present a new statistical method to detect differentially expressed connectivity networks with significantly improved power and lower false-positive rates. The goal of our method was to capture most differentially expressed connections within networks of constrained numbers of brain regions (by the rule of parsimony). By virtue of parsimony, the false-positive individual connectivity edges within a network are effectively reduced, whereas the informative (differentially expressed) edges are allowed to borrow strength from each other to increase the overall power of the network. We develop a test statistic for each network in light of combinatorics graph theory, and provide p-values for the networks (in the weak sense) by using permutation test with multiple-testing adjustment. We validate and compare this new approach with existing methods, including false discovery rate and network-based statistic, via simulation studies and a resting-state functional magnetic resonance imaging case–control study. The results indicate that our method can identify differentially expressed connectivity networks, whereas existing methods are limited. PMID:26416398

  10. Automated counting of morphologically normal red blood cells by using digital holographic microscopy and statistical methods

    NASA Astrophysics Data System (ADS)

    Moon, Inkyu; Yi, Faliu

    2015-09-01

    In this paper we overview a method to automatically count morphologically normal red blood cells (RBCs) by using off-axis digital holographic microscopy and statistical methods. Three kinds of RBC are used as training and testing data. All of the RBC phase images are obtained with digital holographic microscopy (DHM) that is robust to transparent or semitransparent biological cells. For the determination of morphologically normal RBCs, the RBC's phase images are first segmented with marker-controlled watershed transform algorithm. Multiple features are extracted from the segmented cells. Moreover, the statistical method of Hotelling's T-square test is conducted to show that the 3D features from 3D imaging method can improve the discrimination performance for counting of normal shapes of RBCs. Finally, the classifier is designed by using statistical Bayesian algorithm and the misclassification rates are measured with leave-one-out technique. Experimental results show the feasibility of the classification method for calculating the percentage of each typical normal RBC shape.

  11. A Statistical Approach for the Concurrent Coupling of Molecular Dynamics and Finite Element Methods

    NASA Technical Reports Server (NTRS)

    Saether, E.; Yamakov, V.; Glaessgen, E.

    2007-01-01

    Molecular dynamics (MD) methods are opening new opportunities for simulating the fundamental processes of material behavior at the atomistic level. However, increasing the size of the MD domain quickly presents intractable computational demands. A robust approach to surmount this computational limitation has been to unite continuum modeling procedures such as the finite element method (FEM) with MD analyses thereby reducing the region of atomic scale refinement. The challenging problem is to seamlessly connect the two inherently different simulation techniques at their interface. In the present work, a new approach to MD-FEM coupling is developed based on a restatement of the typical boundary value problem used to define a coupled domain. The method uses statistical averaging of the atomistic MD domain to provide displacement interface boundary conditions to the surrounding continuum FEM region, which, in return, generates interface reaction forces applied as piecewise constant traction boundary conditions to the MD domain. The two systems are computationally disconnected and communicate only through a continuous update of their boundary conditions. With the use of statistical averages of the atomistic quantities to couple the two computational schemes, the developed approach is referred to as an embedded statistical coupling method (ESCM) as opposed to a direct coupling method where interface atoms and FEM nodes are individually related. The methodology is inherently applicable to three-dimensional domains, avoids discretization of the continuum model down to atomic scales, and permits arbitrary temperatures to be applied.

  12. A statistical method for studying correlated rare events and their risk factors

    PubMed Central

    Xue, Xiaonan; Kim, Mimi Y; Wang, Tao; Kuniholm, Mark H; Strickler, Howard D

    2016-01-01

    Longitudinal studies of rare events such as cervical high-grade lesions or colorectal polyps that can recur often involve correlated binary data. Risk factor for these events cannot be reliably examined using conventional statistical methods. For example, logistic regression models that incorporate generalized estimating equations often fail to converge or provide inaccurate results when analyzing data of this type. Although exact methods have been reported, they are complex and computationally difficult. The current paper proposes a mathematically straightforward and easy-to-use two-step approach involving (i) an additive model to measure associations between a rare or uncommon correlated binary event and potential risk factors and (ii) a permutation test to estimate the statistical significance of these associations. Simulation studies showed that the proposed method reliably tests and accurately estimates the associations of exposure with correlated binary rare events. This method was then applied to a longitudinal study of human leukocyte antigen (HLA) genotype and risk of cervical high grade squamous intraepithelial lesions (HSIL) among HIV-infected and HIV-uninfected women. Results showed statistically significant associations of two HLA alleles among HIV-negative but not HIV-positive women, suggesting that immune status may modify the HLA and cervical HSIL association. Overall, the proposed method avoids model nonconvergence problems and provides a computationally simple, accurate, and powerful approach for the analysis of risk factor associations with rare/uncommon correlated binary events. PMID:25854937

  13. Feature-based and statistical methods for analyzing the Deepwater Horizon oil spill with AVIRIS imagery

    USGS Publications Warehouse

    Rand, R.S.; Clark, R.N.; Livo, K.E.

    2011-01-01

    The Deepwater Horizon oil spill covered a very large geographical area in the Gulf of Mexico creating potentially serious environmental impacts on both marine life and the coastal shorelines. Knowing the oil's areal extent and thickness as well as denoting different categories of the oil's physical state is important for assessing these impacts. High spectral resolution data in hyperspectral imagery (HSI) sensors such as Airborne Visible and Infrared Imaging Spectrometer (AVIRIS) provide a valuable source of information that can be used for analysis by semi-automatic methods for tracking an oil spill's areal extent, oil thickness, and oil categories. However, the spectral behavior of oil in water is inherently a highly non-linear and variable phenomenon that changes depending on oil thickness and oil/water ratios. For certain oil thicknesses there are well-defined absorption features, whereas for very thin films sometimes there are almost no observable features. Feature-based imaging spectroscopy methods are particularly effective at classifying materials that exhibit specific well-defined spectral absorption features. Statistical methods are effective at classifying materials with spectra that exhibit a considerable amount of variability and that do not necessarily exhibit well-defined spectral absorption features. This study investigates feature-based and statistical methods for analyzing oil spills using hyperspectral imagery. The appropriate use of each approach is investigated and a combined feature-based and statistical method is proposed. ?? 2011 SPIE.

  14. A parsimonious statistical method to detect groupwise differentially expressed functional connectivity networks.

    PubMed

    Chen, Shuo; Kang, Jian; Xing, Yishi; Wang, Guoqing

    2015-12-01

    Group-level functional connectivity analyses often aim to detect the altered connectivity patterns between subgroups with different clinical or psychological experimental conditions, for example, comparing cases and healthy controls. We present a new statistical method to detect differentially expressed connectivity networks with significantly improved power and lower false-positive rates. The goal of our method was to capture most differentially expressed connections within networks of constrained numbers of brain regions (by the rule of parsimony). By virtue of parsimony, the false-positive individual connectivity edges within a network are effectively reduced, whereas the informative (differentially expressed) edges are allowed to borrow strength from each other to increase the overall power of the network. We develop a test statistic for each network in light of combinatorics graph theory, and provide p-values for the networks (in the weak sense) by using permutation test with multiple-testing adjustment. We validate and compare this new approach with existing methods, including false discovery rate and network-based statistic, via simulation studies and a resting-state functional magnetic resonance imaging case-control study. The results indicate that our method can identify differentially expressed connectivity networks, whereas existing methods are limited. PMID:26416398

  15. Quantification and Statistical Analysis Methods for Vessel Wall Components from Stained Images with Masson's Trichrome

    PubMed Central

    Hernández-Morera, Pablo; Castaño-González, Irene; Travieso-González, Carlos M.; Mompeó-Corredera, Blanca; Ortega-Santana, Francisco

    2016-01-01

    Purpose To develop a digital image processing method to quantify structural components (smooth muscle fibers and extracellular matrix) in the vessel wall stained with Masson’s trichrome, and a statistical method suitable for small sample sizes to analyze the results previously obtained. Methods The quantification method comprises two stages. The pre-processing stage improves tissue image appearance and the vessel wall area is delimited. In the feature extraction stage, the vessel wall components are segmented by grouping pixels with a similar color. The area of each component is calculated by normalizing the number of pixels of each group by the vessel wall area. Statistical analyses are implemented by permutation tests, based on resampling without replacement from the set of the observed data to obtain a sampling distribution of an estimator. The implementation can be parallelized on a multicore machine to reduce execution time. Results The methods have been tested on 48 vessel wall samples of the internal saphenous vein stained with Masson’s trichrome. The results show that the segmented areas are consistent with the perception of a team of doctors and demonstrate good correlation between the expert judgments and the measured parameters for evaluating vessel wall changes. Conclusion The proposed methodology offers a powerful tool to quantify some components of the vessel wall. It is more objective, sensitive and accurate than the biochemical and qualitative methods traditionally used. The permutation tests are suitable statistical techniques to analyze the numerical measurements obtained when the underlying assumptions of the other statistical techniques are not met. PMID:26761643

  16. New Methods for Applying Statistical State Dynamics to Problems in Atmospheric Turbulence

    NASA Astrophysics Data System (ADS)

    Farrell, B.; Ioannou, P. J.

    2015-12-01

    Adopting the perspective of statistical state dynamics (SSD) has led to a number of recent advances inunderstanding and simulating atmospheric turbulence at both boundary layer and planetary scale. Traditionally, realizations have been used to study turbulence and if a statistical quantity was needed it was obtained by averaging. However, it is now becomimg more widely appreciated that there are important advantages to studying the statistical state dynamics (SSD) directly. In turbulent systems statistical quantities are often the most useful and the advantage of obtaining these quantities directly as state variables is obvious. Moreover, quantities such as the probability density function (pdf) are often difficult to obtain accurately by sampling state trajectories. In the event that the pdf is itself time dependent or even chaotic, as is the case in the turbulence of the planetary boundary layer, the pdf can only be obtained as a state variable. However, perhaps the greatest advantage of the SSD approach is that it reveals directly the essential cooperative mechanisms of interaction among spatial and temporal scales that underly the turbulent state. In order to exploit these advantages of the SSD approach to geophysical turbulence, new analytical and computational methods are being developed. Example problems in atmospheric turbulence will be presented in which these new SSD analysis and computational methods are used.

  17. Proposal for a biometrics of the cortical surface: a statistical method for relative surface distance metrics

    NASA Astrophysics Data System (ADS)

    Bookstein, Fred L.

    1995-08-01

    Recent advances in computational geometry have greatly extended the range of neuroanatomical questions that can be approached by rigorous quantitative methods. One of the major current challenges in this area is to describe the variability of human cortical surface form and its implications for individual differences in neurophysiological functioning. Existing techniques for representation of stochastically invaginated surfaces do not conduce to the necessary parametric statistical summaries. In this paper, following a hint from David Van Essen and Heather Drury, I sketch a statistical method customized for the constraints of this complex data type. Cortical surface form is represented by its Riemannian metric tensor and averaged according to parameters of a smooth averaged surface. Sulci are represented by integral trajectories of the smaller principal strains of this metric, and their statistics follow the statistics of that relative metric. The diagrams visualizing this tensor analysis look like alligator leather but summarize all aspects of cortical surface form in between the principal sulci, the reliable ones; no flattening is required.

  18. Statistical and Computational Methods for High-Throughput Sequencing Data Analysis of Alternative Splicing

    PubMed Central

    2013-01-01

    The burgeoning field of high-throughput sequencing significantly improves our ability to understand the complexity of transcriptomes. Alternative splicing, as one of the most important driving forces for transcriptome diversity, can now be studied at an unprecedent resolution. Efficient and powerful computational and statistical methods are in urgent need to facilitate the characterization and quantification of alternative splicing events. Here we discuss methods in splice junction read mapping, and methods in exon-centric or isoform-centric quantification of alternative splicing. In addition, we discuss HITS-CLIP and splicing QTL analyses which are novel high-throughput sequencing based approaches in the dissection of splicing regulation. PMID:24058384

  19. A combined approach to the estimation of statistical error of the direct simulation Monte Carlo method

    NASA Astrophysics Data System (ADS)

    Plotnikov, M. Yu.; Shkarupa, E. V.

    2015-11-01

    Presently, the direct simulation Monte Carlo (DSMC) method is widely used for solving rarefied gas dynamics problems. As applied to steady-state problems, a feature of this method is the use of dependent sample values of random variables for the calculation of macroparameters of gas flows. A new combined approach to estimating the statistical error of the method is proposed that does not practically require additional computations, and it is applicable for any degree of probabilistic dependence of sample values. Features of the proposed approach are analyzed theoretically and numerically. The approach is tested using the classical Fourier problem and the problem of supersonic flow of rarefied gas through permeable obstacle.

  20. Data Mining Methods Applied to Flight Operations Quality Assurance Data: A Comparison to Standard Statistical Methods

    NASA Technical Reports Server (NTRS)

    Stolzer, Alan J.; Halford, Carl

    2007-01-01

    In a previous study, multiple regression techniques were applied to Flight Operations Quality Assurance-derived data to develop parsimonious model(s) for fuel consumption on the Boeing 757 airplane. The present study examined several data mining algorithms, including neural networks, on the fuel consumption problem and compared them to the multiple regression results obtained earlier. Using regression methods, parsimonious models were obtained that explained approximately 85% of the variation in fuel flow. In general data mining methods were more effective in predicting fuel consumption. Classification and Regression Tree methods reported correlation coefficients of .91 to .92, and General Linear Models and Multilayer Perceptron neural networks reported correlation coefficients of about .99. These data mining models show great promise for use in further examining large FOQA databases for operational and safety improvements.

  1. Meta-analysis and The Cochrane Collaboration: 20 years of the Cochrane Statistical Methods Group

    PubMed Central

    2013-01-01

    The Statistical Methods Group has played a pivotal role in The Cochrane Collaboration over the past 20 years. The Statistical Methods Group has determined the direction of statistical methods used within Cochrane reviews, developed guidance for these methods, provided training, and continued to discuss and consider new and controversial issues in meta-analysis. The contribution of Statistical Methods Group members to the meta-analysis literature has been extensive and has helped to shape the wider meta-analysis landscape. In this paper, marking the 20th anniversary of The Cochrane Collaboration, we reflect on the history of the Statistical Methods Group, beginning in 1993 with the identification of aspects of statistical synthesis for which consensus was lacking about the best approach. We highlight some landmark methodological developments that Statistical Methods Group members have contributed to in the field of meta-analysis. We discuss how the Group implements and disseminates statistical methods within The Cochrane Collaboration. Finally, we consider the importance of robust statistical methodology for Cochrane systematic reviews, note research gaps, and reflect on the challenges that the Statistical Methods Group faces in its future direction. PMID:24280020

  2. Supervision of Student Teachers: How Adequate?

    ERIC Educational Resources Information Center

    Dean, Ken

    This study attempted to ascertain how adequately student teachers are supervised by college supervisors and supervising teachers. Questions to be answered were as follows: a) How do student teachers rate the adequacy of supervision given them by college supervisors and supervising teachers? and b) Are there significant differences between ratings…

  3. Small Rural Schools CAN Have Adequate Curriculums.

    ERIC Educational Resources Information Center

    Loustaunau, Martha

    The small rural school's foremost and largest problem is providing an adequate curriculum for students in a changing world. Often the small district cannot or is not willing to pay the per-pupil cost of curriculum specialists, specialized courses using expensive equipment no more than one period a day, and remodeled rooms to accommodate new…

  4. Toward More Adequate Quantitative Instructional Research.

    ERIC Educational Resources Information Center

    VanSickle, Ronald L.

    1986-01-01

    Sets an agenda for improving instructional research conducted with classical quantitative experimental or quasi-experimental methodology. Includes guidelines regarding the role of a social perspective, adequate conceptual and operational definition, quality instrumentation, control of threats to internal and external validity, and the use of…

  5. An Adequate Education Defined. Fastback 476.

    ERIC Educational Resources Information Center

    Thomas, M. Donald; Davis, E. E. (Gene)

    Court decisions historically have dealt with educational equity; now they are helping to establish "adequacy" as a standard in education. Legislatures, however, have been slow to enact remedies. One debate over education adequacy, though, is settled: Schools are not financed at an adequate level. This fastback is divided into three sections.…

  6. Funding the Formula Adequately in Oklahoma

    ERIC Educational Resources Information Center

    Hancock, Kenneth

    2015-01-01

    This report is a longevity, simulational study that looks at how the ratio of state support to local support effects the number of school districts that breaks the common school's funding formula which in turns effects the equity of distribution to the common schools. After nearly two decades of adequately supporting the funding formula, Oklahoma…

  7. Statistical analysis on extended reference method for volume holographic data storage

    NASA Astrophysics Data System (ADS)

    Dai, Foster F.; Gu, Claire

    1997-06-01

    We previously proposed a novel recording method to reduce the crosstalk noise and improve the storage capacity in angle-multiplexed volume holographic data storages. Instead of conventionally using a plane wave to record holograms, the technique employs a recording reference extending uniformly within a narrow spatial frequency bandwidth and reads the memory with a plane wave. The analytical results show that the SNR obtained by using the extended reference method is about 20 dB higher than that achieved in terms of the conventional recording method. For interpixel crosstalk noise, a statistical analysis is presented and the SNR is given in a closed form for both the point reference and the extended reference methods. Considering both interpage and interpixel crosstalk noise, we further investigate the crosstalk-limited storage density. The results show that the proposed extended reference method achieves about 10 times larger storage density than the point reference method.

  8. Autonomous Correction of Sensor Data Applied to Building Technologies Utilizing Statistical Processing Methods

    SciTech Connect

    Castello, Charles C; New, Joshua Ryan

    2012-01-01

    Autonomous detection and correction of potentially missing or corrupt sensor data is a essential concern in building technologies since data availability and correctness is necessary to develop accurate software models for instrumented experiments. Therefore, this paper aims to address this problem by using statistical processing methods including: (1) least squares; (2) maximum likelihood estimation; (3) segmentation averaging; and (4) threshold based techniques. Application of these validation schemes are applied to a subset of data collected from Oak Ridge National Laboratory s (ORNL) ZEBRAlliance research project, which is comprised of four single-family homes in Oak Ridge, TN outfitted with a total of 1,218 sensors. The focus of this paper is on three different types of sensor data: (1) temperature; (2) humidity; and (3) energy consumption. Simulations illustrate the threshold based statistical processing method performed best in predicting temperature, humidity, and energy data.

  9. Statistical Methods for Estimating the Uncertainty in the Best Basis Inventories

    SciTech Connect

    WILMARTH, S.R.

    2000-09-07

    This document describes the statistical methods used to determine sample-based uncertainty estimates for the Best Basis Inventory (BBI). For each waste phase, the equation for the inventory of an analyte in a tank is Inventory (Kg or Ci) = Concentration x Density x Waste Volume. the total inventory is the sum of the inventories in the different waste phases. Using tanks sample data: statistical methods are used to obtain estimates of the mean concentration of an analyte the density of the waste, and their standard deviations. The volumes of waste in the different phases, and their standard deviations, are estimated based on other types of data. The three estimates are multiplied to obtain the inventory estimate. The standard deviations are combined to obtain a standard deviation of the inventory. The uncertainty estimate for the Best Basis Inventory (BBI) is the approximate 95% confidence interval on the inventory.

  10. Statistical and optimization methods to expedite neural network training for transient identification

    SciTech Connect

    Reifman, J. . Reactor Analysis Div.); Vitela, E.J. . Inst. de Ciencias Nucleares); Lee, J.C. . Dept. of Nuclear Engineering)

    1993-01-01

    Two complementary methods, statistical feature selection and nonlinear optimization through conjugate gradients, are used to expedite feedforward neural network training. Statistical feature selection techniques in the form of linear correlation coefficients and information-theoretic entropy are used to eliminate redundant and non-informative plant parameters to reduce the size of the network. The method of conjugate gradients is used to accelerate the network training convergence and to systematically calculate the Teaming and momentum constants at each iteration. The proposed techniques are compared with the backpropagation algorithm using the entire set of plant parameters in the training of neural networks to identify transients simulated with the Midland Nuclear Power Plant Unit 2 simulator. By using 25% of the plant parameters and the conjugate gradients, a 30-fold reduction in CPU time was obtained without degrading the diagnostic ability of the network.

  11. Statistical and optimization methods to expedite neural network training for transient identification

    SciTech Connect

    Reifman, J.; Vitela, E.J.; Lee, J.C.

    1993-03-01

    Two complementary methods, statistical feature selection and nonlinear optimization through conjugate gradients, are used to expedite feedforward neural network training. Statistical feature selection techniques in the form of linear correlation coefficients and information-theoretic entropy are used to eliminate redundant and non-informative plant parameters to reduce the size of the network. The method of conjugate gradients is used to accelerate the network training convergence and to systematically calculate the Teaming and momentum constants at each iteration. The proposed techniques are compared with the backpropagation algorithm using the entire set of plant parameters in the training of neural networks to identify transients simulated with the Midland Nuclear Power Plant Unit 2 simulator. By using 25% of the plant parameters and the conjugate gradients, a 30-fold reduction in CPU time was obtained without degrading the diagnostic ability of the network.

  12. Statistical damage detection method for frame structures using a confidence interval

    NASA Astrophysics Data System (ADS)

    Li, Weiming; Zhu, Hongping; Luo, Hanbin; Xia, Yong

    2010-03-01

    A novel damage detection method is applied to a 3-story frame structure, to obtain statistical quantification control criterion of the existence, location and identification of damage. The mean, standard deviation, and exponentially weighted moving average (EWMA) are applied to detect damage information according to statistical process control (SPC) theory. It is concluded that the detection is insignificant with the mean and EWMA because the structural response is not independent and is not a normal distribution. On the other hand, the damage information is detected well with the standard deviation because the influence of the data distribution is not pronounced with this parameter. A suitable moderate confidence level is explored for more significant damage location and quantification detection, and the impact of noise is investigated to illustrate the robustness of the method.

  13. Fractional exclusion statistics: the method for describing interacting particle systems as ideal gases

    NASA Astrophysics Data System (ADS)

    Anghel, Dragoş-Victor

    2012-11-01

    I show that if the total energy of a system of interacting particles may be written as a sum of quasiparticle energies, then the system of quasiparticles can be viewed, in general, as an ideal gas with fractional exclusion statistics (FES). The general method for calculating the FES parameters is also provided. The interacting particle system cannot be described as an ideal gas of Bose and Fermi quasiparticles except in trivial situations.

  14. A Framework for the Economic Analysis of Data Collection Methods for Vital Statistics

    PubMed Central

    Jimenez-Soto, Eliana; Hodge, Andrew; Nguyen, Kim-Huong; Dettrick, Zoe; Lopez, Alan D.

    2014-01-01

    Background Over recent years there has been a strong movement towards the improvement of vital statistics and other types of health data that inform evidence-based policies. Collecting such data is not cost free. To date there is no systematic framework to guide investment decisions on methods of data collection for vital statistics or health information in general. We developed a framework to systematically assess the comparative costs and outcomes/benefits of the various data methods for collecting vital statistics. Methodology The proposed framework is four-pronged and utilises two major economic approaches to systematically assess the available data collection methods: cost-effectiveness analysis and efficiency analysis. We built a stylised example of a hypothetical low-income country to perform a simulation exercise in order to illustrate an application of the framework. Findings Using simulated data, the results from the stylised example show that the rankings of the data collection methods are not affected by the use of either cost-effectiveness or efficiency analysis. However, the rankings are affected by how quantities are measured. Conclusion There have been several calls for global improvements in collecting useable data, including vital statistics, from health information systems to inform public health policies. Ours is the first study that proposes a systematic framework to assist countries undertake an economic evaluation of DCMs. Despite numerous challenges, we demonstrate that a systematic assessment of outputs and costs of DCMs is not only necessary, but also feasible. The proposed framework is general enough to be easily extended to other areas of health information. PMID:25171152

  15. A new non-invasive statistical method to assess the spontaneous cardiac baroreflex in humans.

    PubMed

    Ducher, M; Fauvel, J P; Gustin, M P; Cerutti, C; Najem, R; Cuisinaud, G; Laville, M; Pozet, N; Paultre, C Z

    1995-06-01

    1. A new method was developed to evaluate cardiac baroreflex sensitivity. The association of a high systolic blood pressure with a low heart rate or the converse is considered to be under the influence of cardiac baroreflex activity. This method is based on the determination of the statistical dependence between systolic blood pressure and heart rate values obtained non-invasively by a Finapres device. Our computerized analysis selects the associations with the highest statistical dependence. A 'Z-coefficient' quantifies the strength of the statistical dependence. The slope of the linear regression, computed on these selected associations, is used to estimate baroreflex sensitivity. 2. The present study was carried out in 11 healthy resting male subjects. The results obtained by the 'Z-coefficient' method were compared with those obtained by cross-spectrum analysis, which has already been validated in humans. Furthermore, the reproducibility of both methods was checked after 1 week. 3. The results obtained by the two methods were significantly correlated (r = 0.78 for the first and r = 0.76 for the second experiment, P < 0.01). When repeated after 1 week, the average results were not significantly different. Considering individual results, test-retest correlation coefficients were higher with the Z-analysis (r = 0.79, P < 0.01) than with the cross-spectrum analysis (r = 0.61, P < 0.05). 4. In conclusion, as the Z-method gives results similar to but more reproducible than the cross-spectrum method, it might be a powerful and reliable tool to assess baroreflex sensitivity in humans.

  16. Nonparametric statistical methods for comparing two sites based on data with multiple non-detect limits

    NASA Astrophysics Data System (ADS)

    Millard, Steven P.; Deverel, Steven J.

    1988-12-01

    As concern over the effects of trace amounts of pollutants has increased, so has the need for statistical methods that deal appropriately with data that include values reported as "less than" the detection limit. It has become increasingly common for water quality data to include censored values that reflect more than one detection limit for a single analyte. For such multiply censored data sets, standard statistical methods (for example, to compare analyte concentration in two areas) are not valid. In such cases, methods from the biostatistical field of survival analysis are applicable. Several common two-sample censored data rank tests are explained, and their behaviors are studied via a Monte Carlo simulation in which sample sizes and censoring mechanisms are varied under an assumed lognormal distribution. These tests are applied to shallow groundwater chemistry data from two sites in the San Joaquin Valley, California. The best overall test, in terms of maintained α level, is the normal scores test based on a permutation variance. In cases where the α level is maintained, however, the Peto-Prentice statistic based on an asymptotic variance performs as well or better.

  17. Statistical Methods for Estimation of Direct and Differential Kinematics of the Vocal Tract

    PubMed Central

    Lammert, Adam; Goldstein, Louis; Narayanan, Shrikanth; Iskarous, Khalil

    2012-01-01

    We present and evaluate two statistical methods for estimating kinematic relationships of the speech production system: Artificial Neural Networks and Locally-Weighted Regression. The work is motivated by the need to characterize this motor system, with particular focus on estimating differential aspects of kinematics. Kinematic analysis will facilitate progress in a variety of areas, including the nature of speech production goals, articulatory redundancy and, relatedly, acoustic-to-articulatory inversion. Statistical methods must be used to estimate these relationships from data since they are infeasible to express in closed form. Statistical models are optimized and evaluated – using a heldout data validation procedure – on two sets of synthetic speech data. The theoretical and practical advantages of both methods are also discussed. It is shown that both direct and differential kinematics can be estimated with high accuracy, even for complex, nonlinear relationships. Locally-Weighted Regression displays the best overall performance, which may be due to practical advantages in its training procedure. Moreover, accurate estimation can be achieved using only a modest amount of training data, as judged by convergence of performance. The algorithms are also applied to real-time MRI data, and the results are generally consistent with those obtained from synthetic data. PMID:24052685

  18. Identifying minefields and verifying clearance: adapting statistical methods for UXO target detection

    NASA Astrophysics Data System (ADS)

    Gilbert, Richard O.; O'Brien, Robert F.; Wilson, John E.; Pulsipher, Brent A.; McKinstry, Craig A.

    2003-09-01

    It may not be feasible to completely survey large tracts of land suspected of containing minefields. It is desirable to develop a characterization protocol that will confidently identify minefields within these large land tracts if they exist. Naturally, surveying areas of greatest concern and most likely locations would be necessary but will not provide the needed confidence that an unknown minefield had not eluded detection. Once minefields are detected, methods are needed to bound the area that will require detailed mine detection surveys. The US Department of Defense Strategic Environmental Research and Development Program (SERDP) is sponsoring the development of statistical survey methods and tools for detecting potential UXO targets. These methods may be directly applicable to demining efforts. Statistical methods are employed to determine the optimal geophysical survey transect spacing to have confidence of detecting target areas of a critical size, shape, and anomaly density. Other methods under development determine the proportion of a land area that must be surveyed to confidently conclude that there are no UXO present. Adaptive sampling schemes are also being developed as an approach for bounding the target areas. These methods and tools will be presented and the status of relevant research in this area will be discussed.

  19. Evaluation of Statistical Rainfall Disaggregation Methods Using Rain-Gauge Information for West-Central Florida

    SciTech Connect

    Murch, Renee Rokicki; Zhang, Jing; Ross, Mark; Ganguly, Auroop R; Nachabe, Mahmood

    2008-01-01

    Rainfall disaggregation in time can be useful for the simulation of hydrologic systems and the prediction of floods and flash floods. Disaggregation of rainfall to timescales less than 1 h can be especially useful for small urbanized watershed study, and for continuous hydrologic simulations and when Hortonian or saturation-excess runoff dominates. However, the majority of rain gauges in any region record rainfall in daily time steps or, very often, hourly records have extensive missing data. Also, the convective nature of the rainfall can result in significant differences in the measured rainfall at nearby gauges. This study evaluates several statistical approaches for rainfall disaggregation which may be applicable using data from West-Central Florida, specifically from 1 h observations to 15 min records, and proposes new methodologies that have the potential to outperform existing approaches. Four approaches are examined. The first approach is an existing direct scaling method that utilizes observed 15 min rainfall at secondary rain gauges, to disaggregate observed 1 h rainfall at more numerous primary rain gauges. The second approach is an extension of an existing method for continuous rainfall disaggregation through statistical distributional assumptions. The third approach relies on artificial neural networks for the disaggregation process without sorting and the fourth approach extends the neural network methods through statistical preprocessing via new sorting and desorting schemes. The applicability and performance of these methods were evaluated using information from a fairly dense rain gauge network in West-Central Florida. Of the four methods compared, the sorted neural networks and the direct scaling method predicted peak rainfall magnitudes significantly better than the remaining techniques. The study also suggests that desorting algorithms would also be useful to randomly replace the artificial hyetograph within a rainfall period.

  20. The breaking load method - Results and statistical modification from the ASTM interlaboratory test program

    NASA Technical Reports Server (NTRS)

    Colvin, E. L.; Emptage, M. R.

    1992-01-01

    The breaking load test provides quantitative stress corrosion cracking data by determining the residual strength of tension specimens that have been exposed to corrosive environments. Eight laboratories have participated in a cooperative test program under the auspices of ASTM Committee G-1 to evaluate the new test method. All eight laboratories were able to distinguish between three tempers of aluminum alloy 7075. The statistical analysis procedures that were used in the test program do not work well in all situations. An alternative procedure using Box-Cox transformations shows a great deal of promise. An ASTM standard method has been drafted which incorporates the Box-Cox procedure.

  1. Sharpening method of satellite thermal image based on the geographical statistical model

    NASA Astrophysics Data System (ADS)

    Qi, Pengcheng; Hu, Shixiong; Zhang, Haijun; Guo, Guangmeng

    2016-04-01

    To improve the effectiveness of thermal sharpening in mountainous regions, paying more attention to the laws of land surface energy balance, a thermal sharpening method based on the geographical statistical model (GSM) is proposed. Explanatory variables were selected from the processes of land surface energy budget and thermal infrared electromagnetic radiation transmission, then high spatial resolution (57 m) raster layers were generated for these variables through spatially simulating or using other raster data as proxies. Based on this, the local adaptation statistical relationship between brightness temperature (BT) and the explanatory variables, i.e., the GSM, was built at 1026-m resolution using the method of multivariate adaptive regression splines. Finally, the GSM was applied to the high-resolution (57-m) explanatory variables; thus, the high-resolution (57-m) BT image was obtained. This method produced a sharpening result with low error and good visual effect. The method can avoid the blind choice of explanatory variables and remove the dependence on synchronous imagery at visible and near-infrared bands. The influences of the explanatory variable combination, sampling method, and the residual error correction on sharpening results were analyzed deliberately, and their influence mechanisms are reported herein.

  2. A robust vector field correction method via a mixture statistical model of PIV signal

    NASA Astrophysics Data System (ADS)

    Lee, Yong; Yang, Hua; Yin, Zhouping

    2016-03-01

    Outlier (spurious vector) is a common problem in practical velocity field measurement using particle image velocimetry technology (PIV), and it should be validated and replaced by a reliable value. One of the most challenging problems is to correctly label the outliers under the circumstance that measurement noise exists or the flow becomes turbulent. Moreover, the outlier's cluster occurrence makes it difficult to pick out all the outliers. Most of current methods validate and correct the outliers using local statistical models in a single pass. In this work, a vector field correction (VFC) method is proposed directly from a mixture statistical model of PIV signal. Actually, this problem is formulated as a maximum a posteriori (MAP) estimation of a Bayesian model with hidden/latent variables, labeling the outliers in the original field. The solution of this MAP estimation, i.e., the outlier set and the restored flow field, is optimized iteratively using an expectation-maximization algorithm. We illustrated this VFC method on two kinds of synthetic velocity fields and two kinds of experimental data and demonstrated that it is robust to a very large number of outliers (even up to 60 %). Besides, the proposed VFC method has high accuracy and excellent compatibility for clustered outliers, compared with the state-of-the-art methods. Our VFC algorithm is computationally efficient, and corresponding Matlab code is provided for others to use it. In addition, our approach is general and can be seamlessly extended to three-dimensional-three-component (3D3C) PIV data.

  3. Advances in statistical methods to map quantitative trait loci in outbred populations.

    PubMed

    Hoeschele, I; Uimari, P; Grignola, F E; Zhang, Q; Gage, K M

    1997-11-01

    Statistical methods to map quantitative trait loci (QTL) in outbred populations are reviewed, extensions and applications to human and plant genetic data are indicated, and areas for further research are identified. Simple and computationally inexpensive methods include (multiple) linear regression of phenotype on marker genotypes and regression of squared phenotypic differences among relative pairs on estimated proportions of identity-by-descent at a locus. These methods are less suited for genetic parameter estimation in outbred populations but allow the determination of test statistic distributions via simulation or data permutation; however, further inferences including confidence intervals of QTL location require the use of Monte Carlo or bootstrap sampling techniques. A method which is intermediate in computational requirements is residual maximum likelihood (REML) with a covariance matrix of random QTL effects conditional on information from multiple linked markers. Testing for the number of QTLs on a chromosome is difficult in a classical framework. The computationally most demanding methods are maximum likelihood and Bayesian analysis, which take account of the distribution of multilocus marker-QTL genotypes on a pedigree and permit investigators to fit different models of variation at the QTL. The Bayesian analysis includes the number of QTLs on a chromosome as an unknown.

  4. Performance of statistical methods to correct food intake distribution: comparison between observed and estimated usual intake.

    PubMed

    Verly-Jr, Eliseu; Oliveira, Dayan C R S; Fisberg, Regina M; Marchioni, Dirce Maria L

    2016-09-01

    There are statistical methods that remove the within-person random error and estimate the usual intake when there is a second 24-h recall (24HR) for at least a subsample of the study population. We aimed to compare the distribution of usual food intake estimated by statistical models with the distribution of observed usual intake. A total of 302 individuals from Rio de Janeiro (Brazil) answered twenty, non-consecutive 24HR; the average length of follow-up was 3 months. The usual food intake was considered as the average of the 20 collection days of food intake. Using data sets with a pair of 2 collection days, usual percentiles of intake of the selected foods using two methods were estimated (National Cancer Institute (NCI) method and Multiple Source Method (MSM)). These estimates were compared with the percentiles of the observed usual intake. Selected foods comprised a range of parameter distributions: skewness, percentage of zero intakes and within- and between-person intakes. Both methods performed well but failed in some situations. In most cases, NCI and MSM produced similar percentiles between each other and values very close to the true intake, and they better represented the usual intake compared with 2-d mean. The smallest precision was observed in the upper tail of the distribution. In spite of the underestimation and overestimation of percentiles of intake, from a public health standpoint, these biases appear not to be of major concern. PMID:27523187

  5. Sharpening method of satellite thermal image based on the geographical statistical model

    NASA Astrophysics Data System (ADS)

    Qi, Pengcheng; Hu, Shixiong; Zhang, Haijun; Guo, Guangmeng

    2016-04-01

    To improve the effectiveness of thermal sharpening in mountainous regions, paying more attention to the laws of land surface energy balance, a thermal sharpening method based on the geographical statistical model (GSM) is proposed. Explanatory variables were selected from the processes of land surface energy budget and thermal infrared electromagnetic radiation transmission, then high spatial resolution (57 m) raster layers were generated for these variables through spatially simulating or using other raster data as proxies. Based on this, the local adaptation statistical relationship between brightness temperature (BT) and the explanatory variables, i.e., the GSM, was built at 1026-m resolution using the method of multivariate adaptive regression splines. Finally, the GSM was applied to the high-resolution (57-m) explanatory variables; thus, the high-resolution (57-m) BT image was obtained. This method produced a sharpening result with low error and good visual effect. The method can avoid the blind choice of explanatory variables and remove the dependence on synchronous imagery at visible and near-infrared bands. The influences of the explanatory variable combination, sampling method, and the residual error correction on sharpening results were analyzed deliberately, and their influence mechanisms are reported herein.

  6. Adaptive contour-based statistical background subtraction method for moving target detection in infrared video sequences

    NASA Astrophysics Data System (ADS)

    Akula, Aparna; Khanna, Nidhi; Ghosh, Ripul; Kumar, Satish; Das, Amitava; Sardana, H. K.

    2014-03-01

    A robust contour-based statistical background subtraction method for detection of non-uniform thermal targets in infrared imagery is presented. The foremost step of the method comprises of generation of background frame using statistical information of an initial set of frames not containing any targets. The generated background frame is made adaptive by continuously updating the background using the motion information of the scene. The background subtraction method followed by a clutter rejection stage ensure the detection of foreground objects. The next step comprises of detection of contours and distinguishing the target boundaries from the noisy background. This is achieved by using the Canny edge detector that extracts the contours followed by a k-means clustering approach to differentiate the object contour from the background contours. The post processing step comprises of morphological edge linking approach to close any broken contours and finally flood fill is performed to generate the silhouettes of moving targets. This method is validated on infrared video data consisting of a variety of moving targets. Experimental results demonstrate a high detection rate with minimal false alarms establishing the robustness of the proposed method.

  7. An Embedded Statistical Method for Coupling Molecular Dynamics and Finite Element Analyses

    NASA Technical Reports Server (NTRS)

    Saether, E.; Glaessgen, E.H.; Yamakov, V.

    2008-01-01

    The coupling of molecular dynamics (MD) simulations with finite element methods (FEM) yields computationally efficient models that link fundamental material processes at the atomistic level with continuum field responses at higher length scales. The theoretical challenge involves developing a seamless connection along an interface between two inherently different simulation frameworks. Various specialized methods have been developed to solve particular classes of problems. Many of these methods link the kinematics of individual MD atoms with FEM nodes at their common interface, necessarily requiring that the finite element mesh be refined to atomic resolution. Some of these coupling approaches also require simulations to be carried out at 0 K and restrict modeling to two-dimensional material domains due to difficulties in simulating full three-dimensional material processes. In the present work, a new approach to MD-FEM coupling is developed based on a restatement of the standard boundary value problem used to define a coupled domain. The method replaces a direct linkage of individual MD atoms and finite element (FE) nodes with a statistical averaging of atomistic displacements in local atomic volumes associated with each FE node in an interface region. The FEM and MD computational systems are effectively independent and communicate only through an iterative update of their boundary conditions. With the use of statistical averages of the atomistic quantities to couple the two computational schemes, the developed approach is referred to as an embedded statistical coupling method (ESCM). ESCM provides an enhanced coupling methodology that is inherently applicable to three-dimensional domains, avoids discretization of the continuum model to atomic scale resolution, and permits finite temperature states to be applied.

  8. Direct comparison of two statistical methods for determination of evoked-potential thresholds

    NASA Astrophysics Data System (ADS)

    Langford, Ted L.; Patterson, James H., Jr.

    1994-07-01

    Several statistical procedures have been proposed as objective methods for determining evoked-potential thresholds. Data have been presented to support each of the methods, but there have not been direct comparisons using the same data. The goal of the present study was to evaluate correlation and variance ratio statistics using common data. A secondary goal was to evaluate the utility of a derived potential for determining thresholds. Chronic, bipolar electrodes were stereotaxically implanted in the inferior colliculi of six chinchillas. Evoked potentials were obtained at 0.25, 0.5, 1.0, 2.0, 4.0 and 8.0 kHz using 12-ms tone bursts and 12-ms tone bursts superimposed on 120-ms pedestal tones which were of the same frequency as the bursts, but lower in amplitude by 15 dB. Alternate responses were averaged in blocks of 200 to 4000 depending on the size of the response. Correlations were calculated for the pairs of averages. A response was deemed present if the correlation coefficient reached the 0.05 level of significance in 4000 or fewer averages. Threshold was defined as the mean of the level at which the correlation was significant and a level 5 dB below that at which it was not. Variance ratios were calculated as described by Elberling and Don (1984) using the same data. Averaged tone burst and tone burst-plus pedestal data were differenced and the resulting waveforms subjected to the same statistical analyses described above. All analyses yielded thresholds which were essentially the same as those obtained using behavioral methods. When the difference between stimulus durations is taken into account, however, evoked-potential methods produced lower thresholds than behavioral methods.

  9. A novel method for fast and statistically verified morphological characterization of filamentous fungi.

    PubMed

    Posch, Andreas E; Spadiut, Oliver; Herwig, Christoph

    2012-07-01

    Along with productivity and physiology, morphological growth behavior is the key parameter in bioprocess design for filamentous fungi. Despite complex interactions between fungal morphology, broth viscosity, mixing kinetics, transport characteristics and process productivity, morphology is still commonly tackled only by empirical trial-and-error techniques during strain selection and process development procedures. In fact, morphological growth characteristics are investigated by computational analysis of only a limited number of pre-selected microscopic images or via manual evaluation of images, which causes biased results and does not allow any automation or high-throughput quantification. To overcome the lack of tools for fast, reliable and quantitative morphological analysis, this work introduces a method enabling statistically verified quantification of fungal morphology in accordance with Quality by Design principles. The novel, high-throughput method presented here interlinks fully automated recording of microscopic images with a newly developed evaluation approach reducing the need for manual intervention to a minimum. Validity of results is ensured by concomitantly testing the acquired sample for representativeness by statistical inference via bootstrap analysis. The novel approach for statistical verification can be equally applied as control logic to automatically proceed with morphological analysis of a consecutive sample once user defined acceptance criteria are met. Hence, analysis time can be reduced to an absolute minimum. The quantitative potential of the developed methodology is demonstrated by characterizing the morphological growth behavior of two industrial Penicillium chrysogenum production strains in batch cultivation.

  10. Statistical Methods and Tools for Uxo Characterization (SERDP Final Technical Report)

    SciTech Connect

    Pulsipher, Brent A.; Gilbert, Richard O.; Wilson, John E.; Hassig, Nancy L.; Carlson, Deborah K.; O'Brien, Robert F.; Bates, Derrick J.; Sandness, Gerald A.; Anderson, Kevin K.

    2004-11-15

    The Strategic Environmental Research and Development Program (SERDP) issued a statement of need for FY01 titled Statistical Sampling for Unexploded Ordnance (UXO) Site Characterization that solicited proposals to develop statistically valid sampling protocols for cost-effective, practical, and reliable investigation of sites contaminated with UXO; protocols that could be validated through subsequent field demonstrations. The SERDP goal was the development of a sampling strategy for which a fraction of the site is initially surveyed by geophysical detectors to confidently identify clean areas and subsections (target areas, TAs) that had elevated densities of anomalous geophysical detector readings that could indicate the presence of UXO. More detailed surveys could then be conducted to search the identified TAs for UXO. SERDP funded three projects: those proposed by the Pacific Northwest National Laboratory (PNNL) (SERDP Project No. UXO 1199), Sandia National Laboratory (SNL), and Oak Ridge National Laboratory (ORNL). The projects were closely coordinated to minimize duplication of effort and facilitate use of shared algorithms where feasible. This final report for PNNL Project 1199 describes the methods developed by PNNL to address SERDP's statement-of-need for the development of statistically-based geophysical survey methods for sites where 100% surveys are unattainable or cost prohibitive.

  11. Performance of statistical methods for analysing survival data in the presence of non-random compliance.

    PubMed

    Odondi, Lang'o; McNamee, Roseanne

    2010-12-20

    Noncompliance often complicates estimation of treatment efficacy from randomized trials. Under random noncompliance, per protocol analyses or even simple regression adjustments for noncompliance, could be adequate for causal inference, but special methods are needed when noncompliance is related to risk. For survival data, Robins and Tsiatis introduced the semi-parametric structural Causal Accelerated Life Model (CALM) which allows time-dependent departures from randomized treatment in either arm and relates each observed event time to a potential event time that would have been observed if the control treatment had been given throughout the trial. Alternatively, Loeys and Goetghebeur developed a structural Proportional Hazards (C-Prophet) model for when there is all-or-nothing noncompliance in the treatment arm only. Whitebiet al. proposed a 'complier average causal effect' method for Proportional Hazards estimation which allows time-dependent departures from randomized treatment in the active arm. A time-invariant version of this estimator (CHARM) consists of a simple adjustment to the Intention-to-Treat hazard ratio estimate. We used simulation studies mimicking a randomized controlled trial of active treatment versus control with censored time-to-event data, and under both random and non-random time-dependent noncompliance, to evaluate performance of these methods in terms of 95 per cent confidence interval coverage, bias and root mean square errors (RMSE). All methods performed well in terms of bias, even the C-Prophet used after treating time-varying compliance as all-or-nothing. Coverage of the latter method, as implemented in Stata, was too low. The CALM method performed best in terms of bias and coverage but had the largest RMSE. PMID:20963732

  12. Boosting Bayesian parameter inference of stochastic differential equation models with methods from statistical physics

    NASA Astrophysics Data System (ADS)

    Albert, Carlo; Ulzega, Simone; Stoop, Ruedi

    2016-04-01

    Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods

  13. Application of the non-equilibrium statistical operator method (NESOM) to dissipation atomic force microscopy

    NASA Astrophysics Data System (ADS)

    Mo, M. Y.; Kantorovich, L.

    2001-02-01

    We apply the non-equilibrium statistical operator method to non-contact atomic force microscopy, considering explicitly the statistical effects of (classical) vibrations of surface atoms and associated energy transfer from the tip to the surface. We derive several, physically and mathematically equivalent, forms of the equation of motion for the tip, each containing a friction term due to the so-called intrinsic mechanism of energy dissipation first suggested by Gauthier and Tsukada. Our exact treatment supports the results of some earlier work which were all approximate. We also demonstrate, using the same theory, that the distribution function of the tip in the coordinate-momentum phase subspace is governed by the Fokker-Planck equation and should be considered as strongly peaked around the exact values t and t of the momentum and the position of the tip, respectively.

  14. Advanced statistical methods for improved data analysis of NASA astrophysics missions

    NASA Astrophysics Data System (ADS)

    Feigelson, Eric D.

    The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.

  15. Advanced statistical methods for improved data analysis of NASA astrophysics missions

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.

    1992-01-01

    The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.

  16. Shell Model Nuclear Level Densities using the Methods of Statistical Spectroscopy

    NASA Astrophysics Data System (ADS)

    Karampagia, Sofia; Sen'kov, Roman; Zelevinsky, Vladimir; Brown, Alex B.

    2016-03-01

    An algorithm has been developed for the calculation of spin- and parity-dependent nuclear level densities, based on a two-body shell-model Hamiltonian. Instead of diagonalizing the full shell-model Hamiltonian, the algorithm uses methods of statistical spectroscopy in order to derive nuclear level densities. This method allows one to calculate the exact level densities (coinciding with the shell model densities) very fast and for model spaces that the shell model cannot reach. In this work we study the evolution of the level density under variation of specific matrix elements of the shell-model Hamiltonian. We also study the impact on the calculated level density as a result the expansion of single-particle model space. As an application of the method, whenever it is possible and experimental information exists, we make a comparison of the nuclear level densities calculated within our method with experimental level densities. Supported by the NSF Grant PHY-1404442.

  17. Statistical Methods and Software for the Analysis of Occupational Exposure Data with Non-detectable Values

    SciTech Connect

    Frome, EL

    2005-09-20

    Environmental exposure measurements are, in general, positive and may be subject to left censoring; i.e,. the measured value is less than a ''detection limit''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. Parametric methods used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level, an upper percentile, and the exceedance fraction are used to characterize exposure levels, and confidence limits are used to describe the uncertainty in these estimates. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on an upper percentile (i.e., the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical data analysis and graphics has greatly enhanced the availability of high-quality nonproprietary (open source) software that serves as the basis for implementing the methods in this paper.

  18. Comparisons of power of statistical methods for gene-environment interaction analyses.

    PubMed

    Ege, Markus J; Strachan, David P

    2013-10-01

    Any genome-wide analysis is hampered by reduced statistical power due to multiple comparisons. This is particularly true for interaction analyses, which have lower statistical power than analyses of associations. To assess gene-environment interactions in population settings we have recently proposed a statistical method based on a modified two-step approach, where first genetic loci are selected by their associations with disease and environment, respectively, and subsequently tested for interactions. We have simulated various data sets resembling real world scenarios and compared single-step and two-step approaches with respect to true positive rate (TPR) in 486 scenarios and (study-wide) false positive rate (FPR) in 252 scenarios. Our simulations confirmed that in all two-step methods the two steps are not correlated. In terms of TPR, two-step approaches combining information on gene-disease association and gene-environment association in the first step were superior to all other methods, while preserving a low FPR in over 250 million simulations under the null hypothesis. Our weighted modification yielded the highest power across various degrees of gene-environment association in the controls. An optimal threshold for step 1 depended on the interacting allele frequency and the disease prevalence. In all scenarios, the least powerful method was to proceed directly to an unbiased full interaction model, applying conventional genome-wide significance thresholds. This simulation study confirms the practical advantage of two-step approaches to interaction testing over more conventional one-step designs, at least in the context of dichotomous disease outcomes and other parameters that might apply in real-world settings.

  19. Combined Bayesian statistics and load duration curve method for bacteria nonpoint source loading estimation.

    PubMed

    Shen, Jian; Zhao, Yuan

    2010-01-01

    Nonpoint source load estimation is an essential part of the development of the bacterial total maximum daily load (TMDL) mandated by the Clean Water Act. However, the currently widely used watershed-receiving water modeling approach is usually associated with a high level of uncertainty and requires long-term observational data and intensive training effort. The load duration curve (LDC) method recommended by the EPA provides a simpler way to estimate bacteria loading. This method, however, does not take into consideration the specific fate and transport mechanisms of the pollutant and cannot address the uncertainty. In this study, a Bayesian statistical approach is applied to the Escherichia coli TMDL development of a stream on the Eastern Shore of Virginia to inversely estimate watershed bacteria loads from the in-stream monitoring data. The mechanism of bacteria transport is incorporated. The effects of temperature, bottom slope, and flow on allowable and existing load calculations are discussed. The uncertainties associated with load estimation are also fully described. Our method combines the merits of LDC, mechanistic modeling, and Bayesian statistics, while overcoming some of the shortcomings associated with these methods. It is a cost-effective tool for bacteria TMDL development and can be modified and applied to multi-segment streams as well. PMID:19781737

  20. Methods of artificial enlargement of the training set for statistical shape models.

    PubMed

    Koikkalainen, Juha; Tölli, Tuomas; Lauerma, Kirsi; Antila, Kari; Mattila, Elina; Lilja, Mikko; Lötjönen, Jyrki

    2008-11-01

    Due to the small size of training sets, statistical shape models often over-constrain the deformation in medical image segmentation. Hence, artificial enlargement of the training set has been proposed as a solution for the problem to increase the flexibility of the models. In this paper, different methods were evaluated to artificially enlarge a training set. Furthermore, the objectives were to study the effects of the size of the training set, to estimate the optimal number of deformation modes, to study the effects of different error sources, and to compare different deformation methods. The study was performed for a cardiac shape model consisting of ventricles, atria, and epicardium, and built from magnetic resonance (MR) volume images of 25 subjects. Both shape modeling and image segmentation accuracies were studied. The objectives were reached by utilizing different training sets and datasets, and two deformation methods. The evaluation proved that artificial enlargement of the training set improves both the modeling and segmentation accuracy. All but one enlargement techniques gave statistically significantly (p < 0.05) better segmentation results than the standard method without enlargement. The two best enlargement techniques were the nonrigid movement technique and the technique that combines principal component analysis (PCA) and finite element model (FEM). The optimal number of deformation modes was found to be near 100 modes in our application. The active shape model segmentation gave better segmentation accuracy than the one based on the simulated annealing optimization of the model weights.

  1. Van der Waals interactions: evaluations by use of a statistical mechanical method.

    PubMed

    Høye, Johan S

    2011-10-01

    In this work the induced van der Waals interaction between a pair of neutral atoms or molecules is considered by use of a statistical mechanical method. With use of the Schrödinger equation this interaction can be obtained by standard quantum mechanical perturbation theory to second order. However, the latter is restricted to electrostatic interactions between dipole moments. So with radiating dipole-dipole interaction where retardation effects are important for large separations of the particles, other methods are needed, and the resulting induced interaction is the Casimir-Polder interaction usually obtained by field theory. It can also be evaluated, however, by a statistical mechanical method that utilizes the path integral representation. We here show explicitly by use of this method the equivalence of the Casimir-Polder interaction and the van der Waals interaction based upon the Schrödinger equation. The equivalence is to leading order for short separations where retardation effects can be neglected. In recent works [J. S. Høye, Physica A 389, 1380 (2010); Phys. Rev. E 81, 061114 (2010)], the Casimir-Polder or Casimir energy was added as a correction to calculations of systems like the electron clouds of molecules. The equivalence to van der Waals interactions indicates that the added Casimir energy will improve the accuracy of calculated molecular energies. Thus, we give numerical estimates of this energy including analysis and estimates for the uniform electron gas.

  2. Evaluation and projection of daily temperature percentiles from statistical and dynamical downscaling methods

    NASA Astrophysics Data System (ADS)

    Casanueva, A.; Herrera, S.; Fernández, J.; Frías, M. D.; Gutiérrez, J. M.

    2013-08-01

    The study of extreme events has become of great interest in recent years due to their direct impact on society. Extremes are usually evaluated by using extreme indicators, based on order statistics on the tail of the probability distribution function (typically percentiles). In this study, we focus on the tail of the distribution of daily maximum and minimum temperatures. For this purpose, we analyse high (95th) and low (5th) percentiles in daily maximum and minimum temperatures on the Iberian Peninsula, respectively, derived from different downscaling methods (statistical and dynamical). First, we analyse the performance of reanalysis-driven downscaling methods in present climate conditions. The comparison among the different methods is performed in terms of the bias of seasonal percentiles, considering as observations the public gridded data sets E-OBS and Spain02, and obtaining an estimation of both the mean and spatial percentile errors. Secondly, we analyse the increments of future percentile projections under the SRES A1B scenario and compare them with those corresponding to the mean temperature, showing that their relative importance depends on the method, and stressing the need to consider an ensemble of methodologies.

  3. Statistical method for detecting phase shifts in alpha rhythm from human electroencephalogram data

    NASA Astrophysics Data System (ADS)

    Naruse, Yasushi; Takiyama, Ken; Okada, Masato; Umehara, Hiroaki

    2013-04-01

    We developed a statistical method for detecting discontinuous phase changes (phase shifts) in fluctuating alpha rhythms in the human brain from electroencephalogram (EEG) data obtained in a single trial. This method uses the state space models and the line process technique, which is a Bayesian method for detecting discontinuity in an image. By applying this method to simulated data, we were able to detect the phase and amplitude shifts in a single simulated trial. Further, we demonstrated that this method can detect phase shifts caused by a visual stimulus in the alpha rhythm from experimental EEG data even in a single trial. The results for the experimental data showed that the timings of the phase shifts in the early latency period were similar between many of the trials, and that those in the late latency period were different between the trials. The conventional averaging method can only detect phase shifts that occur at similar timings between many of the trials, and therefore, the phase shifts that occur at differing timings cannot be detected using the conventional method. Consequently, our obtained results indicate the practicality of our method. Thus, we believe that our method will contribute to studies examining the phase dynamics of nonlinear alpha rhythm oscillators.

  4. Introducing 3D U-statistic method for separating anomaly from background in exploration geochemical data with associated software development

    NASA Astrophysics Data System (ADS)

    Ghannadpour, Seyyed Saeed; Hezarkhani, Ardeshir

    2016-03-01

    The U-statistic method is one of the most important structural methods to separate the anomaly from the background. It considers the location of samples and carries out the statistical analysis of the data without judging from a geochemical point of view and tries to separate subpopulations and determine anomalous areas. In the present study, to use U-statistic method in three-dimensional (3D) condition, U-statistic is applied on the grade of two ideal test examples, by considering sample Z values (elevation). So far, this is the first time that this method has been applied on a 3D condition. To evaluate the performance of 3D U-statistic method and in order to compare U-statistic with one non-structural method, the method of threshold assessment based on median and standard deviation (MSD method) is applied on the two example tests. Results show that the samples indicated by U-statistic method as anomalous are more regular and involve less dispersion than those indicated by the MSD method. So that, according to the location of anomalous samples, denser areas of them can be determined as promising zones. Moreover, results show that at a threshold of U = 0, the total error of misclassification for U-statistic method is much smaller than the total error of criteria of bar {x}+n× s. Finally, 3D model of two test examples for separating anomaly from background using 3D U-statistic method is provided. The source code for a software program, which was developed in the MATLAB programming language in order to perform the calculations of the 3D U-spatial statistic method, is additionally provided. This software is compatible with all the geochemical varieties and can be used in similar exploration projects.

  5. Monitoring Method of Cow Anthrax Based on Gis and Spatial Statistical Analysis

    NASA Astrophysics Data System (ADS)

    Li, Lin; Yang, Yong; Wang, Hongbin; Dong, Jing; Zhao, Yujun; He, Jianbin; Fan, Honggang

    Geographic information system (GIS) is a computer application system, which possesses the ability of manipulating spatial information and has been used in many fields related with the spatial information management. Many methods and models have been established for analyzing animal diseases distribution models and temporal-spatial transmission models. Great benefits have been gained from the application of GIS in animal disease epidemiology. GIS is now a very important tool in animal disease epidemiological research. Spatial analysis function of GIS can be widened and strengthened by using spatial statistical analysis, allowing for the deeper exploration, analysis, manipulation and interpretation of spatial pattern and spatial correlation of the animal disease. In this paper, we analyzed the cow anthrax spatial distribution characteristics in the target district A (due to the secret of epidemic data we call it district A) based on the established GIS of the cow anthrax in this district in combination of spatial statistical analysis and GIS. The Cow anthrax is biogeochemical disease, and its geographical distribution is related closely to the environmental factors of habitats and has some spatial characteristics, and therefore the correct analysis of the spatial distribution of anthrax cow for monitoring and the prevention and control of anthrax has a very important role. However, the application of classic statistical methods in some areas is very difficult because of the pastoral nomadic context. The high mobility of livestock and the lack of enough suitable sampling for the some of the difficulties in monitoring currently make it nearly impossible to apply rigorous random sampling methods. It is thus necessary to develop an alternative sampling method, which could overcome the lack of sampling and meet the requirements for randomness. The GIS computer application software ArcGIS9.1 was used to overcome the lack of data of sampling sites.Using ArcGIS 9.1 and GEODA

  6. Evaluation of the applicability in the future climate of a statistical downscaling method in France

    NASA Astrophysics Data System (ADS)

    Dayon, G.; Boé, J.; Martin, E.

    2013-12-01

    The uncertainties in climate projections during the next decades generally remain large, with an important contribution of internal climate variability. To quantify and capture the impact of those uncertainties in impact projections, multi-model and multi-member approaches are essential. Statistical downscaling (SD) methods are computationally inexpensive allowing for large ensemble approaches. The main weakness of SD is that it relies on a stationarity hypothesis, namely that the statistical relation established in the present climate remains valid in the climate change context. In this study, the evaluation of SD methods developed for a future study of hydrological changes during the next decades over France is presented, focusing on precipitation. The SD methods are all based on the analogs method which is quite simple to set up and permits to easily test different combinations of predictors, the only changing parameter in the methods discussed in this presentation. The basic idea of the analogs method is that for a same large scale climatic state, the state of local variables will be identical. In a climate change context, the statistical relation established on past climate is assumed to remain valid in the future climate. In practice, this stationarity assumption is impossible to verify until the future climate is effectively observed. It is possible to evaluate the ability of SD methods to reproduce the interannual variability in the present climate, but this approach does not guarantee their validity in the future climate as the mechanisms that play in the interannual and climate change contexts may not be identical. Another common approach is to test whether a SD method is able to reproduce observed, as they may be partly caused by climate changes. The observed trends in precipitation are compared to those obtained by downscaling 4 different atmospheric reanalyses with analogs methods. The uncertainties in downscaled trends due to renalyses are very large

  7. Appplication of statistical mechanical methods to the modeling of social networks

    NASA Astrophysics Data System (ADS)

    Strathman, Anthony Robert

    With the recent availability of large-scale social data sets, social networks have become open to quantitative analysis via the methods of statistical physics. We examine the statistical properties of a real large-scale social network, generated from cellular phone call-trace logs. We find this network, like many other social networks to be assortative (r = 0.31) and clustered (i.e., strongly transitive, C = 0.21). We measure fluctuation scaling to identify the presence of internal structure in the network and find that structural inhomogeneity effectively disappears at the scale of a few hundred nodes, though there is no sharp cutoff. We introduce an agent-based model of social behavior, designed to model the formation and dissolution of social ties. The model is a modified Metropolis algorithm containing agents operating under the basic sociological constraints of reciprocity, communication need and transitivity. The model introduces the concept of a social temperature. We go on to show that this simple model reproduces the global statistical network features (incl. assortativity, connected fraction, mean degree, clustering, and mean shortest path length) of the real network data and undergoes two phase transitions, one being from a "gas" to a "liquid" state and the second from a liquid to a glassy state as function of this social temperature.

  8. A method for obtaining a statistically stationary turbulent free shear flow

    NASA Technical Reports Server (NTRS)

    Timson, Stephen F.; Lele, S. K.; Moser, R. D.

    1994-01-01

    The long-term goal of the current research is the study of Large-Eddy Simulation (LES) as a tool for aeroacoustics. New algorithms and developments in computer hardware are making possible a new generation of tools for aeroacoustic predictions, which rely on the physics of the flow rather than empirical knowledge. LES, in conjunction with an acoustic analogy, holds the promise of predicting the statistics of noise radiated to the far-field of a turbulent flow. LES's predictive ability will be tested through extensive comparison of acoustic predictions based on a Direct Numerical Simulation (DNS) and LES of the same flow, as well as a priori testing of DNS results. The method presented here is aimed at allowing simulation of a turbulent flow field that is both simple and amenable to acoustic predictions. A free shear flow is homogeneous in both the streamwise and spanwise directions and which is statistically stationary will be simulated using equations based on the Navier-Stokes equations with a small number of added terms. Studying a free shear flow eliminates the need to consider flow-surface interactions as an acoustic source. The homogeneous directions and the flow's statistically stationary nature greatly simplify the application of an acoustic analogy.

  9. New Statistical Methods for the Analysis of the Cratering on Venus

    NASA Astrophysics Data System (ADS)

    Xie, M.; Smrekar, S. E.; Handcock, M. S.

    2014-12-01

    The sparse crater population (~1000 craters) on Venus is the most important clue of determining the planet's surface age and aids in understanding its geologic history. What processes (volcanism, tectonism, weathering, etc.) modify the total impact crater population? Are the processes regional or global in occurrence? The heated debate on these questions points to the need for better approaches. We present new statistical methods for the analysis of the crater locations and characteristics. Specifically: 1) We produce a map of crater density and the proportion of no halo craters (inferred to be modified) by using generalized additive models, and smoothing splines with a spherical spline basis set. Based on this map, we are able to predict the probability of a crater has no halo given that there is a crater at that point. We also obtain a continuous representation of the ratio of craters with no halo as a function of crater density. This approach allows us to look for regions that appear to have experienced more or less modification, and are thus potentially older or younger. 2) We examine the randomness or clustering of distributions of craters by type (e.g. dark floored, intermediate). For example, for dark floored craters we consider two hypotheses: i) the dark floored craters are randomly distributed on the surface; ii) the dark floored craters are random given the locations of the crater population. Instead of only using a single measure such as average nearest neighbor distance, we use the probability density function of these distances, and compare it to complete spatial randomness to get the relative probability density function. This function gives us a clearer picture of how and where the nearest neighbor distances differ from complete spatial randomness. We also conduct statistical tests of these hypotheses. Confidence intervals with specified global coverage are constructed. Software to reproduce the methods is available in the open source statistics

  10. Identifying Potentially Biased Test Items: A Comparison of the Mantel-Haenszel Statistic and Several Item Response Theory Methods.

    ERIC Educational Resources Information Center

    Hambleton, Ronald K.; And Others

    Four item bias methods were studied. The methods compared include the Mantel-Haenszel statistic, the plot method, the route mean squared difference method, and the total area method; the latter two methods are based on item response theory. The test consisted of item responses of 451 male and 486 female ninth graders to 75 test items on the 1985…

  11. Intelligent Condition Diagnosis Method Based on Adaptive Statistic Test Filter and Diagnostic Bayesian Network

    PubMed Central

    Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing

    2016-01-01

    A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method. PMID:26761006

  12. Statistical methods for the assessment of EQAPOL proficiency testing: ELISpot, Luminex, and Flow Cytometry.

    PubMed

    Rountree, Wes; Vandergrift, Nathan; Bainbridge, John; Sanchez, Ana M; Denny, Thomas N

    2014-07-01

    In September 2011 Duke University was awarded a contract to develop the National Institutes of Health/National Institute of Allergy and Infectious Diseases (NIH/NIAID) External Quality Assurance Program Oversight Laboratory (EQAPOL). Through EQAPOL, proficiency testing programs are administered for Interferon-γ (IFN-γ) Enzyme-linked immunosorbent spot (ELISpot), Intracellular Cytokine Staining Flow Cytometry (ICS) and Luminex-based cytokine assays. One of the charges of the EQAPOL program was to apply statistical methods to determine overall site performance. We utilized various statistical methods for each program to find the most appropriate for assessing laboratory performance using the consensus average as the target value. Accuracy ranges were calculated based on Wald-type confidence intervals, exact Poisson confidence intervals, or via simulations. Given the nature of proficiency testing data, which has repeated measures within donor/sample made across several laboratories; the use of mixed effects models with alpha adjustments for multiple comparisons was also explored. Mixed effects models were found to be the most useful method to assess laboratory performance with respect to accuracy to the consensus. Model based approaches to the proficiency testing data in EQAPOL will continue to be utilized. Mixed effects models also provided a means of performing more complex analyses that would address secondary research questions regarding within and between laboratory variability as well as longitudinal analyses. PMID:24456626

  13. Intelligent Condition Diagnosis Method Based on Adaptive Statistic Test Filter and Diagnostic Bayesian Network.

    PubMed

    Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing

    2016-01-01

    A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.

  14. Comparison of classical statistical methods and artificial neural network in traffic noise prediction

    SciTech Connect

    Nedic, Vladimir; Despotovic, Danijela; Cvetanovic, Slobodan; Despotovic, Milan; Babic, Sasa

    2014-11-15

    Traffic is the main source of noise in urban environments and significantly affects human mental and physical health and labor productivity. Therefore it is very important to model the noise produced by various vehicles. Techniques for traffic noise prediction are mainly based on regression analysis, which generally is not good enough to describe the trends of noise. In this paper the application of artificial neural networks (ANNs) for the prediction of traffic noise is presented. As input variables of the neural network, the proposed structure of the traffic flow and the average speed of the traffic flow are chosen. The output variable of the network is the equivalent noise level in the given time period L{sub eq}. Based on these parameters, the network is modeled, trained and tested through a comparative analysis of the calculated values and measured levels of traffic noise using the originally developed user friendly software package. It is shown that the artificial neural networks can be a useful tool for the prediction of noise with sufficient accuracy. In addition, the measured values were also used to calculate equivalent noise level by means of classical methods, and comparative analysis is given. The results clearly show that ANN approach is superior in traffic noise level prediction to any other statistical method. - Highlights: • We proposed an ANN model for prediction of traffic noise. • We developed originally designed user friendly software package. • The results are compared with classical statistical methods. • The results are much better predictive capabilities of ANN model.

  15. A Nonparametric Statistical Method That Improves Physician Cost of Care Analysis

    PubMed Central

    Metfessel, Brent A; Greene, Robert A

    2012-01-01

    Objective To develop a compositing method that demonstrates improved performance compared with commonly used tests for statistical analysis of physician cost of care data. Data Source Commercial preferred provider organization (PPO) claims data for internists from a large metropolitan area. Study Design We created a nonparametric composite performance metric that maintains risk adjustment using the Wilcoxon rank-sum (WRS) test. We compared the resulting algorithm to the parametric observed-to-expected ratio, with and without a statistical test, for stability of physician cost ratings among different outlier trimming methods and across two partially overlapping time periods. Principal Findings The WRS algorithm showed significantly greater within-physician stability among several typical outlier trimming and capping methods. The algorithm also showed significantly greater within-physician stability when the same physicians were analyzed across time periods. Conclusions The nonparametric algorithm described is a more robust and more stable methodology for evaluating physician cost of care than commonly used observed-to-expected ratio techniques. Use of such an algorithm can improve physician cost assessment for important current applications such as public reporting, pay for performance, and tiered benefit design. PMID:22524195

  16. A statistical method for lung tumor segmentation uncertainty in PET images based on user inference.

    PubMed

    Zheng, Chaojie; Wang, Xiuying; Feng, Dagan

    2015-01-01

    PET has been widely accepted as an effective imaging modality for lung tumor diagnosis and treatment. However, standard criteria for delineating tumor boundary from PET are yet to develop largely due to relatively low quality of PET images, uncertain tumor boundary definition, and variety of tumor characteristics. In this paper, we propose a statistical solution to segmentation uncertainty on the basis of user inference. We firstly define the uncertainty segmentation band on the basis of segmentation probability map constructed from Random Walks (RW) algorithm; and then based on the extracted features of the user inference, we use Principle Component Analysis (PCA) to formulate the statistical model for labeling the uncertainty band. We validated our method on 10 lung PET-CT phantom studies from the public RIDER collections [1] and 16 clinical PET studies where tumors were manually delineated by two experienced radiologists. The methods were validated using Dice similarity coefficient (DSC) to measure the spatial volume overlap. Our method achieved an average DSC of 0.878 ± 0.078 on phantom studies and 0.835 ± 0.039 on clinical studies. PMID:26736741

  17. Comparison and validation of statistical methods for predicting power outage durations in the event of hurricanes.

    PubMed

    Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M

    2011-12-01

    This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy.

  18. Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods.

    PubMed

    Ogilvie, Huw A; Heled, Joseph; Xie, Dong; Drummond, Alexei J

    2016-05-01

    Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with species divergence. The multispecies coalescent is supported by previous studies which found that its predicted distributions fit empirical data, and that concatenation is not a consistent estimator of the species tree. *BEAST, a fully Bayesian implementation of the multispecies coalescent, is popular but computationally intensive, so the increasing size of phylogenetic data sets is both a computational challenge and an opportunity for better systematics. Using simulation studies, we characterize the scaling behavior of *BEAST, and enable quantitative prediction of the impact increasing the number of loci has on both computational performance and statistical accuracy. Follow-up simulations over a wide range of parameters show that the statistical performance of *BEAST relative to concatenation improves both as branch length is reduced and as the number of loci is increased. Finally, using simulations based on estimated parameters from two phylogenomic data sets, we compare the performance of a range of species tree and concatenation methods to show that using *BEAST with tens of loci can be preferable to using concatenation with thousands of loci. Our results provide insight into the practicalities of Bayesian species tree estimation, the number of loci required to obtain a given level of accuracy and the situations in which supermatrix or summary methods will be outperformed by the fully Bayesian multispecies coalescent. PMID:26821913

  19. Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods

    PubMed Central

    Ogilvie, Huw A.; Heled, Joseph; Xie, Dong; Drummond, Alexei J.

    2016-01-01

    Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with species divergence. The multispecies coalescent is supported by previous studies which found that its predicted distributions fit empirical data, and that concatenation is not a consistent estimator of the species tree. *BEAST, a fully Bayesian implementation of the multispecies coalescent, is popular but computationally intensive, so the increasing size of phylogenetic data sets is both a computational challenge and an opportunity for better systematics. Using simulation studies, we characterize the scaling behavior of *BEAST, and enable quantitative prediction of the impact increasing the number of loci has on both computational performance and statistical accuracy. Follow-up simulations over a wide range of parameters show that the statistical performance of *BEAST relative to concatenation improves both as branch length is reduced and as the number of loci is increased. Finally, using simulations based on estimated parameters from two phylogenomic data sets, we compare the performance of a range of species tree and concatenation methods to show that using *BEAST with tens of loci can be preferable to using concatenation with thousands of loci. Our results provide insight into the practicalities of Bayesian species tree estimation, the number of loci required to obtain a given level of accuracy and the situations in which supermatrix or summary methods will be outperformed by the fully Bayesian multispecies coalescent. PMID:26821913

  20. Intelligent Condition Diagnosis Method Based on Adaptive Statistic Test Filter and Diagnostic Bayesian Network.

    PubMed

    Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing

    2016-01-01

    A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method. PMID:26761006

  1. A statistical method for lung tumor segmentation uncertainty in PET images based on user inference.

    PubMed

    Zheng, Chaojie; Wang, Xiuying; Feng, Dagan

    2015-01-01

    PET has been widely accepted as an effective imaging modality for lung tumor diagnosis and treatment. However, standard criteria for delineating tumor boundary from PET are yet to develop largely due to relatively low quality of PET images, uncertain tumor boundary definition, and variety of tumor characteristics. In this paper, we propose a statistical solution to segmentation uncertainty on the basis of user inference. We firstly define the uncertainty segmentation band on the basis of segmentation probability map constructed from Random Walks (RW) algorithm; and then based on the extracted features of the user inference, we use Principle Component Analysis (PCA) to formulate the statistical model for labeling the uncertainty band. We validated our method on 10 lung PET-CT phantom studies from the public RIDER collections [1] and 16 clinical PET studies where tumors were manually delineated by two experienced radiologists. The methods were validated using Dice similarity coefficient (DSC) to measure the spatial volume overlap. Our method achieved an average DSC of 0.878 ± 0.078 on phantom studies and 0.835 ± 0.039 on clinical studies.

  2. University and Student Segmentation: Multilevel Latent-Class Analysis of Students' Attitudes towards Research Methods and Statistics

    ERIC Educational Resources Information Center

    Mutz, Rudiger; Daniel, Hans-Dieter

    2013-01-01

    Background: It is often claimed that psychology students' attitudes towards research methods and statistics affect course enrolment, persistence, achievement, and course climate. However, the inter-institutional variability has been widely neglected in the research on students' attitudes towards research methods and statistics, but it is important…

  3. An Analysis of Research Methods and Statistical Techniques Used by Doctoral Dissertation at the Education Sciences in Turkey

    ERIC Educational Resources Information Center

    Karadag, Engin

    2010-01-01

    To assess research methods and analysis of statistical techniques employed by educational researchers, this study surveyed unpublished doctoral dissertation from 2003 to 2007. Frequently used research methods consisted of experimental research; a survey; a correlational study; and a case study. Descriptive statistics, t-test, ANOVA, factor…

  4. Investigation of M2 factor influence for paraxial computer generated hologram reconstruction using a statistical method

    NASA Astrophysics Data System (ADS)

    Flury, M.; Gérard, P.; Takakura, Y.; Twardworski, P.; Fontaine, J.

    2005-04-01

    In this paper, we study the influence of the M2 quality factor of an incident beam on the reconstruction performance of a computer generated hologram (CGH). We use a statistical method to analyze the evolution of different quality criteria such as diffraction efficiency, root mean square error, illumination uniformity or correlation coefficient calculated in the numerical reconstruction versus the increasing M2 quality factor. The simulation results show us that this factor must always be taken into account in the CGH design when the M2 value is bigger than 2.

  5. Computing physical properties with quantum Monte Carlo methods with statistical fluctuations independent of system size.

    PubMed

    Assaraf, Roland

    2014-12-01

    We show that the recently proposed correlated sampling without reweighting procedure extends the locality (asymptotic independence of the system size) of a physical property to the statistical fluctuations of its estimator. This makes the approach potentially vastly more efficient for computing space-localized properties in large systems compared with standard correlated methods. A proof is given for a large collection of noninteracting fragments. Calculations on hydrogen chains suggest that this behavior holds not only for systems displaying short-range correlations, but also for systems with long-range correlations.

  6. Typical Behavior of the Linear Programming Method for Combinatorial Optimization Problems: A Statistical-Mechanical Perspective

    NASA Astrophysics Data System (ADS)

    Takabe, Satoshi; Hukushima, Koji

    2014-04-01

    The typical behavior of the linear programming (LP) problem is studied as a relaxation of the minimum vertex cover problem, which is a type of integer programming (IP) problem. To deal with LP and IP using statistical mechanics, a lattice-gas model on the Erdös-Rényi random graphs is analyzed by a replica method. It is found that the LP optimal solution is typically equal to that given by IP below the critical average degree c*=e in the thermodynamic limit. The critical threshold for LP = IP extends the previous result c = 1, and coincides with the replica symmetry-breaking threshold of the IP.

  7. A shortcut through the Coulomb gas method for spectral linear statistics on random matrices

    NASA Astrophysics Data System (ADS)

    Deelan Cunden, Fabio; Facchi, Paolo; Vivo, Pierpaolo

    2016-04-01

    In the last decade, spectral linear statistics on large dimensional random matrices have attracted significant attention. Within the physics community, a privileged role has been played by invariant matrix ensembles for which a two-dimensional Coulomb gas analogy is available. We present a critical revision of the Coulomb gas method in random matrix theory (RMT) borrowing language and tools from large deviations theory. This allows us to formalize an equivalent, but more effective and quicker route toward RMT free energy calculations. Moreover, we argue that this more modern viewpoint is likely to shed further light on the interesting issues of weak phase transitions and evaporation phenomena recently observed in RMT.

  8. Comparison of statistical methods for assessment of population genetic diversity by DNA fingerprinting

    SciTech Connect

    Leonard, T.; Roth, A.; Gordon, D.; Wessendarp, T.; Smith, M.K.; Silbiger, R.; Torsella, J.

    1995-12-31

    The advent of newer techniques for genomic characterization, e.g., Random Amplified Polymorphic DNA (RAPD) fingerprinting, has motivated development of a number of statistical approaches for creating hypothesis tests using this genetic information. The authors specific interest is methods for deriving relative genetic diversity measures of feral populations subjected to varying degrees of environmental impacts. Decreased polymorphism and loss of alleles have been documented in stressed populations of some species as assayed by allozyme analysis and, more recently, by DNA fingerprinting. Multilocus fingerprinting techniques (such as RAPDS) differ from allozyme analysis in that they do not explicitly yield information of allelism and heterozygosity. Therefore, in order to infer these parameters, assumptions must be made concerning the relationship of observed data to the underlying DNA architecture. In particular, assessments of population genetic diversity from DNA fingerprint data have employed at least three approaches based on different assumptions about the data. The authors compare different statistics, using a previously presented set of RAPD fingerprints of three populations of brown bullhead catfish. Furthermore, the behavior of these statistics is examined--as the sample sizes of fish/population and polymorphisms/fish are varied. Sample sizes are reduced either randomly or, in the case of polymorphisms (which are electrophoretic bands), systematically pruned using the criteria of high reproducibility between duplicate samples for inclusion of data. Implications for sampling individuals and loci in assessments of population genetic diversities are discussed. Concern about population N value and statistical power is very relevant to field situations where individuals available for sampling may be limited in number.

  9. Estimation of social value of statistical life using willingness-to-pay method in Nanjing, China.

    PubMed

    Yang, Zhao; Liu, Pan; Xu, Xin

    2016-10-01

    Rational decision making regarding the safety related investment programs greatly depends on the economic valuation of traffic crashes. The primary objective of this study was to estimate the social value of statistical life in the city of Nanjing in China. A stated preference survey was conducted to investigate travelers' willingness to pay for traffic risk reduction. Face-to-face interviews were conducted at stations, shopping centers, schools, and parks in different districts in the urban area of Nanjing. The respondents were categorized into two groups, including motorists and non-motorists. Both the binary logit model and mixed logit model were developed for the two groups of people. The results revealed that the mixed logit model is superior to the fixed coefficient binary logit model. The factors that significantly affect people's willingness to pay for risk reduction include income, education, gender, age, drive age (for motorists), occupation, whether the charged fees were used to improve private vehicle equipment (for motorists), reduction in fatality rate, and change in travel cost. The Monte Carlo simulation method was used to generate the distribution of value of statistical life (VSL). Based on the mixed logit model, the VSL had a mean value of 3,729,493 RMB ($586,610) with a standard deviation of 2,181,592 RMB ($343,142) for motorists; and a mean of 3,281,283 RMB ($505,318) with a standard deviation of 2,376,975 RMB ($366,054) for non-motorists. Using the tax system to illustrate the contribution of different income groups to social funds, the social value of statistical life was estimated. The average social value of statistical life was found to be 7,184,406 RMB ($1,130,032). PMID:27178028

  10. Estimation of social value of statistical life using willingness-to-pay method in Nanjing, China.

    PubMed

    Yang, Zhao; Liu, Pan; Xu, Xin

    2016-10-01

    Rational decision making regarding the safety related investment programs greatly depends on the economic valuation of traffic crashes. The primary objective of this study was to estimate the social value of statistical life in the city of Nanjing in China. A stated preference survey was conducted to investigate travelers' willingness to pay for traffic risk reduction. Face-to-face interviews were conducted at stations, shopping centers, schools, and parks in different districts in the urban area of Nanjing. The respondents were categorized into two groups, including motorists and non-motorists. Both the binary logit model and mixed logit model were developed for the two groups of people. The results revealed that the mixed logit model is superior to the fixed coefficient binary logit model. The factors that significantly affect people's willingness to pay for risk reduction include income, education, gender, age, drive age (for motorists), occupation, whether the charged fees were used to improve private vehicle equipment (for motorists), reduction in fatality rate, and change in travel cost. The Monte Carlo simulation method was used to generate the distribution of value of statistical life (VSL). Based on the mixed logit model, the VSL had a mean value of 3,729,493 RMB ($586,610) with a standard deviation of 2,181,592 RMB ($343,142) for motorists; and a mean of 3,281,283 RMB ($505,318) with a standard deviation of 2,376,975 RMB ($366,054) for non-motorists. Using the tax system to illustrate the contribution of different income groups to social funds, the social value of statistical life was estimated. The average social value of statistical life was found to be 7,184,406 RMB ($1,130,032).

  11. Statistical physics inspired methods to assign statistical significance in bioinformatics and proteomics: From sequence comparison to mass spectrometry based peptide sequencing

    NASA Astrophysics Data System (ADS)

    Alves, Gelio

    After the sequencing of many complete genomes, we are in a post-genomic era in which the most important task has changed from gathering genetic information to organizing the mass of data as well as under standing how components interact with each other. The former is usually undertaking using bioinformatics methods, while the latter task is generally termed proteomics. Success in both parts demands correct statistical significance assignments for results found. In my dissertation. I study two concrete examples: global sequence alignment statistics and peptide sequencing/identification using mass spectrometry. High-performance liquid chromatography coupled to a mass spectrometer (HPLC/MS/MS), enabling peptide identifications and thus protein identifications, has become the tool of choice in large-scale proteomics experiments. Peptide identification is usually done by database searches methods. The lack of robust statistical significance assignment among current methods motivated the development of a novel de novo algorithm, RAId, whose score statistics then provide statistical significance for high scoring peptides found in our custom, enzyme-digested peptide library. The ease of incorporating post-translation modifications is another important feature of RAId. To organize the massive protein/DNA data accumulated, biologists often cluster proteins according to their similarity via tools such as sequence alignment. Homologous proteins share similar domains. To assess the similarity of two domains usually requires alignment from head to toe, ie. a global alignment. A good alignment score statistics with an appropriate null model enable us to distinguish the biologically meaningful similarity from chance similarity. There has been much progress in local alignment statistics, which characterize score statistics when alignments tend to appear as a short segment of the whole sequence. For global alignment, which is useful in domain alignment, there is still much room for

  12. Whole vertebral bone segmentation method with a statistical intensity-shape model based approach

    NASA Astrophysics Data System (ADS)

    Hanaoka, Shouhei; Fritscher, Karl; Schuler, Benedikt; Masutani, Yoshitaka; Hayashi, Naoto; Ohtomo, Kuni; Schubert, Rainer

    2011-03-01

    An automatic segmentation algorithm for the vertebrae in human body CT images is presented. Especially we focused on constructing and utilizing 4 different statistical intensity-shape combined models for the cervical, upper / lower thoracic and lumbar vertebrae, respectively. For this purpose, two previously reported methods were combined: a deformable model-based initial segmentation method and a statistical shape-intensity model-based precise segmentation method. The former is used as a pre-processing to detect the position and orientation of each vertebra, which determines the initial condition for the latter precise segmentation method. The precise segmentation method needs prior knowledge on both the intensities and the shapes of the objects. After PCA analysis of such shape-intensity expressions obtained from training image sets, vertebrae were parametrically modeled as a linear combination of the principal component vectors. The segmentation of each target vertebra was performed as fitting of this parametric model to the target image by maximum a posteriori estimation, combined with the geodesic active contour method. In the experimental result by using 10 cases, the initial segmentation was successful in 6 cases and only partially failed in 4 cases (2 in the cervical area and 2 in the lumbo-sacral). In the precise segmentation, the mean error distances were 2.078, 1.416, 0.777, 0.939 mm for cervical, upper and lower thoracic, lumbar spines, respectively. In conclusion, our automatic segmentation algorithm for the vertebrae in human body CT images showed a fair performance for cervical, thoracic and lumbar vertebrae.

  13. Vibration-based structural health monitoring using adaptive statistical method under varying environmental condition

    NASA Astrophysics Data System (ADS)

    Jin, Seung-Seop; Jung, Hyung-Jo

    2014-03-01

    It is well known that the dynamic properties of a structure such as natural frequencies depend not only on damage but also on environmental condition (e.g., temperature). The variation in dynamic characteristics of a structure due to environmental condition may mask damage of the structure. Without taking the change of environmental condition into account, false-positive or false-negative damage diagnosis may occur so that structural health monitoring becomes unreliable. In order to address this problem, an approach to construct a regression model based on structural responses considering environmental factors has been usually used by many researchers. The key to success of this approach is the formulation between the input and output variables of the regression model to take into account the environmental variations. However, it is quite challenging to determine proper environmental variables and measurement locations in advance for fully representing the relationship between the structural responses and the environmental variations. One alternative (i.e., novelty detection) is to remove the variations caused by environmental factors from the structural responses by using multivariate statistical analysis (e.g., principal component analysis (PCA), factor analysis, etc.). The success of this method is deeply depending on the accuracy of the description of normal condition. Generally, there is no prior information on normal condition during data acquisition, so that the normal condition is determined by subjective perspective with human-intervention. The proposed method is a novel adaptive multivariate statistical analysis for monitoring of structural damage detection under environmental change. One advantage of this method is the ability of a generative learning to capture the intrinsic characteristics of the normal condition. The proposed method is tested on numerically simulated data for a range of noise in measurement under environmental variation. A comparative

  14. A prediction method for radon in groundwater using GIS and multivariate statistics.

    PubMed

    Skeppström, Kirlna; Olofsson, Bo

    2006-08-31

    Radon (222Rn) in groundwater constitutes a source of natural radioactivity to indoor air. It is difficult to make predictions of radon levels in groundwater due to the heterogeneous distribution of uranium and radium, flow patterns and varying geochemical conditions. High radon concentrations in groundwater are not always associated with high uranium content in the bedrock, since groundwater with a high radon content has been found in regions with low to moderate uranium concentrations in the bedrock. This paper describes a methodology for predicting areas with high concentrations of 222Rn in groundwater on a general scale, within an area of approximately 185x145km2. The methodology is based on multivariate statistical analyses, including principal component analysis and regression analysis, and investigates the factors of geology, land use, topography and uranium (U) content in the bedrock. A statistical variable based method (the RV method) was used to estimate risk values related to different radon concentrations. The method was calibrated and tested on more than 4400 drilled wells in Stockholm County. The results showed that radon concentration was clearly correlated to bedrock type, well altitude and distance from fracture zones. The weighted index (risk value) estimated by the RV method provided a fair prediction of radon potential in groundwater on a general scale. Risk values obtained using the RV method were compared to radon measurements in 12 test areas (on a local scale, each of area 25x25km2) in Stockholm County and a high correlation (r=-0.87) was observed. The study showed that the occurrence and spread of radon in groundwater are guided by multiple factors, which can be used in a radon prediction method on a general scale. However, it does not provide any direct information on the geochemical and flow processes involved.

  15. Jacobian integration method increases the statistical power to measure gray matter atrophy in multiple sclerosis☆

    PubMed Central

    Nakamura, Kunio; Guizard, Nicolas; Fonov, Vladimir S.; Narayanan, Sridar; Collins, D. Louis; Arnold, Douglas L.

    2013-01-01

    Gray matter atrophy provides important insights into neurodegeneration in multiple sclerosis (MS) and can be used as a marker of neuroprotection in clinical trials. Jacobian integration is a method for measuring volume change that uses integration of the local Jacobian determinants of the nonlinear deformation field registering two images, and is a promising tool for measuring gray matter atrophy. Our main objective was to compare the statistical power of the Jacobian integration method to commonly used methods in terms of the sample size required to detect a treatment effect on gray matter atrophy. We used multi-center longitudinal data from relapsing–remitting MS patients and evaluated combinations of cross-sectional and longitudinal pre-processing with SIENAX/FSL, SPM, and FreeSurfer, as well as the Jacobian integration method. The Jacobian integration method outperformed these other commonly used methods, reducing the required sample size by a factor of 4–5. The results demonstrate the advantage of using the Jacobian integration method to assess neuroprotection in MS clinical trials. PMID:24266007

  16. Assessing statistical reliability of phylogenetic trees via a speedy double bootstrap method.

    PubMed

    Ren, Aizhen; Ishida, Takashi; Akiyama, Yutaka

    2013-05-01

    Evaluating the reliability of estimated phylogenetic trees is of critical importance in the field of molecular phylogenetics, and for other endeavors that depend on accurate phylogenetic reconstruction. The bootstrap method is a well-known computational approach to phylogenetic tree assessment, and more generally for assessing the reliability of statistical models. However, it is known to be biased under certain circumstances, calling into question the accuracy of the method. Several advanced bootstrap methods have been developed to achieve higher accuracy, one of which is the double bootstrap approach, but the computational burden of this method has precluded its application to practical problems of phylogenetic tree selection. We address this issue by proposing a simple method called the speedy double bootstrap, which circumvents the second-tier resampling step in the regular double bootstrap approach. We also develop an implementation of the regular double bootstrap for comparison with our speedy method. The speedy double bootstrap suffers no significant loss of accuracy compared with the regular double bootstrap, while performing calculations significantly more rapidly (at minimum around 371 times faster, based on analysis of mammalian mitochondrial amino acid sequences and 12S and 16S rRNA genes). Our method thus enables, for the first time, the practical application of the double bootstrap technique in the context of molecular phylogenetics. The approach can also be used more generally for model selection problems wherever the maximum likelihood criterion is used.

  17. Estimating soil organic carbon stocks and spatial patterns with statistical and GIS-based methods.

    PubMed

    Zhi, Junjun; Jing, Changwei; Lin, Shengpan; Zhang, Cao; Liu, Qiankun; DeGloria, Stephen D; Wu, Jiaping

    2014-01-01

    Accurately quantifying soil organic carbon (SOC) is considered fundamental to studying soil quality, modeling the global carbon cycle, and assessing global climate change. This study evaluated the uncertainties caused by up-scaling of soil properties from the county scale to the provincial scale and from lower-level classification of Soil Species to Soil Group, using four methods: the mean, median, Soil Profile Statistics (SPS), and pedological professional knowledge based (PKB) methods. For the SPS method, SOC stock is calculated at the county scale by multiplying the mean SOC density value of each soil type in a county by its corresponding area. For the mean or median method, SOC density value of each soil type is calculated using provincial arithmetic mean or median. For the PKB method, SOC density value of each soil type is calculated at the county scale considering soil parent materials and spatial locations of all soil profiles. A newly constructed 1∶50,000 soil survey geographic database of Zhejiang Province, China, was used for evaluation. Results indicated that with soil classification levels up-scaling from Soil Species to Soil Group, the variation of estimated SOC stocks among different soil classification levels was obviously lower than that among different methods. The difference in the estimated SOC stocks among the four methods was lowest at the Soil Species level. The differences in SOC stocks among the mean, median, and PKB methods for different Soil Groups resulted from the differences in the procedure of aggregating soil profile properties to represent the attributes of one soil type. Compared with the other three estimation methods (i.e., the SPS, mean and median methods), the PKB method holds significant promise for characterizing spatial differences in SOC distribution because spatial locations of all soil profiles are considered during the aggregation procedure.

  18. A comparison of dynamical and statistical downscaling methods for regional wave climate projections along French coastlines.

    NASA Astrophysics Data System (ADS)

    Laugel, Amélie; Menendez, Melisa; Benoit, Michel; Mattarolo, Giovanni; Mendez, Fernando

    2013-04-01

    Wave climate forecasting is a major issue for numerous marine and coastal related activities, such as offshore industries, flooding risks assessment and wave energy resource evaluation, among others. Generally, there are two main ways to predict the impacts of the climate change on the wave climate at regional scale: the dynamical and the statistical downscaling of GCM (Global Climate Model). In this study, both methods have been applied on the French coast (Atlantic , English Channel and North Sea shoreline) under three climate change scenarios (A1B, A2, B1) simulated with the GCM ARPEGE-CLIMAT, from Météo-France (AR4, IPCC). The aim of the work is to characterise the wave climatology of the 21st century and compare the statistical and dynamical methods pointing out advantages and disadvantages of each approach. The statistical downscaling method proposed by the Environmental Hydraulics Institute of Cantabria (Spain) has been applied (Menendez et al., 2011). At a particular location, the sea-state climate (Predictand Y) is defined as a function, Y=f(X), of several atmospheric circulation patterns (Predictor X). Assuming these climate associations between predictor and predictand are stationary, the statistical approach has been used to project the future wave conditions with reference to the GCM. The statistical relations between predictor and predictand have been established over 31 years, from 1979 to 2009. The predictor is built as the 3-days-averaged squared sea level pressure gradient from the hourly CFSR database (Climate Forecast System Reanalysis, http://cfs.ncep.noaa.gov/cfsr/). The predictand has been extracted from the 31-years hindcast sea-state database ANEMOC-2 performed with the 3G spectral wave model TOMAWAC (Benoit et al., 1996), developed at EDF R&D LNHE and Saint-Venant Laboratory for Hydraulics and forced by the CFSR 10m wind field. Significant wave height, peak period and mean wave direction have been extracted with an hourly-resolution at

  19. Biomarkers for pancreatic cancer: recent achievements in proteomics and genomics through classical and multivariate statistical methods.

    PubMed

    Marengo, Emilio; Robotti, Elisa

    2014-10-01

    Pancreatic cancer (PC) is one of the most aggressive and lethal neoplastic diseases. A valid alternative to the usual invasive diagnostic tools would certainly be the determination of biomarkers in peripheral fluids to provide less invasive tools for early diagnosis. Nowadays, biomarkers are generally investigated mainly in peripheral blood and tissues through high-throughput omics techniques comparing control vs pathological samples. The results can be evaluated by two main strategies: (1) classical methods in which the identification of significant biomarkers is accomplished by monovariate statistical tests where each biomarker is considered as independent from the others; and (2) multivariate methods, taking into consideration the correlations existing among the biomarkers themselves. This last approach is very powerful since it allows the identification of pools of biomarkers with diagnostic and prognostic performances which are superior to single markers in terms of sensitivity, specificity and robustness. Multivariate techniques are usually applied with variable selection procedures to provide a restricted set of biomarkers with the best predictive ability; however, standard selection methods are usually aimed at the identification of the smallest set of variables with the best predictive ability and exhaustivity is usually neglected. The exhaustive search for biomarkers is instead an important alternative to standard variable selection since it can provide information about the etiology of the pathology by producing a comprehensive set of markers. In this review, the most recent applications of the omics techniques (proteomics, genomics and metabolomics) to the identification of exploratory biomarkers for PC will be presented with particular regard to the statistical methods adopted for their identification. The basic theory related to classical and multivariate methods for identification of biomarkers is presented and then, the most recent applications in

  20. Testing for Additivity at Select Mixture Groups of Interest Based on Statistical Equivalence Testing Methods

    SciTech Connect

    Stork, LeAnna M.; Gennings, Chris; Carchman, Richard; Carter, Jr., Walter H.; Pounds, Joel G.; Mumtaz, Moiz

    2006-12-01

    Several assumptions, defined and undefined, are used in the toxicity assessment of chemical mixtures. In scientific practice mixture components in the low-dose region, particularly subthreshold doses, are often assumed to behave additively (i.e., zero interaction) based on heuristic arguments. This assumption has important implications in the practice of risk assessment, but has not been experimentally tested. We have developed methodology to test for additivity in the sense of Berenbaum (Advances in Cancer Research, 1981), based on the statistical equivalence testing literature where the null hypothesis of interaction is rejected for the alternative hypothesis of additivity when data support the claim. The implication of this approach is that conclusions of additivity are made with a false positive rate controlled by the experimenter. The claim of additivity is based on prespecified additivity margins, which are chosen using expert biological judgment such that small deviations from additivity, which are not considered to be biologically important, are not statistically significant. This approach is in contrast to the usual hypothesis-testing framework that assumes additivity in the null hypothesis and rejects when there is significant evidence of interaction. In this scenario, failure to reject may be due to lack of statistical power making the claim of additivity problematic. The proposed method is illustrated in a mixture of five organophosphorus pesticides that were experimentally evaluated alone and at relevant mixing ratios. Motor activity was assessed in adult male rats following acute exposure. Four low-dose mixture groups were evaluated. Evidence of additivity is found in three of the four low-dose mixture groups.The proposed method tests for additivity of the whole mixture and does not take into account subset interactions (e.g., synergistic, antagonistic) that may have occurred and cancelled each other out.

  1. A Tool Preference Choice Method for RNA Secondary Structure Prediction by SVM with Statistical Tests

    PubMed Central

    Hor, Chiou-Yi; Yang, Chang-Biau; Chang, Chia-Hung; Tseng, Chiou-Ting; Chen, Hung-Hsin

    2013-01-01

    The Prediction of RNA secondary structures has drawn much attention from both biologists and computer scientists. Many useful tools have been developed for this purpose. These tools have their individual strengths and weaknesses. As a result, based on support vector machines (SVM), we propose a tool choice method which integrates three prediction tools: pknotsRG, RNAStructure, and NUPACK. Our method first extracts features from the target RNA sequence, and adopts two information-theoretic feature selection methods for feature ranking. We propose a method to combine feature selection and classifier fusion in an incremental manner. Our test data set contains 720 RNA sequences, where 225 pseudoknotted RNA sequences are obtained from PseudoBase, and 495 nested RNA sequences are obtained from RNA SSTRAND. The method serves as a preprocessing way in analyzing RNA sequences before the RNA secondary structure prediction tools are employed. In addition, the performance of various configurations is subject to statistical tests to examine their significance. The best base-pair accuracy achieved is 75.5%, which is obtained by the proposed incremental method, and is significantly higher than 68.8%, which is associated with the best predictor, pknotsRG. PMID:23641141

  2. Performance analysis of Wald-statistic based network detection methods for radiation sources

    SciTech Connect

    Sen, Satyabrata; Rao, Nageswara S; Wu, Qishi; Barry, M. L..; Grieme, M.; Brooks, Richard R; Cordone, G.

    2016-01-01

    There have been increasingly large deployments of radiation detection networks that require computationally fast algorithms to produce prompt results over ad-hoc sub-networks of mobile devices, such as smart-phones. These algorithms are in sharp contrast to complex network algorithms that necessitate all measurements to be sent to powerful central servers. In this work, at individual sensors, we employ Wald-statistic based detection algorithms which are computationally very fast, and are implemented as one of three Z-tests and four chi-square tests. At fusion center, we apply the K-out-of-N fusion to combine the sensors hard decisions. We characterize the performance of detection methods by deriving analytical expressions for the distributions of underlying test statistics, and by analyzing the fusion performances in terms of K, N, and the false-alarm rates of individual detectors. We experimentally validate our methods using measurements from indoor and outdoor characterization tests of the Intelligence Radiation Sensors Systems (IRSS) program. In particular, utilizing the outdoor measurements, we construct two important real-life scenarios, boundary surveillance and portal monitoring, and present the results of our algorithms.

  3. A Statistical Method for Assessing Peptide Identification Confidence in Accurate Mass and Time Tag Proteomics

    SciTech Connect

    Stanley, Jeffrey R.; Adkins, Joshua N.; Slysz, Gordon W.; Monroe, Matthew E.; Purvine, Samuel O.; Karpievitch, Yuliya V.; Anderson, Gordon A.; Smith, Richard D.; Dabney, Alan R.

    2011-07-15

    High-throughput proteomics is rapidly evolving to require high mass measurement accuracy for a variety of different applications. Increased mass measurement accuracy in bottom-up proteomics specifically allows for an improved ability to distinguish and characterize detected MS features, which may in turn be identified by, e.g., matching to entries in a database for both precursor and fragmentation mass identification methods. Many tools exist with which to score the identification of peptides from LC-MS/MS measurements or to assess matches to an accurate mass and time (AMT) tag database, but these two calculations remain distinctly unrelated. Here we present a statistical method, Statistical Tools for AMT tag Confidence (STAC), which extends our previous work incorporating prior probabilities of correct sequence identification from LC-MS/MS, as well as the quality with which LC-MS features match AMT tags, to evaluate peptide identification confidence. Compared to existing tools, we are able to obtain significantly more high-confidence peptide identifications at a given false discovery rate and additionally assign confidence estimates to individual peptide identifications. Freely available software implementations of STAC are available in both command line and as a Windows graphical application.

  4. [Somatic hypermutagenesis in immunoglobulin genes. I. Connection of somatic mutations with repeats. A statistical weighting method].

    PubMed

    Solov'ev, V V; Rogozin, I V; Kolchanov, N A

    1989-01-01

    Based on the analysis of a number of immunoglobulin genes' nucleotide sequences, it has been suggested, that somatic mutations emerge by means of imperfect duplexes correction, formed by mispairing of complementary regions of direct and inverted repeats. In the present work provides new data, confirming this mechanism of somatic hypermutagenesis. It has been shown that the presented sample of V- and J-segments of immunoglobulin genes is abundant in nonrandom imperfect direct repeats and complementary palindromes. To prove the connection of somatic mutations with the correction of imperfect duplexes, made up by the regions of these repeats, we have developed the method of statistical weights, permitting us to analyse the samples of mutations and repeats and to reveal the reliability of the connection between them. Using this method we have investigated the collection of 203 nucleotide substitutions in V- and J-segments and have shown a statistically reliable (P less than 10(-4) connection of these mutation positions with imperfect repeats.

  5. Methods for estimating selected low-flow frequency statistics for unregulated streams in Kentucky

    USGS Publications Warehouse

    Martin, Gary R.; Arihood, Leslie D.

    2010-01-01

    This report provides estimates of, and presents methods for estimating, selected low-flow frequency statistics for unregulated streams in Kentucky including the 30-day mean low flows for recurrence intervals of 2 and 5 years (30Q2 and 30Q5) and the 7-day mean low flows for recurrence intervals of 5, 10, and 20 years (7Q2, 7Q10, and 7Q20). Estimates of these statistics are provided for 121 U.S. Geological Survey streamflow-gaging stations with data through the 2006 climate year, which is the 12-month period ending March 31 of each year. Data were screened to identify the periods of homogeneous, unregulated flows for use in the analyses. Logistic-regression equations are presented for estimating the annual probability of the selected low-flow frequency statistics being equal to zero. Weighted-least-squares regression equations were developed for estimating the magnitude of the nonzero 30Q2, 30Q5, 7Q2, 7Q10, and 7Q20 low flows. Three low-flow regions were defined for estimating the 7-day low-flow frequency statistics. The explicit explanatory variables in the regression equations include total drainage area and the mapped streamflow-variability index measured from a revised statewide coverage of this characteristic. The percentage of the station low-flow statistics correctly classified as zero or nonzero by use of the logistic-regression equations ranged from 87.5 to 93.8 percent. The average standard errors of prediction of the weighted-least-squares regression equations ranged from 108 to 226 percent. The 30Q2 regression equations have the smallest standard errors of prediction, and the 7Q20 regression equations have the largest standard errors of prediction. The regression equations are applicable only to stream sites with low flows unaffected by regulation from reservoirs and local diversions of flow and to drainage basins in specified ranges of basin characteristics. Caution is advised when applying the equations for basins with characteristics near the

  6. A statistical method for assessing peptide identification confidence in accurate mass and time tag proteomics.

    PubMed

    Stanley, Jeffrey R; Adkins, Joshua N; Slysz, Gordon W; Monroe, Matthew E; Purvine, Samuel O; Karpievitch, Yuliya V; Anderson, Gordon A; Smith, Richard D; Dabney, Alan R

    2011-08-15

    Current algorithms for quantifying peptide identification confidence in the accurate mass and time (AMT) tag approach assume that the AMT tags themselves have been correctly identified. However, there is uncertainty in the identification of AMT tags, because this is based on matching LC-MS/MS fragmentation spectra to peptide sequences. In this paper, we incorporate confidence measures for the AMT tag identifications into the calculation of probabilities for correct matches to an AMT tag database, resulting in a more accurate overall measure of identification confidence for the AMT tag approach. The method is referenced as Statistical Tools for AMT Tag Confidence (STAC). STAC additionally provides a uniqueness probability (UP) to help distinguish between multiple matches to an AMT tag and a method to calculate an overall false discovery rate (FDR). STAC is freely available for download, as both a command line and a Windows graphical application.

  7. Effect of the Target Motion Sampling Temperature Treatment Method on the Statistics and Performance

    NASA Astrophysics Data System (ADS)

    Viitanen, Tuomas; Leppänen, Jaakko

    2014-06-01

    Target Motion Sampling (TMS) is a stochastic on-the-fly temperature treatment technique that is being developed as a part of the Monte Carlo reactor physics code Serpent. The method provides for modeling of arbitrary temperatures in continuous-energy Monte Carlo tracking routines with only one set of cross sections stored in the computer memory. Previously, only the performance of the TMS method in terms of CPU time per transported neutron has been discussed. Since the effective cross sections are not calculated at any point of a transport simulation with TMS, reaction rate estimators must be scored using sampled cross sections, which is expected to increase the variances and, consequently, to decrease the figures-of-merit. This paper examines the effects of the TMS on the statistics and performance in practical calculations involving reaction rate estimation with collision estimators. Against all expectations it turned out that the usage of sampled response values has no practical effect on the performance of reaction rate estimators when using TMS with elevated basis cross section temperatures (EBT), i.e. the usual way. With 0 Kelvin cross sections a significant increase in the variances of capture rate estimators was observed right below the energy region of unresolved resonances, but at these energies the figures-of-merit could be increased using a simple resampling technique to decrease the variances of the responses. It was, however, noticed that the usage of the TMS method increases the statistical deviances of all estimators, including the flux estimator, by tens of percents in the vicinity of very strong resonances. This effect is actually not related to the usage of sampled responses, but is instead an inherent property of the TMS tracking method and concerns both EBT and 0 K calculations.

  8. Higher-order statistical moments and a procedure that detects potentially anomalous years as two alternative methods describing alterations in continuous environmental data

    USGS Publications Warehouse

    Arismendi, Ivan; Johnson, Sherri L.; Dunham, Jason

    2015-01-01

    Statistics of central tendency and dispersion may not capture relevant or desired characteristics of the distribution of continuous phenomena and, thus, they may not adequately describe temporal patterns of change. Here, we present two methodological approaches that can help to identify temporal changes in environmental regimes. First, we use higher-order statistical moments (skewness and kurtosis) to examine potential changes of empirical distributions at decadal extents. Second, we adapt a statistical procedure combining a non-metric multidimensional scaling technique and higher density region plots to detect potentially anomalous years. We illustrate the use of these approaches by examining long-term stream temperature data from minimally and highly human-influenced streams. In particular, we contrast predictions about thermal regime responses to changing climates and human-related water uses. Using these methods, we effectively diagnose years with unusual thermal variability and patterns in variability through time, as well as spatial variability linked to regional and local factors that influence stream temperature. Our findings highlight the complexity of responses of thermal regimes of streams and reveal their differential vulnerability to climate warming and human-related water uses. The two approaches presented here can be applied with a variety of other continuous phenomena to address historical changes, extreme events, and their associated ecological responses.

  9. Characterization of squamous cell carcinomas of the head and neck using methods of spatial statistics.

    PubMed

    Mattfeldt, T; Fleischer, F

    2014-10-01

    In the present study, 53 cases of squamous cell carcinomas of the head and neck were characterized by a quantitative histological texture analysis based on principles of spatial statistics. A planar tessellation of the epithelial tumour component was generated by a skeletonization algorithm. The size distribution of the virtual cells of this planar tessellation, and the size distribution of the profiles of the tumour cell nuclei were estimated in terms of area and boundary length. The intensity, the reduced second moment function (K-function) and the pair correlation function of the point process of the centroids of the profiles of the tumour cell nuclei were also estimated. For both purposes, it is necessary to correct for edge effects, which we consider in this paper in some detail. Specifically, the point patterns of the tumour cell nuclei were considered as realizations of a point process, where the points exist only in the epithelial tumour component (the permitted phase) and not in the stroma (the forbidden phase). The methods allow to characterize each individual tumour by a series of summary statistics. The total set of cases was then partitioned into two groups: 19 cases without lymph node metastases (pN0), and 34 nodal positive cases (pN1 or pN2). Statistical analysis showed no significant differences between the intensities, the mean K-functions and the mean pair correlation functions of the tumour cell nucleus profiles of the two groups. However, there were some significant differences between the sizes of the virtual cells and of the nucleus profiles of the nodal negative cases as compared to the nodal positive cases. In a logistic regression analysis, one of the quantitative nuclear size variables (mean nuclear area) was found to be a significant predictor of lymph node metastasis, in addition to tumour stage. The study shows the potential of methods of spatial statistics for objective quantitative grading of squamous cell carcinomas of the head and

  10. Statistical Classification of Soft Solder Alloys by Laser-Induced Breakdown Spectroscopy: Review of Methods

    NASA Astrophysics Data System (ADS)

    Zdunek, R.; Nowak, M.; Pliński, E.

    2016-02-01

    This paper reviews machine-learning methods that are nowadays the most frequently used for the supervised classification of spectral signals in laser-induced breakdown spectroscopy (LIBS). We analyze and compare various statistical classification methods, such as linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), partial least-squares discriminant analysis (PLS-DA), soft independent modeling of class analogy (SIMCA), support vector machine (SVM), naive Bayes method, probabilistic neural networks (PNN), and K-nearest neighbor (KNN) method. The theoretical considerations are supported with experiments conducted for real soft-solder-alloy spectra obtained using LIBS. We consider two decision problems: binary and multiclass classification. The former is used to distinguish overheated soft solders from their normal versions. The latter aims to assign a testing sample to a given group of materials. The measurements are obtained for several laser-energy values, projection masks, and numbers of laser shots. Using cross-validation, we evaluate the above classification methods in terms of their usefulness in solving both classification problems.

  11. Statistical methods for the analysis of a screening test for chronic beryllium disease

    SciTech Connect

    Frome, E.L.; Neubert, R.L.; Smith, M.H.; Littlefield, L.G.; Colyer, S.P.

    1994-10-01

    The lymphocyte proliferation test (LPT) is a noninvasive screening procedure used to identify persons who may have chronic beryllium disease. A practical problem in the analysis of LPT well counts is the occurrence of outlying data values (approximately 7% of the time). A log-linear regression model is used to describe the expected well counts for each set of test conditions. The variance of the well counts is proportional to the square of the expected counts, and two resistant regression methods are used to estimate the parameters of interest. The first approach uses least absolute values (LAV) on the log of the well counts to estimate beryllium stimulation indices (SIs) and the coefficient of variation. The second approach uses a resistant regression version of maximum quasi-likelihood estimation. A major advantage of the resistant regression methods is that it is not necessary to identify and delete outliers. These two new methods for the statistical analysis of the LPT data and the outlier rejection method that is currently being used are applied to 173 LPT assays. The authors strongly recommend the LAV method for routine analysis of the LPT.

  12. PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks

    PubMed Central

    Pham, Thong; Sheridan, Paul; Shimodaira, Hidetoshi

    2015-01-01

    Preferential attachment is a stochastic process that has been proposed to explain certain topological features characteristic of complex networks from diverse domains. The systematic investigation of preferential attachment is an important area of research in network science, not only for the theoretical matter of verifying whether this hypothesized process is operative in real-world networks, but also for the practical insights that follow from knowledge of its functional form. Here we describe a maximum likelihood based estimation method for the measurement of preferential attachment in temporal complex networks. We call the method PAFit, and implement it in an R package of the same name. PAFit constitutes an advance over previous methods primarily because we based it on a nonparametric statistical framework that enables attachment kernel estimation free of any assumptions about its functional form. We show this results in PAFit outperforming the popular methods of Jeong and Newman in Monte Carlo simulations. What is more, we found that the application of PAFit to a publically available Flickr social network dataset yielded clear evidence for a deviation of the attachment kernel from the popularly assumed log-linear form. Independent of our main work, we provide a correction to a consequential error in Newman’s original method which had evidently gone unnoticed since its publication over a decade ago. PMID:26378457

  13. Computer program for the calculation of grain size statistics by the method of moments

    USGS Publications Warehouse

    Sawyer, Michael B.

    1977-01-01

    A computer program is presented for a Hewlett-Packard Model 9830A desk-top calculator (1) which calculates statistics using weight or point count data from a grain-size analysis. The program uses the method of moments in contrast to the more commonly used but less inclusive graphic method of Folk and Ward (1957). The merits of the program are: (1) it is rapid; (2) it can accept data in either grouped or ungrouped format; (3) it allows direct comparison with grain-size data in the literature that have been calculated by the method of moments; (4) it utilizes all of the original data rather than percentiles from the cumulative curve as in the approximation technique used by the graphic method; (5) it is written in the computer language BASIC, which is easily modified and adapted to a wide variety of computers; and (6) when used in the HP-9830A, it does not require punching of data cards. The method of moments should be used only if the entire sample has been measured and the worker defines the measured grain-size range. (1) Use of brand names in this paper does not imply endorsement of these products by the U.S. Geological Survey.

  14. Statistical method for sparse coding of speech including a linear predictive model

    NASA Astrophysics Data System (ADS)

    Rufiner, Hugo L.; Goddard, John; Rocha, Luis F.; Torres, María E.

    2006-07-01

    Recently, different methods for obtaining sparse representations of a signal using dictionaries of waveforms have been studied. They are often motivated by the way the brain seems to process certain sensory signals. Algorithms have been developed using a specific criterion to choose the waveforms occurring in the representation. The waveforms are choosen from a fixed dictionary and some algorithms also construct them as a part of the method. In the case of speech signals, most approaches do not take into consideration the important temporal correlations that are exhibited. It is known that these correlations are well approximated by linear models. Incorporating this a priori knowledge of the signal can facilitate the search for a suitable representation solution and also can help with its interpretation. Lewicki proposed a method to solve the noisy and overcomplete independent component analysis problem. In the present paper we propose a modification of this statistical technique for obtaining a sparse representation using a generative parametric model. The representations obtained with the method proposed here and other techniques are applied to artificial data and real speech signals, and compared using different coding costs and sparsity measures. The results show that the proposed method achieves more efficient representations of these signals compared to the others. A qualitative analysis of these results is also presented, which suggests that the restriction imposed by the parametric model is helpful in discovering meaningful characteristics of the signals.

  15. [Concentration retrieving method of SO2 using differential optical absorption spectroscopy based on statistics].

    PubMed

    Liu, Bin; Sun, Chang-Ku; Zhang, Chi; Zhao, Yu-Mei; Liu, Jun-Ping

    2011-01-01

    A concentration retrieving method using statistics is presented, which is applied in differential optical absorption spectroscopy (DOAS) for measuring the concentration of SO2. The method uses the standard deviation of the differential absorption to represents the gas concentration. Principle component analysis (PCA) method is used to process the differential absorption spectrum. In the method, the basis data for the concentration retrieval of SO2 is the combination of the PCA processing result, the correlation coefficient, and the standard deviation of the differential absorption. The method is applied to a continuous emission monitoring system (CEMS) with optical path length of 0.3 m. Its measuring range for SO2 concentration is 0-5 800 mg x m(-3). The nonlinear calibration and the temperature compensation for the system were executed. The full scale error of the retrieving concentration is less than 0.7% FS. And the measuring result is -4.54 mg x m(-3) when the concentration of SO2 is zero. PMID:21428087

  16. Statistical methods for temporal and space-time analysis of community composition data.

    PubMed

    Legendre, Pierre; Gauthier, Olivier

    2014-03-01

    This review focuses on the analysis of temporal beta diversity, which is the variation in community composition along time in a study area. Temporal beta diversity is measured by the variance of the multivariate community composition time series and that variance can be partitioned using appropriate statistical methods. Some of these methods are classical, such as simple or canonical ordination, whereas others are recent, including the methods of temporal eigenfunction analysis developed for multiscale exploration (i.e. addressing several scales of variation) of univariate or multivariate response data, reviewed, to our knowledge for the first time in this review. These methods are illustrated with ecological data from 13 years of benthic surveys in Chesapeake Bay, USA. The following methods are applied to the Chesapeake data: distance-based Moran's eigenvector maps, asymmetric eigenvector maps, scalogram, variation partitioning, multivariate correlogram, multivariate regression tree, and two-way MANOVA to study temporal and space-time variability. Local (temporal) contributions to beta diversity (LCBD indices) are computed and analysed graphically and by regression against environmental variables, and the role of species in determining the LCBD values is analysed by correlation analysis. A tutorial detailing the analyses in the R language is provided in an appendix. PMID:24430848

  17. Adequate iodine levels in healthy pregnant women. A cross-sectional survey of dietary intake in Turkey

    PubMed Central

    Kasap, Burcu; Akbaba, Gülhan; Yeniçeri, Emine N.; Akın, Melike N.; Akbaba, Eren; Öner, Gökalp; Turhan, Nilgün Ö.; Duru, Mehmet E.

    2016-01-01

    Objectives: To assess current iodine levels and related factors among healthy pregnant women. Methods: In this cross-sectional, hospital-based study, healthy pregnant women (n=135) were scanned for thyroid volume, provided urine samples for urinary iodine concentration and completed a questionnaire including sociodemographic characteristics and dietary habits targeted for iodine consumption at the Department of Obstetrics and Gynecology, School of Medicine, Muğla Sıtkı Koçman University, Muğla, Turkey, between August 2014 and February 2015. Sociodemographic data were analyzed by simple descriptive statistics. Results: Median urinary iodine concentration was 222.0 µg/L, indicating adequate iodine intake during pregnancy. According to World Health Organization (WHO) criteria, 28.1% of subjects had iodine deficiency, 34.1% had adequate iodine intake, 34.8% had more than adequate iodine intake, and 3.0% had excessive iodine intake during pregnancy. Education level, higher monthly income, current employment, consuming iodized salt, and adding salt to food during, or after cooking were associated with higher urinary iodine concentration. Conclusion: Iodine status of healthy pregnant women was adequate, although the percentage of women with more than adequate iodine intake was higher than the reported literature. PMID:27279519

  18. A Network-Based Method to Assess the Statistical Significance of Mild Co-Regulation Effects

    PubMed Central

    Horvát, Emőke-Ágnes; Zhang, Jitao David; Uhlmann, Stefan; Sahin, Özgür; Zweig, Katharina Anna

    2013-01-01

    Recent development of high-throughput, multiplexing technology has initiated projects that systematically investigate interactions between two types of components in biological networks, for instance transcription factors and promoter sequences, or microRNAs (miRNAs) and mRNAs. In terms of network biology, such screening approaches primarily attempt to elucidate relations between biological components of two distinct types, which can be represented as edges between nodes in a bipartite graph. However, it is often desirable not only to determine regulatory relationships between nodes of different types, but also to understand the connection patterns of nodes of the same type. Especially interesting is the co-occurrence of two nodes of the same type, i.e., the number of their common neighbours, which current high-throughput screening analysis fails to address. The co-occurrence gives the number of circumstances under which both of the biological components are influenced in the same way. Here we present SICORE, a novel network-based method to detect pairs of nodes with a statistically significant co-occurrence. We first show the stability of the proposed method on artificial data sets: when randomly adding and deleting observations we obtain reliable results even with noise exceeding the expected level in large-scale experiments. Subsequently, we illustrate the viability of the method based on the analysis of a proteomic screening data set to reveal regulatory patterns of human microRNAs targeting proteins in the EGFR-driven cell cycle signalling system. Since statistically significant co-occurrence may indicate functional synergy and the mechanisms underlying canalization, and thus hold promise in drug target identification and therapeutic development, we provide a platform-independent implementation of SICORE with a graphical user interface as a novel tool in the arsenal of high-throughput screening analysis. PMID:24039936

  19. Exploring the use of statistical process control methods to assess course changes

    NASA Astrophysics Data System (ADS)

    Vollstedt, Ann-Marie

    This dissertation pertains to the field of Engineering Education. The Department of Mechanical Engineering at the University of Nevada, Reno (UNR) is hosting this dissertation under a special agreement. This study was motivated by the desire to find an improved, quantitative measure of student quality that is both convenient to use and easy to evaluate. While traditional statistical analysis tools such as ANOVA (analysis of variance) are useful, they are somewhat time consuming and are subject to error because they are based on grades, which are influenced by numerous variables, independent of student ability and effort (e.g. inflation and curving). Additionally, grades are currently the only measure of quality in most engineering courses even though most faculty agree that grades do not accurately reflect student quality. Based on a literature search, in this study, quality was defined as content knowledge, cognitive level, self efficacy, and critical thinking. Nineteen treatments were applied to a pair of freshmen classes in an effort in increase the qualities. The qualities were measured via quiz grades, essays, surveys, and online critical thinking tests. Results from the quality tests were adjusted and filtered prior to analysis. All test results were subjected to Chauvenet's criterion in order to detect and remove outlying data. In addition to removing outliers from data sets, it was felt that individual course grades needed adjustment to accommodate for the large portion of the grade that was defined by group work. A new method was developed to adjust grades within each group based on the residual of the individual grades within the group and the portion of the course grade defined by group work. It was found that the grade adjustment method agreed 78% of the time with the manual ii grade changes instructors made in 2009, and also increased the correlation between group grades and individual grades. Using these adjusted grades, Statistical Process Control

  20. New Developments in the Embedded Statistical Coupling Method: Atomistic/Continuum Crack Propagation

    NASA Technical Reports Server (NTRS)

    Saether, E.; Yamakov, V.; Glaessgen, E.

    2008-01-01

    A concurrent multiscale modeling methodology that embeds a molecular dynamics (MD) region within a finite element (FEM) domain has been enhanced. The concurrent MD-FEM coupling methodology uses statistical averaging of the deformation of the atomistic MD domain to provide interface displacement boundary conditions to the surrounding continuum FEM region, which, in turn, generates interface reaction forces that are applied as piecewise constant traction boundary conditions to the MD domain. The enhancement is based on the addition of molecular dynamics-based cohesive zone model (CZM) elements near the MD-FEM interface. The CZM elements are a continuum interpretation of the traction-displacement relationships taken from MD simulations using Cohesive Zone Volume Elements (CZVE). The addition of CZM elements to the concurrent MD-FEM analysis provides a consistent set of atomistically-based cohesive properties within the finite element region near the growing crack. Another set of CZVEs are then used to extract revised CZM relationships from the enhanced embedded statistical coupling method (ESCM) simulation of an edge crack under uniaxial loading.

  1. Improved Test Planning and Analysis Through the Use of Advanced Statistical Methods

    NASA Technical Reports Server (NTRS)

    Green, Lawrence L.; Maxwell, Katherine A.; Glass, David E.; Vaughn, Wallace L.; Barger, Weston; Cook, Mylan

    2016-01-01

    The goal of this work is, through computational simulations, to provide statistically-based evidence to convince the testing community that a distributed testing approach is superior to a clustered testing approach for most situations. For clustered testing, numerous, repeated test points are acquired at a limited number of test conditions. For distributed testing, only one or a few test points are requested at many different conditions. The statistical techniques of Analysis of Variance (ANOVA), Design of Experiments (DOE) and Response Surface Methods (RSM) are applied to enable distributed test planning, data analysis and test augmentation. The D-Optimal class of DOE is used to plan an optimally efficient single- and multi-factor test. The resulting simulated test data are analyzed via ANOVA and a parametric model is constructed using RSM. Finally, ANOVA can be used to plan a second round of testing to augment the existing data set with new data points. The use of these techniques is demonstrated through several illustrative examples. To date, many thousands of comparisons have been performed and the results strongly support the conclusion that the distributed testing approach outperforms the clustered testing approach.

  2. Quantitative imaging biomarkers: a review of statistical methods for computer algorithm comparisons.

    PubMed

    Obuchowski, Nancy A; Reeves, Anthony P; Huang, Erich P; Wang, Xiao-Feng; Buckler, Andrew J; Kim, Hyun J Grace; Barnhart, Huiman X; Jackson, Edward F; Giger, Maryellen L; Pennello, Gene; Toledano, Alicia Y; Kalpathy-Cramer, Jayashree; Apanasovich, Tatiyana V; Kinahan, Paul E; Myers, Kyle J; Goldgof, Dmitry B; Barboriak, Daniel P; Gillies, Robert J; Schwartz, Lawrence H; Sullivan, Daniel C

    2015-02-01

    Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.

  3. Statistical and graphical methods for quality control determination of high-throughput screening data.

    PubMed

    Gunter, Bert; Brideau, Christine; Pikounis, Bill; Liaw, Andy

    2003-12-01

    High-throughput screening (HTS) is used in modern drug discovery to screen hundreds of thousands to millions of compounds on selected protein targets. It is an industrial-scale process relying on sophisticated automation and state-of-the-art detection technologies. Quality control (QC) is an integral part of the process and is used to ensure good quality data and mini mize assay variability while maintaining assay sensitivity. The authors describe new QC methods and show numerous real examples from their biologist-friendly Stat Server HTS application, a custom-developed software tool built from the commercially available S-PLUS and Stat Server statistical analysis and server software. This system remotely processes HTS data using powerful and sophisticated statistical methodology but insulates users from the technical details by outputting results in a variety of readily interpretable graphs and tables. It allows users to visualize HTS data and examine assay performance during the HTS campaign to quickly react to or avoid quality problems.

  4. [The principal components analysis--method to classify the statistical variables with applications in medicine].

    PubMed

    Dascălu, Cristina Gena; Antohe, Magda Ecaterina

    2009-01-01

    Based on the eigenvalues and the eigenvectors analysis, the principal component analysis has the purpose to identify the subspace of the main components from a set of parameters, which are enough to characterize the whole set of parameters. Interpreting the data for analysis as a cloud of points, we find through geometrical transformations the directions where the cloud's dispersion is maximal--the lines that pass through the cloud's center of weight and have a maximal density of points around them (by defining an appropriate criteria function and its minimization. This method can be successfully used in order to simplify the statistical analysis on questionnaires--because it helps us to select from a set of items only the most relevant ones, which cover the variations of the whole set of data. For instance, in the presented sample we started from a questionnaire with 28 items and, applying the principal component analysis we identified 7 principal components--or main items--fact that simplifies significantly the further data statistical analysis. PMID:21495371

  5. Quantitative Imaging Biomarkers: A Review of Statistical Methods for Computer Algorithm Comparisons

    PubMed Central

    2014-01-01

    Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research. PMID:24919829

  6. [The principal components analysis--method to classify the statistical variables with applications in medicine].

    PubMed

    Dascălu, Cristina Gena; Antohe, Magda Ecaterina

    2009-01-01

    Based on the eigenvalues and the eigenvectors analysis, the principal component analysis has the purpose to identify the subspace of the main components from a set of parameters, which are enough to characterize the whole set of parameters. Interpreting the data for analysis as a cloud of points, we find through geometrical transformations the directions where the cloud's dispersion is maximal--the lines that pass through the cloud's center of weight and have a maximal density of points around them (by defining an appropriate criteria function and its minimization. This method can be successfully used in order to simplify the statistical analysis on questionnaires--because it helps us to select from a set of items only the most relevant ones, which cover the variations of the whole set of data. For instance, in the presented sample we started from a questionnaire with 28 items and, applying the principal component analysis we identified 7 principal components--or main items--fact that simplifies significantly the further data statistical analysis.

  7. Interactive statistical-distribution-analysis program utilizing numerical and graphical methods

    SciTech Connect

    Glandon, S. R.; Fields, D. E.

    1982-04-01

    The TERPED/P program is designed to facilitate the quantitative analysis of experimental data, determine the distribution function that best describes the data, and provide graphical representations of the data. This code differs from its predecessors, TEDPED and TERPED, in that a printer-plotter has been added for graphical output flexibility. The addition of the printer-plotter provides TERPED/P with a method of generating graphs that is not dependent on DISSPLA, Integrated Software Systems Corporation's confidential proprietary graphics package. This makes it possible to use TERPED/P on systems not equipped with DISSPLA. In addition, the printer plot is usually produced more rapidly than a high-resolution plot can be generated. Graphical and numerical tests are performed on the data in accordance with the user's assumption of normality or lognormality. Statistical analysis options include computation of the chi-squared statistic and its significance level and the Kolmogorov-Smirnov one-sample test confidence level for data sets of more than 80 points. Plots can be produced on a Calcomp paper plotter, a FR80 film plotter, or a graphics terminal using the high-resolution, DISSPLA-dependent plotter or on a character-type output device by the printer-plotter. The plots are of cumulative probability (abscissa) versus user-defined units (ordinate). The program was developed on a Digital Equipment Corporation (DEC) PDP-10 and consists of 1500 statements. The language used is FORTRAN-10, DEC's extended version of FORTRAN-IV.

  8. Krylov iterative methods and synthetic acceleration for transport in binary statistical media

    SciTech Connect

    Fichtl, Erin D; Warsa, James S; Prinja, Anil K

    2008-01-01

    In particle transport applications there are numerous physical constructs in which heterogeneities are randomly distributed. The quantity of interest in these problems is the ensemble average of the flux, or the average of the flux over all possible material 'realizations.' The Levermore-Pomraning closure assumes Markovian mixing statistics and allows a closed, coupled system of equations to be written for the ensemble averages of the flux in each material. Generally, binary statistical mixtures are considered in which there are two (homogeneous) materials and corresponding coupled equations. The solution process is iterative, but convergence may be slow as either or both materials approach the diffusion and/or atomic mix limits. A three-part acceleration scheme is devised to expedite convergence, particularly in the atomic mix-diffusion limit where computation is extremely slow. The iteration is first divided into a series of 'inner' material and source iterations to attenuate the diffusion and atomic mix error modes separately. Secondly, atomic mix synthetic acceleration is applied to the inner material iteration and S{sup 2} synthetic acceleration to the inner source iterations to offset the cost of doing several inner iterations per outer iteration. Finally, a Krylov iterative solver is wrapped around each iteration, inner and outer, to further expedite convergence. A spectral analysis is conducted and iteration counts and computing cost for the new two-step scheme are compared against those for a simple one-step iteration, to which a Krylov iterative method can also be applied.

  9. A comparison of different statistical methods analyzing hypoglycemia data using bootstrap simulations.

    PubMed

    Jiang, Honghua; Ni, Xiao; Huster, William; Heilmann, Cory

    2015-01-01

    Hypoglycemia has long been recognized as a major barrier to achieving normoglycemia with intensive diabetic therapies. It is a common safety concern for the diabetes patients. Therefore, it is important to apply appropriate statistical methods when analyzing hypoglycemia data. Here, we carried out bootstrap simulations to investigate the performance of the four commonly used statistical models (Poisson, negative binomial, analysis of covariance [ANCOVA], and rank ANCOVA) based on the data from a diabetes clinical trial. Zero-inflated Poisson (ZIP) model and zero-inflated negative binomial (ZINB) model were also evaluated. Simulation results showed that Poisson model inflated type I error, while negative binomial model was overly conservative. However, after adjusting for dispersion, both Poisson and negative binomial models yielded slightly inflated type I errors, which were close to the nominal level and reasonable power. Reasonable control of type I error was associated with ANCOVA model. Rank ANCOVA model was associated with the greatest power and with reasonable control of type I error. Inflated type I error was observed with ZIP and ZINB models.

  10. Computed statistics at streamgages, and methods for estimating low-flow frequency statistics and development of regional regression equations for estimating low-flow frequency statistics at ungaged locations in Missouri

    USGS Publications Warehouse

    Southard, Rodney E.

    2013-01-01

    The weather and precipitation patterns in Missouri vary considerably from year to year. In 2008, the statewide average rainfall was 57.34 inches and in 2012, the statewide average rainfall was 30.64 inches. This variability in precipitation and resulting streamflow in Missouri underlies the necessity for water managers and users to have reliable streamflow statistics and a means to compute select statistics at ungaged locations for a better understanding of water availability. Knowledge of surface-water availability is dependent on the streamflow data that have been collected and analyzed by the U.S. Geological Survey for more than 100 years at approximately 350 streamgages throughout Missouri. The U.S. Geological Survey, in cooperation with the Missouri Department of Natural Resources, computed streamflow statistics at streamgages through the 2010 water year, defined periods of drought and defined methods to estimate streamflow statistics at ungaged locations, and developed regional regression equations to compute selected streamflow statistics at ungaged locations. Streamflow statistics and flow durations were computed for 532 streamgages in Missouri and in neighboring States of Missouri. For streamgages with more than 10 years of record, Kendall’s tau was computed to evaluate for trends in streamflow data. If trends were detected, the variable length method was used to define the period of no trend. Water years were removed from the dataset from the beginning of the record for a streamgage until no trend was detected. Low-flow frequency statistics were then computed for the entire period of record and for the period of no trend if 10 or more years of record were available for each analysis. Three methods are presented for computing selected streamflow statistics at ungaged locations. The first method uses power curve equations developed for 28 selected streams in Missouri and neighboring States that have multiple streamgages on the same streams. Statistical

  11. Zeolites to peptides: Statistical mechanics methods for structure solution and property evaluation

    NASA Astrophysics Data System (ADS)

    Pophale, Ramdas S.

    Methods in statistical mechanics are used to study structure and properties of zeolites and peptides. A Monte Carlo method is applied to solve structure of a newly synthesized zeolite. Understanding the structure of a zeolite could lead to optimization of chemical processes it is involved in. Dielectric constant and elastic modulus are calculated using a molecular dynamics method for a pure silica zeolite with and without the structure directing agent used in its synthesis. These properties are of interest due to the potential use of this zeolite as low dielectric constant material in manufacturing integrated circuits. Results of four methods probing energy landscapes in peptides are compared for four cyclic peptides. Their ability to equilibrate structural properties and their relative speeds are important in their ability to simulate complex structures. A docking study is carried out to probe interactions between two proteins, Cripto and Snail, and E-cadherin promoter. The study supports experimental evidence that Cripto is involved in expression of E-cadherin through a promoter priming mechanism. Finally, the use of computational models in the design of better vaccines is illustrated through an example of Influenza.

  12. A Statistical Method for Volcanic Hazard Assessment: Applications to Colima, Popocatepetl and Citlaltepetl Volcanoes, Mexico.

    NASA Astrophysics Data System (ADS)

    Mendoza-Rosas, A. T.; de La Cruz-Reyna, S.

    2007-05-01

    The volcanic-eruption time series are sequences describing processes of great complexity representing one of the main tools for the assessment of the volcanic hazard. The analysis of such series is thus a critical step in the precise assessment of the volcanic risk. The study of low-magnitude eruption sequences, containing larger data populations can usually be done using conventional methods and statistics, namely the Binomial or Poisson distributions. However, time-dependent processes, or sequences including rare or extreme events involving very few data, require special and specific methods of analysis, such as the non-homogeneous Poisson process analysis or the extreme-value theory. A general methodology for analyzing these types of processes is proposed in this work with the purpose of calculating more precise values of the volcanic eruption hazard. This is done in four steps: First, an exploratory analysis of the repose-periods and eruptive magnitudes series is done complementing the historical eruptive time series with geological eruption data and thus expanding the data population. Secondly, a Weibull analysis is performed on the repose-time between successive eruptions distribution. Thirdly, the eruption occurrence data are analyzed using a non-homogeneous Poisson process with a generalized Pareto distribution as its intensity function. Finally, these results are compared with fittings obtained from conventional Poisson and Binomial distributions. The hazard or eruption probabilities of three active polygenetic Mexican volcanoes: Colima, Popocatepetl and Citlaltepetl are then calculated with this method and compared with the results obtained with other methods.

  13. A new method to obtain uniform distribution of ground control points based on regional statistical information

    NASA Astrophysics Data System (ADS)

    Ma, Chao; An, Wei; Deng, Xinpu

    2015-10-01

    The Ground Control Points (GCPs) is an important source of fundamental data in geometric correction for remote sensing imagery. The quantity, accuracy and distribution of GCPs are three factors which may affect the accuracy of geometric correction. It is generally required that the distribution of GCP should be uniform, so they can fully control the accuracy of mapping regions. In this paper, we establish an objective standard of evaluating the uniformity of the GCPs' distribution based on regional statistical information (RSI), and get an optimal distribution of GCPs. This sampling method is called RSIS for short in this work. The Amounts of GCPs in different regions by equally partitioning the image in regions in different manners are counted which forms a vector called RSI vector in this work. The uniformity of GCPs' distribution can be evaluated by a mathematical quantity of the RSI vector. An optimal distribution of GCPs is obtained by searching the RSI vector with the minimum mathematical quantity. In this paper, the simulation annealing is employed to search the optimal distribution of GCPs that have the minimum mathematical quantity of the RSI vector. Experiments are carried out to test the method proposed in this paper, and sampling designs compared are simple random sampling and universal kriging model-based sampling. The experiments indicate that this method is highly recommended as new GCPs sampling design method for geometric correction of remotely sensed imagery.

  14. Evaluation of an Alternative Statistical Method for Analysis of RCRA Groundwater Monitoring Data at the Hanford Site

    SciTech Connect

    Chou, Charissa J.

    2004-06-24

    Statistical methods are required in groundwater monitoring programs to determine if a RCRA-regulated unit affects groundwater quality beneath a site. This report presents the results of the statistical analysis of groundwater monitoring data acquired at B Pond and the 300 Area process trenches during a 2-year trial test period.

  15. Reporting of allocation method and statistical analyses that deal with bilaterally affected wrists in clinical trials for carpal tunnel syndrome.

    PubMed

    Page, Matthew J; O'Connor, Denise A; Pitt, Veronica; Massy-Westropp, Nicola

    2013-11-01

    The authors aimed to describe how often the allocation method and the statistical analyses that deal with bilateral involvement are reported in clinical trials for carpal tunnel syndrome and to determine whether reporting has improved over time. Forty-two trials identified from recently published systematic reviews were assessed. Information about allocation method and statistical analyses was obtained from published reports and trialists. Only 15 trialists (36%) reported the method of random sequence generation used, and 6 trialists (14%) reported the method of allocation concealment used. Of 25 trials including participants with bilateral carpal tunnel syndrome, 17 (68%) reported the method used to allocate the wrists, whereas only 1 (4%) reported using a statistical analysis that appropriately dealt with bilateral involvement. There was no clear trend of improved reporting over time. Interventions are needed to improve reporting quality and statistical analyses of these trials so that these can provide more reliable evidence to inform clinical practice.

  16. Statistical Angle-of-Arrival and Doppler Method for GPS Radio Interferometry of TIDS

    NASA Astrophysics Data System (ADS)

    Afraimovich, E. L.; Palamartchouk, K. S.; Perevalova, N. P.

    A Statistical Angle-of-arrival and Doppler Method for GPS radio interferometry (SADM-GPS) is proposed for determining the characteristics of the Travelling Ionospheric Disturbances (TIDs) by measuring variations of GPS phase derivatives with respect to time and spatial coordinates. These data are used to calculate corresponding values of the velocity vector, in view of a correction for satellite motion based on current information available regarding the angular coordinates of the satellites. Through a computer simulation it was shown that multi satellite GPS radio interferometry in conjunction with the SADM-GPS algorithm allows for detecting and measuring the velocity vector of TIDs in virtually the entire azimuthal range of possible TID propagation directions

  17. Forensic classification of counterfeit banknote paper by X-ray fluorescence and multivariate statistical methods.

    PubMed

    Guo, Hongling; Yin, Baohua; Zhang, Jie; Quan, Yangke; Shi, Gaojun

    2016-09-01

    Counterfeiting of banknotes is a crime and seriously harmful to economy. Examination of the paper, ink and toners used to make counterfeit banknotes can provide useful information to classify and link different cases in which the suspects use the same raw materials. In this paper, 21 paper samples of counterfeit banknotes seized from 13 cases were analyzed by wavelength dispersive X-ray fluorescence. After measuring the elemental composition in paper semi-quantitatively, the normalized weight percentage data of 10 elements were processed by multivariate statistical methods of cluster analysis and principle component analysis. All these paper samples were mainly classified into 3 groups. Nine separate cases were successfully linked. It is demonstrated that elemental composition measured by XRF is a useful way to compare and classify papers used in different cases. PMID:27342345

  18. A study of turbulent flow between parallel plates by a statistical method

    NASA Technical Reports Server (NTRS)

    Srinivasan, R.; Giddens, D. P.; Bangert, L. H.; Wu, J. C.

    1976-01-01

    Turbulent Couette flow between parallel plates was studied from a statistical mechanics approach utilizing a model equation, similar to the Boltzmann equation of kinetic theory, which was proposed by Lundgren from the velocity distribution of fluid elements. Solutions to this equation are obtained numerically, employing the discrete ordinate method and finite differences. Two types of boundary conditions on the distribution function are considered, and the results of the calculations are compared to available experimental data. The research establishes that Lundgren's equation provides a very good description of turbulence for the flow situation considered and that it offers an analytical tool for further study of more complex turbulent flows. The present work also indicates that modelling of the boundary conditions is an area where further study is required.

  19. Computing light statistics in heterogeneous media based on a mass weighted probability density function method.

    PubMed

    Jenny, Patrick; Mourad, Safer; Stamm, Tobias; Vöge, Markus; Simon, Klaus

    2007-08-01

    Based on the transport theory, we present a modeling approach to light scattering in turbid material. It uses an efficient and general statistical description of the material's scattering and absorption behavior. The model estimates the spatial distribution of intensity and the flow direction of radiation, both of which are required, e.g., for adaptable predictions of the appearance of colors in halftone prints. This is achieved by employing a computational particle method, which solves a model equation for the probability density function of photon positions and propagation directions. In this framework, each computational particle represents a finite probability of finding a photon in a corresponding state, including properties like wavelength. Model evaluations and verifications conclude the discussion.

  20. Statistically Qualified Neuro-Analytic system and Method for Process Monitoring

    SciTech Connect

    Vilim, Richard B.; Garcia, Humberto E.; Chen, Frederick W.

    1998-11-04

    An apparatus and method for monitoring a process involves development and application of a statistically qualified neuro-analytic (SQNA) model to accurately and reliably identify process change. The development of the SQNA model is accomplished in two steps: deterministic model adaption and stochastic model adaptation. Deterministic model adaption involves formulating an analytic model of the process representing known process characteristics,augmenting the analytic model with a neural network that captures unknown process characteristics, and training the resulting neuro-analytic model by adjusting the neural network weights according to a unique scaled equation emor minimization technique. Stochastic model adaptation involves qualifying any remaining uncertainty in the trained neuro-analytic model by formulating a likelihood function, given an error propagation equation, for computing the probability that the neuro-analytic model generates measured process output. Preferably, the developed SQNA model is validated using known sequential probability ratio tests and applied to the process as an on-line monitoring system.

  1. Statistical method to assess usual dietary intakes in the European population.

    PubMed

    Vilone, Giulia; Comiskey, Damien; Heraud, Fanny; O'Mahony, Cian

    2014-01-01

    Food consumption data are a key element of EFSA's risk assessment activities, forming the basis of dietary exposure assessment at the European level. In 2011, EFSA released the Comprehensive European Food Consumption Database, gathering consumption data from 34 national surveys representing 66,492 individuals from 22 European Union member states. Due to the different methodologies used, national survey data cannot be combined to generate European estimates of dietary exposure. This study was executed to assess how existing consumption data and the representativeness of dietary exposure and risk estimates at the European Union level can be improved by developing a 'Compiled European Food Consumption Database'. To create the database, the usual intake distributions of 589 food items representing the total diet were estimated for 36 clusters composed of subjects belonging to the same age class, gender and having a similar diet. An adapted form of the National Cancer Institute (NCI) method was used for this, with a number of important modifications. Season, body weight and whether or not the food was consumed at the weekend were used to predict the probability of consumption. A gamma distribution was found to be more suitable for modelling the distribution of food amounts in the different food groups instead of a normal distribution. These distributions were combined with food correlation matrices according to the Iman-Conover method in order to simulate 28 days of consumption for 40,000 simulated individuals. The simulated data were validated by comparing the consumption statistics of the simulated individuals and food groups with the same statistics estimated from the Comprehensive Database. The opportunities and limitations of using the simulated database for exposure assessments are described.

  2. GSHSite: Exploiting an Iteratively Statistical Method to Identify S-Glutathionylation Sites with Substrate Specificity

    PubMed Central

    Chen, Yi-Ju; Lu, Cheng-Tsung; Huang, Kai-Yao; Wu, Hsin-Yi; Chen, Yu-Ju; Lee, Tzong-Yi

    2015-01-01

    S-glutathionylation, the covalent attachment of a glutathione (GSH) to the sulfur atom of cysteine, is a selective and reversible protein post-translational modification (PTM) that regulates protein activity, localization, and stability. Despite its implication in the regulation of protein functions and cell signaling, the substrate specificity of cysteine S-glutathionylation remains unknown. Based on a total of 1783 experimentally identified S-glutathionylation sites from mouse macrophages, this work presents an informatics investigation on S-glutathionylation sites including structural factors such as the flanking amino acids composition and the accessible surface area (ASA). TwoSampleLogo presents that positively charged amino acids flanking the S-glutathionylated cysteine may influence the formation of S-glutathionylation in closed three-dimensional environment. A statistical method is further applied to iteratively detect the conserved substrate motifs with statistical significance. Support vector machine (SVM) is then applied to generate predictive model considering the substrate motifs. According to five-fold cross-validation, the SVMs trained with substrate motifs could achieve an enhanced sensitivity, specificity, and accuracy, and provides a promising performance in an independent test set. The effectiveness of the proposed method is demonstrated by the correct identification of previously reported S-glutathionylation sites of mouse thioredoxin (TXN) and human protein tyrosine phosphatase 1b (PTP1B). Finally, the constructed models are adopted to implement an effective web-based tool, named GSHSite (http://csb.cse.yzu.edu.tw/GSHSite/), for identifying uncharacterized GSH substrate sites on the protein sequences. PMID:25849935

  3. Statistical method to assess usual dietary intakes in the European population.

    PubMed

    Vilone, Giulia; Comiskey, Damien; Heraud, Fanny; O'Mahony, Cian

    2014-01-01

    Food consumption data are a key element of EFSA's risk assessment activities, forming the basis of dietary exposure assessment at the European level. In 2011, EFSA released the Comprehensive European Food Consumption Database, gathering consumption data from 34 national surveys representing 66,492 individuals from 22 European Union member states. Due to the different methodologies used, national survey data cannot be combined to generate European estimates of dietary exposure. This study was executed to assess how existing consumption data and the representativeness of dietary exposure and risk estimates at the European Union level can be improved by developing a 'Compiled European Food Consumption Database'. To create the database, the usual intake distributions of 589 food items representing the total diet were estimated for 36 clusters composed of subjects belonging to the same age class, gender and having a similar diet. An adapted form of the National Cancer Institute (NCI) method was used for this, with a number of important modifications. Season, body weight and whether or not the food was consumed at the weekend were used to predict the probability of consumption. A gamma distribution was found to be more suitable for modelling the distribution of food amounts in the different food groups instead of a normal distribution. These distributions were combined with food correlation matrices according to the Iman-Conover method in order to simulate 28 days of consumption for 40,000 simulated individuals. The simulated data were validated by comparing the consumption statistics of the simulated individuals and food groups with the same statistics estimated from the Comprehensive Database. The opportunities and limitations of using the simulated database for exposure assessments are described. PMID:25205439

  4. A Method for Simulating Correlated Non-Normal Systems of Statistical Equations.

    ERIC Educational Resources Information Center

    Headrick, Todd C.; Beasley, T. Mark

    Real world data often fail to meet the underlying assumptions of normal statistical theory. Many statistical procedures in the psychological and educational sciences involve models that may include a system of statistical equations with non-normal correlated variables (e.g., factor analysis, structural equation modeling, or other complex…

  5. Statistical method for prediction of gait kinematics with Gaussian process regression.

    PubMed

    Yun, Youngmok; Kim, Hyun-Chul; Shin, Sung Yul; Lee, Junwon; Deshpande, Ashish D; Kim, Changhwan

    2014-01-01

    We propose a novel methodology for predicting human gait pattern kinematics based on a statistical and stochastic approach using a method called Gaussian process regression (GPR). We selected 14 body parameters that significantly affect the gait pattern and 14 joint motions that represent gait kinematics. The body parameter and gait kinematics data were recorded from 113 subjects by anthropometric measurements and a motion capture system. We generated a regression model with GPR for gait pattern prediction and built a stochastic function mapping from body parameters to gait kinematics based on the database and GPR, and validated the model with a cross validation method. The function can not only produce trajectories for the joint motions associated with gait kinematics, but can also estimate the associated uncertainties. Our approach results in a novel, low-cost and subject-specific method for predicting gait kinematics with only the subject's body parameters as the necessary input, and also enables a comprehensive understanding of the correlation and uncertainty between body parameters and gait kinematics. PMID:24211221

  6. Statistical Evaluation and Improvement of Methods for Combining Random and Harmonic Loads

    NASA Technical Reports Server (NTRS)

    Brown, A. M.; McGhee, D. S.

    2003-01-01

    Structures in many environments experience both random and harmonic excitation. A variety of closed-form techniques has been used in the aerospace industry to combine the loads resulting from the two sources. The resulting combined loads are then used to design for both yield/ultimate strength and high- cycle fatigue capability. This Technical Publication examines the cumulative distribution percentiles obtained using each method by integrating the joint probability density function of the sine and random components. A new Microsoft Excel spreadsheet macro that links with the software program Mathematica to calculate the combined value corresponding to any desired percentile is then presented along with a curve tit to this value. Another Excel macro that calculates the combination using Monte Carlo simulation is shown. Unlike the traditional techniques. these methods quantify the calculated load value with a consistent percentile. Using either of the presented methods can be extremely valuable in probabilistic design, which requires a statistical characterization of the loading. Additionally, since the CDF at high probability levels is very flat, the design value is extremely sensitive to the predetermined percentile; therefore, applying the new techniques can substantially lower the design loading without losing any of the identified structural reliability.

  7. Statistical Comparison and Improvement of Methods for Combining Random and Harmonic Loads

    NASA Technical Reports Server (NTRS)

    Brown, Andrew M.; McGhee, David S.

    2004-01-01

    Structures in many environments experience both random and harmonic excitation. A variety of closed-form techniques has been used in the aerospace industry to combine the loads resulting from the two sources. The resulting combined loads are then used to design for both yield ultimate strength and high cycle fatigue capability. This paper examines the cumulative distribution function (CDF) percentiles obtained using each method by integrating the joint probability density function of the sine and random components. A new Microsoft Excel spreadsheet macro that links with the software program Mathematics is then used to calculate the combined value corresponding to any desired percentile along with a curve fit to this value. Another Excel macro is used to calculate the combination using a Monte Carlo simulation. Unlike the traditional techniques, these methods quantify the calculated load value with a Consistent percentile. Using either of the presented methods can be extremely valuable in probabilistic design, which requires a statistical characterization of the loading. Also, since the CDF at high probability levels is very flat, the design value is extremely sensitive to the predetermined percentile; therefore, applying the new techniques can lower the design loading substantially without losing any of the identified structural reliability.

  8. GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data

    PubMed Central

    Cooke, Thomas F.; Yee, Muh-Ching; Muzzio, Marina; Sockell, Alexandra; Bell, Ryan; Cornejo, Omar E.; Kelley, Joanna L.; Bailliet, Graciela; Bravi, Claudio M.; Bustamante, Carlos D.; Kenny, Eimear E.

    2016-01-01

    Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth. PMID:26828719

  9. Quantitative imaging biomarkers: a review of statistical methods for technical performance assessment.

    PubMed

    Raunig, David L; McShane, Lisa M; Pennello, Gene; Gatsonis, Constantine; Carson, Paul L; Voyvodic, James T; Wahl, Richard L; Kurland, Brenda F; Schwarz, Adam J; Gönen, Mithat; Zahlmann, Gudrun; Kondratovich, Marina V; O'Donnell, Kevin; Petrick, Nicholas; Cole, Patricia E; Garra, Brian; Sullivan, Daniel C

    2015-02-01

    Technological developments and greater rigor in the quantitative measurement of biological features in medical images have given rise to an increased interest in using quantitative imaging biomarkers to measure changes in these features. Critical to the performance of a quantitative imaging biomarker in preclinical or clinical settings are three primary metrology areas of interest: measurement linearity and bias, repeatability, and the ability to consistently reproduce equivalent results when conditions change, as would be expected in any clinical trial. Unfortunately, performance studies to date differ greatly in designs, analysis method, and metrics used to assess a quantitative imaging biomarker for clinical use. It is therefore difficult or not possible to integrate results from different studies or to use reported results to design studies. The Radiological Society of North America and the Quantitative Imaging Biomarker Alliance with technical, radiological, and statistical experts developed a set of technical performance analysis methods, metrics, and study designs that provide terminology, metrics, and methods consistent with widely accepted metrological standards. This document provides a consistent framework for the conduct and evaluation of quantitative imaging biomarker performance studies so that results from multiple studies can be compared, contrasted, or combined.

  10. A Review of the Statistical and Quantitative Methods Used to Study Alcohol-Attributable Crime.

    PubMed

    Fitterer, Jessica L; Nelson, Trisalyn A

    2015-01-01

    Modelling the relationship between alcohol consumption and crime generates new knowledge for crime prevention strategies. Advances in data, particularly data with spatial and temporal attributes, have led to a growing suite of applied methods for modelling. In support of alcohol and crime researchers we synthesized and critiqued existing methods of spatially and quantitatively modelling the effects of alcohol exposure on crime to aid method selection, and identify new opportunities for analysis strategies. We searched the alcohol-crime literature from 1950 to January 2014. Analyses that statistically evaluated or mapped the association between alcohol and crime were included. For modelling purposes, crime data were most often derived from generalized police reports, aggregated to large spatial units such as census tracts or postal codes, and standardized by residential population data. Sixty-eight of the 90 selected studies included geospatial data of which 48 used cross-sectional datasets. Regression was the prominent modelling choice (n = 78) though dependent on data many variations existed. There are opportunities to improve information for alcohol-attributable crime prevention by using alternative population data to standardize crime rates, sourcing crime information from non-traditional platforms (social media), increasing the number of panel studies, and conducting analysis at the local level (neighbourhood, block, or point). Due to the spatio-temporal advances in crime data, we expect a continued uptake of flexible Bayesian hierarchical modelling, a greater inclusion of spatial-temporal point pattern analysis, and shift toward prospective (forecast) modelling over small areas (e.g., blocks).

  11. GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data.

    PubMed

    Cooke, Thomas F; Yee, Muh-Ching; Muzzio, Marina; Sockell, Alexandra; Bell, Ryan; Cornejo, Omar E; Kelley, Joanna L; Bailliet, Graciela; Bravi, Claudio M; Bustamante, Carlos D; Kenny, Eimear E

    2016-02-01

    Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth. PMID:26828719

  12. Statistical analysis to assess automated level of suspicion scoring methods in breast ultrasound

    NASA Astrophysics Data System (ADS)

    Galperin, Michael

    2003-05-01

    A well-defined rule-based system has been developed for scoring 0-5 the Level of Suspicion (LOS) based on qualitative lexicon describing the ultrasound appearance of breast lesion. The purposes of the research are to asses and select one of the automated LOS scoring quantitative methods developed during preliminary studies in benign biopsies reduction. The study has used Computer Aided Imaging System (CAIS) to improve the uniformity and accuracy of applying the LOS scheme by automatically detecting, analyzing and comparing breast masses. The overall goal is to reduce biopsies on the masses with lower levels of suspicion, rather that increasing the accuracy of diagnosis of cancers (will require biopsy anyway). On complex cysts and fibroadenoma cases experienced radiologists were up to 50% less certain in true negatives than CAIS. Full correlation analysis was applied to determine which of the proposed LOS quantification methods serves CAIS accuracy the best. This paper presents current results of applying statistical analysis for automated LOS scoring quantification for breast masses with known biopsy results. It was found that First Order Ranking method yielded most the accurate results. The CAIS system (Image Companion, Data Companion software) is developed by Almen Laboratories and was used to achieve the results.

  13. Two-time Green's functions and spectral density method in nonextensive quantum statistical mechanics.

    PubMed

    Cavallo, A; Cosenza, F; De Cesare, L

    2008-05-01

    We extend the formalism of the thermodynamic two-time Green's functions to nonextensive quantum statistical mechanics. Working in the optimal Lagrangian multiplier representation, the q -spectral properties and the methods for a direct calculation of the two-time q Green's functions and the related q -spectral density ( q measures the nonextensivity degree) for two generic operators are presented in strict analogy with the extensive (q=1) counterpart. Some emphasis is devoted to the nonextensive version of the less known spectral density method whose effectiveness in exploring equilibrium and transport properties of a wide variety of systems has been well established in conventional classical and quantum many-body physics. To check how both the equations of motion and the spectral density methods work to study the q -induced nonextensivity effects in nontrivial many-body problems, we focus on the equilibrium properties of a second-quantized model for a high-density Bose gas with strong attraction between particles for which exact results exist in extensive conditions. Remarkably, the contributions to several thermodynamic quantities of the q -induced nonextensivity close to the extensive regime are explicitly calculated in the low-temperature regime by overcoming the calculation of the q grand-partition function.

  14. On Statistical Methods for Common Mean and Reference Confidence Intervals in Interlaboratory Comparisons for Temperature

    NASA Astrophysics Data System (ADS)

    Witkovský, Viktor; Wimmer, Gejza; Ďuriš, Stanislav

    2015-08-01

    We consider a problem of constructing the exact and/or approximate coverage intervals for the common mean of several independent distributions. In a metrological context, this problem is closely related to evaluation of the interlaboratory comparison experiments, and in particular, to determination of the reference value (estimate) of a measurand and its uncertainty, or alternatively, to determination of the coverage interval for a measurand at a given level of confidence, based on such comparison data. We present a brief overview of some specific statistical models, methods, and algorithms useful for determination of the common mean and its uncertainty, or alternatively, the proper interval estimator. We illustrate their applicability by a simple simulation study and also by example of interlaboratory comparisons for temperature. In particular, we shall consider methods based on (i) the heteroscedastic common mean fixed effect model, assuming negligible laboratory biases, (ii) the heteroscedastic common mean random effects model with common (unknown) distribution of the laboratory biases, and (iii) the heteroscedastic common mean random effects model with possibly different (known) distributions of the laboratory biases. Finally, we consider a method, recently suggested by Singh et al., for determination of the interval estimator for a common mean based on combining information from independent sources through confidence distributions.

  15. A Review of the Statistical and Quantitative Methods Used to Study Alcohol-Attributable Crime

    PubMed Central

    Fitterer, Jessica L.; Nelson, Trisalyn A.

    2015-01-01

    Modelling the relationship between alcohol consumption and crime generates new knowledge for crime prevention strategies. Advances in data, particularly data with spatial and temporal attributes, have led to a growing suite of applied methods for modelling. In support of alcohol and crime researchers we synthesized and critiqued existing methods of spatially and quantitatively modelling the effects of alcohol exposure on crime to aid method selection, and identify new opportunities for analysis strategies. We searched the alcohol-crime literature from 1950 to January 2014. Analyses that statistically evaluated or mapped the association between alcohol and crime were included. For modelling purposes, crime data were most often derived from generalized police reports, aggregated to large spatial units such as census tracts or postal codes, and standardized by residential population data. Sixty-eight of the 90 selected studies included geospatial data of which 48 used cross-sectional datasets. Regression was the prominent modelling choice (n = 78) though dependent on data many variations existed. There are opportunities to improve information for alcohol-attributable crime prevention by using alternative population data to standardize crime rates, sourcing crime information from non-traditional platforms (social media), increasing the number of panel studies, and conducting analysis at the local level (neighbourhood, block, or point). Due to the spatio-temporal advances in crime data, we expect a continued uptake of flexible Bayesian hierarchical modelling, a greater inclusion of spatial-temporal point pattern analysis, and shift toward prospective (forecast) modelling over small areas (e.g., blocks). PMID:26418016

  16. Computed statistics at streamgages, and methods for estimating low-flow frequency statistics and development of regional regression equations for estimating low-flow frequency statistics at ungaged locations in Missouri

    USGS Publications Warehouse

    Southard, Rodney E.

    2013-01-01

    The weather and precipitation patterns in Missouri vary considerably from year to year. In 2008, the statewide average rainfall was 57.34 inches and in 2012, the statewide average rainfall was 30.64 inches. This variability in precipitation and resulting streamflow in Missouri underlies the necessity for water managers and users to have reliable streamflow statistics and a means to compute select statistics at ungaged locations for a better understanding of water availability. Knowledge of surface-water availability is dependent on the streamflow data that have been collected and analyzed by the U.S. Geological Survey for more than 100 years at approximately 350 streamgages throughout Missouri. The U.S. Geological Survey, in cooperation with the Missouri Department of Natural Resources, computed streamflow statistics at streamgages through the 2010 water year, defined periods of drought and defined methods to estimate streamflow statistics at ungaged locations, and developed regional regression equations to compute selected streamflow statistics at ungaged locations. Streamflow statistics and flow durations were computed for 532 streamgages in Missouri and in neighboring States of Missouri. For streamgages with more than 10 years of record, Kendall’s tau was computed to evaluate for trends in streamflow data. If trends were detected, the variable length method was used to define the period of no trend. Water years were removed from the dataset from the beginning of the record for a streamgage until no trend was detected. Low-flow frequency statistics were then computed for the entire period of record and for the period of no trend if 10 or more years of record were available for each analysis. Three methods are presented for computing selected streamflow statistics at ungaged locations. The first method uses power curve equations developed for 28 selected streams in Missouri and neighboring States that have multiple streamgages on the same streams. Statistical

  17. Is a vegetarian diet adequate for children.

    PubMed

    Hackett, A; Nathan, I; Burgess, L

    1998-01-01

    The number of people who avoid eating meat is growing, especially among young people. Benefits to health from a vegetarian diet have been reported in adults but it is not clear to what extent these benefits are due to diet or to other aspects of lifestyles. In children concern has been expressed concerning the adequacy of vegetarian diets especially with regard to growth. The risks/benefits seem to be related to the degree of restriction of he diet; anaemia is probably both the main and the most serious risk but this also applies to omnivores. Vegan diets are more likely to be associated with malnutrition, especially if the diets are the result of authoritarian dogma. Overall, lacto-ovo-vegetarian children consume diets closer to recommendations than omnivores and their pre-pubertal growth is at least as good. The simplest strategy when becoming vegetarian may involve reliance on vegetarian convenience foods which are not necessarily superior in nutritional composition. The vegetarian sector of the food industry could do more to produce foods closer to recommendations. Vegetarian diets can be, but are not necessarily, adequate for children, providing vigilance is maintained, particularly to ensure variety. Identical comments apply to omnivorous diets. Three threats to the diet of children are too much reliance on convenience foods, lack of variety and lack of exercise.

  18. Inferences on weather extremes and weather-related disasters: a review of statistical methods

    NASA Astrophysics Data System (ADS)

    Visser, H.; Petersen, A. C.

    2011-09-01

    The study of weather extremes and their impacts, such as weather-related disasters, plays an important role in climate-change research. Due to the great societal consequences of extremes - historically, now and in the future - the peer-reviewed literature on this theme has been growing enormously since the 1980s. Data sources have a wide origin, from century-long climate reconstructions from tree rings to short databases with disaster statistics and human impacts (30 to 60 yr). In scanning the peer-reviewed literature on weather extremes and impacts thereof we noticed that many different methods are used to make inferences. However, discussions on methods are rare. Such discussions are important since a particular methodological choice might substantially influence the inferences made. A calculation of a return period of once in 500 yr, based on a normal distribution will deviate from that based on a Gumbel distribution. And the particular choice between a linear or a flexible trend model might influence inferences as well. In this article we give a concise overview of statistical methods applied in the field of weather extremes and weather-related disasters. Methods have been evaluated as to stationarity assumptions, the choice for specific probability density functions (PDFs) and the availability of uncertainty information. As for stationarity we found that good testing is essential. Inferences on extremes may be wrong if data are assumed stationary while they are not. The same holds for the block-stationarity assumption. As for PDF choices we found that often more than one PDF shape fits to the same data. From a simulation study we conclude that both the generalized extreme value (GEV) distribution and the log-normal PDF fit very well to a variety of indicators. The application of the normal and Gumbel distributions is more limited. As for uncertainty it is advised to test conclusions on extremes for assumptions underlying the modeling approach. Finally, we

  19. Inferences on weather extremes and weather-related disasters: a review of statistical methods

    NASA Astrophysics Data System (ADS)

    Visser, H.; Petersen, A. C.

    2012-02-01

    The study of weather extremes and their impacts, such as weather-related disasters, plays an important role in research of climate change. Due to the great societal consequences of extremes - historically, now and in the future - the peer-reviewed literature on this theme has been growing enormously since the 1980s. Data sources have a wide origin, from century-long climate reconstructions from tree rings to relatively short (30 to 60 yr) databases with disaster statistics and human impacts. When scanning peer-reviewed literature on weather extremes and its impacts, it is noticeable that many different methods are used to make inferences. However, discussions on these methods are rare. Such discussions are important since a particular methodological choice might substantially influence the inferences made. A calculation of a return period of once in 500 yr, based on a normal distribution will deviate from that based on a Gumbel distribution. And the particular choice between a linear or a flexible trend model might influence inferences as well. In this article, a concise overview of statistical methods applied in the field of weather extremes and weather-related disasters is given. Methods have been evaluated as to stationarity assumptions, the choice for specific probability density functions (PDFs) and the availability of uncertainty information. As for stationarity assumptions, the outcome was that good testing is essential. Inferences on extremes may be wrong if data are assumed stationary while they are not. The same holds for the block-stationarity assumption. As for PDF choices it was found that often more than one PDF shape fits to the same data. From a simulation study the conclusion can be drawn that both the generalized extreme value (GEV) distribution and the log-normal PDF fit very well to a variety of indicators. The application of the normal and Gumbel distributions is more limited. As for uncertainty, it is advisable to test conclusions on extremes

  20. Oil and Gas on Indian Reservations: Statistical Methods Help to Establish Value for Royalty Purposes

    ERIC Educational Resources Information Center

    Fowler, Mary S.; Kadane, Joseph B.

    2006-01-01

    Part of the history of oil and gas development on Indian reservations concerns potential underpayment of royalties due to under-valuation of production by oil companies. This paper discusses a model used by the Shoshone and Arapaho tribes in a lawsuit against the Federal government, claiming the Government failed to collect adequate royalties.…

  1. Statistical methods in public health and epidemiology: a look at the recent past and projections for the next decade.

    PubMed

    Levy, P S; Stolte, K

    2000-02-01

    This article attempts to prognosticate from past patterns, the type of statistical methods that will be used in published public health and epidemiological studies in the decade that follows the millennium. With this in mind, we conducted a study that would characterize trends in use of statistical methods in two major public health journals: the American Journal of Public Health, and the American Journal of Epidemiology. We took a probability sample of 348 articles published in these journals between 1970 and 1998. For each article sampled, we abstracted information on the design of the study and the types of statistical methods used in the article. Our major findings are that the proportion of articles using statistical methods as well as the mean number of statistical methods used per article has increased dramatically over the three decades surveyed. Also, the proportion of published articles using study designs that we classified as analytic has increased over the years. We also examined patterns of use in these journals of three statistical methodologies: logistic regression, proportional hazards regression, and methods for analysis of data from complex sample surveys. These methods were selected because they had been introduced initially in the late 1960s or early 1970s and had made considerable impact on data analysis in the biomedical sciences in the 1970s-90s. Estimated usage of each of these techniques remained relatively low until user-friendly software became available. Our overall conclusions are that new statistical methods are developed on the basis of need, disseminated to potential users over a course of many years, and often do not reach maximum use until tools for their comfortable use are made readily available to potential users. Based on these conclusions, we identify certain needs that are not now being met and which are likely to generate new statistical methodologies that we will see in the next decade.

  2. Statistical downscaling of precipitation using local regression and high accuracy surface modeling method

    NASA Astrophysics Data System (ADS)

    Zhao, Na; Yue, Tianxiang; Zhou, Xun; Zhao, Mingwei; Liu, Yu; Du, Zhengping; Zhang, Lili

    2016-03-01

    Downscaling precipitation is required in local scale climate impact studies. In this paper, a statistical downscaling scheme was presented with a combination of geographically weighted regression (GWR) model and a recently developed method, high accuracy surface modeling method (HASM). This proposed method was compared with another downscaling method using the Coupled Model Intercomparison Project Phase 5 (CMIP5) database and ground-based data from 732 stations across China for the period 1976-2005. The residual which was produced by GWR was modified by comparing different interpolators including HASM, Kriging, inverse distance weighted method (IDW), and Spline. The spatial downscaling from 1° to 1-km grids for period 1976-2005 and future scenarios was achieved by using the proposed downscaling method. The prediction accuracy was assessed at two separate validation sites throughout China and Jiangxi Province on both annual and seasonal scales, with the root mean square error (RMSE), mean relative error (MRE), and mean absolute error (MAE). The results indicate that the developed model in this study outperforms the method that builds transfer function using the gauge values. There is a large improvement in the results when using a residual correction with meteorological station observations. In comparison with other three classical interpolators, HASM shows better performance in modifying the residual produced by local regression method. The success of the developed technique lies in the effective use of the datasets and the modification process of the residual by using HASM. The results from the future climate scenarios show that precipitation exhibits overall increasing trend from T1 (2011-2040) to T2 (2041-2070) and T2 to T3 (2071-2100) in RCP2.6, RCP4.5, and RCP8.5 emission scenarios. The most significant increase occurs in RCP8.5 from T2 to T3, while the lowest increase is found in RCP2.6 from T2 to T3, increased by 47.11 and 2.12 mm, respectively.

  3. Indoor Soiling Method and Outdoor Statistical Risk Analysis of Photovoltaic Power Plants

    NASA Astrophysics Data System (ADS)

    Rajasekar, Vidyashree

    This is a two-part thesis. Part 1 presents an approach for working towards the development of a standardized artificial soiling method for laminated photovoltaic (PV) cells or mini-modules. Construction of an artificial chamber to maintain controlled environmental conditions and components/chemicals used in artificial soil formulation is briefly explained. Both poly-Si mini-modules and a single cell mono-Si coupons were soiled and characterization tests such as I-V, reflectance and quantum efficiency (QE) were carried out on both soiled, and cleaned coupons. From the results obtained, poly-Si mini-modules proved to be a good measure of soil uniformity, as any non-uniformity present would not result in a smooth curve during I-V measurements. The challenges faced while executing reflectance and QE characterization tests on poly-Si due to smaller size cells was eliminated on the mono-Si coupons with large cells to obtain highly repeatable measurements. This study indicates that the reflectance measurements between 600-700 nm wavelengths can be used as a direct measure of soil density on the modules. Part 2 determines the most dominant failure modes of field aged PV modules using experimental data obtained in the field and statistical analysis, FMECA (Failure Mode, Effect, and Criticality Analysis). The failure and degradation modes of about 744 poly-Si glass/polymer frameless modules fielded for 18 years under the cold-dry climate of New York was evaluated. Defect chart, degradation rates (both string and module levels) and safety map were generated using the field measured data. A statistical reliability tool, FMECA that uses Risk Priority Number (RPN) is used to determine the dominant failure or degradation modes in the strings and modules by means of ranking and prioritizing the modes. This study on PV power plants considers all the failure and degradation modes from both safety and performance perspectives. The indoor and outdoor soiling studies were jointly

  4. How to choose the right statistical software?-a method increasing the post-purchase satisfaction.

    PubMed

    Cavaliere, Roberto

    2015-12-01

    Nowadays, we live in the "data era" where the use of statistical or data analysis software is inevitable, in any research field. This means that the choice of the right software tool or platform is a strategic issue for a research department. Nevertheless, in many cases decision makers do not pay the right attention to a comprehensive and appropriate evaluation of what the market offers. Indeed, the choice still depends on few factors like, for instance, researcher's personal inclination, e.g., which software have been used at the university or is already known. This is not wrong in principle, but in some cases it's not enough at all and might lead to a "dead end" situation, typically after months or years of investments already done on the wrong software. This article, far from being a full and complete guide to statistical software evaluation, aims to illustrate some key points of the decision process and introduce an extended range of factors which can help to undertake the right choice, at least in potential. There is not enough literature about that topic, most of the time underestimated, both in the traditional literature and even in the so called "gray literature", even if some documents or short pages can be found online. Anyhow, it seems there is not a common and known standpoint about the process of software evaluation from the final user perspective. We suggests a multi-factor analysis leading to an evaluation matrix tool, to be intended as a flexible and customizable tool, aimed to provide a clearer picture of the software alternatives available, not in abstract but related to the researcher's own context and needs. This method is a result of about twenty years of experience of the author in the field of evaluating and using technical-computing software and partially arises from a research made about such topics as part of a project funded by European Commission under the Lifelong Learning Programme 2011. PMID:26793368

  5. Influence of threshold value in the use of statistical methods for groundwater vulnerability assessment.

    PubMed

    Masetti, Marco; Sterlacchini, Simone; Ballabio, Cristiano; Sorichetta, Alessandro; Poli, Simone

    2009-06-01

    Statistical techniques can be used in groundwater pollution problems to determine the relationships among observed contamination (impacted wells representing an occurrence of what has to be predicted), environmental factors that may influence it and the potential contamination sources. Determination of a threshold concentration to discriminate between impacted or non impacted wells represents a key issue in the application of these techniques. In this work the effects on groundwater vulnerability assessment by statistical methods due to the use of different threshold values have been evaluated. The study area (Province of Milan, northern Italy) is about 2000 km(2) and groundwater nitrate concentration is constantly monitored by a net of about 300 wells. Along with different predictor factors three different threshold values of nitrate concentration have been considered to perform the vulnerability assessment of the shallow unconfined aquifer. The likelihood ratio model has been chosen to analyze the spatial distribution of the vulnerable areas. The reliability of the three final vulnerability maps has been tested showing that all maps identify a general positive trend relating mean nitrate concentration in the wells and vulnerability classes the same wells belong to. Then using the kappa coefficient the influence of the different threshold values has been evaluated comparing the spatial distribution of the resulting vulnerability classes in each map. The use of different threshold does not determine different vulnerability assessment if results are analyzed on a broad scale, even if the smaller threshold value gives the poorest performance in terms of reliability. On the contrary, the spatial distribution of a detailed vulnerability assessment is strongly influenced by the selected threshold used to identify the occurrences, suggesting that there is a strong relationship among the number of identified occurrences, the scale of the maps representing the predictor

  6. How to choose the right statistical software?-a method increasing the post-purchase satisfaction.

    PubMed

    Cavaliere, Roberto

    2015-12-01

    Nowadays, we live in the "data era" where the use of statistical or data analysis software is inevitable, in any research field. This means that the choice of the right software tool or platform is a strategic issue for a research department. Nevertheless, in many cases decision makers do not pay the right attention to a comprehensive and appropriate evaluation of what the market offers. Indeed, the choice still depends on few factors like, for instance, researcher's personal inclination, e.g., which software have been used at the university or is already known. This is not wrong in principle, but in some cases it's not enough at all and might lead to a "dead end" situation, typically after months or years of investments already done on the wrong software. This article, far from being a full and complete guide to statistical software evaluation, aims to illustrate some key points of the decision process and introduce an extended range of factors which can help to undertake the right choice, at least in potential. There is not enough literature about that topic, most of the time underestimated, both in the traditional literature and even in the so called "gray literature", even if some documents or short pages can be found online. Anyhow, it seems there is not a common and known standpoint about the process of software evaluation from the final user perspective. We suggests a multi-factor analysis leading to an evaluation matrix tool, to be intended as a flexible and customizable tool, aimed to provide a clearer picture of the software alternatives available, not in abstract but related to the researcher's own context and needs. This method is a result of about twenty years of experience of the author in the field of evaluating and using technical-computing software and partially arises from a research made about such topics as part of a project funded by European Commission under the Lifelong Learning Programme 2011.

  7. Methods of learning in statistical education: Design and analysis of a randomized trial

    NASA Astrophysics Data System (ADS)

    Boyd, Felicity Turner

    Background. Recent psychological and technological advances suggest that active learning may enhance understanding and retention of statistical principles. A randomized trial was designed to evaluate the addition of innovative instructional methods within didactic biostatistics courses for public health professionals. Aims. The primary objectives were to evaluate and compare the addition of two active learning methods (cooperative and internet) on students' performance; assess their impact on performance after adjusting for differences in students' learning style; and examine the influence of learning style on trial participation. Methods. Consenting students enrolled in a graduate introductory biostatistics course were randomized to cooperative learning, internet learning, or control after completing a pretest survey. The cooperative learning group participated in eight small group active learning sessions on key statistical concepts, while the internet learning group accessed interactive mini-applications on the same concepts. Controls received no intervention. Students completed evaluations after each session and a post-test survey. Study outcome was performance quantified by examination scores. Intervention effects were analyzed by generalized linear models using intent-to-treat analysis and marginal structural models accounting for reported participation. Results. Of 376 enrolled students, 265 (70%) consented to randomization; 69, 100, and 96 students were randomized to the cooperative, internet, and control groups, respectively. Intent-to-treat analysis showed no differences between study groups; however, 51% of students in the intervention groups had dropped out after the second session. After accounting for reported participation, expected examination scores were 2.6 points higher (of 100 points) after completing one cooperative learning session (95% CI: 0.3, 4.9) and 2.4 points higher after one internet learning session (95% CI: 0.0, 4.7), versus

  8. TEGS-CN: A Statistical Method for Pathway Analysis of Genome-wide Copy Number Profile.

    PubMed

    Huang, Yen-Tsung; Hsu, Thomas; Christiani, David C

    2014-01-01

    The effects of copy number alterations make up a significant part of the tumor genome profile, but pathway analyses of these alterations are still not well established. We proposed a novel method to analyze multiple copy numbers of genes within a pathway, termed Test for the Effect of a Gene Set with Copy Number data (TEGS-CN). TEGS-CN was adapted from TEGS, a method that we previously developed for gene expression data using a variance component score test. With additional development, we extend the method to analyze DNA copy number data, accounting for different sizes and thus various numbers of copy number probes in genes. The test statistic follows a mixture of X (2) distributions that can be obtained using permutation with scaled X (2) approximation. We conducted simulation studies to evaluate the size and the power of TEGS-CN and to compare its performance with TEGS. We analyzed a genome-wide copy number data from 264 patients of non-small-cell lung cancer. With the Molecular Signatures Database (MSigDB) pathway database, the genome-wide copy number data can be classified into 1814 biological pathways or gene sets. We investigated associations of the copy number profile of the 1814 gene sets with pack-years of cigarette smoking. Our analysis revealed five pathways with significant P values after Bonferroni adjustment (<2.8 × 10(-5)), including the PTEN pathway (7.8 × 10(-7)), the gene set up-regulated under heat shock (3.6 × 10(-6)), the gene sets involved in the immune profile for rejection of kidney transplantation (9.2 × 10(-6)) and for transcriptional control of leukocytes (2.2 × 10(-5)), and the ganglioside biosynthesis pathway (2.7 × 10(-5)). In conclusion, we present a new method for pathway analyses of copy number data, and causal mechanisms of the five pathways require further study.

  9. Feature Selection Applying Statistical and Neurofuzzy Methods to EEG-Based BCI.

    PubMed

    Martinez-Leon, Juan-Antonio; Cano-Izquierdo, Jose-Manuel; Ibarrola, Julio

    2015-01-01

    This paper presents an investigation aimed at drastically reducing the processing burden required by motor imagery brain-computer interface (BCI) systems based on electroencephalography (EEG). In this research, the focus has moved from the channel to the feature paradigm, and a 96% reduction of the number of features required in the process has been achieved maintaining and even improving the classification success rate. This way, it is possible to build cheaper, quicker, and more portable BCI systems. The data set used was provided within the framework of BCI Competition III, which allows it to compare the presented results with the classification accuracy achieved in the contest. Furthermore, a new three-step methodology has been developed which includes a feature discriminant character calculation stage; a score, order, and selection phase; and a final feature selection step. For the first stage, both statistics method and fuzzy criteria are used. The fuzzy criteria are based on the S-dFasArt classification algorithm which has shown excellent performance in previous papers undertaking the BCI multiclass motor imagery problem. The score, order, and selection stage is used to sort the features according to their discriminant nature. Finally, both order selection and Group Method Data Handling (GMDH) approaches are used to choose the most discriminant ones.

  10. Bayesian Analysis of Two Stellar Populations in Galactic Globular Clusters. I. Statistical and Computational Methods

    NASA Astrophysics Data System (ADS)

    Stenning, D. C.; Wagner-Kaiser, R.; Robinson, E.; van Dyk, D. A.; von Hippel, T.; Sarajedini, A.; Stein, N.

    2016-07-01

    We develop a Bayesian model for globular clusters composed of multiple stellar populations, extending earlier statistical models for open clusters composed of simple (single) stellar populations. Specifically, we model globular clusters with two populations that differ in helium abundance. Our model assumes a hierarchical structuring of the parameters in which physical properties—age, metallicity, helium abundance, distance, absorption, and initial mass—are common to (i) the cluster as a whole or to (ii) individual populations within a cluster, or are unique to (iii) individual stars. An adaptive Markov chain Monte Carlo (MCMC) algorithm is devised for model fitting that greatly improves convergence relative to its precursor non-adaptive MCMC algorithm. Our model and computational tools are incorporated into an open-source software suite known as BASE-9. We use numerical studies to demonstrate that our method can recover parameters of two-population clusters, and also show how model misspecification can potentially be identified. As a proof of concept, we analyze the two stellar populations of globular cluster NGC 5272 using our model and methods. (BASE-9 is available from GitHub: https://github.com/argiopetech/base/releases).

  11. A Statistical Method for Measuring the Galactic Potential and Testing Gravity with Cold Tidal Streams

    NASA Astrophysics Data System (ADS)

    Peñarrubia, Jorge; Koposov, Sergey E.; Walker, Matthew G.

    2012-11-01

    We introduce the Minimum Entropy Method, a simple statistical technique for constraining the Milky Way gravitational potential and simultaneously testing different gravity theories directly from 6D phase-space surveys and without adopting dynamical models. We demonstrate that orbital energy distributions that are separable (i.e., independent of position) have an associated entropy that increases under wrong assumptions about the gravitational potential and/or gravity theory. Of known objects, "cold" tidal streams from low-mass progenitors follow orbital distributions that most nearly satisfy the condition of separability. Although the orbits of tidally stripped stars are perturbed by the progenitor's self-gravity, systematic variations of the energy distribution can be quantified in terms of the cross-entropy of individual tails, giving further sensitivity to theoretical biases in the host potential. The feasibility of using the Minimum Entropy Method to test a wide range of gravity theories is illustrated by evolving restricted N-body models in a Newtonian potential and examining the changes in entropy introduced by Dirac, MONDian, and f(R) gravity modifications.

  12. Remote access methods for exploratory data analysis and statistical modelling: Privacy-Preserving Analytics.

    PubMed

    Sparks, Ross; Carter, Chris; Donnelly, John B; O'Keefe, Christine M; Duncan, Jodie; Keighley, Tim; McAullay, Damien

    2008-09-01

    This paper is concerned with the challenge of enabling the use of confidential or private data for research and policy analysis, while protecting confidentiality and privacy by reducing the risk of disclosure of sensitive information. Traditional solutions to the problem of reducing disclosure risk include releasing de-identified data and modifying data before release. In this paper we discuss the alternative approach of using a remote analysis server which does not enable any data release, but instead is designed to deliver useful results of user-specified statistical analyses with a low risk of disclosure. The techniques described in this paper enable a user to conduct a wide range of methods in exploratory data analysis, regression and survival analysis, while at the same time reducing the risk that the user can read or infer any individual record attribute value. We illustrate our methods with examples from biostatistics using publicly available data. We have implemented our techniques into a software demonstrator called Privacy-Preserving Analytics (PPA), via a web-based interface to the R software. We believe that PPA may provide an effective balance between the competing goals of providing useful information and reducing disclosure risk in some situations.

  13. Feature Selection Applying Statistical and Neurofuzzy Methods to EEG-Based BCI

    PubMed Central

    Martinez-Leon, Juan-Antonio; Cano-Izquierdo, Jose-Manuel; Ibarrola, Julio

    2015-01-01

    This paper presents an investigation aimed at drastically reducing the processing burden required by motor imagery brain-computer interface (BCI) systems based on electroencephalography (EEG). In this research, the focus has moved from the channel to the feature paradigm, and a 96% reduction of the number of features required in the process has been achieved maintaining and even improving the classification success rate. This way, it is possible to build cheaper, quicker, and more portable BCI systems. The data set used was provided within the framework of BCI Competition III, which allows it to compare the presented results with the classification accuracy achieved in the contest. Furthermore, a new three-step methodology has been developed which includes a feature discriminant character calculation stage; a score, order, and selection phase; and a final feature selection step. For the first stage, both statistics method and fuzzy criteria are used. The fuzzy criteria are based on the S-dFasArt classification algorithm which has shown excellent performance in previous papers undertaking the BCI multiclass motor imagery problem. The score, order, and selection stage is used to sort the features according to their discriminant nature. Finally, both order selection and Group Method Data Handling (GMDH) approaches are used to choose the most discriminant ones. PMID:25977685

  14. Detecting Gene-Environment Interactions in Human Birth Defects: Study Designs and Statistical Methods

    PubMed Central

    Tai, Caroline G.; Graff, Rebecca E.; Liu, Jinghua; Passarelli, Michael N.; Mefford, Joel A.; Shaw, Gary M.; Hoffmann, Thomas J.; Witte, John S.

    2015-01-01

    Background The National Birth Defects Prevention Study (NBDPS) contains a wealth of information on affected and unaffected family triads, and thus provides numerous opportunities to study gene-environment interactions (GxE) in the etiology of birth defect outcomes. Depending on the research objective, several analytic options exist to estimate GxE effects that utilize varying combinations of individuals drawn from available triads. Methods In this paper we discuss several considerations in the collection of genetic data and environmental exposures. We will also present several population- and family-based approaches that can be applied to data from the NBDPS including case-control, case-only, family-based trio, and maternal versus fetal effects. For each, we describe the data requirements, applicable statistical methods, advantages and disadvantages. Discussion A range of approaches can be used to evaluate potentially important GxE effects in the NBDPS. Investigators should be aware of the limitations inherent to each approach when choosing a study design and interpreting results. PMID:26010994

  15. Secondary minimum coagulation in charged colloidal suspensions from statistical mechanics methods.

    PubMed

    Cortada, María; Anta, Juan A; Molina-Bolívar, J A

    2007-02-01

    A statistical mechanics approach is applied to predict the critical parameters of coagulation in the secondary minimum for charged colloidal suspensions. This method is based on the solution of the reference hypernetted chain (RHNC) integral equation, and it is intended to estimate only the locus of the critical point instead of the full computation of the "gas-liquid" coexistence. We have used an extrapolation procedure due to the lack of solution of the integral equation in the vicinity of the critical point. Knowing that the osmotic isothermal compressibility of the colloidal system should ideally diverge in the critical point, we work out the critical salt concentration for which the inverse of the compressibility should be zero. This extrapolation procedure is more rapid than that previously proposed by Morales and co-workers [Morales, V.; Anta, J. A.; Lago, S. Langmuir 2003, 19, 475], and it is shown to give equivalent results. We also present experimental results about secondary minimum coagulation for polystyrene latexes and use our method to reproduce the experimental trends. The comparison between theory and experiment is quite good for all colloidal diameters studied.

  16. Native fluorescence spectroscopy of cervical tissues: classification by different statistical methods

    NASA Astrophysics Data System (ADS)

    Ganesan, Singaravelu; Vengadesan, Nammalver; Anbupalam, Thalaimuthu; Hemamalini, Srinivasan; Aruna, Prakasa R.; Karkuzhali, P.

    2002-05-01

    Optical Spectroscopy in the diagnosis of diseases has attracted the medical community due to their minimally invasive nature. Among various optical spectroscopic techniques, native fluorescence spectroscopy has emerged as a potential tool in diagnostic oncology. However, still the reasons for the altered spectral signatures between normal and cancer tissues not yet completely understood. Recently, data reported that emission due to the alteration of some proteins is responsible for the transformation of normal in to malignant one. In this regard, the present study is aimed to characterize the native fluorescence spectroscopy of abnormal and normal cervical tissues, at 280nm excitation. From the study, it is observed that the normal and pathologically diseased cervical tissues have their peak emission around 339 and 336nm respectively with a secondary peak around 440nm. The FWHM value of emission spectra of abnormal tissues is lower than that of normal tissues. The fluorescence spectra of normal and various pathological conditions of cancerous tissues were analyzed by various empirical and statistical methods. Among various type of discriminant analysis, combination of ratio values and linear discrimination method provides better discrimination of normal from pre-malignant and malignant tissues.

  17. PNS and statistical experiments simulation in subcritical systems using Monte-Carlo method on example of Yalina-Thermal assembly

    NASA Astrophysics Data System (ADS)

    Sadovich, Sergey; Talamo, A.; Burnos, V.; Kiyavitskaya, H.; Fokov, Yu.

    2014-06-01

    In subcritical systems driven by an external neutron source, the experimental methods based on pulsed neutron source and statistical techniques play an important role for reactivity measurement. Simulation of these methods is very time-consumed procedure. For simulations in Monte-Carlo programs several improvements for neutronic calculations have been made. This paper introduces a new method for simulation PNS and statistical measurements. In this method all events occurred in the detector during simulation are stored in a file using PTRAC feature in the MCNP. After that with a special code (or post-processing) PNS and statistical methods can be simulated. Additionally different shapes of neutron pulses and its lengths as well as dead time of detectors can be included into simulation. The methods described above were tested on subcritical assembly Yalina-Thermal, located in Joint Institute for Power and Nuclear Research SOSNY, Minsk, Belarus. A good agreement between experimental and simulated results was shown.

  18. Spectral-Lagrangian methods for collisional models of non-equilibrium statistical states

    SciTech Connect

    Gamba, Irene M. Tharkabhushanam, Sri Harsha

    2009-04-01

    We propose a new spectral Lagrangian based deterministic solver for the non-linear Boltzmann transport equation (BTE) in d-dimensions for variable hard sphere (VHS) collision kernels with conservative or non-conservative binary interactions. The method is based on symmetries of the Fourier transform of the collision integral, where the complexity in its computation is reduced to a separate integral over the unit sphere S{sup d-1}. The conservation of moments is enforced by Lagrangian constraints. The resulting scheme, implemented in free space, is very versatile and adjusts in a very simple manner to several cases that involve energy dissipation due to local micro-reversibility (inelastic interactions) or elastic models of slowing down process. Our simulations are benchmarked with available exact self-similar solutions, exact moment equations and analytical estimates for the homogeneous Boltzmann equation, both for elastic and inelastic VHS interactions. Benchmarking of the simulations involves the selection of a time self-similar rescaling of the numerical distribution function which is performed using the continuous spectrum of the equation for Maxwell molecules as studied first in Bobylev et al. [A.V. Bobylev, C. Cercignani, G. Toscani, Proof of an asymptotic property of self-similar solutions of the Boltzmann equation for granular materials, Journal of Statistical Physics 111 (2003) 403-417] and generalized to a wide range of related models in Bobylev et al. [A.V. Bobylev, C. Cercignani, I.M. Gamba, On the self-similar asymptotics for generalized non-linear kinetic Maxwell models, Communication in Mathematical Physics, in press. URL: ()]. The method also produces accurate results in the case of inelastic diffusive Boltzmann equations for hard spheres (inelastic collisions under thermal bath), where overpopulated non-Gaussian exponential tails have been conjectured in computations by stochastic methods [T.V. Noije, M. Ernst

  19. Statistical Track-Before-Detect Methods Applied to Faint Optical Observations of Resident Space Objects

    NASA Astrophysics Data System (ADS)

    Fujimoto, K.; Yanagisawa, T.; Uetsuhara, M.

    Automated detection and tracking of faint objects in optical, or bearing-only, sensor imagery is a topic of immense interest in space surveillance. Robust methods in this realm will lead to better space situational awareness (SSA) while reducing the cost of sensors and optics. They are especially relevant in the search for high area-to-mass ratio (HAMR) objects, as their apparent brightness can change significantly over time. A track-before-detect (TBD) approach has been shown to be suitable for faint, low signal-to-noise ratio (SNR) images of resident space objects (RSOs). TBD does not rely upon the extraction of feature points within the image based on some thresholding criteria, but rather directly takes as input the intensity information from the image file. Not only is all of the available information from the image used, TBD avoids the computational intractability of the conventional feature-based line detection (i.e., "string of pearls") approach to track detection for low SNR data. Implementation of TBD rooted in finite set statistics (FISST) theory has been proposed recently by Vo, et al. Compared to other TBD methods applied so far to SSA, such as the stacking method or multi-pass multi-period denoising, the FISST approach is statistically rigorous and has been shown to be more computationally efficient, thus paving the path toward on-line processing. In this paper, we intend to apply a multi-Bernoulli filter to actual CCD imagery of RSOs. The multi-Bernoulli filter can explicitly account for the birth and death of multiple targets in a measurement arc. TBD is achieved via a sequential Monte Carlo implementation. Preliminary results with simulated single-target data indicate that a Bernoulli filter can successfully track and detect objects with measurement SNR as low as 2.4. Although the advent of fast-cadence scientific CMOS sensors have made the automation of faint object detection a realistic goal, it is nonetheless a difficult goal, as measurements

  20. A statistical model for assessing performance standards for quantitative and semiquantitative disinfectant test methods.

    PubMed

    Parker, Albert E; Hamilton, Martin A; Tomasino, Stephen F

    2014-01-01

    A performance standard for a disinfectant test method can be evaluated by quantifying the (Type I) pass-error rate for ineffective products and the (Type II) fail-error rate for highly effective products. This paper shows how to calculate these error rates for test methods where the log reduction in a microbial population is used as a measure of antimicrobial efficacy. The calculations can be used to assess performance standards that may require multiple tests of multiple microbes at multiple laboratories. Notably, the error rates account for among-laboratory variance of the log reductions estimated from a multilaboratory data set and the correlation among tests of different microbes conducted in the same laboratory. Performance standards that require that a disinfectant product pass all tests or multiple tests on average, are considered. The proposed statistical methodology is flexible and allows for a different acceptable outcome for each microbe tested, since, for example, variability may be different for different microbes. The approach can also be applied to semiquantitative methods for which product efficacy is reported as the number of positive carriers out of a treated set and the density of the microbes on control carriers is quantified, thereby allowing a log reduction to be calculated. Therefore, using the approach described in this paper, the error rates can also be calculated for semiquantitative method performance standards specified solely in terms of the maximum allowable number of positive carriers per test. The calculations are demonstrated in a case study of the current performance standard for the semiquantitative AOAC Use-Dilution Methods for Pseudomonas aeruginosa (964.02) and Staphylococcus aureus (955.15), which allow up to one positive carrier out of a set of 60 inoculated and treated carriers in each test. A simulation study was also conducted to verify the validity of the model's assumptions and accuracy. Our approach, easily implemented

  1. Evaluation of Oceanic Transport Statistics By Use of Transient Tracers and Bayesian Methods

    NASA Astrophysics Data System (ADS)

    Trossman, D. S.; Thompson, L.; Mecking, S.; Bryan, F.; Peacock, S.

    2013-12-01

    Key variables that quantify the time scales over which atmospheric signals penetrate into the oceanic interior and their uncertainties are computed using Bayesian methods and transient tracers from both models and observations. First, the mean residence times, subduction rates, and formation rates of Subtropical Mode Water (STMW) and Subpolar Mode Water (SPMW) in the North Atlantic and Subantarctic Mode Water (SAMW) in the Southern Ocean are estimated by combining a model and observations of chlorofluorocarbon-11 (CFC-11) via Bayesian Model Averaging (BMA), statistical technique that weights model estimates according to how close they agree with observations. Second, a Bayesian method is presented to find two oceanic transport parameters associated with the age distribution of ocean waters, the transit-time distribution (TTD), by combining an eddying global ocean model's estimate of the TTD with hydrographic observations of CFC-11, temperature, and salinity. Uncertainties associated with objectively mapping irregularly spaced bottle data are quantified by making use of a thin-plate spline and then propagated via the two Bayesian techniques. It is found that the subduction of STMW, SPMW, and SAMW is mostly an advective process, but up to about one-third of STMW subduction likely owes to non-advective processes. Also, while the formation of STMW is mostly due to subduction, the formation of SPMW is mostly due to other processes. About half of the formation of SAMW is due to subduction and half is due to other processes. A combination of air-sea flux, acting on relatively short time scales, and turbulent mixing, acting on a wide range of time scales, is likely the dominant SPMW erosion mechanism. Air-sea flux is likely responsible for most STMW erosion, and turbulent mixing is likely responsible for most SAMW erosion. Two oceanic transport parameters, the mean age of a water parcel and the half-variance associated with the TTD, estimated using the model's tracers as

  2. Machine learning and statistical methods for the prediction of maximal oxygen uptake: recent advances

    PubMed Central

    Abut, Fatih; Akay, Mehmet Fatih

    2015-01-01

    Maximal oxygen uptake (VO2max) indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R) and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance. PMID:26346869

  3. Machine learning and statistical methods for the prediction of maximal oxygen uptake: recent advances.

    PubMed

    Abut, Fatih; Akay, Mehmet Fatih

    2015-01-01

    Maximal oxygen uptake (VO2max) indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R) and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance.

  4. Adequate mathematical modelling of environmental processes

    NASA Astrophysics Data System (ADS)

    Chashechkin, Yu. D.

    2012-04-01

    In environmental observations and laboratory visualization both large scale flow components like currents, jets, vortices, waves and a fine structure are registered (different examples are given). The conventional mathematical modeling both analytical and numerical is directed mostly on description of energetically important flow components. The role of a fine structures is still remains obscured. A variety of existing models makes it difficult to choose the most adequate and to estimate mutual assessment of their degree of correspondence. The goal of the talk is to give scrutiny analysis of kinematics and dynamics of flows. A difference between the concept of "motion" as transformation of vector space into itself with a distance conservation and the concept of "flow" as displacement and rotation of deformable "fluid particles" is underlined. Basic physical quantities of the flow that are density, momentum, energy (entropy) and admixture concentration are selected as physical parameters defined by the fundamental set which includes differential D'Alembert, Navier-Stokes, Fourier's and/or Fick's equations and closing equation of state. All of them are observable and independent. Calculations of continuous Lie groups shown that only the fundamental set is characterized by the ten-parametric Galilelian groups reflecting based principles of mechanics. Presented analysis demonstrates that conventionally used approximations dramatically change the symmetries of the governing equations sets which leads to their incompatibility or even degeneration. The fundamental set is analyzed taking into account condition of compatibility. A high order of the set indicated on complex structure of complete solutions corresponding to physical structure of real flows. Analytical solutions of a number problems including flows induced by diffusion on topography, generation of the periodic internal waves a compact sources in week-dissipative media as well as numerical solutions of the same

  5. Comparing Trend and Gap Statistics across Tests: Distributional Change Using Ordinal Methods and Bayesian Inference

    ERIC Educational Resources Information Center

    Denbleyker, John Nickolas

    2012-01-01

    The shortcomings of the proportion above cut (PAC) statistic used so prominently in the educational landscape renders it a very problematic measure for making correct inferences with student test data. The limitations of PAC-based statistics are more pronounced with cross-test comparisons due to their dependency on cut-score locations. A better…

  6. An Inferential Confidence Interval Method of Establishing Statistical Equivalence that Corrects Tryon's (2001) Reduction Factor

    ERIC Educational Resources Information Center

    Tryon, Warren W.; Lewis, Charles

    2008-01-01

    Evidence of group matching frequently takes the form of a nonsignificant test of statistical difference. Theoretical hypotheses of no difference are also tested in this way. These practices are flawed in that null hypothesis statistical testing provides evidence against the null hypothesis and failing to reject H[subscript 0] is not evidence…

  7. USING STATISTICAL METHODS FOR WATER QUALITY MANAGEMENT: ISSUES, PROBLEMS AND SOLUTIONS

    EPA Science Inventory

    This book is readable, comprehensible and I anticipate, usable. The author has an enthusiasm which comes out in the text. Statistics is presented as a living breathing subject, still being debated, defined, and refined. This statistics book actually has examples in the field...

  8. Developing Students' Thought Processes for Choosing Appropriate Statistical Methods

    ERIC Educational Resources Information Center

    Murray, James; Knowles, Elizabeth

    2014-01-01

    Students often struggle to select appropriate statistical tests when investigating research questions. The authors present a lesson study designed to make students' thought processes visible while considering this choice. The authors taught their students a way to organize knowledge about statistical tests and observed its impact in the…

  9. The T(ea) Test: Scripted Stories Increase Statistical Method Selection Skills

    ERIC Educational Resources Information Center

    Hackathorn, Jana; Ashdown, Brien

    2015-01-01

    To teach statistics, teachers must attempt to overcome pedagogical obstacles, such as dread, anxiety, and boredom. There are many options available to teachers that facilitate a pedagogically conducive environment in the classroom. The current study examined the effectiveness of incorporating scripted stories and humor into statistical method…

  10. Comparing Methods for Item Analysis: The Impact of Different Item-Selection Statistics on Test Difficulty

    ERIC Educational Resources Information Center

    Jones, Andrew T.

    2011-01-01

    Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…

  11. Comparison of precipitation nowcasting by extrapolation and statistical-advection methods

    NASA Astrophysics Data System (ADS)

    Sokol, Zbynek; Kitzmiller, David; Pesice, Petr; Mejsnar, Jan

    2013-04-01

    Two models for nowcasting of 1-h, 2-h and 3-h precipitation in the warm part of the year were evaluated. The first model was based on the extrapolation of observed radar reflectivity (COTREC-IPA) and the second one combined the extrapolation with the application of a statistical model (SAMR). The accuracy of the model forecasts was evaluated on independent data using the standard measures of root-mean-squared-error, absolute error, bias and correlation coefficient as well as by spatial verification methods Fractions Skill Score and SAL technique. The results show that SAMR yields slightly better forecasts during the afternoon period. On the other hand very small or no improvement is realized at night and in the very early morning. COTREC-IPA and SAMR forecast a very similar horizontal structure of precipitation patterns but the model forecasts differ in values. SAMR, similarly as COTREC-IPA, is not able to develop new storms or significantly intensify already existing storms. This is caused by a large uncertainty regarding future development. On the other hand, the SAMR model can reliably predict decreases in precipitation intensity.

  12. Integrating Symbolic and Statistical Methods for Testing Intelligent Systems Applications to Machine Learning and Computer Vision

    SciTech Connect

    Jha, Sumit Kumar; Pullum, Laura L; Ramanathan, Arvind

    2016-01-01

    Embedded intelligent systems ranging from tiny im- plantable biomedical devices to large swarms of autonomous un- manned aerial systems are becoming pervasive in our daily lives. While we depend on the flawless functioning of such intelligent systems, and often take their behavioral correctness and safety for granted, it is notoriously difficult to generate test cases that expose subtle errors in the implementations of machine learning algorithms. Hence, the validation of intelligent systems is usually achieved by studying their behavior on representative data sets, using methods such as cross-validation and bootstrapping.In this paper, we present a new testing methodology for studying the correctness of intelligent systems. Our approach uses symbolic decision procedures coupled with statistical hypothesis testing to. We also use our algorithm to analyze the robustness of a human detection algorithm built using the OpenCV open-source computer vision library. We show that the human detection implementation can fail to detect humans in perturbed video frames even when the perturbations are so small that the corresponding frames look identical to the naked eye.

  13. Statistical methods for efficient design of community surveys of response to noise: Random coefficients regression models

    NASA Technical Reports Server (NTRS)

    Tomberlin, T. J.

    1985-01-01

    Research studies of residents' responses to noise consist of interviews with samples of individuals who are drawn from a number of different compact study areas. The statistical techniques developed provide a basis for those sample design decisions. These techniques are suitable for a wide range of sample survey applications. A sample may consist of a random sample of residents selected from a sample of compact study areas, or in a more complex design, of a sample of residents selected from a sample of larger areas (e.g., cities). The techniques may be applied to estimates of the effects on annoyance of noise level, numbers of noise events, the time-of-day of the events, ambient noise levels, or other factors. Methods are provided for determining, in advance, how accurately these effects can be estimated for different sample sizes and study designs. Using a simple cost function, they also provide for optimum allocation of the sample across the stages of the design for estimating these effects. These techniques are developed via a regression model in which the regression coefficients are assumed to be random, with components of variance associated with the various stages of a multi-stage sample design.

  14. Statistical methods for biodosimetry in the presence of both Berkson and classical measurement error

    NASA Astrophysics Data System (ADS)

    Miller, Austin

    In radiation epidemiology, the true dose received by those exposed cannot be assessed directly. Physical dosimetry uses a deterministic function of the source term, distance and shielding to estimate dose. For the atomic bomb survivors, the physical dosimetry system is well established. The classical measurement errors plaguing the location and shielding inputs to the physical dosimetry system are well known. Adjusting for the associated biases requires an estimate for the classical measurement error variance, for which no data-driven estimate exists. In this case, an instrumental variable solution is the most viable option to overcome the classical measurement error indeterminacy. Biological indicators of dose may serve as instrumental variables. Specification of the biodosimeter dose-response model requires identification of the radiosensitivity variables, for which we develop statistical definitions and variables. More recently, researchers have recognized Berkson error in the dose estimates, introduced by averaging assumptions for many components in the physical dosimetry system. We show that Berkson error induces a bias in the instrumental variable estimate of the dose-response coefficient, and then address the estimation problem. This model is specified by developing an instrumental variable mixed measurement error likelihood function, which is then maximized using a Monte Carlo EM Algorithm. These methods produce dose estimates that incorporate information from both physical and biological indicators of dose, as well as the first instrumental variable based data-driven estimate for the classical measurement error variance.

  15. Spiked proteomic standard dataset for testing label-free quantitative software and statistical methods.

    PubMed

    Ramus, Claire; Hovasse, Agnès; Marcellin, Marlène; Hesse, Anne-Marie; Mouton-Barbosa, Emmanuelle; Bouyssié, David; Vaca, Sebastian; Carapito, Christine; Chaoui, Karima; Bruley, Christophe; Garin, Jérôme; Cianférani, Sarah; Ferro, Myriam; Dorssaeler, Alain Van; Burlet-Schiltz, Odile; Schaeffer, Christine; Couté, Yohann; Gonzalez de Peredo, Anne

    2016-03-01

    This data article describes a controlled, spiked proteomic dataset for which the "ground truth" of variant proteins is known. It is based on the LC-MS analysis of samples composed of a fixed background of yeast lysate and different spiked amounts of the UPS1 mixture of 48 recombinant proteins. It can be used to objectively evaluate bioinformatic pipelines for label-free quantitative analysis, and their ability to detect variant proteins with good sensitivity and low false discovery rate in large-scale proteomic studies. More specifically, it can be useful for tuning software tools parameters, but also testing new algorithms for label-free quantitative analysis, or for evaluation of downstream statistical methods. The raw MS files can be downloaded from ProteomeXchange with identifier PXD001819. Starting from some raw files of this dataset, we also provide here some processed data obtained through various bioinformatics tools (including MaxQuant, Skyline, MFPaQ, IRMa-hEIDI and Scaffold) in different workflows, to exemplify the use of such data in the context of software benchmarking, as discussed in details in the accompanying manuscript [1]. The experimental design used here for data processing takes advantage of the different spike levels introduced in the samples composing the dataset, and processed data are merged in a single file to facilitate the evaluation and illustration of software tools results for the detection of variant proteins with different absolute expression levels and fold change values.

  16. An Efficient Augmented Lagrangian Method for Statistical X-Ray CT Image Reconstruction

    PubMed Central

    Li, Jiaojiao; Niu, Shanzhou; Huang, Jing; Bian, Zhaoying; Feng, Qianjin; Yu, Gaohang; Liang, Zhengrong; Chen, Wufan; Ma, Jianhua

    2015-01-01

    Statistical iterative reconstruction (SIR) for X-ray computed tomography (CT) under the penalized weighted least-squares criteria can yield significant gains over conventional analytical reconstruction from the noisy measurement. However, due to the nonlinear expression of the objective function, most exiting algorithms related to the SIR unavoidably suffer from heavy computation load and slow convergence rate, especially when an edge-preserving or sparsity-based penalty or regularization is incorporated. In this work, to address abovementioned issues of the general algorithms related to the SIR, we propose an adaptive nonmonotone alternating direction algorithm in the framework of augmented Lagrangian multiplier method, which is termed as “ALM-ANAD”. The algorithm effectively combines an alternating direction technique with an adaptive nonmonotone line search to minimize the augmented Lagrangian function at each iteration. To evaluate the present ALM-ANAD algorithm, both qualitative and quantitative studies were conducted by using digital and physical phantoms. Experimental results show that the present ALM-ANAD algorithm can achieve noticeable gains over the classical nonlinear conjugate gradient algorithm and state-of-the-art split Bregman algorithm in terms of noise reduction, contrast-to-noise ratio, convergence rate, and universal quality index metrics. PMID:26495975

  17. "Geo-statistics methods and neural networks in geophysical applications: A case study"

    NASA Astrophysics Data System (ADS)

    Rodriguez Sandoval, R.; Urrutia Fucugauchi, J.; Ramirez Cruz, L. C.

    2008-12-01

    The study is focus in the Ebano-Panuco basin of northeastern Mexico, which is being explored for hydrocarbon reservoirs. These reservoirs are in limestones and there is interest in determining porosity and permeability in the carbonate sequences. The porosity maps presented in this study are estimated from application of multiattribute and neural networks techniques, which combine geophysics logs and 3-D seismic data by means of statistical relationships. The multiattribute analysis is a process to predict a volume of any underground petrophysical measurement from well-log and seismic data. The data consist of a series of target logs from wells which tie a 3-D seismic volume. The target logs are neutron porosity logs. From the 3-D seismic volume a series of sample attributes is calculated. The objective of this study is to derive a set of attributes and the target log values. The selected set is determined by a process of forward stepwise regression. The analysis can be linear or nonlinear. In the linear mode the method consists of a series of weights derived by least-square minimization. In the nonlinear mode, a neural network is trained using the select attributes as inputs. In this case we used a probabilistic neural network PNN. The method is applied to a real data set from PEMEX. For better reservoir characterization the porosity distribution was estimated using both techniques. The case shown a continues improvement in the prediction of the porosity from the multiattribute to the neural network analysis. The improvement is in the training and the validation, which are important indicators of the reliability of the results. The neural network showed an improvement in resolution over the multiattribute analysis. The final maps provide more realistic results of the porosity distribution.

  18. Meta-analysis as Statistical and Analytical Method of Journal’s Content Scientific Evaluation

    PubMed Central

    Masic, Izet; Begic, Edin

    2015-01-01

    Introduction: A meta-analysis is a statistical and analytical method which combines and synthesizes different independent studies and integrates their results into one common result. Goal: Analysis of the journals “Medical Archives”, “Materia Socio Medica” and “Acta Informatica Medica”, which are located in the most eminent indexed databases of the biomedical milieu. Material and methods: The study has retrospective and descriptive character, and included the period of the calendar year 2014. Study included six editions of all three journals (total of 18 journals). Results: In this period was published a total of 291 articles (in the “Medical Archives” 110, “Materia Socio Medica” 97, and in “Acta Informatica Medica” 84). The largest number of articles was original articles. Small numbers have been published as professional, review articles and case reports. Clinical events were most common in the first two journals, while in the journal “Acta Informatica Medica” belonged to the field of medical informatics, as part of pre-clinical medical disciplines. Articles are usually required period of fifty to fifty nine days for review. Articles were received from four continents, mostly from Europe. The authors are most often from the territory of Bosnia and Herzegovina, then Iran, Kosovo and Macedonia. Conclusion: The number of articles published each year is increasing, with greater participation of authors from different continents and abroad. Clinical medical disciplines are the most common, with the broader spectrum of topics and with a growing number of original articles. Greater support of the wider scientific community is needed for further development of all three of the aforementioned journals. PMID:25870484

  19. Statistical methods for conducting agreement (comparison of clinical tests) and precision (repeatability or reproducibility) studies in optometry and ophthalmology.

    PubMed

    McAlinden, Colm; Khadka, Jyoti; Pesudovs, Konrad

    2011-07-01

    The ever-expanding choice of ocular metrology and imaging equipment has driven research into the validity of their measurements. Consequently, studies of the agreement between two instruments or clinical tests have proliferated in the ophthalmic literature. It is important that researchers apply the appropriate statistical tests in agreement studies. Correlation coefficients are hazardous and should be avoided. The 'limits of agreement' method originally proposed by Altman and Bland in 1983 is the statistical procedure of choice. Its step-by-step use and practical considerations in relation to optometry and ophthalmology are detailed in addition to sample size considerations and statistical approaches to precision (repeatability or reproducibility) estimates.

  20. Examination of two methods for statistical analysis of data with magnitude and direction emphasizing vestibular research applications

    NASA Technical Reports Server (NTRS)

    Calkins, D. S.

    1998-01-01

    When the dependent (or response) variable response variable in an experiment has direction and magnitude, one approach that has been used for statistical analysis involves splitting magnitude and direction and applying univariate statistical techniques to the components. However, such treatment of quantities with direction and magnitude is not justifiable mathematically and can lead to incorrect conclusions about relationships among variables and, as a result, to flawed interpretations. This note discusses a problem with that practice and recommends mathematically correct procedures to be used with dependent variables that have direction and magnitude for 1) computation of mean values, 2) statistical contrasts of and confidence intervals for means, and 3) correlation methods.

  1. Statistical prediction of dynamic distortion of inlet flow using minimum dynamic measurement. An application to the Melick statistical method and inlet flow dynamic distortion prediction without RMS measurements

    NASA Technical Reports Server (NTRS)

    Schweikhard, W. G.; Chen, Y. S.

    1986-01-01

    The Melick method of inlet flow dynamic distortion prediction by statistical means is outlined. A hypothetic vortex model is used as the basis for the mathematical formulations. The main variables are identified by matching the theoretical total pressure rms ratio with the measured total pressure rms ratio. Data comparisons, using the HiMAT inlet test data set, indicate satisfactory prediction of the dynamic peak distortion for cases with boundary layer control device vortex generators. A method for the dynamic probe selection was developed. Validity of the probe selection criteria is demonstrated by comparing the reduced-probe predictions with the 40-probe predictions. It is indicated that the the number of dynamic probes can be reduced to as few as two and still retain good accuracy.

  2. A New Method for Assessing the Statistical Significance in the Differential Functioning of Items and Tests (DFIT) Framework

    ERIC Educational Resources Information Center

    Oshima, T. C.; Raju, Nambury S.; Nanda, Alice O.

    2006-01-01

    A new item parameter replication method is proposed for assessing the statistical significance of the noncompensatory differential item functioning (NCDIF) index associated with the differential functioning of items and tests framework. In this new method, a cutoff score for each item is determined by obtaining a (1-alpha ) percentile rank score…

  3. Multivariate statistical data analysis methods for detecting baroclinic wave interactions in the thermally driven rotating annulus

    NASA Astrophysics Data System (ADS)

    von Larcher, Thomas; Harlander, Uwe; Alexandrov, Kiril; Wang, Yongtai

    2010-05-01

    Experiments on baroclinic wave instabilities in a rotating cylindrical gap have been long performed, e.g., to unhide regular waves of different zonal wave number, to better understand the transition to the quasi-chaotic regime, and to reveal the underlying dynamical processes of complex wave flows. We present the application of appropriate multivariate data analysis methods on time series data sets acquired by the use of non-intrusive measurement techniques of a quite different nature. While the high accurate Laser-Doppler-Velocimetry (LDV ) is used for measurements of the radial velocity component at equidistant azimuthal positions, a high sensitive thermographic camera measures the surface temperature field. The measurements are performed at particular parameter points, where our former studies show that kinds of complex wave patterns occur [1, 2]. Obviously, the temperature data set has much more information content as the velocity data set due to the particular measurement techniques. Both sets of time series data are analyzed by using multivariate statistical techniques. While the LDV data sets are studied by applying the Multi-Channel Singular Spectrum Analysis (M - SSA), the temperature data sets are analyzed by applying the Empirical Orthogonal Functions (EOF ). Our goal is (a) to verify the results yielded with the analysis of the velocity data and (b) to compare the data analysis methods. Therefor, the temperature data are processed in a way to become comparable to the LDV data, i.e. reducing the size of the data set in such a manner that the temperature measurements would imaginary be performed at equidistant azimuthal positions only. This approach initially results in a great loss of information. But applying the M - SSA to the reduced temperature data sets enable us to compare the methods. [1] Th. von Larcher and C. Egbers, Experiments on transitions of baroclinic waves in a differentially heated rotating annulus, Nonlinear Processes in Geophysics

  4. Statistical analysis using the Bayesian nonparametric method for irradiation embrittlement of reactor pressure vessels

    NASA Astrophysics Data System (ADS)

    Takamizawa, Hisashi; Itoh, Hiroto; Nishiyama, Yutaka

    2016-10-01

    In order to understand neutron irradiation embrittlement in high fluence regions, statistical analysis using the Bayesian nonparametric (BNP) method was performed for the Japanese surveillance and material test reactor irradiation database. The BNP method is essentially expressed as an infinite summation of normal distributions, with input data being subdivided into clusters with identical statistical parameters, such as mean and standard deviation, for each cluster to estimate shifts in ductile-to-brittle transition temperature (DBTT). The clusters typically depend on chemical compositions, irradiation conditions, and the irradiation embrittlement. Specific variables contributing to the irradiation embrittlement include the content of Cu, Ni, P, Si, and Mn in the pressure vessel steels, neutron flux, neutron fluence, and irradiation temperatures. It was found that the measured shifts of DBTT correlated well with the calculated ones. Data associated with the same materials were subdivided into the same clusters even if neutron fluences were increased. Comparing cluster IDs 2 and 6, embrittlement of high-Cu-bearing materials (<0.07 wt%) was larger than that of low-Cu-bearing (0.07 < wt.%) materials. This is attributed to irradiation-induced Cu-enriched clusters, as well as those that are irradiation-enhanced [4]. A similar feature is recognized for cluster IDs 5 and 8 in materials with a higher Ni content. A flux effect with a higher flux range was demonstrated for cluster ID 3 comprising MTR irradiation in a high flux region (≤1 × 1013 n/cm2/s) [44]. For cluster ID 10, classification is rendered based upon flux effect, where embrittlement is accelerated in high Cu-bearing materials irradiated at lower flux levels (less than 5 × 109 n/cm2·s). This is possibly due to increased thermal equilibrium vacancies [44,45]. Per all the above considerations, it was hence ascertained that data belonging to identical cluster ID

  5. Low dose dynamic CT myocardial perfusion imaging using a statistical iterative reconstruction method

    SciTech Connect

    Tao, Yinghua; Chen, Guang-Hong; Hacker, Timothy A.; Raval, Amish N.; Van Lysel, Michael S.; Speidel, Michael A.

    2014-07-15

    Purpose: Dynamic CT myocardial perfusion imaging has the potential to provide both functional and anatomical information regarding coronary artery stenosis. However, radiation dose can be potentially high due to repeated scanning of the same region. The purpose of this study is to investigate the use of statistical iterative reconstruction to improve parametric maps of myocardial perfusion derived from a low tube current dynamic CT acquisition. Methods: Four pigs underwent high (500 mA) and low (25 mA) dose dynamic CT myocardial perfusion scans with and without coronary occlusion. To delineate the affected myocardial territory, an N-13 ammonia PET perfusion scan was performed for each animal in each occlusion state. Filtered backprojection (FBP) reconstruction was first applied to all CT data sets. Then, a statistical iterative reconstruction (SIR) method was applied to data sets acquired at low dose. Image voxel noise was matched between the low dose SIR and high dose FBP reconstructions. CT perfusion maps were compared among the low dose FBP, low dose SIR and high dose FBP reconstructions. Numerical simulations of a dynamic CT scan at high and low dose (20:1 ratio) were performed to quantitatively evaluate SIR and FBP performance in terms of flow map accuracy, precision, dose efficiency, and spatial resolution. Results: Forin vivo studies, the 500 mA FBP maps gave −88.4%, −96.0%, −76.7%, and −65.8% flow change in the occluded anterior region compared to the open-coronary scans (four animals). The percent changes in the 25 mA SIR maps were in good agreement, measuring −94.7%, −81.6%, −84.0%, and −72.2%. The 25 mA FBP maps gave unreliable flow measurements due to streaks caused by photon starvation (percent changes of +137.4%, +71.0%, −11.8%, and −3.5%). Agreement between 25 mA SIR and 500 mA FBP global flow was −9.7%, 8.8%, −3.1%, and 26.4%. The average variability of flow measurements in a nonoccluded region was 16.3%, 24.1%, and 937

  6. Geochemical correlation study of Oklahoma crude oils using a multivariate statistical method

    SciTech Connect

    Imbus, S.W.; Engel, M.H.; Zumberge, J.E.

    1987-05-01

    Despite significant production in the southern Mid-Continent, organic geochemical characterization of oils and potential source rocks has been limited and, with respect to correlation studies, somewhat inconclusive. In the present study, 46 Oklahoma oils of varying reservoir ages (Cambro-Ordovician to Pennsylvanian-Morrowan) from an extensive geographic area were analyzed for saturate and aromatic hydrocarbon distributions, percent S, and stable carbon isotopic compositions. Eighteen of the oils were analyzed by GC/MS for biological marker compound distributions. Similarities and differences between oils were assessed using multivariate statistical methods. In general, the majority of oils were mature, of marine origin, and very similar with respect to chemical and stable carbonate isotopic compositions. Some differences were, however, observed. In particular, oils from the Oklahoma Panhandle (Pennsylvanian-Morrowan) were enriched in /sup 13/C and had high Pr/Ph values. Oils from the Marietta basin and the southeast portion of the Anadarko basin (Ordovician) had high n-C/sub 19//n-C/sub 18/ values. With few exceptions, the remaining oils from throughout the state appeared to be identical. Simultaneous R-mode, Q-mode factor analyses confirm these distinctions, and it is tentatively proposed that the Oklahoma oils that have been analyzed to date represent three distinct families. The geographic distributions of these families may be useful for establishing their respective sources and migration pathways. While the Woodford formation is commonly recognized as the principal source for Oklahoma oils, with organic facies changes accounting for some of the slight differences in oil compositions, other possible local sources as well as the possibility of multiple sources, i.e., mixing, are currently being investigated.

  7. Epoch of reionization window. II. Statistical methods for foreground wedge reduction

    NASA Astrophysics Data System (ADS)

    Liu, Adrian; Parsons, Aaron R.; Trott, Cathryn M.

    2014-07-01

    For there to be a successful measurement of the 21 cm epoch of reionization (EoR) power spectrum, it is crucial that strong foreground contaminants be robustly suppressed. These foregrounds come from a variety of sources (such as Galactic synchrotron emission and extragalactic point sources), but almost all share the property of being spectrally smooth and, when viewed through the chromatic response of an interferometer, occupy a signature "wedge" region in cylindrical k⊥k∥ Fourier space. The complement of the foreground wedge is termed the "EoR window" and is expected to be mostly foreground-free, allowing clean measurements of the power spectrum. This paper is a sequel to a previous paper that established a rigorous mathematical framework for describing the foreground wedge and the EoR window. Here, we use our framework to explore statistical methods by which the EoR window can be enlarged, thereby increasing the sensitivity of a power spectrum measurement. We adapt the Feldman-Kaiser-Peacock approximation (commonly used in galaxy surveys) for 21 cm cosmology and also compare the optimal quadratic estimator to simpler estimators that ignore covariances between different Fourier modes. The optimal quadratic estimator is found to suppress foregrounds by an extra factor of ˜105 in power at the peripheries of the EoR window, boosting the detection of the cosmological signal from 12σ to 50σ at the midpoint of reionization in our fiducial models. If numerical issues can be finessed, decorrelation techniques allow the EoR window to be further enlarged, enabling measurements to be made deep within the foreground wedge. These techniques do not assume that foreground is Gaussian distributed, and we additionally prove that a final round of foreground subtraction can be performed after decorrelation in a way that is guaranteed to have no cosmological signal loss.

  8. The evaluation of the statistical monomineral thermobarometric methods for the reconstruction of the lithospheric mantle structure

    NASA Astrophysics Data System (ADS)

    Ashchepkov, I.; Vishnyakova, E.

    2009-04-01

    The modified versions of the thermobarometers for the mantle assemblages were revised sing statistical calibrations on the results of Opx thermobarometry. The modifications suggest the calculation of the Fe# of coexisting olivine Fe#Ol according to the statistical approximations by the regressions obtained from the xenoliths from kimberlite data base including >700 associations. They allow reproduces the Opx based TP estimates and to receive the complete set of the TP values for mantle xenoliths and xenocrysts. For GARNET Three variants of barometer give similar results. The first is published (Ashchepkov, 2006). The second is calculating the Al2O3 from Garnet for Orthopyroxene according to procedure: xCrOpx=Cr2O3/CaO)/FeO/MgO/500 xAlOpx=1/(3875*(exp(Cr2O3^0.2/CaO)-0.3)*CaO/989+16)-XcrOpx Al2O3=xAlOp*24.64/Cr2O3^0.2*CaO/2.+FeO*(ToK-501)/1002 And then it suppose using of the Al2O3 in Opx barometer (McGregor, 1974). The third variant is transformation of the G. Grutter (2006) method by introducing of the influence of temperature. P=40+(Cr2O3)-4.5)*10/3-20/7*CaO+(ToC)*0.0000751*MgO)*CaO+2.45*Cr2O3*(7-xv(5,8)) -Fe*0.5 with the correction for P>55: P=55+(P-55)*55/(1+0.9*P) Average from this three methods give appropriate values comparable with determined with (McGregor,1974) barometer. Temperature are estimating according to transformed Krogh thermometer Fe#Ol_Gar=Fe#Gar/2+(T(K)-1420)*0.000112+0.01 For the deep seated associations P>55 kbar T=T-(0.25/(0.4-0.004*(20-P))-0.38/Ca)*275+51*Ca*Cr2-378*CaO-0.51)-Cr/Ca2*5+Mg/(Fe+0.0001)*17.4 ILMENITE P= ((TiO2-23.)*2.15-(T0-973)/20*MgO*Cr2O3 and next P=(60-P)/6.1+P ToK is determined according to (Taylor et al , 1998) Fe#Ol_Chr =(Fe/(Fe+Mg)ilm -0.35)/2.252-0.0000351*(T(K)-973) CHROMITE The equations for PT estimates with chromite compositions P=Cr/(Cr+Al)*T(K)/14.+Ti*0.10 with the next iteration P=-0.0053*P^2+1.1292*P+5.8059 +0.00135*T(K)*Ti*410-8.2 For P> 57 P=P+(P-57)*2.75 Temperature estimates are according to the O

  9. Adequate histologic sectioning of prostate needle biopsies.

    PubMed

    Bostwick, David G; Kahane, Hillel

    2013-08-01

    No standard method exists for sampling prostate needle biopsies, although most reports claim to embed 3 cores per block and obtain 3 slices from each block. This study was undertaken to determine the extent of histologic sectioning necessary for optimal examination of prostate biopsies. We prospectively compared the impact on cancer yield of submitting 1 biopsy core per cassette (biopsies from January 2010) with 3 cores per cassette (biopsies from August 2010) from a large national reference laboratory. Between 6 and 12 slices were obtained with the former 1-core method, resulting in 3 to 6 slices being placed on each of 2 slides; for the latter 3-core method, a limit of 6 slices was obtained, resulting in 3 slices being place on each of 2 slides. A total of 6708 sets of 12 to 18 core biopsies were studied, including 3509 biopsy sets from the 1-biopsy-core-per-cassette group (January 2010) and 3199 biopsy sets from the 3-biopsy-cores-percassette group (August 2010). The yield of diagnoses was classified as benign, atypical small acinar proliferation, high-grade prostatic intraepithelial neoplasia, and cancer and was similar with the 2 methods: 46.2%, 8.2%, 4.5%, and 41.1% and 46.7%, 6.3%, 4.4%, and 42.6%, respectively (P = .02). Submission of 1 core or 3 cores per cassette had no effect on the yield of atypical small acinar proliferation, prostatic intraepithelial neoplasia, or cancer in prostate needle biopsies. Consequently, we recommend submission of 3 cores per cassette to minimize labor and cost of processing. PMID:23764163

  10. Methods for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma

    USGS Publications Warehouse

    Esralew, Rachel A.; Smith, S. Jerrod

    2010-01-01

    Flow statistics can be used to provide decision makers with surface-water information needed for activities such as water-supply permitting, flow regulation, and other water rights issues. Flow statistics could be needed at any location along a stream. Most often, streamflow statistics are needed at ungaged sites, where no flow data are available to compute the statistics. Methods are presented in this report for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma. Flow statistics included the (1) annual (period of record), (2) seasonal (summer-autumn and winter-spring), and (3) 12 monthly duration statistics, including the 20th, 50th, 80th, 90th, and 95th percentile flow exceedances, and the annual mean-flow (mean of daily flows for the period of record). Flow statistics were calculated from daily streamflow information collected from 235 streamflow-gaging stations throughout Oklahoma and areas in adjacent states. A drainage-area ratio method is the preferred method for estimating flow statistics at an ungaged location that is on a stream near a gage. The method generally is reliable only if the drainage-area ratio of the two sites is between 0.5 and 1.5. Regression equations that relate flow statistics to drainage-basin characteristics were developed for the purpose of estimating selected flow-duration and annual mean-flow statistics for ungaged streams that are not near gaging stations on the same stream. Regression equations were developed from flow statistics and drainage-basin characteristics for 113 unregulated gaging stations. Separate regression equations were developed by using U.S. Geological Survey streamflow-gaging stations in regions with similar drainage-basin characteristics. These equations can increase the accuracy of regression equations used for estimating flow-duration and annual mean-flow statistics at ungaged stream locations in Oklahoma. Streamflow-gaging stations were grouped by selected drainage

  11. Adipose Tissue - Adequate, Accessible Regenerative Material

    PubMed Central

    Kolaparthy, Lakshmi Kanth.; Sanivarapu, Sahitya; Moogla, Srinivas; Kutcham, Rupa Sruthi

    2015-01-01

    The potential use of stem cell based therapies for the repair and regeneration of various tissues offers a paradigm shift that may provide alternative therapeutic solutions for a number of diseases. The use of either embryonic stem cells (ESCs) or induced pluripotent stem cells in clinical situations is limited due to cell regulations and to technical and ethical considerations involved in genetic manipulation of human ESCs, even though these cells are highly beneficial. Mesenchymal stem cells seen to be an ideal population of stem cells in particular, Adipose derived stem cells (ASCs) which can be obtained in large number and easily harvested from adipose tissue. It is ubiquitously available and has several advantages compared to other sources as easily accessible in large quantities with minimal invasive harvesting procedure, and isolation of adipose derived mesenchymal stem cells yield a high amount of stem cells which is essential for stem cell based therapies and tissue engineering. Recently, periodontal tissue regeneration using ASCs has been examined in some animal models. This method has potential in the regeneration of functional periodontal tissues because various secreted growth factors from ASCs might not only promote the regeneration of periodontal tissues but also encourage neovascularization of the damaged tissues. This review summarizes the sources, isolation and characteristics of adipose derived stem cells and its potential role in periodontal regeneration is discussed. PMID:26634060

  12. Adipose Tissue - Adequate, Accessible Regenerative Material.

    PubMed

    Kolaparthy, Lakshmi Kanth; Sanivarapu, Sahitya; Moogla, Srinivas; Kutcham, Rupa Sruthi

    2015-11-01

    The potential use of stem cell based therapies for the repair and regeneration of various tissues offers a paradigm shift that may provide alternative therapeutic solutions for a number of diseases. The use of either embryonic stem cells (ESCs) or induced pluripotent stem cells in clinical situations is limited due to cell regulations and to technical and ethical considerations involved in genetic manipulation of human ESCs, even though these cells are highly beneficial. Mesenchymal stem cells seen to be an ideal population of stem cells in particular, Adipose derived stem cells (ASCs) which can be obtained in large number and easily harvested from adipose tissue. It is ubiquitously available and has several advantages compared to other sources as easily accessible in large quantities with minimal invasive harvesting procedure, and isolation of adipose derived mesenchymal stem cells yield a high amount of stem cells which is essential for stem cell based therapies and tissue engineering. Recently, periodontal tissue regeneration using ASCs has been examined in some animal models. This method has potential in the regeneration of functional periodontal tissues because various secreted growth factors from ASCs might not only promote the regeneration of periodontal tissues but also encourage neovascularization of the damaged tissues. This review summarizes the sources, isolation and characteristics of adipose derived stem cells and its potential role in periodontal regeneration is discussed. PMID:26634060

  13. A new statistical modeling and detection method for rolling element bearing faults based on alpha-stable distribution

    NASA Astrophysics Data System (ADS)

    Yu, Gang; Li, Changning; Zhang, Jianfeng

    2013-12-01

    Due to limited information given by traditional local statistics, a new statistical modeling method for rolling element bearing fault signals is proposed based on alpha-stable distribution. In order to fully take advantages of complete information provided by alpha-stable distribution, this paper focuses on testing the validity of the proposed statistical model. A number of hypothetical test methods were applied to practical bearing fault vibration signals with different fault types and degrees. Through testing on the consistency of three alpha-stable parameter estimation methods, and the probability density function fitting level between fault signals and their corresponding hypothetical alpha-stable distributions, it can be concluded that such a non-Gaussian model is sufficient to thoroughly describe the statistical characteristics of bearing fault signals with impulsive behaviors, and consequently the alpha-stable hypothesis is verified. In the meantime, a new bearing fault detection method based on kurtogram and α parameter of the alpha-stable model is proposed, experimental results have shown that the proposed method has better performance on detecting incipient bearing faults than that based on the traditional kurtogram.

  14. Identifying relatively high-risk group of coronary artery calcification based on progression rate: statistical and machine learning methods.

    PubMed

    Kim, Ha-Young; Yoo, Sanghyun; Lee, Jihyun; Kam, Hye Jin; Woo, Kyoung-Gu; Choi, Yoon-Ho; Sung, Jidong; Kang, Mira

    2012-01-01

    Coronary artery calcification (CAC) score is an important predictor of coronary artery disease (CAD), which is the primary cause of death in advanced countries. Early prediction of high-risk of CAC based on progression rate enables people to prevent CAD from developing into severe symptoms and diseases. In this study, we developed various classifiers to identify patients in high risk of CAC using statistical and machine learning methods, and compared them with performance accuracy. For statistical approaches, linear regression based classifier and logistic regression model were developed. For machine learning approaches, we suggested three kinds of ensemble-based classifiers (best, top-k, and voting method) to deal with imbalanced distribution of our data set. Ensemble voting method outperformed all other methods including regression methods as AUC was 0.781. PMID:23366360

  15. Application of statistical methods for analyzing the relationship between casting distortion, mold filling, and interfacial heat transfer in sand molds

    SciTech Connect

    Y. A. Owusu

    1999-03-31

    This report presents a statistical method of evaluating geometric tolerances of casting products using point cloud data generated by coordinate measuring machine (CMM) process. The focus of this report is to present a statistical-based approach to evaluate the differences in dimensional and form variations or tolerances of casting products as affected by casting gating system, molding material, casting thickness, and casting orientation at the mold-metal interface. Form parameters such as flatness, parallelism, and other geometric profiles such as angularity, casting length, and height of casting products were obtained and analyzed from CMM point cloud data. In order to relate the dimensional and form errors to the factors under consideration such as flatness and parallelism, a factorial analysis of variance and statistical test means methods were performed to identify the factors that contributed to the casting distortion at the mold-metal interface.

  16. A Comparison of the Standardization and IRT Methods of Adjusting Pretest Item Statistics Using Realistic Data.

    ERIC Educational Resources Information Center

    Chang, Shun-Wen; Hanson, Bradley A.; Harris, Deborah J.

    The requirement of large sample sizes for calibrating items based on item response theory (IRT) models is not easily met in many practical pretesting situations. Although classical item statistics could be estimated with much smaller samples, the values may not be comparable across different groups of examinees. This study extended the authors'…

  17. From association to prediction: statistical methods for the dissection and selection of complex traits in plants

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Quantification of genotype-to-phenotype associations is central to many scientific investigations, yet the ability to obtain consistent results may be thwarted without appropriate statistical analyses. Models for association can consider confounding effects in the materials and complex genetic inter...

  18. Statistical Methods of Latent Structure Discovery in Child-Directed Speech

    ERIC Educational Resources Information Center

    Panteleyeva, Natalya B.

    2010-01-01

    This dissertation investigates how distributional information in the speech stream can assist infants in the initial stages of acquisition of their native language phonology. An exploratory statistical analysis derives this information from the adult speech data in the corpus of conversations between adults and young children in Russian. Because…

  19. Modern Robust Statistical Methods: An Easy Way to Maximize the Accuracy and Power of Your Research

    ERIC Educational Resources Information Center

    Erceg-Hurn, David M.; Mirosevich, Vikki M.

    2008-01-01

    Classic parametric statistical significance tests, such as analysis of variance and least squares regression, are widely used by researchers in many disciplines, including psychology. For classic parametric tests to produce accurate results, the assumptions underlying them (e.g., normality and homoscedasticity) must be satisfied. These assumptions…

  20. Study of UV Cu + Ne – CuBr laser lifetime by statistical methods

    SciTech Connect

    Iliev, I P; Gocheva-Ilieva, S G

    2013-11-30

    On the basis of a large amount of experimental data, statistical investigation of the average lifetime of a UV Cu + Ne – CuBr laser depending on ten input physical laser parameters is carried out. It is found that only three of the parameters have a substantial influence on the laser lifetime. Physical analysis and interpretation of the results are provided. (lasers)

  1. Using the Bootstrap Method to Evaluate the Critical Range of Misfit for Polytomous Rasch Fit Statistics.

    PubMed

    Seol, Hyunsoo

    2016-06-01

    The purpose of this study was to apply the bootstrap procedure to evaluate how the bootstrapped confidence intervals (CIs) for polytomous Rasch fit statistics might differ according to sample sizes and test lengths in comparison with the rule-of-thumb critical value of misfit. A total of 25 simulated data sets were generated to fit the Rasch measurement and then a total of 1,000 replications were conducted to compute the bootstrapped CIs under each of 25 testing conditions. The results showed that rule-of-thumb critical values for assessing the magnitude of misfit were not applicable because the infit and outfit mean square error statistics showed different magnitudes of variability over testing conditions and the standardized fit statistics did not exactly follow the standard normal distribution. Further, they also do not share the same critical range for the item and person misfit. Based on the results of the study, the bootstrapped CIs can be used to identify misfitting items or persons as they offer a reasonable alternative solution, especially when the distributions of the infit and outfit statistics are not well known and depend on sample size.

  2. Statistical Methods for Assessments in Simulations and Serious Games. Research Report. ETS RR-14-12

    ERIC Educational Resources Information Center

    Fu, Jianbin; Zapata, Diego; Mavronikolas, Elia

    2014-01-01

    Simulation or game-based assessments produce outcome data and process data. In this article, some statistical models that can potentially be used to analyze data from simulation or game-based assessments are introduced. Specifically, cognitive diagnostic models that can be used to estimate latent skills from outcome data so as to scale these…

  3. Research design and statistical methods in Indian medical journals: a retrospective survey.

    PubMed

    Hassan, Shabbeer; Yellur, Rajashree; Subramani, Pooventhan; Adiga, Poornima; Gokhale, Manoj; Iyer, Manasa S; Mayya, Shreemathi S

    2015-01-01

    Good quality medical research generally requires not only an expertise in the chosen medical field of interest but also a sound knowledge of statistical methodology. The number of medical research articles which have been published in Indian medical journals has increased quite substantially in the past decade. The aim of this study was to collate all evidence on study design quality and statistical analyses used in selected leading Indian medical journals. Ten (10) leading Indian medical journals were selected based on impact factors and all original research articles published in 2003 (N = 588) and 2013 (N = 774) were categorized and reviewed. A validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation of the articles. Main outcomes considered in the present study were - study design types and their frequencies, error/defects proportion in study design, statistical analyses, and implementation of CONSORT checklist in RCT (randomized clinical trials). From 2003 to 2013: The proportion of erroneous statistical analyses did not decrease (χ2=0.592, Φ=0.027, p=0.4418), 25% (80/320) in 2003 compared to 22.6% (111/490) in 2013. Compared with 2003, significant improvement was seen in 2013; the proportion of papers using statistical tests increased significantly (χ2=26.96, Φ=0.16, p<0.0001) from 42.5% (250/588) to 56.7 % (439/774). The overall proportion of errors in study design decreased significantly (χ2=16.783, Φ=0.12 p<0.0001), 41.3% (243/588) compared to 30.6% (237/774). In 2013, randomized clinical trials designs has remained very low (7.3%, 43/588) with majority showing some errors (41 papers, 95.3%). Majority of the published studies were retrospective in nature both in 2003 [79.1% (465/588)] and in 2013 [78.2% (605/774)]. Major decreases in error proportions were observed in both results presentation (χ2=24.477, Φ=0.17, p<0.0001), 82.2% (263/320) compared to 66.3% (325/490) and

  4. Efficient and Statistically Valid Method of Textural Sea Floor Characterization in Benthic Habitat Mapping

    NASA Astrophysics Data System (ADS)

    Kostylev, V. E.; Orpin, A. R.

    2004-12-01

    The advent of multibeam bathymetric sonar technology and the thematic development of benthic habitat research have spawned renewed interest in the systematic characterization and mapping of the seafloor. This necessitates the application of reliable and accurate sea floor descriptors in combination with a robust means to statistically assess descriptor associations. Traditionally, geoscientific sea floor mapping was comprised primarily of identifying the spatial extent and relationship of geological units, broadly following chronostratigraphic conventions. Classifying seafloor sediments using geological facies may not be meaningful biologically because they incorporate temporal elements that stem from a geochronological qualifier. Textural properties of geological facies are typically reliant on the application of distribution-dependent statistics, which have been shown to be inappropriate with multimodal marine sediments. While the relationship between grain size and biota appears self-evident, there is a compelling argument that granulometric properites alone are not a determinant of species distribution or community composition. The classification process is problematic because most statistical clustering techniques will, by their very nature, form clusters which may or may not represent meaningful and discernable differences. Moreover, as habitat mapping is aimed at boundary definition, the boundaries between clusters in such cases could be based on very subtle differences, or noise (e.g. sampling bias). An independent measure of the appropriate number of groups in a dataset is required. Therefore, we examine a statistical approach pioneered by Calinski & Harabasz (C-H), which was implemented by a computer routine to work in partnership with information entropy analysis of grain size data. We utilize a 30-year legacy of grain size data collected from the Scotian Shelf, Canadian Atlantic continental margin, and show that considerable improvements in textural

  5. Total Quality Management: Statistics and Graphics III - Experimental Design and Taguchi Methods. AIR 1993 Annual Forum Paper.

    ERIC Educational Resources Information Center

    Schwabe, Robert A.

    Interest in Total Quality Management (TQM) at institutions of higher education has been stressed in recent years as an important area of activity for institutional researchers. Two previous AIR Forum papers have presented some of the statistical and graphical methods used for TQM. This paper, the third in the series, first discusses some of the…

  6. Overcoming Student Disengagement and Anxiety in Theory, Methods, and Statistics Courses by Building a Community of Learners

    ERIC Educational Resources Information Center

    Macheski, Ginger E.; Buhrmann, Jan; Lowney, Kathleen S.; Bush, Melanie E. L.

    2008-01-01

    Participants in the 2007 American Sociological Association teaching workshop, "Innovative Teaching Practices for Difficult Subjects," shared concerns about teaching statistics, research methods, and theory. Strategies for addressing these concerns center on building a community of learners by creating three processes throughout the course: 1) an…

  7. The Role of Statistics and Research Methods in the Academic Success of Psychology Majors: Do Performance and Enrollment Timing Matter?

    ERIC Educational Resources Information Center

    Freng, Scott; Webber, David; Blatter, Jamin; Wing, Ashley; Scott, Walter D.

    2011-01-01

    Comprehension of statistics and research methods is crucial to understanding psychology as a science (APA, 2007). However, psychology majors sometimes approach methodology courses with derision or anxiety (Onwuegbuzie & Wilson, 2003; Rajecki, Appleby, Williams, Johnson, & Jeschke, 2005); consequently, students may postpone enrollment…

  8. United States Census 2000 Population with Bridged Race Categories. Vital and Health Statistics. Data Evaluation and Methods Research.

    ERIC Educational Resources Information Center

    Ingram, Deborah D.; Parker, Jennifer D.; Schenker, Nathaniel; Weed, James A.; Hamilton, Brady; Arias, Elizabeth; Madans, Jennifer H.

    This report documents the National Center for Health Statistics' (NCHS) methods for bridging the Census 2000 multiple-race resident population to single-race categories and describing bridged race resident population estimates. Data came from the pooled 1997-2000 National Health Interview Surveys. The bridging models included demographic and…

  9. Stats on the Cheap: Using Free and Inexpensive Internet Resources to Enhance the Teaching of Statistics and Research Methods

    ERIC Educational Resources Information Center

    Hartnett, Jessica L.

    2013-01-01

    The present article describes four free or inexpensive Internet-based activities that can be used to supplement statistics/research methods/general psychology classes. Each activity and subsequent homework assessment is described, as well as homework performance outcome and student opinion data for each activity. (Contains 1 table.)

  10. A Meta-Analytic Review of Studies of the Effectiveness of Small-Group Learning Methods on Statistics Achievement

    ERIC Educational Resources Information Center

    Kalaian, Sema A.; Kasim, Rafa M.

    2014-01-01

    This meta-analytic study focused on the quantitative integration and synthesis of the accumulated pedagogical research in undergraduate statistics education literature. These accumulated research studies compared the academic achievement of students who had been instructed using one of the various forms of small-group learning methods to those who…

  11. On the statistical significance of excess events: Remarks of caution and the need for a standard method of calculation

    NASA Technical Reports Server (NTRS)

    Staubert, R.

    1985-01-01

    Methods for calculating the statistical significance of excess events and the interpretation of the formally derived values are discussed. It is argued that a simple formula for a conservative estimate should generally be used in order to provide a common understanding of quoted values.

  12. Eliminating the influence of serial correlation on statistical process control charts using trend free pre-whitening (TFPW) method

    NASA Astrophysics Data System (ADS)

    Desa, Nor Hasliza Mat; Jemain, Abdul Aziz

    2013-11-01

    A key assumption in traditional statistical process control (SPC) technique is based on the requirement that observations or time series data are normally and independently distributed. The presences of a serial autocorrelation results in a number of problems, including an increase in the type I error rate and thereby increase the expected number of false alarm in the process observation. However, the independency assumption is often violated in practice due to the influence of serial correlation in the observation. Therefore, the aim of this paper is to demonstrate with the hospital admission data, the influence of serial correlation on the statistical control charts. The trend free pre-whitening (TFPW) method has been used and applied as an alternative method to obtain residuals series which are statistically uncorrelated to each other. In this study, a data set of daily hospital admission for respiratory and cardiovascular diseases was used from the period of 1 January 2009 to 31 December 2009 (365 days). Result showed that TFPW method is an easy and useful method in removing the influence of serial correlation from the hospital admission data. It can be concluded that statistical control chart based on residual series perform better compared to original hospital admission series which influenced by the effects of serial correlation data.

  13. Guidelines for the statistical analysis of a collaborative study of a laboratory method for testing disinfectant product performance.

    PubMed

    Hamilton, Martin A; Hamilton, Gordon Cord; Goeres, Darla M; Parker, Albert E

    2013-01-01

    This paper presents statistical techniques suitable for analyzing a collaborative study (multilaboratory study or ring trial) of a laboratory disinfectant product performance test (DPPT) method. Emphasis is on the assessment of the repeatability, reproducibility, resemblance, and responsiveness of the DPPT method. The suggested statistical techniques are easily modified for application to a single laboratory study. The presentation includes descriptions of the plots and tables that should be constructed during initial examination of the data, including a discussion of outliers and QA checks. The statistical recommendations deal with evaluations of prevailing types of DPPTs, including both quantitative and semiquantitative tests. The presentation emphasizes tests in which the disinfectant treatment is applied to surface-associated microbes and the outcome is a viable cell count; however, the statistical guidelines are appropriate for suspension tests and other test systems. The recommendations also are suitable for disinfectant tests using any microbe (vegetative bacteria, virus, spores, etc.) or any disinfectant treatment. The descriptions of the statistical techniques include either examples of calculations based on published data or citations to published calculations. Computer code is provided in an appendix.

  14. Thinking About Data, Research Methods, and Statistical Analyses: Commentary on Sijtsma's (2014) "Playing with Data".

    PubMed

    Waldman, Irwin D; Lilienfeld, Scott O

    2016-03-01

    We comment on Sijtsma's (2014) thought-provoking essay on how to minimize questionable research practices (QRPs) in psychology. We agree with Sijtsma that proactive measures to decrease the risk of QRPs will ultimately be more productive than efforts to target individual researchers and their work. In particular, we concur that encouraging researchers to make their data and research materials public is the best institutional antidote against QRPs, although we are concerned that Sijtsma's proposal to delegate more responsibility to statistical and methodological consultants could inadvertently reinforce the dichotomy between the substantive and statistical aspects of research. We also discuss sources of false-positive findings and replication failures in psychological research, and outline potential remedies for these problems. We conclude that replicability is the best metric of the minimization of QRPs and their adverse effects on psychological research.

  15. Using the UMLS and Simple Statistical Methods to Semantically Categorize Causes of Death on Death Certificates.

    PubMed

    Riedl, Bill; Than, Nhan; Hogarth, Michael

    2010-11-13

    Cause of death data is an invaluable resource for shaping our understanding of population health. Mortality statistics is one of the principal sources of health information and in many countries the most reliable source of health data. 1 A quick classification process for this data can significantly improve public health efforts. Currently, cause of death data is captured in unstructured form requiring months to process. We think this process can be automated, at least partially, using simple statistical Natural Language Processing, NLP, techniques and the Unified Medical Language System, UMLS, as a vocabulary resource. A system, Medical Match Master, MMM, was built to exercise this theory. We evaluate this simple NLP approach in the classification of causes of death. This technique performed well if we engaged the use of a large biomedical vocabulary and applied certain syntactic maneuvers made possible by textual relationships within the vocabulary.

  16. Method for nondestructive testing using multiple-energy CT and statistical pattern classification

    NASA Astrophysics Data System (ADS)

    Homem, Murillo R. P.; Mascarenhas, Nelson D. A.; Cruvinel, Paulo E.

    1999-10-01

    This paper reports on how multiple energy techniques in X and gamma-ray CT scanning are able to provide good results with the use of Statistical Pattern Classification theory. We obtained a set of four images with different energies (40, 60, 85 and 662 keV) containing aluminum, phosphorus, calcium, water and plexiglass, with a minitomograph scanner for soil science. We analyzed those images through both a supervised classifier based on the maximum-likelihood criterion under the multivariate Gaussian model and a supervised contextual classifier based on the ICM (iterated conditional modes) algorithm using an a priori Potts-Strauss model. A comparison between them was performed through the statistical kappa coefficient. A feature selection procedure using the Jeffries- Matusita (J-M) Distance was also performed. Both the classification and the feature selection procedures were found to be in agreement with the predicted discrimination given by the separation of the linear attenuation coefficient curves for different materials.

  17. Methods of mathematical statistics for verification of hydrogen content in zirconium hydride moderator

    SciTech Connect

    Ponomarev-Stepnoi, N.N.; Bubelev, V.G.; Glushkov, Ye.S.; Kompaniets, G.V.; Nosov, V.I. )

    1995-02-01

    The hydrogen content of zirconium hydride blocks used as the moderator in Topaz-2-type space reactors is estimated according to correlation-regression analysis procedures of mathematical statistics and is based on the results of the definition of the reactivity of the blocks in a research critical assembly. A linear mathematical model for a variable response is formulated within the framework of the first-order perturbation theory applied to the estimation of reactivity effects in reactors. A PASPORT computer code is written based on the developed algorithm. The statistical analysis of the available data performed by using PASPORT shows that the developed approach allows determination of the insignificance of the contribution of the impurities to the reactivity of the blocks, verification of the manufacturer's data on the hydrogen content in zirconium hydride blocks, and estimation of the reactivity shift in a standard block.

  18. Use of statistical methods in industrial water pollution control regulations in the United States.

    PubMed

    Kahn, H D; Rubin, M B

    1989-11-01

    This paper describes the process for developing regulations limiting the discharge of pollutants from industrial sources into the waters of the United States. The process includies and surveys of the industry to define products, processes, wastewater sources and characteristics, appropriate subcategorization and control technologies in use. Limitations on the amounts of pollutants that may be discharged in treated wastewater are based on statistical analysis of physical and chemical analytical data characterizing the performance capability of technologies in use in the industry. A general discussion of the statistical approach employed is provided along with some examples based on work performed to support recently promulgated regulations. The determination of regulatory discharge limitations, based on estimates of percentiles of lognormal distributions of measured pollutant concentrations in treated wastewater, is presented. Modifications to account for different averaging periods and detection limit observations are discussed. PMID:24243169

  19. Statistical analysis of nonlinear dynamical systems using differential geometric sampling methods.

    PubMed

    Calderhead, Ben; Girolami, Mark

    2011-12-01

    Mechanistic models based on systems of nonlinear differential equations can help provide a quantitative understanding of complex physical or biological phenomena. The use of such models to describe nonlinear interactions in molecular biology has a long history; however, it is only recently that advances in computing have allowed these models to be set within a statistical framework, further increasing their usefulness and binding modelling and experimental approaches more tightly together. A probabilistic approach to modelling allows us to quantify uncertainty in both the model parameters and the model predictions, as well as in the model hypotheses themselves. In this paper, the Bayesian approach to statistical inference is adopted and we examine the significant challenges that arise when performing inference over nonlinear ordinary differential equation models describing cell signalling pathways and enzymatic circadian control; in particular, we address the difficulties arising owing to strong nonlinear correlation structures, high dimensionality and non-identifiability of parameters. We demonstrate how recently introduced differential geometric Markov chain Monte Carlo methodology alleviates many of these issues by making proposals based on local sensitivity information, which ultimately allows us to perform effective statistical analysis. Along the way, we highlight the deep link between the sensitivity analysis of such dynamic system models and the underlying Riemannian geometry of the induced posterior probability distributions. PMID:23226584

  20. Statistical methods for analysis of time-dependent inhibition of cytochrome p450 enzymes.

    PubMed

    Yates, Phillip; Eng, Heather; Di, Li; Obach, R Scott

    2012-12-01

    Time-dependent inhibition (TDI) of cytochrome P450 (P450) enzymes, especially CYP3A4, is an important attribute of drugs in evaluating the potential for pharmacokinetic drug-drug interactions. The analysis of TDI data for P450 enzymes can be challenging, yet it is important to be able to reliably evaluate whether a drug is a TDI or not, and if so, how best to derive the inactivation kinetic parameters K(I) and k(inact). In the present investigation a two-step statistical evaluation was developed to evaluate CYP3A4 TDI data. In the first step, a two-sided two-sample z-test is used to compare the k(obs) values measured in the absence and presence of the test compound to answer the question of whether the test compound is a TDI or not. In the second step, k(obs) values are plotted versus both [I] and ln[I] to determine whether a significant correlation exists, which can then inform the investigator of whether the inactivation kinetic parameters, K(I) and k(inact), can be reliably estimated. Use of this two-step statistical evaluation is illustrated with the examination of five drugs of varying capabilities to inactivate CYP3A4: ketoconazole, erythromycin, raloxifene, rosiglitazone, and pioglitazone. The use of a set statistical algorithm offers a more robust and objective approach to the analysis of P450 TDI data than frequently employed empirically derived or heuristic approaches.

  1. Quantifying variability within water samples: the need for adequate subsampling.

    PubMed

    Donohue, Ian; Irvine, Kenneth

    2008-01-01

    Accurate and precise determination of the concentration of nutrients and other substances in waterbodies is an essential requirement for supporting effective management and legislation. Owing primarily to logistic and financial constraints, however, national and regional agencies responsible for monitoring surface waters tend to quantify chemical indicators of water quality using a single sample from each waterbody, thus largely ignoring spatial variability. We show here that total sample variability, which comprises both analytical variability and within-sample heterogeneity, of a number of important chemical indicators of water quality (chlorophyll a, total phosphorus, total nitrogen, soluble molybdate-reactive phosphorus and dissolved inorganic nitrogen) varies significantly both over time and among determinands, and can be extremely high. Within-sample heterogeneity, whose mean contribution to total sample variability ranged between 62% and 100%, was significantly higher in samples taken from rivers compared with those from lakes, and was shown to be reduced by filtration. Our results show clearly that neither a single sample, nor even two sub-samples from that sample is adequate for the reliable, and statistically robust, detection of changes in the quality of surface waters. We recommend strongly that, in situations where it is practicable to take only a single sample from a waterbody, a minimum of three sub-samples are analysed from that sample for robust quantification of both the concentrations of determinands and total sample variability. PMID:17706740

  2. Statistical methods for estimating normal blood chemistry ranges and variance in rainbow trout (Salmo gairdneri), Shasta Strain

    USGS Publications Warehouse

    Wedemeyer, Gary A.; Nelson, Nancy C.

    1975-01-01

    Gaussian and nonparametric (percentile estimate and tolerance interval) statistical methods were used to estimate normal ranges for blood chemistry (bicarbonate, bilirubin, calcium, hematocrit, hemoglobin, magnesium, mean cell hemoglobin concentration, osmolality, inorganic phosphorus, and pH for juvenile rainbow (Salmo gairdneri, Shasta strain) trout held under defined environmental conditions. The percentile estimate and Gaussian methods gave similar normal ranges, whereas the tolerance interval method gave consistently wider ranges for all blood variables except hemoglobin. If the underlying frequency distribution is unknown, the percentile estimate procedure would be the method of choice.

  3. Design and statistical methods in studies using animal models of development.

    PubMed

    Festing, Michael F W

    2006-01-01

    Experiments involving neonates should follow the same basic principles as most other experiments. They should be unbiased, be powerful, have a good range of applicability, not be excessively complex, and be statistically analyzable to show the range of uncertainty in the conclusions. However, investigation of growth and development in neonatal multiparous animals poses special problems associated with the choice of "experimental unit" and differences between litters: the "litter effect." Two main types of experiments are described, with recommendations regarding their design and statistical analysis: First, the "between litter design" is used when females or whole litters are assigned to a treatment group. In this case the litter, rather than the individuals within a litter, is the experimental unit and should be the unit for the statistical analysis. Measurements made on individual neonatal animals need to be combined within each litter. Counting each neonate as a separate observation may lead to incorrect conclusions. The number of observations for each outcome ("n") is based on the number of treated females or whole litters. Where litter sizes vary, it may be necessary to use a weighted statistical analysis because means based on more observations are more reliable than those based on a few observations. Second, the more powerful "within-litter design" is used when neonates can be individually assigned to treatment groups so that individuals within a litter can have different treatments. In this case, the individual neonate is the experimental unit, and "n" is based on the number of individual pups, not on the number of whole litters. However, variation in litter size means that it may be difficult to perform balanced experiments with equal numbers of animals in each treatment group within each litter. This increases the complexity of the statistical analysis. A numerical example using a general linear model analysis of variance is provided in the Appendix. The

  4. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    NASA Technical Reports Server (NTRS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.

  5. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    NASA Astrophysics Data System (ADS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C.-K.; Simons, M.; Stanley, H. E.

    1995-09-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C.elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of the coding regions. In particular, (i) an n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger ``n-gram redundancy'') than the coding regions. In contrast to the three chromosomes, we find that for vertebrates-such as primates and rodents-and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of zero- and first-order Markovian models or simple nucleotide repeats to account fully for these ``linguistic'' features of DNA. Finally, we emphasize that our results by no means prove the existence of a ``language'' in noncoding DNA.

  6. Statistical Methods for Analysis of High-Throughput RNA Interference Screens

    PubMed Central

    Birmingham, Amanda; Selfors, Laura M.; Forster, Thorsten; Wrobel, David; Kennedy, Caleb J.; Shanks, Emma; Santoyo-Lopez, Javier; Dunican, Dara J.; Long, Aideen; Kelleher, Dermot; Smith, Queta; Beijersbergen, Roderick L.; Ghazal, Peter; Shamu, Caroline E.

    2009-01-01

    RNA interference (RNAi) has become a powerful technique for reverse genetics and drug discovery and, in both of these areas, large-scale high-throughput RNAi screens are commonly performed. The statistical techniques used to analyze these screens are frequently borrowed directly from small-molecule screening; however small-molecule and RNAi data characteristics differ in meaningful ways. We examine the similarities and differences between RNAi and small-molecule screens, highlighting particular characteristics of RNAi screen data that must be addressed during analysis. Additionally, we provide guidance on selection of analysis techniques in the context of a sample workflow. PMID:19644458

  7. A Comparison of Five Statistical Methods for Analyzing Pretest-Posttest Designs.

    ERIC Educational Resources Information Center

    Hendrix, Leland J.; And Others

    1978-01-01

    Five methods for analyzing data from pretest-post-test research designs are discussed. Analysis of gain scores, with pretests as a covariate, is indicated as a superior method when the assumptions underlying covariance analysis are met. (Author/GDC)

  8. STATISTICAL VALIDATION OF SULFATE QUANTIFICATION METHODS USED FOR ANALYSIS OF ACID MINE DRAINAGE

    EPA Science Inventory

    Turbidimetric method (TM), ion chromatography (IC) and inductively coupled plasma atomic emission spectrometry (ICP-AES) with and without acid digestion have been compared and validated for the determination of sulfate in mining wastewater. Analytical methods were chosen to compa...

  9. A statistical method of testing the gamma ray emission mechanisms of blazars.

    NASA Astrophysics Data System (ADS)

    Chi, X.; Young, E. C. M.

    1997-09-01

    Models for generation of high energy gamma rays in blazars can be classified into two types of mechanisms in the jet comoving frame: relativistic electron scattering on the internal photons or magnetic field (virtual photons) (SIP) and on the external photons (SEP). These two mechanisms are known to result in a significant difference in the beaming effect. In this work, we propose a statistical test for the two types of mechanisms based on the beaming difference. The random variable is taken to be the K-corrected gamma ray to radio flux ratio and its distribution is shown to be a power-law with an index being model-dependent. The feasibility of such a test is investigated with a limited sample of data which are complied from the EGRET gamma ray survey, low resolution radio surveys and a VLBI radio survey. A correlation study indicates that the VLBI data are more suitable for the purpose than the low resolution data. Due to the limited amount of available data, the current test result is not statistically significant to discriminate the two emission mechanisms. Future generation of high energy gamma ray telescopes are needed to produce a larger sample of data of gamma ray blazars and their simultaneous observations with VLBI are called.

  10. Automated microcalcification detection in mammograms using statistical variable-box-threshold filter method

    NASA Astrophysics Data System (ADS)

    Wilson, Mark; Mitra, Sunanda; Roberson, Glenn H.; Shieh, Yao-Yang

    1997-10-01

    Currently early detection of breast cancer is primarily accomplished by mammography and suspicious findings may lead to a decision for performing a biopsy. Digital enhancement and pattern recognition techniques may aid in early detection of some patterns such as microcalcification clusters indicating onset of DCIS (ductal carcinoma in situ) that accounts for 20% of all mammographically detected breast cancers and could be treated when detected early. These individual calcifications are hard to detect due to size and shape variability and inhomogeneous background texture. Our study addresses only early detection of microcalcifications that allows the radiologist to interpret the x-ray findings in computer-aided enhanced form easier than evaluating the x-ray film directly. We present an algorithm which locates microcalcifications based on local grayscale variability and of tissue structures and image statistics. Threshold filters with lower and upper bounds computed from the image statistics of the entire image and selected subimages were designed to enhance the entire image. This enhanced image was used as the initial image for identifying the micro-calcifications based on the variable box threshold filters at different resolutions. The test images came from the Texas Tech University Health Sciences Center and the MIAS mammographic database, which are classified into various categories including microcalcifications. Classification of other types of abnormalities in mammograms based on their characteristic features is addressed in later studies.

  11. The Kernel Method of Equating Score Distributions. Program Statistics Research Technical Report No. 89-84.

    ERIC Educational Resources Information Center

    Holland, Paul W.; Thayer, Dorothy T.

    A new and unified approach to test equating is described that is based on log-linear models for smoothing score distributions and on the kernel method of nonparametric density estimation. The new method contains both linear and standard equipercentile methods as special cases and can handle several important equating data collection designs. An…

  12. An investigation of the 'Overlap' between the Statistical-Discrete-Gust and the Power-Spectral-Density analysis methods

    NASA Technical Reports Server (NTRS)

    Perry, Boyd, III; Pototzky, Anthony S.; Woods, Jessica A.

    1989-01-01

    This paper presents the results of a NASA investigation of a claimed 'Overlap' between two gust response analysis methods: the Statistical Discrete Gust (SDG) method and the Power Spectral Density (PSD) method. The claim is that the ratio of an SDG response to the corresponding PSD response is 10.4. Analytical results presented in this paper for several different airplanes at several different flight conditions indicate that such an 'Overlap' does appear to exist. However, the claim was not met precisely: a scatter of up to about 10 percent about the 10.4 factor can be expected.

  13. An Investigation of the Overlap Between the Statistical Discrete Gust and the Power Spectral Density Analysis Methods

    NASA Technical Reports Server (NTRS)

    Perry, Boyd, III; Pototzky, Anthony S.; Woods, Jessica A.

    1989-01-01

    The results of a NASA investigation of a claimed Overlap between two gust response analysis methods: the Statistical Discrete Gust (SDG) Method and the Power Spectral Density (PSD) Method are presented. The claim is that the ratio of an SDG response to the corresponding PSD response is 10.4. Analytical results presented for several different airplanes at several different flight conditions indicate that such an Overlap does appear to exist. However, the claim was not met precisely: a scatter of up to about 10 percent about the 10.4 factor can be expected.

  14. The Fusion of Financial Analysis and Seismology: Statistical Methods from Financial Market Analysis Applied to Earthquake Data

    NASA Astrophysics Data System (ADS)

    Ohyanagi, S.; Dileonardo, C.

    2013-12-01

    As a natural phenomenon earthquake occurrence is difficult to predict. Statistical analysis of earthquake data was performed using candlestick chart and Bollinger Band methods. These statistical methods, commonly used in the financial world to analyze market trends were tested against earthquake data. Earthquakes above Mw 4.0 located on shore of Sanriku (37.75°N ~ 41.00°N, 143.00°E ~ 144.50°E) from February 1973 to May 2013 were selected for analysis. Two specific patterns in earthquake occurrence were recognized through the analysis. One is a spread of candlestick prior to the occurrence of events greater than Mw 6.0. A second pattern shows convergence in the Bollinger Band, which implies a positive or negative change in the trend of earthquakes. Both patterns match general models for the buildup and release of strain through the earthquake cycle, and agree with both the characteristics of the candlestick chart and Bollinger Band analysis. These results show there is a high correlation between patterns in earthquake occurrence and trend analysis by these two statistical methods. The results of this study agree with the appropriateness of the application of these financial analysis methods to the analysis of earthquake occurrence.

  15. Identification of robust statistical downscaling methods based on a comprehensive suite of performance metrics for South Korea

    NASA Astrophysics Data System (ADS)

    Eum, H. I.; Cannon, A. J.

    2015-12-01

    Climate models are a key provider to investigate impacts of projected future climate conditions on regional hydrologic systems. However, there is a considerable mismatch of spatial resolution between GCMs and regional applications, in particular a region characterized by complex terrain such as Korean peninsula. Therefore, a downscaling procedure is an essential to assess regional impacts of climate change. Numerous statistical downscaling methods have been used mainly due to the computational efficiency and simplicity. In this study, four statistical downscaling methods [Bias-Correction/Spatial Disaggregation (BCSD), Bias-Correction/Constructed Analogue (BCCA), Multivariate Adaptive Constructed Analogs (MACA), and Bias-Correction/Climate Imprint (BCCI)] are applied to downscale the latest Climate Forecast System Reanalysis data to stations for precipitation, maximum temperature, and minimum temperature over South Korea. By split sampling scheme, all methods are calibrated with observational station data for 19 years from 1973 to 1991 are and tested for the recent 19 years from 1992 to 2010. To assess skill of the downscaling methods, we construct a comprehensive suite of performance metrics that measure an ability of reproducing temporal correlation, distribution, spatial correlation, and extreme events. In addition, we employ Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to identify robust statistical downscaling methods based on the performance metrics for each season. The results show that downscaling skill is considerably affected by the skill of CFSR and all methods lead to large improvements in representing all performance metrics. According to seasonal performance metrics evaluated, when TOPSIS is applied, MACA is identified as the most reliable and robust method for all variables and seasons. Note that such result is derived from CFSR output which is recognized as near perfect climate data in climate studies. Therefore, the

  16. Upscaling Method Using Multipoint Geostatistics for Statistical Integration of Seismic and Electromagnetic Data

    NASA Astrophysics Data System (ADS)

    Lee, J.; Mukerji, T.; Tompkins, M. J.

    2012-12-01

    Joint integration of seismic and electromagnetic (EM) data has been studied to better characterize hydrocarbon reservoirs because they are sensitive to different reservoir properties. Most of them, however, applied deterministic joint inversion which provides the best estimate of the spatial distribution of reservoir properties in least square sense. Although this way of integrating two different data helps to obtain a more improved reservoir model matching both data, it gives only one reservoir model. But, numerous reservoirs can be consistent with seismic and EM data obtained from field measurements. Therefore, uncertainty associated with reservoir models should be quantified for reducing risks of making wrong decisions in reservoir management. We suggest statistical integration with a new upscaling scheme, which simulates the joint probability distribution of field scale seismic and EM data as well as reservoir properties, such as facies, porosity, and fluid saturation, not only to estimate reservoir properties but also to assess uncertainty of the estimates. Statistical data integration has been used (e.g., Lucet and Mavko, 1991; Avseth et al., 2001a, 2001b; Mukerji et al., 1998, 2001; Eidsvik et al., 2004) to characterize reservoirs from seismic attributes in geophysics. Main issue in applying statistical integration to joint seismic and EM data is the scale difference of two data because seismic (crosswell or surface seismic) and EM measurements (crosswell EM or CSEM) represent different volumes of a reservoir. In this research, geologically analogous reservoirs to the target reservoir were generated by unconstrained simulation with multipoint geostatistical algorithm, SNESIM (Strebelle, 2000, 2002). Well-log scale seismic and EM attributes were randomly assigned to the analogous reservoirs using conditional probability distributions of the attributes given facies obtained from well log analysis. Forward modeling and inversion of the analogous reservoirs were

  17. Parametric and Nonparametric Statistical Methods for Genomic Selection of Traits with Additive and Epistatic Genetic Architectures

    PubMed Central

    Howard, Réka; Carriquiry, Alicia L.; Beavis, William D.

    2014-01-01

    Parametric and nonparametric methods have been developed for purposes of predicting phenotypes. These methods are based on retrospective analyses of empirical data consisting of genotypic and phenotypic scores. Recent reports have indicated that parametric methods are unable to predict phenotypes of traits with known epistatic genetic architectures. Herein, we review parametric methods including least squares regression, ridge regression, Bayesian ridge regression, least absolute shrinkage and selection operator (LASSO), Bayesian LASSO, best linear unbiased prediction (BLUP), Bayes A, Bayes B, Bayes C, and Bayes Cπ. We also review nonparametric methods including Nadaraya-Watson estimator, reproducing kernel Hilbert space, support vector machine regression, and neural networks. We assess the relative merits of these 14 methods in terms of accuracy and mean squared error (MSE) using simulated genetic architectures consisting of completely additive or two-way epistatic interactions in an F2 population derived from crosses of inbred lines. Each simulated genetic architecture explained either 30% or 70% of the phenotypic variability. The greatest impact on estimates of accuracy and MSE was due to genetic architecture. Parametric methods were unable to predict phenotypic values when the underlying genetic architecture was based entirely on epistasis. Parametric methods were slightly better than nonparametric methods for additive genetic architectures. Distinctions among parametric methods for additive genetic architectures were incremental. Heritability, i.e., proportion of phenotypic variability, had the second greatest impact on estimates of accuracy and MSE. PMID:24727289

  18. Computational and statistical methods for high-throughput analysis of post-translational modifications of proteins.

    PubMed

    Schwämmle, Veit; Verano-Braga, Thiago; Roepstorff, Peter

    2015-11-01

    The investigation of post-translational modifications (PTMs) represents one of the main research focuses for the study of protein function and cell signaling. Mass spectrometry instrumentation with increasing sensitivity improved protocols for PTM enrichment and recently established pipelines for high-throughput experiments allow large-scale identification and quantification of several PTM types. This review addresses the concurrently emerging challenges for the computational analysis of the resulting data and presents PTM-centered approaches for spectra identification, statistical analysis, multivariate analysis and data interpretation. We furthermore discuss the potential of future developments that will help to gain deep insight into the PTM-ome and its biological role in cells. This article is part of a Special Issue entitled: Computational Proteomics.

  19. Incorrect statistical method in parallel-groups RCT led to unsubstantiated conclusions.

    PubMed

    Allison, David B; Antoine, Lisa H; George, Brandon J

    2016-01-01

    The article by Aiso et al. titled "Compared with the intake of commercial vegetable juice, the intake of fresh fruit and komatsuna (Brassica rapa L. var perviridis) juice mixture reduces serum cholesterol in middle-aged men: a randomized controlled pilot study" does not meet the expected standards of Lipids in Health and Disease. Although the article concludes that there are some significant benefits to their komatsuna juice mixture, these claims are not supported by the statistical analyses used. An incorrect procedure was used to compare the differences in two treatment groups over time, and a large number of outcomes were tested without correction; both issues are known to produce high rates of false positives, making the conclusions of the study unjustified. The study also fails to follow published journal standards regarding clinical trial registration and reporting.

  20. Statistical and graphical methods for evaluating solute transport models: Overview and application

    NASA Astrophysics Data System (ADS)

    Loague, Keith; Green, Richard E.

    1991-01-01

    Mathematical modeling is the major tool to predict the mobility and the persistence of pollutants to and within groundwater systems. Several comprehensive institutional models have been developed in recent years for this purpose. However, evaluation procedures are not well established for models of saturated-unsaturated soil-water flow and chemical transport. This paper consists of three parts: (1) an overview of various aspects of mathematical modeling focused upon solute transport models; (2) an introduction to statistical criteria and graphical displays that can be useful for model evaluation; and (3) an example of model evaluation for a mathematical model of pesticide leaching. The model testing example uses observed and predicted atrazine concentration profiles from a small catchment in Georgia. The model tested is the EPA pesticide root zone model (PRZM).

  1. Incorrect statistical method in parallel-groups RCT led to unsubstantiated conclusions.

    PubMed

    Allison, David B; Antoine, Lisa H; George, Brandon J

    2016-01-01

    The article by Aiso et al. titled "Compared with the intake of commercial vegetable juice, the intake of fresh fruit and komatsuna (Brassica rapa L. var perviridis) juice mixture reduces serum cholesterol in middle-aged men: a randomized controlled pilot study" does not meet the expected standards of Lipids in Health and Disease. Although the article concludes that there are some significant benefits to their komatsuna juice mixture, these claims are not supported by the statistical analyses used. An incorrect procedure was used to compare the differences in two treatment groups over time, and a large number of outcomes were tested without correction; both issues are known to produce high rates of false positives, making the conclusions of the study unjustified. The study also fails to follow published journal standards regarding clinical trial registration and reporting. PMID:27083538

  2. Overview of selected multivariate statistical methods and their use in phytopathological research.

    PubMed

    Sanogo, S; Yang, X B

    2004-09-01

    ABSTRACT To disentangle the nature of a pathosystem or a component of the system such as disease epidemics for descriptive or predictive purposes, mensuration is conducted on several variables of the physical and chemical environment, pathogenic populations, and host plants. For instance, it may be desired to (i) distinguish pathogenic variation among several isolates of a pathogen based on disease severity; (ii) identify the most important variables that characterize the structure of an epidemic; and (iii) assess the potential of developing regional scale versus site-specific postmanagement schemes using weather and site variation. In all these cases, a simultaneous handling of several variables is required, and entails the use of multivariate statistics such as discriminant analysis, multivariate analysis of variance, correspondence analysis, and canonical correlation analysis. These tools have been used to varying degree in the phytopathological literature. A succinct overview of these tools is presented with cited examples.

  3. A new modeling and simulation method for important statistical performance prediction of single photon avalanche diode detectors

    NASA Astrophysics Data System (ADS)

    Xu, Yue; Xiang, Ping; Xie, Xiaopeng; Huang, Yang

    2016-06-01

    This paper presents a new modeling and simulation method to predict the important statistical performance of single photon avalanche diode (SPAD) detectors, including photon detection efficiency (PDE), dark count rate (DCR) and afterpulsing probability (AP). Three local electric field models are derived for the PDE, DCR and AP calculations, which show analytical dependence of key parameters such as avalanche triggering probability, impact ionization rate and electric field distributions that can be directly obtained from Geiger mode Technology Computer Aided Design (TCAD) simulation. The model calculation results are proven to be in good agreement with the reported experimental data in the open literature, suggesting that the proposed modeling and simulation method is very suitable for the prediction of SPAD statistical performance.

  4. A statistical method for evaluating sampling configurations of spatially variable parameters in environmental site audits

    SciTech Connect

    Molash, E.; McTernan, W.F.

    1995-11-01

    In hazardous waste sites the existence, number, and areal distributions of buried drums are unknown factors and are critical in defining the extent of contamination and defining decision models for remediation system design. The location and removal of these drums are appropriate first actions in responding to hazardous waste disposal sites which threaten groundwater resources. Magnetometry utilizes the earth`s natural magnetic field as an inducing element for the detection of ferrometallic objects in the subsurface. The contrast between most earth materials, which tend to have very low magnetic susceptibilities, and steel drums, which have very high magnetic susceptibilities, provide the basis for detecting and locating these objects using magnetic field attributes. The results of the geostatistical analysis for the magnetometry surveys over the Western Processing Superfund site showed that the total field intensity and vertical gradient data displayed distinctly different spatial statistical properties. The total magnetic intensity data had an experimental semivariogram was easily defined mathematically using the 3.05 meter (10 foot) triangular sampling grid. The subsequent kriged estimates for the total magnetic field intensity had very good statistical correlations with the original sampled data, with some minor distortions of the probability distributions of the results. These minor distortions were concluded to be intrinsic products of the mathematics of the kriging process. Contrastingly, the experimental semivariogram for the vertical magnetic gradient indicated that more than 69% of the correlation structure existed at spatial wavelengths less than that of the 10 foot triangular sampling grid, indicating much higher spatial frequencies. This implies that a field measurements for vertical magnetic gradient data, with the intent of locating buried steel drums, should be designed with sampling grids at distances considerably less than 10 feet.

  5. Methods for obtaining 3D training images for multiple-point statistics simulations: a comparative study

    NASA Astrophysics Data System (ADS)

    Jha, S. K.; Comunian, A.; Mariethoz, G.; Kelly, B. F.

    2013-12-01

    In recent years, multiple-point statistics (MPS) has been used in several studies for characterizing facies heterogeneity in geological formations. MPS uses a conceptual representation of the expected facies distribution, called a Training image (TI), to generate patterns of facies heterogeneity. In two-dimensional (2D) simulations the TI can be a hand-drawn image, an analogue outcrop image, or derived from geological reconstructions using a combination of geological analogues and geophysical data. However, obtaining suitable TI in three-dimensions (3D) from geological analogues or geophysical data is harder and has limited the use of MPS for simulating facies heterogeneity in 3D. There have been attempts to generate 3D training images using object-based simulation (OBS). However, determining suitable values for the large number of parameters required by OBS is often challenging. In this study, we compare two approaches for generating three-dimensional training images to model a valley filling sequence deposited by meandering rivers. The first approach is based on deriving statistical information from two-dimensional TIs. The 3D domain is simulated with a sequence of 2D MPS simulation steps, performed along different directions on slices of the 3D domain. At each 2D simulation step, the facies simulated at the previous steps that lie on the current 2D slice are used as conditioning data. The second approach uses hand-drawn two-dimensional TIs and produces complex patterns resembling the geological structures by applying rotation and affinity transformations in the facies simulation. The two techniques are compared using transition probabilities, facies proportions, and connectivity metrics. In the presentation we discuss the benefits of each approach for generating three-dimensional facies models.

  6. Preparing systems engineering and computing science students in disciplined methods, quantitative, and advanced statistical techniques to improve process performance

    NASA Astrophysics Data System (ADS)

    McCray, Wilmon Wil L., Jr.

    The research was prompted by a need to conduct a study that assesses process improvement, quality management and analytical techniques taught to students in U.S. colleges and universities undergraduate and graduate systems engineering and the computing science discipline (e.g., software engineering, computer science, and information technology) degree programs during their academic training that can be applied to quantitatively manage processes for performance. Everyone involved in executing repeatable processes in the software and systems development lifecycle processes needs to become familiar with the concepts of quantitative management, statistical thinking, process improvement methods and how they relate to process-performance. Organizations are starting to embrace the de facto Software Engineering Institute (SEI) Capability Maturity Model Integration (CMMI RTM) Models as process improvement frameworks to improve business processes performance. High maturity process areas in the CMMI model imply the use of analytical, statistical, quantitative management techniques, and process performance modeling to identify and eliminate sources of variation, continually improve process-performance; reduce cost and predict future outcomes. The research study identifies and provides a detail discussion of the gap analysis findings of process improvement and quantitative analysis techniques taught in U.S. universities systems engineering and computing science degree programs, gaps that exist in the literature, and a comparison analysis which identifies the gaps that exist between the SEI's "healthy ingredients " of a process performance model and courses taught in U.S. universities degree program. The research also heightens awareness that academicians have conducted little research on applicable statistics and quantitative techniques that can be used to demonstrate high maturity as implied in the CMMI models. The research also includes a Monte Carlo simulation optimization

  7. Statistics Clinic

    NASA Technical Reports Server (NTRS)

    Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James

    2014-01-01

    Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.

  8. Cosmic statistics of statistics

    NASA Astrophysics Data System (ADS)

    Szapudi, István; Colombi, Stéphane; Bernardeau, Francis

    1999-12-01

    The errors on statistics measured in finite galaxy catalogues are exhaustively investigated. The theory of errors on factorial moments by Szapudi & Colombi is applied to cumulants via a series expansion method. All results are subsequently extended to the weakly non-linear regime. Together with previous investigations this yields an analytic theory of the errors for moments and connected moments of counts in cells from highly non-linear to weakly non-linear scales. For non-linear functions of unbiased estimators, such as the cumulants, the phenomenon of cosmic bias is identified and computed. Since it is subdued by the cosmic errors in the range of applicability of the theory, correction for it is inconsequential. In addition, the method of Colombi, Szapudi & Szalay concerning sampling effects is generalized, adapting the theory for inhomogeneous galaxy catalogues. While previous work focused on the variance only, the present article calculates the cross-correlations between moments and connected moments as well for a statistically complete description. The final analytic formulae representing the full theory are explicit but somewhat complicated. Therefore we have made available a fortran program capable of calculating the described quantities numerically (for further details e-mail SC at colombi@iap.fr). An important special case is the evaluation of the errors on the two-point correlation function, for which this should be more accurate than any method put forward previously. This tool will be immensely useful in the future for assessing the precision of measurements from existing catalogues, as well as aiding the design of new galaxy surveys. To illustrate the applicability of the results and to explore the numerical aspects of the theory qualitatively and quantitatively, the errors and cross-correlations are predicted under a wide range of assumptions for the future Sloan Digital Sky Survey. The principal results concerning the cumulants ξ, Q3 and Q4 is that

  9. Development of Flood Forecasting Using Statistical Method in Four River Basins in Terengganu, Malaysia

    NASA Astrophysics Data System (ADS)

    Noor, M. S. F. M.; Sidek, L. M.; Basri, H.; Husni, M. M. M.; Jaafar, A. S.; Kamaluddin, M. H.; Majid, W. H. A. W. A.; Mohammad, A. H.; Osman, S.

    2016-03-01

    One of the critical regions in Malaysia is Terengganu which is located at east coast of Peninsular Malaysia. In Terengganu, flood is experienced regularly because of attributed topography and climate including northeast monsoon. Moreover, rainfall is with high intensity during the November to February in Terengganu as forcing factor to produce of flood. In this study, main objectives are water stage forecasting and deriving the related equations based on least squared method. For this study, it is used two methods which called inclusion of residual (Method A) and non-inclusion residual (Method B) respectively. Result depicts that Method B outperformed to forecast the water stage at selected case studies (Besut, Dungun, Kemaman, Terengganu).

  10. 40 CFR 51.354 - Adequate tools and resources.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 2 2013-07-01 2013-07-01 false Adequate tools and resources. 51.354... Requirements § 51.354 Adequate tools and resources. (a) Administrative resources. The program shall maintain the administrative resources necessary to perform all of the program functions including...

  11. 40 CFR 51.354 - Adequate tools and resources.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 2 2014-07-01 2014-07-01 false Adequate tools and resources. 51.354... Requirements § 51.354 Adequate tools and resources. (a) Administrative resources. The program shall maintain the administrative resources necessary to perform all of the program functions including...

  12. 40 CFR 51.354 - Adequate tools and resources.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 2 2012-07-01 2012-07-01 false Adequate tools and resources. 51.354... Requirements § 51.354 Adequate tools and resources. (a) Administrative resources. The program shall maintain the administrative resources necessary to perform all of the program functions including...

  13. 10 CFR 1304.114 - Responsibility for maintaining adequate safeguards.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Responsibility for maintaining adequate safeguards. 1304.114 Section 1304.114 Energy NUCLEAR WASTE TECHNICAL REVIEW BOARD PRIVACY ACT OF 1974 § 1304.114 Responsibility for maintaining adequate safeguards. The Board has the responsibility for maintaining...

  14. 13 CFR 108.200 - Adequate capital for NMVC Companies.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... VENTURE CAPITAL (âNMVCâ) PROGRAM Qualifications for the NMVC Program Capitalizing A Nmvc Company § 108.200 Adequate capital for NMVC Companies. You must meet the requirements of §§ 108.200-108.230 in order to... 13 Business Credit and Assistance 1 2010-01-01 2010-01-01 false Adequate capital for...

  15. 34 CFR 200.20 - Making adequate yearly progress.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 34 Education 1 2012-07-01 2012-07-01 false Making adequate yearly progress. 200.20 Section 200.20... Basic Programs Operated by Local Educational Agencies Adequate Yearly Progress (ayp) § 200.20 Making... State data system; (vi) Include, as separate factors in determining whether schools are making AYP for...

  16. 34 CFR 200.20 - Making adequate yearly progress.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 34 Education 1 2013-07-01 2013-07-01 false Making adequate yearly progress. 200.20 Section 200.20... Basic Programs Operated by Local Educational Agencies Adequate Yearly Progress (ayp) § 200.20 Making... State data system; (vi) Include, as separate factors in determining whether schools are making AYP for...

  17. 34 CFR 200.20 - Making adequate yearly progress.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 34 Education 1 2010-07-01 2010-07-01 false Making adequate yearly progress. 200.20 Section 200.20... Basic Programs Operated by Local Educational Agencies Adequate Yearly Progress (ayp) § 200.20 Making... State data system; (vi) Include, as separate factors in determining whether schools are making AYP for...

  18. 34 CFR 200.20 - Making adequate yearly progress.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 34 Education 1 2014-07-01 2014-07-01 false Making adequate yearly progress. 200.20 Section 200.20... Basic Programs Operated by Local Educational Agencies Adequate Yearly Progress (ayp) § 200.20 Making... State data system; (vi) Include, as separate factors in determining whether schools are making AYP for...

  19. 34 CFR 200.20 - Making adequate yearly progress.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 34 Education 1 2011-07-01 2011-07-01 false Making adequate yearly progress. 200.20 Section 200.20... Basic Programs Operated by Local Educational Agencies Adequate Yearly Progress (ayp) § 200.20 Making... State data system; (vi) Include, as separate factors in determining whether schools are making AYP for...

  20. 40 CFR 716.25 - Adequate file search.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 31 2011-07-01 2011-07-01 false Adequate file search. 716.25 Section... ACT HEALTH AND SAFETY DATA REPORTING General Provisions § 716.25 Adequate file search. The scope of a person's responsibility to search records is limited to records in the location(s) where the...

  1. 40 CFR 716.25 - Adequate file search.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 32 2013-07-01 2013-07-01 false Adequate file search. 716.25 Section... ACT HEALTH AND SAFETY DATA REPORTING General Provisions § 716.25 Adequate file search. The scope of a person's responsibility to search records is limited to records in the location(s) where the...

  2. 40 CFR 716.25 - Adequate file search.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 31 2014-07-01 2014-07-01 false Adequate file search. 716.25 Section... ACT HEALTH AND SAFETY DATA REPORTING General Provisions § 716.25 Adequate file search. The scope of a person's responsibility to search records is limited to records in the location(s) where the...

  3. 40 CFR 716.25 - Adequate file search.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 32 2012-07-01 2012-07-01 false Adequate file search. 716.25 Section... ACT HEALTH AND SAFETY DATA REPORTING General Provisions § 716.25 Adequate file search. The scope of a person's responsibility to search records is limited to records in the location(s) where the...

  4. 40 CFR 716.25 - Adequate file search.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 30 2010-07-01 2010-07-01 false Adequate file search. 716.25 Section... ACT HEALTH AND SAFETY DATA REPORTING General Provisions § 716.25 Adequate file search. The scope of a person's responsibility to search records is limited to records in the location(s) where the...

  5. 9 CFR 305.3 - Sanitation and adequate facilities.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 9 Animals and Animal Products 2 2010-01-01 2010-01-01 false Sanitation and adequate facilities. 305.3 Section 305.3 Animals and Animal Products FOOD SAFETY AND INSPECTION SERVICE, DEPARTMENT OF... OF VIOLATION § 305.3 Sanitation and adequate facilities. Inspection shall not be inaugurated if...

  6. 9 CFR 305.3 - Sanitation and adequate facilities.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 9 Animals and Animal Products 2 2011-01-01 2011-01-01 false Sanitation and adequate facilities. 305.3 Section 305.3 Animals and Animal Products FOOD SAFETY AND INSPECTION SERVICE, DEPARTMENT OF... OF VIOLATION § 305.3 Sanitation and adequate facilities. Inspection shall not be inaugurated if...

  7. 40 CFR 51.354 - Adequate tools and resources.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 2 2011-07-01 2011-07-01 false Adequate tools and resources. 51.354... Requirements § 51.354 Adequate tools and resources. (a) Administrative resources. The program shall maintain the administrative resources necessary to perform all of the program functions including...

  8. 40 CFR 51.354 - Adequate tools and resources.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 2 2010-07-01 2010-07-01 false Adequate tools and resources. 51.354... Requirements § 51.354 Adequate tools and resources. (a) Administrative resources. The program shall maintain the administrative resources necessary to perform all of the program functions including...

  9. 10 CFR 1304.114 - Responsibility for maintaining adequate safeguards.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 10 Energy 4 2011-01-01 2011-01-01 false Responsibility for maintaining adequate safeguards. 1304.114 Section 1304.114 Energy NUCLEAR WASTE TECHNICAL REVIEW BOARD PRIVACY ACT OF 1974 § 1304.114 Responsibility for maintaining adequate safeguards. The Board has the responsibility for maintaining...

  10. 10 CFR 1304.114 - Responsibility for maintaining adequate safeguards.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 10 Energy 4 2014-01-01 2014-01-01 false Responsibility for maintaining adequate safeguards. 1304.114 Section 1304.114 Energy NUCLEAR WASTE TECHNICAL REVIEW BOARD PRIVACY ACT OF 1974 § 1304.114 Responsibility for maintaining adequate safeguards. The Board has the responsibility for maintaining...

  11. 10 CFR 1304.114 - Responsibility for maintaining adequate safeguards.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 10 Energy 4 2013-01-01 2013-01-01 false Responsibility for maintaining adequate safeguards. 1304.114 Section 1304.114 Energy NUCLEAR WASTE TECHNICAL REVIEW BOARD PRIVACY ACT OF 1974 § 1304.114 Responsibility for maintaining adequate safeguards. The Board has the responsibility for maintaining...

  12. 10 CFR 1304.114 - Responsibility for maintaining adequate safeguards.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 10 Energy 4 2012-01-01 2012-01-01 false Responsibility for maintaining adequate safeguards. 1304.114 Section 1304.114 Energy NUCLEAR WASTE TECHNICAL REVIEW BOARD PRIVACY ACT OF 1974 § 1304.114 Responsibility for maintaining adequate safeguards. The Board has the responsibility for maintaining...

  13. 13 CFR 107.200 - Adequate capital for Licensees.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 13 Business Credit and Assistance 1 2010-01-01 2010-01-01 false Adequate capital for Licensees. 107.200 Section 107.200 Business Credit and Assistance SMALL BUSINESS ADMINISTRATION SMALL BUSINESS INVESTMENT COMPANIES Qualifying for an SBIC License Capitalizing An Sbic § 107.200 Adequate capital...

  14. 21 CFR 201.5 - Drugs; adequate directions for use.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 21 Food and Drugs 4 2010-04-01 2010-04-01 false Drugs; adequate directions for use. 201.5 Section 201.5 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) DRUGS: GENERAL LABELING General Labeling Provisions § 201.5 Drugs; adequate directions for use....

  15. 21 CFR 201.5 - Drugs; adequate directions for use.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 4 2011-04-01 2011-04-01 false Drugs; adequate directions for use. 201.5 Section 201.5 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) DRUGS: GENERAL LABELING General Labeling Provisions § 201.5 Drugs; adequate directions for use....

  16. 7 CFR 4290.200 - Adequate capital for RBICs.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 15 2010-01-01 2010-01-01 false Adequate capital for RBICs. 4290.200 Section 4290.200 Agriculture Regulations of the Department of Agriculture (Continued) RURAL BUSINESS-COOPERATIVE SERVICE AND... Qualifications for the RBIC Program Capitalizing A Rbic § 4290.200 Adequate capital for RBICs. You must meet...

  17. "Something Adequate"? In Memoriam Seamus Heaney, Sister Quinlan, Nirbhaya

    ERIC Educational Resources Information Center

    Parker, Jan

    2014-01-01

    Seamus Heaney talked of poetry's responsibility to represent the "bloody miracle", the "terrible beauty" of atrocity; to create "something adequate". This article asks, what is adequate to the burning and eating of a nun and the murderous gang rape and evisceration of a medical student? It considers Njabulo…

  18. Estimating Small-area Populations by Age and Sex Using Spatial Interpolation and Statistical Inference Methods

    SciTech Connect

    Qai, Qiang; Rushton, Gerald; Bhaduri, Budhendra L; Bright, Eddie A; Coleman, Phil R

    2006-01-01

    The objective of this research is to compute population estimates by age and sex for small areas whose boundaries are different from those for which the population counts were made. In our approach, population surfaces and age-sex proportion surfaces are separately estimated. Age-sex population estimates for small areas and their confidence intervals are then computed using a binomial model with the two surfaces as inputs. The approach was implemented for Iowa using a 90 m resolution population grid (LandScan USA) and U.S. Census 2000 population. Three spatial interpolation methods, the areal weighting (AW) method, the ordinary kriging (OK) method, and a modification of the pycnophylactic method, were used on Census Tract populations to estimate the age-sex proportion surfaces. To verify the model, age-sex population estimates were computed for paired Block Groups that straddled Census Tracts and therefore were spatially misaligned with them. The pycnophylactic method and the OK method were more accurate than the AW method. The approach is general and can be used to estimate subgroup-count types of variables from information in existing administrative areas for custom-defined areas used as the spatial basis of support in other applications.

  19. Spoilt for choice: implications of using alternative methods of costing hospital episode statistics.

    PubMed

    Geue, Claudia; Lewsey, James; Lorgelly, Paula; Govan, Lindsay; Hart, Carole; Briggs, Andrew

    2012-10-01

    In the absence of a 'gold standard' to estimate the economic burden of disease, a decision about the most appropriate costing method is required. Researchers have employed various methods to cost hospital stays, including per diem or diagnosis-related group (DRG)-based costs. Alternative methods differ in data collection and costing methodology. Using data from Scotland as an illustrative example, costing methods are compared, highlighting the wider implications for other countries with a publicly financed healthcare system. Five methods are compared using longitudinal data including baseline survey data (Midspan) linked to acute hospital admissions. Cost variables are derived using two forms of DRG-type costs, costs per diem, costs per episode-using a novel approach that distinguishes between variable and fixed costs and incorporates individual length of stay (LOS), and costs per episode using national average LOS. Cost estimates are generated using generalised linear model regression. Descriptive analysis shows substantial variation between costing methods. Differences found in regression analyses highlight the magnitude of variation in cost estimates for subgroups of the sample population. This paper emphasises that any inference made from econometric modelling of costs, where the marginal effect of explanatory variables is assessed, is substantially influenced by the costing method.

  20. Adaptive and robust statistical methods for processing near-field scanning microwave microscopy images.

    PubMed

    Coakley, K J; Imtiaz, A; Wallis, T M; Weber, J C; Berweger, S; Kabos, P

    2015-03-01

    Near-field scanning microwave microscopy offers great potential to facilitate characterization, development and modeling of materials. By acquiring microwave images at multiple frequencies and amplitudes (along with the other modalities) one can study material and device physics at different lateral and depth scales. Images are typically noisy and contaminated by artifacts that can vary from scan line to scan line and planar-like trends due to sample tilt errors. Here, we level images based on an estimate of a smooth 2-d trend determined with a robust implementation of a local regression method. In this robust approach, features and outliers which are not due to the trend are automatically downweighted. We denoise images with the Adaptive Weights Smoothing method. This method smooths out additive noise while preserving edge-like features in images. We demonstrate the feasibility of our methods on topography images and microwave |S11| images. For one challenging test case, we demonstrate that our method outperforms alternative methods from the scanning probe microscopy data analysis software package Gwyddion. Our methods should be useful for massive image data sets where manual selection of landmarks or image subsets by a user is impractical.