Sample records for statistical inference econometric

  1. Bayesian Nonparametric Prediction and Statistical Inference

    DTIC Science & Technology

    1989-09-07

    Kadane, J. (1980), "Bayesian decision theory and the sim- plification of models," in Evaluation of Econometric Models, J. Kmenta and J. Ramsey , eds...the random model and weighted least squares regression," in Evaluation of Econometric Models, ed. by J. Kmenta and J. Ramsey , Academic Press, 197-217...likelihood function. On the other hand, H. Jeffreys’s theory of hypothesis testing covers the most important situations in which the prior is not diffuse. See

  2. Identification and Inference for Econometric Models

    NASA Astrophysics Data System (ADS)

    Andrews, Donald W. K.; Stock, James H.

    2005-07-01

    This volume contains the papers presented in honor of the lifelong achievements of Thomas J. Rothenberg on the occasion of his retirement. The authors of the chapters include many of the leading econometricians of our day, and the chapters address topics of current research significance in econometric theory. The chapters cover four themes: identification and efficient estimation in econometrics, asymptotic approximations to the distributions of econometric estimators and tests, inference involving potentially nonstationary time series, such as processes that might have a unit autoregressive root, and nonparametric and semiparametric inference. Several of the chapters provide overviews and treatments of basic conceptual issues, while others advance our understanding of the properties of existing econometric procedures and/or propose new ones. Specific topics include identification in nonlinear models, inference with weak instruments, tests for nonstationary in time series and panel data, generalized empirical likelihood estimation, and the bootstrap.

  3. Much ado about two: reconsidering retransformation and the two-part model in health econometrics.

    PubMed

    Mullahy, J

    1998-06-01

    In health economics applications involving outcomes (y) and covariates (x), it is often the case that the central inferential problems of interest involve E[y/x] and its associated partial effects or elasticities. Many such outcomes have two fundamental statistical properties: y > or = 0; and the outcome y = 0 is observed with sufficient frequency that the zeros cannot be ignored econometrically. This paper (1) describes circumstances where the standard two-part model with homoskedastic retransformation will fail to provide consistent inferences about important policy parameters; and (2) demonstrates some alternative approaches that are likely to prove helpful in applications.

  4. Managing heteroscedasticity in general linear models.

    PubMed

    Rosopa, Patrick J; Schaffer, Meline M; Schroeder, Amber N

    2013-09-01

    Heteroscedasticity refers to a phenomenon where data violate a statistical assumption. This assumption is known as homoscedasticity. When the homoscedasticity assumption is violated, this can lead to increased Type I error rates or decreased statistical power. Because this can adversely affect substantive conclusions, the failure to detect and manage heteroscedasticity could have serious implications for theory, research, and practice. In addition, heteroscedasticity is not uncommon in the behavioral and social sciences. Thus, in the current article, we synthesize extant literature in applied psychology, econometrics, quantitative psychology, and statistics, and we offer recommendations for researchers and practitioners regarding available procedures for detecting heteroscedasticity and mitigating its effects. In addition to discussing the strengths and weaknesses of various procedures and comparing them in terms of existing simulation results, we describe a 3-step data-analytic process for detecting and managing heteroscedasticity: (a) fitting a model based on theory and saving residuals, (b) the analysis of residuals, and (c) statistical inferences (e.g., hypothesis tests and confidence intervals) involving parameter estimates. We also demonstrate this data-analytic process using an illustrative example. Overall, detecting violations of the homoscedasticity assumption and mitigating its biasing effects can strengthen the validity of inferences from behavioral and social science data.

  5. Granger Causality and Transfer Entropy Are Equivalent for Gaussian Variables

    NASA Astrophysics Data System (ADS)

    Barnett, Lionel; Barrett, Adam B.; Seth, Anil K.

    2009-12-01

    Granger causality is a statistical notion of causal influence based on prediction via vector autoregression. Developed originally in the field of econometrics, it has since found application in a broader arena, particularly in neuroscience. More recently transfer entropy, an information-theoretic measure of time-directed information transfer between jointly dependent processes, has gained traction in a similarly wide field. While it has been recognized that the two concepts must be related, the exact relationship has until now not been formally described. Here we show that for Gaussian variables, Granger causality and transfer entropy are entirely equivalent, thus bridging autoregressive and information-theoretic approaches to data-driven causal inference.

  6. The log-periodic-AR(1)-GARCH(1,1) model for financial crashes

    NASA Astrophysics Data System (ADS)

    Gazola, L.; Fernandes, C.; Pizzinga, A.; Riera, R.

    2008-02-01

    This paper intends to meet recent claims for the attainment of more rigorous statistical methodology within the econophysics literature. To this end, we consider an econometric approach to investigate the outcomes of the log-periodic model of price movements, which has been largely used to forecast financial crashes. In order to accomplish reliable statistical inference for unknown parameters, we incorporate an autoregressive dynamic and a conditional heteroskedasticity structure in the error term of the original model, yielding the log-periodic-AR(1)-GARCH(1,1) model. Both the original and the extended models are fitted to financial indices of U. S. market, namely S&P500 and NASDAQ. Our analysis reveal two main points: (i) the log-periodic-AR(1)-GARCH(1,1) model has residuals with better statistical properties and (ii) the estimation of the parameter concerning the time of the financial crash has been improved.

  7. Econometric Assessment of "One Minute" Paper as a Pedagogic Tool

    ERIC Educational Resources Information Center

    Das, Amaresh

    2010-01-01

    This paper makes an econometric testing of one-minute paper used as a tool to manage and assess instruction in my statistics class. One of our findings is that the one minute paper when I have tested it by using an OLS estimate in a controlled Vs experimental design framework is found to statistically significant and effective in enhancing…

  8. Estimating the Regional Economic Significance of Airports

    DTIC Science & Technology

    1992-09-01

    following three options for estimating induced impacts: the economic base model , an econometric model , and a regional input-output model . One approach to...limitations, however, the economic base model has been widely used for regional economic analysis. A second approach is to develop an econometric model of...analysis is the principal statistical tool used to estimate the economic relationships. Regional econometric models are capable of estimating a single

  9. Scale Mixture Models with Applications to Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Qin, Zhaohui S.; Damien, Paul; Walker, Stephen

    2003-11-01

    Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.

  10. Hedonic approaches based on spatial econometrics and spatial statistics: application to evaluation of project benefits

    NASA Astrophysics Data System (ADS)

    Tsutsumi, Morito; Seya, Hajime

    2009-12-01

    This study discusses the theoretical foundation of the application of spatial hedonic approaches—the hedonic approach employing spatial econometrics or/and spatial statistics—to benefits evaluation. The study highlights the limitations of the spatial econometrics approach since it uses a spatial weight matrix that is not employed by the spatial statistics approach. Further, the study presents empirical analyses by applying the Spatial Autoregressive Error Model (SAEM), which is based on the spatial econometrics approach, and the Spatial Process Model (SPM), which is based on the spatial statistics approach. SPMs are conducted based on both isotropy and anisotropy and applied to different mesh sizes. The empirical analysis reveals that the estimated benefits are quite different, especially between isotropic and anisotropic SPM and between isotropic SPM and SAEM; the estimated benefits are similar for SAEM and anisotropic SPM. The study demonstrates that the mesh size does not affect the estimated amount of benefits. Finally, the study provides a confidence interval for the estimated benefits and raises an issue with regard to benefit evaluation.

  11. Legitimate Techniques for Improving the R-Square and Related Statistics of a Multiple Regression Model

    DTIC Science & Technology

    1981-01-01

    explanatory variable has been ommitted. Ramsey (1974) has developed a rather interesting test for detecting specification errors using estimates of the...Peter. (1979) A Guide to Econometrics , Cambridge, MA: The MIT Press. Ramsey , J.B. (1974), "Classical Model Selection Through Specification Error... Tests ," in P. Zarembka, Ed. Frontiers in Econometrics , New York: Academia Press. Theil, Henri. (1971), Principles of Econometrics , New York: John Wiley

  12. Informing Ex Ante Event Studies with Macro-Econometric Evidence on the Structural and Policy Impacts of Terrorism.

    PubMed

    Nassios, Jason; Giesecke, James A

    2018-04-01

    Economic consequence analysis is one of many inputs to terrorism contingency planning. Computable general equilibrium (CGE) models are being used more frequently in these analyses, in part because of their capacity to accommodate high levels of event-specific detail. In modeling the potential economic effects of a hypothetical terrorist event, two broad sets of shocks are required: (1) physical impacts on observable variables (e.g., asset damage); (2) behavioral impacts on unobservable variables (e.g., investor uncertainty). Assembling shocks describing the physical impacts of a terrorist incident is relatively straightforward, since estimates are either readily available or plausibly inferred. However, assembling shocks describing behavioral impacts is more difficult. Values for behavioral variables (e.g., required rates of return) are typically inferred or estimated by indirect means. Generally, this has been achieved via reference to extraneous literature or ex ante surveys. This article explores a new method. We elucidate the magnitude of CGE-relevant structural shifts implicit in econometric evidence on terrorist incidents, with a view to informing future ex ante event assessments. Ex post econometric studies of terrorism by Blomberg et al. yield macro econometric equations that describe the response of observable economic variables (e.g., GDP growth) to terrorist incidents. We use these equations to determine estimates for relevant (unobservable) structural and policy variables impacted by terrorist incidents, using a CGE model of the United States. This allows us to: (i) compare values for these shifts with input assumptions in earlier ex ante CGE studies; and (ii) discuss how future ex ante studies can be informed by our analysis. © 2017 Society for Risk Analysis.

  13. Something old, something new, something borrowed, something blue: a framework for the marriage of health econometrics and cost-effectiveness analysis.

    PubMed

    Hoch, Jeffrey S; Briggs, Andrew H; Willan, Andrew R

    2002-07-01

    Economic evaluation is often seen as a branch of health economics divorced from mainstream econometric techniques. Instead, it is perceived as relying on statistical methods for clinical trials. Furthermore, the statistic of interest in cost-effectiveness analysis, the incremental cost-effectiveness ratio is not amenable to regression-based methods, hence the traditional reliance on comparing aggregate measures across the arms of a clinical trial. In this paper, we explore the potential for health economists undertaking cost-effectiveness analysis to exploit the plethora of established econometric techniques through the use of the net-benefit framework - a recently suggested reformulation of the cost-effectiveness problem that avoids the reliance on cost-effectiveness ratios and their associated statistical problems. This allows the formulation of the cost-effectiveness problem within a standard regression type framework. We provide an example with empirical data to illustrate how a regression type framework can enhance the net-benefit method. We go on to suggest that practical advantages of the net-benefit regression approach include being able to use established econometric techniques, adjust for imperfect randomisation, and identify important subgroups in order to estimate the marginal cost-effectiveness of an intervention. Copyright 2002 John Wiley & Sons, Ltd.

  14. The need for econometric research in laboratory animal operations.

    PubMed

    Baker, David G; Kearney, Michael T

    2015-06-01

    The scarcity of research funding can affect animal facilities in various ways. These effects can be evaluated by examining the allocation of financial resources in animal facilities, which can be facilitated by the use of mathematical and statistical methods to analyze economic problems, a discipline known as econometrics. The authors applied econometrics to study whether increasing per diem charges had a negative effect on the number of days of animal care purchased by animal users. They surveyed animal numbers and per diem charges at 20 research institutions and found that demand for large animals decreased as per diem charges increased. The authors discuss some of the challenges involved in their study and encourage research institutions to carry out more robust econometric studies of this and other economic questions facing laboratory animal research.

  15. The Child as Econometrician: A Rational Model of Preference Understanding in Children

    PubMed Central

    Lucas, Christopher G.; Griffiths, Thomas L.; Xu, Fei; Fawcett, Christine; Gopnik, Alison; Kushnir, Tamar; Markson, Lori; Hu, Jane

    2014-01-01

    Recent work has shown that young children can learn about preferences by observing the choices and emotional reactions of other people, but there is no unified account of how this learning occurs. We show that a rational model, built on ideas from economics and computer science, explains the behavior of children in several experiments, and offers new predictions as well. First, we demonstrate that when children use statistical information to learn about preferences, their inferences match the predictions of a simple econometric model. Next, we show that this same model can explain children's ability to learn that other people have preferences similar to or different from their own and use that knowledge to reason about the desirability of hidden objects. Finally, we use the model to explain a developmental shift in preference understanding. PMID:24667309

  16. Judging Statistical Models of Individual Decision Making under Risk Using In- and Out-of-Sample Criteria

    PubMed Central

    Drichoutis, Andreas C.; Lusk, Jayson L.

    2014-01-01

    Despite the fact that conceptual models of individual decision making under risk are deterministic, attempts to econometrically estimate risk preferences require some assumption about the stochastic nature of choice. Unfortunately, the consequences of making different assumptions are, at present, unclear. In this paper, we compare three popular error specifications (Fechner, contextual utility, and Luce error) for three different preference functionals (expected utility, rank-dependent utility, and a mixture of those two) using in- and out-of-sample selection criteria. We find drastically different inferences about structural risk preferences across the competing functionals and error specifications. Expected utility theory is least affected by the selection of the error specification. A mixture model combining the two conceptual models assuming contextual utility provides the best fit of the data both in- and out-of-sample. PMID:25029467

  17. Judging statistical models of individual decision making under risk using in- and out-of-sample criteria.

    PubMed

    Drichoutis, Andreas C; Lusk, Jayson L

    2014-01-01

    Despite the fact that conceptual models of individual decision making under risk are deterministic, attempts to econometrically estimate risk preferences require some assumption about the stochastic nature of choice. Unfortunately, the consequences of making different assumptions are, at present, unclear. In this paper, we compare three popular error specifications (Fechner, contextual utility, and Luce error) for three different preference functionals (expected utility, rank-dependent utility, and a mixture of those two) using in- and out-of-sample selection criteria. We find drastically different inferences about structural risk preferences across the competing functionals and error specifications. Expected utility theory is least affected by the selection of the error specification. A mixture model combining the two conceptual models assuming contextual utility provides the best fit of the data both in- and out-of-sample.

  18. The child as econometrician: a rational model of preference understanding in children.

    PubMed

    Lucas, Christopher G; Griffiths, Thomas L; Xu, Fei; Fawcett, Christine; Gopnik, Alison; Kushnir, Tamar; Markson, Lori; Hu, Jane

    2014-01-01

    Recent work has shown that young children can learn about preferences by observing the choices and emotional reactions of other people, but there is no unified account of how this learning occurs. We show that a rational model, built on ideas from economics and computer science, explains the behavior of children in several experiments, and offers new predictions as well. First, we demonstrate that when children use statistical information to learn about preferences, their inferences match the predictions of a simple econometric model. Next, we show that this same model can explain children's ability to learn that other people have preferences similar to or different from their own and use that knowledge to reason about the desirability of hidden objects. Finally, we use the model to explain a developmental shift in preference understanding.

  19. Non-robust dynamic inferences from macroeconometric models: Bifurcation stratification of confidence regions

    NASA Astrophysics Data System (ADS)

    Barnett, William A.; Duzhak, Evgeniya Aleksandrovna

    2008-06-01

    Grandmont [J.M. Grandmont, On endogenous competitive business cycles, Econometrica 53 (1985) 995-1045] found that the parameter space of the most classical dynamic models is stratified into an infinite number of subsets supporting an infinite number of different kinds of dynamics, from monotonic stability at one extreme to chaos at the other extreme, and with many forms of multiperiodic dynamics in between. The econometric implications of Grandmont’s findings are particularly important, if bifurcation boundaries cross the confidence regions surrounding parameter estimates in policy-relevant models. Stratification of a confidence region into bifurcated subsets seriously damages robustness of dynamical inferences. Recently, interest in policy in some circles has moved to New-Keynesian models. As a result, in this paper we explore bifurcation within the class of New-Keynesian models. We develop the econometric theory needed to locate bifurcation boundaries in log-linearized New-Keynesian models with Taylor policy rules or inflation-targeting policy rules. Central results needed in this research are our theorems on the existence and location of Hopf bifurcation boundaries in each of the cases that we consider.

  20. Application of modern tests for stationarity to single-trial MEG data: transferring powerful statistical tools from econometrics to neuroscience.

    PubMed

    Kipiński, Lech; König, Reinhard; Sielużycki, Cezary; Kordecki, Wojciech

    2011-10-01

    Stationarity is a crucial yet rarely questioned assumption in the analysis of time series of magneto- (MEG) or electroencephalography (EEG). One key drawback of the commonly used tests for stationarity of encephalographic time series is the fact that conclusions on stationarity are only indirectly inferred either from the Gaussianity (e.g. the Shapiro-Wilk test or Kolmogorov-Smirnov test) or the randomness of the time series and the absence of trend using very simple time-series models (e.g. the sign and trend tests by Bendat and Piersol). We present a novel approach to the analysis of the stationarity of MEG and EEG time series by applying modern statistical methods which were specifically developed in econometrics to verify the hypothesis that a time series is stationary. We report our findings of the application of three different tests of stationarity--the Kwiatkowski-Phillips-Schmidt-Schin (KPSS) test for trend or mean stationarity, the Phillips-Perron (PP) test for the presence of a unit root and the White test for homoscedasticity--on an illustrative set of MEG data. For five stimulation sessions, we found already for short epochs of duration of 250 and 500 ms that, although the majority of the studied epochs of single MEG trials were usually mean-stationary (KPSS test and PP test), they were classified as nonstationary due to their heteroscedasticity (White test). We also observed that the presence of external auditory stimulation did not significantly affect the findings regarding the stationarity of the data. We conclude that the combination of these tests allows a refined analysis of the stationarity of MEG and EEG time series.

  1. 77 FR 1454 - Request for Nominations of Members To Serve on the Census Scientific Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-01-10

    ..., statistical analysis, survey methodology, geospatial analysis, econometrics, cognitive psychology, and... following disciplines: Demography, economics, geography, psychology, statistics, survey methodology, social... technical expertise in such areas as demography, economics, geography, psychology, statistics, survey...

  2. Simulating Quantile Models with Applications to Economics and Management

    NASA Astrophysics Data System (ADS)

    Machado, José A. F.

    2010-05-01

    The massive increase in the speed of computers over the past forty years changed the way that social scientists, applied economists and statisticians approach their trades and also the very nature of the problems that they could feasibly tackle. The new methods that use intensively computer power go by the names of "computer-intensive" or "simulation". My lecture will start with bird's eye view of the uses of simulation in Economics and Statistics. Then I will turn out to my own research on uses of computer- intensive methods. From a methodological point of view the question I address is how to infer marginal distributions having estimated a conditional quantile process, (Counterfactual Decomposition of Changes in Wage Distributions using Quantile Regression," Journal of Applied Econometrics 20, 2005). Illustrations will be provided of the use of the method to perform counterfactual analysis in several different areas of knowledge.

  3. Application of econometric and ecology analysis methods in physics software

    NASA Astrophysics Data System (ADS)

    Han, Min Cheol; Hoff, Gabriela; Kim, Chan Hyeong; Kim, Sung Hun; Grazia Pia, Maria; Ronchieri, Elisabetta; Saracco, Paolo

    2017-10-01

    Some data analysis methods typically used in econometric studies and in ecology have been evaluated and applied in physics software environments. They concern the evolution of observables through objective identification of change points and trends, and measurements of inequality, diversity and evenness across a data set. Within each analysis area, various statistical tests and measures have been examined. This conference paper summarizes a brief overview of some of these methods.

  4. Factors influencing crime rates: an econometric analysis approach

    NASA Astrophysics Data System (ADS)

    Bothos, John M. A.; Thomopoulos, Stelios C. A.

    2016-05-01

    The scope of the present study is to research the dynamics that determine the commission of crimes in the US society. Our study is part of a model we are developing to understand urban crime dynamics and to enhance citizens' "perception of security" in large urban environments. The main targets of our research are to highlight dependence of crime rates on certain social and economic factors and basic elements of state anticrime policies. In conducting our research, we use as guides previous relevant studies on crime dependence, that have been performed with similar quantitative analyses in mind, regarding the dependence of crime on certain social and economic factors using statistics and econometric modelling. Our first approach consists of conceptual state space dynamic cross-sectional econometric models that incorporate a feedback loop that describes crime as a feedback process. In order to define dynamically the model variables, we use statistical analysis on crime records and on records about social and economic conditions and policing characteristics (like police force and policing results - crime arrests), to determine their influence as independent variables on crime, as the dependent variable of our model. The econometric models we apply in this first approach are an exponential log linear model and a logit model. In a second approach, we try to study the evolvement of violent crime through time in the US, independently as an autonomous social phenomenon, using autoregressive and moving average time-series econometric models. Our findings show that there are certain social and economic characteristics that affect the formation of crime rates in the US, either positively or negatively. Furthermore, the results of our time-series econometric modelling show that violent crime, viewed solely and independently as a social phenomenon, correlates with previous years crime rates and depends on the social and economic environment's conditions during previous years.

  5. Statistical Cost Estimation in Higher Education: Some Alternatives.

    ERIC Educational Resources Information Center

    Brinkman, Paul T.; Niwa, Shelley

    Recent developments in econometrics that are relevant to the task of estimating costs in higher education are reviewed. The relative effectiveness of alternative statistical procedures for estimating costs are also tested. Statistical cost estimation involves three basic parts: a model, a data set, and an estimation procedure. Actual data are used…

  6. 78 FR 67103 - Request for Nominations of Members To Serve on the Census Scientific Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-08

    ... analysis, survey methodology, geospatial analysis, econometrics, cognitive psychology, and computer science... following disciplines: demography, economics, geography, psychology, statistics, survey methodology, social... expertise in such areas as demography, economics, geography, psychology, statistics, survey methodology...

  7. On the Validity of Econometric Techniques with Weak Instruments--Inference on Returns to Education Using Compulsory School Attendance Laws

    ERIC Educational Resources Information Center

    Cruz, Luiz M.; Moreira, Marcelo J.

    2005-01-01

    The authors evaluate Angrist and Krueger (1991) and Bound, Jaeger, and Baker (1995) by constructing reliable confidence regions around the 2SLS and LIML estimators for returns-to-schooling regardless of the quality of the instruments. The results indicate that the returns-to-schooling were between 8 and 25 percent in 1970 and between 4 and 14…

  8. 76 FR 11195 - Request for Nominations of Members To Serve on the Census Scientific Advisory Committee

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-01

    ..., econometrics, cognitive psychology, and computer science as they pertain to the full range of Census Bureau... technical expertise from the following disciplines: demography, economics, geography, psychology, statistics..., psychology, statistics, survey methodology, social and behavioral sciences, Information Technology, computing...

  9. 77 FR 28607 - Advisory Committee on Organ Transplantation; Request for Nominations for Voting Members

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-15

    ... bioethics, behavioral sciences, economics and statistics, as well as representatives of transplant...; law and bioethics; behavioral sciences; economics and econometrics; organ procurement organizations...

  10. Day of the week effect in paper submission/acceptance/rejection to/in/by peer review journals. II. An ARCH econometric-like modeling

    NASA Astrophysics Data System (ADS)

    Ausloos, Marcel; Nedic, Olgica; Dekanski, Aleksandar; Mrowinski, Maciej J.; Fronczak, Piotr; Fronczak, Agata

    2017-02-01

    This paper aims at providing a statistical model for the preferred behavior of authors submitting a paper to a scientific journal. The electronic submission of (about 600) papers to the Journal of the Serbian Chemical Society has been recorded for every day from Jan. 01, 2013 till Dec. 31, 2014, together with the acceptance or rejection paper fate. Seasonal effects and editor roles (through desk rejection and subfield editors) are examined. An ARCH-like econometric model is derived stressing the main determinants of the favorite day-of-week process.

  11. Municipal water consumption forecast accuracy

    NASA Astrophysics Data System (ADS)

    Fullerton, Thomas M.; Molina, Angel L.

    2010-06-01

    Municipal water consumption planning is an active area of research because of infrastructure construction and maintenance costs, supply constraints, and water quality assurance. In spite of that, relatively few water forecast accuracy assessments have been completed to date, although some internal documentation may exist as part of the proprietary "grey literature." This study utilizes a data set of previously published municipal consumption forecasts to partially fill that gap in the empirical water economics literature. Previously published municipal water econometric forecasts for three public utilities are examined for predictive accuracy against two random walk benchmarks commonly used in regional analyses. Descriptive metrics used to quantify forecast accuracy include root-mean-square error and Theil inequality statistics. Formal statistical assessments are completed using four-pronged error differential regression F tests. Similar to studies for other metropolitan econometric forecasts in areas with similar demographic and labor market characteristics, model predictive performances for the municipal water aggregates in this effort are mixed for each of the municipalities included in the sample. Given the competitiveness of the benchmarks, analysts should employ care when utilizing econometric forecasts of municipal water consumption for planning purposes, comparing them to recent historical observations and trends to insure reliability. Comparative results using data from other markets, including regions facing differing labor and demographic conditions, would also be helpful.

  12. 78 FR 49276 - Advisory Committee on Organ Transplantation; Request for Nominations for Voting Members

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-13

    ...; behavioral sciences; economics and econometrics; organ procurement organizations; transplant candidates..., non-physician transplant professions, nursing, epidemiology, immunology, law and bioethics, behavioral sciences, economics and statistics, as well as representatives of transplant candidates, transplant...

  13. 75 FR 57807 - Advisory Committee on Organ Transplantation; Request for Nominations for Voting Members

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-22

    ... bioethics; behavioral sciences; economics and econometrics; organ procurement organizations; transplant..., non-physician transplant professions, nursing, epidemiology, immunology, law and bioethics, behavioral sciences, economics and statistics, as well as representatives of transplant candidates, transplant...

  14. Valuing Eastern Visibility: A Field Test of the Contingent Valuation Method (1993)

    EPA Pesticide Factsheets

    The report describes the Eastern visibility survey design in detail, presents the implementation of and data obtained from the surveys, provides summary statistics on the overall response and discusses the econometric techniques employed to value benefits.

  15. ENVIRONMENTAL ECONOMICS FOR WATERSHED RESTORATION

    EPA Science Inventory

    This book overviews non-market valuation, input-output analysis, cost-benefit analysis, and presents case studies from the Mid Atlantic Highland region, with all but the bare minimum econometrics, statistics, and math excluded or relegated to an appendix. It is a non-market valu...

  16. A statistical analysis of the effects of a uniform minimum drinking age

    DOT National Transportation Integrated Search

    1987-04-01

    This report examines the relationship between minimum drinking age (MDA) and : highway fatalities during the 1975-1985 period, when 35 states changed their : MDAs. An econometric model of fatalities involving the 18-20 year-old driver : normalized by...

  17. SPATIAL STATISTICS AND ECONOMETRICS FOR MODELS IN FISHERIES ECONOMICS. (R828012)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  18. A Framework for Restructuring the Military Retirement System

    DTIC Science & Technology

    2013-07-01

    Associate Professor of Economics in the Social Sciences Department at West Point where he teaches econometrics and labor economics. His areas of...others worth considering, but each should be carefully benchmarked against our proposed framework. 25 ENDNOTES 1. Office of the Actuary , Statistical

  19. 47 CFR 1.363 - Introduction of statistical data.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... case of sample surveys, there shall be a clear description of the survey design, including the... evidence in common carrier hearing proceedings, including but not limited to sample surveys, econometric... description of the experimental design shall be set forth, including a specification of the controlled...

  20. 47 CFR 1.363 - Introduction of statistical data.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... case of sample surveys, there shall be a clear description of the survey design, including the... evidence in common carrier hearing proceedings, including but not limited to sample surveys, econometric... description of the experimental design shall be set forth, including a specification of the controlled...

  1. 47 CFR 1.363 - Introduction of statistical data.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... case of sample surveys, there shall be a clear description of the survey design, including the... evidence in common carrier hearing proceedings, including but not limited to sample surveys, econometric... description of the experimental design shall be set forth, including a specification of the controlled...

  2. 47 CFR 1.363 - Introduction of statistical data.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... case of sample surveys, there shall be a clear description of the survey design, including the... evidence in common carrier hearing proceedings, including but not limited to sample surveys, econometric... description of the experimental design shall be set forth, including a specification of the controlled...

  3. 47 CFR 1.363 - Introduction of statistical data.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... case of sample surveys, there shall be a clear description of the survey design, including the... evidence in common carrier hearing proceedings, including but not limited to sample surveys, econometric... description of the experimental design shall be set forth, including a specification of the controlled...

  4. Analytic Methods for Adjusting Subjective Rating Schemes.

    ERIC Educational Resources Information Center

    Cooper, Richard V. L.; Nelson, Gary R.

    Statistical and econometric techniques of correcting for supervisor bias in models of individual performance appraisal were developed, using a variant of the classical linear regression model. Location bias occurs when individual performance is systematically overestimated or underestimated, while scale bias results when raters either exaggerate…

  5. An econometric investigation of the sunspot number record since the year 1700 and its prediction into the 22nd century

    NASA Astrophysics Data System (ADS)

    Travaglini, Guido

    2015-09-01

    Solar activity, as measured by the yearly revisited time series of sunspot numbers (SSN) for the period 1700-2014 (Clette et al., 2014), undergoes in this paper a triple statistical and econometric checkup. The conclusions are that the SSN sequence: (1) is best modeled as a signal that features nonlinearity in mean and variance, long memory, mean reversion, 'threshold' symmetry, and stationarity; (2) is best described as a discrete damped harmonic oscillator which linearly approximates the flux-transport dynamo model; (3) its prediction well into the 22nd century testifies of a substantial fall of the SSN centered around the year 2030. In addition, the first and last Gleissberg cycles show almost the same peak number and height during the period considered, yet the former slightly prevails when measured by means of the estimated smoother. All of these conclusions are achieved by making use of modern tools developed in the field of Financial Econometrics and of two new proposed procedures for signal smoothing and prediction.

  6. Economics of technological change - A joint model for the aircraft and airline industries

    NASA Technical Reports Server (NTRS)

    Kneafsey, J. T.; Taneja, N. K.

    1981-01-01

    The principal focus of this econometric model is on the process of technological change in the U.S. aircraft manufacturing and airline industries. The problem of predicting the rate of introduction of current technology aircraft into an airline's fleet during the period of research, development, and construction for new technology aircraft arises in planning aeronautical research investments. The approach in this model is a statistical one. It attempts to identify major factors that influence transport aircraft manufacturers and airlines, and to correlate them with the patterns of delivery of new aircraft to the domestic trunk carriers. The functional form of the model has been derived from several earlier econometric models on the economics of innovation, acquisition, and technological change.

  7. Schools and Labor Market Outcomes. EQW Working Papers WP33.

    ERIC Educational Resources Information Center

    Crawford, David L.; And Others

    The relationship between school characteristics and labor market outcomes was examined through a literature review and an econometric analysis of the effects of various characteristics of the schooling experience on students' labor market performance after high school. Data from the National Center on Education Statistics' longitudinal survey of…

  8. Interpreting Bivariate Regression Coefficients: Going beyond the Average

    ERIC Educational Resources Information Center

    Halcoussis, Dennis; Phillips, G. Michael

    2010-01-01

    Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…

  9. Impact of Education on the Income of Different Social Groups

    ERIC Educational Resources Information Center

    Yue, Changjun; Liu, Yanping

    2007-01-01

    This study investigates, statistically and econometrically, the income level, income inequality, education inequality, and the relationship between education and income of different social groups, on the basis of the Chinese Urban Household Survey conducted in 2005, the Gini coefficient and the quartile regression method. Research findings…

  10. Bureau of Labor Statistics Employment Projections: Detailed Analysis of Selected Occupations and Industries. Report to the Honorable Berkley Bedell, United States House of Representatives.

    ERIC Educational Resources Information Center

    General Accounting Office, Washington, DC.

    To compile its projections of future employment levels, the Bureau of Labor Statistics (BLS) combines the following five interlinked models in a six-step process: a labor force model, an econometric model of the U.S. economy, an industry activity model, an industry labor demand model, and an occupational labor demand model. The BLS was asked to…

  11. Using directed information for influence discovery in interconnected dynamical systems

    NASA Astrophysics Data System (ADS)

    Rao, Arvind; Hero, Alfred O.; States, David J.; Engel, James Douglas

    2008-08-01

    Structure discovery in non-linear dynamical systems is an important and challenging problem that arises in various applications such as computational neuroscience, econometrics, and biological network discovery. Each of these systems have multiple interacting variables and the key problem is the inference of the underlying structure of the systems (which variables are connected to which others) based on the output observations (such as multiple time trajectories of the variables). Since such applications demand the inference of directed relationships among variables in these non-linear systems, current methods that have a linear assumption on structure or yield undirected variable dependencies are insufficient. Hence, in this work, we present a methodology for structure discovery using an information-theoretic metric called directed time information (DTI). Using both synthetic dynamical systems as well as true biological datasets (kidney development and T-cell data), we demonstrate the utility of DTI in such problems.

  12. Statistical Inference at Work: Statistical Process Control as an Example

    ERIC Educational Resources Information Center

    Bakker, Arthur; Kent, Phillip; Derry, Jan; Noss, Richard; Hoyles, Celia

    2008-01-01

    To characterise statistical inference in the workplace this paper compares a prototypical type of statistical inference at work, statistical process control (SPC), with a type of statistical inference that is better known in educational settings, hypothesis testing. Although there are some similarities between the reasoning structure involved in…

  13. Regression Models of Quarterly Overhead Costs for Six Government Aerospace Contractors.

    DTIC Science & Technology

    1986-03-01

    34 Testing ,, for Serial Correlation After Least Squares %Regression, Econometrica, Vol. 36, No. 1, pp. 133-150, January 1968. Intrili8ator M.D., Econometric ...to be superior. These two estimators are both two-stage estimators that are calculated utilizing Wallis’s test statistic for fourth-order...utilizing Wallis’s test statistic for fourth-order autocorrelation. NTIS C F’,& D tI1C T - .1 I -. . . ..- rJ ,. *p J • - DA 3

  14. Econometric models for predicting confusion crop ratios

    NASA Technical Reports Server (NTRS)

    Umberger, D. E.; Proctor, M. H.; Clark, J. E.; Eisgruber, L. M.; Braschler, C. B. (Principal Investigator)

    1979-01-01

    Results for both the United States and Canada show that econometric models can provide estimates of confusion crop ratios that are more accurate than historical ratios. Whether these models can support the LACIE 90/90 accuracy criterion is uncertain. In the United States, experimenting with additional model formulations could provide improved methods models in some CRD's, particularly in winter wheat. Improved models may also be possible for the Canadian CD's. The more aggressive province/state models outperformed individual CD/CRD models. This result was expected partly because acreage statistics are based on sampling procedures, and the sampling precision declines from the province/state to the CD/CRD level. Declining sampling precision and the need to substitute province/state data for the CD/CRD data introduced measurement error into the CD/CRD models.

  15. Treatment effects model for assessing disease management: measuring outcomes and strengthening program management.

    PubMed

    Wendel, Jeanne; Dumitras, Diana

    2005-06-01

    This paper describes an analytical methodology for obtaining statistically unbiased outcomes estimates for programs in which participation decisions may be correlated with variables that impact outcomes. This methodology is particularly useful for intraorganizational program evaluations conducted for business purposes. In this situation, data is likely to be available for a population of managed care members who are eligible to participate in a disease management (DM) program, with some electing to participate while others eschew the opportunity. The most pragmatic analytical strategy for in-house evaluation of such programs is likely to be the pre-intervention/post-intervention design in which the control group consists of people who were invited to participate in the DM program, but declined the invitation. Regression estimates of program impacts may be statistically biased if factors that impact participation decisions are correlated with outcomes measures. This paper describes an econometric procedure, the Treatment Effects model, developed to produce statistically unbiased estimates of program impacts in this type of situation. Two equations are estimated to (a) estimate the impacts of patient characteristics on decisions to participate in the program, and then (b) use this information to produce a statistically unbiased estimate of the impact of program participation on outcomes. This methodology is well-established in economics and econometrics, but has not been widely applied in the DM outcomes measurement literature; hence, this paper focuses on one illustrative application.

  16. Synthetic Indicators of Quality of Life in Europe

    ERIC Educational Resources Information Center

    Somarriba, Noelia; Pena, Bernardo

    2009-01-01

    For more than three decades now, sociologists, politicians and economists have used a wide range of statistical and econometric techniques to analyse and measure the quality of life of individuals with the aim of obtaining useful instruments for social, political and economic decision making. The aim of this paper is to analyse the advantages and…

  17. Gender and Migration Background in Intergenerational Educational Mobility

    ERIC Educational Resources Information Center

    Schneebaum, Alyssa; Rumplmaier, Bernhard; Altzinger, Wilfried

    2016-01-01

    We employ 2011 European Union Statistics on Income and Living Conditions survey data for Austria to perform uni- and multivariate econometric analyses to study the role of gender and migration background (MB) in intergenerational educational mobility. We find that there is more persistence in the educational attainment of girls relative to their…

  18. College Choice in America.

    ERIC Educational Resources Information Center

    Manski, Charles F.; And Others

    The processes of choosing a college and being accepted by a college are analyzed, based on data on nearly 23,000 seniors from more than 1,300 high schools from the National Longitudinal Study of the Class of 1972. Econometric modeling and descriptive statistics are provided on: student behavior in selecting a college, choosing school/nonschool…

  19. Estimated Effects of Retirement Revision on Retention of Navy Tactical Pilots.

    DTIC Science & Technology

    1986-12-01

    detailed explanation of the procedure and proofs can be found in Hanushek and Jackson [Ref. 441. S511 ,V. 󈧈 VI. RESULTS AND ANALYSIS A. DESCRIPTIVE...Introduction to Econometrics, pp. 242-243, Prentice-Hall, 1978. 44. Hanushek Eric ard Jackson, John, Statistical .Mlethods for Social Scientists, p. S188

  20. A Diagrammatic Exposition of Regression and Instrumental Variables for the Beginning Student

    ERIC Educational Resources Information Center

    Foster, Gigi

    2009-01-01

    Some beginning students of statistics and econometrics have difficulty with traditional algebraic approaches to explaining regression and related techniques. For these students, a simple and intuitive diagrammatic introduction as advocated by Kennedy (2008) may prove a useful framework to support further study. The author presents a series of…

  1. Web-based Learning Environments Guided by Principles of Good Teaching Practice.

    ERIC Educational Resources Information Center

    Chizmar, John F.; Walbert, Mark S.

    1999-01-01

    Describes the preparation and execution of a statistics course, an undergraduate econometrics course, and a microeconomic theory course that all utilize Internet technology. Reviews seven principles of teaching practice in order to demonstrate how to enhance the quality of student learning using Web technologies. Includes reactions by Steve Hurd…

  2. Influences on Labor Market Outcomes of African American College Graduates: A National Study

    ERIC Educational Resources Information Center

    Strayhorn, Terrell L.

    2008-01-01

    Using an expanded econometric model, this study sought to estimate more precisely the net effect of independent variables (i.e., attending an HBCU) on three measures of labor market outcomes for African American college graduates. Findings reveal a statistically significant, albeit moderate, relationship between measures of background, human and…

  3. Pedagogy and the PC: Trends in the AIS Curriculum

    ERIC Educational Resources Information Center

    Badua, Frank

    2008-01-01

    The author investigated the array of course topics in accounting information systems (AIS), as course syllabi embody. The author (a) used exploratory data analysis to determine the topics that AIS courses most frequently offered and (b) used descriptive statistics and econometric analysis to trace the diversity of course topics through time,…

  4. Attrition Bias in Panel Data: A Sheep in Wolf's Clothing? A Case Study Based on the Mabel Survey.

    PubMed

    Cheng, Terence C; Trivedi, Pravin K

    2015-09-01

    This paper investigates the nature and consequences of sample attrition in a unique longitudinal survey of medical doctors. We describe the patterns of non-response and examine if attrition affects the econometric analysis of medical labour market outcomes using the estimation of physician earnings equations as a case study. We compare the econometric gestimates obtained from a number of different modelling strategies, which are as follows: balanced versus unbalanced samples; an attrition model for panel data based on the classic sample selection model; and a recently developed copula-based selection model. Descriptive evidence shows that doctors who work longer hours, have lower years of experience, are overseas trained and have changed their work location are more likely to drop out. Our analysis suggests that the impact of attrition on inference about the earnings of general practitioners is small. For specialists, there appears to be some evidence for an economically significant bias. Finally, we discuss how the top-up samples in the Medicine in Australia: Balancing Employment and Life survey can be used to address the problem of panel attrition. Copyright © 2015 John Wiley & Sons, Ltd.

  5. Stochastic Calculus and Differential Equations for Physics and Finance

    NASA Astrophysics Data System (ADS)

    McCauley, Joseph L.

    2013-02-01

    1. Random variables and probability distributions; 2. Martingales, Markov, and nonstationarity; 3. Stochastic calculus; 4. Ito processes and Fokker-Planck equations; 5. Selfsimilar Ito processes; 6. Fractional Brownian motion; 7. Kolmogorov's PDEs and Chapman-Kolmogorov; 8. Non Markov Ito processes; 9. Black-Scholes, martingales, and Feynman-Katz; 10. Stochastic calculus with martingales; 11. Statistical physics and finance, a brief history of both; 12. Introduction to new financial economics; 13. Statistical ensembles and time series analysis; 14. Econometrics; 15. Semimartingales; References; Index.

  6. Statistical and Economic Techniques for Site-specific Nematode Management.

    PubMed

    Liu, Zheng; Griffin, Terry; Kirkpatrick, Terrence L

    2014-03-01

    Recent advances in precision agriculture technologies and spatial statistics allow realistic, site-specific estimation of nematode damage to field crops and provide a platform for the site-specific delivery of nematicides within individual fields. This paper reviews the spatial statistical techniques that model correlations among neighboring observations and develop a spatial economic analysis to determine the potential of site-specific nematicide application. The spatial econometric methodology applied in the context of site-specific crop yield response contributes to closing the gap between data analysis and realistic site-specific nematicide recommendations and helps to provide a practical method of site-specifically controlling nematodes.

  7. Doing a Monty: Who Opened the Door to This Game for Economists?

    ERIC Educational Resources Information Center

    Round, David K.

    2007-01-01

    The Monty Hall three-door, "Let's Make a Deal" game, named after the 1970s television show, is used widely in economics, econometrics, statistics, and game-theory-based teaching, as well as in many other disciplines. Its solutions and underlying assumptions arouse great passion and argument, in both the academic and popular press. Most economists…

  8. National projections of forest and rangeland condition indicators: a supporting technical document for the 1999 RPA assessment.

    Treesearch

    John Hof; Curtis Flather; Tony Baltic; Stephen Davies

    1999-01-01

    The 1999 forest and rangeland condition indicator model is a set of independent econometric production functions for environmental outputs (measured with condition indicators) at the national scale. This report documents the development of the database and the statistical estimation required by this particular production structure with emphasis on two special...

  9. Declining national park visitation: An economic analysis

    Treesearch

    Thomas H. Stevens; Thomas A. More; Marla Markowski-Lindsay

    2014-01-01

    Visitation to the major nature-based national parks has been declining. This paper specifies an econometric model that estimates the relative impact of consumer incomes, travel costs, entry fees and other factors on per capita attendance from 1993 to 2010. Results suggest that entrance fees have had a statistically significant but small impact on per capita attendance...

  10. 78 FR 29258 - Blueberry Promotion, Research and Information Order; Assessment Rate Increase

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-20

    .... \\6\\ The econometric model used statistical methods with time series data to measure how strongly the... program has been over 15 times greater than the costs. At the opposite end of the spectrum in the supply... times greater than the costs. Given the wide range of supply responses considered in the analysis, and...

  11. 78 FR 59775 - Blueberry Promotion, Research and Information Order; Assessment Rate Increase

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-09-30

    ... demand. \\6\\ The econometric model used statistical methods with time series data to measure how strongly... been over 15 times greater than the costs. At the opposite end of the spectrum in the supply response, the average BCR was computed to be 5.36, implying that the benefits of the USHBC were over five times...

  12. Investigation of Statistical Inference Methodologies Through Scale Model Propagation Experiments

    DTIC Science & Technology

    2015-09-30

    statistical inference methodologies for ocean- acoustic problems by investigating and applying statistical methods to data collected from scale-model...to begin planning experiments for statistical inference applications. APPROACH In the ocean acoustics community over the past two decades...solutions for waveguide parameters. With the introduction of statistical inference to the field of ocean acoustics came the desire to interpret marginal

  13. Stan: Statistical inference

    NASA Astrophysics Data System (ADS)

    Stan Development Team

    2018-01-01

    Stan facilitates statistical inference at the frontiers of applied statistics and provides both a modeling language for specifying complex statistical models and a library of statistical algorithms for computing inferences with those models. These components are exposed through interfaces in environments such as R, Python, and the command line.

  14. Taxes in a Labor Supply Model with Joint Wage-Hours Determination.

    ERIC Educational Resources Information Center

    Rosen, Harvey S.

    1976-01-01

    Payroll and progressive income taxes play an enormous role in the American fiscal system. The purpose of this study is to present some econometric evidence on the effects of taxes on married women, a group of growing importance in the American labor force. A testable model of labor supply is developed which permits statistical estimation of a…

  15. The Determinants of Academic Performance of under Graduate Students: In the Case of Arba Minch University Chamo Campus

    ERIC Educational Resources Information Center

    Yigermal, Moges Endalamaw

    2017-01-01

    The main objective of the paper is to investigate the determinant factors affecting the academic performance of regular undergraduate students of Arba Minch university (AMU) chamo campus students. To meet the objective, the Pearson product moment correlation statistical tool and econometrics data analysis (OLS regression) method were used with the…

  16. The Development of Introductory Statistics Students' Informal Inferential Reasoning and Its Relationship to Formal Inferential Reasoning

    ERIC Educational Resources Information Center

    Jacob, Bridgette L.

    2013-01-01

    The difficulties introductory statistics students have with formal statistical inference are well known in the field of statistics education. "Informal" statistical inference has been studied as a means to introduce inferential reasoning well before and without the formalities of formal statistical inference. This mixed methods study…

  17. Students' Emergent Articulations of Statistical Models and Modeling in Making Informal Statistical Inferences

    ERIC Educational Resources Information Center

    Braham, Hana Manor; Ben-Zvi, Dani

    2017-01-01

    A fundamental aspect of statistical inference is representation of real-world data using statistical models. This article analyzes students' articulations of statistical models and modeling during their first steps in making informal statistical inferences. An integrated modeling approach (IMA) was designed and implemented to help students…

  18. Analysis by the Residual Method for Estimate Market Value of Land on the Areas with Mining Exploitation in Subsoil under Future New Building

    NASA Astrophysics Data System (ADS)

    Gwozdz-Lason, Monika

    2017-12-01

    This paper attempts to answer some of the following questions: what is the main selling advantage of a plot of land on the areas with mining exploitation? which attributes influence on market value the most? and how calculate the mining influence in subsoil under future new building as market value of plot with commercial use? This focus is not accidental, as the paper sets out to prove that the subsoil load bearing capacity, as directly inferred from the local geotechnical properties with mining exploitation, considerably influences the market value of this type of real estate. Presented in this elaborate analysis and calculations, are part of the ongoing development works which aimed at suggesting a new technology and procedures for estimating the value of the land belonging to the third category geotechnical. Analysed the question was examined both in terms of the theoretical and empirical. On the basis of the analysed code calculations in residual method, numerical, statistical and econometric defined results and final conclusions. A market analysis yielded a group of subsoil stabilization costs which depend on the mining operations interaction, subsoil parameters, type of the contemplated structure, its foundations, selected stabilization method, its overall area and shape.

  19. Panel data analysis of cardiotocograph (CTG) data.

    PubMed

    Horio, Hiroyuki; Kikuchi, Hitomi; Ikeda, Tomoaki

    2013-01-01

    Panel data analysis is a statistical method, widely used in econometrics, which deals with two-dimensional panel data collected over time and over individuals. Cardiotocograph (CTG) which monitors fetal heart rate (FHR) using Doppler ultrasound and uterine contraction by strain gage is commonly used in intrapartum treatment of pregnant women. Although the relationship between FHR waveform pattern and the outcome such as umbilical blood gas data at delivery has long been analyzed, there exists no accumulated FHR patterns from large number of cases. As time-series economic fluctuations in econometrics such as consumption trend has been studied using panel data which consists of time-series and cross-sectional data, we tried to apply this method to CTG data. The panel data composed of a symbolized segment of FHR pattern can be easily handled, and a perinatologist can get the whole FHR pattern view from the microscopic level of time-series FHR data.

  20. Reliability-based econometrics of aerospace structural systems: Design criteria and test options. Ph.D. Thesis - Georgia Inst. of Tech.

    NASA Technical Reports Server (NTRS)

    Thomas, J. M.; Hanagud, S.

    1974-01-01

    The design criteria and test options for aerospace structural reliability were investigated. A decision methodology was developed for selecting a combination of structural tests and structural design factors. The decision method involves the use of Bayesian statistics and statistical decision theory. Procedures are discussed for obtaining and updating data-based probabilistic strength distributions for aerospace structures when test information is available and for obtaining subjective distributions when data are not available. The techniques used in developing the distributions are explained.

  1. An Econometric Model of External Labor Supply to the Establishment Within a Confined Geographic Market.

    ERIC Educational Resources Information Center

    Hines, Robert James

    The study conducted in the Buffalo, New York standard metropolitan statistical area, was undertaken to formulate and test a simple model of labor supply for a local labor market. The principal variables to be examined to determine the external supply function of labor to the establishment are variants of the rate of change of the entry wage and…

  2. An Econometric Model for Estimating IQ Scores and Environmental Influences on the Pattern of IQ Scores Over Time.

    ERIC Educational Resources Information Center

    Kadane, Joseph B.; And Others

    This paper offers a preliminary analysis of the effects of a semi-segregated school system on the IQ's of its students. The basic data consist of IQ scores for fourth, sixth, and eighth grades and associated environmental data obtained from their school records. A statistical model is developed to analyze longitudinal data when both process error…

  3. Econometrics of joint production: another approach. [Petroleum refining and petrochemicals production

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Griffin, J.M.

    1977-11-01

    The pseudo data approach to the joint production of petroleum refining and chemicals is described as an alternative that avoids the multicollinearity of time series data and allows a complex technology to be characterized in a statistical price possibility frontier. Intended primarily for long-range analysis, the pseudo data method can be used as a source of elasticity estimate for policy analysis. 19 references.

  4. Dynamics of Markets

    NASA Astrophysics Data System (ADS)

    McCauley, Joseph L.

    2009-09-01

    Preface; 1. Econophysics: why and what; 2. Neo-classical economic theory; 3. Probability and stochastic processes; 4. Introduction to financial economics; 5. Introduction to portfolio selection theory; 6. Scaling, pair correlations, and conditional densities; 7. Statistical ensembles: deducing dynamics from time series; 8. Martingale option pricing; 9. FX market globalization: evolution of the dollar to worldwide reserve currency; 10. Macroeconomics and econometrics: regression models vs. empirically based modeling; 11. Complexity; Index.

  5. The Reasoning behind Informal Statistical Inference

    ERIC Educational Resources Information Center

    Makar, Katie; Bakker, Arthur; Ben-Zvi, Dani

    2011-01-01

    Informal statistical inference (ISI) has been a frequent focus of recent research in statistics education. Considering the role that context plays in developing ISI calls into question the need to be more explicit about the reasoning that underpins ISI. This paper uses educational literature on informal statistical inference and philosophical…

  6. India's pulp and paper industry: Productivity and energy efficiency

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schumacher, Katja

    1999-07-01

    Historical estimates of productivity growth in India's pulp and paper sector vary from indicating an improvement to a decline in the sector's productivity. The variance may be traced to the time period of study, source of data for analysis, and type of indices and econometric specifications used for reporting productivity growth. The authors derive both statistical and econometric estimates of productivity growth for this sector. Their results show that productivity declined over the observed period from 1973-74 to 1993-94 by 1.1% p.a. Using a translog specification the econometric analysis reveals that technical progress in India's pulp and paper sector hasmore » been biased towards the use of energy and material, while it has been capital and labor saving. The decline in productivity was caused largely by the protection afforded by high tariffs on imported paper products and other policies, which allowed inefficient, small plants to enter the market and flourish. Will these trends continue into the future, particularly where energy use is concerned? The authors examine the current changes in structure and energy efficiency undergoing in the sector. Their analysis shows that with liberalization of the sector, and tighter environmental controls, the industry is moving towards higher efficiency and productivity. However, the analysis also shows that because these improvements are being hampered by significant financial and other barriers the industry might have a long way to go.« less

  7. Bootstrapping Student Understanding of What Is Going on in Econometrics.

    ERIC Educational Resources Information Center

    Kennedy, Peter E.

    2001-01-01

    Explains that econometrics is an intellectual game played by rules based on the sampling distribution concept. Contains explanations for why many students are uncomfortable with econometrics. Encourages instructors to use explain-how-to-bootstrap exercises to promote student understanding. (RLH)

  8. The Standard Model in the history of the Natural Sciences, Econometrics, and the social sciences

    NASA Astrophysics Data System (ADS)

    Fisher, W. P., Jr.

    2010-07-01

    In the late 18th and early 19th centuries, scientists appropriated Newton's laws of motion as a model for the conduct of any other field of investigation that would purport to be a science. This early form of a Standard Model eventually informed the basis of analogies for the mathematical expression of phenomena previously studied qualitatively, such as cohesion, affinity, heat, light, electricity, and magnetism. James Clerk Maxwell is known for his repeated use of a formalized version of this method of analogy in lectures, teaching, and the design of experiments. Economists transferring skills learned in physics made use of the Standard Model, especially after Maxwell demonstrated the value of conceiving it in abstract mathematics instead of as a concrete and literal mechanical analogy. Haavelmo's probability approach in econometrics and R. Fisher's Statistical Methods for Research Workers brought a statistical approach to bear on the Standard Model, quietly reversing the perspective of economics and the social sciences relative to that of physics. Where physicists, and Maxwell in particular, intuited scientific method as imposing stringent demands on the quality and interrelations of data, instruments, and theory in the name of inferential and comparative stability, statistical models and methods disconnected theory from data by removing the instrument as an essential component. New possibilities for reconnecting economics and the social sciences to Maxwell's sense of the method of analogy are found in Rasch's probabilistic models for measurement.

  9. From Data to Bonuses: A Case Study of the Issues Related to Awarding Teachers Pay on the Basis of Their Students' Progress. Working Paper 2008-14

    ERIC Educational Resources Information Center

    McCaffrey, Daniel F.; Han, Bing; Lockwood, J. R.

    2008-01-01

    A key component to the new wave of performance-based pay initiatives is the use of student achievement data to evaluate teacher performance. As greater amounts of student achievement data are being collected, researchers have been developing and applying innovative statistical and econometric models to longitudinal data to develop measures of an…

  10. An Econometric Analysis of the Unemployment Insurance System in a Local Urban Labor Market. Final Report for September 1, 1973--September 30, 1974.

    ERIC Educational Resources Information Center

    Marston, Stephen Tilney

    The study derives a model of the unemployment insurance (UI) system and its relationship to the labor market, estimates it with data from the Detroit Standard Metropolitan Statistical Area, and evaluates its potential use to forecast UI benefit amounts, UI insured unemployment, and UI exhaustions. It further uses the model to analyze policy issues…

  11. Oscillatory dynamics of investment and capacity utilization

    NASA Astrophysics Data System (ADS)

    Greenblatt, R. E.

    2017-01-01

    Capitalist economic systems display a wide variety of oscillatory phenomena whose underlying causes are often not well understood. In this paper, I consider a very simple model of the reciprocal interaction between investment, capacity utilization, and their time derivatives. The model, which gives rise periodic oscillations, predicts qualitatively the phase relations between these variables. These predictions are observed to be consistent in a statistical sense with econometric data from the US economy.

  12. [Demand for cigarettes and tax increases in El Salvador].

    PubMed

    Ramos-Carbajales, Alejandro; González-Rozada, Martín; Vallarino, Hugo

    2016-10-01

    Analyze short- and long-term elasticities of demand for cigarettes in El Salvador as a tool for supporting recommendations on tax increases to reduce prevalence and consumption through price increases. Demand for cigarettes in El Salvador was analyzed through an econometric time-series model using a database from El Salvador's General Directorate of Internal Taxes (DGII) and the General Directorate of Statistics and Census (DIGESTYC). The analysis period was quarterly: 2000Q1-2012Q4. The usual tests were done to prevent a spurious econometric estimation. It was found that the variables volume sales, actual sale prices, and actual per capita income exhibited first-order cointegration; this result makes it possible to use an error correction model with short- and long-term elasticity estimates. Only long-term elasticities were found to be statistically significant to 5%. Results show long-term price elasticity (5 quarters) of -0.9287 and income price elasticity of 0.9978. Absolute price elasticity is somewhat high, although it is within the levels estimated in other studies in low per-capita income countries. A tax increase from a base amount of US$1.04 per pack of 20 cigarettes to US$1.66 within three years would reduce demand by 20% to 31% and would increase tax revenues by 9% to 22%.

  13. A κ-generalized statistical mechanics approach to income analysis

    NASA Astrophysics Data System (ADS)

    Clementi, F.; Gallegati, M.; Kaniadakis, G.

    2009-02-01

    This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low-middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful.

  14. Travel cost demand model based river recreation benefit estimates with on-site and household surveys: Comparative results and a correction procedure

    NASA Astrophysics Data System (ADS)

    Loomis, John

    2003-04-01

    Past recreation studies have noted that on-site or visitor intercept surveys are subject to over-sampling of avid users (i.e., endogenous stratification) and have offered econometric solutions to correct for this. However, past papers do not estimate the empirical magnitude of the bias in benefit estimates with a real data set, nor do they compare the corrected estimates to benefit estimates derived from a population sample. This paper empirically examines the magnitude of the recreation benefits per trip bias by comparing estimates from an on-site river visitor intercept survey to a household survey. The difference in average benefits is quite large, with the on-site visitor survey yielding 24 per day trip, while the household survey yields 9.67 per day trip. A simple econometric correction for endogenous stratification in our count data model lowers the benefit estimate to $9.60 per day trip, a mean value nearly identical and not statistically different from the household survey estimate.

  15. Children's weight and participation in organized sports.

    PubMed

    Quinto Romani, Annette

    2011-11-01

    Literature dealing with the impact of organized sports on children's weight has been marked by a lack of consensus. A major weakness characterizing most of this research is a lack of proper measurement methods. This paper seeks to fill an important knowledge gap through careful application of econometric methods. Estimations are carried out using data on 1,400 children attending 6th grade in 2008 in the municipality of Aalborg, Denmark. We use standard ordinary least squares (OLS) and class fixed effects to explore the effect of sports participation on body mass index (BMI) as well as underweight, overweight and obesity. Results indicate that participation in organized sports reduced BMI by 2.1%. Likewise it reduced the likelihood of being overweight by 8.2 percentage points and obese by 3.1 percentage points. It is the unique dataset combined with econometric methods that distinguishes our contribution from that of others in the field, thereby offering new insight. Results using statistically sound methods suggest that participation in organized sports has a beneficial effect on children's weight.

  16. Granger causality for state-space models

    NASA Astrophysics Data System (ADS)

    Barnett, Lionel; Seth, Anil K.

    2015-04-01

    Granger causality has long been a prominent method for inferring causal interactions between stochastic variables for a broad range of complex physical systems. However, it has been recognized that a moving average (MA) component in the data presents a serious confound to Granger causal analysis, as routinely performed via autoregressive (AR) modeling. We solve this problem by demonstrating that Granger causality may be calculated simply and efficiently from the parameters of a state-space (SS) model. Since SS models are equivalent to autoregressive moving average models, Granger causality estimated in this fashion is not degraded by the presence of a MA component. This is of particular significance when the data has been filtered, downsampled, observed with noise, or is a subprocess of a higher dimensional process, since all of these operations—commonplace in application domains as diverse as climate science, econometrics, and the neurosciences—induce a MA component. We show how Granger causality, conditional and unconditional, in both time and frequency domains, may be calculated directly from SS model parameters via solution of a discrete algebraic Riccati equation. Numerical simulations demonstrate that Granger causality estimators thus derived have greater statistical power and smaller bias than AR estimators. We also discuss how the SS approach facilitates relaxation of the assumptions of linearity, stationarity, and homoscedasticity underlying current AR methods, thus opening up potentially significant new areas of research in Granger causal analysis.

  17. Application of an Entropic Approach to Assessing Systems Integration

    DTIC Science & Technology

    2012-03-01

    two econometrical measures of information efficiency – Shannon entropy and Hurst exponent . Shannon entropy (which is explained in Chapter III) can be...applied to evaluate long-term correlation of time series, while Hurst exponent can be applied to classify the time series in accordance to existence...of trend. Hurst exponent is the statistical measure of time series long-range dependence, and its value falls in the interval [0, 1] – a value in

  18. Design-based Sample and Probability Law-Assumed Sample: Their Role in Scientific Investigation.

    ERIC Educational Resources Information Center

    Ojeda, Mario Miguel; Sahai, Hardeo

    2002-01-01

    Discusses some key statistical concepts in probabilistic and non-probabilistic sampling to provide an overview for understanding the inference process. Suggests a statistical model constituting the basis of statistical inference and provides a brief review of the finite population descriptive inference and a quota sampling inferential theory.…

  19. The Importance of Statistical Modeling in Data Analysis and Inference

    ERIC Educational Resources Information Center

    Rollins, Derrick, Sr.

    2017-01-01

    Statistical inference simply means to draw a conclusion based on information that comes from data. Error bars are the most commonly used tool for data analysis and inference in chemical engineering data studies. This work demonstrates, using common types of data collection studies, the importance of specifying the statistical model for sound…

  20. State Labor Market Research Study: An Econometric Analysis of the Effects of Labor Subsidies.

    ERIC Educational Resources Information Center

    MacRae, C. Duncan; And Others

    The report describes the construction, application, and theoretical implications of an econometric model depicting the effects of labor subsidies on the supply of workers in the U.S. Three papers deal with the following aspects of constructing the econometric model: (1) examination of equilibrium wages, employment, and earnings of primary and…

  1. What is mLearning and How Can It Be Used to Support Learning and Teaching in Econometrics?

    ERIC Educational Resources Information Center

    Morales, Lucia

    2013-01-01

    The aim of case this study was to analyze the integration of mobile learning technologies in a postgraduate course in Finance (MSc in Finance) at Dublin Institute of Technology, where econometrics is an important course component. Previous experience with students undertaking econometrics modules supported this analysis, where the researcher…

  2. Inferring Demographic History Using Two-Locus Statistics.

    PubMed

    Ragsdale, Aaron P; Gutenkunst, Ryan N

    2017-06-01

    Population demographic history may be learned from contemporary genetic variation data. Methods based on aggregating the statistics of many single loci into an allele frequency spectrum (AFS) have proven powerful, but such methods ignore potentially informative patterns of linkage disequilibrium (LD) between neighboring loci. To leverage such patterns, we developed a composite-likelihood framework for inferring demographic history from aggregated statistics of pairs of loci. Using this framework, we show that two-locus statistics are more sensitive to demographic history than single-locus statistics such as the AFS. In particular, two-locus statistics escape the notorious confounding of depth and duration of a bottleneck, and they provide a means to estimate effective population size based on the recombination rather than mutation rate. We applied our approach to a Zambian population of Drosophila melanogaster Notably, using both single- and two-locus statistics, we inferred a substantially lower ancestral effective population size than previous works and did not infer a bottleneck history. Together, our results demonstrate the broad potential for two-locus statistics to enable powerful population genetic inference. Copyright © 2017 by the Genetics Society of America.

  3. Teaching Statistical Inference for Causal Effects in Experiments and Observational Studies

    ERIC Educational Resources Information Center

    Rubin, Donald B.

    2004-01-01

    Inference for causal effects is a critical activity in many branches of science and public policy. The field of statistics is the one field most suited to address such problems, whether from designed experiments or observational studies. Consequently, it is arguably essential that departments of statistics teach courses in causal inference to both…

  4. Multi-Year Revenue and Expenditure Forecasting for Small Municipal Governments.

    DTIC Science & Technology

    1981-03-01

    Management Audit Econometric Revenue Forecast Gap and Impact Analysis Deterministic Expenditure Forecast Municipal Forecasting Municipal Budget Formlto...together with a multi-year revenue and expenditure forecasting model for the City of Monterey, California. The Monterey model includes an econometric ...65 5 D. FORECAST BASED ON THE ECONOMETRIC MODEL ------- 67 E. FORECAST BASED ON EXPERT JUDGMENT AND TREND ANALYSIS

  5. Econometrics as evidence? Examining the 'causal' connections between financial speculation and commodities prices.

    PubMed

    Williams, James W; Cook, Nikolai M

    2016-10-01

    One of the lasting legacies of the financial crisis of 2008, and the legislative energies that followed from it, is the growing reliance on econometrics as part of the rulemaking process. Financial regulators are increasingly expected to rationalize proposed rules using available econometric techniques, and the courts have vacated several key rules emanating from Dodd-Frank on the grounds of alleged deficiencies in this evidentiary effort. The turn toward such econometric tools is seen as a significant constraint on and challenge to regulators as they endeavor to engage with such essential policy questions as the impact of financial speculation on food security. Yet, outside of the specialized practitioner community, very little is known about these techniques. This article examines one such econometric test, Granger causality, and its role in a pivotal Dodd-Frank rulemaking. Through an examination of the test for Granger causality and its attempts to distill the causal connections between financial speculation and commodities prices, the article argues that econometrics is a blunt but useful tool, limited in its ability to provide decisive insights into commodities markets and yet yielding useful returns for those who are able to wield it.

  6. Reasoning about Informal Statistical Inference: One Statistician's View

    ERIC Educational Resources Information Center

    Rossman, Allan J.

    2008-01-01

    This paper identifies key concepts and issues associated with the reasoning of informal statistical inference. I focus on key ideas of inference that I think all students should learn, including at secondary level as well as tertiary. I argue that a fundamental component of inference is to go beyond the data at hand, and I propose that statistical…

  7. Data-driven inference for the spatial scan statistic.

    PubMed

    Almeida, Alexandre C L; Duarte, Anderson R; Duczmal, Luiz H; Oliveira, Fernando L P; Takahashi, Ricardo H C

    2011-08-02

    Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas) or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.

  8. Assessment of statistical education in Indonesia: Preliminary results and initiation to simulation-based inference

    NASA Astrophysics Data System (ADS)

    Saputra, K. V. I.; Cahyadi, L.; Sembiring, U. A.

    2018-01-01

    Start in this paper, we assess our traditional elementary statistics education and also we introduce elementary statistics with simulation-based inference. To assess our statistical class, we adapt the well-known CAOS (Comprehensive Assessment of Outcomes in Statistics) test that serves as an external measure to assess the student’s basic statistical literacy. This test generally represents as an accepted measure of statistical literacy. We also introduce a new teaching method on elementary statistics class. Different from the traditional elementary statistics course, we will introduce a simulation-based inference method to conduct hypothesis testing. From the literature, it has shown that this new teaching method works very well in increasing student’s understanding of statistics.

  9. Apes are intuitive statisticians.

    PubMed

    Rakoczy, Hannes; Clüver, Annette; Saucke, Liane; Stoffregen, Nicole; Gräbener, Alice; Migura, Judith; Call, Josep

    2014-04-01

    Inductive learning and reasoning, as we use it both in everyday life and in science, is characterized by flexible inferences based on statistical information: inferences from populations to samples and vice versa. Many forms of such statistical reasoning have been found to develop late in human ontogeny, depending on formal education and language, and to be fragile even in adults. New revolutionary research, however, suggests that even preverbal human infants make use of intuitive statistics. Here, we conducted the first investigation of such intuitive statistical reasoning with non-human primates. In a series of 7 experiments, Bonobos, Chimpanzees, Gorillas and Orangutans drew flexible statistical inferences from populations to samples. These inferences, furthermore, were truly based on statistical information regarding the relative frequency distributions in a population, and not on absolute frequencies. Intuitive statistics in its most basic form is thus an evolutionarily more ancient rather than a uniquely human capacity. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. The SRI-WEFA Soviet Econometric Model: Phase One Documentation

    DTIC Science & Technology

    1975-03-01

    established prices. We also have an estimated equation for an end-use residual category which conceptually includes state grain reserves, other undis...forecasting. An important virtue of the econometric discipline is that it requires one first to conceptualize and estimate regularities of behavior...any de- scriptive analysis. Within the framwork of an econometric model, the analyst is able to discriminate among these "special events

  11. Cluster mass inference via random field theory.

    PubMed

    Zhang, Hui; Nichols, Thomas E; Johnson, Timothy D

    2009-01-01

    Cluster extent and voxel intensity are two widely used statistics in neuroimaging inference. Cluster extent is sensitive to spatially extended signals while voxel intensity is better for intense but focal signals. In order to leverage strength from both statistics, several nonparametric permutation methods have been proposed to combine the two methods. Simulation studies have shown that of the different cluster permutation methods, the cluster mass statistic is generally the best. However, to date, there is no parametric cluster mass inference available. In this paper, we propose a cluster mass inference method based on random field theory (RFT). We develop this method for Gaussian images, evaluate it on Gaussian and Gaussianized t-statistic images and investigate its statistical properties via simulation studies and real data. Simulation results show that the method is valid under the null hypothesis and demonstrate that it can be more powerful than the cluster extent inference method. Further, analyses with a single subject and a group fMRI dataset demonstrate better power than traditional cluster size inference, and good accuracy relative to a gold-standard permutation test.

  12. A Test by Any Other Name: P Values, Bayes Factors, and Statistical Inference.

    PubMed

    Stern, Hal S

    2016-01-01

    Procedures used for statistical inference are receiving increased scrutiny as the scientific community studies the factors associated with insuring reproducible research. This note addresses recent negative attention directed at p values, the relationship of confidence intervals and tests, and the role of Bayesian inference and Bayes factors, with an eye toward better understanding these different strategies for statistical inference. We argue that researchers and data analysts too often resort to binary decisions (e.g., whether to reject or accept the null hypothesis) in settings where this may not be required.

  13. Economic growth and CO2 emissions: an investigation with smooth transition autoregressive distributed lag models for the 1800-2014 period in the USA.

    PubMed

    Bildirici, Melike; Ersin, Özgür Ömer

    2018-01-01

    The study aims to combine the autoregressive distributed lag (ARDL) cointegration framework with smooth transition autoregressive (STAR)-type nonlinear econometric models for causal inference. Further, the proposed STAR distributed lag (STARDL) models offer new insights in terms of modeling nonlinearity in the long- and short-run relations between analyzed variables. The STARDL method allows modeling and testing nonlinearity in the short-run and long-run parameters or both in the short- and long-run relations. To this aim, the relation between CO 2 emissions and economic growth rates in the USA is investigated for the 1800-2014 period, which is one of the largest data sets available. The proposed hybrid models are the logistic, exponential, and second-order logistic smooth transition autoregressive distributed lag (LSTARDL, ESTARDL, and LSTAR2DL) models combine the STAR framework with nonlinear ARDL-type cointegration to augment the linear ARDL approach with smooth transitional nonlinearity. The proposed models provide a new approach to the relevant econometrics and environmental economics literature. Our results indicated the presence of asymmetric long-run and short-run relations between the analyzed variables that are from the GDP towards CO 2 emissions. By the use of newly proposed STARDL models, the results are in favor of important differences in terms of the response of CO 2 emissions in regimes 1 and 2 for the estimated LSTAR2DL and LSTARDL models.

  14. Effectiveness of conservation easements in agricultural regions.

    PubMed

    Braza, Mark

    2017-08-01

    Conservation easements are a standard technique for preventing habitat loss, particularly in agricultural regions with extensive cropland cultivation, yet little is known about their effectiveness. I developed a spatial econometric approach to propensity-score matching and used the approach to estimate the amount of habitat loss prevented by a grassland conservation easement program of the U.S. federal government. I used a spatial autoregressive probit model to predict tract enrollment in the easement program as of 2001 based on tract agricultural suitability, habitat quality, and spatial interactions among neighboring tracts. Using the predicted values from the model, I matched enrolled tracts with similar unenrolled tracts to form a treatment group and a control group. To measure the program's impact on subsequent grassland loss, I estimated cropland cultivation rates for both groups in 2014 with a second spatial probit model. Between 2001 and 2014, approximately 14.9% of control tracts were cultivated and 0.3% of treated tracts were cultivated. Therefore, approximately 14.6% of the protected land would have been cultivated in the absence of the program. My results demonstrate that conservation easements can significantly reduce habitat loss in agricultural regions; however, the enrollment of tracts with low cropland suitability may constrain the amount of habitat loss they prevent. My results also show that spatial econometric models can improve the validity of control groups and thereby strengthen causal inferences about program effectiveness in situations when spatial interactions influence conservation decisions. © 2017 Society for Conservation Biology.

  15. Using Guided Reinvention to Develop Teachers' Understanding of Hypothesis Testing Concepts

    ERIC Educational Resources Information Center

    Dolor, Jason; Noll, Jennifer

    2015-01-01

    Statistics education reform efforts emphasize the importance of informal inference in the learning of statistics. Research suggests statistics teachers experience similar difficulties understanding statistical inference concepts as students and how teacher knowledge can impact student learning. This study investigates how teachers reinvented an…

  16. An Analysis of Selectivity Bias in the Medicare AAPCC

    PubMed Central

    Dowd, Bryan; Feldman, Roger; Moscovice, Ira; Wisner, Catherine; Bland, Pat; Finch, Mike

    1996-01-01

    Using econometric models of endogenous sample selection, we examine possible payment bias to Medicare Tax Equity and Fiscal Responsibility Act of 1982 (TEFRA)-risk health maintenance organizations (HMOs) in the Twin Cities in 1988. We do not find statistically significant evidence of favorable HMO selection. In fact, the sign of the selection term indicates adverse selection into HMOs. This finding is interesting, in view of the fact that three of the five risk HMOs in the study have since converted to non-risk contracts. PMID:10158735

  17. A multi-resolution approach for optimal mass transport

    NASA Astrophysics Data System (ADS)

    Dominitz, Ayelet; Angenent, Sigurd; Tannenbaum, Allen

    2007-09-01

    Optimal mass transport is an important technique with numerous applications in econometrics, fluid dynamics, automatic control, statistical physics, shape optimization, expert systems, and meteorology. Motivated by certain problems in image registration and medical image visualization, in this note, we describe a simple gradient descent methodology for computing the optimal L2 transport mapping which may be easily implemented using a multiresolution scheme. We also indicate how the optimal transport map may be computed on the sphere. A numerical example is presented illustrating our ideas.

  18. The Defense Department’s Support of Industry’s Independent Research and Development (IR&D). Analyses and Evaluation

    DTIC Science & Technology

    1989-04-01

    Hill and Susan Bodilly were incorporated into this report. Todd Porter, a RAND Summer Intern, contributed to the data development, statistical...5Bailey and Lawrence , 1987, pp. 19-20. The R&D tax credit allowed a 25 percent credit for R&D expenditures in excess of the average amount spent...See J. A. Hausman , ’Specificaticn Tests in Econometrics,"Econormetrica, Vol, 46, No. 6, November 1978. 61 (7) and (8) report the results of this

  19. Joint Use of the MAB-II and MicroCog for Improvements in the Clinical and Neuropsychological Screening and Aeromedical Waiver Process of Rated USAF Pilots

    DTIC Science & Technology

    2010-01-01

    medical flight screening and the aeromedical waiver process ( Olea & Ree, 1994; Ree & Carretta, 1996; Ree, Carretta, & Teachout, 1995). Currently, the...Student pilots with high scores on ability tests are more likely to complete training ( Olea & Ree, 1994; Ree & Carretta, 1996; Ree, Carretta, & Teachout...Matrix differential calculus with applications in statistics and econometrics. New York, NY: John Wiley. Olea , M., & Ree, M.J. (1994

  20. Does a hospital's quality depend on the quality of other hospitals? A spatial econometrics approach

    PubMed Central

    Gravelle, Hugh; Santos, Rita; Siciliani, Luigi

    2014-01-01

    We examine whether a hospital's quality is affected by the quality provided by other hospitals in the same market. We first sketch a theoretical model with regulated prices and derive conditions on demand and cost functions which determine whether a hospital will increase its quality if its rivals increase their quality. We then apply spatial econometric methods to a sample of English hospitals in 2009–10 and a set of 16 quality measures including mortality rates, readmission, revision and redo rates, and three patient reported indicators, to examine the relationship between the quality of hospitals. We find that a hospital's quality is positively associated with the quality of its rivals for seven out of the sixteen quality measures. There are no statistically significant negative associations. In those cases where there is a significant positive association, an increase in rivals' quality by 10% increases a hospital's quality by 1.7% to 2.9%. The finding suggests that for some quality measures a policy which improves the quality in one hospital will have positive spillover effects on the quality in other hospitals. PMID:25843994

  1. Does a hospital's quality depend on the quality of other hospitals? A spatial econometrics approach.

    PubMed

    Gravelle, Hugh; Santos, Rita; Siciliani, Luigi

    2014-11-01

    We examine whether a hospital's quality is affected by the quality provided by other hospitals in the same market. We first sketch a theoretical model with regulated prices and derive conditions on demand and cost functions which determine whether a hospital will increase its quality if its rivals increase their quality. We then apply spatial econometric methods to a sample of English hospitals in 2009-10 and a set of 16 quality measures including mortality rates, readmission, revision and redo rates, and three patient reported indicators, to examine the relationship between the quality of hospitals. We find that a hospital's quality is positively associated with the quality of its rivals for seven out of the sixteen quality measures. There are no statistically significant negative associations. In those cases where there is a significant positive association, an increase in rivals' quality by 10% increases a hospital's quality by 1.7% to 2.9%. The finding suggests that for some quality measures a policy which improves the quality in one hospital will have positive spillover effects on the quality in other hospitals.

  2. Evaluating the Use of Random Distribution Theory to Introduce Statistical Inference Concepts to Business Students

    ERIC Educational Resources Information Center

    Larwin, Karen H.; Larwin, David A.

    2011-01-01

    Bootstrapping methods and random distribution methods are increasingly recommended as better approaches for teaching students about statistical inference in introductory-level statistics courses. The authors examined the effect of teaching undergraduate business statistics students using random distribution and bootstrapping simulations. It is the…

  3. Application of Transformations in Parametric Inference

    ERIC Educational Resources Information Center

    Brownstein, Naomi; Pensky, Marianna

    2008-01-01

    The objective of the present paper is to provide a simple approach to statistical inference using the method of transformations of variables. We demonstrate performance of this powerful tool on examples of constructions of various estimation procedures, hypothesis testing, Bayes analysis and statistical inference for the stress-strength systems.…

  4. Assessing Independent Variables Used in Econometric Modeling Forest Land Use or Land Cover Change: A Meta-Analysis

    Treesearch

    J Jeuck; F. Cubbage; R. Abt; R. Bardon; J. McCarter; J. Coulston; M. Renkow

    2014-01-01

    : We conducted a meta-analysis on 64 econometric models from 47 studies predicting forestland conversion to agriculture (F2A), forestland to development (F2D), forestland to non-forested (F2NF) and undeveloped (including forestland) to developed (U2D) land. Over 250 independent econometric variables were identified from 21 F2A models, 21 F2D models, 12 F2NF models, and...

  5. Statistical inference for tumor growth inhibition T/C ratio.

    PubMed

    Wu, Jianrong

    2010-09-01

    The tumor growth inhibition T/C ratio is commonly used to quantify treatment effects in drug screening tumor xenograft experiments. The T/C ratio is converted to an antitumor activity rating using an arbitrary cutoff point and often without any formal statistical inference. Here, we applied a nonparametric bootstrap method and a small sample likelihood ratio statistic to make a statistical inference of the T/C ratio, including both hypothesis testing and a confidence interval estimate. Furthermore, sample size and power are also discussed for statistical design of tumor xenograft experiments. Tumor xenograft data from an actual experiment were analyzed to illustrate the application.

  6. A stratification approach using logit-based models for confounder adjustment in the study of continuous outcomes.

    PubMed

    Tan, Chuen Seng; Støer, Nathalie C; Chen, Ying; Andersson, Marielle; Ning, Yilin; Wee, Hwee-Lin; Khoo, Eric Yin Hao; Tai, E-Shyong; Kao, Shih Ling; Reilly, Marie

    2017-01-01

    The control of confounding is an area of extensive epidemiological research, especially in the field of causal inference for observational studies. Matched cohort and case-control study designs are commonly implemented to control for confounding effects without specifying the functional form of the relationship between the outcome and confounders. This paper extends the commonly used regression models in matched designs for binary and survival outcomes (i.e. conditional logistic and stratified Cox proportional hazards) to studies of continuous outcomes through a novel interpretation and application of logit-based regression models from the econometrics and marketing research literature. We compare the performance of the maximum likelihood estimators using simulated data and propose a heuristic argument for obtaining the residuals for model diagnostics. We illustrate our proposed approach with two real data applications. Our simulation studies demonstrate that our stratification approach is robust to model misspecification and that the distribution of the estimated residuals provides a useful diagnostic when the strata are of moderate size. In our applications to real data, we demonstrate that parity and menopausal status are associated with percent mammographic density, and that the mean level and variability of inpatient blood glucose readings vary between medical and surgical wards within a national tertiary hospital. Our work highlights how the same class of regression models, available in most statistical software, can be used to adjust for confounding in the study of binary, time-to-event and continuous outcomes.

  7. Investigating Mathematics Teachers' Thoughts of Statistical Inference

    ERIC Educational Resources Information Center

    Yang, Kai-Lin

    2012-01-01

    Research on statistical cognition and application suggests that statistical inference concepts are commonly misunderstood by students and even misinterpreted by researchers. Although some research has been done on students' misunderstanding or misconceptions of confidence intervals (CIs), few studies explore either students' or mathematics…

  8. Lessons from Inferentialism for Statistics Education

    ERIC Educational Resources Information Center

    Bakker, Arthur; Derry, Jan

    2011-01-01

    This theoretical paper relates recent interest in informal statistical inference (ISI) to the semantic theory termed inferentialism, a significant development in contemporary philosophy, which places inference at the heart of human knowing. This theory assists epistemological reflection on challenges in statistics education encountered when…

  9. Statistical Inference and Patterns of Inequality in the Global North

    ERIC Educational Resources Information Center

    Moran, Timothy Patrick

    2006-01-01

    Cross-national inequality trends have historically been a crucial field of inquiry across the social sciences, and new methodological techniques of statistical inference have recently improved the ability to analyze these trends over time. This paper applies Monte Carlo, bootstrap inference methods to the income surveys of the Luxembourg Income…

  10. Statistical inference and Aristotle's Rhetoric.

    PubMed

    Macdonald, Ranald R

    2004-11-01

    Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.

  11. Allocating physicians' overhead costs to services: an econometric/accounting-activity based-approach.

    PubMed

    Peden, Al; Baker, Judith J

    2002-01-01

    Using the optimizing properties of econometric analysis, this study analyzes how physician overhead costs (OC) can be allocated to multiple activities to maximize precision in reimbursing the costs of services. Drawing on work by Leibenstein and Friedman, the analysis also shows that allocating OC to multiple activities unbiased by revenue requires controlling for revenue when making the estimates. Further econometric analysis shows that it is possible to save about 10 percent of OC by paying only for those that are necessary.

  12. Price of gasoline: forecasting comparisons. [Box-Jenkins, econometric, and regression methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bopp, A.E.; Neri, J.A.

    Gasoline prices are simulated using three popular forecasting methodologies: A Box--Jenkins type method, an econometric method, and a regression method. One-period-ahead and 18-period-ahead comparisons are made. For the one-period-ahead method, a Box--Jenkins type time-series model simulated best, although all do well. However, for the 18-period simulation, the econometric and regression methods perform substantially better than the Box-Jenkins formulation. A rationale for and implications of these results ae discussed. 11 references.

  13. FBST for Cointegration Problems

    NASA Astrophysics Data System (ADS)

    Diniz, M.; Pereira, C. A. B.; Stern, J. M.

    2008-11-01

    In order to estimate causal relations, the time series econometrics has to be aware of spurious correlation, a problem first mentioned by Yule [21]. To solve the problem, one can work with differenced series or use multivariate models like VAR or VEC models. In this case, the analysed series are going to present a long run relation i.e. a cointegration relation. Even though the Bayesian literature about inference on VAR/VEC models is quite advanced, Bauwens et al. [2] highlight that "the topic of selecting the cointegrating rank has not yet given very useful and convincing results." This paper presents the Full Bayesian Significance Test applied to cointegration rank selection tests in multivariate (VAR/VEC) time series models and shows how to implement it using available in the literature and simulated data sets. A standard non-informative prior is assumed.

  14. Analytical-numerical solution of a nonlinear integrodifferential equation in econometrics

    NASA Astrophysics Data System (ADS)

    Kakhktsyan, V. M.; Khachatryan, A. Kh.

    2013-07-01

    A mixed problem for a nonlinear integrodifferential equation arising in econometrics is considered. An analytical-numerical method is proposed for solving the problem. Some numerical results are presented.

  15. CADDIS Volume 4. Data Analysis: Biological and Environmental Data Requirements

    EPA Pesticide Factsheets

    Overview of PECBO Module, using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, methods for inferring environmental conditions, statistical scripts in module.

  16. Statistical methods for the beta-binomial model in teratology.

    PubMed Central

    Yamamoto, E; Yanagimoto, T

    1994-01-01

    The beta-binomial model is widely used for analyzing teratological data involving littermates. Recent developments in statistical analyses of teratological data are briefly reviewed with emphasis on the model. For statistical inference of the parameters in the beta-binomial distribution, separation of the likelihood introduces an likelihood inference. This leads to reducing biases of estimators and also to improving accuracy of empirical significance levels of tests. Separate inference of the parameters can be conducted in a unified way. PMID:8187716

  17. On Some Assumptions of the Null Hypothesis Statistical Testing

    ERIC Educational Resources Information Center

    Patriota, Alexandre Galvão

    2017-01-01

    Bayesian and classical statistical approaches are based on different types of logical principles. In order to avoid mistaken inferences and misguided interpretations, the practitioner must respect the inference rules embedded into each statistical method. Ignoring these principles leads to the paradoxical conclusions that the hypothesis…

  18. Direct evidence for a dual process model of deductive inference.

    PubMed

    Markovits, Henry; Brunet, Marie-Laurence; Thompson, Valerie; Brisson, Janie

    2013-07-01

    In 2 experiments, we tested a strong version of a dual process theory of conditional inference (cf. Verschueren et al., 2005a, 2005b) that assumes that most reasoners have 2 strategies available, the choice of which is determined by situational variables, cognitive capacity, and metacognitive control. The statistical strategy evaluates inferences probabilistically, accepting those with high conditional probability. The counterexample strategy rejects inferences when a counterexample shows the inference to be invalid. To discriminate strategy use, we presented reasoners with conditional statements (if p, then q) and explicit statistical information about the relative frequency of the probability of p/q (50% vs. 90%). A statistical strategy would accept the more probable inferences more frequently, whereas the counterexample one would reject both. In Experiment 1, reasoners under time pressure used the statistical strategy more, but switched to the counterexample strategy when time constraints were removed; the former took less time than the latter. These data are consistent with the hypothesis that the statistical strategy is the default heuristic. Under a free-time condition, reasoners preferred the counterexample strategy and kept it when put under time pressure. Thus, it is not simply a lack of capacity that produces a statistical strategy; instead, it seems that time pressure disrupts the ability to make good metacognitive choices. In line with this conclusion, in a 2nd experiment, we measured reasoners' confidence in their performance; those under time pressure were less confident in the statistical than the counterexample strategy and more likely to switch strategies under free-time conditions. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  19. The APA Task Force on Statistical Inference (TFSI) Report as a Framework for Teaching and Evaluating Students' Understandings of Study Validity.

    ERIC Educational Resources Information Center

    Thompson, Bruce

    Web-based statistical instruction, like all statistical instruction, ought to focus on teaching the essence of the research endeavor: the exercise of reflective judgment. Using the framework of the recent report of the American Psychological Association (APA) Task Force on Statistical Inference (Wilkinson and the APA Task Force on Statistical…

  20. Data Analysis Techniques for Physical Scientists

    NASA Astrophysics Data System (ADS)

    Pruneau, Claude A.

    2017-10-01

    Preface; How to read this book; 1. The scientific method; Part I. Foundation in Probability and Statistics: 2. Probability; 3. Probability models; 4. Classical inference I: estimators; 5. Classical inference II: optimization; 6. Classical inference III: confidence intervals and statistical tests; 7. Bayesian inference; Part II. Measurement Techniques: 8. Basic measurements; 9. Event reconstruction; 10. Correlation functions; 11. The multiple facets of correlation functions; 12. Data correction methods; Part III. Simulation Techniques: 13. Monte Carlo methods; 14. Collision and detector modeling; List of references; Index.

  1. CADDIS Volume 4. Data Analysis: Predicting Environmental Conditions from Biological Observations (PECBO Appendix)

    EPA Pesticide Factsheets

    Overview of PECBO Module, using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, methods for inferring environmental conditions, statistical scripts in module.

  2. Cost Effectiveness Trade-Offs in Software Support Environment Standardization.

    DTIC Science & Technology

    1986-09-30

    IIIIIEEEIIIIIE MiII I U..2 2 ma MICROCOPY RESOLUTION TEST CHART 911C FILE C y, o FINAL REPORT - September 30, 1986 G- TECHNION INTERNATIONAL, INC. Cost...Summary description of econometric model B-I C. Causal chain used as basis for model C-I D. Excerpts from [Wer185) D-1 LIST OF FIGURES S-1 USAF MCCR...Productivity cost drivers D-4 LIST OF TASL3$ I-1 Summary of Tangible Benefits in Econometric Equations 1-9 1-2 Summary of Tangible Costs in Econometric

  3. A statistical test of the stability assumption inherent in empirical estimates of economic depreciation.

    PubMed

    Shriver, K A

    1986-01-01

    Realistic estimates of economic depreciation are required for analyses of tax policy, economic growth and production, and national income and wealth. THe purpose of this paper is to examine the stability assumption underlying the econometric derivation of empirical estimates of economic depreciation for industrial machinery and and equipment. The results suggest that a reasonable stability of economic depreciation rates of decline may exist over time. Thus, the assumption of a constant rate of economic depreciation may be a reasonable approximation for further empirical economic analyses.

  4. An economic approach to abortion demand.

    PubMed

    Rothstein, D S

    1992-01-01

    "This paper uses econometric multiple regression techniques in order to analyze the socioeconomic factors affecting the demand for abortion for the year 1985. A cross-section of the 50 [U.S.] states and Washington D.C. is examined and a household choice theoretical framework is utilized. The results suggest that average price of abortion, disposable personal per capita income, percentage of single women, whether abortions are state funded, unemployment rate, divorce rate, and if the state is located in the far West, are statistically significant factors in the determination of the demand for abortion." excerpt

  5. Appendix : airborne incidents : an econometric analysis of severity

    DOT National Transportation Integrated Search

    2014-12-19

    This is the Appendix for Airborne Incidents: An Econometric Analysis of Severity Report. : Airborne loss of separation incidents occur when an aircraft breaches the defined separation limit (vertical and/or horizontal) with another aircraft or terrai...

  6. Statistical Inferences from Formaldehyde Dna-Protein Cross-Link Data

    EPA Science Inventory

    Physiologically-based pharmacokinetic (PBPK) modeling has reached considerable sophistication in its application in the pharmacological and environmental health areas. Yet, mature methodologies for making statistical inferences have not been routinely incorporated in these applic...

  7. Statistics for nuclear engineers and scientists. Part 1. Basic statistical inference

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beggs, W.J.

    1981-02-01

    This report is intended for the use of engineers and scientists working in the nuclear industry, especially at the Bettis Atomic Power Laboratory. It serves as the basis for several Bettis in-house statistics courses. The objectives of the report are to introduce the reader to the language and concepts of statistics and to provide a basic set of techniques to apply to problems of the collection and analysis of data. Part 1 covers subjects of basic inference. The subjects include: descriptive statistics; probability; simple inference for normally distributed populations, and for non-normal populations as well; comparison of two populations; themore » analysis of variance; quality control procedures; and linear regression analysis.« less

  8. Introducing Statistical Inference to Biology Students through Bootstrapping and Randomization

    ERIC Educational Resources Information Center

    Lock, Robin H.; Lock, Patti Frazer

    2008-01-01

    Bootstrap methods and randomization tests are increasingly being used as alternatives to standard statistical procedures in biology. They also serve as an effective introduction to the key ideas of statistical inference in introductory courses for biology students. We discuss the use of such simulation based procedures in an integrated curriculum…

  9. Developing Young Children's Emergent Inferential Practices in Statistics

    ERIC Educational Resources Information Center

    Makar, Katie

    2016-01-01

    Informal statistical inference has now been researched at all levels of schooling and initial tertiary study. Work in informal statistical inference is least understood in the early years, where children have had little if any exposure to data handling. A qualitative study in Australia was carried out through a series of teaching experiments with…

  10. Directory of Energy Information Administration Model Abstracts

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    1986-07-16

    This directory partially fulfills the requirements of Section 8c, of the documentation order, which states in part that: The Office of Statistical Standards will annually publish an EIA document based on the collected abstracts and the appendices. This report contains brief statements about each model's title, acronym, purpose, and status, followed by more detailed information on characteristics, uses, and requirements. Sources for additional information are identified. All models active through March 1985 are included. The main body of this directory is an alphabetical list of all active EIA models. Appendix A identifies major EIA modeling systems and the models withinmore » these systems, and Appendix B identifies active EIA models by type (basic, auxiliary, and developing). EIA also leases models developed by proprietary software vendors. Documentation for these proprietary models is the responsibility of the companies from which they are leased. EIA has recently leased models from Chase Econometrics, Inc., Data Resources, Inc. (DRI), the Oak Ridge National Laboratory (ORNL), and Wharton Econometric Forecasting Associates (WEFA). Leased models are not abstracted here. The directory is intended for the use of energy and energy-policy analysts in the public and private sectors.« less

  11. Macro policy responses to oil booms and busts in the United Arab Emirates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Al-Mutawa, A.K.

    1991-01-01

    The effects of oil shocks and macro policy changes in the United Arab Emirates are analyzed. A theoretical model is developed within the framework of the Dutch Disease literature. It contains four unique features that are applicable to the United Arab Emirates' economy. There are: (1) the presence of a large foreign labor force; (2) OPEC's oil export quotas; (3) the division of oil profits; and (4) the important role of government expenditures. The model is then used to examine the welfare effects of the above-mentioned shocks. An econometric model is then specified that conforms to the analytical model. Inmore » the econometric model the method of principal components' is applied owing to the undersized sample data. The principal components methodology is used in both the identification testing and the estimation of the structural equations. The oil and macro policy shocks are then simulated. The simulation results show that an oil-quantity boom leads to a higher welfare gain than an oil-price boom. Under certain circumstances, this finding is also confirmed by the comparative statistics that follow from the analytical model.« less

  12. Load balancing for massively-parallel soft-real-time systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hailperin, M.

    1988-09-01

    Global load balancing, if practical, would allow the effective use of massively-parallel ensemble architectures for large soft-real-problems. The challenge is to replace quick global communications, which is impractical in a massively-parallel system, with statistical techniques. In this vein, the author proposes a novel approach to decentralized load balancing based on statistical time-series analysis. Each site estimates the system-wide average load using information about past loads of individual sites and attempts to equal that average. This estimation process is practical because the soft-real-time systems of interest naturally exhibit loads that are periodic, in a statistical sense akin to seasonality in econometrics.more » It is shown how this load-characterization technique can be the foundation for a load-balancing system in an architecture employing cut-through routing and an efficient multicast protocol.« less

  13. Airborne incidents : an econometric analysis of severity, December 31, 2014 : technical summary

    DOT National Transportation Integrated Search

    2014-12-31

    This is a technical summary of the Airborne Incidents: An Econometric Analysis of Severity main report. : Airborne loss of separation incidents occur when an aircraft breaches the defined separation limit (vertical and/or horizontal) with anoth...

  14. Using Alien Coins to Test Whether Simple Inference Is Bayesian

    ERIC Educational Resources Information Center

    Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.

    2016-01-01

    Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…

  15. Econometric Models of U.S. Navy Career Petty Officer Retention.

    DTIC Science & Technology

    1981-06-01

    PF AD-AL04 076 NAVAL POSTGRADUATE SCHOOL MONTEREY CA F/6 5/9 ECONO ETRIC MODELS OF U.S. NAVY CAREER PETTY OFFICER RETENTION.(Ul JUN 81 J J B PKO...THESIS D . ECONOMETRIC MODELS OF U. S. NAVY CAREER PETTY OFFICER RETENTION SML Vby John Joseph Bepko III June 1981 Thesis Advisor: George W. Thomas...DOCUMENTATION PACE 33703 coTu~rwc oEm 0419PsR 01N1911VT*48~ &GM01 1𔃻. 411CIP1SIMYS CATALOG IulmSIS Econometric Models of U. S. Navy Career Petty 1’ t h s s j

  16. Econometrics of exhaustible resource supply: a theory and an application. Final report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Epple, D.; Hansen, L.P.

    1981-12-01

    An econometric model of US oil and natural gas discoveries is developed in this study. The econometric model is explicitly derived as the solution to the problem of maximizing the expected discounted after tax present value of revenues net of exploration, development, and production costs. The model contains equations representing producers' formation of price expectations and separate equations giving producers' optimal exploration decisions contingent on expected prices. A procedure is developed for imposing resource base constraints (e.g., ultimate recovery estimates based on geological analysis) when estimating the econometric model. The model is estimated using aggregate post-war data for the Unitedmore » States. Production from a given addition to proved reserves is assumed to follow a negative exponential path, and additions of proved reserves from a given discovery are assumed to follow a negative exponential path. Annual discoveries of oil and natural gas are estimated as latent variables. These latent variables are the endogenous variables in the econometric model of oil and natural gas discoveries. The model is estimated without resource base constraints. The model is also estimated imposing the mean oil and natural gas ultimate recovery estimates of the US Geological Survey. Simulations through the year 2020 are reported for various future price regimes.« less

  17. Students' Expressions of Uncertainty in Making Informal Inference When Engaged in a Statistical Investigation Using TinkerPlots

    ERIC Educational Resources Information Center

    Henriques, Ana; Oliveira, Hélia

    2016-01-01

    This paper reports on the results of a study investigating the potential to embed Informal Statistical Inference in statistical investigations, using TinkerPlots, for assisting 8th grade students' informal inferential reasoning to emerge, particularly their articulations of uncertainty. Data collection included students' written work on a…

  18. Driving factors of interactions between the exchange rate market and the commodity market: A wavelet-based complex network perspective

    NASA Astrophysics Data System (ADS)

    Wen, Shaobo; An, Haizhong; Chen, Zhihua; Liu, Xueyong

    2017-08-01

    In traditional econometrics, a time series must be in a stationary sequence. However, it usually shows time-varying fluctuations, and it remains a challenge to execute a multiscale analysis of the data and discover the topological characteristics of conduction in different scales. Wavelet analysis and complex networks in physical statistics have special advantages in solving these problems. We select the exchange rate variable from the Chinese market and the commodity price index variable from the world market as the time series of our study. We explore the driving factors behind the behavior of the two markets and their topological characteristics in three steps. First, we use the Kalman filter to find the optimal estimation of the relationship between the two markets. Second, wavelet analysis is used to extract the scales of the relationship that are driven by different frequency wavelets. Meanwhile, we search for the actual economic variables corresponding to different frequency wavelets. Finally, a complex network is used to search for the transfer characteristics of the combination of states driven by different frequency wavelets. The results show that statistical physics have a unique advantage over traditional econometrics. The Chinese market has time-varying impacts on the world market: it has greater influence when the world economy is stable and less influence in times of turmoil. The process of forming the state combination is random. Transitions between state combinations have a clustering feature. Based on these characteristics, we can effectively reduce the information burden on investors and correctly respond to the government's policy mix.

  19. Faster Mass Spectrometry-based Protein Inference: Junction Trees are More Efficient than Sampling and Marginalization by Enumeration

    PubMed Central

    Serang, Oliver; Noble, William Stafford

    2012-01-01

    The problem of identifying the proteins in a complex mixture using tandem mass spectrometry can be framed as an inference problem on a graph that connects peptides to proteins. Several existing protein identification methods make use of statistical inference methods for graphical models, including expectation maximization, Markov chain Monte Carlo, and full marginalization coupled with approximation heuristics. We show that, for this problem, the majority of the cost of inference usually comes from a few highly connected subgraphs. Furthermore, we evaluate three different statistical inference methods using a common graphical model, and we demonstrate that junction tree inference substantially improves rates of convergence compared to existing methods. The python code used for this paper is available at http://noble.gs.washington.edu/proj/fido. PMID:22331862

  20. Using genetic data to strengthen causal inference in observational research.

    PubMed

    Pingault, Jean-Baptiste; O'Reilly, Paul F; Schoeler, Tabea; Ploubidis, George B; Rijsdijk, Frühling; Dudbridge, Frank

    2018-06-05

    Causal inference is essential across the biomedical, behavioural and social sciences.By progressing from confounded statistical associations to evidence of causal relationships, causal inference can reveal complex pathways underlying traits and diseases and help to prioritize targets for intervention. Recent progress in genetic epidemiology - including statistical innovation, massive genotyped data sets and novel computational tools for deep data mining - has fostered the intense development of methods exploiting genetic data and relatedness to strengthen causal inference in observational research. In this Review, we describe how such genetically informed methods differ in their rationale, applicability and inherent limitations and outline how they should be integrated in the future to offer a rich causal inference toolbox.

  1. Risk, statistical inference, and the law of evidence: the use of epidemiological data in toxic tort cases.

    PubMed

    Brannigan, V M; Bier, V M; Berg, C

    1992-09-01

    Toxic torts are product liability cases dealing with alleged injuries due to chemical or biological hazards such as radiation, thalidomide, or Agent Orange. Toxic tort cases typically rely more heavily than other product liability cases on indirect or statistical proof of injury. There have been numerous theoretical analyses of statistical proof of injury in toxic tort cases. However, there have been only a handful of actual legal decisions regarding the use of such statistical evidence, and most of those decisions have been inconclusive. Recently, a major case from the Fifth Circuit, involving allegations that Benedectin (a morning sickness drug) caused birth defects, was decided entirely on the basis of statistical inference. This paper examines both the conceptual basis of that decision, and also the relationships among statistical inference, scientific evidence, and the rules of product liability in general.

  2. Making statistical inferences about software reliability

    NASA Technical Reports Server (NTRS)

    Miller, Douglas R.

    1988-01-01

    Failure times of software undergoing random debugging can be modelled as order statistics of independent but nonidentically distributed exponential random variables. Using this model inferences can be made about current reliability and, if debugging continues, future reliability. This model also shows the difficulty inherent in statistical verification of very highly reliable software such as that used by digital avionics in commercial aircraft.

  3. Comparing Trend and Gap Statistics across Tests: Distributional Change Using Ordinal Methods and Bayesian Inference

    ERIC Educational Resources Information Center

    Denbleyker, John Nickolas

    2012-01-01

    The shortcomings of the proportion above cut (PAC) statistic used so prominently in the educational landscape renders it a very problematic measure for making correct inferences with student test data. The limitations of PAC-based statistics are more pronounced with cross-test comparisons due to their dependency on cut-score locations. A better…

  4. Aspects of First Year Statistics Students' Reasoning When Performing Intuitive Analysis of Variance: Effects of Within- and Between-Group Variability

    ERIC Educational Resources Information Center

    Trumpower, David L.

    2015-01-01

    Making inferences about population differences based on samples of data, that is, performing intuitive analysis of variance (IANOVA), is common in everyday life. However, the intuitive reasoning of individuals when making such inferences (even following statistics instruction), often differs from the normative logic of formal statistics. The…

  5. Econometrics and Psychometrics: A Survey of Communalities

    ERIC Educational Resources Information Center

    Goldberger, Arthur S.

    1971-01-01

    Several themes which are common to both econometrics and psychometrics are surveyed. The themes are illustrated by reference to permanent income hypotheses, simultaneous equation models, adaptive expectations and partial adjustment schemes, and by reference to test score theory, factor analysis, and time-series models. (Author)

  6. Difference to Inference: teaching logical and statistical reasoning through on-line interactivity.

    PubMed

    Malloy, T E

    2001-05-01

    Difference to Inference is an on-line JAVA program that simulates theory testing and falsification through research design and data collection in a game format. The program, based on cognitive and epistemological principles, is designed to support learning of the thinking skills underlying deductive and inductive logic and statistical reasoning. Difference to Inference has database connectivity so that game scores can be counted as part of course grades.

  7. Inference as Prediction

    ERIC Educational Resources Information Center

    Watson, Jane

    2007-01-01

    Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…

  8. A Framework for Thinking about Informal Statistical Inference

    ERIC Educational Resources Information Center

    Makar, Katie; Rubin, Andee

    2009-01-01

    Informal inferential reasoning has shown some promise in developing students' deeper understanding of statistical processes. This paper presents a framework to think about three key principles of informal inference--generalizations "beyond the data," probabilistic language, and data as evidence. The authors use primary school classroom…

  9. Sensitivity to the Sampling Process Emerges From the Principle of Efficiency.

    PubMed

    Jara-Ettinger, Julian; Sun, Felix; Schulz, Laura; Tenenbaum, Joshua B

    2018-05-01

    Humans can seamlessly infer other people's preferences, based on what they do. Broadly, two types of accounts have been proposed to explain different aspects of this ability. The first account focuses on spatial information: Agents' efficient navigation in space reveals what they like. The second account focuses on statistical information: Uncommon choices reveal stronger preferences. Together, these two lines of research suggest that we have two distinct capacities for inferring preferences. Here we propose that this is not the case, and that spatial-based and statistical-based preference inferences can be explained by the assumption that agents are efficient alone. We show that people's sensitivity to spatial and statistical information when they infer preferences is best predicted by a computational model of the principle of efficiency, and that this model outperforms dual-system models, even when the latter are fit to participant judgments. Our results suggest that, as adults, a unified understanding of agency under the principle of efficiency underlies our ability to infer preferences. Copyright © 2018 Cognitive Science Society, Inc.

  10. The Interaction of Statistics and Geology -- Finite Deformations.

    DTIC Science & Technology

    1980-11-01

    UNCLASSIFIlED TR-178-SER-2 N 7 DIZO9 f l l ff-63f ~ l f f PRNEO VN TO TTSISF61/ EOM.’..lN 11 1 .1 2I " IIIj.5IIHL4. 1. 111 1----_III MICROCOPY RESOLUTION TEST ... Ramsey (1967). These might be the result of a sequence of linear deformations or homogeneous strains. In this section we summarize the description of...problem may be found in textbooks (see e.g. Theil (1971)) on Econometrics : y=B&+f, x=&+e where the errors of measurement e and f of x and y are

  11. Effect of economic growth and environmental quality on tourism in Southeast Asian Countries

    NASA Astrophysics Data System (ADS)

    Firmansyah

    2017-02-01

    The tourism is an important sector in generating income for a country, nevertheless, tourism is sensitive toward the changes in economy, as well as changes in environmental quality. By employing econometric models of error correction on annual data, this study examines the influence of environmental quality, domestic and global economic growth on foreign tourist arrivals in selected Southeast Asian countries, namely Indonesia, Malaysia, Thailand, Philippines, and Singapore. The findings of this study showed that all of countries long run model were proved statistically, indicated that world economic growth as well as environmental quality affect foreign tourism arrivals.

  12. Statistical inference for extended or shortened phase II studies based on Simon's two-stage designs.

    PubMed

    Zhao, Junjun; Yu, Menggang; Feng, Xi-Ping

    2015-06-07

    Simon's two-stage designs are popular choices for conducting phase II clinical trials, especially in the oncology trials to reduce the number of patients placed on ineffective experimental therapies. Recently Koyama and Chen (2008) discussed how to conduct proper inference for such studies because they found that inference procedures used with Simon's designs almost always ignore the actual sampling plan used. In particular, they proposed an inference method for studies when the actual second stage sample sizes differ from planned ones. We consider an alternative inference method based on likelihood ratio. In particular, we order permissible sample paths under Simon's two-stage designs using their corresponding conditional likelihood. In this way, we can calculate p-values using the common definition: the probability of obtaining a test statistic value at least as extreme as that observed under the null hypothesis. In addition to providing inference for a couple of scenarios where Koyama and Chen's method can be difficult to apply, the resulting estimate based on our method appears to have certain advantage in terms of inference properties in many numerical simulations. It generally led to smaller biases and narrower confidence intervals while maintaining similar coverages. We also illustrated the two methods in a real data setting. Inference procedures used with Simon's designs almost always ignore the actual sampling plan. Reported P-values, point estimates and confidence intervals for the response rate are not usually adjusted for the design's adaptiveness. Proper statistical inference procedures should be used.

  13. Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology

    NASA Astrophysics Data System (ADS)

    Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen

    2018-07-01

    Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.

  14. A Coalitional Game for Distributed Inference in Sensor Networks With Dependent Observations

    NASA Astrophysics Data System (ADS)

    He, Hao; Varshney, Pramod K.

    2016-04-01

    We consider the problem of collaborative inference in a sensor network with heterogeneous and statistically dependent sensor observations. Each sensor aims to maximize its inference performance by forming a coalition with other sensors and sharing information within the coalition. It is proved that the inference performance is a nondecreasing function of the coalition size. However, in an energy constrained network, the energy consumption of inter-sensor communication also increases with increasing coalition size, which discourages the formation of the grand coalition (the set of all sensors). In this paper, the formation of non-overlapping coalitions with statistically dependent sensors is investigated under a specific communication constraint. We apply a game theoretical approach to fully explore and utilize the information contained in the spatial dependence among sensors to maximize individual sensor performance. Before formulating the distributed inference problem as a coalition formation game, we first quantify the gain and loss in forming a coalition by introducing the concepts of diversity gain and redundancy loss for both estimation and detection problems. These definitions, enabled by the statistical theory of copulas, allow us to characterize the influence of statistical dependence among sensor observations on inference performance. An iterative algorithm based on merge-and-split operations is proposed for the solution and the stability of the proposed algorithm is analyzed. Numerical results are provided to demonstrate the superiority of our proposed game theoretical approach.

  15. Glossary for econometrics and epidemiology.

    PubMed

    Gunasekara, F Imlach; Carter, K; Blakely, T

    2008-10-01

    Epidemiologists and econometricians are often interested in similar topics-socioeconomic position and health outcomes-but the different languages that epidemiologists and economists use to interpret and discuss their results can create a barrier to mutual communication. This glossary defines key terms used in econometrics and epidemiology to assist in bridging this gap.

  16. Transportation and socioeconomic impacts of bypasses on communities : an integrated synthesis of panel data, multilevel, and spatial econometric models with case studies.

    DOT National Transportation Integrated Search

    2011-09-21

    Title: Transportation and Socioeconomic Impacts of Bypasses on Communities: An Integrated Synthesis of Panel Data, Multilevel, and Spatial Econometric Models with Case Studies. The title used at the start of this project was Transportation and Soc...

  17. Time Series Econometrics for the 21st Century

    ERIC Educational Resources Information Center

    Hansen, Bruce E.

    2017-01-01

    The field of econometrics largely started with time series analysis because many early datasets were time-series macroeconomic data. As the field developed, more cross-sectional and longitudinal datasets were collected, which today dominate the majority of academic empirical research. In nonacademic (private sector, central bank, and governmental)…

  18. Spurious correlations and inference in landscape genetics

    Treesearch

    Samuel A. Cushman; Erin L. Landguth

    2010-01-01

    Reliable interpretation of landscape genetic analyses depends on statistical methods that have high power to identify the correct process driving gene flow while rejecting incorrect alternative hypotheses. Little is known about statistical power and inference in individual-based landscape genetics. Our objective was to evaluate the power of causalmodelling with partial...

  19. The Philosophical Foundations of Prescriptive Statements and Statistical Inference

    ERIC Educational Resources Information Center

    Sun, Shuyan; Pan, Wei

    2011-01-01

    From the perspectives of the philosophy of science and statistical inference, we discuss the challenges of making prescriptive statements in quantitative research articles. We first consider the prescriptive nature of educational research and argue that prescriptive statements are a necessity in educational research. The logic of deduction,…

  20. Inference and the Introductory Statistics Course

    ERIC Educational Resources Information Center

    Pfannkuch, Maxine; Regan, Matt; Wild, Chris; Budgett, Stephanie; Forbes, Sharleen; Harraway, John; Parsonage, Ross

    2011-01-01

    This article sets out some of the rationale and arguments for making major changes to the teaching and learning of statistical inference in introductory courses at our universities by changing from a norm-based, mathematical approach to more conceptually accessible computer-based approaches. The core problem of the inferential argument with its…

  1. "Magnitude-based inference": a statistical review.

    PubMed

    Welsh, Alan H; Knight, Emma J

    2015-04-01

    We consider "magnitude-based inference" and its interpretation by examining in detail its use in the problem of comparing two means. We extract from the spreadsheets, which are provided to users of the analysis (http://www.sportsci.org/), a precise description of how "magnitude-based inference" is implemented. We compare the implemented version of the method with general descriptions of it and interpret the method in familiar statistical terms. We show that "magnitude-based inference" is not a progressive improvement on modern statistics. The additional probabilities introduced are not directly related to the confidence interval but, rather, are interpretable either as P values for two different nonstandard tests (for different null hypotheses) or as approximate Bayesian calculations, which also lead to a type of test. We also discuss sample size calculations associated with "magnitude-based inference" and show that the substantial reduction in sample sizes claimed for the method (30% of the sample size obtained from standard frequentist calculations) is not justifiable so the sample size calculations should not be used. Rather than using "magnitude-based inference," a better solution is to be realistic about the limitations of the data and use either confidence intervals or a fully Bayesian analysis.

  2. Refining cost-effectiveness analyses using the net benefit approach and econometric methods: an example from a trial of anti-depressant treatment.

    PubMed

    Sabes-Figuera, Ramon; McCrone, Paul; Kendricks, Antony

    2013-04-01

    Economic evaluation analyses can be enhanced by employing regression methods, allowing for the identification of important sub-groups and to adjust for imperfect randomisation in clinical trials or to analyse non-randomised data. To explore the benefits of combining regression techniques and the standard Bayesian approach to refine cost-effectiveness analyses using data from randomised clinical trials. Data from a randomised trial of anti-depressant treatment were analysed and a regression model was used to explore the factors that have an impact on the net benefit (NB) statistic with the aim of using these findings to adjust the cost-effectiveness acceptability curves. Exploratory sub-samples' analyses were carried out to explore possible differences in cost-effectiveness. Results The analysis found that having suffered a previous similar depression is strongly correlated with a lower NB, independent of the outcome measure or follow-up point. In patients with previous similar depression, adding an selective serotonin reuptake inhibitors (SSRI) to supportive care for mild-to-moderate depression is probably cost-effective at the level used by the English National Institute for Health and Clinical Excellence to make recommendations. This analysis highlights the need for incorporation of econometric methods into cost-effectiveness analyses using the NB approach.

  3. Multi-Agent Inference in Social Networks: A Finite Population Learning Approach.

    PubMed

    Fan, Jianqing; Tong, Xin; Zeng, Yao

    When people in a society want to make inference about some parameter, each person may want to use data collected by other people. Information (data) exchange in social networks is usually costly, so to make reliable statistical decisions, people need to trade off the benefits and costs of information acquisition. Conflicts of interests and coordination problems will arise in the process. Classical statistics does not consider people's incentives and interactions in the data collection process. To address this imperfection, this work explores multi-agent Bayesian inference problems with a game theoretic social network model. Motivated by our interest in aggregate inference at the societal level, we propose a new concept, finite population learning , to address whether with high probability, a large fraction of people in a given finite population network can make "good" inference. Serving as a foundation, this concept enables us to study the long run trend of aggregate inference quality as population grows.

  4. Intuitive statistics by 8-month-old infants

    PubMed Central

    Xu, Fei; Garcia, Vashti

    2008-01-01

    Human learners make inductive inferences based on small amounts of data: we generalize from samples to populations and vice versa. The academic discipline of statistics formalizes these intuitive statistical inferences. What is the origin of this ability? We report six experiments investigating whether 8-month-old infants are “intuitive statisticians.” Our results showed that, given a sample, the infants were able to make inferences about the population from which the sample had been drawn. Conversely, given information about the entire population of relatively small size, the infants were able to make predictions about the sample. Our findings provide evidence that infants possess a powerful mechanism for inductive learning, either using heuristics or basic principles of probability. This ability to make inferences based on samples or information about the population develops early and in the absence of schooling or explicit teaching. Human infants may be rational learners from very early in development. PMID:18378901

  5. An Econometric Model for Forecasting Income and Employment in Hawaii.

    ERIC Educational Resources Information Center

    Chau, Laurence C.

    This report presents the methodology for short-run forecasting of personal income and employment in Hawaii. The econometric model developed in the study is used to make actual forecasts through 1973 of income and employment, with major components forecasted separately. Several sets of forecasts are made, under different assumptions on external…

  6. Outputs as Educator Effectiveness in the United States: Shifting towards Political Accountability

    ERIC Educational Resources Information Center

    Piro, Jody S.; Mullen, Laurie

    2013-01-01

    The definition of educator effectiveness is being redefined by econometric modeling to evidence student achievement on standardized tests. While the reasons that econometric frameworks are in vogue are many, it is clear that the strength of such models lie in the quantifiable evidence of student learning. Current accountability models frame…

  7. Econometric Models for Forecasting of Macroeconomic Indices

    ERIC Educational Resources Information Center

    Sukhanova, Elena I.; Shirnaeva, Svetlana Y.; Mokronosov, Aleksandr G.

    2016-01-01

    The urgency of the research topic was stipulated by the necessity to carry out an effective controlled process by the economic system which can hardly be imagined without indices forecasting characteristic of this system. An econometric model is a safe tool of forecasting which makes it possible to take into consideration the trend of indices…

  8. Technical Change in the North American Forestry Sector: A Review

    Treesearch

    Jeffery C. Stier; David N. Bengston

    1992-01-01

    Economists have examined the impact of technical change on the forest products sector using the historical, index number, and econometric approaches. This paper reviews econometric analyses of the rate and bias of technical change, examining functional form, factors included, and empirical results. Studies are classified as first- second-, or third-generation...

  9. Econometric Methods for Causal Evaluation of Education Policies and Practices: A Non-Technical Guide

    ERIC Educational Resources Information Center

    Schlotter, Martin; Schwerdt, Guido; Woessmann, Ludger

    2011-01-01

    Education policy-makers and practitioners want to know which policies and practices can best achieve their goals. But research that can inform evidence-based policy often requires complex methods to distinguish causation from accidental association. Avoiding econometric jargon and technical detail, this paper explains the main idea and intuition…

  10. An econometric model of the hardwood lumber market

    Treesearch

    William G. Luppold

    1982-01-01

    A recursive econometric model with causal flow originating from the demand relationship is used to analyze the effects of exogenous variables on quantity and price of hardwood lumber. Wage rates, interest rates, stumpage price, lumber exports, and price of lumber demanders' output were the major factors influencing quantities demanded and supplied and hardwood...

  11. The Status of Econometrics in the Economics Major: A Survey

    ERIC Educational Resources Information Center

    Johnson, Bruce K.; Perry, John J.; Petkus, Marie

    2012-01-01

    In this article, the authors describe the place of econometrics in undergraduate economics curricula in all American colleges and universities that offer economics majors as listed in the "U.S. News & World Report" "Best Colleges 2010" guide ("U.S. News & World Report" 2009). Data come from online catalogs, departmental Web sites, and online…

  12. Empirical methods for modeling landscape change, ecosystem services, and biodiversity

    Treesearch

    David Lewis; Ralph Alig

    2009-01-01

    The purpose of this paper is to synthesize recent economics research aimed at integrating discrete-choice econometric models of land-use change with spatially-explicit landscape simulations and quantitative ecology. This research explicitly models changes in the spatial pattern of landscapes in two steps: 1) econometric estimation of parcel-scale transition...

  13. Transfer Entropy as a Log-Likelihood Ratio

    NASA Astrophysics Data System (ADS)

    Barnett, Lionel; Bossomaier, Terry

    2012-09-01

    Transfer entropy, an information-theoretic measure of time-directed information transfer between joint processes, has steadily gained popularity in the analysis of complex stochastic dynamics in diverse fields, including the neurosciences, ecology, climatology, and econometrics. We show that for a broad class of predictive models, the log-likelihood ratio test statistic for the null hypothesis of zero transfer entropy is a consistent estimator for the transfer entropy itself. For finite Markov chains, furthermore, no explicit model is required. In the general case, an asymptotic χ2 distribution is established for the transfer entropy estimator. The result generalizes the equivalence in the Gaussian case of transfer entropy and Granger causality, a statistical notion of causal influence based on prediction via vector autoregression, and establishes a fundamental connection between directed information transfer and causality in the Wiener-Granger sense.

  14. Transfer entropy as a log-likelihood ratio.

    PubMed

    Barnett, Lionel; Bossomaier, Terry

    2012-09-28

    Transfer entropy, an information-theoretic measure of time-directed information transfer between joint processes, has steadily gained popularity in the analysis of complex stochastic dynamics in diverse fields, including the neurosciences, ecology, climatology, and econometrics. We show that for a broad class of predictive models, the log-likelihood ratio test statistic for the null hypothesis of zero transfer entropy is a consistent estimator for the transfer entropy itself. For finite Markov chains, furthermore, no explicit model is required. In the general case, an asymptotic χ2 distribution is established for the transfer entropy estimator. The result generalizes the equivalence in the Gaussian case of transfer entropy and Granger causality, a statistical notion of causal influence based on prediction via vector autoregression, and establishes a fundamental connection between directed information transfer and causality in the Wiener-Granger sense.

  15. Assessing colour-dependent occupation statistics inferred from galaxy group catalogues

    NASA Astrophysics Data System (ADS)

    Campbell, Duncan; van den Bosch, Frank C.; Hearin, Andrew; Padmanabhan, Nikhil; Berlind, Andreas; Mo, H. J.; Tinker, Jeremy; Yang, Xiaohu

    2015-09-01

    We investigate the ability of current implementations of galaxy group finders to recover colour-dependent halo occupation statistics. To test the fidelity of group catalogue inferred statistics, we run three different group finders used in the literature over a mock that includes galaxy colours in a realistic manner. Overall, the resulting mock group catalogues are remarkably similar, and most colour-dependent statistics are recovered with reasonable accuracy. However, it is also clear that certain systematic errors arise as a consequence of correlated errors in group membership determination, central/satellite designation, and halo mass assignment. We introduce a new statistic, the halo transition probability (HTP), which captures the combined impact of all these errors. As a rule of thumb, errors tend to equalize the properties of distinct galaxy populations (i.e. red versus blue galaxies or centrals versus satellites), and to result in inferred occupation statistics that are more accurate for red galaxies than for blue galaxies. A statistic that is particularly poorly recovered from the group catalogues is the red fraction of central galaxies as a function of halo mass. Group finders do a good job in recovering galactic conformity, but also have a tendency to introduce weak conformity when none is present. We conclude that proper inference of colour-dependent statistics from group catalogues is best achieved using forward modelling (i.e. running group finders over mock data) or by implementing a correction scheme based on the HTP, as long as the latter is not too strongly model dependent.

  16. Statistical learning and selective inference.

    PubMed

    Taylor, Jonathan; Tibshirani, Robert J

    2015-06-23

    We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.

  17. Variations on Bayesian Prediction and Inference

    DTIC Science & Technology

    2016-05-09

    inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle

  18. Inferring causal relationships between phenotypes using summary statistics from genome-wide association studies.

    PubMed

    Meng, Xiang-He; Shen, Hui; Chen, Xiang-Ding; Xiao, Hong-Mei; Deng, Hong-Wen

    2018-03-01

    Genome-wide association studies (GWAS) have successfully identified numerous genetic variants associated with diverse complex phenotypes and diseases, and provided tremendous opportunities for further analyses using summary association statistics. Recently, Pickrell et al. developed a robust method for causal inference using independent putative causal SNPs. However, this method may fail to infer the causal relationship between two phenotypes when only a limited number of independent putative causal SNPs identified. Here, we extended Pickrell's method to make it more applicable for the general situations. We extended the causal inference method by replacing the putative causal SNPs with the lead SNPs (the set of the most significant SNPs in each independent locus) and tested the performance of our extended method using both simulation and empirical data. Simulations suggested that when the same number of genetic variants is used, our extended method had similar distribution of test statistic under the null model as well as comparable power under the causal model compared with the original method by Pickrell et al. But in practice, our extended method would generally be more powerful because the number of independent lead SNPs was often larger than the number of independent putative causal SNPs. And including more SNPs, on the other hand, would not cause more false positives. By applying our extended method to summary statistics from GWAS for blood metabolites and femoral neck bone mineral density (FN-BMD), we successfully identified ten blood metabolites that may causally influence FN-BMD. We extended a causal inference method for inferring putative causal relationship between two phenotypes using summary statistics from GWAS, and identified a number of potential causal metabolites for FN-BMD, which may provide novel insights into the pathophysiological mechanisms underlying osteoporosis.

  19. In defence of model-based inference in phylogeography

    PubMed Central

    Beaumont, Mark A.; Nielsen, Rasmus; Robert, Christian; Hey, Jody; Gaggiotti, Oscar; Knowles, Lacey; Estoup, Arnaud; Panchal, Mahesh; Corander, Jukka; Hickerson, Mike; Sisson, Scott A.; Fagundes, Nelson; Chikhi, Lounès; Beerli, Peter; Vitalis, Renaud; Cornuet, Jean-Marie; Huelsenbeck, John; Foll, Matthieu; Yang, Ziheng; Rousset, Francois; Balding, David; Excoffier, Laurent

    2017-01-01

    Recent papers have promoted the view that model-based methods in general, and those based on Approximate Bayesian Computation (ABC) in particular, are flawed in a number of ways, and are therefore inappropriate for the analysis of phylogeographic data. These papers further argue that Nested Clade Phylogeographic Analysis (NCPA) offers the best approach in statistical phylogeography. In order to remove the confusion and misconceptions introduced by these papers, we justify and explain the reasoning behind model-based inference. We argue that ABC is a statistically valid approach, alongside other computational statistical techniques that have been successfully used to infer parameters and compare models in population genetics. We also examine the NCPA method and highlight numerous deficiencies, either when used with single or multiple loci. We further show that the ages of clades are carelessly used to infer ages of demographic events, that these ages are estimated under a simple model of panmixia and population stationarity but are then used under different and unspecified models to test hypotheses, a usage the invalidates these testing procedures. We conclude by encouraging researchers to study and use model-based inference in population genetics. PMID:29284924

  20. Statistical inference for remote sensing-based estimates of net deforestation

    Treesearch

    Ronald E. McRoberts; Brian F. Walters

    2012-01-01

    Statistical inference requires expression of an estimate in probabilistic terms, usually in the form of a confidence interval. An approach to constructing confidence intervals for remote sensing-based estimates of net deforestation is illustrated. The approach is based on post-classification methods using two independent forest/non-forest classifications because...

  1. Pulling Econometrics Students up by Their Bootstraps

    ERIC Educational Resources Information Center

    O'Hara, Michael E.

    2014-01-01

    Although the concept of the sampling distribution is at the core of much of what we do in econometrics, it is a concept that is often difficult for students to grasp. The thought process behind bootstrapping provides a way for students to conceptualize the sampling distribution in a way that is intuitive and visual. However, teaching students to…

  2. Econometric models of road use, accidents, and road investment decisions. Volume 2 : an econometric model of car ownership, road use, accidents, and their severity (Essay 3)

    DOT National Transportation Integrated Search

    1999-11-01

    Using a fairly large cross-section/time-series data base, covering all provinces of Norway and all months between January 1973 and December 1994, we estimate non-linear (Box-Cox) regression equations explaining aggregate car ownership, road use, seat...

  3. Econometric Methods for Research in Education. NBER Working Paper No. 16003

    ERIC Educational Resources Information Center

    Meghir, Costas; Rivkin, Steven G.

    2010-01-01

    This paper reviews some of the econometric methods that have been used in the economics of education. The focus is on understanding how the assumptions made to justify and implement such methods relate to the underlying economic model and the interpretation of the results. We start by considering the estimation of the returns to education both…

  4. Econometric Estimation of the Economic Impact of a University. AIR 1993 Annual Forum Paper.

    ERIC Educational Resources Information Center

    Gana, Rajaram

    This study conducted an econometric analysis of the impact of the University of Delaware (UD), a public, doctoral level institution, on the Delaware economy, particularly the impact of nonresident students. To construct a model the study used historical institutional data from the Office of Institutional Research and Planning at UD and…

  5. The Nexus of Place and Finance in the Analysis of Educational Attainment: A Spatial Econometric Approach

    ERIC Educational Resources Information Center

    Sutton, Farah

    2012-01-01

    This study examines the spatial distribution of educational attainment and then builds upon current predictive frameworks for understanding patterns of educational attainment by applying a spatial econometric method of analysis. The research from this study enables a new approach to the policy discussion on how to improve educational attainment…

  6. First-Year Study Success in Economics and Econometrics: The Role of Gender, Motivation, and Math Skills

    ERIC Educational Resources Information Center

    Arnold, Ivo J. M.; Rowaan, Wietske

    2014-01-01

    In this study, the authors investigate the relationships among gender, math skills, motivation, and study success in economics and econometrics. They find that female students have stronger intrinsic motivation, yet lower study confidence than their male counterparts. They also find weak evidence for a gender gap over the entire first-year…

  7. Determinants of Educational Achievement in Morocco: A Micro-Econometric Analysis Applied to the TIMSS Study

    ERIC Educational Resources Information Center

    Ibourk, Aomar

    2013-01-01

    Based on data from international surveys measuring learning (TIMSS), this article focuses on the analysis of the academic performance Moroccan students. The results of the econometric model show that the students' characteristics, their family environment and school context are key determinants of these performances. The study also shows that the…

  8. An Initial Econometric Consideration of Supply and Demand in the Guaranteed Student Loan Program.

    ERIC Educational Resources Information Center

    Bayus, Barry; Kendis, Kurt

    1982-01-01

    In this econometric model of the Guaranteed Student Loan Program (GSLP), supply is related to banks' liquidity and yield curves, all lenders' economic costs and returns, and Student Loan Marketing Association activity. GSLP demand is based on loan costs, family debt position, and net student need for financial aid. (RW)

  9. 78 FR 41161 - Self-Regulatory Organizations; The Options Clearing Corporation; Notice of Filing of Advance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-07-09

    ... behavior is included in the econometric models underlying STANS, time series of proportional changes in... included in the econometric models underlying STANS, time series of proportional changes in implied... calculate daily margin requirements. OCC has proposed at this time to clear only OTC Options on the S&P 500...

  10. Thermodynamics of statistical inference by cells.

    PubMed

    Lang, Alex H; Fisher, Charles K; Mora, Thierry; Mehta, Pankaj

    2014-10-03

    The deep connection between thermodynamics, computation, and information is now well established both theoretically and experimentally. Here, we extend these ideas to show that thermodynamics also places fundamental constraints on statistical estimation and learning. To do so, we investigate the constraints placed by (nonequilibrium) thermodynamics on the ability of biochemical signaling networks to estimate the concentration of an external signal. We show that accuracy is limited by energy consumption, suggesting that there are fundamental thermodynamic constraints on statistical inference.

  11. Genetic markers as instrumental variables.

    PubMed

    von Hinke, Stephanie; Davey Smith, George; Lawlor, Debbie A; Propper, Carol; Windmeijer, Frank

    2016-01-01

    The use of genetic markers as instrumental variables (IV) is receiving increasing attention from economists, statisticians, epidemiologists and social scientists. Although IV is commonly used in economics, the appropriate conditions for the use of genetic variants as instruments have not been well defined. The increasing availability of biomedical data, however, makes understanding of these conditions crucial to the successful use of genotypes as instruments. We combine the econometric IV literature with that from genetic epidemiology, and discuss the biological conditions and IV assumptions within the statistical potential outcomes framework. We review this in the context of two illustrative applications. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  12. Out of sight but not out of mind: Home countries' macroeconomic volatilities and immigrants' mental health.

    PubMed

    Nguyen, Ha Trong; Connelly, Luke Brian

    2018-01-01

    We provide the first empirical evidence that better economic performances by immigrants' countries of origin, as measured by lower consumer price index (CPI) or higher gross domestic product, improve immigrants' mental health. We use an econometrically-robust approach that exploits exogenous changes in macroeconomic conditions across immigrants' home countries over time and controls for immigrants' observable and unobservable characteristics. The CPI effect is statistically significant and sizeable. Furthermore, the CPI effect diminishes as the time since emigrating increases. By contrast, home countries' unemployment rates and exchange rate fluctuations have no impact on immigrants' mental health. Copyright © 2017 John Wiley & Sons, Ltd.

  13. Proper and Paradigmatic Metonymy as a Lens for Characterizing Student Conceptions of Distributions and Sampling

    ERIC Educational Resources Information Center

    Noll, Jennifer; Hancock, Stacey

    2015-01-01

    This research investigates what students' use of statistical language can tell us about their conceptions of distribution and sampling in relation to informal inference. Prior research documents students' challenges in understanding ideas of distribution and sampling as tools for making informal statistical inferences. We know that these…

  14. Robust estimation procedure in panel data model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shariff, Nurul Sima Mohamad; Hamzah, Nor Aishah

    2014-06-19

    The panel data modeling has received a great attention in econometric research recently. This is due to the availability of data sources and the interest to study cross sections of individuals observed over time. However, the problems may arise in modeling the panel in the presence of cross sectional dependence and outliers. Even though there are few methods that take into consideration the presence of cross sectional dependence in the panel, the methods may provide inconsistent parameter estimates and inferences when outliers occur in the panel. As such, an alternative method that is robust to outliers and cross sectional dependencemore » is introduced in this paper. The properties and construction of the confidence interval for the parameter estimates are also considered in this paper. The robustness of the procedure is investigated and comparisons are made to the existing method via simulation studies. Our results have shown that robust approach is able to produce an accurate and reliable parameter estimates under the condition considered.« less

  15. Multi-Agent Inference in Social Networks: A Finite Population Learning Approach

    PubMed Central

    Tong, Xin; Zeng, Yao

    2016-01-01

    When people in a society want to make inference about some parameter, each person may want to use data collected by other people. Information (data) exchange in social networks is usually costly, so to make reliable statistical decisions, people need to trade off the benefits and costs of information acquisition. Conflicts of interests and coordination problems will arise in the process. Classical statistics does not consider people’s incentives and interactions in the data collection process. To address this imperfection, this work explores multi-agent Bayesian inference problems with a game theoretic social network model. Motivated by our interest in aggregate inference at the societal level, we propose a new concept, finite population learning, to address whether with high probability, a large fraction of people in a given finite population network can make “good” inference. Serving as a foundation, this concept enables us to study the long run trend of aggregate inference quality as population grows. PMID:27076691

  16. Robust inference from multiple test statistics via permutations: a better alternative to the single test statistic approach for randomized trials.

    PubMed

    Ganju, Jitendra; Yu, Xinxin; Ma, Guoguang Julie

    2013-01-01

    Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre-specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre-specifying multiple test statistics and relying on the minimum p-value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p-value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p-value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p-value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd.

  17. The Anatomy of a Likely Donor: Econometric Evidence on Philanthropy to Higher Education

    ERIC Educational Resources Information Center

    Lara, Christen; Johnson, Daniel

    2014-01-01

    In 2011, philanthropic giving to higher education institutions totaled $30.3 billion, an 8.2% increase over the previous year. Roughly, 26% of those funds came from alumni donations. This article builds upon existing economic models to create an econometric model to explain and predict the pattern of alumni giving. We test the model using data…

  18. An Econometric Approach to Evaluate Navy Advertising Efficiency.

    DTIC Science & Technology

    1996-03-01

    This thesis uses an econometric approach to systematically and comprehensively analyze Navy advertising and recruiting data to determine Navy... advertising cost efficiency in the Navy recruiting process. Current recruiting and advertising cost data are merged into an appropriate data base and...evaluated using multiple regression techniques to find assessments of the relationships between Navy advertising expenditures and recruit contracts attained

  19. Patterns of Marine Corps Reserve Continuation Behavior: Pre- and Post-9/11

    DTIC Science & Technology

    2011-03-01

    to consider when studying reserve retention and very difficult to measure using multivariate econometric models, which rely solely on observational...chapter present an interesting supplement to standard economic theoretical perspectives commonly used in econometric analyses. Notably, the structural...relevant to this thesis. These factors contribute to the over- arching themes of job satisfaction and organizational commitment and therefore ultimately

  20. Predicting future forestland area: a comparison of econometric approaches.

    Treesearch

    SoEun Ahn; Andrew J. Plantinga; Ralph J. Alig

    2000-01-01

    Predictions of future forestland area are an important component of forest policy analyses. In this article, we test the ability of econometric land use models to accurately forecast forest area. We construct a panel data set for Alabama consisting of county and time-series observation for the period 1964 to 1992. We estimate models using restricted data sets-namely,...

  1. Econometric analysis of the factors influencing forest acreage trends in the southeast.

    Treesearch

    Ralph J. Alig

    1986-01-01

    Econometric models of changes in land use acreages in the Southeast by physiographic region have been developed by pooling cross-section and time series data. Separate acreage equations have been estimated for the three major private forestland owner classes and the three major classes of nonforest land use. Observations were drawn at three or four different points in...

  2. An econometric model of the U.S. pallet market

    Treesearch

    Albert T. Schuler; Walter B. Wallin

    1979-01-01

    A need for quantitative information on demand and price has been expressed by the pallet industry. In response to this, an econometric model of the aggregate U.S. pallet market was developed. Demand was found to be affected by real pallet price, industrial and food production levels, and slipsheet prices. Supply was affected by real price, housing starts lagged 1 year...

  3. Ensemble stacking mitigates biases in inference of synaptic connectivity.

    PubMed

    Chambers, Brendan; Levy, Maayan; Dechery, Joseph B; MacLean, Jason N

    2018-01-01

    A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches.

  4. Sample selection and spatial models of housing price indexes, and, A disequilibrium analysis of the U.S. gasoline market using panel data

    NASA Astrophysics Data System (ADS)

    Hu, Haixin

    This dissertation consists of two parts. The first part studies the sample selection and spatial models of housing price index using transaction data on detached single-family houses of two California metropolitan areas from 1990 through 2008. House prices are often spatially correlated due to shared amenities, or when the properties are viewed as close substitutes in a housing submarket. There have been many studies that address spatial correlation in the context of housing markets. However, none has used spatial models to construct housing price indexes at zip code level for the entire time period analyzed in this dissertation to the best of my knowledge. In this paper, I study a first-order autoregressive spatial model with four different weighing matrix schemes. Four sets of housing price indexes are constructed accordingly. Gatzlaff and Haurin (1997, 1998) study the sample selection problem in housing index by using Heckman's two-step method. This method, however, is generally inefficient and can cause multicollinearity problem. Also, it requires data on unsold houses in order to carry out the first-step probit regression. Maximum likelihood (ML) method can be used to estimate a truncated incidental model which allows one to correct for sample selection based on transaction data only. However, convergence problem is very prevalent in practice. In this paper I adopt Lewbel's (2007) sample selection correction method which does not require one to model or estimate the selection model, except for some very general assumptions. I then extend this method to correct for spatial correlation. In the second part, I analyze the U.S. gasoline market with a disequilibrium model that allows lagged-latent variables, endogenous prices, and panel data with fixed effects. Most existing studies (see the survey of Espey, 1998, Energy Economics) of the gasoline market assume equilibrium. In practice, however, prices do not always adjust fast enough to clear the market. Equilibrium assumptions greatly simplify statistical inference, but are very restrictive and can produce conflicting estimates. For example, econometric models of markets that assume equilibrium often produce more elastic demand price elasticity than their disequilibrium counterparts (Holt and Johnson, 1989, Review of Economics and Statistics, Oczkowski, 1998, Economics Letters). The few studies that allow disequilibrium, however, have been limited to macroeconomic time-series data without lagged-latent variables. While time series data allows one to investigate national trends, it cannot be used to identify and analyze regional differences and the role of local markets. Exclusion of the lagged-latent variables is also undesirable because such variables capture adjustment costs and inter-temporal spillovers. Simulation methods offer tractable solutions to dynamic and panel data disequilibrium models (Lee, 1997, Journal of Econometrics), but assume normally distributed errors. This paper compares estimates of price/income elasticity and excess supply/demand across time periods, regions, and model specifications, using both equilibrium and disequilibrium methods. In the equilibrium model, I compare the within group estimator with Anderson and Hsiao's first-difference 2SLS estimator. In the disequilibrium model, I extend Amemiya's 2SLS by using Newey's efficient estimator with optimal instruments.

  5. Bayesian inference of physiologically meaningful parameters from body sway measurements.

    PubMed

    Tietäväinen, A; Gutmann, M U; Keski-Vakkuri, E; Corander, J; Hæggström, E

    2017-06-19

    The control of the human body sway by the central nervous system, muscles, and conscious brain is of interest since body sway carries information about the physiological status of a person. Several models have been proposed to describe body sway in an upright standing position, however, due to the statistical intractability of the more realistic models, no formal parameter inference has previously been conducted and the expressive power of such models for real human subjects remains unknown. Using the latest advances in Bayesian statistical inference for intractable models, we fitted a nonlinear control model to posturographic measurements, and we showed that it can accurately predict the sway characteristics of both simulated and real subjects. Our method provides a full statistical characterization of the uncertainty related to all model parameters as quantified by posterior probability density functions, which is useful for comparisons across subjects and test settings. The ability to infer intractable control models from sensor data opens new possibilities for monitoring and predicting body status in health applications.

  6. Research participant compensation: A matter of statistical inference as well as ethics.

    PubMed

    Swanson, David M; Betensky, Rebecca A

    2015-11-01

    The ethics of compensation of research subjects for participation in clinical trials has been debated for years. One ethical issue of concern is variation among subjects in the level of compensation for identical treatments. Surprisingly, the impact of variation on the statistical inferences made from trial results has not been examined. We seek to identify how variation in compensation may influence any existing dependent censoring in clinical trials, thereby also influencing inference about the survival curve, hazard ratio, or other measures of treatment efficacy. In simulation studies, we consider a model for how compensation structure may influence the censoring model. Under existing dependent censoring, we estimate survival curves under different compensation structures and observe how these structures induce variability in the estimates. We show through this model that if the compensation structure affects the censoring model and dependent censoring is present, then variation in that structure induces variation in the estimates and affects the accuracy of estimation and inference on treatment efficacy. From the perspectives of both ethics and statistical inference, standardization and transparency in the compensation of participants in clinical trials are warranted. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Pre-Service Mathematics Teachers' Use of Probability Models in Making Informal Inferences about a Chance Game

    ERIC Educational Resources Information Center

    Kazak, Sibel; Pratt, Dave

    2017-01-01

    This study considers probability models as tools for both making informal statistical inferences and building stronger conceptual connections between data and chance topics in teaching statistics. In this paper, we aim to explore pre-service mathematics teachers' use of probability models for a chance game, where the sum of two dice matters in…

  8. Phylogeography Takes a Relaxed Random Walk in Continuous Space and Time

    PubMed Central

    Lemey, Philippe; Rambaut, Andrew; Welch, John J.; Suchard, Marc A.

    2010-01-01

    Research aimed at understanding the geographic context of evolutionary histories is burgeoning across biological disciplines. Recent endeavors attempt to interpret contemporaneous genetic variation in the light of increasingly detailed geographical and environmental observations. Such interest has promoted the development of phylogeographic inference techniques that explicitly aim to integrate such heterogeneous data. One promising development involves reconstructing phylogeographic history on a continuous landscape. Here, we present a Bayesian statistical approach to infer continuous phylogeographic diffusion using random walk models while simultaneously reconstructing the evolutionary history in time from molecular sequence data. Moreover, by accommodating branch-specific variation in dispersal rates, we relax the most restrictive assumption of the standard Brownian diffusion process and demonstrate increased statistical efficiency in spatial reconstructions of overdispersed random walks by analyzing both simulated and real viral genetic data. We further illustrate how drawing inference about summary statistics from a fully specified stochastic process over both sequence evolution and spatial movement reveals important characteristics of a rabies epidemic. Together with recent advances in discrete phylogeographic inference, the continuous model developments furnish a flexible statistical framework for biogeographical reconstructions that is easily expanded upon to accommodate various landscape genetic features. PMID:20203288

  9. Variation in reaction norms: Statistical considerations and biological interpretation.

    PubMed

    Morrissey, Michael B; Liefting, Maartje

    2016-09-01

    Analysis of reaction norms, the functions by which the phenotype produced by a given genotype depends on the environment, is critical to studying many aspects of phenotypic evolution. Different techniques are available for quantifying different aspects of reaction norm variation. We examine what biological inferences can be drawn from some of the more readily applicable analyses for studying reaction norms. We adopt a strongly biologically motivated view, but draw on statistical theory to highlight strengths and drawbacks of different techniques. In particular, consideration of some formal statistical theory leads to revision of some recently, and forcefully, advocated opinions on reaction norm analysis. We clarify what simple analysis of the slope between mean phenotype in two environments can tell us about reaction norms, explore the conditions under which polynomial regression can provide robust inferences about reaction norm shape, and explore how different existing approaches may be used to draw inferences about variation in reaction norm shape. We show how mixed model-based approaches can provide more robust inferences than more commonly used multistep statistical approaches, and derive new metrics of the relative importance of variation in reaction norm intercepts, slopes, and curvatures. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.

  10. Reflections on Heckman and Pinto’s Causal Analysis After Haavelmo

    DTIC Science & Technology

    2013-11-01

    Econometric Analysis , Cambridge University Press, 477–490, 1995. Halpern, J. (1998). Axiomatizing causal reasoning. In Uncertainty in Artificial...Models, Structural Models and Econometric Policy Evaluation. Elsevier B.V., Amsterdam, 4779–4874. Heckman, J. J. (1979). Sample selection bias as a...Reflections on Heckman and Pinto’s “Causal Analysis After Haavelmo” Judea Pearl University of California, Los Angeles Computer Science Department Los

  11. The Credibility Revolution in Empirical Economics: How Better Research Design is Taking the Con out of Econometrics. NBER Working Paper No. 15794

    ERIC Educational Resources Information Center

    Angrist, Joshua; Pischke, Jorn-Steffen

    2010-01-01

    This essay reviews progress in empirical economics since Leamer'rs (1983) critique. Leamer highlighted the benefits of sensitivity analysis, a procedure in which researchers show how their results change with changes in specification or functional form. Sensitivity analysis has had a salutary but not a revolutionary effect on econometric practice.…

  12. The Role of Probability-Based Inference in an Intelligent Tutoring System.

    ERIC Educational Resources Information Center

    Mislevy, Robert J.; Gitomer, Drew H.

    Probability-based inference in complex networks of interdependent variables is an active topic in statistical research, spurred by such diverse applications as forecasting, pedigree analysis, troubleshooting, and medical diagnosis. This paper concerns the role of Bayesian inference networks for updating student models in intelligent tutoring…

  13. Boosting Bayesian parameter inference of stochastic differential equation models with methods from statistical physics

    NASA Astrophysics Data System (ADS)

    Albert, Carlo; Ulzega, Simone; Stoop, Ruedi

    2016-04-01

    Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods that have been developed in the statistical physics community over the last few decades. We demonstrate that such methods, along with automated differentiation algorithms, allow us to perform a full-fledged Bayesian inference, for a large class of SDE models, in a highly efficient and largely automatized manner. Furthermore, our algorithm is highly parallelizable. For our toy model, discretized with a few hundred points, a full Bayesian inference can be performed in a matter of seconds on a standard PC.

  14. An inferentialist perspective on the coordination of actions and reasons involved in making a statistical inference

    NASA Astrophysics Data System (ADS)

    Bakker, Arthur; Ben-Zvi, Dani; Makar, Katie

    2017-12-01

    To understand how statistical and other types of reasoning are coordinated with actions to reduce uncertainty, we conducted a case study in vocational education that involved statistical hypothesis testing. We analyzed an intern's research project in a hospital laboratory in which reducing uncertainties was crucial to make a valid statistical inference. In his project, the intern, Sam, investigated whether patients' blood could be sent through pneumatic post without influencing the measurement of particular blood components. We asked, in the process of making a statistical inference, how are reasons and actions coordinated to reduce uncertainty? For the analysis, we used the semantic theory of inferentialism, specifically, the concept of webs of reasons and actions—complexes of interconnected reasons for facts and actions; these reasons include premises and conclusions, inferential relations, implications, motives for action, and utility of tools for specific purposes in a particular context. Analysis of interviews with Sam, his supervisor and teacher as well as video data of Sam in the classroom showed that many of Sam's actions aimed to reduce variability, rule out errors, and thus reduce uncertainties so as to arrive at a valid inference. Interestingly, the decisive factor was not the outcome of a t test but of the reference change value, a clinical chemical measure of analytic and biological variability. With insights from this case study, we expect that students can be better supported in connecting statistics with context and in dealing with uncertainty.

  15. Forecasting space weather: Can new econometric methods improve accuracy?

    NASA Astrophysics Data System (ADS)

    Reikard, Gordon

    2011-06-01

    Space weather forecasts are currently used in areas ranging from navigation and communication to electric power system operations. The relevant forecast horizons can range from as little as 24 h to several days. This paper analyzes the predictability of two major space weather measures using new time series methods, many of them derived from econometrics. The data sets are the A p geomagnetic index and the solar radio flux at 10.7 cm. The methods tested include nonlinear regressions, neural networks, frequency domain algorithms, GARCH models (which utilize the residual variance), state transition models, and models that combine elements of several techniques. While combined models are complex, they can be programmed using modern statistical software. The data frequency is daily, and forecasting experiments are run over horizons ranging from 1 to 7 days. Two major conclusions stand out. First, the frequency domain method forecasts the A p index more accurately than any time domain model, including both regressions and neural networks. This finding is very robust, and holds for all forecast horizons. Combining the frequency domain method with other techniques yields a further small improvement in accuracy. Second, the neural network forecasts the solar flux more accurately than any other method, although at short horizons (2 days or less) the regression and net yield similar results. The neural net does best when it includes measures of the long-term component in the data.

  16. PyClone: statistical inference of clonal population structure in cancer.

    PubMed

    Roth, Andrew; Khattra, Jaswinder; Yap, Damian; Wan, Adrian; Laks, Emma; Biele, Justina; Ha, Gavin; Aparicio, Samuel; Bouchard-Côté, Alexandre; Shah, Sohrab P

    2014-04-01

    We introduce PyClone, a statistical model for inference of clonal population structures in cancers. PyClone is a Bayesian clustering method for grouping sets of deeply sequenced somatic mutations into putative clonal clusters while estimating their cellular prevalences and accounting for allelic imbalances introduced by segmental copy-number changes and normal-cell contamination. Single-cell sequencing validation demonstrates PyClone's accuracy.

  17. Statistical Signal Models and Algorithms for Image Analysis

    DTIC Science & Technology

    1984-10-25

    In this report, two-dimensional stochastic linear models are used in developing algorithms for image analysis such as classification, segmentation, and object detection in images characterized by textured backgrounds. These models generate two-dimensional random processes as outputs to which statistical inference procedures can naturally be applied. A common thread throughout our algorithms is the interpretation of the inference procedures in terms of linear prediction

  18. Fair Inference on Outcomes

    PubMed Central

    Nabi, Razieh; Shpitser, Ilya

    2017-01-01

    In this paper, we consider the problem of fair statistical inference involving outcome variables. Examples include classification and regression problems, and estimating treatment effects in randomized trials or observational data. The issue of fairness arises in such problems where some covariates or treatments are “sensitive,” in the sense of having potential of creating discrimination. In this paper, we argue that the presence of discrimination can be formalized in a sensible way as the presence of an effect of a sensitive covariate on the outcome along certain causal pathways, a view which generalizes (Pearl 2009). A fair outcome model can then be learned by solving a constrained optimization problem. We discuss a number of complications that arise in classical statistical inference due to this view and provide workarounds based on recent work in causal and semi-parametric inference.

  19. P values are only an index to evidence: 20th- vs. 21st-century statistical science.

    PubMed

    Burnham, K P; Anderson, D R

    2014-03-01

    Early statistical methods focused on pre-data probability statements (i.e., data as random variables) such as P values; these are not really inferences nor are P values evidential. Statistical science clung to these principles throughout much of the 20th century as a wide variety of methods were developed for special cases. Looking back, it is clear that the underlying paradigm (i.e., testing and P values) was weak. As Kuhn (1970) suggests, new paradigms have taken the place of earlier ones: this is a goal of good science. New methods have been developed and older methods extended and these allow proper measures of strength of evidence and multimodel inference. It is time to move forward with sound theory and practice for the difficult practical problems that lie ahead. Given data the useful foundation shifts to post-data probability statements such as model probabilities (Akaike weights) or related quantities such as odds ratios and likelihood intervals. These new methods allow formal inference from multiple models in the a prior set. These quantities are properly evidential. The past century was aimed at finding the "best" model and making inferences from it. The goal in the 21st century is to base inference on all the models weighted by their model probabilities (model averaging). Estimates of precision can include model selection uncertainty leading to variances conditional on the model set. The 21st century will be about the quantification of information, proper measures of evidence, and multi-model inference. Nelder (1999:261) concludes, "The most important task before us in developing statistical science is to demolish the P-value culture, which has taken root to a frightening extent in many areas of both pure and applied science and technology".

  20. Statistical inference of the generation probability of T-cell receptors from sequence repertoires.

    PubMed

    Murugan, Anand; Mora, Thierry; Walczak, Aleksandra M; Callan, Curtis G

    2012-10-02

    Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.

  1. Meta-Analysis of Land Use / Land Cover Change Factors in the Conterminous US and Prediction of Potential Working Timberlands in the US South from FIA Inventory Plots and NLCD Cover Maps

    NASA Astrophysics Data System (ADS)

    Jeuck, James A.

    This dissertation consists of research projects related to forest land use / land cover (LULC): (1) factors predicting LULC change and (2) methodology to predict particular forest use, or "potential working timberland" (PWT), from current forms of land data. The first project resulted in a published paper, a meta-analysis of 64 econometric models from 47 studies predicting forest land use changes. The response variables, representing some form of forest land change, were organized into four groups: forest conversion to agriculture (F2A), forestland to development (F2D), forestland to non-forested (F2NF) and undeveloped (including forestland) to developed (U2D) land. Over 250 independent econometric variables were identified, from 21 F2A models, 21 F2D models, 12 F2NF models, and 10 U2D models. These variables were organized into a hierarchy of 119 independent variable groups, 15 categories, and 4 econometric drivers suitable for conducting simple vote count statistics. Vote counts were summarized at the independent variable group level and formed into ratios estimating the predictive success of each variable group. Two ratio estimates were developed based on (1) proportion of times independent variables successfully achieved statistical significance (p ≤0.10), and (2) proportion of times independent variables successfully met the original researchers'expectations. In F2D models, popular independent variables such as population, income, and urban proximity often achieved statistical significance. In F2A models, popular independent variables such as forest and agricultural rents and costs, governmental programs, and site quality often achieved statistical significance. In U2D models, successful independent variables included urban rents and costs, zoning issues concerning forestland loss, site quality, urban proximity, population, and income. F2NF models high success variables were found to be agricultural rents, site quality, population, and income. This meta-analysis provides insight into the general success of econometric independent variables for future forest use or cover change research. The second part of this dissertation developed a method for predicting area estimates and spatial distribution of PWT in the US South. This technique determined land use from USFS Forest Inventory and Analysis (FIA) and land cover from the National Land Cover Database (NLCD). Three dependent variable forms (DV Forms) were derived from the FIA data: DV Form 1, timberland, other; DV Form 2, short timberland, tall timberland, agriculture, other; and DV Form 3, short hardwood (HW) timberland, tall HW timberland, short softwood (SW) timberland, tall SW timberland, agriculture, other. The prediction accuracy of each DV Form was investigated using both random forest model and logistic regression model specifications and data optimization techniques. Model verification employing a "leave-group-out" Monte Carlo simulation determined the selection of a stratified version of the random forest model using one-year NLCD observations with an overall accuracy of 0.53-0.94. The lower accuracy side of the range was when predictions were made from an aggregated NLCD land cover class "grass_shrub". The selected model specification was run using 2011 NLCD and the other predictor variables to produce three levels of timberland prediction and probability maps for the US South. Spatial masks removed areas unlikely to be working forests (protected and urbanized lands) resulting in PWT maps. The area of the resulting maps compared well with USFS area estimates and masked PWT maps and had an 8-11% reduction of the USFS timberland estimate for the US South compared to the DV Form. Change analysis of the 2011 NLCD to PWT showed (1) the majority of the short timberland came from NLCD grass_shrub; (2) the majority of NLCD grass_shrub predicted into tall timberland, and (3) NLCD grass_shrub was more strongly associated with timberland in the Coastal Plain. Resulting map products provide practical analytical tools for those interested in studying the area and distribution of PWT in the US South.

  2. Statistical Inference and Reverse Engineering of Gene Regulatory Networks from Observational Expression Data

    PubMed Central

    Emmert-Streib, Frank; Glazko, Galina V.; Altay, Gökmen; de Matos Simoes, Ricardo

    2012-01-01

    In this paper, we present a systematic and conceptual overview of methods for inferring gene regulatory networks from observational gene expression data. Further, we discuss two classic approaches to infer causal structures and compare them with contemporary methods by providing a conceptual categorization thereof. We complement the above by surveying global and local evaluation measures for assessing the performance of inference algorithms. PMID:22408642

  3. Information Entropy Production of Maximum Entropy Markov Chains from Spike Trains

    NASA Astrophysics Data System (ADS)

    Cofré, Rodrigo; Maldonado, Cesar

    2018-01-01

    We consider the maximum entropy Markov chain inference approach to characterize the collective statistics of neuronal spike trains, focusing on the statistical properties of the inferred model. We review large deviations techniques useful in this context to describe properties of accuracy and convergence in terms of sampling size. We use these results to study the statistical fluctuation of correlations, distinguishability and irreversibility of maximum entropy Markov chains. We illustrate these applications using simple examples where the large deviation rate function is explicitly obtained for maximum entropy models of relevance in this field.

  4. The Model Analyst’s Toolkit: Scientific Model Development, Analysis, and Validation

    DTIC Science & Technology

    2015-02-20

    being integrated within MAT, including Granger causality. Granger causality tests whether a data series helps when predicting future values of another...relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, 424-438. Granger, C. W. (1980). Testing ... testing dataset. This effort is described in Section 3.2. 3.1. Improvements in Granger Causality User Interface Various metrics of causality are

  5. Geographical Network Analysis and Spatial Econometrics as Tools to Enhance Our Understanding of Student Migration Patterns and Benefits in the U.S. Higher Education Network

    ERIC Educational Resources Information Center

    González Canché, Manuel S.

    2018-01-01

    This study measures the extent to which student outmigration outside the 4-year sector takes place and posits that the benefits from attracting non-resident students exist regardless of sector of enrollment. The study also provides empirical evidence about the relevance of employing geographical network analysis (GNA) and spatial econometrics in…

  6. A spatial econometric analysis of land-use change with land cover trends data: an application to the Pacific Northwest

    Treesearch

    David J. Lewis; Ralph J. Alig

    2014-01-01

    This paper develops a plot-level spatial econometric land-use model and estimates it with U.S. Geological Survey Land Cover Trends (LCT) geographic information system panel data for the western halves of the states of Oregon and Washington. The discrete-choice framework we use models plot-scale choices of the three dominant land uses in this region: forest, agriculture...

  7. Crime Pattern Analysis: A Spatial Frequent Pattern Mining Approach

    DTIC Science & Technology

    2012-05-10

    econometrics. A companion to theoretical econometrics, pages 310-330, 1988. [5] L. Anselin, J. Cohen, D. Cook, W. Gorr, and G. Tita . Spatial analyses...52] G. Mohler, M. Short, P. Brantingham, F. Schoenberg, and G. Tita . Self-exciting point process modeling of crime. Journal of the American...Systems, 9:462, 2010. [69] M. Short, P. Brantingham, A. Bertozzi, and G. Tita . Dissipation and displacement of hotspots in reaction-diffusion models

  8. An Applied Physicist Does Econometrics

    NASA Astrophysics Data System (ADS)

    Taff, L. G.

    2010-02-01

    The biggest problem those attempting to understand econometric data, via modeling, have is that economics has no F = ma. Without a theoretical underpinning, econometricians have no way to build a good model to fit observations to. Physicists do, and when F = ma failed, we knew it. Still desiring to comprehend econometric data, applied economists turn to mis-applying probability theory---especially with regard to the assumptions concerning random errors---and choosing extremely simplistic analytical formulations of inter-relationships. This introduces model bias to an unknown degree. An applied physicist, used to having to match observations to a numerical or analytical model with a firm theoretical basis, modify the model, re-perform the analysis, and then know why, and when, to delete ``outliers'', is at a considerable advantage when quantitatively analyzing econometric data. I treat two cases. One is to determine the household density distribution of total assets, annual income, age, level of education, race, and marital status. Each of these ``independent'' variables is highly correlated with every other but only current annual income and level of education follow a linear relationship. The other is to discover the functional dependence of total assets on the distribution of assets: total assets has an amazingly tight power law dependence on a quadratic function of portfolio composition. Who knew? )

  9. Statistical inference based on the nonparametric maximum likelihood estimator under double-truncation.

    PubMed

    Emura, Takeshi; Konno, Yoshihiko; Michimae, Hirofumi

    2015-07-01

    Doubly truncated data consist of samples whose observed values fall between the right- and left- truncation limits. With such samples, the distribution function of interest is estimated using the nonparametric maximum likelihood estimator (NPMLE) that is obtained through a self-consistency algorithm. Owing to the complicated asymptotic distribution of the NPMLE, the bootstrap method has been suggested for statistical inference. This paper proposes a closed-form estimator for the asymptotic covariance function of the NPMLE, which is computationally attractive alternative to bootstrapping. Furthermore, we develop various statistical inference procedures, such as confidence interval, goodness-of-fit tests, and confidence bands to demonstrate the usefulness of the proposed covariance estimator. Simulations are performed to compare the proposed method with both the bootstrap and jackknife methods. The methods are illustrated using the childhood cancer dataset.

  10. Structured statistical models of inductive reasoning.

    PubMed

    Kemp, Charles; Tenenbaum, Joshua B

    2009-01-01

    Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. This article presents a Bayesian framework that attempts to meet both goals and describes [corrected] 4 applications of the framework: a taxonomic model, a spatial model, a threshold model, and a causal model. Each model makes probabilistic inferences about the extensions of novel properties, but the priors for the 4 models are defined over different kinds of structures that capture different relationships between the categories in a domain. The framework therefore shows how statistical inference can operate over structured background knowledge, and the authors argue that this interaction between structure and statistics is critical for explaining the power and flexibility of human reasoning.

  11. Inference on network statistics by restricting to the network space: applications to sexual history data.

    PubMed

    Goyal, Ravi; De Gruttola, Victor

    2018-01-30

    Analysis of sexual history data intended to describe sexual networks presents many challenges arising from the fact that most surveys collect information on only a very small fraction of the population of interest. In addition, partners are rarely identified and responses are subject to reporting biases. Typically, each network statistic of interest, such as mean number of sexual partners for men or women, is estimated independently of other network statistics. There is, however, a complex relationship among networks statistics; and knowledge of these relationships can aid in addressing concerns mentioned earlier. We develop a novel method that constrains a posterior predictive distribution of a collection of network statistics in order to leverage the relationships among network statistics in making inference about network properties of interest. The method ensures that inference on network properties is compatible with an actual network. Through extensive simulation studies, we also demonstrate that use of this method can improve estimates in settings where there is uncertainty that arises both from sampling and from systematic reporting bias compared with currently available approaches to estimation. To illustrate the method, we apply it to estimate network statistics using data from the Chicago Health and Social Life Survey. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  12. Factors Influencing Energy Use and Carbon Emissions in China

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fisher-Vanden, Karen; Jefferson, Gary

    This research project was designed to fill a critical void in our understanding of the state of energy research and innovation in China. It seeks to provide a comprehensive review and accounting of the various elements of the Chinese government and non-governmental sectors (commercial, university, research institutes) that are engaged in energy-related R&D and various aspects of energy innovation, including specific programs and projects designed to promote renewable energy innovation and energy conservation. The project provides an interrelated descriptive, statistical, and econometric account of China's overall energy innovation activities and capabilities, spanning the full economy with a particular focus onmore » the dynamic industrial sector.« less

  13. ESTIMATION OF FUNCTIONALS OF SPARSE COVARIANCE MATRICES.

    PubMed

    Fan, Jianqing; Rigollet, Philippe; Wang, Weichen

    High-dimensional statistical tests often ignore correlations to gain simplicity and stability leading to null distributions that depend on functionals of correlation matrices such as their Frobenius norm and other ℓ r norms. Motivated by the computation of critical values of such tests, we investigate the difficulty of estimation the functionals of sparse correlation matrices. Specifically, we show that simple plug-in procedures based on thresholded estimators of correlation matrices are sparsity-adaptive and minimax optimal over a large class of correlation matrices. Akin to previous results on functional estimation, the minimax rates exhibit an elbow phenomenon. Our results are further illustrated in simulated data as well as an empirical study of data arising in financial econometrics.

  14. ESTIMATION OF FUNCTIONALS OF SPARSE COVARIANCE MATRICES

    PubMed Central

    Fan, Jianqing; Rigollet, Philippe; Wang, Weichen

    2016-01-01

    High-dimensional statistical tests often ignore correlations to gain simplicity and stability leading to null distributions that depend on functionals of correlation matrices such as their Frobenius norm and other ℓr norms. Motivated by the computation of critical values of such tests, we investigate the difficulty of estimation the functionals of sparse correlation matrices. Specifically, we show that simple plug-in procedures based on thresholded estimators of correlation matrices are sparsity-adaptive and minimax optimal over a large class of correlation matrices. Akin to previous results on functional estimation, the minimax rates exhibit an elbow phenomenon. Our results are further illustrated in simulated data as well as an empirical study of data arising in financial econometrics. PMID:26806986

  15. The Hog Cycle of Law Professors: An Econometric Time Series Analysis of the Entry-Level Job Market in Legal Academia.

    PubMed

    Engel, Christoph; Hamann, Hanjo

    2016-01-01

    The (German) market for law professors fulfils the conditions for a hog cycle: In the short run, supply cannot be extended or limited; future law professors must be hired soon after they first present themselves, or leave the market; demand is inelastic. Using a comprehensive German dataset, we show that the number of market entries today is negatively correlated with the number of market entries eight years ago. This suggests short-sighted behavior of young scholars at the time when they decide to prepare for the market. Using our statistical model, we make out-of-sample predictions for the German academic market in law until 2020.

  16. Statistical comparison of a hybrid approach with approximate and exact inference models for Fusion 2+

    NASA Astrophysics Data System (ADS)

    Lee, K. David; Wiesenfeld, Eric; Gelfand, Andrew

    2007-04-01

    One of the greatest challenges in modern combat is maintaining a high level of timely Situational Awareness (SA). In many situations, computational complexity and accuracy considerations make the development and deployment of real-time, high-level inference tools very difficult. An innovative hybrid framework that combines Bayesian inference, in the form of Bayesian Networks, and Possibility Theory, in the form of Fuzzy Logic systems, has recently been introduced to provide a rigorous framework for high-level inference. In previous research, the theoretical basis and benefits of the hybrid approach have been developed. However, lacking is a concrete experimental comparison of the hybrid framework with traditional fusion methods, to demonstrate and quantify this benefit. The goal of this research, therefore, is to provide a statistical analysis on the comparison of the accuracy and performance of hybrid network theory, with pure Bayesian and Fuzzy systems and an inexact Bayesian system approximated using Particle Filtering. To accomplish this task, domain specific models will be developed under these different theoretical approaches and then evaluated, via Monte Carlo Simulation, in comparison to situational ground truth to measure accuracy and fidelity. Following this, a rigorous statistical analysis of the performance results will be performed, to quantify the benefit of hybrid inference to other fusion tools.

  17. Statistically optimal perception and learning: from behavior to neural representations

    PubMed Central

    Fiser, József; Berkes, Pietro; Orbán, Gergő; Lengyel, Máté

    2010-01-01

    Human perception has recently been characterized as statistical inference based on noisy and ambiguous sensory inputs. Moreover, suitable neural representations of uncertainty have been identified that could underlie such probabilistic computations. In this review, we argue that learning an internal model of the sensory environment is another key aspect of the same statistical inference procedure and thus perception and learning need to be treated jointly. We review evidence for statistically optimal learning in humans and animals, and reevaluate possible neural representations of uncertainty based on their potential to support statistically optimal learning. We propose that spontaneous activity can have a functional role in such representations leading to a new, sampling-based, framework of how the cortex represents information and uncertainty. PMID:20153683

  18. On the analysis of very small samples of Gaussian repeated measurements: an alternative approach.

    PubMed

    Westgate, Philip M; Burchett, Woodrow W

    2017-03-15

    The analysis of very small samples of Gaussian repeated measurements can be challenging. First, due to a very small number of independent subjects contributing outcomes over time, statistical power can be quite small. Second, nuisance covariance parameters must be appropriately accounted for in the analysis in order to maintain the nominal test size. However, available statistical strategies that ensure valid statistical inference may lack power, whereas more powerful methods may have the potential for inflated test sizes. Therefore, we explore an alternative approach to the analysis of very small samples of Gaussian repeated measurements, with the goal of maintaining valid inference while also improving statistical power relative to other valid methods. This approach uses generalized estimating equations with a bias-corrected empirical covariance matrix that accounts for all small-sample aspects of nuisance correlation parameter estimation in order to maintain valid inference. Furthermore, the approach utilizes correlation selection strategies with the goal of choosing the working structure that will result in the greatest power. In our study, we show that when accurate modeling of the nuisance correlation structure impacts the efficiency of regression parameter estimation, this method can improve power relative to existing methods that yield valid inference. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  19. Statistical inference for noisy nonlinear ecological dynamic systems.

    PubMed

    Wood, Simon N

    2010-08-26

    Chaotic ecological dynamic systems defy conventional statistical analysis. Systems with near-chaotic dynamics are little better. Such systems are almost invariably driven by endogenous dynamic processes plus demographic and environmental process noise, and are only observable with error. Their sensitivity to history means that minute changes in the driving noise realization, or the system parameters, will cause drastic changes in the system trajectory. This sensitivity is inherited and amplified by the joint probability density of the observable data and the process noise, rendering it useless as the basis for obtaining measures of statistical fit. Because the joint density is the basis for the fit measures used by all conventional statistical methods, this is a major theoretical shortcoming. The inability to make well-founded statistical inferences about biological dynamic models in the chaotic and near-chaotic regimes, other than on an ad hoc basis, leaves dynamic theory without the methods of quantitative validation that are essential tools in the rest of biological science. Here I show that this impasse can be resolved in a simple and general manner, using a method that requires only the ability to simulate the observed data on a system from the dynamic model about which inferences are required. The raw data series are reduced to phase-insensitive summary statistics, quantifying local dynamic structure and the distribution of observations. Simulation is used to obtain the mean and the covariance matrix of the statistics, given model parameters, allowing the construction of a 'synthetic likelihood' that assesses model fit. This likelihood can be explored using a straightforward Markov chain Monte Carlo sampler, but one further post-processing step returns pure likelihood-based inference. I apply the method to establish the dynamic nature of the fluctuations in Nicholson's classic blowfly experiments.

  20. Probabilistic Graphical Model Representation in Phylogenetics

    PubMed Central

    Höhna, Sebastian; Heath, Tracy A.; Boussau, Bastien; Landis, Michael J.; Ronquist, Fredrik; Huelsenbeck, John P.

    2014-01-01

    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis–Hastings or Gibbs sampling of the posterior distribution. [Computation; graphical models; inference; modularization; statistical phylogenetics; tree plate.] PMID:24951559

  1. Protein and gene model inference based on statistical modeling in k-partite graphs.

    PubMed

    Gerster, Sarah; Qeli, Ermir; Ahrens, Christian H; Bühlmann, Peter

    2010-07-06

    One of the major goals of proteomics is the comprehensive and accurate description of a proteome. Shotgun proteomics, the method of choice for the analysis of complex protein mixtures, requires that experimentally observed peptides are mapped back to the proteins they were derived from. This process is also known as protein inference. We present Markovian Inference of Proteins and Gene Models (MIPGEM), a statistical model based on clearly stated assumptions to address the problem of protein and gene model inference for shotgun proteomics data. In particular, we are dealing with dependencies among peptides and proteins using a Markovian assumption on k-partite graphs. We are also addressing the problems of shared peptides and ambiguous proteins by scoring the encoding gene models. Empirical results on two control datasets with synthetic mixtures of proteins and on complex protein samples of Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana suggest that the results with MIPGEM are competitive with existing tools for protein inference.

  2. minet: A R/Bioconductor package for inferring large transcriptional networks using mutual information.

    PubMed

    Meyer, Patrick E; Lafitte, Frédéric; Bontempi, Gianluca

    2008-10-29

    This paper presents the R/Bioconductor package minet (version 1.1.6) which provides a set of functions to infer mutual information networks from a dataset. Once fed with a microarray dataset, the package returns a network where nodes denote genes, edges model statistical dependencies between genes and the weight of an edge quantifies the statistical evidence of a specific (e.g transcriptional) gene-to-gene interaction. Four different entropy estimators are made available in the package minet (empirical, Miller-Madow, Schurmann-Grassberger and shrink) as well as four different inference methods, namely relevance networks, ARACNE, CLR and MRNET. Also, the package integrates accuracy assessment tools, like F-scores, PR-curves and ROC-curves in order to compare the inferred network with a reference one. The package minet provides a series of tools for inferring transcriptional networks from microarray data. It is freely available from the Comprehensive R Archive Network (CRAN) as well as from the Bioconductor website.

  3. Computational statistics using the Bayesian Inference Engine

    NASA Astrophysics Data System (ADS)

    Weinberg, Martin D.

    2013-09-01

    This paper introduces the Bayesian Inference Engine (BIE), a general parallel, optimized software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organize and reuse expensive derived data. The BIE is the first platform for computational statistics designed explicitly to enable Bayesian update and model comparison for astronomical problems. Bayesian update is based on the representation of high-dimensional posterior distributions using metric-ball-tree based kernel density estimation. Among its algorithmic offerings, the BIE emphasizes hybrid tempered Markov chain Monte Carlo schemes that robustly sample multimodal posterior distributions in high-dimensional parameter spaces. Moreover, the BIE implements a full persistence or serialization system that stores the full byte-level image of the running inference and previously characterized posterior distributions for later use. Two new algorithms to compute the marginal likelihood from the posterior distribution, developed for and implemented in the BIE, enable model comparison for complex models and data sets. Finally, the BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. It includes an extensible object-oriented and easily extended framework that implements every aspect of the Bayesian inference. By providing a variety of statistical algorithms for all phases of the inference problem, a scientist may explore a variety of approaches with a single model and data implementation. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU General Public License.

  4. A Theory of Bayesian Data Analysis

    DTIC Science & Technology

    1989-10-10

    and the sim- plification of models," in Evaluation of Econometric Models, J. Kmenta and J. 20 Ramsey , eds., Academic Press, 245-268. Edwards, W...Evaluation of Econometric Models, ed. by J. Kmenta and J. Ramsey , Academic Press, 197-217. Hill, Bruce M., (1980c), Review of Specification Searches, by E...also Hill (1970a, 1975a) for earlier thoughts the subject with regard to tests of significance, and Smith.(1986). The Baesi theory of tests of

  5. Econometric comparisons of liquid rocket engines for dual-fuel advanced earth-to-orbit shuttles

    NASA Technical Reports Server (NTRS)

    Martin, J. A.

    1978-01-01

    Econometric analyses of advanced Earth-to-orbit vehicles indicate that there are economic benefits from development of new vehicles beyond the space shuttle as traffic increases. Vehicle studies indicate the advantage of the dual-fuel propulsion in single-stage vehicles. This paper shows the economic effect of incorporating dual-fuel propulsion in advanced vehicles. Several dual-fuel propulsion systems are compared to a baseline hydrogen and oxygen system.

  6. Time Series Modeling of Army Mission Command Communication Networks: An Event-Driven Analysis

    DTIC Science & Technology

    2013-06-01

    Lehmann, D. R. (1984). How advertising affects sales: Meta- analysis of econometric results. Journal of Marketing Research , 21, 65-74. Barabási, A. L...317-357. Leone, R. P. (1983). Modeling sales-advertising relationships: An integrated time series- econometric approach. Journal of Marketing ... Research , 20, 291-295. McGrath, J. E., & Kravitz, D. A. (1982). Group research. Annual Review of Psychology, 33, 195- 230. Monge, P. R., & Contractor

  7. Teach a Confidence Interval for the Median in the First Statistics Course

    ERIC Educational Resources Information Center

    Howington, Eric B.

    2017-01-01

    Few introductory statistics courses consider statistical inference for the median. This article argues in favour of adding a confidence interval for the median to the first statistics course. Several methods suitable for introductory statistics students are identified and briefly reviewed.

  8. Pointwise probability reinforcements for robust statistical inference.

    PubMed

    Frénay, Benoît; Verleysen, Michel

    2014-02-01

    Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Probability, statistics, and computational science.

    PubMed

    Beerenwinkel, Niko; Siebourg, Juliane

    2012-01-01

    In this chapter, we review basic concepts from probability theory and computational statistics that are fundamental to evolutionary genomics. We provide a very basic introduction to statistical modeling and discuss general principles, including maximum likelihood and Bayesian inference. Markov chains, hidden Markov models, and Bayesian network models are introduced in more detail as they occur frequently and in many variations in genomics applications. In particular, we discuss efficient inference algorithms and methods for learning these models from partially observed data. Several simple examples are given throughout the text, some of which point to models that are discussed in more detail in subsequent chapters.

  10. A Not-So-Fundamental Limitation on Studying Complex Systems with Statistics: Comment on Rabin (2011)

    NASA Astrophysics Data System (ADS)

    Thomas, Drew M.

    2012-12-01

    Although living organisms are affected by many interrelated and unidentified variables, this complexity does not automatically impose a fundamental limitation on statistical inference. Nor need one invoke such complexity as an explanation of the "Truth Wears Off" or "decline" effect; similar "decline" effects occur with far simpler systems studied in physics. Selective reporting and publication bias, and scientists' biases in favor of reporting eye-catching results (in general) or conforming to others' results (in physics) better explain this feature of the "Truth Wears Off" effect than Rabin's suggested limitation on statistical inference.

  11. Safety regulations, firm size, and the risk of accidents in E&P operations on the Gulf of Mexico outer continental shelf

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Iledare, O.O.; Pulsipher, A.G.; Baumann, R.H.

    1996-12-31

    The current expanded role of smaller independent oil producers in the OCS has led to concern about the possibility of increased risk of accidents in E&P operations on the Gulf of Mexico OCS. In addition, questions have been posed concerning the effects of the Minerals Management Service`s (MMS) safety regulations and inspection program, firm size, and industry practices on the risk of accidents in E&P operations on the Gulf of Mexico OCS. The specific purposes of the study reported in this paper were to ascertain (1) whether any empirical justification exists for the widespread concern that an increase in independentsmore » relative share of E&P operations in the Gulf OCS region will be detrimental to safety, and (2) whether MMS policies and safety programs have reduced the frequency or severity of accidents on the OCS. Our statistical and descriptive analyses of data on accidents from MMS provide no statistical evidence to support the apprehension that an expanded role for independents in E&P activity constitutes any major threat to safety on the OCS. Further, the results of our econometrics analysis confirm the expectation that the more effective MMS inspectors are at detecting incidents of noncompliance the lower the rate of accidents on the OCS is, ceteris paribus. In addition the results indicate that the variability in platform exposure years--cumulative age of operating platform--in comparison to other factors explains a significant portion of the variation in accidents per operating platform. That is, the platform aging process provides more opportunity for accidents than any other contributing factors. Our econometrics analysis also suggests that, if the other factors contributing to offshore accidents are held constant, the responsiveness of accident rate to drilling activity is inelastic while the response of accident rate to production activity levels is elastic.« less

  12. The Changing Balance: South and North Korean Capabilities for Long-Term Military Competition

    DTIC Science & Technology

    1985-12-01

    econometric model. Ideally, a model should be estimated over one period and then tested over a different period. If one esti- mates and tests over the...unprecedented impending shift of political leadership from Kim II-Sung to his son, Kim Chong-Il. Section III summarizes an aggregative econometric ...model of the South Korean economy, which we have developed to test the effect on that economy of alternative South Korean military force postures and

  13. Bayesian Inference: with ecological applications

    USGS Publications Warehouse

    Link, William A.; Barker, Richard J.

    2010-01-01

    This text provides a mathematically rigorous yet accessible and engaging introduction to Bayesian inference with relevant examples that will be of interest to biologists working in the fields of ecology, wildlife management and environmental studies as well as students in advanced undergraduate statistics.. This text opens the door to Bayesian inference, taking advantage of modern computational efficiencies and easily accessible software to evaluate complex hierarchical models.

  14. Theory-based Bayesian Models of Inductive Inference

    DTIC Science & Technology

    2010-07-19

    Subjective randomness and natural scene statistics. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom/papers/randscenes.pdf Page 1...in press). Exemplar models as a mechanism for performing Bayesian inference. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom

  15. Differences in Performance Among Test Statistics for Assessing Phylogenomic Model Adequacy.

    PubMed

    Duchêne, David A; Duchêne, Sebastian; Ho, Simon Y W

    2018-05-18

    Statistical phylogenetic analyses of genomic data depend on models of nucleotide or amino acid substitution. The adequacy of these substitution models can be assessed using a number of test statistics, allowing the model to be rejected when it is found to provide a poor description of the evolutionary process. A potentially valuable use of model-adequacy test statistics is to identify when data sets are likely to produce unreliable phylogenetic estimates, but their differences in performance are rarely explored. We performed a comprehensive simulation study to identify test statistics that are sensitive to some of the most commonly cited sources of phylogenetic estimation error. Our results show that, for many test statistics, traditional thresholds for assessing model adequacy can fail to reject the model when the phylogenetic inferences are inaccurate and imprecise. This is particularly problematic when analysing loci that have few variable informative sites. We propose new thresholds for assessing substitution model adequacy and demonstrate their effectiveness in analyses of three phylogenomic data sets. These thresholds lead to frequent rejection of the model for loci that yield topological inferences that are imprecise and are likely to be inaccurate. We also propose the use of a summary statistic that provides a practical assessment of overall model adequacy. Our approach offers a promising means of enhancing model choice in genome-scale data sets, potentially leading to improvements in the reliability of phylogenomic inference.

  16. Inferring action structure and causal relationships in continuous sequences of human action.

    PubMed

    Buchsbaum, Daphna; Griffiths, Thomas L; Plunkett, Dillon; Gopnik, Alison; Baldwin, Dare

    2015-02-01

    In the real world, causal variables do not come pre-identified or occur in isolation, but instead are embedded within a continuous temporal stream of events. A challenge faced by both human learners and machine learning algorithms is identifying subsequences that correspond to the appropriate variables for causal inference. A specific instance of this problem is action segmentation: dividing a sequence of observed behavior into meaningful actions, and determining which of those actions lead to effects in the world. Here we present a Bayesian analysis of how statistical and causal cues to segmentation should optimally be combined, as well as four experiments investigating human action segmentation and causal inference. We find that both people and our model are sensitive to statistical regularities and causal structure in continuous action, and are able to combine these sources of information in order to correctly infer both causal relationships and segmentation boundaries. Copyright © 2014. Published by Elsevier Inc.

  17. Applications of statistics to medical science (1) Fundamental concepts.

    PubMed

    Watanabe, Hiroshi

    2011-01-01

    The conceptual framework of statistical tests and statistical inferences are discussed, and the epidemiological background of statistics is briefly reviewed. This study is one of a series in which we survey the basics of statistics and practical methods used in medical statistics. Arguments related to actual statistical analysis procedures will be made in subsequent papers.

  18. The role of causal criteria in causal inferences: Bradford Hill's "aspects of association".

    PubMed

    Ward, Andrew C

    2009-06-17

    As noted by Wesley Salmon and many others, causal concepts are ubiquitous in every branch of theoretical science, in the practical disciplines and in everyday life. In the theoretical and practical sciences especially, people often base claims about causal relations on applications of statistical methods to data. However, the source and type of data place important constraints on the choice of statistical methods as well as on the warrant attributed to the causal claims based on the use of such methods. For example, much of the data used by people interested in making causal claims come from non-experimental, observational studies in which random allocations to treatment and control groups are not present. Thus, one of the most important problems in the social and health sciences concerns making justified causal inferences using non-experimental, observational data. In this paper, I examine one method of justifying such inferences that is especially widespread in epidemiology and the health sciences generally - the use of causal criteria. I argue that while the use of causal criteria is not appropriate for either deductive or inductive inferences, they do have an important role to play in inferences to the best explanation. As such, causal criteria, exemplified by what Bradford Hill referred to as "aspects of [statistical] associations", have an indispensible part to play in the goal of making justified causal claims.

  19. The role of causal criteria in causal inferences: Bradford Hill's "aspects of association"

    PubMed Central

    Ward, Andrew C

    2009-01-01

    As noted by Wesley Salmon and many others, causal concepts are ubiquitous in every branch of theoretical science, in the practical disciplines and in everyday life. In the theoretical and practical sciences especially, people often base claims about causal relations on applications of statistical methods to data. However, the source and type of data place important constraints on the choice of statistical methods as well as on the warrant attributed to the causal claims based on the use of such methods. For example, much of the data used by people interested in making causal claims come from non-experimental, observational studies in which random allocations to treatment and control groups are not present. Thus, one of the most important problems in the social and health sciences concerns making justified causal inferences using non-experimental, observational data. In this paper, I examine one method of justifying such inferences that is especially widespread in epidemiology and the health sciences generally – the use of causal criteria. I argue that while the use of causal criteria is not appropriate for either deductive or inductive inferences, they do have an important role to play in inferences to the best explanation. As such, causal criteria, exemplified by what Bradford Hill referred to as "aspects of [statistical] associations", have an indispensible part to play in the goal of making justified causal claims. PMID:19534788

  20. Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models.

    PubMed

    Jacquin, Hugo; Gilson, Amy; Shakhnovich, Eugene; Cocco, Simona; Monasson, Rémi

    2016-05-01

    Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structure and energetics remain untested so far. Here we use lattice protein model (LP) to benchmark those inverse statistical approaches. We build MSA of highly stable sequences in target LP structures, and infer the effective pairwise Potts Hamiltonians from those MSA. We find that inferred Potts Hamiltonians reproduce many important aspects of 'true' LP structures and energetics. Careful analysis reveals that effective pairwise couplings in inferred Potts Hamiltonians depend not only on the energetics of the native structure but also on competing folds; in particular, the coupling values reflect both positive design (stabilization of native conformation) and negative design (destabilization of competing folds). In addition to providing detailed structural information, the inferred Potts models used as protein Hamiltonian for design of new sequences are able to generate with high probability completely new sequences with the desired folds, which is not possible using independent-site models. Those are remarkable results as the effective LP Hamiltonians used to generate MSA are not simple pairwise models due to the competition between the folds. Our findings elucidate the reasons for the success of inverse approaches to the modelling of proteins from sequence data, and their limitations.

  1. Logical reasoning versus information processing in the dual-strategy model of reasoning.

    PubMed

    Markovits, Henry; Brisson, Janie; de Chantal, Pier-Luc

    2017-01-01

    One of the major debates concerning the nature of inferential reasoning is between counterexample-based strategies such as mental model theory and statistical strategies underlying probabilistic models. The dual-strategy model, proposed by Verschueren, Schaeken, & d'Ydewalle (2005a, 2005b), which suggests that people might have access to both kinds of strategy has been supported by several recent studies. These have shown that statistical reasoners make inferences based on using information about premises in order to generate a likelihood estimate of conclusion probability. However, while results concerning counterexample reasoners are consistent with a counterexample detection model, these results could equally be interpreted as indicating a greater sensitivity to logical form. In order to distinguish these 2 interpretations, in Studies 1 and 2, we presented reasoners with Modus ponens (MP) inferences with statistical information about premise strength and in Studies 3 and 4, naturalistic MP inferences with premises having many disabling conditions. Statistical reasoners accepted the MP inference more often than counterexample reasoners in Studies 1 and 2, while the opposite pattern was observed in Studies 3 and 4. Results show that these strategies must be defined in terms of information processing, with no clear relations to "logical" reasoning. These results have additional implications for the underlying debate about the nature of human reasoning. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  2. Statistical inference for Hardy-Weinberg proportions in the presence of missing genotype information.

    PubMed

    Graffelman, Jan; Sánchez, Milagros; Cook, Samantha; Moreno, Victor

    2013-01-01

    In genetic association studies, tests for Hardy-Weinberg proportions are often employed as a quality control checking procedure. Missing genotypes are typically discarded prior to testing. In this paper we show that inference for Hardy-Weinberg proportions can be biased when missing values are discarded. We propose to use multiple imputation of missing values in order to improve inference for Hardy-Weinberg proportions. For imputation we employ a multinomial logit model that uses information from allele intensities and/or neighbouring markers. Analysis of an empirical data set of single nucleotide polymorphisms possibly related to colon cancer reveals that missing genotypes are not missing completely at random. Deviation from Hardy-Weinberg proportions is mostly due to a lack of heterozygotes. Inbreeding coefficients estimated by multiple imputation of the missings are typically lowered with respect to inbreeding coefficients estimated by discarding the missings. Accounting for missings by multiple imputation qualitatively changed the results of 10 to 17% of the statistical tests performed. Estimates of inbreeding coefficients obtained by multiple imputation showed high correlation with estimates obtained by single imputation using an external reference panel. Our conclusion is that imputation of missing data leads to improved statistical inference for Hardy-Weinberg proportions.

  3. [Application of statistics on chronic-diseases-relating observational research papers].

    PubMed

    Hong, Zhi-heng; Wang, Ping; Cao, Wei-hua

    2012-09-01

    To study the application of statistics on Chronic-diseases-relating observational research papers which were recently published in the Chinese Medical Association Magazines, with influential index above 0.5. Using a self-developed criterion, two investigators individually participated in assessing the application of statistics on Chinese Medical Association Magazines, with influential index above 0.5. Different opinions reached an agreement through discussion. A total number of 352 papers from 6 magazines, including the Chinese Journal of Epidemiology, Chinese Journal of Oncology, Chinese Journal of Preventive Medicine, Chinese Journal of Cardiology, Chinese Journal of Internal Medicine and Chinese Journal of Endocrinology and Metabolism, were reviewed. The rate of clear statement on the following contents as: research objectives, t target audience, sample issues, objective inclusion criteria and variable definitions were 99.43%, 98.57%, 95.43%, 92.86% and 96.87%. The correct rates of description on quantitative and qualitative data were 90.94% and 91.46%, respectively. The rates on correctly expressing the results, on statistical inference methods related to quantitative, qualitative data and modeling were 100%, 95.32% and 87.19%, respectively. 89.49% of the conclusions could directly response to the research objectives. However, 69.60% of the papers did not mention the exact names of the study design, statistically, that the papers were using. 11.14% of the papers were in lack of further statement on the exclusion criteria. Percentage of the papers that could clearly explain the sample size estimation only taking up as 5.16%. Only 24.21% of the papers clearly described the variable value assignment. Regarding the introduction on statistical conduction and on database methods, the rate was only 24.15%. 18.75% of the papers did not express the statistical inference methods sufficiently. A quarter of the papers did not use 'standardization' appropriately. As for the aspect of statistical inference, the rate of description on statistical testing prerequisite was only 24.12% while 9.94% papers did not even employ the statistical inferential method that should be used. The main deficiencies on the application of Statistics used in papers related to Chronic-diseases-related observational research were as follows: lack of sample-size determination, variable value assignment description not sufficient, methods on statistics were not introduced clearly or properly, lack of consideration for pre-requisition regarding the use of statistical inferences.

  4. Allocation of Future Federal Airport and Airway Costs.

    DTIC Science & Technology

    1986-12-01

    attributable to users are allocated among them based upon Ramsey Pricing which minimizes the distortion in aviation markets resulting from the allocation of...the years following 1992, the producers price Uindex projections made by Wharton Econometric Forecasting . Associates1 were employed. This latter set...and on econometric cost estimation techniques. These are Volumes 3 and 5 respectively. 68 A(A A11 I FSZK7_ ODi Id Lin <j< .99 C-4 x\\ M LL- < P7 Pi0

  5. Econometrics in outcomes research: the use of instrumental variables.

    PubMed

    Newhouse, J P; McClellan, M

    1998-01-01

    We describe an econometric technique, instrumental variables, that can be useful in estimating the effectiveness of clinical treatments in situations when a controlled trial has not or cannot be done. This technique relies upon the existence of one or more variables that induce substantial variation in the treatment variable but have no direct effect on the outcome variable of interest. We illustrate the use of the technique with an application to aggressive treatment of acute myocardial infarction in the elderly.

  6. Econometric model for age- and population-dependent radiation exposures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sandquist, G.M.; Slaughter, D.M.; Rogers, V.C.

    1991-01-01

    The economic impact associated with ionizing radiation exposures in a given human population depends on numerous factors including the individual's mean economic status as a function age, the age distribution of the population, the future life expectancy at each age, and the latency period for the occurrence of radiation-induced health effects. A simple mathematical model has been developed that provides an analytical methodology for estimating the societal econometrics associated with radiation effects are to be assessed and compared for economic evaluation.

  7. [Econometric and ethical validation of regression logistics. Reducing of the number of patients in the evaluation of mortality].

    PubMed

    Castiel, D; Herve, C

    1992-01-01

    In general, a large number of patients is needed to conclude whether the results of a therapeutic strategy are significant or not. One can lower this number with a logit. The method has been proposed in an article published recently (Cost-utility analysis of early thrombolytic therapy, Pharmaco Economics, 1992). The present article is an essay aimed at validating the method, both from the econometric and ethical points of view.

  8. Hierarchical modeling and inference in ecology: The analysis of data from populations, metapopulations and communities

    USGS Publications Warehouse

    Royle, J. Andrew; Dorazio, Robert M.

    2008-01-01

    A guide to data collection, modeling and inference strategies for biological survey data using Bayesian and classical statistical methods. This book describes a general and flexible framework for modeling and inference in ecological systems based on hierarchical models, with a strict focus on the use of probability models and parametric inference. Hierarchical models represent a paradigm shift in the application of statistics to ecological inference problems because they combine explicit models of ecological system structure or dynamics with models of how ecological systems are observed. The principles of hierarchical modeling are developed and applied to problems in population, metapopulation, community, and metacommunity systems. The book provides the first synthetic treatment of many recent methodological advances in ecological modeling and unifies disparate methods and procedures. The authors apply principles of hierarchical modeling to ecological problems, including * occurrence or occupancy models for estimating species distribution * abundance models based on many sampling protocols, including distance sampling * capture-recapture models with individual effects * spatial capture-recapture models based on camera trapping and related methods * population and metapopulation dynamic models * models of biodiversity, community structure and dynamics.

  9. Truth, models, model sets, AIC, and multimodel inference: a Bayesian perspective

    USGS Publications Warehouse

    Barker, Richard J.; Link, William A.

    2015-01-01

    Statistical inference begins with viewing data as realizations of stochastic processes. Mathematical models provide partial descriptions of these processes; inference is the process of using the data to obtain a more complete description of the stochastic processes. Wildlife and ecological scientists have become increasingly concerned with the conditional nature of model-based inference: what if the model is wrong? Over the last 2 decades, Akaike's Information Criterion (AIC) has been widely and increasingly used in wildlife statistics for 2 related purposes, first for model choice and second to quantify model uncertainty. We argue that for the second of these purposes, the Bayesian paradigm provides the natural framework for describing uncertainty associated with model choice and provides the most easily communicated basis for model weighting. Moreover, Bayesian arguments provide the sole justification for interpreting model weights (including AIC weights) as coherent (mathematically self consistent) model probabilities. This interpretation requires treating the model as an exact description of the data-generating mechanism. We discuss the implications of this assumption, and conclude that more emphasis is needed on model checking to provide confidence in the quality of inference.

  10. Selecting the right statistical model for analysis of insect count data by using information theoretic measures.

    PubMed

    Sileshi, G

    2006-10-01

    Researchers and regulatory agencies often make statistical inferences from insect count data using modelling approaches that assume homogeneous variance. Such models do not allow for formal appraisal of variability which in its different forms is the subject of interest in ecology. Therefore, the objectives of this paper were to (i) compare models suitable for handling variance heterogeneity and (ii) select optimal models to ensure valid statistical inferences from insect count data. The log-normal, standard Poisson, Poisson corrected for overdispersion, zero-inflated Poisson, the negative binomial distribution and zero-inflated negative binomial models were compared using six count datasets on foliage-dwelling insects and five families of soil-dwelling insects. Akaike's and Schwarz Bayesian information criteria were used for comparing the various models. Over 50% of the counts were zeros even in locally abundant species such as Ootheca bennigseni Weise, Mesoplatys ochroptera Stål and Diaecoderus spp. The Poisson model after correction for overdispersion and the standard negative binomial distribution model provided better description of the probability distribution of seven out of the 11 insects than the log-normal, standard Poisson, zero-inflated Poisson or zero-inflated negative binomial models. It is concluded that excess zeros and variance heterogeneity are common data phenomena in insect counts. If not properly modelled, these properties can invalidate the normal distribution assumptions resulting in biased estimation of ecological effects and jeopardizing the integrity of the scientific inferences. Therefore, it is recommended that statistical models appropriate for handling these data properties be selected using objective criteria to ensure efficient statistical inference.

  11. Comparison of a non-stationary voxelation-corrected cluster-size test with TFCE for group-Level MRI inference.

    PubMed

    Li, Huanjie; Nickerson, Lisa D; Nichols, Thomas E; Gao, Jia-Hong

    2017-03-01

    Two powerful methods for statistical inference on MRI brain images have been proposed recently, a non-stationary voxelation-corrected cluster-size test (CST) based on random field theory and threshold-free cluster enhancement (TFCE) based on calculating the level of local support for a cluster, then using permutation testing for inference. Unlike other statistical approaches, these two methods do not rest on the assumptions of a uniform and high degree of spatial smoothness of the statistic image. Thus, they are strongly recommended for group-level fMRI analysis compared to other statistical methods. In this work, the non-stationary voxelation-corrected CST and TFCE methods for group-level analysis were evaluated for both stationary and non-stationary images under varying smoothness levels, degrees of freedom and signal to noise ratios. Our results suggest that, both methods provide adequate control for the number of voxel-wise statistical tests being performed during inference on fMRI data and they are both superior to current CSTs implemented in popular MRI data analysis software packages. However, TFCE is more sensitive and stable for group-level analysis of VBM data. Thus, the voxelation-corrected CST approach may confer some advantages by being computationally less demanding for fMRI data analysis than TFCE with permutation testing and by also being applicable for single-subject fMRI analyses, while the TFCE approach is advantageous for VBM data. Hum Brain Mapp 38:1269-1280, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  12. The Hog Cycle of Law Professors: An Econometric Time Series Analysis of the Entry-Level Job Market in Legal Academia

    PubMed Central

    Hamann, Hanjo

    2016-01-01

    The (German) market for law professors fulfils the conditions for a hog cycle: In the short run, supply cannot be extended or limited; future law professors must be hired soon after they first present themselves, or leave the market; demand is inelastic. Using a comprehensive German dataset, we show that the number of market entries today is negatively correlated with the number of market entries eight years ago. This suggests short-sighted behavior of young scholars at the time when they decide to prepare for the market. Using our statistical model, we make out-of-sample predictions for the German academic market in law until 2020. PMID:27467518

  13. Unbiased split variable selection for random survival forests using maximally selected rank statistics.

    PubMed

    Wright, Marvin N; Dankowski, Theresa; Ziegler, Andreas

    2017-04-15

    The most popular approach for analyzing survival data is the Cox regression model. The Cox model may, however, be misspecified, and its proportionality assumption may not always be fulfilled. An alternative approach for survival prediction is random forests for survival outcomes. The standard split criterion for random survival forests is the log-rank test statistic, which favors splitting variables with many possible split points. Conditional inference forests avoid this split variable selection bias. However, linear rank statistics are utilized by default in conditional inference forests to select the optimal splitting variable, which cannot detect non-linear effects in the independent variables. An alternative is to use maximally selected rank statistics for the split point selection. As in conditional inference forests, splitting variables are compared on the p-value scale. However, instead of the conditional Monte-Carlo approach used in conditional inference forests, p-value approximations are employed. We describe several p-value approximations and the implementation of the proposed random forest approach. A simulation study demonstrates that unbiased split variable selection is possible. However, there is a trade-off between unbiased split variable selection and runtime. In benchmark studies of prediction performance on simulated and real datasets, the new method performs better than random survival forests if informative dichotomous variables are combined with uninformative variables with more categories and better than conditional inference forests if non-linear covariate effects are included. In a runtime comparison, the method proves to be computationally faster than both alternatives, if a simple p-value approximation is used. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  14. Incorporating Biological Knowledge into Evaluation of Casual Regulatory Hypothesis

    NASA Technical Reports Server (NTRS)

    Chrisman, Lonnie; Langley, Pat; Bay, Stephen; Pohorille, Andrew; DeVincenzi, D. (Technical Monitor)

    2002-01-01

    Biological data can be scarce and costly to obtain. The small number of samples available typically limits statistical power and makes reliable inference of causal relations extremely difficult. However, we argue that statistical power can be increased substantially by incorporating prior knowledge and data from diverse sources. We present a Bayesian framework that combines information from different sources and we show empirically that this lets one make correct causal inferences with small sample sizes that otherwise would be impossible.

  15. A Review of Some Aspects of Robust Inference for Time Series.

    DTIC Science & Technology

    1984-09-01

    REVIEW OF SOME ASPECTSOF ROBUST INFERNCE FOR TIME SERIES by Ad . Dougla Main TE "iAL REPOW No. 63 Septermber 1984 Department of Statistics University of ...clear. One cannot hope to have a good method for dealing with outliers in time series by using only an instantaneous nonlinear transformation of the data...AI.49 716 A REVIEWd OF SOME ASPECTS OF ROBUST INFERENCE FOR TIME 1/1 SERIES(U) WASHINGTON UNIV SEATTLE DEPT OF STATISTICS R D MARTIN SEP 84 TR-53

  16. The researcher and the consultant: from testing to probability statements.

    PubMed

    Hamra, Ghassan B; Stang, Andreas; Poole, Charles

    2015-09-01

    In the first instalment of this series, Stang and Poole provided an overview of Fisher significance testing (ST), Neyman-Pearson null hypothesis testing (NHT), and their unfortunate and unintended offspring, null hypothesis significance testing. In addition to elucidating the distinction between the first two and the evolution of the third, the authors alluded to alternative models of statistical inference; namely, Bayesian statistics. Bayesian inference has experienced a revival in recent decades, with many researchers advocating for its use as both a complement and an alternative to NHT and ST. This article will continue in the direction of the first instalment, providing practicing researchers with an introduction to Bayesian inference. Our work will draw on the examples and discussion of the previous dialogue.

  17. Spectral likelihood expansions for Bayesian inference

    NASA Astrophysics Data System (ADS)

    Nagel, Joseph B.; Sudret, Bruno

    2016-03-01

    A spectral approach to Bayesian inference is presented. It pursues the emulation of the posterior probability density. The starting point is a series expansion of the likelihood function in terms of orthogonal polynomials. From this spectral likelihood expansion all statistical quantities of interest can be calculated semi-analytically. The posterior is formally represented as the product of a reference density and a linear combination of polynomial basis functions. Both the model evidence and the posterior moments are related to the expansion coefficients. This formulation avoids Markov chain Monte Carlo simulation and allows one to make use of linear least squares instead. The pros and cons of spectral Bayesian inference are discussed and demonstrated on the basis of simple applications from classical statistics and inverse modeling.

  18. Synthesizing Econometric Evidence: The Case of Demand Elasticity Estimates.

    PubMed

    DeCicca, Philip; Kenkel, Don

    2015-06-01

    Econometric estimates of the responsiveness of health-related consumer demand to higher prices are often key ingredients for risk policy analysis. We review the potential advantages and challenges of synthesizing econometric evidence on the price-responsiveness of consumer demand. We draw on examples of research on consumer demand for health-related goods, especially cigarettes. We argue that the overarching goal of research synthesis in this context is to provide policy-relevant evidence for broad-brush conclusions. We propose three main criteria to select among research synthesis methods. We discuss how in principle and in current practice synthesis of research on the price-elasticity of smoking meets our proposed criteria. Our analysis of current practice also contributes to academic research on the specific policy question of the effectiveness of higher cigarette prices to reduce smoking. Although we point out challenges and limitations, we believe more work on research synthesis in this area will be productive and important. © 2015 Society for Risk Analysis.

  19. Parameterized examination in econometrics

    NASA Astrophysics Data System (ADS)

    Malinova, Anna; Kyurkchiev, Vesselin; Spasov, Georgi

    2018-01-01

    The paper presents a parameterization of basic types of exam questions in Econometrics. This algorithm is used to automate and facilitate the process of examination, assessment and self-preparation of a large number of students. The proposed parameterization of testing questions reduces the time required to author tests and course assignments. It enables tutors to generate a large number of different but equivalent dynamic questions (with dynamic answers) on a certain topic, which are automatically assessed. The presented methods are implemented in DisPeL (Distributed Platform for e-Learning) and provide questions in the areas of filtering and smoothing of time-series data, forecasting, building and analysis of single-equation econometric models. Questions also cover elasticity, average and marginal characteristics, product and cost functions, measurement of monopoly power, supply, demand and equilibrium price, consumer and product surplus, etc. Several approaches are used to enable the required numerical computations in DisPeL - integration of third-party mathematical libraries, developing our own procedures from scratch, and wrapping our legacy math codes in order to modernize and reuse them.

  20. Statistics, Computation, and Modeling in Cosmology

    NASA Astrophysics Data System (ADS)

    Jewell, Jeff; Guiness, Joe; SAMSI 2016 Working Group in Cosmology

    2017-01-01

    Current and future ground and space based missions are designed to not only detect, but map out with increasing precision, details of the universe in its infancy to the present-day. As a result we are faced with the challenge of analyzing and interpreting observations from a wide variety of instruments to form a coherent view of the universe. Finding solutions to a broad range of challenging inference problems in cosmology is one of the goals of the “Statistics, Computation, and Modeling in Cosmology” workings groups, formed as part of the year long program on ‘Statistical, Mathematical, and Computational Methods for Astronomy’, hosted by the Statistical and Applied Mathematical Sciences Institute (SAMSI), a National Science Foundation funded institute. Two application areas have emerged for focused development in the cosmology working group involving advanced algorithmic implementations of exact Bayesian inference for the Cosmic Microwave Background, and statistical modeling of galaxy formation. The former includes study and development of advanced Markov Chain Monte Carlo algorithms designed to confront challenging inference problems including inference for spatial Gaussian random fields in the presence of sources of galactic emission (an example of a source separation problem). Extending these methods to future redshift survey data probing the nonlinear regime of large scale structure formation is also included in the working group activities. In addition, the working group is also focused on the study of ‘Galacticus’, a galaxy formation model applied to dark matter-only cosmological N-body simulations operating on time-dependent halo merger trees. The working group is interested in calibrating the Galacticus model to match statistics of galaxy survey observations; specifically stellar mass functions, luminosity functions, and color-color diagrams. The group will use subsampling approaches and fractional factorial designs to statistically and computationally efficiently explore the Galacticus parameter space. The group will also use the Galacticus simulations to study the relationship between the topological and physical structure of the halo merger trees and the properties of the resulting galaxies.

  1. Hybrid regulatory models: a statistically tractable approach to model regulatory network dynamics.

    PubMed

    Ocone, Andrea; Millar, Andrew J; Sanguinetti, Guido

    2013-04-01

    Computational modelling of the dynamics of gene regulatory networks is a central task of systems biology. For networks of small/medium scale, the dominant paradigm is represented by systems of coupled non-linear ordinary differential equations (ODEs). ODEs afford great mechanistic detail and flexibility, but calibrating these models to data is often an extremely difficult statistical problem. Here, we develop a general statistical inference framework for stochastic transcription-translation networks. We use a coarse-grained approach, which represents the system as a network of stochastic (binary) promoter and (continuous) protein variables. We derive an exact inference algorithm and an efficient variational approximation that allows scalable inference and learning of the model parameters. We demonstrate the power of the approach on two biological case studies, showing that the method allows a high degree of flexibility and is capable of testable novel biological predictions. http://homepages.inf.ed.ac.uk/gsanguin/software.html. Supplementary data are available at Bioinformatics online.

  2. Reading biological processes from nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Murugan, Anand

    Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical mechanisms.

  3. Economic Impacts of Wind Turbine Development in U.S. Counties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    J., Brown; B., Hoen; E., Lantz

    2011-07-25

    The objective is to address the research question using post-project construction, county-level data, and econometric evaluation methods. Wind energy is expanding rapidly in the United States: Over the last 4 years, wind power has contributed approximately 35 percent of all new electric power capacity. Wind power plants are often developed in rural areas where local economic development impacts from the installation are projected, including land lease and property tax payments and employment growth during plant construction and operation. Wind energy represented 2.3 percent of the U.S. electricity supply in 2010, but studies show that penetrations of at least 20 percentmore » are feasible. Several studies have used input-output models to predict direct, indirect, and induced economic development impacts. These analyses have often been completed prior to project construction. Available studies have not yet investigated the economic development impacts of wind development at the county level using post-construction econometric evaluation methods. Analysis of county-level impacts is limited. However, previous county-level analyses have estimated operation-period employment at 0.2 to 0.6 jobs per megawatt (MW) of power installed and earnings at $9,000/MW to $50,000/MW. We find statistically significant evidence of positive impacts of wind development on county-level per capita income from the OLS and spatial lag models when they are applied to the full set of wind and non-wind counties. The total impact on annual per capita income of wind turbine development (measured in MW per capita) in the spatial lag model was $21,604 per MW. This estimate is within the range of values estimated in the literature using input-output models. OLS results for the wind-only counties and matched samples are similar in magnitude, but are not statistically significant at the 10-percent level. We find a statistically significant impact of wind development on employment in the OLS analysis for wind counties only, but not in the other models. Our estimates of employment impacts are not precise enough to assess the validity of employment impacts from input-output models applied in advance of wind energy project construction. The analysis provides empirical evidence of positive income effects at the county level from cumulative wind turbine development, consistent with the range of impacts estimated using input-output models. Employment impacts are less clear.« less

  4. CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions

    EPA Pesticide Factsheets

    Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.

  5. Test Theory Reconceived.

    ERIC Educational Resources Information Center

    Mislevy, Robert J.

    Educational test theory consists of statistical and methodological tools to support inferences about examinees' knowledge, skills, and accomplishments. The evolution of test theory has been shaped by the nature of users' inferences which, until recently, have been framed almost exclusively in terms of trait and behavioral psychology. Progress in…

  6. Data-driven sensitivity inference for Thomson scattering electron density measurement systems.

    PubMed

    Fujii, Keisuke; Yamada, Ichihiro; Hasuo, Masahiro

    2017-01-01

    We developed a method to infer the calibration parameters of multichannel measurement systems, such as channel variations of sensitivity and noise amplitude, from experimental data. We regard such uncertainties of the calibration parameters as dependent noise. The statistical properties of the dependent noise and that of the latent functions were modeled and implemented in the Gaussian process kernel. Based on their statistical difference, both parameters were inferred from the data. We applied this method to the electron density measurement system by Thomson scattering for the Large Helical Device plasma, which is equipped with 141 spatial channels. Based on the 210 sets of experimental data, we evaluated the correction factor of the sensitivity and noise amplitude for each channel. The correction factor varies by ≈10%, and the random noise amplitude is ≈2%, i.e., the measurement accuracy increases by a factor of 5 after this sensitivity correction. The certainty improvement in the spatial derivative inference was demonstrated.

  7. Evaluating sufficient similarity for drinking-water disinfection by-product (DBP) mixtures with bootstrap hypothesis test procedures.

    PubMed

    Feder, Paul I; Ma, Zhenxu J; Bull, Richard J; Teuschler, Linda K; Rice, Glenn

    2009-01-01

    In chemical mixtures risk assessment, the use of dose-response data developed for one mixture to estimate risk posed by a second mixture depends on whether the two mixtures are sufficiently similar. While evaluations of similarity may be made using qualitative judgments, this article uses nonparametric statistical methods based on the "bootstrap" resampling technique to address the question of similarity among mixtures of chemical disinfectant by-products (DBP) in drinking water. The bootstrap resampling technique is a general-purpose, computer-intensive approach to statistical inference that substitutes empirical sampling for theoretically based parametric mathematical modeling. Nonparametric, bootstrap-based inference involves fewer assumptions than parametric normal theory based inference. The bootstrap procedure is appropriate, at least in an asymptotic sense, whether or not the parametric, distributional assumptions hold, even approximately. The statistical analysis procedures in this article are initially illustrated with data from 5 water treatment plants (Schenck et al., 2009), and then extended using data developed from a study of 35 drinking-water utilities (U.S. EPA/AMWA, 1989), which permits inclusion of a greater number of water constituents and increased structure in the statistical models.

  8. Local dependence in random graph models: characterization, properties and statistical inference

    PubMed Central

    Schweinberger, Michael; Handcock, Mark S.

    2015-01-01

    Summary Dependent phenomena, such as relational, spatial and temporal phenomena, tend to be characterized by local dependence in the sense that units which are close in a well-defined sense are dependent. In contrast with spatial and temporal phenomena, though, relational phenomena tend to lack a natural neighbourhood structure in the sense that it is unknown which units are close and thus dependent. Owing to the challenge of characterizing local dependence and constructing random graph models with local dependence, many conventional exponential family random graph models induce strong dependence and are not amenable to statistical inference. We take first steps to characterize local dependence in random graph models, inspired by the notion of finite neighbourhoods in spatial statistics and M-dependence in time series, and we show that local dependence endows random graph models with desirable properties which make them amenable to statistical inference. We show that random graph models with local dependence satisfy a natural domain consistency condition which every model should satisfy, but conventional exponential family random graph models do not satisfy. In addition, we establish a central limit theorem for random graph models with local dependence, which suggests that random graph models with local dependence are amenable to statistical inference. We discuss how random graph models with local dependence can be constructed by exploiting either observed or unobserved neighbourhood structure. In the absence of observed neighbourhood structure, we take a Bayesian view and express the uncertainty about the neighbourhood structure by specifying a prior on a set of suitable neighbourhood structures. We present simulation results and applications to two real world networks with ‘ground truth’. PMID:26560142

  9. Billions of Dollars are Involved in Taxation of the Life Insurance Industry -- Some Corrections in the Law are Needed.

    DTIC Science & Technology

    1981-09-17

    leading life companies, 1979 69 i A A TABLES 16 A comparative example of the reserve test Calculation 76 17 Comparative income tax burden of life...pp. 159-61. 2/J. David Cummins, An Econometric Model of the Life Insurance Sector of the U.S. Economy (Lexington, Mass.: Lexington Books, 1975), p. 57...3/Cummins, Econometric Model, p. 44. 4/Fact Book 1979, p. 32. 23 decades earlier. 1/ This decline has been attributed to two sources. First, as

  10. IWR-MAIN Water Use Forecasting System. Version 5.1. User’s Manual and System Description

    DTIC Science & Technology

    1987-12-01

    Crosschecks for Input Data 1-68 11-1 Organization of the IWR-MAIN System H-8 11-2 Example of Econometric Demand Model 11-9 11-3 Example of Unit Use Coefficient...Unaccounted (entry does not affect default Loss and free service calculations) Y Conservation Data City Name: Test City USA Fl-Hetp, F2-return to monu, F4...socioeconomic data. 1-11 (1) Internal Growth Models The IWR-MAIN program contains a subroutine called GROWTH which uses econometric growth models based on

  11. Long-term US energy outlook

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Friesen, G.

    Chase Econometrics summarizes the assumptions underlying long-term US energy forecasts. To illustrate the uncertainty involved in forecasting for the period to the year 2000, they compare Chase Econometrics forecasts with some recent projections prepared by the DOE Office of Policy, Planning and Analysis for the annual National Energy Policy Plan supplement. Scenario B, the mid-range reference case, is emphasized. The purpose of providing Scenario B as well as Scenarios A and C as alternate cases is to show the sensitivity of oil price projections to small swings in energy demand. 4 tables.

  12. Statistical inference, the bootstrap, and neural-network modeling with application to foreign exchange rates.

    PubMed

    White, H; Racine, J

    2001-01-01

    We propose tests for individual and joint irrelevance of network inputs. Such tests can be used to determine whether an input or group of inputs "belong" in a particular model, thus permitting valid statistical inference based on estimated feedforward neural-network models. The approaches employ well-known statistical resampling techniques. We conduct a small Monte Carlo experiment showing that our tests have reasonable level and power behavior, and we apply our methods to examine whether there are predictable regularities in foreign exchange rates. We find that exchange rates do appear to contain information that is exploitable for enhanced point prediction, but the nature of the predictive relations evolves through time.

  13. FUNSTAT and statistical image representations

    NASA Technical Reports Server (NTRS)

    Parzen, E.

    1983-01-01

    General ideas of functional statistical inference analysis of one sample and two samples, univariate and bivariate are outlined. ONESAM program is applied to analyze the univariate probability distributions of multi-spectral image data.

  14. Building Intuitions about Statistical Inference Based on Resampling

    ERIC Educational Resources Information Center

    Watson, Jane; Chance, Beth

    2012-01-01

    Formal inference, which makes theoretical assumptions about distributions and applies hypothesis testing procedures with null and alternative hypotheses, is notoriously difficult for tertiary students to master. The debate about whether this content should appear in Years 11 and 12 of the "Australian Curriculum: Mathematics" has gone on…

  15. Theory-based Bayesian models of inductive learning and reasoning.

    PubMed

    Tenenbaum, Joshua B; Griffiths, Thomas L; Kemp, Charles

    2006-07-01

    Inductive inference allows humans to make powerful generalizations from sparse data when learning about word meanings, unobserved properties, causal relationships, and many other aspects of the world. Traditional accounts of induction emphasize either the power of statistical learning, or the importance of strong constraints from structured domain knowledge, intuitive theories or schemas. We argue that both components are necessary to explain the nature, use and acquisition of human knowledge, and we introduce a theory-based Bayesian framework for modeling inductive learning and reasoning as statistical inferences over structured knowledge representations.

  16. Data free inference with processed data products

    DOE PAGES

    Chowdhary, K.; Najm, H. N.

    2014-07-12

    Here, we consider the context of probabilistic inference of model parameters given error bars or confidence intervals on model output values, when the data is unavailable. We introduce a class of algorithms in a Bayesian framework, relying on maximum entropy arguments and approximate Bayesian computation methods, to generate consistent data with the given summary statistics. Once we obtain consistent data sets, we pool the respective posteriors, to arrive at a single, averaged density on the parameters. This approach allows us to perform accurate forward uncertainty propagation consistent with the reported statistics.

  17. Statistics at the Chinese Universities.

    DTIC Science & Technology

    1981-09-01

    education in China in the postwar years is pro- vided to give some perspective. My observa- tions on statistics at the Chinese universities are necessarily...has been accepted as a member society of ISI. 3. Education in China Understanding of statistics in universities in China will be enhanced through some...programaming), Statistical Mathematics (infer- ence, data analysis, industrial statistics , information theory), tiathematical Physics (dif- ferential

  18. The role of familiarity in binary choice inferences.

    PubMed

    Honda, Hidehito; Abe, Keiga; Matsuka, Toshihiko; Yamagishi, Kimihiko

    2011-07-01

    In research on the recognition heuristic (Goldstein & Gigerenzer, Psychological Review, 109, 75-90, 2002), knowledge of recognized objects has been categorized as "recognized" or "unrecognized" without regard to the degree of familiarity of the recognized object. In the present article, we propose a new inference model--familiarity-based inference. We hypothesize that when subjective knowledge levels (familiarity) of recognized objects differ, the degree of familiarity of recognized objects will influence inferences. Specifically, people are predicted to infer that the more familiar object in a pair of two objects has a higher criterion value on the to-be-judged dimension. In two experiments, using a binary choice task, we examined inferences about populations in a pair of two cities. Results support predictions of familiarity-based inference. Participants inferred that the more familiar city in a pair was more populous. Statistical modeling showed that individual differences in familiarity-based inference lie in the sensitivity to differences in familiarity. In addition, we found that familiarity-based inference can be generally regarded as an ecologically rational inference. Furthermore, when cue knowledge about the inference criterion was available, participants made inferences based on the cue knowledge about population instead of familiarity. Implications of the role of familiarity in psychological processes are discussed.

  19. Cost-effectiveness of aortic valve replacement in the elderly: an introductory study.

    PubMed

    Wu, YingXing; Jin, Ruyun; Gao, Guangqiang; Grunkemeier, Gary L; Starr, Albert

    2007-03-01

    With increased life expectancy and improved technology, valve replacement is being offered to increasing numbers of elderly patients with satisfactory clinical results. By using standard econometric techniques, we estimated the relative cost-effectiveness of aortic valve replacement by drawing on a large prospective database at our institution. By using aortic valve replacement as an example, this introductory report paves the way to more definitive studies of these issues in the future. From 1961 to 2003, 4617 adult patients underwent aortic valve replacement at our service. These patients were provided with a prospective lifetime follow-up. As of 2005, these patients had accumulated 31,671 patient-years of follow-up (maximum 41 years) and had returned 22,396 yearly questionnaires. A statistical model was used to estimate the future life years of patients who are currently alive. In the absence of direct estimates of utility, quality-adjusted life years were estimated from New York Heart Association class. The cost-effectiveness ratio was calculated by the patient's age at surgery. The overall cost-effectiveness ratio was approximately 13,528 dollars per quality-adjusted life year gained. The cost-effectiveness ratio increased according to age at surgery, up to 19,826 dollars per quality-adjusted life year for octogenarians and 27,182 dollars per quality-adjusted life year for nonagenarians. Given the limited scope of this introductory study, aortic valve replacement is cost-effective for all age groups and is very cost-effective for all but the most elderly according to standard econometric rules of thumb.

  20. Statistical modeling of software reliability

    NASA Technical Reports Server (NTRS)

    Miller, Douglas R.

    1992-01-01

    This working paper discusses the statistical simulation part of a controlled software development experiment being conducted under the direction of the System Validation Methods Branch, Information Systems Division, NASA Langley Research Center. The experiment uses guidance and control software (GCS) aboard a fictitious planetary landing spacecraft: real-time control software operating on a transient mission. Software execution is simulated to study the statistical aspects of reliability and other failure characteristics of the software during development, testing, and random usage. Quantification of software reliability is a major goal. Various reliability concepts are discussed. Experiments are described for performing simulations and collecting appropriate simulated software performance and failure data. This data is then used to make statistical inferences about the quality of the software development and verification processes as well as inferences about the reliability of software versions and reliability growth under random testing and debugging.

  1. Quantum-Like Representation of Non-Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.

    2013-01-01

    This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.

  2. Why environmental scientists are becoming Bayesians

    Treesearch

    James S. Clark

    2005-01-01

    Advances in computational statistics provide a general framework for the high dimensional models typically needed for ecological inference and prediction. Hierarchical Bayes (HB) represents a modelling structure with capacity to exploit diverse sources of information, to accommodate influences that are unknown (or unknowable), and to draw inference on large numbers of...

  3. Pseudocontingencies and Choice Behavior in Probabilistic Environments with Context-Dependent Outcomes

    ERIC Educational Resources Information Center

    Meiser, Thorsten; Rummel, Jan; Fleig, Hanna

    2018-01-01

    Pseudocontingencies are inferences about correlations in the environment that are formed on the basis of statistical regularities like skewed base rates or varying base rates across environmental contexts. Previous research has demonstrated that pseudocontingencies provide a pervasive mechanism of inductive inference in numerous social judgment…

  4. Cross-Situational Learning of Minimal Word Pairs

    ERIC Educational Resources Information Center

    Escudero, Paola; Mulak, Karen E.; Vlach, Haley A.

    2016-01-01

    "Cross-situational statistical learning" of words involves tracking co-occurrences of auditory words and objects across time to infer word-referent mappings. Previous research has demonstrated that learners can infer referents across sets of very phonologically distinct words (e.g., WUG, DAX), but it remains unknown whether learners can…

  5. Statistical analysis of fNIRS data: a comprehensive review.

    PubMed

    Tak, Sungho; Ye, Jong Chul

    2014-01-15

    Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.

  6. Spatial Analysis of China Province-level Perinatal Mortality

    PubMed Central

    XIANG, Kun; SONG, Deyong

    2016-01-01

    Background: Using spatial analysis tools to determine the spatial patterns of China province-level perinatal mortality and using spatial econometric model to examine the impacts of health care resources and different socio-economic factors on perinatal mortality. Methods: The Global Moran’s I index is used to examine whether the spatial autocorrelation exists in selected regions and Moran’s I scatter plot to examine the spatial clustering among regions. Spatial econometric models are used to investigate the spatial relationships between perinatal mortality and contributing factors. Results: The overall Moran’s I index indicates that perinatal mortality displays positive spatial autocorrelation. Moran’s I scatter plot analysis implies that there is a significant clustering of mortality in both high-rate regions and low-rate regions. The spatial econometric models analyses confirm the existence of a direct link between perinatal mortality and health care resources, socio-economic factors. Conclusions: Since a positive spatial autocorrelation has been detected in China province-level perinatal mortality, the upgrading of regional economic development and medical service level will affect the mortality not only in region itself but also its adjacent regions. PMID:27398334

  7. The impact of tropospheric ozone pollution on trial plot winter wheat yields in Great Britain - an econometric approach.

    PubMed

    Kaliakatsou, Evridiki; Bell, J Nigel B; Thirtle, Colin; Rose, Daniel; Power, Sally A

    2010-05-01

    Numerous experiments have demonstrated reductions in the yields of cereal crops due to tropospheric O(3), with losses of up to 25%. However, the only British econometric study on O(3) impacts on winter wheat yields, found that a 10% increase in AOT40 would decrease yields by only 0.23%. An attempt is made here to reconcile these observations by developing AOT40 maps for Great Britain and matching levels with a large number of standardised trial plot wheat yields from many sites over a 13-year period. Panel estimates (repeated measures on the same plots with time) show a 0.54% decrease in yields and it is hypothesised that plant breeders may have inadvertently selected for O(3) tolerance in wheat. Some support for this is provided by fumigations of cultivars of differing introduction dates. A case is made for the use of econometric as well as experimental studies in prediction of air pollution induced crop loss. Copyright 2009 Elsevier Ltd. All rights reserved.

  8. Accounting for measurement error: a critical but often overlooked process.

    PubMed

    Harris, Edward F; Smith, Richard N

    2009-12-01

    Due to instrument imprecision and human inconsistencies, measurements are not free of error. Technical error of measurement (TEM) is the variability encountered between dimensions when the same specimens are measured at multiple sessions. A goal of a data collection regimen is to minimise TEM. The few studies that actually quantify TEM, regardless of discipline, report that it is substantial and can affect results and inferences. This paper reviews some statistical approaches for identifying and controlling TEM. Statistically, TEM is part of the residual ('unexplained') variance in a statistical test, so accounting for TEM, which requires repeated measurements, enhances the chances of finding a statistically significant difference if one exists. The aim of this paper was to review and discuss common statistical designs relating to types of error and statistical approaches to error accountability. This paper addresses issues of landmark location, validity, technical and systematic error, analysis of variance, scaled measures and correlation coefficients in order to guide the reader towards correct identification of true experimental differences. Researchers commonly infer characteristics about populations from comparatively restricted study samples. Most inferences are statistical and, aside from concerns about adequate accounting for known sources of variation with the research design, an important source of variability is measurement error. Variability in locating landmarks that define variables is obvious in odontometrics, cephalometrics and anthropometry, but the same concerns about measurement accuracy and precision extend to all disciplines. With increasing accessibility to computer-assisted methods of data collection, the ease of incorporating repeated measures into statistical designs has improved. Accounting for this technical source of variation increases the chance of finding biologically true differences when they exist.

  9. Developing a statistically powerful measure for quartet tree inference using phylogenetic identities and Markov invariants.

    PubMed

    Sumner, Jeremy G; Taylor, Amelia; Holland, Barbara R; Jarvis, Peter D

    2017-12-01

    Recently there has been renewed interest in phylogenetic inference methods based on phylogenetic invariants, alongside the related Markov invariants. Broadly speaking, both these approaches give rise to polynomial functions of sequence site patterns that, in expectation value, either vanish for particular evolutionary trees (in the case of phylogenetic invariants) or have well understood transformation properties (in the case of Markov invariants). While both approaches have been valued for their intrinsic mathematical interest, it is not clear how they relate to each other, and to what extent they can be used as practical tools for inference of phylogenetic trees. In this paper, by focusing on the special case of binary sequence data and quartets of taxa, we are able to view these two different polynomial-based approaches within a common framework. To motivate the discussion, we present three desirable statistical properties that we argue any invariant-based phylogenetic method should satisfy: (1) sensible behaviour under reordering of input sequences; (2) stability as the taxa evolve independently according to a Markov process; and (3) explicit dependence on the assumption of a continuous-time process. Motivated by these statistical properties, we develop and explore several new phylogenetic inference methods. In particular, we develop a statistically bias-corrected version of the Markov invariants approach which satisfies all three properties. We also extend previous work by showing that the phylogenetic invariants can be implemented in such a way as to satisfy property (3). A simulation study shows that, in comparison to other methods, our new proposed approach based on bias-corrected Markov invariants is extremely powerful for phylogenetic inference. The binary case is of particular theoretical interest as-in this case only-the Markov invariants can be expressed as linear combinations of the phylogenetic invariants. A wider implication of this is that, for models with more than two states-for example DNA sequence alignments with four-state models-we find that methods which rely on phylogenetic invariants are incapable of satisfying all three of the stated statistical properties. This is because in these cases the relevant Markov invariants belong to a class of polynomials independent from the phylogenetic invariants.

  10. Imputation approaches for animal movement modeling

    USGS Publications Warehouse

    Scharf, Henry; Hooten, Mevin B.; Johnson, Devin S.

    2017-01-01

    The analysis of telemetry data is common in animal ecological studies. While the collection of telemetry data for individual animals has improved dramatically, the methods to properly account for inherent uncertainties (e.g., measurement error, dependence, barriers to movement) have lagged behind. Still, many new statistical approaches have been developed to infer unknown quantities affecting animal movement or predict movement based on telemetry data. Hierarchical statistical models are useful to account for some of the aforementioned uncertainties, as well as provide population-level inference, but they often come with an increased computational burden. For certain types of statistical models, it is straightforward to provide inference if the latent true animal trajectory is known, but challenging otherwise. In these cases, approaches related to multiple imputation have been employed to account for the uncertainty associated with our knowledge of the latent trajectory. Despite the increasing use of imputation approaches for modeling animal movement, the general sensitivity and accuracy of these methods have not been explored in detail. We provide an introduction to animal movement modeling and describe how imputation approaches may be helpful for certain types of models. We also assess the performance of imputation approaches in two simulation studies. Our simulation studies suggests that inference for model parameters directly related to the location of an individual may be more accurate than inference for parameters associated with higher-order processes such as velocity or acceleration. Finally, we apply these methods to analyze a telemetry data set involving northern fur seals (Callorhinus ursinus) in the Bering Sea. Supplementary materials accompanying this paper appear online.

  11. Using a Five-Step Procedure for Inferential Statistical Analyses

    ERIC Educational Resources Information Center

    Kamin, Lawrence F.

    2010-01-01

    Many statistics texts pose inferential statistical problems in a disjointed way. By using a simple five-step procedure as a template for statistical inference problems, the student can solve problems in an organized fashion. The problem and its solution will thus be a stand-by-itself organic whole and a single unit of thought and effort. The…

  12. Design-based and model-based inference in surveys of freshwater mollusks

    USGS Publications Warehouse

    Dorazio, R.M.

    1999-01-01

    Well-known concepts in statistical inference and sampling theory are used to develop recommendations for planning and analyzing the results of quantitative surveys of freshwater mollusks. Two methods of inference commonly used in survey sampling (design-based and model-based) are described and illustrated using examples relevant in surveys of freshwater mollusks. The particular objectives of a survey and the type of information observed in each unit of sampling can be used to help select the sampling design and the method of inference. For example, the mean density of a sparsely distributed population of mollusks can be estimated with higher precision by using model-based inference or by using design-based inference with adaptive cluster sampling than by using design-based inference with conventional sampling. More experience with quantitative surveys of natural assemblages of freshwater mollusks is needed to determine the actual benefits of different sampling designs and inferential procedures.

  13. Bayesian inference for joint modelling of longitudinal continuous, binary and ordinal events.

    PubMed

    Li, Qiuju; Pan, Jianxin; Belcher, John

    2016-12-01

    In medical studies, repeated measurements of continuous, binary and ordinal outcomes are routinely collected from the same patient. Instead of modelling each outcome separately, in this study we propose to jointly model the trivariate longitudinal responses, so as to take account of the inherent association between the different outcomes and thus improve statistical inferences. This work is motivated by a large cohort study in the North West of England, involving trivariate responses from each patient: Body Mass Index, Depression (Yes/No) ascertained with cut-off score not less than 8 at the Hospital Anxiety and Depression Scale, and Pain Interference generated from the Medical Outcomes Study 36-item short-form health survey with values returned on an ordinal scale 1-5. There are some well-established methods for combined continuous and binary, or even continuous and ordinal responses, but little work was done on the joint analysis of continuous, binary and ordinal responses. We propose conditional joint random-effects models, which take into account the inherent association between the continuous, binary and ordinal outcomes. Bayesian analysis methods are used to make statistical inferences. Simulation studies show that, by jointly modelling the trivariate outcomes, standard deviations of the estimates of parameters in the models are smaller and much more stable, leading to more efficient parameter estimates and reliable statistical inferences. In the real data analysis, the proposed joint analysis yields a much smaller deviance information criterion value than the separate analysis, and shows other good statistical properties too. © The Author(s) 2014.

  14. Neuronal couplings between retinal ganglion cells inferred by efficient inverse statistical physics methods

    PubMed Central

    Cocco, Simona; Leibler, Stanislas; Monasson, Rémi

    2009-01-01

    Complexity of neural systems often makes impracticable explicit measurements of all interactions between their constituents. Inverse statistical physics approaches, which infer effective couplings between neurons from their spiking activity, have been so far hindered by their computational complexity. Here, we present 2 complementary, computationally efficient inverse algorithms based on the Ising and “leaky integrate-and-fire” models. We apply those algorithms to reanalyze multielectrode recordings in the salamander retina in darkness and under random visual stimulus. We find strong positive couplings between nearby ganglion cells common to both stimuli, whereas long-range couplings appear under random stimulus only. The uncertainty on the inferred couplings due to limitations in the recordings (duration, small area covered on the retina) is discussed. Our methods will allow real-time evaluation of couplings for large assemblies of neurons. PMID:19666487

  15. Confidence crisis of results in biomechanics research.

    PubMed

    Knudson, Duane

    2017-11-01

    Many biomechanics studies have small sample sizes and incorrect statistical analyses, so reporting of inaccurate inferences and inflated magnitude of effects are common in the field. This review examines these issues in biomechanics research and summarises potential solutions from research in other fields to increase the confidence in the experimental effects reported in biomechanics. Authors, reviewers and editors of biomechanics research reports are encouraged to improve sample sizes and the resulting statistical power, improve reporting transparency, improve the rigour of statistical analyses used, and increase the acceptance of replication studies to improve the validity of inferences from data in biomechanics research. The application of sports biomechanics research results would also improve if a larger percentage of unbiased effects and their uncertainty were reported in the literature.

  16. The Empirical Nature and Statistical Treatment of Missing Data

    ERIC Educational Resources Information Center

    Tannenbaum, Christyn E.

    2009-01-01

    Introduction. Missing data is a common problem in research and can produce severely misleading analyses, including biased estimates of statistical parameters, and erroneous conclusions. In its 1999 report, the APA Task Force on Statistical Inference encouraged authors to report complications such as missing data and discouraged the use of…

  17. Cognitive Transfer Outcomes for a Simulation-Based Introductory Statistics Curriculum

    ERIC Educational Resources Information Center

    Backman, Matthew D.; Delmas, Robert C.; Garfield, Joan

    2017-01-01

    Cognitive transfer is the ability to apply learned skills and knowledge to new applications and contexts. This investigation evaluates cognitive transfer outcomes for a tertiary-level introductory statistics course using the CATALST curriculum, which exclusively used simulation-based methods to develop foundations of statistical inference. A…

  18. The Role of the Sampling Distribution in Understanding Statistical Inference

    ERIC Educational Resources Information Center

    Lipson, Kay

    2003-01-01

    Many statistics educators believe that few students develop the level of conceptual understanding essential for them to apply correctly the statistical techniques at their disposal and to interpret their outcomes appropriately. It is also commonly believed that the sampling distribution plays an important role in developing this understanding.…

  19. A statistical method for lung tumor segmentation uncertainty in PET images based on user inference.

    PubMed

    Zheng, Chaojie; Wang, Xiuying; Feng, Dagan

    2015-01-01

    PET has been widely accepted as an effective imaging modality for lung tumor diagnosis and treatment. However, standard criteria for delineating tumor boundary from PET are yet to develop largely due to relatively low quality of PET images, uncertain tumor boundary definition, and variety of tumor characteristics. In this paper, we propose a statistical solution to segmentation uncertainty on the basis of user inference. We firstly define the uncertainty segmentation band on the basis of segmentation probability map constructed from Random Walks (RW) algorithm; and then based on the extracted features of the user inference, we use Principle Component Analysis (PCA) to formulate the statistical model for labeling the uncertainty band. We validated our method on 10 lung PET-CT phantom studies from the public RIDER collections [1] and 16 clinical PET studies where tumors were manually delineated by two experienced radiologists. The methods were validated using Dice similarity coefficient (DSC) to measure the spatial volume overlap. Our method achieved an average DSC of 0.878 ± 0.078 on phantom studies and 0.835 ± 0.039 on clinical studies.

  20. Empirical evidence for acceleration-dependent amplification factors

    USGS Publications Warehouse

    Borcherdt, R.D.

    2002-01-01

    Site-specific amplification factors, Fa and Fv, used in current U.S. building codes decrease with increasing base acceleration level as implied by the Loma Prieta earthquake at 0.1g and extrapolated using numerical models and laboratory results. The Northridge earthquake recordings of 17 January 1994 and subsequent geotechnical data permit empirical estimates of amplification at base acceleration levels up to 0.5g. Distance measures and normalization procedures used to infer amplification ratios from soil-rock pairs in predetermined azimuth-distance bins significantly influence the dependence of amplification estimates on base acceleration. Factors inferred using a hypocentral distance norm do not show a statistically significant dependence on base acceleration. Factors inferred using norms implied by the attenuation functions of Abrahamson and Silva show a statistically significant decrease with increasing base acceleration. The decrease is statistically more significant for stiff clay and sandy soil (site class D) sites than for stiffer sites underlain by gravely soils and soft rock (site class C). The decrease in amplification with increasing base acceleration is more pronounced for the short-period amplification factor, Fa, than for the midperiod factor, Fv.

  1. Karl Pearson and eugenics: personal opinions and scientific rigor.

    PubMed

    Delzell, Darcie A P; Poliak, Cathy D

    2013-09-01

    The influence of personal opinions and biases on scientific conclusions is a threat to the advancement of knowledge. Expertise and experience does not render one immune to this temptation. In this work, one of the founding fathers of statistics, Karl Pearson, is used as an illustration of how even the most talented among us can produce misleading results when inferences are made without caution or reference to potential bias and other analysis limitations. A study performed by Pearson on British Jewish schoolchildren is examined in light of ethical and professional statistical practice. The methodology used and inferences made by Pearson and his coauthor are sometimes questionable and offer insight into how Pearson's support of eugenics and his own British nationalism could have potentially influenced his often careless and far-fetched inferences. A short background into Pearson's work and beliefs is provided, along with an in-depth examination of the authors' overall experimental design and statistical practices. In addition, portions of the study regarding intelligence and tuberculosis are discussed in more detail, along with historical reactions to their work.

  2. Crystal study and econometric model

    NASA Technical Reports Server (NTRS)

    1975-01-01

    An econometric model was developed that can be used to predict demand and supply figures for crystals over a time horizon roughly concurrent with that of NASA's Space Shuttle Program - that is, 1975 through 1990. The model includes an equation to predict the impact on investment in the crystal-growing industry. Actually, two models are presented. The first is a theoretical model which follows rather strictly the standard theoretical economic concepts involved in supply and demand analysis, and a modified version of the model was developed which, though not quite as theoretically sound, was testable utilizing existing data sources.

  3. The Science of Science Policy: A Federal Research Roadmap

    DTIC Science & Technology

    2008-11-01

    and Atmospheric Administra on, h p://www.ncdc.noaa.gov/ oa /climate/globalwarming.html#q4. T S S P : A F R R4 maintain the na on’s dominance...econometric studies, surveys, case studies, and retrospec ve analyses. Econometric studies include the macroeconomic growth models pioneered by Robert...R A W ha t a re th e be ha vi or al fo un da o ns o f i nn ov a- o n? U nd er st an di ng th e be ha vi or o f i nd iv id ua ls an d

  4. Energy modeling. Volume 2: Inventory and details of state energy models

    NASA Astrophysics Data System (ADS)

    Melcher, A. G.; Underwood, R. G.; Weber, J. C.; Gist, R. L.; Holman, R. P.; Donald, D. W.

    1981-05-01

    An inventory of energy models developed by or for state governments is presented, and certain models are discussed in depth. These models address a variety of purposes such as: supply or demand of energy or of certain types of energy; emergency management of energy; and energy economics. Ten models are described. The purpose, use, and history of the model is discussed, and information is given on the outputs, inputs, and mathematical structure of the model. The models include five models dealing with energy demand, one of which is econometric and four of which are econometric-engineering end-use models.

  5. In silico model-based inference: a contemporary approach for hypothesis testing in network biology

    PubMed Central

    Klinke, David J.

    2014-01-01

    Inductive inference plays a central role in the study of biological systems where one aims to increase their understanding of the system by reasoning backwards from uncertain observations to identify causal relationships among components of the system. These causal relationships are postulated from prior knowledge as a hypothesis or simply a model. Experiments are designed to test the model. Inferential statistics are used to establish a level of confidence in how well our postulated model explains the acquired data. This iterative process, commonly referred to as the scientific method, either improves our confidence in a model or suggests that we revisit our prior knowledge to develop a new model. Advances in technology impact how we use prior knowledge and data to formulate models of biological networks and how we observe cellular behavior. However, the approach for model-based inference has remained largely unchanged since Fisher, Neyman and Pearson developed the ideas in the early 1900’s that gave rise to what is now known as classical statistical hypothesis (model) testing. Here, I will summarize conventional methods for model-based inference and suggest a contemporary approach to aid in our quest to discover how cells dynamically interpret and transmit information for therapeutic aims that integrates ideas drawn from high performance computing, Bayesian statistics, and chemical kinetics. PMID:25139179

  6. In silico model-based inference: a contemporary approach for hypothesis testing in network biology.

    PubMed

    Klinke, David J

    2014-01-01

    Inductive inference plays a central role in the study of biological systems where one aims to increase their understanding of the system by reasoning backwards from uncertain observations to identify causal relationships among components of the system. These causal relationships are postulated from prior knowledge as a hypothesis or simply a model. Experiments are designed to test the model. Inferential statistics are used to establish a level of confidence in how well our postulated model explains the acquired data. This iterative process, commonly referred to as the scientific method, either improves our confidence in a model or suggests that we revisit our prior knowledge to develop a new model. Advances in technology impact how we use prior knowledge and data to formulate models of biological networks and how we observe cellular behavior. However, the approach for model-based inference has remained largely unchanged since Fisher, Neyman and Pearson developed the ideas in the early 1900s that gave rise to what is now known as classical statistical hypothesis (model) testing. Here, I will summarize conventional methods for model-based inference and suggest a contemporary approach to aid in our quest to discover how cells dynamically interpret and transmit information for therapeutic aims that integrates ideas drawn from high performance computing, Bayesian statistics, and chemical kinetics. © 2014 American Institute of Chemical Engineers.

  7. Spatio-temporal conditional inference and hypothesis tests for neural ensemble spiking precision

    PubMed Central

    Harrison, Matthew T.; Amarasingham, Asohan; Truccolo, Wilson

    2014-01-01

    The collective dynamics of neural ensembles create complex spike patterns with many spatial and temporal scales. Understanding the statistical structure of these patterns can help resolve fundamental questions about neural computation and neural dynamics. Spatio-temporal conditional inference (STCI) is introduced here as a semiparametric statistical framework for investigating the nature of precise spiking patterns from collections of neurons that is robust to arbitrarily complex and nonstationary coarse spiking dynamics. The main idea is to focus statistical modeling and inference, not on the full distribution of the data, but rather on families of conditional distributions of precise spiking given different types of coarse spiking. The framework is then used to develop families of hypothesis tests for probing the spatio-temporal precision of spiking patterns. Relationships among different conditional distributions are used to improve multiple hypothesis testing adjustments and to design novel Monte Carlo spike resampling algorithms. Of special note are algorithms that can locally jitter spike times while still preserving the instantaneous peri-stimulus time histogram (PSTH) or the instantaneous total spike count from a group of recorded neurons. The framework can also be used to test whether first-order maximum entropy models with possibly random and time-varying parameters can account for observed patterns of spiking. STCI provides a detailed example of the generic principle of conditional inference, which may be applicable in other areas of neurostatistical analysis. PMID:25380339

  8. Data mining and statistical inference in selective laser melting

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kamath, Chandrika

    Selective laser melting (SLM) is an additive manufacturing process that builds a complex three-dimensional part, layer-by-layer, using a laser beam to fuse fine metal powder together. The design freedom afforded by SLM comes associated with complexity. As the physical phenomena occur over a broad range of length and time scales, the computational cost of modeling the process is high. At the same time, the large number of parameters that control the quality of a part make experiments expensive. In this paper, we describe ways in which we can use data mining and statistical inference techniques to intelligently combine simulations andmore » experiments to build parts with desired properties. We start with a brief summary of prior work in finding process parameters for high-density parts. We then expand on this work to show how we can improve the approach by using feature selection techniques to identify important variables, data-driven surrogate models to reduce computational costs, improved sampling techniques to cover the design space adequately, and uncertainty analysis for statistical inference. Here, our results indicate that techniques from data mining and statistics can complement those from physical modeling to provide greater insight into complex processes such as selective laser melting.« less

  9. Data mining and statistical inference in selective laser melting

    DOE PAGES

    Kamath, Chandrika

    2016-01-11

    Selective laser melting (SLM) is an additive manufacturing process that builds a complex three-dimensional part, layer-by-layer, using a laser beam to fuse fine metal powder together. The design freedom afforded by SLM comes associated with complexity. As the physical phenomena occur over a broad range of length and time scales, the computational cost of modeling the process is high. At the same time, the large number of parameters that control the quality of a part make experiments expensive. In this paper, we describe ways in which we can use data mining and statistical inference techniques to intelligently combine simulations andmore » experiments to build parts with desired properties. We start with a brief summary of prior work in finding process parameters for high-density parts. We then expand on this work to show how we can improve the approach by using feature selection techniques to identify important variables, data-driven surrogate models to reduce computational costs, improved sampling techniques to cover the design space adequately, and uncertainty analysis for statistical inference. Here, our results indicate that techniques from data mining and statistics can complement those from physical modeling to provide greater insight into complex processes such as selective laser melting.« less

  10. Fully Bayesian inference for structural MRI: application to segmentation and statistical analysis of T2-hypointensities.

    PubMed

    Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark

    2013-01-01

    Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.

  11. Drug target inference through pathway analysis of genomics data

    PubMed Central

    Ma, Haisu; Zhao, Hongyu

    2013-01-01

    Statistical modeling coupled with bioinformatics is commonly used for drug discovery. Although there exist many approaches for single target based drug design and target inference, recent years have seen a paradigm shift to system-level pharmacological research. Pathway analysis of genomics data represents one promising direction for computational inference of drug targets. This article aims at providing a comprehensive review on the evolving issues is this field, covering methodological developments, their pros and cons, as well as future research directions. PMID:23369829

  12. Applications of statistics to medical science, II overview of statistical procedures for general use.

    PubMed

    Watanabe, Hiroshi

    2012-01-01

    Procedures of statistical analysis are reviewed to provide an overview of applications of statistics for general use. Topics that are dealt with are inference on a population, comparison of two populations with respect to means and probabilities, and multiple comparisons. This study is the second part of series in which we survey medical statistics. Arguments related to statistical associations and regressions will be made in subsequent papers.

  13. The impacts of recent smoking control policies on individual smoking choice: the case of Japan

    PubMed Central

    2013-01-01

    Abstract This article comprehensively examines the impact of recent smoking control policies in Japan, increases in cigarette taxes and the enforcement of the Health Promotion Law, on individual smoking choice by using multi-year and nationwide individual survey data to overcome the analytical problems of previous Japanese studies. In the econometric analyses, I specify a simple binary choice model based on a random utility model to examine the effects of smoking control policies on individual smoking choice by employing the instrumental variable probit model to control for the endogeneity of cigarette prices. The empirical results show that an increase in cigarette prices statistically significantly reduces the smoking probability of males by 1.0 percent and that of females by 1.4 to 2.0 percent. The enforcement of the Health Promotion Law has a statistically significant effect on reducing the smoking probability of males by 15.2 percent and of females by 11.9 percent. Furthermore, an increase in cigarette prices has a statistically significant negative effect on the smoking probability of office workers, non-workers, male manual workers, and female unemployed people, and the enforcement of the Health Promotion Law has a statistically significant effect on decreasing the smoking probabilities of office workers, female manual workers, and male non-workers. JEL classification C25, C26, I18 PMID:23497490

  14. PROBABILITY SAMPLING AND POPULATION INFERENCE IN MONITORING PROGRAMS

    EPA Science Inventory

    A fundamental difference between probability sampling and conventional statistics is that "sampling" deals with real, tangible populations, whereas "conventional statistics" usually deals with hypothetical populations that have no real-world realization. he focus here is on real ...

  15. Statistical Inference in the Learning of Novel Phonetic Categories

    ERIC Educational Resources Information Center

    Zhao, Yuan

    2010-01-01

    Learning a phonetic category (or any linguistic category) requires integrating different sources of information. A crucial unsolved problem for phonetic learning is how this integration occurs: how can we update our previous knowledge about a phonetic category as we hear new exemplars of the category? One model of learning is Bayesian Inference,…

  16. Conceptual Challenges in Coordinating Theoretical and Data-Centered Estimates of Probability

    ERIC Educational Resources Information Center

    Konold, Cliff; Madden, Sandra; Pollatsek, Alexander; Pfannkuch, Maxine; Wild, Chris; Ziedins, Ilze; Finzer, William; Horton, Nicholas J.; Kazak, Sibel

    2011-01-01

    A core component of informal statistical inference is the recognition that judgments based on sample data are inherently uncertain. This implies that instruction aimed at developing informal inference needs to foster basic probabilistic reasoning. In this article, we analyze and critique the now-common practice of introducing students to both…

  17. Campbell's and Rubin's Perspectives on Causal Inference

    ERIC Educational Resources Information Center

    West, Stephen G.; Thoemmes, Felix

    2010-01-01

    Donald Campbell's approach to causal inference (D. T. Campbell, 1957; W. R. Shadish, T. D. Cook, & D. T. Campbell, 2002) is widely used in psychology and education, whereas Donald Rubin's causal model (P. W. Holland, 1986; D. B. Rubin, 1974, 2005) is widely used in economics, statistics, medicine, and public health. Campbell's approach focuses on…

  18. Direct Evidence for a Dual Process Model of Deductive Inference

    ERIC Educational Resources Information Center

    Markovits, Henry; Brunet, Marie-Laurence; Thompson, Valerie; Brisson, Janie

    2013-01-01

    In 2 experiments, we tested a strong version of a dual process theory of conditional inference (cf. Verschueren et al., 2005a, 2005b) that assumes that most reasoners have 2 strategies available, the choice of which is determined by situational variables, cognitive capacity, and metacognitive control. The statistical strategy evaluates inferences…

  19. The Role of Probability in Developing Learners' Models of Simulation Approaches to Inference

    ERIC Educational Resources Information Center

    Lee, Hollylynne S.; Doerr, Helen M.; Tran, Dung; Lovett, Jennifer N.

    2016-01-01

    Repeated sampling approaches to inference that rely on simulations have recently gained prominence in statistics education, and probabilistic concepts are at the core of this approach. In this approach, learners need to develop a mapping among the problem situation, a physical enactment, computer representations, and the underlying randomization…

  20. From Blickets to Synapses: Inferring Temporal Causal Networks by Observation

    ERIC Educational Resources Information Center

    Fernando, Chrisantha

    2013-01-01

    How do human infants learn the causal dependencies between events? Evidence suggests that this remarkable feat can be achieved by observation of only a handful of examples. Many computational models have been produced to explain how infants perform causal inference without explicit teaching about statistics or the scientific method. Here, we…

  1. It's a Girl! Random Numbers, Simulations, and the Law of Large Numbers

    ERIC Educational Resources Information Center

    Goodwin, Chris; Ortiz, Enrique

    2015-01-01

    Modeling using mathematics and making inferences about mathematical situations are becoming more prevalent in most fields of study. Descriptive statistics cannot be used to generalize about a population or make predictions of what can occur. Instead, inference must be used. Simulation and sampling are essential in building a foundation for…

  2. Thou Shalt Not Bear False Witness against Null Hypothesis Significance Testing

    ERIC Educational Resources Information Center

    García-Pérez, Miguel A.

    2017-01-01

    Null hypothesis significance testing (NHST) has been the subject of debate for decades and alternative approaches to data analysis have been proposed. This article addresses this debate from the perspective of scientific inquiry and inference. Inference is an inverse problem and application of statistical methods cannot reveal whether effects…

  3. Hypothesis-Testing Demands Trustworthy Data—A Simulation Approach to Inferential Statistics Advocating the Research Program Strategy

    PubMed Central

    Krefeld-Schwalb, Antonia; Witte, Erich H.; Zenker, Frank

    2018-01-01

    In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H0-hypothesis to a statistical H1-verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a “pure” Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis. PMID:29740363

  4. NIRS-SPM: statistical parametric mapping for near infrared spectroscopy

    NASA Astrophysics Data System (ADS)

    Tak, Sungho; Jang, Kwang Eun; Jung, Jinwook; Jang, Jaeduck; Jeong, Yong; Ye, Jong Chul

    2008-02-01

    Even though there exists a powerful statistical parametric mapping (SPM) tool for fMRI, similar public domain tools are not available for near infrared spectroscopy (NIRS). In this paper, we describe a new public domain statistical toolbox called NIRS-SPM for quantitative analysis of NIRS signals. Specifically, NIRS-SPM statistically analyzes the NIRS data using GLM and makes inference as the excursion probability which comes from the random field that are interpolated from the sparse measurement. In order to obtain correct inference, NIRS-SPM offers the pre-coloring and pre-whitening method for temporal correlation estimation. For simultaneous recording NIRS signal with fMRI, the spatial mapping between fMRI image and real coordinate in 3-D digitizer is estimated using Horn's algorithm. These powerful tools allows us the super-resolution localization of the brain activation which is not possible using the conventional NIRS analysis tools.

  5. Hypothesis-Testing Demands Trustworthy Data-A Simulation Approach to Inferential Statistics Advocating the Research Program Strategy.

    PubMed

    Krefeld-Schwalb, Antonia; Witte, Erich H; Zenker, Frank

    2018-01-01

    In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H 0 -hypothesis to a statistical H 1 -verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a "pure" Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis.

  6. Application of Bayesian inference to the study of hierarchical organization in self-organized complex adaptive systems

    NASA Astrophysics Data System (ADS)

    Knuth, K. H.

    2001-05-01

    We consider the application of Bayesian inference to the study of self-organized structures in complex adaptive systems. In particular, we examine the distribution of elements, agents, or processes in systems dominated by hierarchical structure. We demonstrate that results obtained by Caianiello [1] on Hierarchical Modular Systems (HMS) can be found by applying Jaynes' Principle of Group Invariance [2] to a few key assumptions about our knowledge of hierarchical organization. Subsequent application of the Principle of Maximum Entropy allows inferences to be made about specific systems. The utility of the Bayesian method is considered by examining both successes and failures of the hierarchical model. We discuss how Caianiello's original statements suffer from the Mind Projection Fallacy [3] and we restate his assumptions thus widening the applicability of the HMS model. The relationship between inference and statistical physics, described by Jaynes [4], is reiterated with the expectation that this realization will aid the field of complex systems research by moving away from often inappropriate direct application of statistical mechanics to a more encompassing inferential methodology.

  7. A Balanced Approach to Adaptive Probability Density Estimation.

    PubMed

    Kovacs, Julio A; Helmick, Cailee; Wriggers, Willy

    2017-01-01

    Our development of a Fast (Mutual) Information Matching (FIM) of molecular dynamics time series data led us to the general problem of how to accurately estimate the probability density function of a random variable, especially in cases of very uneven samples. Here, we propose a novel Balanced Adaptive Density Estimation (BADE) method that effectively optimizes the amount of smoothing at each point. To do this, BADE relies on an efficient nearest-neighbor search which results in good scaling for large data sizes. Our tests on simulated data show that BADE exhibits equal or better accuracy than existing methods, and visual tests on univariate and bivariate experimental data show that the results are also aesthetically pleasing. This is due in part to the use of a visual criterion for setting the smoothing level of the density estimate. Our results suggest that BADE offers an attractive new take on the fundamental density estimation problem in statistics. We have applied it on molecular dynamics simulations of membrane pore formation. We also expect BADE to be generally useful for low-dimensional applications in other statistical application domains such as bioinformatics, signal processing and econometrics.

  8. Temperature rise, sea level rise and increased radiative forcing - an application of cointegration methods

    NASA Astrophysics Data System (ADS)

    Schmith, Torben; Thejll, Peter; Johansen, Søren

    2016-04-01

    We analyse the statistical relationship between changes in global temperature, global steric sea level and radiative forcing in order to reveal causal relationships. There are in this, however, potential pitfalls due to the trending nature of the time series. We therefore apply a statistical method called cointegration analysis, originating from the field of econometrics, which is able to correctly handle the analysis of series with trends and other long-range dependencies. Further, we find a relationship between steric sea level and temperature and find that temperature causally depends on the steric sea level, which can be understood as a consequence of the large heat capacity of the ocean. This result is obtained both when analyzing observed data and data from a CMIP5 historical model run. Finally, we find that in the data from the historical run, the steric sea level, in turn, is driven by the external forcing. Finally, we demonstrate that combining these two results can lead to a novel estimate of radiative forcing back in time based on observations.

  9. Correlates of violence in Guinea's Maison Centrale Prison: a statistical approach to documenting human rights abuses.

    PubMed

    Osborn, Ronald E

    2010-12-15

    Les Mêmes Droits Pour Tous (MDT) is a human rights NGO in Guinea, West Africa that focuses on the rights of prisoners in Maison Centrale, the country's largest prison located in the capital city of Conakry. In 2007, MDT completed a survey of the prison population to assess basic legal and human rights conditions. This article uses statistical tools to explore MDT's survey results in greater depth, shedding light on human rights violations in Guinea. It contributes to human rights literature that argues for greater use of econometric tools in rights reporting, and demonstrates how human rights practitioners and academics can work together to construct an etiology of violence and torture by state actors, as physical violence is perhaps the most extreme violation of the individual's right to health. Copyright © 2010 Osborn. This is an open access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original author and source are credited.

  10. Conditional statistical inference with multistage testing designs.

    PubMed

    Zwitser, Robert J; Maris, Gunter

    2015-03-01

    In this paper it is demonstrated how statistical inference from multistage test designs can be made based on the conditional likelihood. Special attention is given to parameter estimation, as well as the evaluation of model fit. Two reasons are provided why the fit of simple measurement models is expected to be better in adaptive designs, compared to linear designs: more parameters are available for the same number of observations; and undesirable response behavior, like slipping and guessing, might be avoided owing to a better match between item difficulty and examinee proficiency. The results are illustrated with simulated data, as well as with real data.

  11. Use of Tests of Statistical Significance and Other Analytic Choices in a School Psychology Journal: Review of Practices and Suggested Alternatives.

    ERIC Educational Resources Information Center

    Snyder, Patricia A.; Thompson, Bruce

    The use of tests of statistical significance was explored, first by reviewing some criticisms of contemporary practice in the use of statistical tests as reflected in a series of articles in the "American Psychologist" and in the appointment of a "Task Force on Statistical Inference" by the American Psychological Association…

  12. Standard deviation and standard error of the mean.

    PubMed

    Lee, Dong Kyu; In, Junyong; Lee, Sangseok

    2015-06-01

    In most clinical and experimental studies, the standard deviation (SD) and the estimated standard error of the mean (SEM) are used to present the characteristics of sample data and to explain statistical analysis results. However, some authors occasionally muddle the distinctive usage between the SD and SEM in medical literature. Because the process of calculating the SD and SEM includes different statistical inferences, each of them has its own meaning. SD is the dispersion of data in a normal distribution. In other words, SD indicates how accurately the mean represents sample data. However the meaning of SEM includes statistical inference based on the sampling distribution. SEM is the SD of the theoretical distribution of the sample means (the sampling distribution). While either SD or SEM can be applied to describe data and statistical results, one should be aware of reasonable methods with which to use SD and SEM. We aim to elucidate the distinctions between SD and SEM and to provide proper usage guidelines for both, which summarize data and describe statistical results.

  13. Standard deviation and standard error of the mean

    PubMed Central

    In, Junyong; Lee, Sangseok

    2015-01-01

    In most clinical and experimental studies, the standard deviation (SD) and the estimated standard error of the mean (SEM) are used to present the characteristics of sample data and to explain statistical analysis results. However, some authors occasionally muddle the distinctive usage between the SD and SEM in medical literature. Because the process of calculating the SD and SEM includes different statistical inferences, each of them has its own meaning. SD is the dispersion of data in a normal distribution. In other words, SD indicates how accurately the mean represents sample data. However the meaning of SEM includes statistical inference based on the sampling distribution. SEM is the SD of the theoretical distribution of the sample means (the sampling distribution). While either SD or SEM can be applied to describe data and statistical results, one should be aware of reasonable methods with which to use SD and SEM. We aim to elucidate the distinctions between SD and SEM and to provide proper usage guidelines for both, which summarize data and describe statistical results. PMID:26045923

  14. Visual shape perception as Bayesian inference of 3D object-centered shape representations.

    PubMed

    Erdogan, Goker; Jacobs, Robert A

    2017-11-01

    Despite decades of research, little is known about how people visually perceive object shape. We hypothesize that a promising approach to shape perception is provided by a "visual perception as Bayesian inference" framework which augments an emphasis on visual representation with an emphasis on the idea that shape perception is a form of statistical inference. Our hypothesis claims that shape perception of unfamiliar objects can be characterized as statistical inference of 3D shape in an object-centered coordinate system. We describe a computational model based on our theoretical framework, and provide evidence for the model along two lines. First, we show that, counterintuitively, the model accounts for viewpoint-dependency of object recognition, traditionally regarded as evidence against people's use of 3D object-centered shape representations. Second, we report the results of an experiment using a shape similarity task, and present an extensive evaluation of existing models' abilities to account for the experimental data. We find that our shape inference model captures subjects' behaviors better than competing models. Taken as a whole, our experimental and computational results illustrate the promise of our approach and suggest that people's shape representations of unfamiliar objects are probabilistic, 3D, and object-centered. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  15. Robust inference for group sequential trials.

    PubMed

    Ganju, Jitendra; Lin, Yunzhi; Zhou, Kefei

    2017-03-01

    For ethical reasons, group sequential trials were introduced to allow trials to stop early in the event of extreme results. Endpoints in such trials are usually mortality or irreversible morbidity. For a given endpoint, the norm is to use a single test statistic and to use that same statistic for each analysis. This approach is risky because the test statistic has to be specified before the study is unblinded, and there is loss in power if the assumptions that ensure optimality for each analysis are not met. To minimize the risk of moderate to substantial loss in power due to a suboptimal choice of a statistic, a robust method was developed for nonsequential trials. The concept is analogous to diversification of financial investments to minimize risk. The method is based on combining P values from multiple test statistics for formal inference while controlling the type I error rate at its designated value.This article evaluates the performance of 2 P value combining methods for group sequential trials. The emphasis is on time to event trials although results from less complex trials are also included. The gain or loss in power with the combination method relative to a single statistic is asymmetric in its favor. Depending on the power of each individual test, the combination method can give more power than any single test or give power that is closer to the test with the most power. The versatility of the method is that it can combine P values from different test statistics for analysis at different times. The robustness of results suggests that inference from group sequential trials can be strengthened with the use of combined tests. Copyright © 2017 John Wiley & Sons, Ltd.

  16. Long-term strategy for the statistical design of a forest health monitoring system

    Treesearch

    Hans T. Schreuder; Raymond L. Czaplewski

    1993-01-01

    A conceptual framework is given for a broad-scale survey of forest health that accomplishes three objectives: generate descriptive statistics; detect changes in such statistics; and simplify analytical inferences that identify, and possibly establish cause-effect relationships. Our paper discusses the development of sampling schemes to satisfy these three objectives,...

  17. Assessing Understanding of Sampling Distributions and Differences in Learning amongst Different Learning Styles

    ERIC Educational Resources Information Center

    Beeman, Jennifer Leigh Sloan

    2013-01-01

    Research has found that students successfully complete an introductory course in statistics without fully comprehending the underlying theory or being able to exhibit statistical reasoning. This is particularly true for the understanding about the sampling distribution of the mean, a crucial concept for statistical inference. This study…

  18. Using Action Research to Develop a Course in Statistical Inference for Workplace-Based Adults

    ERIC Educational Resources Information Center

    Forbes, Sharleen

    2014-01-01

    Many adults who need an understanding of statistical concepts have limited mathematical skills. They need a teaching approach that includes as little mathematical context as possible. Iterative participatory qualitative research (action research) was used to develop a statistical literacy course for adult learners informed by teaching in…

  19. Applying Statistical Process Control to Clinical Data: An Illustration.

    ERIC Educational Resources Information Center

    Pfadt, Al; And Others

    1992-01-01

    Principles of statistical process control are applied to a clinical setting through the use of control charts to detect changes, as part of treatment planning and clinical decision-making processes. The logic of control chart analysis is derived from principles of statistical inference. Sample charts offer examples of evaluating baselines and…

  20. Efficiency Analysis: Enhancing the Statistical and Evaluative Power of the Regression-Discontinuity Design.

    ERIC Educational Resources Information Center

    Madhere, Serge

    An analytic procedure, efficiency analysis, is proposed for improving the utility of quantitative program evaluation for decision making. The three features of the procedure are explained: (1) for statistical control, it adopts and extends the regression-discontinuity design; (2) for statistical inferences, it de-emphasizes hypothesis testing in…

  1. The penumbra of learning: a statistical theory of synaptic tagging and capture.

    PubMed

    Gershman, Samuel J

    2014-01-01

    Learning in humans and animals is accompanied by a penumbra: Learning one task benefits from learning an unrelated task shortly before or after. At the cellular level, the penumbra of learning appears when weak potentiation of one synapse is amplified by strong potentiation of another synapse on the same neuron during a critical time window. Weak potentiation sets a molecular tag that enables the synapse to capture plasticity-related proteins synthesized in response to strong potentiation at another synapse. This paper describes a computational model which formalizes synaptic tagging and capture in terms of statistical learning mechanisms. According to this model, synaptic strength encodes a probabilistic inference about the dynamically changing association between pre- and post-synaptic firing rates. The rate of change is itself inferred, coupling together different synapses on the same neuron. When the inputs to one synapse change rapidly, the inferred rate of change increases, amplifying learning at other synapses.

  2. Space-Time Data fusion for Remote Sensing Applications

    NASA Technical Reports Server (NTRS)

    Braverman, Amy; Nguyen, H.; Cressie, N.

    2011-01-01

    NASA has been collecting massive amounts of remote sensing data about Earth's systems for more than a decade. Missions are selected to be complementary in quantities measured, retrieval techniques, and sampling characteristics, so these datasets are highly synergistic. To fully exploit this, a rigorous methodology for combining data with heterogeneous sampling characteristics is required. For scientific purposes, the methodology must also provide quantitative measures of uncertainty that propagate input-data uncertainty appropriately. We view this as a statistical inference problem. The true but notdirectly- observed quantities form a vector-valued field continuous in space and time. Our goal is to infer those true values or some function of them, and provide to uncertainty quantification for those inferences. We use a spatiotemporal statistical model that relates the unobserved quantities of interest at point-level to the spatially aggregated, observed data. We describe and illustrate our method using CO2 data from two NASA data sets.

  3. Inference of missing data and chemical model parameters using experimental statistics

    NASA Astrophysics Data System (ADS)

    Casey, Tiernan; Najm, Habib

    2017-11-01

    A method for determining the joint parameter density of Arrhenius rate expressions through the inference of missing experimental data is presented. This approach proposes noisy hypothetical data sets from target experiments and accepts those which agree with the reported statistics, in the form of nominal parameter values and their associated uncertainties. The data exploration procedure is formalized using Bayesian inference, employing maximum entropy and approximate Bayesian computation methods to arrive at a joint density on data and parameters. The method is demonstrated in the context of reactions in the H2-O2 system for predictive modeling of combustion systems of interest. Work supported by the US DOE BES CSGB. Sandia National Labs is a multimission lab managed and operated by Nat. Technology and Eng'g Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell Intl, for the US DOE NCSA under contract DE-NA-0003525.

  4. Statistical numeracy as a moderator of (pseudo)contingency effects on decision behavior.

    PubMed

    Fleig, Hanna; Meiser, Thorsten; Ettlin, Florence; Rummel, Jan

    2017-03-01

    Pseudocontingencies denote contingency estimates inferred from base rates rather than from cell frequencies. We examined the role of statistical numeracy for effects of such fallible but adaptive inferences on choice behavior. In Experiment 1, we provided information on single observations as well as on base rates and tracked participants' eye movements. In Experiment 2, we manipulated the availability of information on cell frequencies and base rates between conditions. Our results demonstrate that a focus on base rates rather than cell frequencies benefits pseudocontingency effects. Learners who are more proficient in (conditional) probability calculation prefer to rely on cell frequencies in order to judge contingencies, though, as was evident from their gaze behavior. If cell frequencies are available in summarized format, they may infer the true contingency between options and outcomes. Otherwise, however, even highly numerate learners are susceptible to pseudocontingency effects. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners

    PubMed Central

    Feinauer, Christoph; Procaccini, Andrea; Zecchina, Riccardo; Weigt, Martin; Pagnani, Andrea

    2014-01-01

    In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code. PMID:24663061

  6. The space of ultrametric phylogenetic trees.

    PubMed

    Gavryushkin, Alex; Drummond, Alexei J

    2016-08-21

    The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  7. Assessing risk factors for dental caries: a statistical modeling approach.

    PubMed

    Trottini, Mario; Bossù, Maurizio; Corridore, Denise; Ierardo, Gaetano; Luzzi, Valeria; Saccucci, Matteo; Polimeni, Antonella

    2015-01-01

    The problem of identifying potential determinants and predictors of dental caries is of key importance in caries research and it has received considerable attention in the scientific literature. From the methodological side, a broad range of statistical models is currently available to analyze dental caries indices (DMFT, dmfs, etc.). These models have been applied in several studies to investigate the impact of different risk factors on the cumulative severity of dental caries experience. However, in most of the cases (i) these studies focus on a very specific subset of risk factors; and (ii) in the statistical modeling only few candidate models are considered and model selection is at best only marginally addressed. As a result, our understanding of the robustness of the statistical inferences with respect to the choice of the model is very limited; the richness of the set of statistical models available for analysis in only marginally exploited; and inferences could be biased due the omission of potentially important confounding variables in the model's specification. In this paper we argue that these limitations can be overcome considering a general class of candidate models and carefully exploring the model space using standard model selection criteria and measures of global fit and predictive performance of the candidate models. Strengths and limitations of the proposed approach are illustrated with a real data set. In our illustration the model space contains more than 2.6 million models, which require inferences to be adjusted for 'optimism'.

  8. Inferring Characteristics of Sensorimotor Behavior by Quantifying Dynamics of Animal Locomotion

    NASA Astrophysics Data System (ADS)

    Leung, KaWai

    Locomotion is one of the most well-studied topics in animal behavioral studies. Many fundamental and clinical research make use of the locomotion of an animal model to explore various aspects in sensorimotor behavior. In the past, most of these studies focused on population average of a specific trait due to limitation of data collection and processing power. With recent advance in computer vision and statistical modeling techniques, it is now possible to track and analyze large amounts of behavioral data. In this thesis, I present two projects that aim to infer the characteristics of sensorimotor behavior by quantifying the dynamics of locomotion of nematode Caenorhabditis elegans and fruit fly Drosophila melanogaster, shedding light on statistical dependence between sensing and behavior. In the first project, I investigate the possibility of inferring noxious sensory information from the behavior of Caenorhabditis elegans. I develop a statistical model to infer the heat stimulus level perceived by individual animals from their stereotyped escape responses after stimulation by an IR laser. The model allows quantification of analgesic-like effects of chemical agents or genetic mutations in the worm. At the same time, the method is able to differentiate perturbations of locomotion behavior that are beyond affecting the sensory system. With this model I propose experimental designs that allows statistically significant identification of analgesic-like effects. In the second project, I investigate the relationship of energy budget and stability of locomotion in determining the walking speed distribution of Drosophila melanogaster during aging. The locomotion stability at different age groups is estimated from video recordings using Floquet theory. I calculate the power consumption of different locomotion speed using a biomechanics model. In conclusion, the power consumption, not stability, predicts the locomotion speed distribution at different ages.

  9. Towards a Phylogenetic Approach to the Composition of Species Complexes in the North and Central American Triatoma, Vectors of Chagas Disease

    PubMed Central

    de la Rúa, Nicholas M.; Bustamante, Dulce M.; Menes, Marianela; Stevens, Lori; Monroy, Carlota; Kilpatrick, William; Rizzo, Donna; Klotz, Stephen A.; Schmidt, Justin; Axen, Heather J.; Dorn, Patricia L.

    2014-01-01

    Phylogenetic relationships of insect vectors of parasitic diseases are important for understanding the evolution of epidemiologically relevant traits, and may be useful in vector control. The subfamily Triatominae (Hemiptera:Reduviidae) includes ~140 extant species arranged in five tribes comprised of 15 genera. The genus Triatoma is the most species-rich and contains important vectors of Trypanosoma cruzi, the causative agent of Chagas disease. Triatoma species were grouped into complexes originally by morphology and more recently with the addition of information from molecular phylogenetics (the four-complex hypothesis); however, without a strict adherence to monophyly. To date, the validity of proposed species complexes has not been tested by statistical tests of topology. The goal of this study was to clarify the systematics of 19 Triatoma species from North and Central America. We inferred their evolutionary relatedness using two independent data sets: the complete nuclear Internal Transcribed Spacer-2 ribosomal DNA (ITS-2 rDNA) and head morphometrics. In addition, we used the Shimodaira-Hasegawa statistical test of topology to assess the fit of the data to a set of competing systematic hypotheses (topologies). An unconstrained topology inferred from the ITS-2 data was compared to topologies constrained based on the four-complex hypothesis or one inferred from our morphometry results. The unconstrained topology represents a statistically significant better fit of the molecular data than either the four-complex or the morphometric topology. We propose an update to the composition of species complexes in the North and Central American Triatoma, based on a phylogeny inferred from ITS-2 as a first step towards updating the phylogeny of the complexes based on monophyly and statistical tests of topologies. PMID:24681261

  10. Emerging Concepts of Data Integration in Pathogen Phylodynamics.

    PubMed

    Baele, Guy; Suchard, Marc A; Rambaut, Andrew; Lemey, Philippe

    2017-01-01

    Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics.

  11. Emerging Concepts of Data Integration in Pathogen Phylodynamics

    PubMed Central

    Baele, Guy; Suchard, Marc A.; Rambaut, Andrew; Lemey, Philippe

    2017-01-01

    Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics. PMID:28173504

  12. BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.

    PubMed

    Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A

    2012-01-01

    Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.

  13. BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

    PubMed Central

    Ayres, Daniel L.; Darling, Aaron; Zwickl, Derrick J.; Beerli, Peter; Holder, Mark T.; Lewis, Paul O.; Huelsenbeck, John P.; Ronquist, Fredrik; Swofford, David L.; Cummings, Michael P.; Rambaut, Andrew; Suchard, Marc A.

    2012-01-01

    Abstract Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software. PMID:21963610

  14. Beyond P Values and Hypothesis Testing: Using the Minimum Bayes Factor to Teach Statistical Inference in Undergraduate Introductory Statistics Courses

    ERIC Educational Resources Information Center

    Page, Robert; Satake, Eiki

    2017-01-01

    While interest in Bayesian statistics has been growing in statistics education, the treatment of the topic is still inadequate in both textbooks and the classroom. Because so many fields of study lead to careers that involve a decision-making process requiring an understanding of Bayesian methods, it is becoming increasingly clear that Bayesian…

  15. Human Inferences about Sequences: A Minimal Transition Probability Model

    PubMed Central

    2016-01-01

    The brain constantly infers the causes of the inputs it receives and uses these inferences to generate statistical expectations about future observations. Experimental evidence for these expectations and their violations include explicit reports, sequential effects on reaction times, and mismatch or surprise signals recorded in electrophysiology and functional MRI. Here, we explore the hypothesis that the brain acts as a near-optimal inference device that constantly attempts to infer the time-varying matrix of transition probabilities between the stimuli it receives, even when those stimuli are in fact fully unpredictable. This parsimonious Bayesian model, with a single free parameter, accounts for a broad range of findings on surprise signals, sequential effects and the perception of randomness. Notably, it explains the pervasive asymmetry between repetitions and alternations encountered in those studies. Our analysis suggests that a neural machinery for inferring transition probabilities lies at the core of human sequence knowledge. PMID:28030543

  16. Exploring High School Students Beginning Reasoning about Significance Tests with Technology

    ERIC Educational Resources Information Center

    García, Víctor N.; Sánchez, Ernesto

    2017-01-01

    In the present study we analyze how students reason about or make inferences given a particular hypothesis testing problem (without having studied formal methods of statistical inference) when using Fathom. They use Fathom to create an empirical sampling distribution through computer simulation. It is found that most student´s reasoning rely on…

  17. Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization

    ERIC Educational Resources Information Center

    Gelman, Andrew; Lee, Daniel; Guo, Jiqiang

    2015-01-01

    Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…

  18. IMNN: Information Maximizing Neural Networks

    NASA Astrophysics Data System (ADS)

    Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.

    2018-04-01

    This software trains artificial neural networks to find non-linear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). As compressing large data sets vastly simplifies both frequentist and Bayesian inference, important information may be inadvertently missed. Likelihood-free inference based on automatically derived IMNN summaries produces summaries that are good approximations to sufficient statistics. IMNNs are robustly capable of automatically finding optimal, non-linear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima.

  19. Anchoring quartet-based phylogenetic distances and applications to species tree reconstruction.

    PubMed

    Sayyari, Erfan; Mirarab, Siavash

    2016-11-11

    Inferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed. We introduce DISTIQUE, a new statistically consistent summary method for inferring species trees from gene trees under the coalescent model. We generalize our results to arbitrary phylogenetic inference problems; we show that two arbitrarily chosen leaves, called anchors, can be used to estimate relative distances between all other pairs of leaves by inferring relevant quartet trees. This results in a family of distance-based tree inference methods, with running times ranging between quadratic to quartic in the number of leaves. We show in simulated studies that DISTIQUE has comparable accuracy to leading coalescent-based summary methods and reduced running times.

  20. Econometric analysis of the impact of the relationship of GDP and the pension capital

    NASA Astrophysics Data System (ADS)

    Nepp, A. N.; Amiryan, A. A.

    2016-12-01

    The article demonstrates the impact of institutional risks on indicators of compulsory pension insurance and describes the results of a comparative analysis of investment risks faced by the pension systems of the Russian Federation and OECD countries. Efficiency of private companies managing pension funds in Russia and OECD countries is compared and analyzed to show the necessity to liberalize requirements placed on investments of pension savings funds. On the basis of the available statistical data, the article puts forward and discusses the hypothesis that increasing of the basic indicators of the pension system is possible by reducing its institutional risks. It is concluded that if the institutional risks are reduced and the level of trust increases, there will be enhance growth in the pension system key indicators, such as pension payments and the replacement rate.

  1. Bubbles, shocks and elementary technical trading strategies

    NASA Astrophysics Data System (ADS)

    Fry, John

    2014-01-01

    In this paper we provide a unifying framework for a set of seemingly disparate models for bubbles, shocks and elementary technical trading strategies in financial markets. Markets operate by balancing intrinsic levels of risk and return. This seemingly simple observation is commonly over-looked by academics and practitioners alike. Our model shares its origins in statistical physics with others. However, under our approach, changes in market regime can be explicitly shown to represent a phase transition from random to deterministic behaviour in prices. This structure leads to an improved physical and econometric model. We develop models for bubbles, shocks and elementary technical trading strategies. The list of empirical applications is both interesting and topical and includes real-estate bubbles and the on-going Eurozone crisis. We close by comparing the results of our model with purely qualitative findings from the finance literature.

  2. Spatial data analytics on heterogeneous multi- and many-core parallel architectures using python

    USGS Publications Warehouse

    Laura, Jason R.; Rey, Sergio J.

    2017-01-01

    Parallel vector spatial analysis concerns the application of parallel computational methods to facilitate vector-based spatial analysis. The history of parallel computation in spatial analysis is reviewed, and this work is placed into the broader context of high-performance computing (HPC) and parallelization research. The rise of cyber infrastructure and its manifestation in spatial analysis as CyberGIScience is seen as a main driver of renewed interest in parallel computation in the spatial sciences. Key problems in spatial analysis that have been the focus of parallel computing are covered. Chief among these are spatial optimization problems, computational geometric problems including polygonization and spatial contiguity detection, the use of Monte Carlo Markov chain simulation in spatial statistics, and parallel implementations of spatial econometric methods. Future directions for research on parallelization in computational spatial analysis are outlined.

  3. Data Acquisition and Preprocessing in Studies on Humans: What Is Not Taught in Statistics Classes?

    PubMed

    Zhu, Yeyi; Hernandez, Ladia M; Mueller, Peter; Dong, Yongquan; Forman, Michele R

    2013-01-01

    The aim of this paper is to address issues in research that may be missing from statistics classes and important for (bio-)statistics students. In the context of a case study, we discuss data acquisition and preprocessing steps that fill the gap between research questions posed by subject matter scientists and statistical methodology for formal inference. Issues include participant recruitment, data collection training and standardization, variable coding, data review and verification, data cleaning and editing, and documentation. Despite the critical importance of these details in research, most of these issues are rarely discussed in an applied statistics program. One reason for the lack of more formal training is the difficulty in addressing the many challenges that can possibly arise in the course of a study in a systematic way. This article can help to bridge this gap between research questions and formal statistical inference by using an illustrative case study for a discussion. We hope that reading and discussing this paper and practicing data preprocessing exercises will sensitize statistics students to these important issues and achieve optimal conduct, quality control, analysis, and interpretation of a study.

  4. Entropy Econometrics for combining regional economic forecasts: A Data-Weighted Prior Estimator

    NASA Astrophysics Data System (ADS)

    Fernández-Vázquez, Esteban; Moreno, Blanca

    2017-10-01

    Forecast combination has been studied in econometrics for a long time, and the literature has shown the superior performance of forecast combination over individual predictions. However, there is still controversy on which is the best procedure to specify the forecast weights. This paper explores the possibility of using a procedure based on Entropy Econometrics, which allows setting the weights for the individual forecasts as a mixture of different alternatives. In particular, we examine the ability of the Data-Weighted Prior Estimator proposed by Golan (J Econom 101(1):165-193, 2001) to combine forecasting models in a context of small sample sizes, a relative common scenario when dealing with time series for regional economies. We test the validity of the proposed approach using a simulation exercise and a real-world example that aims at predicting gross regional product growth rates for a regional economy. The forecasting performance of the Data-Weighted Prior Estimator proposed is compared with other combining methods. The simulation results indicate that in scenarios of heavily ill-conditioned datasets the approach suggested dominates other forecast combination strategies. The empirical results are consistent with the conclusions found in the numerical experiment.

  5. Ratio-based estimators for a change point in persistence.

    PubMed

    Halunga, Andreea G; Osborn, Denise R

    2012-11-01

    We study estimation of the date of change in persistence, from [Formula: see text] to [Formula: see text] or vice versa. Contrary to statements in the original papers, our analytical results establish that the ratio-based break point estimators of Kim [Kim, J.Y., 2000. Detection of change in persistence of a linear time series. Journal of Econometrics 95, 97-116], Kim et al. [Kim, J.Y., Belaire-Franch, J., Badillo Amador, R., 2002. Corringendum to "Detection of change in persistence of a linear time series". Journal of Econometrics 109, 389-392] and Busetti and Taylor [Busetti, F., Taylor, A.M.R., 2004. Tests of stationarity against a change in persistence. Journal of Econometrics 123, 33-66] are inconsistent when a mean (or other deterministic component) is estimated for the process. In such cases, the estimators converge to random variables with upper bound given by the true break date when persistence changes from [Formula: see text] to [Formula: see text]. A Monte Carlo study confirms the large sample downward bias and also finds substantial biases in moderate sized samples, partly due to properties at the end points of the search interval.

  6. A comparative analysis of errors in long-term econometric forecasts

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tepel, R.

    1986-04-01

    The growing body of literature that documents forecast accuracy falls generally into two parts. The first is prescriptive and is carried out by modelers who use simulation analysis as a tool for model improvement. These studies are ex post, that is, they make use of known values for exogenous variables and generate an error measure wholly attributable to the model. The second type of analysis is descriptive and seeks to measure errors, identify patterns among errors and variables and compare forecasts from different sources. Most descriptive studies use an ex ante approach, that is, they evaluate model outputs based onmore » estimated (or forecasted) exogenous variables. In this case, it is the forecasting process, rather than the model, that is under scrutiny. This paper uses an ex ante approach to measure errors in forecast series prepared by Data Resources Incorporated (DRI), Wharton Econometric Forecasting Associates (Wharton), and Chase Econometrics (Chase) and to determine if systematic patterns of errors can be discerned between services, types of variables (by degree of aggregation), length of forecast and time at which the forecast is made. Errors are measured as the percent difference between actual and forecasted values for the historical period of 1971 to 1983.« less

  7. An Artificial Intelligence Approach to Analyzing Student Errors in Statistics.

    ERIC Educational Resources Information Center

    Sebrechts, Marc M.; Schooler, Lael J.

    1987-01-01

    Describes the development of an artificial intelligence system called GIDE that analyzes student errors in statistics problems by inferring the students' intentions. Learning strategies involved in problem solving are discussed and the inclusion of goal structures is explained. (LRW)

  8. Network inference using informative priors.

    PubMed

    Mukherjee, Sach; Speed, Terence P

    2008-09-23

    Recent years have seen much interest in the study of systems characterized by multiple interacting components. A class of statistical models called graphical models, in which graphs are used to represent probabilistic relationships between variables, provides a framework for formal inference regarding such systems. In many settings, the object of inference is the network structure itself. This problem of "network inference" is well known to be a challenging one. However, in scientific settings there is very often existing information regarding network connectivity. A natural idea then is to take account of such information during inference. This article addresses the question of incorporating prior information into network inference. We focus on directed models called Bayesian networks, and use Markov chain Monte Carlo to draw samples from posterior distributions over network structures. We introduce prior distributions on graphs capable of capturing information regarding network features including edges, classes of edges, degree distributions, and sparsity. We illustrate our approach in the context of systems biology, applying our methods to network inference in cancer signaling.

  9. HUMAN DECISIONS AND MACHINE PREDICTIONS.

    PubMed

    Kleinberg, Jon; Lakkaraju, Himabindu; Leskovec, Jure; Ludwig, Jens; Mullainathan, Sendhil

    2018-02-01

    Can machine learning improve human decision making? Bail decisions provide a good test case. Millions of times each year, judges make jail-or-release decisions that hinge on a prediction of what a defendant would do if released. The concreteness of the prediction task combined with the volume of data available makes this a promising machine-learning application. Yet comparing the algorithm to judges proves complicated. First, the available data are generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the variable the algorithm predicts; for instance, judges may care specifically about violent crimes or about racial inequities. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: one policy simulation shows crime reductions up to 24.7% with no change in jailing rates, or jailing rate reductions up to 41.9% with no increase in crime rates. Moreover, all categories of crime, including violent crimes, show reductions; and these gains can be achieved while simultaneously reducing racial disparities. These results suggest that while machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals. JEL Codes: C10 (Econometric and statistical methods and methodology), C55 (Large datasets: Modeling and analysis), K40 (Legal procedure, the legal system, and illegal behavior).

  10. HUMAN DECISIONS AND MACHINE PREDICTIONS*

    PubMed Central

    Kleinberg, Jon; Lakkaraju, Himabindu; Leskovec, Jure; Ludwig, Jens; Mullainathan, Sendhil

    2018-01-01

    Can machine learning improve human decision making? Bail decisions provide a good test case. Millions of times each year, judges make jail-or-release decisions that hinge on a prediction of what a defendant would do if released. The concreteness of the prediction task combined with the volume of data available makes this a promising machine-learning application. Yet comparing the algorithm to judges proves complicated. First, the available data are generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the variable the algorithm predicts; for instance, judges may care specifically about violent crimes or about racial inequities. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: one policy simulation shows crime reductions up to 24.7% with no change in jailing rates, or jailing rate reductions up to 41.9% with no increase in crime rates. Moreover, all categories of crime, including violent crimes, show reductions; and these gains can be achieved while simultaneously reducing racial disparities. These results suggest that while machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals. JEL Codes: C10 (Econometric and statistical methods and methodology), C55 (Large datasets: Modeling and analysis), K40 (Legal procedure, the legal system, and illegal behavior) PMID:29755141

  11. Food preparation patterns in German family households. An econometric approach with time budget data.

    PubMed

    Möser, Anke

    2010-08-01

    In Germany, the rising importance of out-of-home consumption, increasing usage of convenience products and decreasing knowledge of younger individuals how to prepare traditional dishes can be seen as obvious indicators for shifting patterns in food preparation. In this paper, econometric analyses are used to shed more light on the factors which may influence the time spent on food preparation in two-parent family households with children. Two time budget surveys, carried out 1991/92 and 2001/02 through the German National Statistical Office, provide the necessary data. Time budget data analyses reveal that over the last ten years the time spent on food preparation in Germany has decreased. The results point out that time resources of a household, for example gainful employment of the parents, significantly affect the amount of time spent on food preparation. The analysis confirms further that there is a more equal allocation of time spent on cooking, baking or laying the table between women and men in the last ten years. Due to changing attitudes and conceivably adaption of economic conditions, differences in time devoted to food preparation seem to have vanished between Eastern and Western Germany. Greater time spent on eating out in Germany as well as decreasing time spent on food preparation at home reveal that the food provisioning of families is no longer a primarily private task of the households themselves but needs more public attention and institutional offers and help. Among other points, the possibility of addressing mothers' lack of time as well as growing "food illiteracy" of children and young adults are discussed. 2010 Elsevier Ltd. All rights reserved.

  12. Analysis of environmental constraints on expanding reserves in current and future reservoirs in wetlands. Final report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Harder, B.J.

    1995-03-01

    Louisiana wetlands require careful management to allow exploitation of non-renewable resources without destroying renewable resources. Current regulatory requirements have been moderately successful in meeting this goal by restricting development in wetland habitats. Continuing public emphasis on reducing environmental impacts of resource development is causing regulators to reassess their regulations and operators to rethink their compliance strategies. We examined the regulatory system and found that reducing the number of applications required by going to a single application process and having a coherent map of the steps required for operations in wetland areas would reduce regulatory burdens. Incremental changes can be mademore » to regulations to allow one agency to be the lead for wetland permitting at minimal cost to operators. Operators need cost effective means of access that will reduce environmental impacts, decrease permitting time, and limit future liability. Regulators and industry must partner to develop incentive based regulations that can provide significant environmental impact reduction for minimal economic cost. In addition regulators need forecasts of future E&P trends to estimate the impact of future regulations. To determine future activity we attempted to survey potential operators when this approach was unsuccessful we created two econometric models of north and south Louisiana relating drilling activity, success ratio, and price to predict future wetland activity. Results of the econometric models indicate that environmental regulations have a small but statistically significant effect on drilling operations in wetland areas of Louisiana. We examined current wetland practices and evaluated those practices comparing environmental versus economic costs and created a method for ranking the practices.« less

  13. Do health care workforce, population, and service provision significantly contribute to the total health expenditure? An econometric analysis of Serbia.

    PubMed

    Santric-Milicevic, M; Vasic, V; Terzic-Supic, Z

    2016-08-15

    In times of austerity, the availability of econometric health knowledge assists policy-makers in understanding and balancing health expenditure with health care plans within fiscal constraints. The objective of this study is to explore whether the health workforce supply of the public health care sector, population number, and utilization of inpatient care significantly contribute to total health expenditure. The dependent variable is the total health expenditure (THE) in Serbia from the years 2003 to 2011. The independent variables are the number of health workers employed in the public health care sector, population number, and inpatient care discharges per 100 population. The statistical analyses include the quadratic interpolation method, natural logarithm and differentiation, and multiple linear regression analyses. The level of significance is set at P < 0.05. The regression model captures 90 % of all variations of observed dependent variables (adjusted R square), and the model is significant (P < 0.001). Total health expenditure increased by 1.21 standard deviations, with an increase in health workforce growth rate by 1 standard deviation. Furthermore, this rate decreased by 1.12 standard deviations, with an increase in (negative) population growth rate by 1 standard deviation. Finally, the growth rate increased by 0.38 standard deviation, with an increase of the growth rate of inpatient care discharges per 100 population by 1 standard deviation (P < 0.001). Study results demonstrate that the government has been making an effort to control strongly health budget growth. Exploring causality relationships between health expenditure and health workforce is important for countries that are trying to consolidate their public health finances and achieve universal health coverage at the same time.

  14. Reaction Time in Grade 5: Data Collection within the Practice of Statistics

    ERIC Educational Resources Information Center

    Watson, Jane; English, Lyn

    2017-01-01

    This study reports on a classroom activity for Grade 5 students investigating their reaction times. The investigation was part of a 3-year research project introducing students to informal inference and giving them experience carrying out the practice of statistics. For this activity the focus within the practice of statistics was on introducing…

  15. An Inferentialist Perspective on the Coordination of Actions and Reasons Involved in Making a Statistical Inference

    ERIC Educational Resources Information Center

    Bakker, Arthur; Ben-Zvi, Dani; Makar, Katie

    2017-01-01

    To understand how statistical and other types of reasoning are coordinated with actions to reduce uncertainty, we conducted a case study in vocational education that involved statistical hypothesis testing. We analyzed an intern's research project in a hospital laboratory in which reducing uncertainties was crucial to make a valid statistical…

  16. Causal inference in biology networks with integrated belief propagation.

    PubMed

    Chang, Rui; Karr, Jonathan R; Schadt, Eric E

    2015-01-01

    Inferring causal relationships among molecular and higher order phenotypes is a critical step in elucidating the complexity of living systems. Here we propose a novel method for inferring causality that is no longer constrained by the conditional dependency arguments that limit the ability of statistical causal inference methods to resolve causal relationships within sets of graphical models that are Markov equivalent. Our method utilizes Bayesian belief propagation to infer the responses of perturbation events on molecular traits given a hypothesized graph structure. A distance measure between the inferred response distribution and the observed data is defined to assess the 'fitness' of the hypothesized causal relationships. To test our algorithm, we infer causal relationships within equivalence classes of gene networks in which the form of the functional interactions that are possible are assumed to be nonlinear, given synthetic microarray and RNA sequencing data. We also apply our method to infer causality in real metabolic network with v-structure and feedback loop. We show that our method can recapitulate the causal structure and recover the feedback loop only from steady-state data which conventional method cannot.

  17. Gene-network inference by message passing

    NASA Astrophysics Data System (ADS)

    Braunstein, A.; Pagnani, A.; Weigt, M.; Zecchina, R.

    2008-01-01

    The inference of gene-regulatory processes from gene-expression data belongs to the major challenges of computational systems biology. Here we address the problem from a statistical-physics perspective and develop a message-passing algorithm which is able to infer sparse, directed and combinatorial regulatory mechanisms. Using the replica technique, the algorithmic performance can be characterized analytically for artificially generated data. The algorithm is applied to genome-wide expression data of baker's yeast under various environmental conditions. We find clear cases of combinatorial control, and enrichment in common functional annotations of regulated genes and their regulators.

  18. Distinguishing between statistical significance and practical/clinical meaningfulness using statistical inference.

    PubMed

    Wilkinson, Michael

    2014-03-01

    Decisions about support for predictions of theories in light of data are made using statistical inference. The dominant approach in sport and exercise science is the Neyman-Pearson (N-P) significance-testing approach. When applied correctly it provides a reliable procedure for making dichotomous decisions for accepting or rejecting zero-effect null hypotheses with known and controlled long-run error rates. Type I and type II error rates must be specified in advance and the latter controlled by conducting an a priori sample size calculation. The N-P approach does not provide the probability of hypotheses or indicate the strength of support for hypotheses in light of data, yet many scientists believe it does. Outcomes of analyses allow conclusions only about the existence of non-zero effects, and provide no information about the likely size of true effects or their practical/clinical value. Bayesian inference can show how much support data provide for different hypotheses, and how personal convictions should be altered in light of data, but the approach is complicated by formulating probability distributions about prior subjective estimates of population effects. A pragmatic solution is magnitude-based inference, which allows scientists to estimate the true magnitude of population effects and how likely they are to exceed an effect magnitude of practical/clinical importance, thereby integrating elements of subjective Bayesian-style thinking. While this approach is gaining acceptance, progress might be hastened if scientists appreciate the shortcomings of traditional N-P null hypothesis significance testing.

  19. Bayesian multimodel inference for dose-response studies

    USGS Publications Warehouse

    Link, W.A.; Albers, P.H.

    2007-01-01

    Statistical inference in dose?response studies is model-based: The analyst posits a mathematical model of the relation between exposure and response, estimates parameters of the model, and reports conclusions conditional on the model. Such analyses rarely include any accounting for the uncertainties associated with model selection. The Bayesian inferential system provides a convenient framework for model selection and multimodel inference. In this paper we briefly describe the Bayesian paradigm and Bayesian multimodel inference. We then present a family of models for multinomial dose?response data and apply Bayesian multimodel inferential methods to the analysis of data on the reproductive success of American kestrels (Falco sparveriuss) exposed to various sublethal dietary concentrations of methylmercury.

  20. Statistical Inference for Big Data Problems in Molecular Biophysics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ramanathan, Arvind; Savol, Andrej; Burger, Virginia

    2012-01-01

    We highlight the role of statistical inference techniques in providing biological insights from analyzing long time-scale molecular simulation data. Technologi- cal and algorithmic improvements in computation have brought molecular simu- lations to the forefront of techniques applied to investigating the basis of living systems. While these longer simulations, increasingly complex reaching petabyte scales presently, promise a detailed view into microscopic behavior, teasing out the important information has now become a true challenge on its own. Mining this data for important patterns is critical to automating therapeutic intervention discovery, improving protein design, and fundamentally understanding the mech- anistic basis of cellularmore » homeostasis.« less

  1. Tropical geometry of statistical models.

    PubMed

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    This article presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. Here, we address the question of how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. The Newton polytope of a statistical model plays a key role. Our results are applied to the hidden Markov model and the general Markov model on a binary tree.

  2. ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data.

    PubMed

    Salehi, Sohrab; Steif, Adi; Roth, Andrew; Aparicio, Samuel; Bouchard-Côté, Alexandre; Shah, Sohrab P

    2017-03-01

    Next-generation sequencing (NGS) of bulk tumour tissue can identify constituent cell populations in cancers and measure their abundance. This requires computational deconvolution of allelic counts from somatic mutations, which may be incapable of fully resolving the underlying population structure. Single cell sequencing (SCS) is a more direct method, although its replacement of NGS is impeded by technical noise and sampling limitations. We propose ddClone, which analytically integrates NGS and SCS data, leveraging their complementary attributes through joint statistical inference. We show on real and simulated datasets that ddClone produces more accurate results than can be achieved by either method alone.

  3. The Heuristic Value of p in Inductive Statistical Inference

    PubMed Central

    Krueger, Joachim I.; Heck, Patrick R.

    2017-01-01

    Many statistical methods yield the probability of the observed data – or data more extreme – under the assumption that a particular hypothesis is true. This probability is commonly known as ‘the’ p-value. (Null Hypothesis) Significance Testing ([NH]ST) is the most prominent of these methods. The p-value has been subjected to much speculation, analysis, and criticism. We explore how well the p-value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p-value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p-value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say. PMID:28649206

  4. Subjective randomness as statistical inference.

    PubMed

    Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B

    2018-06-01

    Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.

  5. The Heuristic Value of p in Inductive Statistical Inference.

    PubMed

    Krueger, Joachim I; Heck, Patrick R

    2017-01-01

    Many statistical methods yield the probability of the observed data - or data more extreme - under the assumption that a particular hypothesis is true. This probability is commonly known as 'the' p -value. (Null Hypothesis) Significance Testing ([NH]ST) is the most prominent of these methods. The p -value has been subjected to much speculation, analysis, and criticism. We explore how well the p -value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p -value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p -value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say.

  6. Statistical inference of seabed sound-speed structure in the Gulf of Oman Basin.

    PubMed

    Sagers, Jason D; Knobles, David P

    2014-06-01

    Addressed is the statistical inference of the sound-speed depth profile of a thick soft seabed from broadband sound propagation data recorded in the Gulf of Oman Basin in 1977. The acoustic data are in the form of time series signals recorded on a sparse vertical line array and generated by explosive sources deployed along a 280 km track. The acoustic data offer a unique opportunity to study a deep-water bottom-limited thickly sedimented environment because of the large number of time series measurements, very low seabed attenuation, and auxiliary measurements. A maximum entropy method is employed to obtain a conditional posterior probability distribution (PPD) for the sound-speed ratio and the near-surface sound-speed gradient. The multiple data samples allow for a determination of the average error constraint value required to uniquely specify the PPD for each data sample. Two complicating features of the statistical inference study are addressed: (1) the need to develop an error function that can both utilize the measured multipath arrival structure and mitigate the effects of data errors and (2) the effect of small bathymetric slopes on the structure of the bottom interacting arrivals.

  7. Experimental and environmental factors affect spurious detection of ecological thresholds

    USGS Publications Warehouse

    Daily, Jonathan P.; Hitt, Nathaniel P.; Smith, David; Snyder, Craig D.

    2012-01-01

    Threshold detection methods are increasingly popular for assessing nonlinear responses to environmental change, but their statistical performance remains poorly understood. We simulated linear change in stream benthic macroinvertebrate communities and evaluated the performance of commonly used threshold detection methods based on model fitting (piecewise quantile regression [PQR]), data partitioning (nonparametric change point analysis [NCPA]), and a hybrid approach (significant zero crossings [SiZer]). We demonstrated that false detection of ecological thresholds (type I errors) and inferences on threshold locations are influenced by sample size, rate of linear change, and frequency of observations across the environmental gradient (i.e., sample-environment distribution, SED). However, the relative importance of these factors varied among statistical methods and between inference types. False detection rates were influenced primarily by user-selected parameters for PQR (τ) and SiZer (bandwidth) and secondarily by sample size (for PQR) and SED (for SiZer). In contrast, the location of reported thresholds was influenced primarily by SED. Bootstrapped confidence intervals for NCPA threshold locations revealed strong correspondence to SED. We conclude that the choice of statistical methods for threshold detection should be matched to experimental and environmental constraints to minimize false detection rates and avoid spurious inferences regarding threshold location.

  8. Advances in Bayesian Modeling in Educational Research

    ERIC Educational Resources Information Center

    Levy, Roy

    2016-01-01

    In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…

  9. Statistical Inference for Quality-Adjusted Survival Time

    DTIC Science & Technology

    2003-08-01

    survival functions of QAL. If an influence function for a test statistic exists for complete data case, denoted as ’i, then a test statistic for...the survival function for the censoring variable. Zhao and Tsiatis (2001) proposed a test statistic where O is the influence function of the general...to 1 everywhere until a subject’s death. We have considered other forms of test statistics. One option is to use an influence function 0i that is

  10. Intimate Partner Violence in the United States - 2010

    MedlinePlus

    ... administration............................................................................. 9 Statistical testing and inference ................................................................... 9 Additional methodological information ..........................................................10 2. Prevalence and Frequency of Individual ...

  11. Estimating the probability of rare events: addressing zero failure data.

    PubMed

    Quigley, John; Revie, Matthew

    2011-07-01

    Traditional statistical procedures for estimating the probability of an event result in an estimate of zero when no events are realized. Alternative inferential procedures have been proposed for the situation where zero events have been realized but often these are ad hoc, relying on selecting methods dependent on the data that have been realized. Such data-dependent inference decisions violate fundamental statistical principles, resulting in estimation procedures whose benefits are difficult to assess. In this article, we propose estimating the probability of an event occurring through minimax inference on the probability that future samples of equal size realize no more events than that in the data on which the inference is based. Although motivated by inference on rare events, the method is not restricted to zero event data and closely approximates the maximum likelihood estimate (MLE) for nonzero data. The use of the minimax procedure provides a risk adverse inferential procedure where there are no events realized. A comparison is made with the MLE and regions of the underlying probability are identified where this approach is superior. Moreover, a comparison is made with three standard approaches to supporting inference where no event data are realized, which we argue are unduly pessimistic. We show that for situations of zero events the estimator can be simply approximated with 1/2.5n, where n is the number of trials. © 2011 Society for Risk Analysis.

  12. Learning Quantitative Sequence-Function Relationships from Massively Parallel Experiments

    NASA Astrophysics Data System (ADS)

    Atwal, Gurinder S.; Kinney, Justin B.

    2016-03-01

    A fundamental aspect of biological information processing is the ubiquity of sequence-function relationships—functions that map the sequence of DNA, RNA, or protein to a biochemically relevant activity. Most sequence-function relationships in biology are quantitative, but only recently have experimental techniques for effectively measuring these relationships been developed. The advent of such "massively parallel" experiments presents an exciting opportunity for the concepts and methods of statistical physics to inform the study of biological systems. After reviewing these recent experimental advances, we focus on the problem of how to infer parametric models of sequence-function relationships from the data produced by these experiments. Specifically, we retrace and extend recent theoretical work showing that inference based on mutual information, not the standard likelihood-based approach, is often necessary for accurately learning the parameters of these models. Closely connected with this result is the emergence of "diffeomorphic modes"—directions in parameter space that are far less constrained by data than likelihood-based inference would suggest. Analogous to Goldstone modes in physics, diffeomorphic modes arise from an arbitrarily broken symmetry of the inference problem. An analytically tractable model of a massively parallel experiment is then described, providing an explicit demonstration of these fundamental aspects of statistical inference. This paper concludes with an outlook on the theoretical and computational challenges currently facing studies of quantitative sequence-function relationships.

  13. Statistical primer: how to deal with missing data in scientific research?

    PubMed

    Papageorgiou, Grigorios; Grant, Stuart W; Takkenberg, Johanna J M; Mokhles, Mostafa M

    2018-05-10

    Missing data are a common challenge encountered in research which can compromise the results of statistical inference when not handled appropriately. This paper aims to introduce basic concepts of missing data to a non-statistical audience, list and compare some of the most popular approaches for handling missing data in practice and provide guidelines and recommendations for dealing with and reporting missing data in scientific research. Complete case analysis and single imputation are simple approaches for handling missing data and are popular in practice, however, in most cases they are not guaranteed to provide valid inferences. Multiple imputation is a robust and general alternative which is appropriate for data missing at random, surpassing the disadvantages of the simpler approaches, but should always be conducted with care. The aforementioned approaches are illustrated and compared in an example application using Cox regression.

  14. Is the P-Value Really Dead? Assessing Inference Learning Outcomes for Social Science Students in an Introductory Statistics Course

    ERIC Educational Resources Information Center

    Lane-Getaz, Sharon

    2017-01-01

    In reaction to misuses and misinterpretations of p-values and confidence intervals, a social science journal editor banned p-values from its pages. This study aimed to show that education could address misuse and abuse. This study examines inference-related learning outcomes for social science students in an introductory course supplemented with…

  15. Back to BaySICS: a user-friendly program for Bayesian Statistical Inference from Coalescent Simulations.

    PubMed

    Sandoval-Castellanos, Edson; Palkopoulou, Eleftheria; Dalén, Love

    2014-01-01

    Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.

  16. STATISTICAL METHODOLOGY FOR THE SIMULTANEOUS ANALYSIS OF MULTIPLE TYPES OF OUTCOMES IN NONLINEAR THRESHOLD MODELS.

    EPA Science Inventory

    Multiple outcomes are often measured on each experimental unit in toxicology experiments. These multiple observations typically imply the existence of correlation between endpoints, and a statistical analysis that incorporates it may result in improved inference. When both disc...

  17. The Love of Large Numbers: A Popularity Bias in Consumer Choice.

    PubMed

    Powell, Derek; Yu, Jingqi; DeWolf, Melissa; Holyoak, Keith J

    2017-10-01

    Social learning-the ability to learn from observing the decisions of other people and the outcomes of those decisions-is fundamental to human evolutionary and cultural success. The Internet now provides social evidence on an unprecedented scale. However, properly utilizing this evidence requires a capacity for statistical inference. We examined how people's interpretation of online review scores is influenced by the numbers of reviews-a potential indicator both of an item's popularity and of the precision of the average review score. Our task was designed to pit statistical information against social information. We modeled the behavior of an "intuitive statistician" using empirical prior information from millions of reviews posted on Amazon.com and then compared the model's predictions with the behavior of experimental participants. Under certain conditions, people preferred a product with more reviews to one with fewer reviews even though the statistical model indicated that the latter was likely to be of higher quality than the former. Overall, participants' judgments suggested that they failed to make meaningful statistical inferences.

  18. Learning what to expect (in visual perception)

    PubMed Central

    Seriès, Peggy; Seitz, Aaron R.

    2013-01-01

    Expectations are known to greatly affect our experience of the world. A growing theory in computational neuroscience is that perception can be successfully described using Bayesian inference models and that the brain is “Bayes-optimal” under some constraints. In this context, expectations are particularly interesting, because they can be viewed as prior beliefs in the statistical inference process. A number of questions remain unsolved, however, for example: How fast do priors change over time? Are there limits in the complexity of the priors that can be learned? How do an individual’s priors compare to the true scene statistics? Can we unlearn priors that are thought to correspond to natural scene statistics? Where and what are the neural substrate of priors? Focusing on the perception of visual motion, we here review recent studies from our laboratories and others addressing these issues. We discuss how these data on motion perception fit within the broader literature on perceptual Bayesian priors, perceptual expectations, and statistical and perceptual learning and review the possible neural basis of priors. PMID:24187536

  19. Statistical Inference for Porous Materials using Persistent Homology.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moon, Chul; Heath, Jason E.; Mitchell, Scott A.

    2017-12-01

    We propose a porous materials analysis pipeline using persistent homology. We rst compute persistent homology of binarized 3D images of sampled material subvolumes. For each image we compute sets of homology intervals, which are represented as summary graphics called persistence diagrams. We convert persistence diagrams into image vectors in order to analyze the similarity of the homology of the material images using the mature tools for image analysis. Each image is treated as a vector and we compute its principal components to extract features. We t a statistical model using the loadings of principal components to estimate material porosity, permeability,more » anisotropy, and tortuosity. We also propose an adaptive version of the structural similarity index (SSIM), a similarity metric for images, as a measure to determine the statistical representative elementary volumes (sREV) for persistence homology. Thus we provide a capability for making a statistical inference of the uid ow and transport properties of porous materials based on their geometry and connectivity.« less

  20. Philosophy and the practice of Bayesian statistics

    PubMed Central

    Gelman, Andrew; Shalizi, Cosma Rohilla

    2015-01-01

    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. PMID:22364575

  1. Philosophy and the practice of Bayesian statistics.

    PubMed

    Gelman, Andrew; Shalizi, Cosma Rohilla

    2013-02-01

    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. © 2012 The British Psychological Society.

  2. 78 FR 24138 - Implementing Public Safety Broadband Provisions of the Middle Class Tax Relief and Job Creation...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-04-24

    ... Bureau, Statistical Abstract of the United States: 2011, Table 427 (2007). \\28\\ The 2007 U.S Census data.... (U.S. CENSUS BUREAU, STATISTICAL ABSTRACT OF THE UNITED STATES 2011, Table 428.) The criterion by... Statistical Abstract of the U.S., that inference is further supported by the fact that in both Tables, many...

  3. Airfreight forecasting methodology and results

    NASA Technical Reports Server (NTRS)

    1978-01-01

    A series of econometric behavioral equations was developed to explain and forecast the evolution of airfreight traffic demand for the total U.S. domestic airfreight system, the total U.S. international airfreight system, and the total scheduled international cargo traffic carried by the top 44 foreign airlines. The basic explanatory variables used in these macromodels were the real gross national products of the countries involved and a measure of relative transportation costs. The results of the econometric analysis reveal that the models explain more than 99 percent of the historical evolution of freight traffic. The long term traffic forecasts generated with these models are based on scenarios of the likely economic outlook in the United States and 31 major foreign countries.

  4. The effect of relationship status on health with dynamic health and persistent relationships.

    PubMed

    Kohn, Jennifer L; Averett, Susan L

    2014-07-01

    The dynamic evolution of health and persistent relationship status pose econometric challenges to disentangling the causal effect of relationships on health from the selection effect of health on relationship choice. Using a new econometric strategy we find that marriage is not universally better for health. Rather, cohabitation benefits the health of men and women over 45, being never married is no worse for health, and only divorce marginally harms the health of younger men. We find strong evidence that unobservable health-related factors can confound estimates. Our method can be applied to other research questions with dynamic dependent and multivariate endogenous variables. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. Irrigation water demand: A meta-analysis of price elasticities

    NASA Astrophysics Data System (ADS)

    Scheierling, Susanne M.; Loomis, John B.; Young, Robert A.

    2006-01-01

    Metaregression models are estimated to investigate sources of variation in empirical estimates of the price elasticity of irrigation water demand. Elasticity estimates are drawn from 24 studies reported in the United States since 1963, including mathematical programming, field experiments, and econometric studies. The mean price elasticity is 0.48. Long-run elasticities, those that are most useful for policy purposes, are likely larger than the mean estimate. Empirical results suggest that estimates may be more elastic if they are derived from mathematical programming or econometric studies and calculated at a higher irrigation water price. Less elastic estimates are found to be derived from models based on field experiments and in the presence of high-valued crops.

  6. Augmenting Latent Dirichlet Allocation and Rank Threshold Detection with Ontologies

    DTIC Science & Technology

    2010-03-01

    Probabilistic Latent Semantic Indexing (PLSI) is an automated indexing information retrieval model [20]. It is based on a statistical latent class model which is...uses a statistical foundation that is more accurate in finding hidden semantic relationships [20]. The model uses factor analysis of count data, number...principle of statistical infer- ence which asserts that all of the information in a sample is contained in the likelihood function [20]. The statistical

  7. Inference Control Mechanism for Statistical Database: Frequency-Imposed Data Distortions.

    ERIC Educational Resources Information Center

    Liew, Chong K.; And Others

    1985-01-01

    Introduces two data distortion methods (Frequency-Imposed Distortion, Frequency-Imposed Probability Distortion) and uses a Monte Carlo study to compare their performance with that of other distortion methods (Point Distortion, Probability Distortion). Indications that data generated by these two methods produce accurate statistics and protect…

  8. Notes on power of normality tests of error terms in regression models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Střelec, Luboš

    2015-03-10

    Normality is one of the basic assumptions in applying statistical procedures. For example in linear regression most of the inferential procedures are based on the assumption of normality, i.e. the disturbance vector is assumed to be normally distributed. Failure to assess non-normality of the error terms may lead to incorrect results of usual statistical inference techniques such as t-test or F-test. Thus, error terms should be normally distributed in order to allow us to make exact inferences. As a consequence, normally distributed stochastic errors are necessary in order to make a not misleading inferences which explains a necessity and importancemore » of robust tests of normality. Therefore, the aim of this contribution is to discuss normality testing of error terms in regression models. In this contribution, we introduce the general RT class of robust tests for normality, and present and discuss the trade-off between power and robustness of selected classical and robust normality tests of error terms in regression models.« less

  9. Learning planar Ising models

    DOE PAGES

    Johnson, Jason K.; Oyen, Diane Adele; Chertkov, Michael; ...

    2016-12-01

    Inference and learning of graphical models are both well-studied problems in statistics and machine learning that have found many applications in science and engineering. However, exact inference is intractable in general graphical models, which suggests the problem of seeking the best approximation to a collection of random variables within some tractable family of graphical models. In this paper, we focus on the class of planar Ising models, for which exact inference is tractable using techniques of statistical physics. Based on these techniques and recent methods for planarity testing and planar embedding, we propose a greedy algorithm for learning the bestmore » planar Ising model to approximate an arbitrary collection of binary random variables (possibly from sample data). Given the set of all pairwise correlations among variables, we select a planar graph and optimal planar Ising model defined on this graph to best approximate that set of correlations. Finally, we demonstrate our method in simulations and for two applications: modeling senate voting records and identifying geo-chemical depth trends from Mars rover data.« less

  10. Learning planar Ising models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, Jason K.; Oyen, Diane Adele; Chertkov, Michael

    Inference and learning of graphical models are both well-studied problems in statistics and machine learning that have found many applications in science and engineering. However, exact inference is intractable in general graphical models, which suggests the problem of seeking the best approximation to a collection of random variables within some tractable family of graphical models. In this paper, we focus on the class of planar Ising models, for which exact inference is tractable using techniques of statistical physics. Based on these techniques and recent methods for planarity testing and planar embedding, we propose a greedy algorithm for learning the bestmore » planar Ising model to approximate an arbitrary collection of binary random variables (possibly from sample data). Given the set of all pairwise correlations among variables, we select a planar graph and optimal planar Ising model defined on this graph to best approximate that set of correlations. Finally, we demonstrate our method in simulations and for two applications: modeling senate voting records and identifying geo-chemical depth trends from Mars rover data.« less

  11. An argument for mechanism-based statistical inference in cancer

    PubMed Central

    Ochs, Michael; Price, Nathan D.; Tomasetti, Cristian; Younes, Laurent

    2015-01-01

    Cancer is perhaps the prototypical systems disease, and as such has been the focus of extensive study in quantitative systems biology. However, translating these programs into personalized clinical care remains elusive and incomplete. In this perspective, we argue that realizing this agenda—in particular, predicting disease phenotypes, progression and treatment response for individuals—requires going well beyond standard computational and bioinformatics tools and algorithms. It entails designing global mathematical models over network-scale configurations of genomic states and molecular concentrations, and learning the model parameters from limited available samples of high-dimensional and integrative omics data. As such, any plausible design should accommodate: biological mechanism, necessary for both feasible learning and interpretable decision making; stochasticity, to deal with uncertainty and observed variation at many scales; and a capacity for statistical inference at the patient level. This program, which requires a close, sustained collaboration between mathematicians and biologists, is illustrated in several contexts, including learning bio-markers, metabolism, cell signaling, network inference and tumorigenesis. PMID:25381197

  12. Network inference using informative priors

    PubMed Central

    Mukherjee, Sach; Speed, Terence P.

    2008-01-01

    Recent years have seen much interest in the study of systems characterized by multiple interacting components. A class of statistical models called graphical models, in which graphs are used to represent probabilistic relationships between variables, provides a framework for formal inference regarding such systems. In many settings, the object of inference is the network structure itself. This problem of “network inference” is well known to be a challenging one. However, in scientific settings there is very often existing information regarding network connectivity. A natural idea then is to take account of such information during inference. This article addresses the question of incorporating prior information into network inference. We focus on directed models called Bayesian networks, and use Markov chain Monte Carlo to draw samples from posterior distributions over network structures. We introduce prior distributions on graphs capable of capturing information regarding network features including edges, classes of edges, degree distributions, and sparsity. We illustrate our approach in the context of systems biology, applying our methods to network inference in cancer signaling. PMID:18799736

  13. Bayesian Inference for Functional Dynamics Exploring in fMRI Data.

    PubMed

    Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing

    2016-01-01

    This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.

  14. Modular Spectral Inference Framework Applied to Young Stars and Brown Dwarfs

    NASA Technical Reports Server (NTRS)

    Gully-Santiago, Michael A.; Marley, Mark S.

    2017-01-01

    In practice, synthetic spectral models are imperfect, causing inaccurate estimates of stellar parameters. Using forward modeling and statistical inference, we derive accurate stellar parameters for a given observed spectrum by emulating a grid of precomputed spectra to track uncertainties. Spectral inference as applied to brown dwarfs re: Synthetic spectral models (Marley et al 1996 and 2014) via the newest grid spans a massive multi-dimensional grid applied to IGRINS spectra, improving atmospheric models for JWST. When applied to young stars(10Myr) with large starpots, they can be measured spectroscopically, especially in the near-IR with IGRINS.

  15. Inference of neuronal network spike dynamics and topology from calcium imaging data

    PubMed Central

    Lütcke, Henry; Gerhard, Felipe; Zenke, Friedemann; Gerstner, Wulfram; Helmchen, Fritjof

    2013-01-01

    Two-photon calcium imaging enables functional analysis of neuronal circuits by inferring action potential (AP) occurrence (“spike trains”) from cellular fluorescence signals. It remains unclear how experimental parameters such as signal-to-noise ratio (SNR) and acquisition rate affect spike inference and whether additional information about network structure can be extracted. Here we present a simulation framework for quantitatively assessing how well spike dynamics and network topology can be inferred from noisy calcium imaging data. For simulated AP-evoked calcium transients in neocortical pyramidal cells, we analyzed the quality of spike inference as a function of SNR and data acquisition rate using a recently introduced peeling algorithm. Given experimentally attainable values of SNR and acquisition rate, neural spike trains could be reconstructed accurately and with up to millisecond precision. We then applied statistical neuronal network models to explore how remaining uncertainties in spike inference affect estimates of network connectivity and topological features of network organization. We define the experimental conditions suitable for inferring whether the network has a scale-free structure and determine how well hub neurons can be identified. Our findings provide a benchmark for future calcium imaging studies that aim to reliably infer neuronal network properties. PMID:24399936

  16. [Simulation and data mining model for identifying and prediction budget changes in the care of patients with hypertension].

    PubMed

    Joyanes-Aguilar, Luis; Castaño, Néstor J; Osorio, José H

    2015-10-01

    Objective To present a simulation model that establishes the economic impact to the health care system produced by the diagnostic evolution of patients suffering from arterial hypertension. Methodology The information used corresponds to that available in Individual Health Records (RIPs, in Spanish). A statistical characterization was carried out and a model for matrix storage in MATLAB was proposed. Data mining was used to create predictors. Finally, a simulation environment was built to determine the economic cost of diagnostic evolution. Results 5.7 % of the population progresses from the diagnosis, and the cost overrun associated with it is 43.2 %. Conclusions Results shows the applicability and possibility of focussing research on establishing diagnosis relationships using all the information reported in the RIPS in order to create econometric indicators that can determine which diagnostic evolutions are most relevant to budget allocation.

  17. Impacts of Weather Shocks on Murder and Drug Cartel Violence in Mexico

    NASA Astrophysics Data System (ADS)

    Miguel, E.; Hsiang, S. M.; Burke, M.; Gonzalez, F.; Baysan, C.

    2014-12-01

    We estimate impacts of weather shocks on several dimensions of violence in Mexico during 1990-2010, using disaggregated data at the state-by-month level. Controlling for location and time fixed effects, we show that higher than normal temperatures lead to: (i) higher total murder rates, (ii) higher rates of drug cartel related murders, and (iii) higher suicide rates. The effects of high temperatures on inter-personal violence (murders) and on inter-group violence (drug cartel related murders) are large, statistically significant and similar to those found in other recent settings. The use of panel data econometric methods to examine the effect of weather on suicide incidence is novel. We assess the role of economic channels (i.e., agricultural production affected by weather) and conclude that they cannot account for most of the estimated impacts, suggesting that other mechanisms, including psychological explanations, are likely to be important in this setting.

  18. Role of non-traditional locations for seasonal flu vaccination: Empirical evidence and evaluation.

    PubMed

    Kim, Namhoon; Mountain, Travis P

    2017-05-19

    This study investigated the role of non-traditional locations in the decision to vaccinate for seasonal flu. We measured individuals' preferred location for seasonal flu vaccination by examining the National H1N1 Flu Survey (NHFS) conducted from late 2009 to early 2010. Our econometric model estimated the probabilities of possible choices by varying individual characteristics, and predicted the way in which the probabilities are expected to change given the specific covariates of interest. From this estimation, we observed that non-traditional locations significantly influenced the vaccination of certain individuals, such as those who are high-income, educated, White, employed, and living in a metropolitan statistical area (MSA), by increasing the coverage. Thus, based on the empirical evidence, our study suggested that supporting non-traditional locations for vaccination could be effective in increasing vaccination coverage. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Clinical judgment research on economic topics: Role of congruence of tasks in clinical practice.

    PubMed

    Huttin, Christine C

    2017-01-01

    This paper discusses what can ensure the performance of judgment studies with an information design that integrates economics of medical systems, in the context of digitalization of healthcare. It is part of a series of 5 methodological papers on statistical procedures and problems to implement judgment research designs and decision models, especially to address cost of care, and ways to measure conversation on cost of care between physicians and patients, with unstructured data such as economic narratives to complement billing and financial information (e.g. cost cognitive cues in conjoint or reversed conjoint designs). The paper discusses how congruence of tasks can increase the reliability of data. It uses some results of two Meta reviews of judgment studies in different fields of applications: psychology, business, medical sciences and education. It compares tests for congruence in judgment studies and efficiency tests in econometric studies.

  20. Journal of Air Transportation, Volume 8, No. 2. Volume 8, No. 2

    NASA Technical Reports Server (NTRS)

    Bowen, Brent (Editor); Kabashkin, Igor (Editor); Nickerson, Jocelyn (Editor)

    2003-01-01

    The mission of the Journal of Air Transportation (JAT) is to provide the global community immediate key resource information in all areas of air transportation. This journal contains articles on the following:Fuel Consumption Modeling of a Transport Category Aircraft: A FlightOperationsQualityAssurance (F0QA) Analysis;Demand for Air Travel in the United States: Bottom-Up Econometric Estimation and Implications for Forecasts by Origin and Destination Pairs;Blind Flying on the Beam: Aeronautical Communication, Navigation and Surveillance: Its Origins and the Politics of Technology: Part I1 Political Oversight and Promotion;Blind Flying on the Beam: Aeronautical Communication, Navigation and Surveillance: Its Origins and the Politics of Technology: Part 111: Emerging Technologies;Ethics Education in University Aviation Management Programs in the US: Part Two B-Statistical Analysis of Current Practice;Integrating Human Factors into the Human-computer Interface: and How Best to Display Meteorological Information for Critical Aviation Decision-making and Performance.

  1. Type Ia Supernova Light Curve Inference: Hierarchical Models for Nearby SN Ia in the Optical and Near Infrared

    NASA Astrophysics Data System (ADS)

    Mandel, Kaisey; Kirshner, R. P.; Narayan, G.; Wood-Vasey, W. M.; Friedman, A. S.; Hicken, M.

    2010-01-01

    I have constructed a comprehensive statistical model for Type Ia supernova light curves spanning optical through near infrared data simultaneously. The near infrared light curves are found to be excellent standard candles (sigma(MH) = 0.11 +/- 0.03 mag) that are less vulnerable to systematic error from dust extinction, a major confounding factor for cosmological studies. A hierarchical statistical framework incorporates coherently multiple sources of randomness and uncertainty, including photometric error, intrinsic supernova light curve variations and correlations, dust extinction and reddening, peculiar velocity dispersion and distances, for probabilistic inference with Type Ia SN light curves. Inferences are drawn from the full probability density over individual supernovae and the SN Ia and dust populations, conditioned on a dataset of SN Ia light curves and redshifts. To compute probabilistic inferences with hierarchical models, I have developed BayeSN, a Markov Chain Monte Carlo algorithm based on Gibbs sampling. This code explores and samples the global probability density of parameters describing individual supernovae and the population. I have applied this hierarchical model to optical and near infrared data of over 100 nearby Type Ia SN from PAIRITEL, the CfA3 sample, and the literature. Using this statistical model, I find that SN with optical and NIR data have a smaller residual scatter in the Hubble diagram than SN with only optical data. The continued study of Type Ia SN in the near infrared will be important for improving their utility as precise and accurate cosmological distance indicators.

  2. Statistical inference involving binomial and negative binomial parameters.

    PubMed

    García-Pérez, Miguel A; Núñez-Antón, Vicente

    2009-05-01

    Statistical inference about two binomial parameters implies that they are both estimated by binomial sampling. There are occasions in which one aims at testing the equality of two binomial parameters before and after the occurrence of the first success along a sequence of Bernoulli trials. In these cases, the binomial parameter before the first success is estimated by negative binomial sampling whereas that after the first success is estimated by binomial sampling, and both estimates are related. This paper derives statistical tools to test two hypotheses, namely, that both binomial parameters equal some specified value and that both parameters are equal though unknown. Simulation studies are used to show that in small samples both tests are accurate in keeping the nominal Type-I error rates, and also to determine sample size requirements to detect large, medium, and small effects with adequate power. Additional simulations also show that the tests are sufficiently robust to certain violations of their assumptions.

  3. Econophysical visualization of Adam Smith’s invisible hand

    NASA Astrophysics Data System (ADS)

    Cohen, Morrel H.; Eliazar, Iddo I.

    2013-02-01

    Consider a complex system whose macrostate is statistically observable, but yet whose operating mechanism is an unknown black-box. In this paper we address the problem of inferring, from the system’s macrostate statistics, the system’s intrinsic force yielding the observed statistics. The inference is established via two diametrically opposite approaches which result in the very same intrinsic force: a top-down approach based on the notion of entropy, and a bottom-up approach based on the notion of Langevin dynamics. The general results established are applied to the problem of visualizing the intrinsic socioeconomic force-Adam Smith’s invisible hand-shaping the distribution of wealth in human societies. Our analysis yields quantitative econophysical representations of figurative socioeconomic forces, quantitative definitions of “poor” and “rich”, and a quantitative characterization of the “poor-get-poorer” and the “rich-get-richer” phenomena.

  4. Sampling and counting genome rearrangement scenarios

    PubMed Central

    2015-01-01

    Background Even for moderate size inputs, there are a tremendous number of optimal rearrangement scenarios, regardless what the model is and which specific question is to be answered. Therefore giving one optimal solution might be misleading and cannot be used for statistical inferring. Statistically well funded methods are necessary to sample uniformly from the solution space and then a small number of samples are sufficient for statistical inferring. Contribution In this paper, we give a mini-review about the state-of-the-art of sampling and counting rearrangement scenarios, focusing on the reversal, DCJ and SCJ models. Above that, we also give a Gibbs sampler for sampling most parsimonious labeling of evolutionary trees under the SCJ model. The method has been implemented and tested on real life data. The software package together with example data can be downloaded from http://www.renyi.hu/~miklosi/SCJ-Gibbs/ PMID:26452124

  5. Online Updating of Statistical Inference in the Big Data Setting.

    PubMed

    Schifano, Elizabeth D; Wu, Jing; Wang, Chun; Yan, Jun; Chen, Ming-Hui

    2016-01-01

    We present statistical methods for big data arising from online analytical processing, where large amounts of data arrive in streams and require fast analysis without storage/access to the historical data. In particular, we develop iterative estimating algorithms and statistical inferences for linear models and estimating equations that update as new data arrive. These algorithms are computationally efficient, minimally storage-intensive, and allow for possible rank deficiencies in the subset design matrices due to rare-event covariates. Within the linear model setting, the proposed online-updating framework leads to predictive residual tests that can be used to assess the goodness-of-fit of the hypothesized model. We also propose a new online-updating estimator under the estimating equation setting. Theoretical properties of the goodness-of-fit tests and proposed estimators are examined in detail. In simulation studies and real data applications, our estimator compares favorably with competing approaches under the estimating equation setting.

  6. When mechanism matters: Bayesian forecasting using models of ecological diffusion

    USGS Publications Warehouse

    Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.

    2017-01-01

    Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.

  7. Econometrically calibrated computable general equilibrium models: Applications to the analysis of energy and climate politics

    NASA Astrophysics Data System (ADS)

    Schu, Kathryn L.

    Economy-energy-environment models are the mainstay of economic assessments of policies to reduce carbon dioxide (CO2) emissions, yet their empirical basis is often criticized as being weak. This thesis addresses these limitations by constructing econometrically calibrated models in two policy areas. The first is a 35-sector computable general equilibrium (CGE) model of the U.S. economy which analyzes the uncertain impacts of CO2 emission abatement. Econometric modeling of sectors' nested constant elasticity of substitution (CES) cost functions based on a 45-year price-quantity dataset yields estimates of capital-labor-energy-material input substitution elasticities and biases of technical change that are incorporated into the CGE model. I use the estimated standard errors and variance-covariance matrices to construct the joint distribution of the parameters of the economy's supply side, which I sample to perform Monte Carlo baseline and counterfactual runs of the model. The resulting probabilistic abatement cost estimates highlight the importance of the uncertainty in baseline emissions growth. The second model is an equilibrium simulation of the market for new vehicles which I use to assess the response of vehicle prices, sales and mileage to CO2 taxes and increased corporate average fuel economy (CAFE) standards. I specify an econometric model of a representative consumer's vehicle preferences using a nested CES expenditure function which incorporates mileage and other characteristics in addition to prices, and develop a novel calibration algorithm to link this structure to vehicle model supplies by manufacturers engaged in Bertrand competition. CO2 taxes' effects on gasoline prices reduce vehicle sales and manufacturers' profits if vehicles' mileage is fixed, but these losses shrink once mileage can be adjusted. Accelerated CAFE standards induce manufacturers to pay fines for noncompliance rather than incur the higher costs of radical mileage improvements. Neither policy induces major increases in fuel economy.

  8. Probabilistic Signal Recovery and Random Matrices

    DTIC Science & Technology

    2016-12-08

    applications in statistics , biomedical data analysis, quantization, dimen- sion reduction, and networks science. 1. High-dimensional inference and geometry Our...low-rank approxima- tion, with applications to community detection in networks, Annals of Statistics 44 (2016), 373–400. [7] C. Le, E. Levina, R...approximation, with applications to community detection in networks, Annals of Statistics 44 (2016), 373–400. C. Le, E. Levina, R. Vershynin, Concentration

  9. The Use of a Context-Based Information Retrieval Technique

    DTIC Science & Technology

    2009-07-01

    provided in context. Latent Semantic Analysis (LSA) is a statistical technique for inferring contextual and structural information, and previous studies...WAIS). 10 DSTO-TR-2322 1.4.4 Latent Semantic Analysis LSA, which is also known as latent semantic indexing (LSI), uses a statistical and...1.4.6 Language Models In contrast, natural language models apply algorithms that combine statistical information with semantic information. Semantic

  10. An Introduction to Confidence Intervals for Both Statistical Estimates and Effect Sizes.

    ERIC Educational Resources Information Center

    Capraro, Mary Margaret

    This paper summarizes methods of estimating confidence intervals, including classical intervals and intervals for effect sizes. The recent American Psychological Association (APA) Task Force on Statistical Inference report suggested that confidence intervals should always be reported, and the fifth edition of the APA "Publication Manual"…

  11. Balancing Treatment and Control Groups in Quasi-Experiments: An Introduction to Propensity Scoring

    ERIC Educational Resources Information Center

    Connelly, Brian S.; Sackett, Paul R.; Waters, Shonna D.

    2013-01-01

    Organizational and applied sciences have long struggled with improving causal inference in quasi-experiments. We introduce organizational researchers to propensity scoring, a statistical technique that has become popular in other applied sciences as a means for improving internal validity. Propensity scoring statistically models how individuals in…

  12. Modeling Cross-Situational Word-Referent Learning: Prior Questions

    ERIC Educational Resources Information Center

    Yu, Chen; Smith, Linda B.

    2012-01-01

    Both adults and young children possess powerful statistical computation capabilities--they can infer the referent of a word from highly ambiguous contexts involving many words and many referents by aggregating cross-situational statistical information across contexts. This ability has been explained by models of hypothesis testing and by models of…

  13. Temporal and Statistical Information in Causal Structure Learning

    ERIC Educational Resources Information Center

    McCormack, Teresa; Frosch, Caren; Patrick, Fiona; Lagnado, David

    2015-01-01

    Three experiments examined children's and adults' abilities to use statistical and temporal information to distinguish between common cause and causal chain structures. In Experiment 1, participants were provided with conditional probability information and/or temporal information and asked to infer the causal structure of a 3-variable mechanical…

  14. Secondary Analysis of National Longitudinal Transition Study 2 Data

    ERIC Educational Resources Information Center

    Hicks, Tyler A.; Knollman, Greg A.

    2015-01-01

    This review examines published secondary analyses of National Longitudinal Transition Study 2 (NLTS2) data, with a primary focus upon statistical objectives, paradigms, inferences, and methods. Its primary purpose was to determine which statistical techniques have been common in secondary analyses of NLTS2 data. The review begins with an…

  15. Some General Goals in Teaching Statistics.

    ERIC Educational Resources Information Center

    Blalock, H. M.

    1987-01-01

    States that regardless of the content or level of a statistics course, five goals to reach are: (1) overcoming fears, resistances, and tendencies to memorize; (2) the importance of intellectual honesty and integrity; (3) understanding relationship between deductive and inductive inferences; (4) learning to play role of reasonable critic; and (5)…

  16. Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers

    ERIC Educational Resources Information Center

    Keiffer, Greggory L.; Lane, Forrest C.

    2016-01-01

    Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…

  17. Technology Focus: Using Technology to Explore Statistical Inference

    ERIC Educational Resources Information Center

    Garofalo, Joe; Juersivich, Nicole

    2007-01-01

    There is much research that documents what many teachers know, that students struggle with many concepts in probability and statistics. This article presents two sample activities the authors use to help preservice teachers develop ideas about how they can use technology to promote their students' ability to understand mathematics and connect…

  18. The Impact of an Instructional Intervention Designed to Support Development of Stochastic Understanding of Probability Distribution

    ERIC Educational Resources Information Center

    Conant, Darcy Lynn

    2013-01-01

    Stochastic understanding of probability distribution undergirds development of conceptual connections between probability and statistics and supports development of a principled understanding of statistical inference. This study investigated the impact of an instructional course intervention designed to support development of stochastic…

  19. Basic Statistical Concepts and Methods for Earth Scientists

    USGS Publications Warehouse

    Olea, Ricardo A.

    2008-01-01

    INTRODUCTION Statistics is the science of collecting, analyzing, interpreting, modeling, and displaying masses of numerical data primarily for the characterization and understanding of incompletely known systems. Over the years, these objectives have lead to a fair amount of analytical work to achieve, substantiate, and guide descriptions and inferences.

  20. Metacontrast Inferred from Reaction Time and Verbal Report: Replication and Comments on the Feher-Biederman Experiment

    ERIC Educational Resources Information Center

    Amundson, Vickie E.; Bernstein, Ira H.

    1973-01-01

    Authors note that Fehrer and Biederman's two statistical tests were not of equal power and that their conclusion could be a statistical artifact of both the lesser power of the verbal report comparison and the insensitivity of their particular verbal report indicator. (Editor)

  1. Optimism bias leads to inconclusive results - an empirical study

    PubMed Central

    Djulbegovic, Benjamin; Kumar, Ambuj; Magazin, Anja; Schroen, Anneke T.; Soares, Heloisa; Hozo, Iztok; Clarke, Mike; Sargent, Daniel; Schell, Michael J.

    2010-01-01

    Objective Optimism bias refers to unwarranted belief in the efficacy of new therapies. We assessed the impact of optimism bias on a proportion of trials that did not answer their research question successfully, and explored whether poor accrual or optimism bias is responsible for inconclusive results. Study Design Systematic review Setting Retrospective analysis of a consecutive series phase III randomized controlled trials (RCTs) performed under the aegis of National Cancer Institute Cooperative groups. Results 359 trials (374 comparisons) enrolling 150,232 patients were analyzed. 70% (262/374) of the trials generated conclusive results according to the statistical criteria. Investigators made definitive statements related to the treatment preference in 73% (273/374) of studies. Investigators’ judgments and statistical inferences were concordant in 75% (279/374) of trials. Investigators consistently overestimated their expected treatment effects, but to a significantly larger extent for inconclusive trials. The median ratio of expected over observed hazard ratio or odds ratio was 1.34 (range 0.19 – 15.40) in conclusive trials compared to 1.86 (range 1.09 – 12.00) in inconclusive studies (p<0.0001). Only 17% of the trials had treatment effects that matched original researchers’ expectations. Conclusion Formal statistical inference is sufficient to answer the research question in 75% of RCTs. The answers to the other 25% depend mostly on subjective judgments, which at times are in conflict with statistical inference. Optimism bias significantly contributes to inconclusive results. PMID:21163620

  2. Optimism bias leads to inconclusive results-an empirical study.

    PubMed

    Djulbegovic, Benjamin; Kumar, Ambuj; Magazin, Anja; Schroen, Anneke T; Soares, Heloisa; Hozo, Iztok; Clarke, Mike; Sargent, Daniel; Schell, Michael J

    2011-06-01

    Optimism bias refers to unwarranted belief in the efficacy of new therapies. We assessed the impact of optimism bias on a proportion of trials that did not answer their research question successfully and explored whether poor accrual or optimism bias is responsible for inconclusive results. Systematic review. Retrospective analysis of a consecutive-series phase III randomized controlled trials (RCTs) performed under the aegis of National Cancer Institute Cooperative groups. Three hundred fifty-nine trials (374 comparisons) enrolling 150,232 patients were analyzed. Seventy percent (262 of 374) of the trials generated conclusive results according to the statistical criteria. Investigators made definitive statements related to the treatment preference in 73% (273 of 374) of studies. Investigators' judgments and statistical inferences were concordant in 75% (279 of 374) of trials. Investigators consistently overestimated their expected treatment effects but to a significantly larger extent for inconclusive trials. The median ratio of expected and observed hazard ratio or odds ratio was 1.34 (range: 0.19-15.40) in conclusive trials compared with 1.86 (range: 1.09-12.00) in inconclusive studies (P<0.0001). Only 17% of the trials had treatment effects that matched original researchers' expectations. Formal statistical inference is sufficient to answer the research question in 75% of RCTs. The answers to the other 25% depend mostly on subjective judgments, which at times are in conflict with statistical inference. Optimism bias significantly contributes to inconclusive results. Copyright © 2011 Elsevier Inc. All rights reserved.

  3. Marginal regression approach for additive hazards models with clustered current status data.

    PubMed

    Su, Pei-Fang; Chi, Yunchan

    2014-01-15

    Current status data arise naturally from tumorigenicity experiments, epidemiology studies, biomedicine, econometrics and demographic and sociology studies. Moreover, clustered current status data may occur with animals from the same litter in tumorigenicity experiments or with subjects from the same family in epidemiology studies. Because the only information extracted from current status data is whether the survival times are before or after the monitoring or censoring times, the nonparametric maximum likelihood estimator of survival function converges at a rate of n(1/3) to a complicated limiting distribution. Hence, semiparametric regression models such as the additive hazards model have been extended for independent current status data to derive the test statistics, whose distributions converge at a rate of n(1/2) , for testing the regression parameters. However, a straightforward application of these statistical methods to clustered current status data is not appropriate because intracluster correlation needs to be taken into account. Therefore, this paper proposes two estimating functions for estimating the parameters in the additive hazards model for clustered current status data. The comparative results from simulation studies are presented, and the application of the proposed estimating functions to one real data set is illustrated. Copyright © 2013 John Wiley & Sons, Ltd.

  4. Effects of Platform Design on the Customer Experience in an Online Solar PV Marketplace

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    OShaughnessy, Eric J.; Margolis, Robert M.; Leibowicz, Benjamin

    We analyze a unique dataset of residential solar PV quotes offered in an online marketplace to understand how platform design changes affect customer outcomes. Three of the four design changes are associated with statistically significant and robust reductions in offer prices, though none of the policies were designed explicitly to reduce prices. The results suggest that even small changes in how prospective solar PV customers interact with installers can affect customer outcomes such as prices. Specifically, the four changes we evaluate are: 1) a customer map that shows potential new EnergySage registrants the locations of nearby customers; 2) a quotemore » cap that precludes more than seven installers from bidding on any one customer; 3) a price guidance feature that informs installers about competitive prices in the customer's market before they submit quotes; and 4) no pre-quote messaging to prohibit installers from contacting customers prior to offering quotes. We calculate descriptive statistics to investigate whether each design change accomplished its specific objectives. Then, we econometrically evaluate the impacts of the design changes on PV quote prices and purchase prices using a regression discontinuity approach.« less

  5. Using the Lorenz Curve to Characterize Risk Predictiveness and Etiologic Heterogeneity

    PubMed Central

    Mauguen, Audrey; Begg, Colin B.

    2017-01-01

    The Lorenz curve is a graphical tool that is used widely in econometrics. It represents the spread of a probability distribution, and its traditional use has been to characterize population distributions of wealth or income, or more specifically, inequalities in wealth or income. However, its utility in public health research has not been broadly established. The purpose of this article is to explain its special usefulness for characterizing the population distribution of disease risks, and in particular for identifying the precise disease burden that can be predicted to occur in segments of the population that are known to have especially high (or low) risks, a feature that is important for evaluating the yield of screening or other disease prevention initiatives. We demonstrate that, although the Lorenz curve represents the distribution of predicted risks in a population at risk for the disease, in fact it can be estimated from a case–control study conducted in the population without the need for information on absolute risks. We explore two different estimation strategies and compare their statistical properties using simulations. The Lorenz curve is a statistical tool that deserves wider use in public health research. PMID:27096256

  6. The analysis of factors of management of safety of critical information infrastructure with use of dynamic models

    NASA Astrophysics Data System (ADS)

    Trostyansky, S. N.; Kalach, A. V.; Lavlinsky, V. V.; Lankin, O. V.

    2018-03-01

    Based on the analysis of the dynamic model of panel data by region, including fire statistics for surveillance sites and statistics of a set of regional socio-economic indicators, as well as the time of rapid response of the state fire service to fires, the probability of fires in the surveillance sites and the risk of human death in The result of such fires from the values of the corresponding indicators for the previous year, a set of regional social-economics factors, as well as regional indicators time rapid response of the state fire service in the fire. The results obtained are consistent with the results of the application to the fire risks of the model of a rational offender. Estimation of the economic equivalent of human life from data on surveillance objects for Russia, calculated on the basis of the analysis of the presented dynamic model of fire risks, correctly agrees with the known literary data. The results obtained on the basis of the econometric approach to fire risks allow us to forecast fire risks at the supervisory sites in the regions of Russia and to develop management solutions to minimize such risks.

  7. Forward and backward inference in spatial cognition.

    PubMed

    Penny, Will D; Zeidman, Peter; Burgess, Neil

    2013-01-01

    This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.

  8. Forward and Backward Inference in Spatial Cognition

    PubMed Central

    Penny, Will D.; Zeidman, Peter; Burgess, Neil

    2013-01-01

    This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of ‘lower-level’ computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus. PMID:24348230

  9. Statistical inference approach to structural reconstruction of complex networks from binary time series

    NASA Astrophysics Data System (ADS)

    Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng

    2018-02-01

    Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.

  10. A statistical model for interpreting computerized dynamic posturography data

    NASA Technical Reports Server (NTRS)

    Feiveson, Alan H.; Metter, E. Jeffrey; Paloski, William H.

    2002-01-01

    Computerized dynamic posturography (CDP) is widely used for assessment of altered balance control. CDP trials are quantified using the equilibrium score (ES), which ranges from zero to 100, as a decreasing function of peak sway angle. The problem of how best to model and analyze ESs from a controlled study is considered. The ES often exhibits a skewed distribution in repeated trials, which can lead to incorrect inference when applying standard regression or analysis of variance models. Furthermore, CDP trials are terminated when a patient loses balance. In these situations, the ES is not observable, but is assigned the lowest possible score--zero. As a result, the response variable has a mixed discrete-continuous distribution, further compromising inference obtained by standard statistical methods. Here, we develop alternative methodology for analyzing ESs under a stochastic model extending the ES to a continuous latent random variable that always exists, but is unobserved in the event of a fall. Loss of balance occurs conditionally, with probability depending on the realized latent ES. After fitting the model by a form of quasi-maximum-likelihood, one may perform statistical inference to assess the effects of explanatory variables. An example is provided, using data from the NIH/NIA Baltimore Longitudinal Study on Aging.

  11. Statistical inference of protein structural alignments using information and compression.

    PubMed

    Collier, James H; Allison, Lloyd; Lesk, Arthur M; Stuckey, Peter J; Garcia de la Banda, Maria; Konagurthu, Arun S

    2017-04-01

    Structural molecular biology depends crucially on computational techniques that compare protein three-dimensional structures and generate structural alignments (the assignment of one-to-one correspondences between subsets of amino acids based on atomic coordinates). Despite its importance, the structural alignment problem has not been formulated, much less solved, in a consistent and reliable way. To overcome these difficulties, we present here a statistical framework for the precise inference of structural alignments, built on the Bayesian and information-theoretic principle of Minimum Message Length (MML). The quality of any alignment is measured by its explanatory power-the amount of lossless compression achieved to explain the protein coordinates using that alignment. We have implemented this approach in MMLigner , the first program able to infer statistically significant structural alignments. We also demonstrate the reliability of MMLigner 's alignment results when compared with the state of the art. Importantly, MMLigner can also discover different structural alignments of comparable quality, a challenging problem for oligomers and protein complexes. Source code, binaries and an interactive web version are available at http://lcb.infotech.monash.edu.au/mmligner . arun.konagurthu@monash.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  12. Statistical inference with quantum measurements: methodologies for nitrogen vacancy centers in diamond

    NASA Astrophysics Data System (ADS)

    Hincks, Ian; Granade, Christopher; Cory, David G.

    2018-01-01

    The analysis of photon count data from the standard nitrogen vacancy (NV) measurement process is treated as a statistical inference problem. This has applications toward gaining better and more rigorous error bars for tasks such as parameter estimation (e.g. magnetometry), tomography, and randomized benchmarking. We start by providing a summary of the standard phenomenological model of the NV optical process in terms of Lindblad jump operators. This model is used to derive random variables describing emitted photons during measurement, to which finite visibility, dark counts, and imperfect state preparation are added. NV spin-state measurement is then stated as an abstract statistical inference problem consisting of an underlying biased coin obstructed by three Poisson rates. Relevant frequentist and Bayesian estimators are provided, discussed, and quantitatively compared. We show numerically that the risk of the maximum likelihood estimator is well approximated by the Cramér-Rao bound, for which we provide a simple formula. Of the estimators, we in particular promote the Bayes estimator, owing to its slightly better risk performance, and straightforward error propagation into more complex experiments. This is illustrated on experimental data, where quantum Hamiltonian learning is performed and cross-validated in a fully Bayesian setting, and compared to a more traditional weighted least squares fit.

  13. Statistical inference approach to structural reconstruction of complex networks from binary time series.

    PubMed

    Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng

    2018-02-01

    Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.

  14. PREFACE: ELC International Meeting on Inference, Computation, and Spin Glasses (ICSG2013)

    NASA Astrophysics Data System (ADS)

    Kabashima, Yoshiyuki; Hukushima, Koji; Inoue, Jun-ichi; Tanaka, Toshiyuki; Watanabe, Osamu

    2013-12-01

    The close relationship between probability-based inference and statistical mechanics of disordered systems has been noted for some time. This relationship has provided researchers with a theoretical foundation in various fields of information processing for analytical performance evaluation and construction of efficient algorithms based on message-passing or Monte Carlo sampling schemes. The ELC International Meeting on 'Inference, Computation, and Spin Glasses (ICSG2013)', was held in Sapporo 28-30 July 2013. The meeting was organized as a satellite meeting of STATPHYS25 in order to offer a forum where concerned researchers can assemble and exchange information on the latest results and newly established methodologies, and discuss future directions of the interdisciplinary studies between statistical mechanics and information sciences. Financial support from Grant-in-Aid for Scientific Research on Innovative Areas, MEXT, Japan 'Exploring the Limits of Computation (ELC)' is gratefully acknowledged. We are pleased to publish 23 papers contributed by invited speakers of ICSG2013 in this volume of Journal of Physics: Conference Series. We hope that this volume will promote further development of this highly vigorous interdisciplinary field between statistical mechanics and information/computer science. Editors and ICSG2013 Organizing Committee: Koji Hukushima Jun-ichi Inoue (Local Chair of ICSG2013) Yoshiyuki Kabashima (Editor-in-Chief) Toshiyuki Tanaka Osamu Watanabe (General Chair of ICSG2013)

  15. Measuring the Number of M Dwarfs per M Dwarf Using Kepler Eclipsing Binaries

    NASA Astrophysics Data System (ADS)

    Shan, Yutong; Johnson, John A.; Morton, Timothy D.

    2015-11-01

    We measure the binarity of detached M dwarfs in the Kepler field with orbital periods in the range of 1-90 days. Kepler’s photometric precision and nearly continuous monitoring of stellar targets over time baselines ranging from 3 months to 4 years make its detection efficiency for eclipsing binaries nearly complete over this period range and for all radius ratios. Our investigation employs a statistical framework akin to that used for inferring planetary occurrence rates from planetary transits. The obvious simplification is that eclipsing binaries have a vastly improved detection efficiency that is limited chiefly by their geometric probabilities to eclipse. For the M-dwarf sample observed by the Kepler Mission, the fractional incidence of eclipsing binaries implies that there are {0.11}-0.04+0.02 close stellar companions per apparently single M dwarf. Our measured binarity is higher than previous inferences of the occurrence rate of close binaries via radial velocity techniques, at roughly the 2σ level. This study represents the first use of eclipsing binary detections from a high quality transiting planet mission to infer binary statistics. Application of this statistical framework to the eclipsing binaries discovered by future transit surveys will establish better constraints on short-period M+M binary rate, as well as binarity measurements for stars of other spectral types.

  16. Lifetime Prediction for Degradation of Solar Mirrors using Step-Stress Accelerated Testing (Presentation)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, J.; Elmore, R.; Kennedy, C.

    This research is to illustrate the use of statistical inference techniques in order to quantify the uncertainty surrounding reliability estimates in a step-stress accelerated degradation testing (SSADT) scenario. SSADT can be used when a researcher is faced with a resource-constrained environment, e.g., limits on chamber time or on the number of units to test. We apply the SSADT methodology to a degradation experiment involving concentrated solar power (CSP) mirrors and compare the results to a more traditional multiple accelerated testing paradigm. Specifically, our work includes: (1) designing a durability testing plan for solar mirrors (3M's new improved silvered acrylic "Solarmore » Reflector Film (SFM) 1100") through the ultra-accelerated weathering system (UAWS), (2) defining degradation paths of optical performance based on the SSADT model which is accelerated by high UV-radiant exposure, and (3) developing service lifetime prediction models for solar mirrors using advanced statistical inference. We use the method of least squares to estimate the model parameters and this serves as the basis for the statistical inference in SSADT. Several quantities of interest can be estimated from this procedure, e.g., mean-time-to-failure (MTTF) and warranty time. The methods allow for the estimation of quantities that may be of interest to the domain scientists.« less

  17. Cancer Survival Estimates Due to Non-Uniform Loss to Follow-Up and Non-Proportional Hazards

    PubMed

    K M, Jagathnath Krishna; Mathew, Aleyamma; Sara George, Preethi

    2017-06-25

    Background: Cancer survival depends on loss to follow-up (LFU) and non-proportional hazards (non-PH). If LFU is high, survival will be over-estimated. If hazard is non-PH, rank tests will provide biased inference and Cox-model will provide biased hazard-ratio. We assessed the bias due to LFU and non-PH factor in cancer survival and provided alternate methods for unbiased inference and hazard-ratio. Materials and Methods: Kaplan-Meier survival were plotted using a realistic breast cancer (BC) data-set, with >40%, 5-year LFU and compared it using another BC data-set with <15%, 5-year LFU to assess the bias in survival due to high LFU. Age at diagnosis of the latter data set was used to illustrate the bias due to a non-PH factor. Log-rank test was employed to assess the bias in p-value and Cox-model was used to assess the bias in hazard-ratio for the non-PH factor. Schoenfeld statistic was used to test the non-PH of age. For the non-PH factor, we employed Renyi statistic for inference and time dependent Cox-model for hazard-ratio. Results: Five-year BC survival was 69% (SE: 1.1%) vs. 90% (SE: 0.7%) for data with low vs. high LFU respectively. Age (<45, 46-54 & >54 years) was a non-PH factor (p-value: 0.036). However, survival by age was significant (log-rank p-value: 0.026), but not significant using Renyi statistic (p=0.067). Hazard ratio (HR) for age using Cox-model was 1.012 (95%CI: 1.004 -1.019) and the same using time-dependent Cox-model was in the other direction (HR: 0.997; 95% CI: 0.997- 0.998). Conclusion: Over-estimated survival was observed for cancer with high LFU. Log-rank statistic and Cox-model provided biased results for non-PH factor. For data with non-PH factors, Renyi statistic and time dependent Cox-model can be used as alternate methods to obtain unbiased inference and estimates. Creative Commons Attribution License

  18. Final Report, DOE Early Career Award: Predictive modeling of complex physical systems: new tools for statistical inference, uncertainty quantification, and experimental design

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marzouk, Youssef

    Predictive simulation of complex physical systems increasingly rests on the interplay of experimental observations with computational models. Key inputs, parameters, or structural aspects of models may be incomplete or unknown, and must be developed from indirect and limited observations. At the same time, quantified uncertainties are needed to qualify computational predictions in the support of design and decision-making. In this context, Bayesian statistics provides a foundation for inference from noisy and limited data, but at prohibitive computional expense. This project intends to make rigorous predictive modeling *feasible* in complex physical systems, via accelerated and scalable tools for uncertainty quantification, Bayesianmore » inference, and experimental design. Specific objectives are as follows: 1. Develop adaptive posterior approximations and dimensionality reduction approaches for Bayesian inference in high-dimensional nonlinear systems. 2. Extend accelerated Bayesian methodologies to large-scale {\\em sequential} data assimilation, fully treating nonlinear models and non-Gaussian state and parameter distributions. 3. Devise efficient surrogate-based methods for Bayesian model selection and the learning of model structure. 4. Develop scalable simulation/optimization approaches to nonlinear Bayesian experimental design, for both parameter inference and model selection. 5. Demonstrate these inferential tools on chemical kinetic models in reacting flow, constructing and refining thermochemical and electrochemical models from limited data. Demonstrate Bayesian filtering on canonical stochastic PDEs and in the dynamic estimation of inhomogeneous subsurface properties and flow fields.« less

  19. Unraveling multiple changes in complex climate time series using Bayesian inference

    NASA Astrophysics Data System (ADS)

    Berner, Nadine; Trauth, Martin H.; Holschneider, Matthias

    2016-04-01

    Change points in time series are perceived as heterogeneities in the statistical or dynamical characteristics of observations. Unraveling such transitions yields essential information for the understanding of the observed system. The precise detection and basic characterization of underlying changes is therefore of particular importance in environmental sciences. We present a kernel-based Bayesian inference approach to investigate direct as well as indirect climate observations for multiple generic transition events. In order to develop a diagnostic approach designed to capture a variety of natural processes, the basic statistical features of central tendency and dispersion are used to locally approximate a complex time series by a generic transition model. A Bayesian inversion approach is developed to robustly infer on the location and the generic patterns of such a transition. To systematically investigate time series for multiple changes occurring at different temporal scales, the Bayesian inversion is extended to a kernel-based inference approach. By introducing basic kernel measures, the kernel inference results are composed into a proxy probability to a posterior distribution of multiple transitions. Thus, based on a generic transition model a probability expression is derived that is capable to indicate multiple changes within a complex time series. We discuss the method's performance by investigating direct and indirect climate observations. The approach is applied to environmental time series (about 100 a), from the weather station in Tuscaloosa, Alabama, and confirms documented instrumentation changes. Moreover, the approach is used to investigate a set of complex terrigenous dust records from the ODP sites 659, 721/722 and 967 interpreted as climate indicators of the African region of the Plio-Pleistocene period (about 5 Ma). The detailed inference unravels multiple transitions underlying the indirect climate observations coinciding with established global climate events.

  20. Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology

    PubMed Central

    Murakami, Yohei

    2014-01-01

    Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832

Top