Sample records for missing covariate information

  1. Cox regression analysis with missing covariates via nonparametric multiple imputation.

    PubMed

    Hsu, Chiu-Hsieh; Yu, Mandi

    2018-01-01

    We consider the situation of estimating Cox regression in which some covariates are subject to missing, and there exists additional information (including observed event time, censoring indicator and fully observed covariates) which may be predictive of the missing covariates. We propose to use two working regression models: one for predicting the missing covariates and the other for predicting the missing probabilities. For each missing covariate observation, these two working models are used to define a nearest neighbor imputing set. This set is then used to non-parametrically impute covariate values for the missing observation. Upon the completion of imputation, Cox regression is performed on the multiply imputed datasets to estimate the regression coefficients. In a simulation study, we compare the nonparametric multiple imputation approach with the augmented inverse probability weighted (AIPW) method, which directly incorporates the two working models into estimation of Cox regression, and the predictive mean matching imputation (PMM) method. We show that all approaches can reduce bias due to non-ignorable missing mechanism. The proposed nonparametric imputation method is robust to mis-specification of either one of the two working models and robust to mis-specification of the link function of the two working models. In contrast, the PMM method is sensitive to misspecification of the covariates included in imputation. The AIPW method is sensitive to the selection probability. We apply the approaches to a breast cancer dataset from Surveillance, Epidemiology and End Results (SEER) Program.

  2. On analyzing ordinal data when responses and covariates are both missing at random.

    PubMed

    Rana, Subrata; Roy, Surupa; Das, Kalyan

    2016-08-01

    In many occasions, particularly in biomedical studies, data are unavailable for some responses and covariates. This leads to biased inference in the analysis when a substantial proportion of responses or a covariate or both are missing. Except a few situations, methods for missing data have earlier been considered either for missing response or for missing covariates, but comparatively little attention has been directed to account for both missing responses and missing covariates, which is partly attributable to complexity in modeling and computation. This seems to be important as the precise impact of substantial missing data depends on the association between two missing data processes as well. The real difficulty arises when the responses are ordinal by nature. We develop a joint model to take into account simultaneously the association between the ordinal response variable and covariates and also that between the missing data indicators. Such a complex model has been analyzed here by using the Markov chain Monte Carlo approach and also by the Monte Carlo relative likelihood approach. Their performance on estimating the model parameters in finite samples have been looked into. We illustrate the application of these two methods using data from an orthodontic study. Analysis of such data provides some interesting information on human habit. © The Author(s) 2013.

  3. Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.

    PubMed

    Xie, Yanmei; Zhang, Biao

    2017-04-20

    Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and

  4. Marginalized zero-inflated Poisson models with missing covariates.

    PubMed

    Benecha, Habtamu K; Preisser, John S; Divaris, Kimon; Herring, Amy H; Das, Kalyan

    2018-05-11

    Unlike zero-inflated Poisson regression, marginalized zero-inflated Poisson (MZIP) models for counts with excess zeros provide estimates with direct interpretations for the overall effects of covariates on the marginal mean. In the presence of missing covariates, MZIP and many other count data models are ordinarily fitted using complete case analysis methods due to lack of appropriate statistical methods and software. This article presents an estimation method for MZIP models with missing covariates. The method, which is applicable to other missing data problems, is illustrated and compared with complete case analysis by using simulations and dental data on the caries preventive effects of a school-based fluoride mouthrinse program. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Estimation of covariate-specific time-dependent ROC curves in the presence of missing biomarkers.

    PubMed

    Li, Shanshan; Ning, Yang

    2015-09-01

    Covariate-specific time-dependent ROC curves are often used to evaluate the diagnostic accuracy of a biomarker with time-to-event outcomes, when certain covariates have an impact on the test accuracy. In many medical studies, measurements of biomarkers are subject to missingness due to high cost or limitation of technology. This article considers estimation of covariate-specific time-dependent ROC curves in the presence of missing biomarkers. To incorporate the covariate effect, we assume a proportional hazards model for the failure time given the biomarker and the covariates, and a semiparametric location model for the biomarker given the covariates. In the presence of missing biomarkers, we propose a simple weighted estimator for the ROC curves where the weights are inversely proportional to the selection probability. We also propose an augmented weighted estimator which utilizes information from the subjects with missing biomarkers. The augmented weighted estimator enjoys the double-robustness property in the sense that the estimator remains consistent if either the missing data process or the conditional distribution of the missing data given the observed data is correctly specified. We derive the large sample properties of the proposed estimators and evaluate their finite sample performance using numerical studies. The proposed approaches are illustrated using the US Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. © 2015, The International Biometric Society.

  6. Local influence for generalized linear models with missing covariates.

    PubMed

    Shi, Xiaoyan; Zhu, Hongtu; Ibrahim, Joseph G

    2009-12-01

    In the analysis of missing data, sensitivity analyses are commonly used to check the sensitivity of the parameters of interest with respect to the missing data mechanism and other distributional and modeling assumptions. In this article, we formally develop a general local influence method to carry out sensitivity analyses of minor perturbations to generalized linear models in the presence of missing covariate data. We examine two types of perturbation schemes (the single-case and global perturbation schemes) for perturbing various assumptions in this setting. We show that the metric tensor of a perturbation manifold provides useful information for selecting an appropriate perturbation. We also develop several local influence measures to identify influential points and test model misspecification. Simulation studies are conducted to evaluate our methods, and real datasets are analyzed to illustrate the use of our local influence measures.

  7. Covariate Selection for Multilevel Models with Missing Data

    PubMed Central

    Marino, Miguel; Buxton, Orfeu M.; Li, Yi

    2017-01-01

    Missing covariate data hampers variable selection in multilevel regression settings. Current variable selection techniques for multiply-imputed data commonly address missingness in the predictors through list-wise deletion and stepwise-selection methods which are problematic. Moreover, most variable selection methods are developed for independent linear regression models and do not accommodate multilevel mixed effects regression models with incomplete covariate data. We develop a novel methodology that is able to perform covariate selection across multiply-imputed data for multilevel random effects models when missing data is present. Specifically, we propose to stack the multiply-imputed data sets from a multiple imputation procedure and to apply a group variable selection procedure through group lasso regularization to assess the overall impact of each predictor on the outcome across the imputed data sets. Simulations confirm the advantageous performance of the proposed method compared with the competing methods. We applied the method to reanalyze the Healthy Directions-Small Business cancer prevention study, which evaluated a behavioral intervention program targeting multiple risk-related behaviors in a working-class, multi-ethnic population. PMID:28239457

  8. Model selection for marginal regression analysis of longitudinal data with missing observations and covariate measurement error.

    PubMed

    Shen, Chung-Wei; Chen, Yi-Hau

    2015-10-01

    Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. Missing continuous outcomes under covariate dependent missingness in cluster randomised trials

    PubMed Central

    Diaz-Ordaz, Karla; Bartlett, Jonathan W

    2016-01-01

    Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group. PMID:27177885

  10. Missing continuous outcomes under covariate dependent missingness in cluster randomised trials.

    PubMed

    Hossain, Anower; Diaz-Ordaz, Karla; Bartlett, Jonathan W

    2017-06-01

    Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group.

  11. Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression

    ERIC Educational Resources Information Center

    Peng, Chao-Ying Joanne; Zhu, Jin

    2008-01-01

    For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…

  12. A stochastic multiple imputation algorithm for missing covariate data in tree-structured survival analysis.

    PubMed

    Wallace, Meredith L; Anderson, Stewart J; Mazumdar, Sati

    2010-12-20

    Missing covariate data present a challenge to tree-structured methodology due to the fact that a single tree model, as opposed to an estimated parameter value, may be desired for use in a clinical setting. To address this problem, we suggest a multiple imputation algorithm that adds draws of stochastic error to a tree-based single imputation method presented by Conversano and Siciliano (Technical Report, University of Naples, 2003). Unlike previously proposed techniques for accommodating missing covariate data in tree-structured analyses, our methodology allows the modeling of complex and nonlinear covariate structures while still resulting in a single tree model. We perform a simulation study to evaluate our stochastic multiple imputation algorithm when covariate data are missing at random and compare it to other currently used methods. Our algorithm is advantageous for identifying the true underlying covariate structure when complex data and larger percentages of missing covariate observations are present. It is competitive with other current methods with respect to prediction accuracy. To illustrate our algorithm, we create a tree-structured survival model for predicting time to treatment response in older, depressed adults. Copyright © 2010 John Wiley & Sons, Ltd.

  13. Multiple imputation for IPD meta-analysis: allowing for heterogeneity and studies with missing covariates.

    PubMed

    Quartagno, M; Carpenter, J R

    2016-07-30

    Recently, multiple imputation has been proposed as a tool for individual patient data meta-analysis with sporadically missing observations, and it has been suggested that within-study imputation is usually preferable. However, such within study imputation cannot handle variables that are completely missing within studies. Further, if some of the contributing studies are relatively small, it may be appropriate to share information across studies when imputing. In this paper, we develop and evaluate a joint modelling approach to multiple imputation of individual patient data in meta-analysis, with an across-study probability distribution for the study specific covariance matrices. This retains the flexibility to allow for between-study heterogeneity when imputing while allowing (i) sharing information on the covariance matrix across studies when this is appropriate, and (ii) imputing variables that are wholly missing from studies. Simulation results show both equivalent performance to the within-study imputation approach where this is valid, and good results in more general, practically relevant, scenarios with studies of very different sizes, non-negligible between-study heterogeneity and wholly missing variables. We illustrate our approach using data from an individual patient data meta-analysis of hypertension trials. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  14. Sensitivity analysis for missing dichotomous outcome data in multi-visit randomized clinical trial with randomization-based covariance adjustment.

    PubMed

    Li, Siying; Koch, Gary G; Preisser, John S; Lam, Diana; Sanchez-Kam, Matilde

    2017-01-01

    Dichotomous endpoints in clinical trials have only two possible outcomes, either directly or via categorization of an ordinal or continuous observation. It is common to have missing data for one or more visits during a multi-visit study. This paper presents a closed form method for sensitivity analysis of a randomized multi-visit clinical trial that possibly has missing not at random (MNAR) dichotomous data. Counts of missing data are redistributed to the favorable and unfavorable outcomes mathematically to address possibly informative missing data. Adjusted proportion estimates and their closed form covariance matrix estimates are provided. Treatment comparisons over time are addressed with Mantel-Haenszel adjustment for a stratification factor and/or randomization-based adjustment for baseline covariables. The application of such sensitivity analyses is illustrated with an example. An appendix outlines an extension of the methodology to ordinal endpoints.

  15. TRANSPOSABLE REGULARIZED COVARIANCE MODELS WITH AN APPLICATION TO MISSING DATA IMPUTATION

    PubMed Central

    Allen, Genevera I.; Tibshirani, Robert

    2015-01-01

    Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data matrix is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate normal, the mean-restricted matrix-variate normal, in which the rows and columns each have a separate mean vector and covariance matrix. By placing additive penalties on the inverse covariance matrices of the rows and columns, these so called transposable regularized covariance models allow for maximum likelihood estimation of the mean and non-singular covariance matrices. Using these models, we formulate EM-type algorithms for missing data imputation in both the multivariate and transposable frameworks. We present theoretical results exploiting the structure of our transposable models that allow these models and imputation methods to be applied to high-dimensional data. Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility. PMID:26877823

  16. TRANSPOSABLE REGULARIZED COVARIANCE MODELS WITH AN APPLICATION TO MISSING DATA IMPUTATION.

    PubMed

    Allen, Genevera I; Tibshirani, Robert

    2010-06-01

    Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data matrix is transposable , meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate normal, the mean-restricted matrix-variate normal , in which the rows and columns each have a separate mean vector and covariance matrix. By placing additive penalties on the inverse covariance matrices of the rows and columns, these so called transposable regularized covariance models allow for maximum likelihood estimation of the mean and non-singular covariance matrices. Using these models, we formulate EM-type algorithms for missing data imputation in both the multivariate and transposable frameworks. We present theoretical results exploiting the structure of our transposable models that allow these models and imputation methods to be applied to high-dimensional data. Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility.

  17. The competing risks Cox model with auxiliary case covariates under weaker missing-at-random cause of failure.

    PubMed

    Nevo, Daniel; Nishihara, Reiko; Ogino, Shuji; Wang, Molin

    2017-08-04

    In the analysis of time-to-event data with multiple causes using a competing risks Cox model, often the cause of failure is unknown for some of the cases. The probability of a missing cause is typically assumed to be independent of the cause given the time of the event and covariates measured before the event occurred. In practice, however, the underlying missing-at-random assumption does not necessarily hold. Motivated by colorectal cancer molecular pathological epidemiology analysis, we develop a method to conduct valid analysis when additional auxiliary variables are available for cases only. We consider a weaker missing-at-random assumption, with missing pattern depending on the observed quantities, which include the auxiliary covariates. We use an informative likelihood approach that will yield consistent estimates even when the underlying model for missing cause of failure is misspecified. The superiority of our method over naive methods in finite samples is demonstrated by simulation study results. We illustrate the use of our method in an analysis of colorectal cancer data from the Nurses' Health Study cohort, where, apparently, the traditional missing-at-random assumption fails to hold.

  18. Addressing missing covariates for the regression analysis of competing risks: Prognostic modelling for triaging patients diagnosed with prostate cancer.

    PubMed

    Escarela, Gabriel; Ruiz-de-Chavez, Juan; Castillo-Morales, Alberto

    2016-08-01

    Competing risks arise in medical research when subjects are exposed to various types or causes of death. Data from large cohort studies usually exhibit subsets of regressors that are missing for some study subjects. Furthermore, such studies often give rise to censored data. In this article, a carefully formulated likelihood-based technique for the regression analysis of right-censored competing risks data when two of the covariates are discrete and partially missing is developed. The approach envisaged here comprises two models: one describes the covariate effects on both long-term incidence and conditional latencies for each cause of death, whilst the other deals with the observation process by which the covariates are missing. The former is formulated with a well-established mixture model and the latter is characterised by copula-based bivariate probability functions for both the missing covariates and the missing data mechanism. The resulting formulation lends itself to the empirical assessment of non-ignorability by performing sensitivity analyses using models with and without a non-ignorable component. The methods are illustrated on a 20-year follow-up involving a prostate cancer cohort from the National Cancer Institutes Surveillance, Epidemiology, and End Results program. © The Author(s) 2013.

  19. Causal inference with missing exposure information: Methods and applications to an obstetric study.

    PubMed

    Zhang, Zhiwei; Liu, Wei; Zhang, Bo; Tang, Li; Zhang, Jun

    2016-10-01

    Causal inference in observational studies is frequently challenged by the occurrence of missing data, in addition to confounding. Motivated by the Consortium on Safe Labor, a large observational study of obstetric labor practice and birth outcomes, this article focuses on the problem of missing exposure information in a causal analysis of observational data. This problem can be approached from different angles (i.e. missing covariates and causal inference), and useful methods can be obtained by drawing upon the available techniques and insights in both areas. In this article, we describe and compare a collection of methods based on different modeling assumptions, under standard assumptions for missing data (i.e. missing-at-random and positivity) and for causal inference with complete data (i.e. no unmeasured confounding and another positivity assumption). These methods involve three models: one for treatment assignment, one for the dependence of outcome on treatment and covariates, and one for the missing data mechanism. In general, consistent estimation of causal quantities requires correct specification of at least two of the three models, although there may be some flexibility as to which two models need to be correct. Such flexibility is afforded by doubly robust estimators adapted from the missing covariates literature and the literature on causal inference with complete data, and by a newly developed triply robust estimator that is consistent if any two of the three models are correct. The methods are applied to the Consortium on Safe Labor data and compared in a simulation study mimicking the Consortium on Safe Labor. © The Author(s) 2013.

  20. Spatio-Temporal Regression Based Clustering of Precipitation Extremes in a Presence of Systematically Missing Covariates

    NASA Astrophysics Data System (ADS)

    Kaiser, Olga; Martius, Olivia; Horenko, Illia

    2017-04-01

    Regression based Generalized Pareto Distribution (GPD) models are often used to describe the dynamics of hydrological threshold excesses relying on the explicit availability of all of the relevant covariates. But, in real application the complete set of relevant covariates might be not available. In this context, it was shown that under weak assumptions the influence coming from systematically missing covariates can be reflected by a nonstationary and nonhomogenous dynamics. We present a data-driven, semiparametric and an adaptive approach for spatio-temporal regression based clustering of threshold excesses in a presence of systematically missing covariates. The nonstationary and nonhomogenous behavior of threshold excesses is describes by a set of local stationary GPD models, where the parameters are expressed as regression models, and a non-parametric spatio-temporal hidden switching process. Exploiting nonparametric Finite Element time-series analysis Methodology (FEM) with Bounded Variation of the model parameters (BV) for resolving the spatio-temporal switching process, the approach goes beyond strong a priori assumptions made is standard latent class models like Mixture Models and Hidden Markov Models. Additionally, the presented FEM-BV-GPD provides a pragmatic description of the corresponding spatial dependence structure by grouping together all locations that exhibit similar behavior of the switching process. The performance of the framework is demonstrated on daily accumulated precipitation series over 17 different locations in Switzerland from 1981 till 2013 - showing that the introduced approach allows for a better description of the historical data.

  1. Covariance Structure Model Fit Testing under Missing Data: An Application of the Supplemented EM Algorithm

    ERIC Educational Resources Information Center

    Cai, Li; Lee, Taehun

    2009-01-01

    We apply the Supplemented EM algorithm (Meng & Rubin, 1991) to address a chronic problem with the "two-stage" fitting of covariance structure models in the presence of ignorable missing data: the lack of an asymptotically chi-square distributed goodness-of-fit statistic. We show that the Supplemented EM algorithm provides a…

  2. Cox model with interval-censored covariate in cohort studies.

    PubMed

    Ahn, Soohyun; Lim, Johan; Paik, Myunghee Cho; Sacco, Ralph L; Elkind, Mitchell S

    2018-05-18

    In cohort studies the outcome is often time to a particular event, and subjects are followed at regular intervals. Periodic visits may also monitor a secondary irreversible event influencing the event of primary interest, and a significant proportion of subjects develop the secondary event over the period of follow-up. The status of the secondary event serves as a time-varying covariate, but is recorded only at the times of the scheduled visits, generating incomplete time-varying covariates. While information on a typical time-varying covariate is missing for entire follow-up period except the visiting times, the status of the secondary event are unavailable only between visits where the status has changed, thus interval-censored. One may view interval-censored covariate of the secondary event status as missing time-varying covariates, yet missingness is partial since partial information is provided throughout the follow-up period. Current practice of using the latest observed status produces biased estimators, and the existing missing covariate techniques cannot accommodate the special feature of missingness due to interval censoring. To handle interval-censored covariates in the Cox proportional hazards model, we propose an available-data estimator, a doubly robust-type estimator as well as the maximum likelihood estimator via EM algorithm and present their asymptotic properties. We also present practical approaches that are valid. We demonstrate the proposed methods using our motivating example from the Northern Manhattan Study. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Missing Data Imputation versus Full Information Maximum Likelihood with Second-Level Dependencies

    ERIC Educational Resources Information Center

    Larsen, Ross

    2011-01-01

    Missing data in the presence of upper level dependencies in multilevel models have never been thoroughly examined. Whereas first-level subjects are independent over time, the second-level subjects might exhibit nonzero covariances over time. This study compares 2 missing data techniques in the presence of a second-level dependency: multiple…

  4. Covariance Manipulation for Conjunction Assessment

    NASA Technical Reports Server (NTRS)

    Hejduk, M. D.

    2016-01-01

    The manipulation of space object covariances to try to provide additional or improved information to conjunction risk assessment is not an uncommon practice. Types of manipulation include fabricating a covariance when it is missing or unreliable to force the probability of collision (Pc) to a maximum value ('PcMax'), scaling a covariance to try to improve its realism or see the effect of covariance volatility on the calculated Pc, and constructing the equivalent of an epoch covariance at a convenient future point in the event ('covariance forecasting'). In bringing these methods to bear for Conjunction Assessment (CA) operations, however, some do not remain fully consistent with best practices for conducting risk management, some seem to be of relatively low utility, and some require additional information before they can contribute fully to risk analysis. This study describes some basic principles of modern risk management (following the Kaplan construct) and then examines the PcMax and covariance forecasting paradigms for alignment with these principles; it then further examines the expected utility of these methods in the modern CA framework. Both paradigms are found to be not without utility, but only in situations that are somewhat carefully circumscribed.

  5. Do people treat missing information adaptively when making inferences?

    PubMed

    Garcia-Retamero, Rocio; Rieskamp, Jörg

    2009-10-01

    When making inferences, people are often confronted with situations with incomplete information. Previous research has led to a mixed picture about how people react to missing information. Options include ignoring missing information, treating it as either positive or negative, using the average of past observations for replacement, or using the most frequent observation of the available information as a placeholder. The accuracy of these inference mechanisms depends on characteristics of the environment. When missing information is uniformly distributed, it is most accurate to treat it as the average, whereas when it is negatively correlated with the criterion to be judged, treating missing information as if it were negative is most accurate. Whether people treat missing information adaptively according to the environment was tested in two studies. The results show that participants were sensitive to how missing information was distributed in an environment and most frequently selected the mechanism that was most adaptive. From these results the authors conclude that reacting to missing information in different ways is an adaptive response to environmental characteristics.

  6. A Simple Approach to Inference in Covariance Structure Modeling with Missing Data: Bayesian Analysis. Project 2.4, Quantitative Models To Monitor the Status and Progress of Learning and Performance and Their Antecedents.

    ERIC Educational Resources Information Center

    Muthen, Bengt

    This paper investigates methods that avoid using multiple groups to represent the missing data patterns in covariance structure modeling, attempting instead to do a single-group analysis where the only action the analyst has to take is to indicate that data is missing. A new covariance structure approach developed by B. Muthen and G. Arminger is…

  7. VARIABLE SELECTION FOR REGRESSION MODELS WITH MISSING DATA

    PubMed Central

    Garcia, Ramon I.; Ibrahim, Joseph G.; Zhu, Hongtu

    2009-01-01

    We consider the variable selection problem for a class of statistical models with missing data, including missing covariate and/or response data. We investigate the smoothly clipped absolute deviation penalty (SCAD) and adaptive LASSO and propose a unified model selection and estimation procedure for use in the presence of missing data. We develop a computationally attractive algorithm for simultaneously optimizing the penalized likelihood function and estimating the penalty parameters. Particularly, we propose to use a model selection criterion, called the ICQ statistic, for selecting the penalty parameters. We show that the variable selection procedure based on ICQ automatically and consistently selects the important covariates and leads to efficient estimates with oracle properties. The methodology is very general and can be applied to numerous situations involving missing data, from covariates missing at random in arbitrary regression models to nonignorably missing longitudinal responses and/or covariates. Simulations are given to demonstrate the methodology and examine the finite sample performance of the variable selection procedures. Melanoma data from a cancer clinical trial is presented to illustrate the proposed methodology. PMID:20336190

  8. Semiparametric Bayesian analysis of gene-environment interactions with error in measurement of environmental covariates and missing genetic data.

    PubMed

    Lobach, Iryna; Mallick, Bani; Carroll, Raymond J

    2011-01-01

    Case-control studies are widely used to detect gene-environment interactions in the etiology of complex diseases. Many variables that are of interest to biomedical researchers are difficult to measure on an individual level, e.g. nutrient intake, cigarette smoking exposure, long-term toxic exposure. Measurement error causes bias in parameter estimates, thus masking key features of data and leading to loss of power and spurious/masked associations. We develop a Bayesian methodology for analysis of case-control studies for the case when measurement error is present in an environmental covariate and the genetic variable has missing data. This approach offers several advantages. It allows prior information to enter the model to make estimation and inference more precise. The environmental covariates measured exactly are modeled completely nonparametrically. Further, information about the probability of disease can be incorporated in the estimation procedure to improve quality of parameter estimates, what cannot be done in conventional case-control studies. A unique feature of the procedure under investigation is that the analysis is based on a pseudo-likelihood function therefore conventional Bayesian techniques may not be technically correct. We propose an approach using Markov Chain Monte Carlo sampling as well as a computationally simple method based on an asymptotic posterior distribution. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a population-based case-control study of the association between calcium intake with the risk of colorectal adenoma development.

  9. A Cautious Note on Auxiliary Variables That Can Increase Bias in Missing Data Problems.

    PubMed

    Thoemmes, Felix; Rose, Norman

    2014-01-01

    The treatment of missing data in the social sciences has changed tremendously during the last decade. Modern missing data techniques such as multiple imputation and full-information maximum likelihood are used much more frequently. These methods assume that data are missing at random. One very common approach to increase the likelihood that missing at random is achieved consists of including many covariates as so-called auxiliary variables. These variables are either included based on data considerations or in an inclusive fashion; that is, taking all available auxiliary variables. In this article, we point out that there are some instances in which auxiliary variables exhibit the surprising property of increasing bias in missing data problems. In a series of focused simulation studies, we highlight some situations in which this type of biasing behavior can occur. We briefly discuss possible ways how one can avoid selecting bias-inducing covariates as auxiliary variables.

  10. 19 CFR 201.3a - Missing children information.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 19 Customs Duties 3 2010-04-01 2010-04-01 false Missing children information. 201.3a Section 201.3a Customs Duties UNITED STATES INTERNATIONAL TRADE COMMISSION GENERAL RULES OF GENERAL APPLICATION Miscellaneous § 201.3a Missing children information. (a) Pursuant to 39 U.S.C. 3220, penalty mail sent by the...

  11. 19 CFR 201.3a - Missing children information.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 19 Customs Duties 3 2013-04-01 2013-04-01 false Missing children information. 201.3a Section 201.3a Customs Duties UNITED STATES INTERNATIONAL TRADE COMMISSION GENERAL RULES OF GENERAL APPLICATION Miscellaneous § 201.3a Missing children information. (a) Pursuant to 39 U.S.C. 3220, penalty mail sent by the...

  12. 19 CFR 201.3a - Missing children information.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 19 Customs Duties 3 2014-04-01 2014-04-01 false Missing children information. 201.3a Section 201.3a Customs Duties UNITED STATES INTERNATIONAL TRADE COMMISSION GENERAL RULES OF GENERAL APPLICATION Miscellaneous § 201.3a Missing children information. (a) Pursuant to 39 U.S.C. 3220, penalty mail sent by the...

  13. 19 CFR 201.3a - Missing children information.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 19 Customs Duties 3 2012-04-01 2012-04-01 false Missing children information. 201.3a Section 201.3a Customs Duties UNITED STATES INTERNATIONAL TRADE COMMISSION GENERAL RULES OF GENERAL APPLICATION Miscellaneous § 201.3a Missing children information. (a) Pursuant to 39 U.S.C. 3220, penalty mail sent by the...

  14. 19 CFR 201.3a - Missing children information.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 19 Customs Duties 3 2011-04-01 2011-04-01 false Missing children information. 201.3a Section 201.3a Customs Duties UNITED STATES INTERNATIONAL TRADE COMMISSION GENERAL RULES OF GENERAL APPLICATION Miscellaneous § 201.3a Missing children information. (a) Pursuant to 39 U.S.C. 3220, penalty mail sent by the...

  15. 38 CFR 1.705 - Restrictions on use of missing children information.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... missing children information. 1.705 Section 1.705 Pensions, Bonuses, and Veterans' Relief DEPARTMENT OF VETERANS AFFAIRS GENERAL PROVISIONS Use of Official Mail in the Location and Recovery of Missing Children § 1.705 Restrictions on use of missing children information. Missing children pictures and...

  16. 38 CFR 1.705 - Restrictions on use of missing children information.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... missing children information. 1.705 Section 1.705 Pensions, Bonuses, and Veterans' Relief DEPARTMENT OF VETERANS AFFAIRS GENERAL PROVISIONS Use of Official Mail in the Location and Recovery of Missing Children § 1.705 Restrictions on use of missing children information. Missing children pictures and...

  17. 38 CFR 1.705 - Restrictions on use of missing children information.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... missing children information. 1.705 Section 1.705 Pensions, Bonuses, and Veterans' Relief DEPARTMENT OF VETERANS AFFAIRS GENERAL PROVISIONS Use of Official Mail in the Location and Recovery of Missing Children § 1.705 Restrictions on use of missing children information. Missing children pictures and...

  18. 38 CFR 1.705 - Restrictions on use of missing children information.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... missing children information. 1.705 Section 1.705 Pensions, Bonuses, and Veterans' Relief DEPARTMENT OF VETERANS AFFAIRS GENERAL PROVISIONS Use of Official Mail in the Location and Recovery of Missing Children § 1.705 Restrictions on use of missing children information. Missing children pictures and...

  19. 38 CFR 1.705 - Restrictions on use of missing children information.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... missing children information. 1.705 Section 1.705 Pensions, Bonuses, and Veterans' Relief DEPARTMENT OF VETERANS AFFAIRS GENERAL PROVISIONS Use of Official Mail in the Location and Recovery of Missing Children § 1.705 Restrictions on use of missing children information. Missing children pictures and...

  20. Adaptive Mechanisms for Treating Missing Information: A Simulation Study

    ERIC Educational Resources Information Center

    Garcia-Retamero, Rocio; Rieskamp, Jorg

    2008-01-01

    People often make inferences with incomplete information. Previous research has led to a mixed picture of how people treat missing information. To explain these results, the authors follow the Brunswikian perspective on human inference and hypothesize that the mechanism's accuracy for treating missing information depends on how it is distributed…

  1. Covariance Manipulation for Conjunction Assessment

    NASA Technical Reports Server (NTRS)

    Hejduk, M. D.

    2016-01-01

    Use of probability of collision (Pc) has brought sophistication to CA. Made possible by JSpOC precision catalogue because provides covariance. Has essentially replaced miss distance as basic CA parameter. Embrace of Pc has elevated methods to 'manipulate' covariance to enable/improve CA calculations. Two such methods to be examined here; compensation for absent or unreliable covariances through 'Maximum Pc' calculation constructs, projection (not propagation) of epoch covariances forward in time to try to enable better risk assessments. Two questions to be answered about each; situations to which such approaches are properly applicable, amount of utility that such methods offer.

  2. Modeling Achievement Trajectories when Attrition Is Informative

    ERIC Educational Resources Information Center

    Feldman, Betsy J.; Rabe-Hesketh, Sophia

    2012-01-01

    In longitudinal education studies, assuming that dropout and missing data occur completely at random is often unrealistic. When the probability of dropout depends on covariates and observed responses (called "missing at random" [MAR]), or on values of responses that are missing (called "informative" or "not missing at random" [NMAR]),…

  3. On Obtaining Estimates of the Fraction of Missing Information from Full Information Maximum Likelihood

    ERIC Educational Resources Information Center

    Savalei, Victoria; Rhemtulla, Mijke

    2012-01-01

    Fraction of missing information [lambda][subscript j] is a useful measure of the impact of missing data on the quality of estimation of a particular parameter. This measure can be computed for all parameters in the model, and it communicates the relative loss of efficiency in the estimation of a particular parameter due to missing data. It has…

  4. Quantifying lost information due to covariance matrix estimation in parameter inference

    NASA Astrophysics Data System (ADS)

    Sellentin, Elena; Heavens, Alan F.

    2017-02-01

    Parameter inference with an estimated covariance matrix systematically loses information due to the remaining uncertainty of the covariance matrix. Here, we quantify this loss of precision and develop a framework to hypothetically restore it, which allows to judge how far away a given analysis is from the ideal case of a known covariance matrix. We point out that it is insufficient to estimate this loss by debiasing the Fisher matrix as previously done, due to a fundamental inequality that describes how biases arise in non-linear functions. We therefore develop direct estimators for parameter credibility contours and the figure of merit, finding that significantly fewer simulations than previously thought are sufficient to reach satisfactory precisions. We apply our results to DES Science Verification weak lensing data, detecting a 10 per cent loss of information that increases their credibility contours. No significant loss of information is found for KiDS. For a Euclid-like survey, with about 10 nuisance parameters we find that 2900 simulations are sufficient to limit the systematically lost information to 1 per cent, with an additional uncertainty of about 2 per cent. Without any nuisance parameters, 1900 simulations are sufficient to only lose 1 per cent of information. We further derive estimators for all quantities needed for forecasting with estimated covariance matrices. Our formalism allows to determine the sweetspot between running sophisticated simulations to reduce the number of nuisance parameters, and running as many fast simulations as possible.

  5. MISSE in the Materials and Processes Technical Information System (MAPTIS )

    NASA Technical Reports Server (NTRS)

    Burns, DeWitt; Finckenor, Miria; Henrie, Ben

    2013-01-01

    Materials International Space Station Experiment (MISSE) data is now being collected and distributed through the Materials and Processes Technical Information System (MAPTIS) at Marshall Space Flight Center in Huntsville, Alabama. MISSE data has been instrumental in many programs and continues to be an important source of data for the space community. To facilitate great access to the MISSE data the International Space Station (ISS) program office and MAPTIS are working to gather this data into a central location. The MISSE database contains information about materials, samples, and flights along with pictures, pdfs, excel files, word documents, and other files types. Major capabilities of the system are: access control, browsing, searching, reports, and record comparison. The search capabilities will search within any searchable files so even if the desired meta-data has not been associated data can still be retrieved. Other functionality will continue to be added to the MISSE database as the Athena Platform is expanded

  6. Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes.

    PubMed

    Baker, Jannah; White, Nicole; Mengersen, Kerrie

    2014-11-20

    Spatial analysis is increasingly important for identifying modifiable geographic risk factors for disease. However, spatial health data from surveys are often incomplete, ranging from missing data for only a few variables, to missing data for many variables. For spatial analyses of health outcomes, selection of an appropriate imputation method is critical in order to produce the most accurate inferences. We present a cross-validation approach to select between three imputation methods for health survey data with correlated lifestyle covariates, using as a case study, type II diabetes mellitus (DM II) risk across 71 Queensland Local Government Areas (LGAs). We compare the accuracy of mean imputation to imputation using multivariate normal and conditional autoregressive prior distributions. Choice of imputation method depends upon the application and is not necessarily the most complex method. Mean imputation was selected as the most accurate method in this application. Selecting an appropriate imputation method for health survey data, after accounting for spatial correlation and correlation between covariates, allows more complete analysis of geographic risk factors for disease with more confidence in the results to inform public policy decision-making.

  7. Covariance Function for Nearshore Wave Assimilation Systems

    DTIC Science & Technology

    2018-01-30

    covariance can be modeled by a parameterized Gaussian function, for nearshore wave assimilation applications, the covariance function depends primarily on...case of missing values at the compiled time series, the gaps were filled by weighted interpolation. The weights depend on the number of the...averaging, in order to create the continuous time series, filters out the dependency on the instantaneous meteorological and oceanographic conditions

  8. Missing Data in Clinical Studies: Issues and Methods

    PubMed Central

    Ibrahim, Joseph G.; Chu, Haitao; Chen, Ming-Hui

    2012-01-01

    Missing data are a prevailing problem in any type of data analyses. A participant variable is considered missing if the value of the variable (outcome or covariate) for the participant is not observed. In this article, various issues in analyzing studies with missing data are discussed. Particularly, we focus on missing response and/or covariate data for studies with discrete, continuous, or time-to-event end points in which generalized linear models, models for longitudinal data such as generalized linear mixed effects models, or Cox regression models are used. We discuss various classifications of missing data that may arise in a study and demonstrate in several situations that the commonly used method of throwing out all participants with any missing data may lead to incorrect results and conclusions. The methods described are applied to data from an Eastern Cooperative Oncology Group phase II clinical trial of liver cancer and a phase III clinical trial of advanced non–small-cell lung cancer. Although the main area of application discussed here is cancer, the issues and methods we discuss apply to any type of study. PMID:22649133

  9. Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model

    PubMed Central

    Seaman, Shaun R; White, Ian R; Carpenter, James R

    2015-01-01

    Missing covariate data commonly occur in epidemiological and clinical research, and are often dealt with using multiple imputation. Imputation of partially observed covariates is complicated if the substantive model is non-linear (e.g. Cox proportional hazards model), or contains non-linear (e.g. squared) or interaction terms, and standard software implementations of multiple imputation may impute covariates from models that are incompatible with such substantive models. We show how imputation by fully conditional specification, a popular approach for performing multiple imputation, can be modified so that covariates are imputed from models which are compatible with the substantive model. We investigate through simulation the performance of this proposal, and compare it with existing approaches. Simulation results suggest our proposal gives consistent estimates for a range of common substantive models, including models which contain non-linear covariate effects or interactions, provided data are missing at random and the assumed imputation models are correctly specified and mutually compatible. Stata software implementing the approach is freely available. PMID:24525487

  10. Influence function based variance estimation and missing data issues in case-cohort studies.

    PubMed

    Mark, S D; Katki, H

    2001-12-01

    Recognizing that the efficiency in relative risk estimation for the Cox proportional hazards model is largely constrained by the total number of cases, Prentice (1986) proposed the case-cohort design in which covariates are measured on all cases and on a random sample of the cohort. Subsequent to Prentice, other methods of estimation and sampling have been proposed for these designs. We formalize an approach to variance estimation suggested by Barlow (1994), and derive a robust variance estimator based on the influence function. We consider the applicability of the variance estimator to all the proposed case-cohort estimators, and derive the influence function when known sampling probabilities in the estimators are replaced by observed sampling fractions. We discuss the modifications required when cases are missing covariate information. The missingness may occur by chance, and be completely at random; or may occur as part of the sampling design, and depend upon other observed covariates. We provide an adaptation of S-plus code that allows estimating influence function variances in the presence of such missing covariates. Using examples from our current case-cohort studies on esophageal and gastric cancer, we illustrate how our results our useful in solving design and analytic issues that arise in practice.

  11. 20 CFR 364.3 - Publication of missing children information in the Railroad Retirement Board's in-house...

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 20 Employees' Benefits 1 2013-04-01 2012-04-01 true Publication of missing children information in... THE LOCATION AND RECOVERY OF MISSING CHILDREN § 364.3 Publication of missing children information in the Railroad Retirement Board's in-house publications. (a) All-A-Board. Information about missing...

  12. 20 CFR 364.3 - Publication of missing children information in the Railroad Retirement Board's in-house...

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 20 Employees' Benefits 1 2014-04-01 2012-04-01 true Publication of missing children information in... THE LOCATION AND RECOVERY OF MISSING CHILDREN § 364.3 Publication of missing children information in the Railroad Retirement Board's in-house publications. (a) All-A-Board. Information about missing...

  13. 20 CFR 364.3 - Publication of missing children information in the Railroad Retirement Board's in-house...

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 20 Employees' Benefits 1 2012-04-01 2012-04-01 false Publication of missing children information... THE LOCATION AND RECOVERY OF MISSING CHILDREN § 364.3 Publication of missing children information in the Railroad Retirement Board's in-house publications. (a) All-A-Board. Information about missing...

  14. An Upper Bound on Orbital Debris Collision Probability When Only One Object has Position Uncertainty Information

    NASA Technical Reports Server (NTRS)

    Frisbee, Joseph H., Jr.

    2015-01-01

    Upper bounds on high speed satellite collision probability, P (sub c), have been investigated. Previous methods assume an individual position error covariance matrix is available for each object. The two matrices being combined into a single, relative position error covariance matrix. Components of the combined error covariance are then varied to obtain a maximum P (sub c). If error covariance information for only one of the two objects was available, either some default shape has been used or nothing could be done. An alternative is presented that uses the known covariance information along with a critical value of the missing covariance to obtain an approximate but useful P (sub c) upper bound. There are various avenues along which an upper bound on the high speed satellite collision probability has been pursued. Typically, for the collision plane representation of the high speed collision probability problem, the predicted miss position in the collision plane is assumed fixed. Then the shape (aspect ratio of ellipse), the size (scaling of standard deviations) or the orientation (rotation of ellipse principal axes) of the combined position error ellipse is varied to obtain a maximum P (sub c). Regardless as to the exact details of the approach, previously presented methods all assume that an individual position error covariance matrix is available for each object and the two are combined into a single, relative position error covariance matrix. This combined position error covariance matrix is then modified according to the chosen scheme to arrive at a maximum P (sub c). But what if error covariance information for one of the two objects is not available? When error covariance information for one of the objects is not available the analyst has commonly defaulted to the situation in which only the relative miss position and velocity are known without any corresponding state error covariance information. The various usual methods of finding a maximum P (sub c) do

  15. Sleepy driver near-misses may predict accident risks.

    PubMed

    Powell, Nelson B; Schechtman, Kenneth B; Riley, Robert W; Guilleminault, Christian; Chiang, Rayleigh Ping-ying; Weaver, Edward M

    2007-03-01

    To quantify the prevalence of self-reported near-miss sleepy driving accidents and their association with self-reported actual driving accidents. A prospective cross-sectional internet-linked survey on driving behaviors. Dateline NBC News website. Results are given on 35,217 (88% of sample) individuals with a mean age of 37.2 +/- 13 years, 54.8% women, and 87% white. The risk of at least one accident increased monotonically from 23.2% if there were no near-miss sleepy accidents to 44.5% if there were > or = 4 near-miss sleepy accidents (P < 0.0001). After covariate adjustments, subjects who reported at least one near-miss sleepy accident were 1.13 (95% CI, 1.10 to 1.16) times as likely to have reported at least one actual accident as subjects reporting no near-miss sleepy accidents (P < 0.0001). The odds of reporting at least one actual accident in those reporting > or = 4 near-miss sleepy accidents as compared to those reporting no near-miss sleepy accidents was 1.87 (95% CI, 1.64 to 2.14). Furthermore, after adjustments, the summary Epworth Sleepiness Scale (ESS) score had an independent association with having a near-miss or actual accident. An increase of 1 unit of ESS was associated with a covariate adjusted 4.4% increase of having at least one accident (P < 0.0001). A statistically significant dose-response was seen between the numbers of self-reported sleepy near-miss accidents and an actual accident. These findings suggest that sleepy near-misses may be dangerous precursors to an actual accident.

  16. Minimax Rate-optimal Estimation of High-dimensional Covariance Matrices with Incomplete Data*

    PubMed Central

    Cai, T. Tony; Zhang, Anru

    2016-01-01

    Missing data occur frequently in a wide range of applications. In this paper, we consider estimation of high-dimensional covariance matrices in the presence of missing observations under a general missing completely at random model in the sense that the missingness is not dependent on the values of the data. Based on incomplete data, estimators for bandable and sparse covariance matrices are proposed and their theoretical and numerical properties are investigated. Minimax rates of convergence are established under the spectral norm loss and the proposed estimators are shown to be rate-optimal under mild regularity conditions. Simulation studies demonstrate that the estimators perform well numerically. The methods are also illustrated through an application to data from four ovarian cancer studies. The key technical tools developed in this paper are of independent interest and potentially useful for a range of related problems in high-dimensional statistical inference with missing data. PMID:27777471

  17. Minimax Rate-optimal Estimation of High-dimensional Covariance Matrices with Incomplete Data.

    PubMed

    Cai, T Tony; Zhang, Anru

    2016-09-01

    Missing data occur frequently in a wide range of applications. In this paper, we consider estimation of high-dimensional covariance matrices in the presence of missing observations under a general missing completely at random model in the sense that the missingness is not dependent on the values of the data. Based on incomplete data, estimators for bandable and sparse covariance matrices are proposed and their theoretical and numerical properties are investigated. Minimax rates of convergence are established under the spectral norm loss and the proposed estimators are shown to be rate-optimal under mild regularity conditions. Simulation studies demonstrate that the estimators perform well numerically. The methods are also illustrated through an application to data from four ovarian cancer studies. The key technical tools developed in this paper are of independent interest and potentially useful for a range of related problems in high-dimensional statistical inference with missing data.

  18. Simulation-Extrapolation for Estimating Means and Causal Effects with Mismeasured Covariates

    ERIC Educational Resources Information Center

    Lockwood, J. R.; McCaffrey, Daniel F.

    2015-01-01

    Regression, weighting and related approaches to estimating a population mean from a sample with nonrandom missing data often rely on the assumption that conditional on covariates, observed samples can be treated as random. Standard methods using this assumption generally will fail to yield consistent estimators when covariates are measured with…

  19. A class of covariate-dependent spatiotemporal covariance functions

    PubMed Central

    Reich, Brian J; Eidsvik, Jo; Guindani, Michele; Nail, Amy J; Schmidt, Alexandra M.

    2014-01-01

    In geostatistics, it is common to model spatially distributed phenomena through an underlying stationary and isotropic spatial process. However, these assumptions are often untenable in practice because of the influence of local effects in the correlation structure. Therefore, it has been of prolonged interest in the literature to provide flexible and effective ways to model non-stationarity in the spatial effects. Arguably, due to the local nature of the problem, we might envision that the correlation structure would be highly dependent on local characteristics of the domain of study, namely the latitude, longitude and altitude of the observation sites, as well as other locally defined covariate information. In this work, we provide a flexible and computationally feasible way for allowing the correlation structure of the underlying processes to depend on local covariate information. We discuss the properties of the induced covariance functions and discuss methods to assess its dependence on local covariate information by means of a simulation study and the analysis of data observed at ozone-monitoring stations in the Southeast United States. PMID:24772199

  20. Analyzing semi-competing risks data with missing cause of informative terminal event.

    PubMed

    Zhou, Renke; Zhu, Hong; Bondy, Melissa; Ning, Jing

    2017-02-28

    Cancer studies frequently yield multiple event times that correspond to landmarks in disease progression, including non-terminal events (i.e., cancer recurrence) and an informative terminal event (i.e., cancer-related death). Hence, we often observe semi-competing risks data. Work on such data has focused on scenarios in which the cause of the terminal event is known. However, in some circumstances, the information on cause for patients who experience the terminal event is missing; consequently, we are not able to differentiate an informative terminal event from a non-informative terminal event. In this article, we propose a method to handle missing data regarding the cause of an informative terminal event when analyzing the semi-competing risks data. We first consider the nonparametric estimation of the survival function for the terminal event time given missing cause-of-failure data via the expectation-maximization algorithm. We then develop an estimation method for semi-competing risks data with missing cause of the terminal event, under a pre-specified semiparametric copula model. We conduct simulation studies to investigate the performance of the proposed method. We illustrate our methodology using data from a study of early-stage breast cancer. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  1. Missing data exploration: highlighting graphical presentation of missing pattern

    PubMed Central

    2015-01-01

    Functions shipped with R base can fulfill many tasks of missing data handling. However, because the data volume of electronic medical record (EMR) system is always very large, more sophisticated methods may be helpful in data management. The article focuses on missing data handling by using advanced techniques. There are three types of missing data, that is, missing completely at random (MCAR), missing at random (MAR) and not missing at random (NMAR). This classification system depends on how missing values are generated. Two packages, Multivariate Imputation by Chained Equations (MICE) and Visualization and Imputation of Missing Values (VIM), provide sophisticated functions to explore missing data pattern. In particular, the VIM package is especially helpful in visual inspection of missing data. Finally, correlation analysis provides information on the dependence of missing data on other variables. Such information is useful in subsequent imputations. PMID:26807411

  2. Missing data exploration: highlighting graphical presentation of missing pattern.

    PubMed

    Zhang, Zhongheng

    2015-12-01

    Functions shipped with R base can fulfill many tasks of missing data handling. However, because the data volume of electronic medical record (EMR) system is always very large, more sophisticated methods may be helpful in data management. The article focuses on missing data handling by using advanced techniques. There are three types of missing data, that is, missing completely at random (MCAR), missing at random (MAR) and not missing at random (NMAR). This classification system depends on how missing values are generated. Two packages, Multivariate Imputation by Chained Equations (MICE) and Visualization and Imputation of Missing Values (VIM), provide sophisticated functions to explore missing data pattern. In particular, the VIM package is especially helpful in visual inspection of missing data. Finally, correlation analysis provides information on the dependence of missing data on other variables. Such information is useful in subsequent imputations.

  3. Meta-analysis with missing study-level sample variance data.

    PubMed

    Chowdhry, Amit K; Dworkin, Robert H; McDermott, Michael P

    2016-07-30

    We consider a study-level meta-analysis with a normally distributed outcome variable and possibly unequal study-level variances, where the object of inference is the difference in means between a treatment and control group. A common complication in such an analysis is missing sample variances for some studies. A frequently used approach is to impute the weighted (by sample size) mean of the observed variances (mean imputation). Another approach is to include only those studies with variances reported (complete case analysis). Both mean imputation and complete case analysis are only valid under the missing-completely-at-random assumption, and even then the inverse variance weights produced are not necessarily optimal. We propose a multiple imputation method employing gamma meta-regression to impute the missing sample variances. Our method takes advantage of study-level covariates that may be used to provide information about the missing data. Through simulation studies, we show that multiple imputation, when the imputation model is correctly specified, is superior to competing methods in terms of confidence interval coverage probability and type I error probability when testing a specified group difference. Finally, we describe a similar approach to handling missing variances in cross-over studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  4. Impact of Delayed Time to Advanced Imaging on Missed Appointments Across Different Demographic and Socioeconomic Factors.

    PubMed

    Daye, Dania; Carrodeguas, Emmanuel; Glover, McKinley; Guerrier, Claude Emmanuel; Harvey, H Benjamin; Flores, Efrén J

    2018-05-01

    The aim of this study was to investigate the impact of wait days (WDs) on missed outpatient MRI appointments across different demographic and socioeconomic factors. An institutional review board-approved retrospective study was conducted among adult patients scheduled for outpatient MRI during a 12-month period. Scheduling data and demographic information were obtained. Imaging missed appointments were defined as missed scheduled imaging encounters. WDs were defined as the number of days from study order to appointment. Multivariate logistic regression was applied to assess the contribution of race and socioeconomic factors to missed appointments. Linear regression was performed to assess the relationship between missed appointment rates and WDs stratified by race, income, and patient insurance groups with analysis of covariance statistics. A total of 42,727 patients met the inclusion criteria. Mean WDs were 7.95 days. Multivariate regression showed increased odds ratio for missed appointments for patients with increased WDs (7-21 days: odds ratio [OR], 1.39; >21 days: OR, 1.77), African American patients (OR, 1.71), Hispanic patients (OR, 1.30), patients with noncommercial insurance (OR, 2.00-2.55), and those with imaging performed at the main hospital campus (OR, 1.51). Missed appointment rate linearly increased with WDs, with analysis of covariance revealing underrepresented minorities and Medicaid insurance as significant effect modifiers. Increased WDs for advanced imaging significantly increases the likelihood of missed appointments. This effect is most pronounced among underrepresented minorities and patients with lower socioeconomic status. Efforts to reduce WDs may improve equity in access to and utilization of advanced diagnostic imaging for all patients. Copyright © 2018. Published by Elsevier Inc.

  5. Multiple imputation of missing covariates for the Cox proportional hazards cure model

    PubMed Central

    Beesley, Lauren J; Bartlett, Jonathan W; Wolf, Gregory T; Taylor, Jeremy M G

    2016-01-01

    We explore several approaches for imputing partially observed covariates when the outcome of interest is a censored event time and when there is an underlying subset of the population that will never experience the event of interest. We call these subjects “cured,” and we consider the case where the data are modeled using a Cox proportional hazards (CPH) mixture cure model. We study covariate imputation approaches using fully conditional specification (FCS). We derive the exact conditional distribution and suggest a sampling scheme for imputing partially observed covariates in the CPH cure model setting. We also propose several approximations to the exact distribution that are simpler and more convenient to use for imputation. A simulation study demonstrates that the proposed imputation approaches outperform existing imputation approaches for survival data without a cure fraction in terms of bias in estimating CPH cure model parameters. We apply our multiple imputation techniques to a study of patients with head and neck cancer. PMID:27439726

  6. a Probability-Based Statistical Method to Extract Water Body of TM Images with Missing Information

    NASA Astrophysics Data System (ADS)

    Lian, Shizhong; Chen, Jiangping; Luo, Minghai

    2016-06-01

    Water information cannot be accurately extracted using TM images because true information is lost in some images because of blocking clouds and missing data stripes, thereby water information cannot be accurately extracted. Water is continuously distributed in natural conditions; thus, this paper proposed a new method of water body extraction based on probability statistics to improve the accuracy of water information extraction of TM images with missing information. Different disturbing information of clouds and missing data stripes are simulated. Water information is extracted using global histogram matching, local histogram matching, and the probability-based statistical method in the simulated images. Experiments show that smaller Areal Error and higher Boundary Recall can be obtained using this method compared with the conventional methods.

  7. Treatment of Nuclear Data Covariance Information in Sample Generation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Swiler, Laura Painton; Adams, Brian M.; Wieselquist, William

    This report summarizes a NEAMS (Nuclear Energy Advanced Modeling and Simulation) project focused on developing a sampling capability that can handle the challenges of generating samples from nuclear cross-section data. The covariance information between energy groups tends to be very ill-conditioned and thus poses a problem using traditional methods for generated correlated samples. This report outlines a method that addresses the sample generation from cross-section matrices.

  8. Handling Missing Data With Multilevel Structural Equation Modeling and Full Information Maximum Likelihood Techniques.

    PubMed

    Schminkey, Donna L; von Oertzen, Timo; Bullock, Linda

    2016-08-01

    With increasing access to population-based data and electronic health records for secondary analysis, missing data are common. In the social and behavioral sciences, missing data frequently are handled with multiple imputation methods or full information maximum likelihood (FIML) techniques, but healthcare researchers have not embraced these methodologies to the same extent and more often use either traditional imputation techniques or complete case analysis, which can compromise power and introduce unintended bias. This article is a review of options for handling missing data, concluding with a case study demonstrating the utility of multilevel structural equation modeling using full information maximum likelihood (MSEM with FIML) to handle large amounts of missing data. MSEM with FIML is a parsimonious and hypothesis-driven strategy to cope with large amounts of missing data without compromising power or introducing bias. This technique is relevant for nurse researchers faced with ever-increasing amounts of electronic data and decreasing research budgets. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  9. Sensitivity Analysis of Multiple Informant Models When Data are Not Missing at Random

    PubMed Central

    Blozis, Shelley A.; Ge, Xiaojia; Xu, Shu; Natsuaki, Misaki N.; Shaw, Daniel S.; Neiderhiser, Jenae; Scaramella, Laura; Leve, Leslie; Reiss, David

    2014-01-01

    Missing data are common in studies that rely on multiple informant data to evaluate relationships among variables for distinguishable individuals clustered within groups. Estimation of structural equation models using raw data allows for incomplete data, and so all groups may be retained even if only one member of a group contributes data. Statistical inference is based on the assumption that data are missing completely at random or missing at random. Importantly, whether or not data are missing is assumed to be independent of the missing data. A saturated correlates model that incorporates correlates of the missingness or the missing data into an analysis and multiple imputation that may also use such correlates offer advantages over the standard implementation of SEM when data are not missing at random because these approaches may result in a data analysis problem for which the missingness is ignorable. This paper considers these approaches in an analysis of family data to assess the sensitivity of parameter estimates to assumptions about missing data, a strategy that may be easily implemented using SEM software. PMID:25221420

  10. Statistical inference for Hardy-Weinberg proportions in the presence of missing genotype information.

    PubMed

    Graffelman, Jan; Sánchez, Milagros; Cook, Samantha; Moreno, Victor

    2013-01-01

    In genetic association studies, tests for Hardy-Weinberg proportions are often employed as a quality control checking procedure. Missing genotypes are typically discarded prior to testing. In this paper we show that inference for Hardy-Weinberg proportions can be biased when missing values are discarded. We propose to use multiple imputation of missing values in order to improve inference for Hardy-Weinberg proportions. For imputation we employ a multinomial logit model that uses information from allele intensities and/or neighbouring markers. Analysis of an empirical data set of single nucleotide polymorphisms possibly related to colon cancer reveals that missing genotypes are not missing completely at random. Deviation from Hardy-Weinberg proportions is mostly due to a lack of heterozygotes. Inbreeding coefficients estimated by multiple imputation of the missings are typically lowered with respect to inbreeding coefficients estimated by discarding the missings. Accounting for missings by multiple imputation qualitatively changed the results of 10 to 17% of the statistical tests performed. Estimates of inbreeding coefficients obtained by multiple imputation showed high correlation with estimates obtained by single imputation using an external reference panel. Our conclusion is that imputation of missing data leads to improved statistical inference for Hardy-Weinberg proportions.

  11. An Upper Bound on High Speed Satellite Collision Probability When Only One Object has Position Uncertainty Information

    NASA Technical Reports Server (NTRS)

    Frisbee, Joseph H., Jr.

    2015-01-01

    Upper bounds on high speed satellite collision probability, PC †, have been investigated. Previous methods assume an individual position error covariance matrix is available for each object. The two matrices being combined into a single, relative position error covariance matrix. Components of the combined error covariance are then varied to obtain a maximum PC. If error covariance information for only one of the two objects was available, either some default shape has been used or nothing could be done. An alternative is presented that uses the known covariance information along with a critical value of the missing covariance to obtain an approximate but potentially useful Pc upper bound.

  12. CERAMIC: Case-Control Association Testing in Samples with Related Individuals, Based on Retrospective Mixed Model Analysis with Adjustment for Covariates

    PubMed Central

    Zhong, Sheng; McPeek, Mary Sara

    2016-01-01

    We consider the problem of genetic association testing of a binary trait in a sample that contains related individuals, where we adjust for relevant covariates and allow for missing data. We propose CERAMIC, an estimating equation approach that can be viewed as a hybrid of logistic regression and linear mixed-effects model (LMM) approaches. CERAMIC extends the recently proposed CARAT method to allow samples with related individuals and to incorporate partially missing data. In simulations, we show that CERAMIC outperforms existing LMM and generalized LMM approaches, maintaining high power and correct type 1 error across a wider range of scenarios. CERAMIC results in a particularly large power increase over existing methods when the sample includes related individuals with some missing data (e.g., when some individuals with phenotype and covariate information have missing genotype), because CERAMIC is able to make use of the relationship information to incorporate partially missing data in the analysis while correcting for dependence. Because CERAMIC is based on a retrospective analysis, it is robust to misspecification of the phenotype model, resulting in better control of type 1 error and higher power than that of prospective methods, such as GMMAT, when the phenotype model is misspecified. CERAMIC is computationally efficient for genomewide analysis in samples of related individuals of almost any configuration, including small families, unrelated individuals and even large, complex pedigrees. We apply CERAMIC to data on type 2 diabetes (T2D) from the Framingham Heart Study. In a genome scan, 9 of the 10 smallest CERAMIC p-values occur in or near either known T2D susceptibility loci or plausible candidates, verifying that CERAMIC is able to home in on the important loci in a genome scan. PMID:27695091

  13. Missing clinical information in NHS hospital outpatient clinics: prevalence, causes and effects on patient care.

    PubMed

    Burnett, Susan J; Deelchand, Vashist; Franklin, Bryony Dean; Moorthy, Krishna; Vincent, Charles

    2011-05-23

    In Britain over 39,000 reports were received by the National Patient Safety Agency relating to failures in documentation in 2007 and the UK Health Services Journal estimated in 2008 that over a million hospital outpatient visits each year might take place without the full record available. Despite these high numbers, the impact of missing clinical information has not been investigated for hospital outpatients in the UK.Studies in primary care in the USA have found 13.6% of patient consultations have missing clinical information, with this adversely affecting care in about half of cases, and in Australia 1.8% of medical errors were found to be due to the unavailability of clinical information.Our objectives were to assess the frequency, nature and potential impact on patient care of missing clinical information in NHS hospital outpatients and to assess the principal causes. This is the first study to present such figures for the UK and the first to look at how clinicians respond, including the associated impact on patient care. Prospective descriptive study of missing information reported by surgeons, supplemented by interviews on the causes.Data were collected by surgeons in general, gastrointestinal, colorectal and vascular surgical clinics in three teaching hospitals across the UK for over a thousand outpatient appointments. Fifteen interviews were conducted with those involved in collating clinical information for these clinics.The study had ethics approval (Hammersmith and Queen Charlotte's & Chelsea Research Ethics Committee), reference number (09/H0707/27). Participants involved in the interviews signed a consent form and were offered the opportunity to review and agree the transcript of their interview before analysis. No patients were involved in this research. In 15% of outpatient consultations key items of clinical information were missing. Of these patients, 32% experienced a delay or disruption to their care and 20% had a risk of harm. In over half of

  14. Bayesian analysis of longitudinal dyadic data with informative missing data using a dyadic shared-parameter model.

    PubMed

    Ahn, Jaeil; Morita, Satoshi; Wang, Wenyi; Yuan, Ying

    2017-01-01

    Analyzing longitudinal dyadic data is a challenging task due to the complicated correlations from repeated measurements and within-dyad interdependence, as well as potentially informative (or non-ignorable) missing data. We propose a dyadic shared-parameter model to analyze longitudinal dyadic data with ordinal outcomes and informative intermittent missing data and dropouts. We model the longitudinal measurement process using a proportional odds model, which accommodates the within-dyad interdependence using the concept of the actor-partner interdependence effects, as well as dyad-specific random effects. We model informative dropouts and intermittent missing data using a transition model, which shares the same set of random effects as the longitudinal measurement model. We evaluate the performance of the proposed method through extensive simulation studies. As our approach relies on some untestable assumptions on the missing data mechanism, we perform sensitivity analyses to evaluate how the analysis results change when the missing data mechanism is misspecified. We demonstrate our method using a longitudinal dyadic study of metastatic breast cancer.

  15. Covariance Based Pre-Filters and Screening Criteria for Conjunction Analysis

    NASA Astrophysics Data System (ADS)

    George, E., Chan, K.

    2012-09-01

    Several relationships are developed relating object size, initial covariance and range at closest approach to probability of collision. These relationships address the following questions: - Given the objects' initial covariance and combined hard body size, what is the maximum possible value of the probability of collision (Pc)? - Given the objects' initial covariance, what is the maximum combined hard body radius for which the probability of collision does not exceed the tolerance limit? - Given the objects' initial covariance and the combined hard body radius, what is the minimum miss distance for which the probability of collision does not exceed the tolerance limit? - Given the objects' initial covariance and the miss distance, what is the maximum combined hard body radius for which the probability of collision does not exceed the tolerance limit? The first relationship above allows the elimination of object pairs from conjunction analysis (CA) on the basis of the initial covariance and hard-body sizes of the objects. The application of this pre-filter to present day catalogs with estimated covariance results in the elimination of approximately 35% of object pairs as unable to ever conjunct with a probability of collision exceeding 1x10-6. Because Pc is directly proportional to object size and inversely proportional to covariance size, this pre-filter will have a significantly larger impact on future catalogs, which are expected to contain a much larger fraction of small debris tracked only by a limited subset of available sensors. This relationship also provides a mathematically rigorous basis for eliminating objects from analysis entirely based on element set age or quality - a practice commonly done by rough rules of thumb today. Further, these relations can be used to determine the required geometric screening radius for all objects. This analysis reveals the screening volumes for small objects are much larger than needed, while the screening volumes for

  16. Responsiveness-informed multiple imputation and inverse probability-weighting in cohort studies with missing data that are non-monotone or not missing at random.

    PubMed

    Doidge, James C

    2018-02-01

    Population-based cohort studies are invaluable to health research because of the breadth of data collection over time, and the representativeness of their samples. However, they are especially prone to missing data, which can compromise the validity of analyses when data are not missing at random. Having many waves of data collection presents opportunity for participants' responsiveness to be observed over time, which may be informative about missing data mechanisms and thus useful as an auxiliary variable. Modern approaches to handling missing data such as multiple imputation and maximum likelihood can be difficult to implement with the large numbers of auxiliary variables and large amounts of non-monotone missing data that occur in cohort studies. Inverse probability-weighting can be easier to implement but conventional wisdom has stated that it cannot be applied to non-monotone missing data. This paper describes two methods of applying inverse probability-weighting to non-monotone missing data, and explores the potential value of including measures of responsiveness in either inverse probability-weighting or multiple imputation. Simulation studies are used to compare methods and demonstrate that responsiveness in longitudinal studies can be used to mitigate bias induced by missing data, even when data are not missing at random.

  17. 20 CFR 364.3 - Publication of missing children information in the Railroad Retirement Board's in-house...

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... in the Railroad Retirement Board's in-house publications. 364.3 Section 364.3 Employees' Benefits... the Railroad Retirement Board's in-house publications. (a) All-A-Board. Information about missing... publication. (b) Other in-house publications. The Board may publish missing children information in other in...

  18. 20 CFR 364.3 - Publication of missing children information in the Railroad Retirement Board's in-house...

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... in the Railroad Retirement Board's in-house publications. 364.3 Section 364.3 Employees' Benefits... the Railroad Retirement Board's in-house publications. (a) All-A-Board. Information about missing... publication. (b) Other in-house publications. The Board may publish missing children information in other in...

  19. Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data.

    PubMed

    Sehgal, Muhammad Shoaib B; Gondal, Iqbal; Dooley, Laurence S

    2005-05-15

    Microarray data are used in a range of application areas in biology, although often it contains considerable numbers of missing values. These missing values can significantly affect subsequent statistical analysis and machine learning algorithms so there is a strong motivation to estimate these values as accurately as possible before using these algorithms. While many imputation algorithms have been proposed, more robust techniques need to be developed so that further analysis of biological data can be accurately undertaken. In this paper, an innovative missing value imputation algorithm called collateral missing value estimation (CMVE) is presented which uses multiple covariance-based imputation matrices for the final prediction of missing values. The matrices are computed and optimized using least square regression and linear programming methods. The new CMVE algorithm has been compared with existing estimation techniques including Bayesian principal component analysis imputation (BPCA), least square impute (LSImpute) and K-nearest neighbour (KNN). All these methods were rigorously tested to estimate missing values in three separate non-time series (ovarian cancer based) and one time series (yeast sporulation) dataset. Each method was quantitatively analyzed using the normalized root mean square (NRMS) error measure, covering a wide range of randomly introduced missing value probabilities from 0.01 to 0.2. Experiments were also undertaken on the yeast dataset, which comprised 1.7% actual missing values, to test the hypothesis that CMVE performed better not only for randomly occurring but also for a real distribution of missing values. The results confirmed that CMVE consistently demonstrated superior and robust estimation capability of missing values compared with other methods for both series types of data, for the same order of computational complexity. A concise theoretical framework has also been formulated to validate the improved performance of the CMVE

  20. Missed Opportunities for Hepatitis A Vaccination, National Immunization Survey-Child, 2013.

    PubMed

    Casillas, Shannon M; Bednarczyk, Robert A

    2017-08-01

    To quantify the number of missed opportunities for vaccination with hepatitis A vaccine in children and assess the association of missed opportunities for hepatitis A vaccination with covariates of interest. Weighted data from the 2013 National Immunization Survey of US children aged 19-35 months were used. Analysis was restricted to children with provider-verified vaccination history (n = 13 460). Missed opportunities for vaccination were quantified by determining the number of medical visits a child made when another vaccine was administered during eligibility for hepatitis A vaccine, but hepatitis A vaccine was not administered. Cross-sectional bivariate and multivariate polytomous logistic regression were used to assess the association of missed opportunities for vaccination with child and maternal demographic, socioeconomic, and geographic covariates. In 2013, 85% of children in our study population had initiated the hepatitis A vaccine series, and 60% received 2 or more doses. Children who received zero doses of hepatitis A vaccine had an average of 1.77 missed opportunities for vaccination compared with 0.43 missed opportunities for vaccination in those receiving 2 doses. Children with 2 or more missed opportunities for vaccination initiated the vaccine series 6 months later than children without missed opportunities. In the fully adjusted multivariate model, children who were younger, had ever received WIC benefits, or lived in a state with childcare entry mandates were at a reduced odds for 2 or more missed opportunities for vaccination; children living in the Northeast census region were at an increased odds. Missed opportunities for vaccination likely contribute to the poor coverage for hepatitis A vaccination in children; it is important to understand why children are not receiving the vaccine when eligible. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. The impact of missing sensor information on surgical workflow management.

    PubMed

    Liebmann, Philipp; Meixensberger, Jürgen; Wiedemann, Peter; Neumuth, Thomas

    2013-09-01

    Sensor systems in the operating room may encounter intermittent data losses that reduce the performance of surgical workflow management systems (SWFMS). Sensor data loss could impact SWFMS-based decision support, device parameterization, and information presentation. The purpose of this study was to understand the robustness of surgical process models when sensor information is partially missing. SWFMS changes caused by wrong or no data from the sensor system which tracks the progress of a surgical intervention were tested. The individual surgical process models (iSPMs) from 100 different cataract procedures of 3 ophthalmologic surgeons were used to select a randomized subset and create a generalized surgical process model (gSPM). A disjoint subset was selected from the iSPMs and used to simulate the surgical process against the gSPM. The loss of sensor data was simulated by removing some information from one task in the iSPM. The effect of missing sensor data was measured using several metrics: (a) successful relocation of the path in the gSPM, (b) the number of steps to find the converging point, and (c) the perspective with the highest occurrence of unsuccessful path findings. A gSPM built using 30% of the iSPMs successfully found the correct path in 90% of the cases. The most critical sensor data were the information regarding the instrument used by the surgeon. We found that use of a gSPM to provide input data for a SWFMS is robust and can be accurate despite missing sensor data. A surgical workflow management system can provide the surgeon with workflow guidance in the OR for most cases. Sensor systems for surgical process tracking can be evaluated based on the stability and accuracy of functional and spatial operative results.

  2. Dynamic Tasking of Networked Sensors Using Covariance Information

    DTIC Science & Technology

    2010-09-01

    has been created under an effort called TASMAN (Tasking Autonomous Sensors in a Multiple Application Network). One of the first studies utilizing this...environment was focused on a novel resource management approach, namely covariance-based tasking. Under this scheme, the state error covariance of...resident space objects (RSO), sensor characteristics, and sensor- target geometry were used to determine the effectiveness of future observations in

  3. Missing Children. Missing Children Data Collected by the National Crime Information Center. Fact Sheet for the Honorable Alfonse M. D'Amato, United States Senate.

    ERIC Educational Resources Information Center

    General Accounting Office, Washington, DC.

    This document is a fact sheet on missing children data collected by the National Crime Information Center (NCIC). The document contains details on the design of the NCIC system, state control for management and use of the system, access to the system, criteria for missing persons, number of active cases, and the unidentified persons file of the…

  4. OD Covariance in Conjunction Assessment: Introduction and Issues

    NASA Technical Reports Server (NTRS)

    Hejduk, M. D.; Duncan, M.

    2015-01-01

    Primary and secondary covariances combined and projected into conjunction plane (plane perpendicular to relative velocity vector at TCA) Primary placed on x-axis at (miss distance, 0) and represented by circle of radius equal to sum of both spacecraft circumscribing radiiZ-axis perpendicular to x-axis in conjunction plane Pc is portion of combined error ellipsoid that falls within the hard-body radius circle

  5. Assessment of score- and Rasch-based methods for group comparison of longitudinal patient-reported outcomes with intermittent missing data (informative and non-informative).

    PubMed

    de Bock, Élodie; Hardouin, Jean-Benoit; Blanchin, Myriam; Le Neel, Tanguy; Kubis, Gildas; Sébille, Véronique

    2015-01-01

    The purpose of this study was to identify the most adequate strategy for group comparison of longitudinal patient-reported outcomes in the presence of possibly informative intermittent missing data. Models coming from classical test theory (CTT) and item response theory (IRT) were compared. Two groups of patients' responses to dichotomous items with three times of assessment were simulated. Different cases were considered: presence or absence of a group effect and/or a time effect, a total of 100 or 200 patients, 4 or 7 items and two different values for the correlation coefficient of the latent trait between two consecutive times (0.4 or 0.9). Cases including informative and non-informative intermittent missing data were compared at different rates (15, 30 %). These simulated data were analyzed with CTT using score and mixed model (SM) and with IRT using longitudinal Rasch mixed model (LRM). The type I error, the power and the bias of the group effect estimations were compared between the two methods. This study showed that LRM performs better than SM. When the rate of missing data rose to 30 %, estimations were biased with SM mainly for informative missing data. Otherwise, LRM and SM methods were comparable concerning biases. However, regardless of the rate of intermittent missing data, power of LRM was higher compared to power of SM. In conclusion, LRM should be favored when the rate of missing data is higher than 15 %. For other cases, SM and LRM provide similar results.

  6. A comparison of model-based imputation methods for handling missing predictor values in a linear regression model: A simulation study

    NASA Astrophysics Data System (ADS)

    Hasan, Haliza; Ahmad, Sanizah; Osman, Balkish Mohd; Sapri, Shamsiah; Othman, Nadirah

    2017-08-01

    In regression analysis, missing covariate data has been a common problem. Many researchers use ad hoc methods to overcome this problem due to the ease of implementation. However, these methods require assumptions about the data that rarely hold in practice. Model-based methods such as Maximum Likelihood (ML) using the expectation maximization (EM) algorithm and Multiple Imputation (MI) are more promising when dealing with difficulties caused by missing data. Then again, inappropriate methods of missing value imputation can lead to serious bias that severely affects the parameter estimates. The main objective of this study is to provide a better understanding regarding missing data concept that can assist the researcher to select the appropriate missing data imputation methods. A simulation study was performed to assess the effects of different missing data techniques on the performance of a regression model. The covariate data were generated using an underlying multivariate normal distribution and the dependent variable was generated as a combination of explanatory variables. Missing values in covariate were simulated using a mechanism called missing at random (MAR). Four levels of missingness (10%, 20%, 30% and 40%) were imposed. ML and MI techniques available within SAS software were investigated. A linear regression analysis was fitted and the model performance measures; MSE, and R-Squared were obtained. Results of the analysis showed that MI is superior in handling missing data with highest R-Squared and lowest MSE when percent of missingness is less than 30%. Both methods are unable to handle larger than 30% level of missingness.

  7. An ICU Preanesthesia Evaluation Form Reduces Missing Preoperative Key Information.

    PubMed

    Chuy, Katherine; Yan, Zhe; Fleisher, Lee; Liu, Renyu

    2012-09-28

    A comprehensive preoperative evaluation is critical for providing anesthetic care for patients from the intensive care unit (ICU). There has been no preoperative evaluation form specific for ICU patients that allows for a rapid and focused evaluation by anesthesia providers, including junior residents. In this study, a specific preoperative form was designed for ICU patients and evaluated to allow residents to perform the most relevant and important preoperative evaluations efficiently. The following steps were utilized for developing the preoperative evaluation form: 1) designed a new preoperative form specific for ICU patients; 2) had the form reviewed by attending physicians and residents, followed by multiple revisions; 3) conducted test releases and revisions; 4) released the final version and conducted a survey; 5) compared data collection from new ICU form with that from a previously used generic form. Each piece of information on the forms was assigned a score, and the score for the total missing information was determined. The score for each form was presented as mean ± standard deviation (SD), and compared by unpaired t test. A P value < 0.05 was considered statistically significant. Of 52 anesthesiologists (19 attending physicians, 33 residents) responding to the survey, 90% preferred the final new form; and 56% thought the new form would reduce perioperative risk for ICU patients. Forty percent were unsure whether the form would reduce perioperative risk. Over a three month period, we randomly collected 32 generic forms and 25 new forms. The average score for missing data was 23 ± 10 for the generic form and 8 ± 4 for the new form (P = 2.58E-11). A preoperative evaluation form designed specifically for ICU patients is well accepted by anesthesia providers and helped to reduce missing key preoperative information. Such an approach is important for perioperative patient safety.

  8. Individual Information-Centered Approach for Handling Physical Activity Missing Data

    ERIC Educational Resources Information Center

    Kang, Minsoo; Rowe, David A.; Barreira, Tiago V.; Robinson, Terrance S.; Mahar, Matthew T.

    2009-01-01

    The purpose of this study was to validate individual information (II)-centered methods for handling missing data, using data samples of 118 middle-aged adults and 91 older adults equipped with Yamax SW-200 pedometers and Actigraph accelerometers for 7 days. We used a semisimulation approach to create six data sets: three physical activity outcome…

  9. Analyzing partially missing confounder information in comparative effectiveness and safety research of therapeutics.

    PubMed

    Toh, Sengwee; García Rodríguez, Luis A; Hernán, Miguel A

    2012-05-01

    Electronic healthcare databases are commonly used in comparative effectiveness and safety research of therapeutics. Many databases now include additional confounder information in a subset of the study population through data linkage or data collection. We described and compared existing methods for analyzing such datasets. Using data from The Health Improvement Network and the relation between non-steroidal anti-inflammatory drugs and upper gastrointestinal bleeding as an example, we employed several methods to handle partially missing confounder information. The crude odds ratio (OR) of upper gastrointestinal bleeding was 1.50 (95% confidence interval: 0.98, 2.28) among selective cyclo-oxygenase-2 inhibitor initiators (n = 43 569) compared with traditional non-steroidal anti-inflammatory drug initiators (n = 411 616). The OR dropped to 0.81 (0.52, 1.27) upon adjustment for confounders recorded for all patients. When further considering three additional variables missing in 22% of the study population (smoking, alcohol consumption, body mass index), the OR was between 0.80 and 0.83 for the missing-category approach, the missing-indicator approach, single imputation by the most common category, multiple imputation by chained equations, and propensity score calibration. The OR was 0.65 (0.39, 1.09) and 0.67 (0.38, 1.16) for the unweighted and the inverse probability weighted complete-case analysis, respectively. Existing methods for handling partially missing confounder data require different assumptions and may produce different results. The unweighted complete-case analysis, the missing-category/indicator approach, and single imputation require often unrealistic assumptions and should be avoided. In this study, differences across methods were not substantial, likely due to relatively low proportion of missingness and weak confounding effect by the three additional variables upon adjustment for other variables. Copyright © 2012 John Wiley & Sons, Ltd.

  10. Linear Regression with a Randomly Censored Covariate: Application to an Alzheimer's Study.

    PubMed

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2017-01-01

    The association between maternal age of onset of dementia and amyloid deposition (measured by in vivo positron emission tomography (PET) imaging) in cognitively normal older offspring is of interest. In a regression model for amyloid, special methods are required due to the random right censoring of the covariate of maternal age of onset of dementia. Prior literature has proposed methods to address the problem of censoring due to assay limit of detection, but not random censoring. We propose imputation methods and a survival regression method that do not require parametric assumptions about the distribution of the censored covariate. Existing imputation methods address missing covariates, but not right censored covariates. In simulation studies, we compare these methods to the simple, but inefficient complete case analysis, and to thresholding approaches. We apply the methods to the Alzheimer's study.

  11. Robust covariance estimation of galaxy-galaxy weak lensing: validation and limitation of jackknife covariance

    NASA Astrophysics Data System (ADS)

    Shirasaki, Masato; Takada, Masahiro; Miyatake, Hironao; Takahashi, Ryuichi; Hamana, Takashi; Nishimichi, Takahiro; Murata, Ryoma

    2017-09-01

    We develop a method to simulate galaxy-galaxy weak lensing by utilizing all-sky, light-cone simulations and their inherent halo catalogues. Using the mock catalogue to study the error covariance matrix of galaxy-galaxy weak lensing, we compare the full covariance with the 'jackknife' (JK) covariance, the method often used in the literature that estimates the covariance from the resamples of the data itself. We show that there exists the variation of JK covariance over realizations of mock lensing measurements, while the average JK covariance over mocks can give a reasonably accurate estimation of the true covariance up to separations comparable with the size of JK subregion. The scatter in JK covariances is found to be ∼10 per cent after we subtract the lensing measurement around random points. However, the JK method tends to underestimate the covariance at the larger separations, more increasingly for a survey with a higher number density of source galaxies. We apply our method to the Sloan Digital Sky Survey (SDSS) data, and show that the 48 mock SDSS catalogues nicely reproduce the signals and the JK covariance measured from the real data. We then argue that the use of the accurate covariance, compared to the JK covariance, allows us to use the lensing signals at large scales beyond a size of the JK subregion, which contains cleaner cosmological information in the linear regime.

  12. The influence of the number of relevant causes on the processing of covariation information in causal reasoning.

    PubMed

    Kim, Kyungil; Markman, Arthur B; Kim, Tae Hoon

    2016-11-01

    Research on causal reasoning has focused on the influence of covariation between candidate causes and effects on causal judgments. We suggest that the type of covariation information to which people attend is affected by the task being performed. For this, we manipulated the test questions for the evaluation of contingency information and observed its influence on both contingency learning and subsequent causal selections. When people select one cause related to an effect, they focus on conditional contingencies assuming the absence of alternative causes. When people select two causes related to an effect, they focus on conditional contingencies assuming the presence of alternative causes. We demonstrated this use of contingency information in four experiments.

  13. ICON: 3D reconstruction with 'missing-information' restoration in biological electron tomography.

    PubMed

    Deng, Yuchen; Chen, Yu; Zhang, Yan; Wang, Shengliu; Zhang, Fa; Sun, Fei

    2016-07-01

    Electron tomography (ET) plays an important role in revealing biological structures, ranging from macromolecular to subcellular scale. Due to limited tilt angles, ET reconstruction always suffers from the 'missing wedge' artifacts, thus severely weakens the further biological interpretation. In this work, we developed an algorithm called Iterative Compressed-sensing Optimized Non-uniform fast Fourier transform reconstruction (ICON) based on the theory of compressed-sensing and the assumption of sparsity of biological specimens. ICON can significantly restore the missing information in comparison with other reconstruction algorithms. More importantly, we used the leave-one-out method to verify the validity of restored information for both simulated and experimental data. The significant improvement in sub-tomogram averaging by ICON indicates its great potential in the future application of high-resolution structural determination of macromolecules in situ. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  14. Inverse-Probability-Weighted Estimation for Monotone and Nonmonotone Missing Data.

    PubMed

    Sun, BaoLuo; Perkins, Neil J; Cole, Stephen R; Harel, Ofer; Mitchell, Emily M; Schisterman, Enrique F; Tchetgen Tchetgen, Eric J

    2018-03-01

    Missing data is a common occurrence in epidemiologic research. In this paper, 3 data sets with induced missing values from the Collaborative Perinatal Project, a multisite US study conducted from 1959 to 1974, are provided as examples of prototypical epidemiologic studies with missing data. Our goal was to estimate the association of maternal smoking behavior with spontaneous abortion while adjusting for numerous confounders. At the same time, we did not necessarily wish to evaluate the joint distribution among potentially unobserved covariates, which is seldom the subject of substantive scientific interest. The inverse probability weighting (IPW) approach preserves the semiparametric structure of the underlying model of substantive interest and clearly separates the model of substantive interest from the model used to account for the missing data. However, IPW often will not result in valid inference if the missing-data pattern is nonmonotone, even if the data are missing at random. We describe a recently proposed approach to modeling nonmonotone missing-data mechanisms under missingness at random to use in constructing the weights in IPW complete-case estimation, and we illustrate the approach using 3 data sets described in a companion article (Am J Epidemiol. 2018;187(3):568-575).

  15. Multiple imputation of missing data in nested case-control and case-cohort studies.

    PubMed

    Keogh, Ruth H; Seaman, Shaun R; Bartlett, Jonathan W; Wood, Angela M

    2018-06-05

    The nested case-control and case-cohort designs are two main approaches for carrying out a substudy within a prospective cohort. This article adapts multiple imputation (MI) methods for handling missing covariates in full-cohort studies for nested case-control and case-cohort studies. We consider data missing by design and data missing by chance. MI analyses that make use of full-cohort data and MI analyses based on substudy data only are described, alongside an intermediate approach in which the imputation uses full-cohort data but the analysis uses only the substudy. We describe adaptations to two imputation methods: the approximate method (MI-approx) of White and Royston () and the "substantive model compatible" (MI-SMC) method of Bartlett et al. (). We also apply the "MI matched set" approach of Seaman and Keogh () to nested case-control studies, which does not require any full-cohort information. The methods are investigated using simulation studies and all perform well when their assumptions hold. Substantial gains in efficiency can be made by imputing data missing by design using the full-cohort approach or by imputing data missing by chance in analyses using the substudy only. The intermediate approach brings greater gains in efficiency relative to the substudy approach and is more robust to imputation model misspecification than the full-cohort approach. The methods are illustrated using the ARIC Study cohort. Supplementary Materials provide R and Stata code. © 2018, The International Biometric Society.

  16. Operation Reliability Assessment for Cutting Tools by Applying a Proportional Covariate Model to Condition Monitoring Information

    PubMed Central

    Cai, Gaigai; Chen, Xuefeng; Li, Bing; Chen, Baojia; He, Zhengjia

    2012-01-01

    The reliability of cutting tools is critical to machining precision and production efficiency. The conventional statistic-based reliability assessment method aims at providing a general and overall estimation of reliability for a large population of identical units under given and fixed conditions. However, it has limited effectiveness in depicting the operational characteristics of a cutting tool. To overcome this limitation, this paper proposes an approach to assess the operation reliability of cutting tools. A proportional covariate model is introduced to construct the relationship between operation reliability and condition monitoring information. The wavelet packet transform and an improved distance evaluation technique are used to extract sensitive features from vibration signals, and a covariate function is constructed based on the proportional covariate model. Ultimately, the failure rate function of the cutting tool being assessed is calculated using the baseline covariate function obtained from a small sample of historical data. Experimental results and a comparative study show that the proposed method is effective for assessing the operation reliability of cutting tools. PMID:23201980

  17. Dangers in Using Analysis of Covariance Procedures.

    ERIC Educational Resources Information Center

    Campbell, Kathleen T.

    Problems associated with the use of analysis of covariance (ANCOVA) as a statistical control technique are explained. Three problems relate to the use of "OVA" methods (analysis of variance, analysis of covariance, multivariate analysis of variance, and multivariate analysis of covariance) in general. These are: (1) the wasting of information when…

  18. Multivariate analysis of covariates of adherence among HIV-positive mothers with low viral suppression.

    PubMed

    Nsubuga-Nyombi, Tamara; Sensalire, Simon; Karamagi, Esther; Aloyo, Judith; Byabagambi, John; Rahimzai, Mirwais; Nabitaka, Linda Kisaakye; Calnan, Jacqueline

    2018-03-31

    As part of efforts to improve the prevention of mother-to-child transmission in Northern Uganda, we explored reasons for poor viral suppression among 122 pregnant and lactating women who were in care, received viral load tests, but had not achieved viral suppression and had more than 1000 copies/mL. Understanding the patient factors associated with low viral suppression was of interest to the Ministry of Health to guide the development of tools and interventions to achieve viral suppression for pregnant and lactating women newly initiating on ART as well as those on ART with unsuppressed viral load. A facility-based cross-sectional and mixed methods study design was used, with retrospective medical record review. We assessed 122 HIV-positive mothers with known low viral suppression across 31 health facilities in Northern Uganda. Adjusted odds ratios were used to determine the covariates of adherence among HIV positive mothers using logistic regression. A study among health care providers shed further light on predictors of low viral suppression and a history of low early retention. This study was part of a larger national evaluation of the performance of integrated care services for mothers. Adherence defined as taking antiretroviral medications correctly everyday was low at 67.2%. The covariates of low adherence are: taking other medications in addition to ART, missed appointments in the past 6 months, experienced violence in the past 6 months, and faces obstacles to treatment. Mothers who were experiencing each of these covariates were less likely to adhere to treatment. These covariates were triangulated with perspectives of health providers as covariates of low adherence and included: long distances to health facility, missed appointments, running out of pills, sharing antiretroviral drugs, violence, and social lifestyles such as multiple sexual partners coupled with non-disclosure to partners. Inadequate counseling, stigma, and lack of client identity are

  19. MISSE-6 hardware

    NASA Image and Video Library

    2009-09-02

    ISS020-E-037367 (1 Sept. 2009) --- A close-up view of a Materials International Space Station Experiment (MISSE-6) on the exterior of the Columbus laboratory is featured in this image photographed by a space walking astronaut during the STS-128 mission’s first session of extravehicular activity (EVA). MISSE collects information on how different materials weather in the environment of space. MISSE was later placed in Space Shuttle Discovery’s cargo bay for its return to Earth.

  20. Inverse-Probability-Weighted Estimation for Monotone and Nonmonotone Missing Data

    PubMed Central

    Sun, BaoLuo; Perkins, Neil J; Cole, Stephen R; Harel, Ofer; Mitchell, Emily M; Schisterman, Enrique F; Tchetgen Tchetgen, Eric J

    2018-01-01

    Abstract Missing data is a common occurrence in epidemiologic research. In this paper, 3 data sets with induced missing values from the Collaborative Perinatal Project, a multisite US study conducted from 1959 to 1974, are provided as examples of prototypical epidemiologic studies with missing data. Our goal was to estimate the association of maternal smoking behavior with spontaneous abortion while adjusting for numerous confounders. At the same time, we did not necessarily wish to evaluate the joint distribution among potentially unobserved covariates, which is seldom the subject of substantive scientific interest. The inverse probability weighting (IPW) approach preserves the semiparametric structure of the underlying model of substantive interest and clearly separates the model of substantive interest from the model used to account for the missing data. However, IPW often will not result in valid inference if the missing-data pattern is nonmonotone, even if the data are missing at random. We describe a recently proposed approach to modeling nonmonotone missing-data mechanisms under missingness at random to use in constructing the weights in IPW complete-case estimation, and we illustrate the approach using 3 data sets described in a companion article (Am J Epidemiol. 2018;187(3):568–575). PMID:29165557

  1. Principled Missing Data Treatments.

    PubMed

    Lang, Kyle M; Little, Todd D

    2018-04-01

    We review a number of issues regarding missing data treatments for intervention and prevention researchers. Many of the common missing data practices in prevention research are still, unfortunately, ill-advised (e.g., use of listwise and pairwise deletion, insufficient use of auxiliary variables). Our goal is to promote better practice in the handling of missing data. We review the current state of missing data methodology and recent missing data reporting in prevention research. We describe antiquated, ad hoc missing data treatments and discuss their limitations. We discuss two modern, principled missing data treatments: multiple imputation and full information maximum likelihood, and we offer practical tips on how to best employ these methods in prevention research. The principled missing data treatments that we discuss are couched in terms of how they improve causal and statistical inference in the prevention sciences. Our recommendations are firmly grounded in missing data theory and well-validated statistical principles for handling the missing data issues that are ubiquitous in biosocial and prevention research. We augment our broad survey of missing data analysis with references to more exhaustive resources.

  2. Missing persons-missing data: the need to collect antemortem dental records of missing persons.

    PubMed

    Blau, Soren; Hill, Anthony; Briggs, Christopher A; Cordner, Stephen M

    2006-03-01

    incorporated into the National Coroners Information System (NCIS) managed, on behalf of Australia's Coroners, by the Victorian Institute of Forensic Medicine. The existence of the NCIS would ensure operational collaboration in the implementation of the system and cost savings to Australian policing agencies involved in missing person inquiries. The implementation of such a database would facilitate timely and efficient reconciliation of clinical and postmortem dental records and have subsequent social and financial benefits.

  3. A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study.

    PubMed

    De Silva, Anurika Priyanjali; Moreno-Betancur, Margarita; De Livera, Alysha Madhu; Lee, Katherine Jane; Simpson, Julie Anne

    2017-07-25

    Missing data is a common problem in epidemiological studies, and is particularly prominent in longitudinal data, which involve multiple waves of data collection. Traditional multiple imputation (MI) methods (fully conditional specification (FCS) and multivariate normal imputation (MVNI)) treat repeated measurements of the same time-dependent variable as just another 'distinct' variable for imputation and therefore do not make the most of the longitudinal structure of the data. Only a few studies have explored extensions to the standard approaches to account for the temporal structure of longitudinal data. One suggestion is the two-fold fully conditional specification (two-fold FCS) algorithm, which restricts the imputation of a time-dependent variable to time blocks where the imputation model includes measurements taken at the specified and adjacent times. To date, no study has investigated the performance of two-fold FCS and standard MI methods for handling missing data in a time-varying covariate with a non-linear trajectory over time - a commonly encountered scenario in epidemiological studies. We simulated 1000 datasets of 5000 individuals based on the Longitudinal Study of Australian Children (LSAC). Three missing data mechanisms: missing completely at random (MCAR), and a weak and a strong missing at random (MAR) scenarios were used to impose missingness on body mass index (BMI) for age z-scores; a continuous time-varying exposure variable with a non-linear trajectory over time. We evaluated the performance of FCS, MVNI, and two-fold FCS for handling up to 50% of missing data when assessing the association between childhood obesity and sleep problems. The standard two-fold FCS produced slightly more biased and less precise estimates than FCS and MVNI. We observed slight improvements in bias and precision when using a time window width of two for the two-fold FCS algorithm compared to the standard width of one. We recommend the use of FCS or MVNI in a similar

  4. Estimating Missing Features to Improve Multimedia Information Retrieval

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bagherjeiran, A; Love, N S; Kamath, C

    Retrieval in a multimedia database usually involves combining information from different modalities of data, such as text and images. However, all modalities of the data may not be available to form the query. The retrieval results from such a partial query are often less than satisfactory. In this paper, we present an approach to complete a partial query by estimating the missing features in the query. Our experiments with a database of images and their associated captions show that, with an initial text-only query, our completion method has similar performance to a full query with both image and text features.more » In addition, when we use relevance feedback, our approach outperforms the results obtained using a full query.« less

  5. Improvement of Modeling HTGR Neutron Physics by Uncertainty Analysis with the Use of Cross-Section Covariance Information

    NASA Astrophysics Data System (ADS)

    Boyarinov, V. F.; Grol, A. V.; Fomichenko, P. A.; Ternovykh, M. Yu

    2017-01-01

    This work is aimed at improvement of HTGR neutron physics design calculations by application of uncertainty analysis with the use of cross-section covariance information. Methodology and codes for preparation of multigroup libraries of covariance information for individual isotopes from the basic 44-group library of SCALE-6 code system were developed. A 69-group library of covariance information in a special format for main isotopes and elements typical for high temperature gas cooled reactors (HTGR) was generated. This library can be used for estimation of uncertainties, associated with nuclear data, in analysis of HTGR neutron physics with design codes. As an example, calculations of one-group cross-section uncertainties for fission and capture reactions for main isotopes of the MHTGR-350 benchmark, as well as uncertainties of the multiplication factor (k∞) for the MHTGR-350 fuel compact cell model and fuel block model were performed. These uncertainties were estimated by the developed technology with the use of WIMS-D code and modules of SCALE-6 code system, namely, by TSUNAMI, KENO-VI and SAMS. Eight most important reactions on isotopes for MHTGR-350 benchmark were identified, namely: 10B(capt), 238U(n,γ), ν5, 235U(n,γ), 238U(el), natC(el), 235U(fiss)-235U(n,γ), 235U(fiss).

  6. MISSE-6 hardware

    NASA Image and Video Library

    2009-09-02

    ISS020-E-037372 (1 Sept. 2009) --- A close-up view of a Materials International Space Station Experiment (MISSE-6) on the exterior of the Columbus laboratory is featured in this image photographed by a space walking astronaut during the STS-128 mission’s first session of extravehicular activity (EVA). MISSE collects information on how different materials weather in the environment of space. MISSE was later placed in Space Shuttle Discovery’s payload bay for its return to Earth. A portion of a payload bay door is visible in the background.

  7. MISSE-6 hardware

    NASA Image and Video Library

    2009-09-02

    ISS020-E-037369 (1 Sept. 2009) --- A close-up view of a Materials International Space Station Experiment (MISSE-6) on the exterior of the Columbus laboratory is featured in this image photographed by a space walking astronaut during the STS-128 mission’s first session of extravehicular activity (EVA). MISSE collects information on how different materials weather in the environment of space. MISSE was later placed in Space Shuttle Discovery’s payload bay for its return to Earth. A portion of a payload bay door is visible in the background.

  8. Analysis of longitudinal data from animals with missing values using SPSS.

    PubMed

    Duricki, Denise A; Soleman, Sara; Moon, Lawrence D F

    2016-06-01

    Testing of therapies for disease or injury often involves the analysis of longitudinal data from animals. Modern analytical methods have advantages over conventional methods (particularly when some data are missing), yet they are not used widely by preclinical researchers. Here we provide an easy-to-use protocol for the analysis of longitudinal data from animals, and we present a click-by-click guide for performing suitable analyses using the statistical package IBM SPSS Statistics software (SPSS). We guide readers through the analysis of a real-life data set obtained when testing a therapy for brain injury (stroke) in elderly rats. If a few data points are missing, as in this example data set (for example, because of animal dropout), repeated-measures analysis of covariance may fail to detect a treatment effect. An alternative analysis method, such as the use of linear models (with various covariance structures), and analysis using restricted maximum likelihood estimation (to include all available data) can be used to better detect treatment effects. This protocol takes 2 h to carry out.

  9. Comparing multiple imputation methods for systematically missing subject-level data.

    PubMed

    Kline, David; Andridge, Rebecca; Kaizar, Eloise

    2017-06-01

    When conducting research synthesis, the collection of studies that will be combined often do not measure the same set of variables, which creates missing data. When the studies to combine are longitudinal, missing data can occur on the observation-level (time-varying) or the subject-level (non-time-varying). Traditionally, the focus of missing data methods for longitudinal data has been on missing observation-level variables. In this paper, we focus on missing subject-level variables and compare two multiple imputation approaches: a joint modeling approach and a sequential conditional modeling approach. We find the joint modeling approach to be preferable to the sequential conditional approach, except when the covariance structure of the repeated outcome for each individual has homogenous variance and exchangeable correlation. Specifically, the regression coefficient estimates from an analysis incorporating imputed values based on the sequential conditional method are attenuated and less efficient than those from the joint method. Remarkably, the estimates from the sequential conditional method are often less efficient than a complete case analysis, which, in the context of research synthesis, implies that we lose efficiency by combining studies. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.

  10. Comparison of Modern Methods for Analyzing Repeated Measures Data with Missing Values

    ERIC Educational Resources Information Center

    Vallejo, G.; Fernandez, M. P.; Livacic-Rojas, P. E.; Tuero-Herrero, E.

    2011-01-01

    Missing data are a pervasive problem in many psychological applications in the real world. In this article we study the impact of dropout on the operational characteristics of several approaches that can be easily implemented with commercially available software. These approaches include the covariance pattern model based on an unstructured…

  11. A hierarchical nest survival model integrating incomplete temporally varying covariates

    PubMed Central

    Converse, Sarah J; Royle, J Andrew; Adler, Peter H; Urbanek, Richard P; Barzen, Jeb A

    2013-01-01

    Nest success is a critical determinant of the dynamics of avian populations, and nest survival modeling has played a key role in advancing avian ecology and management. Beginning with the development of daily nest survival models, and proceeding through subsequent extensions, the capacity for modeling the effects of hypothesized factors on nest survival has expanded greatly. We extend nest survival models further by introducing an approach to deal with incompletely observed, temporally varying covariates using a hierarchical model. Hierarchical modeling offers a way to separate process and observational components of demographic models to obtain estimates of the parameters of primary interest, and to evaluate structural effects of ecological and management interest. We built a hierarchical model for daily nest survival to analyze nest data from reintroduced whooping cranes (Grus americana) in the Eastern Migratory Population. This reintroduction effort has been beset by poor reproduction, apparently due primarily to nest abandonment by breeding birds. We used the model to assess support for the hypothesis that nest abandonment is caused by harassment from biting insects. We obtained indices of blood-feeding insect populations based on the spatially interpolated counts of insects captured in carbon dioxide traps. However, insect trapping was not conducted daily, and so we had incomplete information on a temporally variable covariate of interest. We therefore supplemented our nest survival model with a parallel model for estimating the values of the missing insect covariates. We used Bayesian model selection to identify the best predictors of daily nest survival. Our results suggest that the black fly Simulium annulus may be negatively affecting nest survival of reintroduced whooping cranes, with decreasing nest survival as abundance of S. annulus increases. The modeling framework we have developed will be applied in the future to a larger data set to evaluate the

  12. A hierarchical nest survival model integrating incomplete temporally varying covariates

    USGS Publications Warehouse

    Converse, Sarah J.; Royle, J. Andrew; Adler, Peter H.; Urbanek, Richard P.; Barzan, Jeb A.

    2013-01-01

    Nest success is a critical determinant of the dynamics of avian populations, and nest survival modeling has played a key role in advancing avian ecology and management. Beginning with the development of daily nest survival models, and proceeding through subsequent extensions, the capacity for modeling the effects of hypothesized factors on nest survival has expanded greatly. We extend nest survival models further by introducing an approach to deal with incompletely observed, temporally varying covariates using a hierarchical model. Hierarchical modeling offers a way to separate process and observational components of demographic models to obtain estimates of the parameters of primary interest, and to evaluate structural effects of ecological and management interest. We built a hierarchical model for daily nest survival to analyze nest data from reintroduced whooping cranes (Grus americana) in the Eastern Migratory Population. This reintroduction effort has been beset by poor reproduction, apparently due primarily to nest abandonment by breeding birds. We used the model to assess support for the hypothesis that nest abandonment is caused by harassment from biting insects. We obtained indices of blood-feeding insect populations based on the spatially interpolated counts of insects captured in carbon dioxide traps. However, insect trapping was not conducted daily, and so we had incomplete information on a temporally variable covariate of interest. We therefore supplemented our nest survival model with a parallel model for estimating the values of the missing insect covariates. We used Bayesian model selection to identify the best predictors of daily nest survival. Our results suggest that the black fly Simulium annulus may be negatively affecting nest survival of reintroduced whooping cranes, with decreasing nest survival as abundance of S. annulus increases. The modeling framework we have developed will be applied in the future to a larger data set to evaluate the

  13. Informatively missing quality of life and unmet needs sex data for immigrant and Anglo-Australian cancer patients and survivors.

    PubMed

    Bell, Melanie L; Butow, Phyllis N; Goldstein, David

    2013-12-01

    Although cancer can seriously affect peoples' sexual well-being, survivors and patients may be reluctant to answer questions about sex. This reluctance may be stronger for immigrants. This study aimed to investigate missing sex data rates and predictors of missingness in two large studies on immigrants and Anglo-Australian controls with cancer and to investigate whether those with missing sex data may have worse sexual outcomes than those with complete data. We carried out two studies aimed at describing the quality of life (QoL) and unmet needs amongst Arabic, Chinese and Greek immigrants versus Anglo-Australians cancer survivors (n = 596, recruited from cancer registries) and patients (n = 845). Logistic regression was used to model the probability of having missing sex data in either of the questionnaires. We compared the mean of the unmet sex needs responses of those who had missing QoL sex data (but not needs) to those who had completed both, and vice versa. Missing sex data rates were as high as 65 %, with immigrants more likely to skip sex items than Anglo-Australians (p = 0.02 for registry study, p < 0.0001 for hospital study). Women, older participants and participants with more advanced disease had increased odds of missingness. There was evidence that data were informatively missing. Additionally, the questionnaire which stated that the sex questions are optional had higher missing data rates. High missing data rates and informatively missing data can lead to biased results. Using the questionnaires that state that they may skip sex items may lead to an underestimation of sexual problems or an overestimation of quality of life.

  14. 'Miss Frances', 'Miss Gail' and 'Miss Sandra' Crapemyrtles

    USDA-ARS?s Scientific Manuscript database

    The Agricultural Research Service, United States Department of Agriculture, announces the release to nurserymen of three new crapemyrtle cultivars named 'Miss Gail', 'Miss Frances', and 'Miss Sandra'. ‘Miss Gail’ resulted from a cross-pollination between ‘Catawba’ as the female parent and ‘Arapaho’ ...

  15. Analysis of longitudinal data from animals where some data are missing in SPSS

    PubMed Central

    Duricki, DA; Soleman, S; Moon, LDF

    2017-01-01

    Testing of therapies for disease or injury often involves analysis of longitudinal data from animals. Modern analytical methods have advantages over conventional methods (particularly where some data are missing) yet are not used widely by pre-clinical researchers. We provide here an easy to use protocol for analysing longitudinal data from animals and present a click-by-click guide for performing suitable analyses using the statistical package SPSS. We guide readers through analysis of a real-life data set obtained when testing a therapy for brain injury (stroke) in elderly rats. We show that repeated measures analysis of covariance failed to detect a treatment effect when a few data points were missing (due to animal drop-out) whereas analysis using an alternative method detected a beneficial effect of treatment; specifically, we demonstrate the superiority of linear models (with various covariance structures) analysed using Restricted Maximum Likelihood estimation (to include all available data). This protocol takes two hours to follow. PMID:27196723

  16. Covariance descriptor fusion for target detection

    NASA Astrophysics Data System (ADS)

    Cukur, Huseyin; Binol, Hamidullah; Bal, Abdullah; Yavuz, Fatih

    2016-05-01

    Target detection is one of the most important topics for military or civilian applications. In order to address such detection tasks, hyperspectral imaging sensors provide useful images data containing both spatial and spectral information. Target detection has various challenging scenarios for hyperspectral images. To overcome these challenges, covariance descriptor presents many advantages. Detection capability of the conventional covariance descriptor technique can be improved by fusion methods. In this paper, hyperspectral bands are clustered according to inter-bands correlation. Target detection is then realized by fusion of covariance descriptor results based on the band clusters. The proposed combination technique is denoted Covariance Descriptor Fusion (CDF). The efficiency of the CDF is evaluated by applying to hyperspectral imagery to detect man-made objects. The obtained results show that the CDF presents better performance than the conventional covariance descriptor.

  17. Hypothesis tests for stratified mark-specific proportional hazards models with missing covariates, with application to HIV vaccine efficacy trials.

    PubMed

    Sun, Yanqing; Qi, Li; Yang, Guangren; Gilbert, Peter B

    2018-05-01

    This article develops hypothesis testing procedures for the stratified mark-specific proportional hazards model with missing covariates where the baseline functions may vary with strata. The mark-specific proportional hazards model has been studied to evaluate mark-specific relative risks where the mark is the genetic distance of an infecting HIV sequence to an HIV sequence represented inside the vaccine. This research is motivated by analyzing the RV144 phase 3 HIV vaccine efficacy trial, to understand associations of immune response biomarkers on the mark-specific hazard of HIV infection, where the biomarkers are sampled via a two-phase sampling nested case-control design. We test whether the mark-specific relative risks are unity and how they change with the mark. The developed procedures enable assessment of whether risk of HIV infection with HIV variants close or far from the vaccine sequence are modified by immune responses induced by the HIV vaccine; this question is interesting because vaccine protection occurs through immune responses directed at specific HIV sequences. The test statistics are constructed based on augmented inverse probability weighted complete-case estimators. The asymptotic properties and finite-sample performances of the testing procedures are investigated, demonstrating double-robustness and effectiveness of the predictive auxiliaries to recover efficiency. The finite-sample performance of the proposed tests are examined through a comprehensive simulation study. The methods are applied to the RV144 trial. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Sensitivity Analysis of Multiple Informant Models When Data Are Not Missing at Random

    ERIC Educational Resources Information Center

    Blozis, Shelley A.; Ge, Xiaojia; Xu, Shu; Natsuaki, Misaki N.; Shaw, Daniel S.; Neiderhiser, Jenae M.; Scaramella, Laura V.; Leve, Leslie D.; Reiss, David

    2013-01-01

    Missing data are common in studies that rely on multiple informant data to evaluate relationships among variables for distinguishable individuals clustered within groups. Estimation of structural equation models using raw data allows for incomplete data, and so all groups can be retained for analysis even if only 1 member of a group contributes…

  19. Impact of Violation of the Missing-at-Random Assumption on Full-Information Maximum Likelihood Method in Multidimensional Adaptive Testing

    ERIC Educational Resources Information Center

    Han, Kyung T.; Guo, Fanmin

    2014-01-01

    The full-information maximum likelihood (FIML) method makes it possible to estimate and analyze structural equation models (SEM) even when data are partially missing, enabling incomplete data to contribute to model estimation. The cornerstone of FIML is the missing-at-random (MAR) assumption. In (unidimensional) computerized adaptive testing…

  20. Terrestrial gross carbon dioxide uptake: global distribution and covariation with climate.

    PubMed

    Beer, Christian; Reichstein, Markus; Tomelleri, Enrico; Ciais, Philippe; Jung, Martin; Carvalhais, Nuno; Rödenbeck, Christian; Arain, M Altaf; Baldocchi, Dennis; Bonan, Gordon B; Bondeau, Alberte; Cescatti, Alessandro; Lasslop, Gitta; Lindroth, Anders; Lomas, Mark; Luyssaert, Sebastiaan; Margolis, Hank; Oleson, Keith W; Roupsard, Olivier; Veenendaal, Elmar; Viovy, Nicolas; Williams, Christopher; Woodward, F Ian; Papale, Dario

    2010-08-13

    Terrestrial gross primary production (GPP) is the largest global CO(2) flux driving several ecosystem functions. We provide an observation-based estimate of this flux at 123 +/- 8 petagrams of carbon per year (Pg C year(-1)) using eddy covariance flux data and various diagnostic models. Tropical forests and savannahs account for 60%. GPP over 40% of the vegetated land is associated with precipitation. State-of-the-art process-oriented biosphere models used for climate predictions exhibit a large between-model variation of GPP's latitudinal patterns and show higher spatial correlations between GPP and precipitation, suggesting the existence of missing processes or feedback mechanisms which attenuate the vegetation response to climate. Our estimates of spatially distributed GPP and its covariation with climate can help improve coupled climate-carbon cycle process models.

  1. MISSE-6 hardware

    NASA Image and Video Library

    2009-09-02

    ISS020-E-037371 (1 Sept. 2009) --- A close-up view of a Materials International Space Station Experiment (MISSE-6) on the exterior of the Columbus laboratory is featured in this image photographed by a space walking astronaut during the STS-128 mission’s first session of extravehicular activity (EVA). MISSE collects information on how different materials weather in the environment of space. MISSE was later placed in Space Shuttle Discovery’s payload bay for its return to Earth. A portion of a payload bay door is visible in the background. The blackness of space and Earth’s horizon provide the backdrop for the scene.

  2. Covariance Recovery from a Square Root Information Matrix for Data Association

    DTIC Science & Technology

    2009-07-02

    association is one of the core problems of simultaneous localization and mapping (SLAM), and it requires knowledge about the uncertainties of the...association is one of the core problems of simultaneous localization and mapping (SLAM), and it requires knowledge about the uncertainties of the...back-substitution as well as efficient access to marginal covariances, which is described next. 2.2. Recovering Marginal Covariances Knowledge of the

  3. Covariance Matrix Evaluations for Independent Mass Fission Yields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Terranova, N., E-mail: nicholas.terranova@unibo.it; Serot, O.; Archier, P.

    2015-01-15

    Recent needs for more accurate fission product yields include covariance information to allow improved uncertainty estimations of the parameters used by design codes. The aim of this work is to investigate the possibility to generate more reliable and complete uncertainty information on independent mass fission yields. Mass yields covariances are estimated through a convolution between the multi-Gaussian empirical model based on Brosa's fission modes, which describe the pre-neutron mass yields, and the average prompt neutron multiplicity curve. The covariance generation task has been approached using the Bayesian generalized least squared method through the CONRAD code. Preliminary results on mass yieldsmore » variance-covariance matrix will be presented and discussed from physical grounds in the case of {sup 235}U(n{sub th}, f) and {sup 239}Pu(n{sub th}, f) reactions.« less

  4. Miss Heroin.

    ERIC Educational Resources Information Center

    Riley, Bernice

    This script, with music, lyrics and dialog, was written especially for youngsters to inform them of the potential dangers of various drugs. The author, who teaches in an elementary school in Harlem, New York, offers Miss Heroin as her answer to the expressed opinion that most drug and alcohol information available is either too simplified and…

  5. Inadequacy of internal covariance estimation for super-sample covariance

    NASA Astrophysics Data System (ADS)

    Lacasa, Fabien; Kunz, Martin

    2017-08-01

    We give an analytical interpretation of how subsample-based internal covariance estimators lead to biased estimates of the covariance, due to underestimating the super-sample covariance (SSC). This includes the jackknife and bootstrap methods as estimators for the full survey area, and subsampling as an estimator of the covariance of subsamples. The limitations of the jackknife covariance have been previously presented in the literature because it is effectively a rescaling of the covariance of the subsample area. However we point out that subsampling is also biased, but for a different reason: the subsamples are not independent, and the corresponding lack of power results in SSC underprediction. We develop the formalism in the case of cluster counts that allows the bias of each covariance estimator to be exactly predicted. We find significant effects for a small-scale area or when a low number of subsamples is used, with auto-redshift biases ranging from 0.4% to 15% for subsampling and from 5% to 75% for jackknife covariance estimates. The cross-redshift covariance is even more affected; biases range from 8% to 25% for subsampling and from 50% to 90% for jackknife. Owing to the redshift evolution of the probe, the covariances cannot be debiased by a simple rescaling factor, and an exact debiasing has the same requirements as the full SSC prediction. These results thus disfavour the use of internal covariance estimators on data itself or a single simulation, leaving analytical prediction and simulations suites as possible SSC predictors.

  6. Bispectrum supersample covariance

    NASA Astrophysics Data System (ADS)

    Chan, Kwan Chuen; Moradinezhad Dizgah, Azadeh; Noreña, Jorge

    2018-02-01

    Modes with wavelengths larger than the survey window can have significant impact on the covariance within the survey window. The supersample covariance has been recognized as an important source of covariance for the power spectrum on small scales, and it can potentially be important for the bispectrum covariance as well. In this paper, using the response function formalism, we model the supersample covariance contributions to the bispectrum covariance and the cross-covariance between the power spectrum and the bispectrum. The supersample covariances due to the long-wavelength density and tidal perturbations are investigated, and the tidal contribution is a few orders of magnitude smaller than the density one because in configuration space the bispectrum estimator involves angular averaging and the tidal response function is anisotropic. The impact of the super-survey modes is quantified using numerical measurements with periodic box and sub-box setups. For the matter bispectrum, the ratio between the supersample covariance correction and the small-scale covariance—which can be computed using a periodic box—is roughly an order of magnitude smaller than that for the matter power spectrum. This is because for the bispectrum, the small-scale non-Gaussian covariance is significantly larger than that for the power spectrum. For the cross-covariance, the supersample covariance is as important as for the power spectrum covariance. The supersample covariance prediction with the halo model response function is in good agreement with numerical results.

  7. A Two-Stage Approach to Missing Data: Theory and Application to Auxiliary Variables

    ERIC Educational Resources Information Center

    Savalei, Victoria; Bentler, Peter M.

    2009-01-01

    A well-known ad-hoc approach to conducting structural equation modeling with missing data is to obtain a saturated maximum likelihood (ML) estimate of the population covariance matrix and then to use this estimate in the complete data ML fitting function to obtain parameter estimates. This 2-stage (TS) approach is appealing because it minimizes a…

  8. Using Analysis of Covariance (ANCOVA) with Fallible Covariates

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew; Aguinis, Herman

    2011-01-01

    Analysis of covariance (ANCOVA) is used widely in psychological research implementing nonexperimental designs. However, when covariates are fallible (i.e., measured with error), which is the norm, researchers must choose from among 3 inadequate courses of action: (a) know that the assumption that covariates are perfectly reliable is violated but…

  9. Background Error Covariance Estimation Using Information from a Single Model Trajectory with Application to Ocean Data Assimilation

    NASA Technical Reports Server (NTRS)

    Keppenne, Christian L.; Rienecker, Michele; Kovach, Robin M.; Vernieres, Guillaume

    2014-01-01

    An attractive property of ensemble data assimilation methods is that they provide flow dependent background error covariance estimates which can be used to update fields of observed variables as well as fields of unobserved model variables. Two methods to estimate background error covariances are introduced which share the above property with ensemble data assimilation methods but do not involve the integration of multiple model trajectories. Instead, all the necessary covariance information is obtained from a single model integration. The Space Adaptive Forecast error Estimation (SAFE) algorithm estimates error covariances from the spatial distribution of model variables within a single state vector. The Flow Adaptive error Statistics from a Time series (FAST) method constructs an ensemble sampled from a moving window along a model trajectory.SAFE and FAST are applied to the assimilation of Argo temperature profiles into version 4.1 of the Modular Ocean Model (MOM4.1) coupled to the GEOS-5 atmospheric model and to the CICE sea ice model. The results are validated against unassimilated Argo salinity data. They show that SAFE and FAST are competitive with the ensemble optimal interpolation (EnOI) used by the Global Modeling and Assimilation Office (GMAO) to produce its ocean analysis. Because of their reduced cost, SAFE and FAST hold promise for high-resolution data assimilation applications.

  10. Characteristics of patients with missing information on stage: a population-based study of patients diagnosed with colon, lung or breast cancer in England in 2013.

    PubMed

    Di Girolamo, Chiara; Walters, Sarah; Benitez Majano, Sara; Rachet, Bernard; Coleman, Michel P; Njagi, Edmund Njeru; Morris, Melanie

    2018-05-02

    Stage is a key predictor of cancer survival. Complete cancer staging is vital for understanding outcomes at population level and monitoring the efficacy of early diagnosis initiatives. Cancer registries usually collect details of the disease extent but staging information may be missing because a stage was never assigned to a patient or because it was not included in cancer registration records. Missing stage information introduce methodological difficulties for analysis and interpretation of results. We describe the associations between missing stage and socio-demographic and clinical characteristics of patients diagnosed with colon, lung or breast cancer in England in 2013. We assess how these associations change when completeness is high, and administrative issues are assumed to be minimal. We estimate the amount of avoidable missing stage data if high levels of completeness reached by some Clinical Commissioning Groups (CCGs), were achieved nationally. Individual cancer records were retrieved from the National Cancer Registration and linked to the Routes to Diagnosis and Hospital Episode Statistics datasets to obtain additional clinical information. We used multivariable beta binomial regression models to estimate the strength of the association between socio-demographic and clinical characteristics of patients and missing stage and to derive the amount of avoidable missing stage. Multivariable modelling showed that old age was associated with missing stage irrespective of the cancer site and independent of comorbidity score, short-term mortality and patient characteristics. This remained true for patients in the CCGs with high completeness. Applying the results from these CCGs to the whole cohort showed that approximately 70% of missing stage information was potentially avoidable. Missing stage was more frequent in older patients, including those residing in CCGs with high completeness. This disadvantage for older patients was not explained fully by the

  11. Relying on Your Own Best Judgment: Imputing Values to Missing Information in Decision Making.

    ERIC Educational Resources Information Center

    Johnson, Richard D.; And Others

    Processes involved in making estimates of the value of missing information that could help in a decision making process were studied. Hypothetical purchases of ground beef were selected for the study as such purchases have the desirable property of quantifying both the price and quality. A total of 150 students at the University of Iowa rated the…

  12. Should multiple imputation be the method of choice for handling missing data in randomized trials?

    PubMed Central

    Sullivan, Thomas R; White, Ian R; Salter, Amy B; Ryan, Philip; Lee, Katherine J

    2016-01-01

    The use of multiple imputation has increased markedly in recent years, and journal reviewers may expect to see multiple imputation used to handle missing data. However in randomized trials, where treatment group is always observed and independent of baseline covariates, other approaches may be preferable. Using data simulation we evaluated multiple imputation, performed both overall and separately by randomized group, across a range of commonly encountered scenarios. We considered both missing outcome and missing baseline data, with missing outcome data induced under missing at random mechanisms. Provided the analysis model was correctly specified, multiple imputation produced unbiased treatment effect estimates, but alternative unbiased approaches were often more efficient. When the analysis model overlooked an interaction effect involving randomized group, multiple imputation produced biased estimates of the average treatment effect when applied to missing outcome data, unless imputation was performed separately by randomized group. Based on these results, we conclude that multiple imputation should not be seen as the only acceptable way to handle missing data in randomized trials. In settings where multiple imputation is adopted, we recommend that imputation is carried out separately by randomized group. PMID:28034175

  13. Should multiple imputation be the method of choice for handling missing data in randomized trials?

    PubMed

    Sullivan, Thomas R; White, Ian R; Salter, Amy B; Ryan, Philip; Lee, Katherine J

    2016-01-01

    The use of multiple imputation has increased markedly in recent years, and journal reviewers may expect to see multiple imputation used to handle missing data. However in randomized trials, where treatment group is always observed and independent of baseline covariates, other approaches may be preferable. Using data simulation we evaluated multiple imputation, performed both overall and separately by randomized group, across a range of commonly encountered scenarios. We considered both missing outcome and missing baseline data, with missing outcome data induced under missing at random mechanisms. Provided the analysis model was correctly specified, multiple imputation produced unbiased treatment effect estimates, but alternative unbiased approaches were often more efficient. When the analysis model overlooked an interaction effect involving randomized group, multiple imputation produced biased estimates of the average treatment effect when applied to missing outcome data, unless imputation was performed separately by randomized group. Based on these results, we conclude that multiple imputation should not be seen as the only acceptable way to handle missing data in randomized trials. In settings where multiple imputation is adopted, we recommend that imputation is carried out separately by randomized group.

  14. Help for Finding Missing Children.

    ERIC Educational Resources Information Center

    McCormick, Kathleen

    1984-01-01

    Efforts to locate missing children have expanded from a federal law allowing for entry of information into an F.B.I. computer system to companion bills before Congress for establishing a national missing child clearinghouse and a Justice Department center to help in conducting searches. Private organizations are also involved. (KS)

  15. Storage and computationally efficient permutations of factorized covariance and square-root information arrays

    NASA Technical Reports Server (NTRS)

    Muellerschoen, R. J.

    1988-01-01

    A unified method to permute vector stored Upper triangular Diagonal factorized covariance and vector stored upper triangular Square Root Information arrays is presented. The method involves cyclic permutation of the rows and columns of the arrays and retriangularization with fast (slow) Givens rotations (reflections). Minimal computation is performed, and a one dimensional scratch array is required. To make the method efficient for large arrays on a virtual memory machine, computations are arranged so as to avoid expensive paging faults. This method is potentially important for processing large volumes of radio metric data in the Deep Space Network.

  16. Integrative missing value estimation for microarray data.

    PubMed

    Hu, Jianjun; Li, Haifeng; Waterman, Michael S; Zhou, Xianghong Jasmine

    2006-10-12

    Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets into consideration. To determine whether the given reference data sets are sufficiently informative for integration, we use a submatrix imputation approach. Our experiments showed that iMISS can significantly and consistently improve the accuracy of the state-of-the-art Local Least Square (LLS) imputation algorithm by up to 15% improvement in our benchmark tests. We demonstrated that the order-statistics-based integrative imputation algorithms can achieve significant improvements over the state-of-the-art missing value estimation approaches such as LLS and is especially good for imputing microarray datasets with a limited number of samples, high rates of missing data, or very noisy measurements. With the rapid accumulation of microarray datasets, the performance of our approach can be further improved by incorporating larger and more appropriate reference datasets.

  17. On the role of covariance information for GRACE K-band observations in the Celestial Mechanics Approach

    NASA Astrophysics Data System (ADS)

    Bentel, Katrin; Meyer, Ulrich; Arnold, Daniel; Jean, Yoomin; Jäggi, Adrian

    2017-04-01

    The Astronomical Institute at the University of Bern (AIUB) derives static and time-variable gravity fields by means of the Celestial Mechanics Approach (CMA) from GRACE (level 1B) data. This approach makes use of the close link between orbit and gravity field determination. GPS-derived kinematic GRACE orbit positions, inter-satellite K-band observations, which are the core observations of GRACE, and accelerometer data are combined to rigorously estimate orbit and spherical harmonic gravity field coefficients in one adjustment step. Pseudo-stochastic orbit parameters are set up to absorb unmodeled noise. The K-band range measurements in along-track direction lead to a much higher correlation of the observations in this direction compared to the other directions and thus, to north-south stripes in the unconstrained gravity field solutions, so-called correlated errors. By using a full covariance matrix for the K-band observations the correlation can be taken into account. One possibility is to derive correlation information from post-processing K-band residuals. This is then used in a second iteration step to derive an improved gravity field solution. We study the effects of pre-defined covariance matrices and residual-derived covariance matrices on the final gravity field product with the CMA.

  18. On using summary statistics from an external calibration sample to correct for covariate measurement error.

    PubMed

    Guo, Ying; Little, Roderick J; McConnell, Daniel S

    2012-01-01

    Covariate measurement error is common in epidemiologic studies. Current methods for correcting measurement error with information from external calibration samples are insufficient to provide valid adjusted inferences. We consider the problem of estimating the regression of an outcome Y on covariates X and Z, where Y and Z are observed, X is unobserved, but a variable W that measures X with error is observed. Information about measurement error is provided in an external calibration sample where data on X and W (but not Y and Z) are recorded. We describe a method that uses summary statistics from the calibration sample to create multiple imputations of the missing values of X in the regression sample, so that the regression coefficients of Y on X and Z and associated standard errors can be estimated using simple multiple imputation combining rules, yielding valid statistical inferences under the assumption of a multivariate normal distribution. The proposed method is shown by simulation to provide better inferences than existing methods, namely the naive method, classical calibration, and regression calibration, particularly for correction for bias and achieving nominal confidence levels. We also illustrate our method with an example using linear regression to examine the relation between serum reproductive hormone concentrations and bone mineral density loss in midlife women in the Michigan Bone Health and Metabolism Study. Existing methods fail to adjust appropriately for bias due to measurement error in the regression setting, particularly when measurement error is substantial. The proposed method corrects this deficiency.

  19. Potential value of health information exchange for people with epilepsy: crossover patterns and missing clinical data.

    PubMed

    Grinspan, Zachary M; Abramson, Erika L; Banerjee, Samprit; Kern, Lisa M; Kaushal, Rainu; Shapiro, Jason S

    2013-01-01

    For people with epilepsy, the potential value of health information exchange (HIE) is unknown. We reviewed two years of clinical encounters for 8055 people with epilepsy from seven Manhattan hospitals. We created network graphs illustrating crossover among these hospitals for multiple encounter types, and calculated a novel metric of care fragmentation: "encounters at risk for missing clinical data." Given two hospitals, a median of 109 [range 46 - 588] patients with epilepsy had visited both. Due to this crossover, recent, relevant clinical data may be missing at the time of care frequently (44.8% of ED encounters, 34.5% inpatient, 24.9% outpatient, and 23.2% radiology). Though a smaller percentage of outpatient encounters were at risk for missing data than ED encounters, the absolute number of outpatient encounters at risk was three times higher (14,579 vs. 5041). People with epilepsy may benefit from HIE. Future HIE initiatives should prioritize outpatient access.

  20. Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey.

    PubMed

    Peyre, Hugo; Leplège, Alain; Coste, Joël

    2011-03-01

    Missing items are common in quality of life (QoL) questionnaires and present a challenge for research in this field. It remains unclear which of the various methods proposed to deal with missing data performs best in this context. We compared personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques using various realistic simulation scenarios of item missingness in QoL questionnaires constructed within the framework of classical test theory. Samples of 300 and 1,000 subjects were randomly drawn from the 2003 INSEE Decennial Health Survey (of 23,018 subjects representative of the French population and having completed the SF-36) and various patterns of missing data were generated according to three different item non-response rates (3, 6, and 9%) and three types of missing data (Little and Rubin's "missing completely at random," "missing at random," and "missing not at random"). The missing data methods were evaluated in terms of accuracy and precision for the analysis of one descriptive and one association parameter for three different scales of the SF-36. For all item non-response rates and types of missing data, multiple imputation and full information maximum likelihood appeared superior to the personal mean score and especially to hot deck in terms of accuracy and precision; however, the use of personal mean score was associated with insignificant bias (relative bias <2%) in all studied situations. Whereas multiple imputation and full information maximum likelihood are confirmed as reference methods, the personal mean score appears nonetheless appropriate for dealing with items missing from completed SF-36 questionnaires in most situations of routine use. These results can reasonably be extended to other questionnaires constructed according to classical test theory.

  1. Managing distance and covariate information with point-based clustering.

    PubMed

    Whigham, Peter A; de Graaf, Brandon; Srivastava, Rashmi; Glue, Paul

    2016-09-01

    Geographic perspectives of disease and the human condition often involve point-based observations and questions of clustering or dispersion within a spatial context. These problems involve a finite set of point observations and are constrained by a larger, but finite, set of locations where the observations could occur. Developing a rigorous method for pattern analysis in this context requires handling spatial covariates, a method for constrained finite spatial clustering, and addressing bias in geographic distance measures. An approach, based on Ripley's K and applied to the problem of clustering with deliberate self-harm (DSH), is presented. Point-based Monte-Carlo simulation of Ripley's K, accounting for socio-economic deprivation and sources of distance measurement bias, was developed to estimate clustering of DSH at a range of spatial scales. A rotated Minkowski L1 distance metric allowed variation in physical distance and clustering to be assessed. Self-harm data was derived from an audit of 2 years' emergency hospital presentations (n = 136) in a New Zealand town (population ~50,000). Study area was defined by residential (housing) land parcels representing a finite set of possible point addresses. Area-based deprivation was spatially correlated. Accounting for deprivation and distance bias showed evidence for clustering of DSH for spatial scales up to 500 m with a one-sided 95 % CI, suggesting that social contagion may be present for this urban cohort. Many problems involve finite locations in geographic space that require estimates of distance-based clustering at many scales. A Monte-Carlo approach to Ripley's K, incorporating covariates and models for distance bias, are crucial when assessing health-related clustering. The case study showed that social network structure defined at the neighbourhood level may account for aspects of neighbourhood clustering of DSH. Accounting for covariate measures that exhibit spatial clustering, such as deprivation

  2. Estimating hazard ratios in cohort data with missing disease information due to death.

    PubMed

    Binder, Nadine; Herrnböck, Anne-Sophie; Schumacher, Martin

    2017-03-01

    In clinical and epidemiological studies information on the primary outcome of interest, that is, the disease status, is usually collected at a limited number of follow-up visits. The disease status can often only be retrieved retrospectively in individuals who are alive at follow-up, but will be missing for those who died before. Right-censoring the death cases at the last visit (ad-hoc analysis) yields biased hazard ratio estimates of a potential risk factor, and the bias can be substantial and occur in either direction. In this work, we investigate three different approaches that use the same likelihood contributions derived from an illness-death multistate model in order to more adequately estimate the hazard ratio by including the death cases into the analysis: a parametric approach, a penalized likelihood approach, and an imputation-based approach. We investigate to which extent these approaches allow for an unbiased regression analysis by evaluating their performance in simulation studies and on a real data example. In doing so, we use the full cohort with complete illness-death data as reference and artificially induce missing information due to death by setting discrete follow-up visits. Compared to an ad-hoc analysis, all considered approaches provide less biased or even unbiased results, depending on the situation studied. In the real data example, the parametric approach is seen to be too restrictive, whereas the imputation-based approach could almost reconstruct the original event history information. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Imputing data that are missing at high rates using a boosting algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cauthen, Katherine Regina; Lambert, Gregory; Ray, Jaideep

    Traditional multiple imputation approaches may perform poorly for datasets with high rates of missingness unless many m imputations are used. This paper implements an alternative machine learning-based approach to imputing data that are missing at high rates. Here, we use boosting to create a strong learner from a weak learner fitted to a dataset missing many observations. This approach may be applied to a variety of types of learners (models). The approach is demonstrated by application to a spatiotemporal dataset for predicting dengue outbreaks in India from meteorological covariates. A Bayesian spatiotemporal CAR model is boosted to produce imputations, andmore » the overall RMSE from a k-fold cross-validation is used to assess imputation accuracy.« less

  4. What Is Missing in Counseling Research? Reporting Missing Data

    ERIC Educational Resources Information Center

    Sterner, William R.

    2011-01-01

    Missing data have long been problematic in quantitative research. Despite the statistical and methodological advances made over the past 3 decades, counseling researchers fail to provide adequate information on this phenomenon. Interpreting the complex statistical procedures and esoteric language seems to be a contributing factor. An overview of…

  5. One-stage individual participant data meta-analysis models: estimation of treatment-covariate interactions must avoid ecological bias by separating out within-trial and across-trial information.

    PubMed

    Hua, Hairui; Burke, Danielle L; Crowther, Michael J; Ensor, Joie; Tudur Smith, Catrin; Riley, Richard D

    2017-02-28

    Stratified medicine utilizes individual-level covariates that are associated with a differential treatment effect, also known as treatment-covariate interactions. When multiple trials are available, meta-analysis is used to help detect true treatment-covariate interactions by combining their data. Meta-regression of trial-level information is prone to low power and ecological bias, and therefore, individual participant data (IPD) meta-analyses are preferable to examine interactions utilizing individual-level information. However, one-stage IPD models are often wrongly specified, such that interactions are based on amalgamating within- and across-trial information. We compare, through simulations and an applied example, fixed-effect and random-effects models for a one-stage IPD meta-analysis of time-to-event data where the goal is to estimate a treatment-covariate interaction. We show that it is crucial to centre patient-level covariates by their mean value in each trial, in order to separate out within-trial and across-trial information. Otherwise, bias and coverage of interaction estimates may be adversely affected, leading to potentially erroneous conclusions driven by ecological bias. We revisit an IPD meta-analysis of five epilepsy trials and examine age as a treatment effect modifier. The interaction is -0.011 (95% CI: -0.019 to -0.003; p = 0.004), and thus highly significant, when amalgamating within-trial and across-trial information. However, when separating within-trial from across-trial information, the interaction is -0.007 (95% CI: -0.019 to 0.005; p = 0.22), and thus its magnitude and statistical significance are greatly reduced. We recommend that meta-analysts should only use within-trial information to examine individual predictors of treatment effect and that one-stage IPD models should separate within-trial from across-trial information to avoid ecological bias. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd

  6. Auto covariance computer

    NASA Technical Reports Server (NTRS)

    Hepner, T. E.; Meyers, J. F. (Inventor)

    1985-01-01

    A laser velocimeter covariance processor which calculates the auto covariance and cross covariance functions for a turbulent flow field based on Poisson sampled measurements in time from a laser velocimeter is described. The device will process a block of data that is up to 4096 data points in length and return a 512 point covariance function with 48-bit resolution along with a 512 point histogram of the interarrival times which is used to normalize the covariance function. The device is designed to interface and be controlled by a minicomputer from which the data is received and the results returned. A typical 4096 point computation takes approximately 1.5 seconds to receive the data, compute the covariance function, and return the results to the computer.

  7. A Strategy for Making Decisions and Evaluating Alternative Juvenile Offender Treatment Programs: Compensating for Missing Information.

    ERIC Educational Resources Information Center

    Roberts, Albert R.; Schervish, Phillip

    1988-01-01

    National survey data on the cost-effectiveness and cost-benefits of 11 different juvenile offender treatment program types are reviewed. Best-worst-midpoint, threshold, and convergency analyses are applied to these studies. Meaningful ways of dealing with missing information when evaluating alternative juvenile justice policies and programs are…

  8. Comparing the accuracy and precision of three techniques used for estimating missing landmarks when reconstructing fossil hominin crania.

    PubMed

    Neeser, Rudolph; Ackermann, Rebecca Rogers; Gain, James

    2009-09-01

    Various methodological approaches have been used for reconstructing fossil hominin remains in order to increase sample sizes and to better understand morphological variation. Among these, morphometric quantitative techniques for reconstruction are increasingly common. Here we compare the accuracy of three approaches--mean substitution, thin plate splines, and multiple linear regression--for estimating missing landmarks of damaged fossil specimens. Comparisons are made varying the number of missing landmarks, sample sizes, and the reference species of the population used to perform the estimation. The testing is performed on landmark data from individuals of Homo sapiens, Pan troglodytes and Gorilla gorilla, and nine hominin fossil specimens. Results suggest that when a small, same-species fossil reference sample is available to guide reconstructions, thin plate spline approaches perform best. However, if no such sample is available (or if the species of the damaged individual is uncertain), estimates of missing morphology based on a single individual (or even a small sample) of close taxonomic affinity are less accurate than those based on a large sample of individuals drawn from more distantly related extant populations using a technique (such as a regression method) able to leverage the information (e.g., variation/covariation patterning) contained in this large sample. Thin plate splines also show an unexpectedly large amount of error in estimating landmarks, especially over large areas. Recommendations are made for estimating missing landmarks under various scenarios. Copyright 2009 Wiley-Liss, Inc.

  9. Instrumental Variable Methods for Continuous Outcomes That Accommodate Nonignorable Missing Baseline Values.

    PubMed

    Ertefaie, Ashkan; Flory, James H; Hennessy, Sean; Small, Dylan S

    2017-06-15

    Instrumental variable (IV) methods provide unbiased treatment effect estimation in the presence of unmeasured confounders under certain assumptions. To provide valid estimates of treatment effect, treatment effect confounders that are associated with the IV (IV-confounders) must be included in the analysis, and not including observations with missing values may lead to bias. Missing covariate data are particularly problematic when the probability that a value is missing is related to the value itself, which is known as nonignorable missingness. In such cases, imputation-based methods are biased. Using health-care provider preference as an IV method, we propose a 2-step procedure with which to estimate a valid treatment effect in the presence of baseline variables with nonignorable missing values. First, the provider preference IV value is estimated by performing a complete-case analysis using a random-effects model that includes IV-confounders. Second, the treatment effect is estimated using a 2-stage least squares IV approach that excludes IV-confounders with missing values. Simulation results are presented, and the method is applied to an analysis comparing the effects of sulfonylureas versus metformin on body mass index, where the variables baseline body mass index and glycosylated hemoglobin have missing values. Our result supports the association of sulfonylureas with weight gain. © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  10. Best practices for missing data management in counseling psychology.

    PubMed

    Schlomer, Gabriel L; Bauman, Sheri; Card, Noel A

    2010-01-01

    This article urges counseling psychology researchers to recognize and report how missing data are handled, because consumers of research cannot accurately interpret findings without knowing the amount and pattern of missing data or the strategies that were used to handle those data. Patterns of missing data are reviewed, and some of the common strategies for dealing with them are described. The authors provide an illustration in which data were simulated and evaluate 3 methods of handling missing data: mean substitution, multiple imputation, and full information maximum likelihood. Results suggest that mean substitution is a poor method for handling missing data, whereas both multiple imputation and full information maximum likelihood are recommended alternatives to this approach. The authors suggest that researchers fully consider and report the amount and pattern of missing data and the strategy for handling those data in counseling psychology research and that editors advise researchers of this expectation.

  11. Are your covariates under control? How normalization can re-introduce covariate effects.

    PubMed

    Pain, Oliver; Dudbridge, Frank; Ronald, Angelica

    2018-04-30

    Many statistical tests rely on the assumption that the residuals of a model are normally distributed. Rank-based inverse normal transformation (INT) of the dependent variable is one of the most popular approaches to satisfy the normality assumption. When covariates are included in the analysis, a common approach is to first adjust for the covariates and then normalize the residuals. This study investigated the effect of regressing covariates against the dependent variable and then applying rank-based INT to the residuals. The correlation between the dependent variable and covariates at each stage of processing was assessed. An alternative approach was tested in which rank-based INT was applied to the dependent variable before regressing covariates. Analyses based on both simulated and real data examples demonstrated that applying rank-based INT to the dependent variable residuals after regressing out covariates re-introduces a linear correlation between the dependent variable and covariates, increasing type-I errors and reducing power. On the other hand, when rank-based INT was applied prior to controlling for covariate effects, residuals were normally distributed and linearly uncorrelated with covariates. This latter approach is therefore recommended in situations were normality of the dependent variable is required.

  12. Improving chemical species tomography of turbulent flows using covariance estimation.

    PubMed

    Grauer, Samuel J; Hadwin, Paul J; Daun, Kyle J

    2017-05-01

    Chemical species tomography (CST) experiments can be divided into limited-data and full-rank cases. Both require solving ill-posed inverse problems, and thus the measurement data must be supplemented with prior information to carry out reconstructions. The Bayesian framework formalizes the role of additive information, expressed as the mean and covariance of a joint-normal prior probability density function. We present techniques for estimating the spatial covariance of a flow under limited-data and full-rank conditions. Our results show that incorporating a covariance estimate into CST reconstruction via a Bayesian prior increases the accuracy of instantaneous estimates. Improvements are especially dramatic in real-time limited-data CST, which is directly applicable to many industrially relevant experiments.

  13. Realistic Covariance Prediction for the Earth Science Constellation

    NASA Technical Reports Server (NTRS)

    Duncan, Matthew; Long, Anne

    2006-01-01

    Routine satellite operations for the Earth Science Constellation (ESC) include collision risk assessment between members of the constellation and other orbiting space objects. One component of the risk assessment process is computing the collision probability between two space objects. The collision probability is computed using Monte Carlo techniques as well as by numerically integrating relative state probability density functions. Each algorithm takes as inputs state vector and state vector uncertainty information for both objects. The state vector uncertainty information is expressed in terms of a covariance matrix. The collision probability computation is only as good as the inputs. Therefore, to obtain a collision calculation that is a useful decision-making metric, realistic covariance matrices must be used as inputs to the calculation. This paper describes the process used by the NASA/Goddard Space Flight Center's Earth Science Mission Operations Project to generate realistic covariance predictions for three of the Earth Science Constellation satellites: Aqua, Aura and Terra.

  14. What are the appropriate methods for analyzing patient-reported outcomes in randomized trials when data are missing?

    PubMed

    Hamel, J F; Sebille, V; Le Neel, T; Kubis, G; Boyer, F C; Hardouin, J B

    2017-12-01

    Subjective health measurements using Patient Reported Outcomes (PRO) are increasingly used in randomized trials, particularly for patient groups comparisons. Two main types of analytical strategies can be used for such data: Classical Test Theory (CTT) and Item Response Theory models (IRT). These two strategies display very similar characteristics when data are complete, but in the common case when data are missing, whether IRT or CTT would be the most appropriate remains unknown and was investigated using simulations. We simulated PRO data such as quality of life data. Missing responses to items were simulated as being completely random, depending on an observable covariate or on an unobserved latent trait. The considered CTT-based methods allowed comparing scores using complete-case analysis, personal mean imputations or multiple-imputations based on a two-way procedure. The IRT-based method was the Wald test on a Rasch model including a group covariate. The IRT-based method and the multiple-imputations-based method for CTT displayed the highest observed power and were the only unbiased method whatever the kind of missing data. Online software and Stata® modules compatibles with the innate mi impute suite are provided for performing such analyses. Traditional procedures (listwise deletion and personal mean imputations) should be avoided, due to inevitable problems of biases and lack of power.

  15. Storage and computationally efficient permutations of factorized covariance and square-root information matrices

    NASA Technical Reports Server (NTRS)

    Muellerschoen, R. J.

    1988-01-01

    A unified method to permute vector-stored upper-triangular diagonal factorized covariance (UD) and vector stored upper-triangular square-root information filter (SRIF) arrays is presented. The method involves cyclical permutation of the rows and columns of the arrays and retriangularization with appropriate square-root-free fast Givens rotations or elementary slow Givens reflections. A minimal amount of computation is performed and only one scratch vector of size N is required, where N is the column dimension of the arrays. To make the method efficient for large SRIF arrays on a virtual memory machine, three additional scratch vectors each of size N are used to avoid expensive paging faults. The method discussed is compared with the methods and routines of Bierman's Estimation Subroutine Library (ESL).

  16. Covariant electromagnetic field lines

    NASA Astrophysics Data System (ADS)

    Hadad, Y.; Cohen, E.; Kaminer, I.; Elitzur, A. C.

    2017-08-01

    Faraday introduced electric field lines as a powerful tool for understanding the electric force, and these field lines are still used today in classrooms and textbooks teaching the basics of electromagnetism within the electrostatic limit. However, despite attempts at generalizing this concept beyond the electrostatic limit, such a fully relativistic field line theory still appears to be missing. In this work, we propose such a theory and define covariant electromagnetic field lines that naturally extend electric field lines to relativistic systems and general electromagnetic fields. We derive a closed-form formula for the field lines curvature in the vicinity of a charge, and show that it is related to the world line of the charge. This demonstrates how the kinematics of a charge can be derived from the geometry of the electromagnetic field lines. Such a theory may also provide new tools in modeling and analyzing electromagnetic phenomena, and may entail new insights regarding long-standing problems such as radiation-reaction and self-force. In particular, the electromagnetic field lines curvature has the attractive property of being non-singular everywhere, thus eliminating all self-field singularities without using renormalization techniques.

  17. MISSE PEC, on the ISS Airlock crewlock endcone

    NASA Image and Video Library

    2001-08-17

    STS105-E-5342 (17 August 2001) --- Backdropped by a sunrise, the newly installed Materials International Space Station Experiment (MISSE) is visible. The MISSE was installed on the outside of the Quest Airlock during the first extravehicular activity (EVA) of the STS-105 mission. MISSE will collect information on how different materials weather in the environment of space. This image was taken with a digital still camera.

  18. Examination of various roles for covariance matrices in the development, evaluation, and application of nuclear data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smith, D.L.

    The last decade has been a period of rapid development in the implementation of covariance-matrix methodology in nuclear data research. This paper offers some perspective on the progress which has been made, on some of the unresolved problems, and on the potential yet to be realized. These discussions address a variety of issues related to the development of nuclear data. Topics examined are: the importance of designing and conducting experiments so that error information is conveniently generated; the procedures for identifying error sources and quantifying their magnitudes and correlations; the combination of errors; the importance of consistent and well-characterized measurementmore » standards; the role of covariances in data parameterization (fitting); the estimation of covariances for values calculated from mathematical models; the identification of abnormalities in covariance matrices and the analysis of their consequences; the problems encountered in representing covariance information in evaluated files; the role of covariances in the weighting of diverse data sets; the comparison of various evaluations; the influence of primary-data covariance in the analysis of covariances for derived quantities (sensitivity); and the role of covariances in the merging of the diverse nuclear data information. 226 refs., 2 tabs.« less

  19. Short-time windowed covariance: A metric for identifying non-stationary, event-related covariant cortical sites

    PubMed Central

    Blakely, Timothy; Ojemann, Jeffrey G.; Rao, Rajesh P.N.

    2014-01-01

    Background Electrocorticography (ECoG) signals can provide high spatio-temporal resolution and high signal to noise ratio recordings of local neural activity from the surface of the brain. Previous studies have shown that broad-band, spatially focal, high-frequency increases in ECoG signals are highly correlated with movement and other cognitive tasks and can be volitionally modulated. However, significant additional information may be present in inter-electrode interactions, but adding additional higher order inter-electrode interactions can be impractical from a computational aspect, if not impossible. New method In this paper we present a new method of calculating high frequency interactions between electrodes called Short-Time Windowed Covariance (STWC) that builds on mathematical techniques currently used in neural signal analysis, along with an implementation that accelerates the algorithm by orders of magnitude by leveraging commodity, off-the-shelf graphics processing unit (GPU) hardware. Results Using the hardware-accelerated implementation of STWC, we identify many types of event-related inter-electrode interactions from human ECoG recordings on global and local scales that have not been identified by previous methods. Unique temporal patterns are observed for digit flexion in both low- (10 mm spacing) and high-resolution (3 mm spacing) electrode arrays. Comparison with existing methods Covariance is a commonly used metric for identifying correlated signals, but the standard covariance calculations do not allow for temporally varying covariance. In contrast STWC allows and identifies event-driven changes in covariance without identifying spurious noise correlations. Conclusions: STWC can be used to identify event-related neural interactions whose high computational load is well suited to GPU capabilities. PMID:24211499

  20. Breast Cancer and Modifiable Lifestyle Factors in Argentinean Women: Addressing Missing Data in a Case-Control Study

    PubMed Central

    Coquet, Julia Becaria; Tumas, Natalia; Osella, Alberto Ruben; Tanzi, Matteo; Franco, Isabella; Diaz, Maria Del Pilar

    2016-01-01

    A number of studies have evidenced the effect of modifiable lifestyle factors such as diet, breastfeeding and nutritional status on breast cancer risk. However, none have addressed the missing data problem in nutritional epidemiologic research in South America. Missing data is a frequent problem in breast cancer studies and epidemiological settings in general. Estimates of effect obtained from these studies may be biased, if no appropriate method for handling missing data is applied. We performed Multiple Imputation for missing values on covariates in a breast cancer case-control study of Córdoba (Argentina) to optimize risk estimates. Data was obtained from a breast cancer case control study from 2008 to 2015 (318 cases, 526 controls). Complete case analysis and multiple imputation using chained equations were the methods applied to estimate the effects of a Traditional dietary pattern and other recognized factors associated with breast cancer. Physical activity and socioeconomic status were imputed. Logistic regression models were performed. When complete case analysis was performed only 31% of women were considered. Although a positive association of Traditional dietary pattern and breast cancer was observed from both approaches (complete case analysis OR=1.3, 95%CI=1.0-1.7; multiple imputation OR=1.4, 95%CI=1.2-1.7), effects of other covariates, like BMI and breastfeeding, were only identified when multiple imputation was considered. A Traditional dietary pattern, BMI and breastfeeding are associated with the occurrence of breast cancer in this Argentinean population when multiple imputation is appropriately performed. Multiple Imputation is suggested in Latin America’s epidemiologic studies to optimize effect estimates in the future. PMID:27892664

  1. Covariation assessment for neutral and emotional verbal stimuli in paranoid delusions.

    PubMed

    Díez-Alegría, Cristina; Vázquez, Carmelo; Hernández-Lloreda, María J

    2008-11-01

    Selective processing of emotion-relevant information is considered a central feature in various types of psychopathology, yet the mechanisms underlying these biases are not well understood. One of the first steps in processing information is to gather data to judge the covariation or association of events. The aim of this study was to explore whether patients with persecutory delusions would show a covariation bias when processing stimuli related to social threat. We assessed estimations of covariation in-patients with current persecutory (CP) beliefs (N=40), patients with past persecutory (PP) beliefs (N=25), and a non-clinical control (NC) group (N=36). Covariation estimations were assessed under three different experimental conditions. The first two conditions focused on neutral behaviours (Condition 1) and psychological traits (Condition 2) for two distant cultural groups, while the third condition included self-relevant material by exposing the participant to either protective social (positive) or threatening social (negative) statements about the participant or a third person. Our results showed that all participants were precise in their covariation estimations. However, when judging covariation for self-relevant sentences related to social statements (Condition 3), all groups showed a significant tendency to associate positive social interaction (protection themed) sentences to the self. Yet, when using sentences related to social-threat, the CP group showed a bias consisting of overestimating the number of self-referent sentences. Our results showed that there was no specific covariation assessment bias related to paranoid beliefs. Both NCs and participants with persecutory beliefs showed a similar pattern of results when processing neutral or social threat-related sentences. The implications for understanding of the role of self-referent information processing biases in delusion formation are discussed.

  2. The impact of sociodemographic, treatment, and work support on missed work after breast cancer diagnosis

    PubMed Central

    Mujahid, Mahasin S.; Janz, Nancy K.; Hawley, Sarah T.; Griggs, Jennifer J.; Hamilton, Ann S.; Katz, Steven J.

    2016-01-01

    Work loss is a potential adverse consequence of cancer. There is limited research on patterns and correlates of paid work after diagnosis of breast cancer, especially among ethnic minorities. Women with non-metastatic breast cancer diagnosed from June 2005 to May 2006 who reported to the Los Angeles County SEER registry were identified and asked to complete the survey after initial treatment (median time from diagnosis = 8.9 months). Latina and African American women were over-sampled. Analyses were restricted to women working at the time of diagnosis, <65 years of age, and who had complete covariate information (N = 589). The outcome of the study was missed paid work (≤ month, >1 month, stopped all together). Approximately 44, 24, and 32% of women missed ≤1 month, >1 month, or stopped working, respectively. African Americans and Latinas were more likely to stop working when compared with Whites [OR for stop working vs. missed ≤1 month: 3.0, 3.4, (P < 0.001), respectively]. Women receiving mastectomy and those receiving chemotherapy were also more likely to stop working, independent of sociodemographic and treatment factors [ORs for stopped working vs. missed ≤1 month: 4.2, P < 0.001; 7.9, P < 0.001, respectively]. Not having a flexible work schedule available through work was detrimental to working [ORs for stopped working 18.9, P < 0.001 after adjusting for sociodemographic and treatment factors]. Many women stop working altogether after a diagnosis of breast cancer, particularly if they are racial/ethnic minorities, receive chemotherapy, or those who are employed in an unsupportive work settings. Health care providers need to be aware of these adverse consequences of breast cancer diagnosis and initial treatment. PMID:19360466

  3. The impact of sociodemographic, treatment, and work support on missed work after breast cancer diagnosis.

    PubMed

    Mujahid, Mahasin S; Janz, Nancy K; Hawley, Sarah T; Griggs, Jennifer J; Hamilton, Ann S; Katz, Steven J

    2010-01-01

    Work loss is a potential adverse consequence of cancer. There is limited research on patterns and correlates of paid work after diagnosis of breast cancer, especially among ethnic minorities. Women with non-metastatic breast cancer diagnosed from June 2005 to May 2006 who reported to the Los Angeles County SEER registry were identified and asked to complete the survey after initial treatment (median time from diagnosis = 8.9 months). Latina and African American women were over-sampled. Analyses were restricted to women working at the time of diagnosis, <65 years of age, and who had complete covariate information (N = 589). The outcome of the study was missed paid work (1 month, stopped all together). Approximately 44, 24, and 32% of women missed 1 month, or stopped working, respectively. African Americans and Latinas were more likely to stop working when compared with Whites [OR for stop working vs. missed missed

  4. Partial covariance based functional connectivity computation using Ledoit-Wolf covariance regularization.

    PubMed

    Brier, Matthew R; Mitra, Anish; McCarthy, John E; Ances, Beau M; Snyder, Abraham Z

    2015-11-01

    Functional connectivity refers to shared signals among brain regions and is typically assessed in a task free state. Functional connectivity commonly is quantified between signal pairs using Pearson correlation. However, resting-state fMRI is a multivariate process exhibiting a complicated covariance structure. Partial covariance assesses the unique variance shared between two brain regions excluding any widely shared variance, hence is appropriate for the analysis of multivariate fMRI datasets. However, calculation of partial covariance requires inversion of the covariance matrix, which, in most functional connectivity studies, is not invertible owing to rank deficiency. Here we apply Ledoit-Wolf shrinkage (L2 regularization) to invert the high dimensional BOLD covariance matrix. We investigate the network organization and brain-state dependence of partial covariance-based functional connectivity. Although RSNs are conventionally defined in terms of shared variance, removal of widely shared variance, surprisingly, improved the separation of RSNs in a spring embedded graphical model. This result suggests that pair-wise unique shared variance plays a heretofore unrecognized role in RSN covariance organization. In addition, application of partial correlation to fMRI data acquired in the eyes open vs. eyes closed states revealed focal changes in uniquely shared variance between the thalamus and visual cortices. This result suggests that partial correlation of resting state BOLD time series reflect functional processes in addition to structural connectivity. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. Partial covariance based functional connectivity computation using Ledoit-Wolf covariance regularization

    PubMed Central

    Brier, Matthew R.; Mitra, Anish; McCarthy, John E.; Ances, Beau M.; Snyder, Abraham Z.

    2015-01-01

    Functional connectivity refers to shared signals among brain regions and is typically assessed in a task free state. Functional connectivity commonly is quantified between signal pairs using Pearson correlation. However, resting-state fMRI is a multivariate process exhibiting a complicated covariance structure. Partial covariance assesses the unique variance shared between two brain regions excluding any widely shared variance, hence is appropriate for the analysis of multivariate fMRI datasets. However, calculation of partial covariance requires inversion of the covariance matrix, which, in most functional connectivity studies, is not invertible owing to rank deficiency. Here we apply Ledoit-Wolf shrinkage (L2 regularization) to invert the high dimensional BOLD covariance matrix. We investigate the network organization and brain-state dependence of partial covariance-based functional connectivity. Although RSNs are conventionally defined in terms of shared variance, removal of widely shared variance, surprisingly, improved the separation of RSNs in a spring embedded graphical model. This result suggests that pair-wise unique shared variance plays a heretofore unrecognized role in RSN covariance organization. In addition, application of partial correlation to fMRI data acquired in the eyes open vs. eyes closed states revealed focal changes in uniquely shared variance between the thalamus and visual cortices. This result suggests that partial correlation of resting state BOLD time series reflect functional processes in addition to structural connectivity. PMID:26208872

  6. Exosome-Mediated Genetic Information Transfer, a Missing Piece of Osteoblast-Osteoclast Communication Puzzle.

    PubMed

    Yin, Pengbin; Lv, Houchen; Li, Yi; Deng, Yuan; Zhang, Licheng; Tang, Peifu

    2017-01-01

    The skeletal system functions and maintains itself based on communication between cells of diverse origins, especially between osteoblasts (OBs) and osteoclasts (OCs), accounting for bone formation and resorption, respectively. Previously, protein-level information exchange has been the research focus, and this has been discussed in detail. The regulative effects of microRNAs (miRNAs) on OB and OC ignite the question as to whether genetic information could be transferred between bone cells. Exosomes, extracellular membrane vesicles 30-100 nm in diameter, have recently been demonstrated to transfer functional proteins, mRNAs, and miRNAs, and serve as mediators of intercellular communication. By reviewing the distinguishing features of exosomes, a hypothesis was formulated and evaluated in this article that exosome-mediated genetic information transfer may represent a novel strategy for OB-OC communication. The exosomes may coordinately regulate these two cells under certain physiological conditions by transferring genetic information. Further research in exosome-shuttered miRNAs in OB-OC communication may add a missing piece to the bone cells communication "puzzle."

  7. Covariate-free and Covariate-dependent Reliability.

    PubMed

    Bentler, Peter M

    2016-12-01

    Classical test theory reliability coefficients are said to be population specific. Reliability generalization, a meta-analysis method, is the main procedure for evaluating the stability of reliability coefficients across populations. A new approach is developed to evaluate the degree of invariance of reliability coefficients to population characteristics. Factor or common variance of a reliability measure is partitioned into parts that are, and are not, influenced by control variables, resulting in a partition of reliability into a covariate-dependent and a covariate-free part. The approach can be implemented in a single sample and can be applied to a variety of reliability coefficients.

  8. Working with Missing Values

    ERIC Educational Resources Information Center

    Acock, Alan C.

    2005-01-01

    Less than optimum strategies for missing values can produce biased estimates, distorted statistical power, and invalid conclusions. After reviewing traditional approaches (listwise, pairwise, and mean substitution), selected alternatives are covered including single imputation, multiple imputation, and full information maximum likelihood…

  9. [Near miss outcomes in gambling games].

    PubMed

    Pecsenye, Zsuzsa; Kurucz, Gyozo

    2017-01-01

    Games of chance operate with an intermittent reinforcement schedule in which the number of games takes the player to win differ in each turn thus they can not predict when the next positive reinforcement arrives. The near miss outcome (close to winning but actually a losing outcome) can be interpreted as a secondary (built in) reinforcement within variable ratio reinforcement schedule that presumably contribute to the development and maintanance of gambling addiction. The aim of this publication would be to introduce near miss outcomes and to summarize and critically analyze literature connected to this issue.We searched internet datebases using word "near miss" and analyse articles focusing on gambling games. Based on numerous authors' results a near miss rate set at around 30% increases the desire to continue playing among gamblers and players who have no former gambling experience as well. Some studies have demonstrated that this effect might be related to the extent the player has the situation under control during the gambling session. The hypothetical inhibiting effect of a 45% near miss ratio has not yet been proven. Neurobiological researches show middle-cerebral activity during near miss outcomes furthermore similar physiological patterns have been discovered following a near miss and winning outcomes. Regarding the connection between intrapsychic variables (cognitive and personality factors) and near misses there are very few studies. The fact that different authors interpret near miss outcomes differently even when studying the same game leads to problems in interpreting their results. It follows from the foregoing empirical results that near miss outcomes contribute to the development and maintanance of pathological gambling but we have little information on the factors implementing this effect.

  10. Missing data and multiple imputation in clinical epidemiological research.

    PubMed

    Pedersen, Alma B; Mikkelsen, Ellen M; Cronin-Fenton, Deirdre; Kristensen, Nickolaj R; Pham, Tra My; Pedersen, Lars; Petersen, Irene

    2017-01-01

    Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data.

  11. Missing data and multiple imputation in clinical epidemiological research

    PubMed Central

    Pedersen, Alma B; Mikkelsen, Ellen M; Cronin-Fenton, Deirdre; Kristensen, Nickolaj R; Pham, Tra My; Pedersen, Lars; Petersen, Irene

    2017-01-01

    Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data. PMID:28352203

  12. Earth Observing System Covariance Realism

    NASA Technical Reports Server (NTRS)

    Zaidi, Waqar H.; Hejduk, Matthew D.

    2016-01-01

    The purpose of covariance realism is to properly size a primary object's covariance in order to add validity to the calculation of the probability of collision. The covariance realism technique in this paper consists of three parts: collection/calculation of definitive state estimates through orbit determination, calculation of covariance realism test statistics at each covariance propagation point, and proper assessment of those test statistics. An empirical cumulative distribution function (ECDF) Goodness-of-Fit (GOF) method is employed to determine if a covariance is properly sized by comparing the empirical distribution of Mahalanobis distance calculations to the hypothesized parent 3-DoF chi-squared distribution. To realistically size a covariance for collision probability calculations, this study uses a state noise compensation algorithm that adds process noise to the definitive epoch covariance to account for uncertainty in the force model. Process noise is added until the GOF tests pass a group significance level threshold. The results of this study indicate that when outliers attributed to persistently high or extreme levels of solar activity are removed, the aforementioned covariance realism compensation method produces a tuned covariance with up to 80 to 90% of the covariance propagation timespan passing (against a 60% minimum passing threshold) the GOF tests-a quite satisfactory and useful result.

  13. Identifying Heat Waves in Florida: Considerations of Missing Weather Data.

    PubMed

    Leary, Emily; Young, Linda J; DuClos, Chris; Jordan, Melissa M

    2015-01-01

    Using current climate models, regional-scale changes for Florida over the next 100 years are predicted to include warming over terrestrial areas and very likely increases in the number of high temperature extremes. No uniform definition of a heat wave exists. Most past research on heat waves has focused on evaluating the aftermath of known heat waves, with minimal consideration of missing exposure information. To identify and discuss methods of handling and imputing missing weather data and how those methods can affect identified periods of extreme heat in Florida. In addition to ignoring missing data, temporal, spatial, and spatio-temporal models are described and utilized to impute missing historical weather data from 1973 to 2012 from 43 Florida weather monitors. Calculated thresholds are used to define periods of extreme heat across Florida. Modeling of missing data and imputing missing values can affect the identified periods of extreme heat, through the missing data itself or through the computed thresholds. The differences observed are related to the amount of missingness during June, July, and August, the warmest months of the warm season (April through September). Missing data considerations are important when defining periods of extreme heat. Spatio-temporal methods are recommended for data imputation. A heat wave definition that incorporates information from all monitors is advised.

  14. Simultaneous Mean and Covariance Correction Filter for Orbit Estimation.

    PubMed

    Wang, Xiaoxu; Pan, Quan; Ding, Zhengtao; Ma, Zhengya

    2018-05-05

    This paper proposes a novel filtering design, from a viewpoint of identification instead of the conventional nonlinear estimation schemes (NESs), to improve the performance of orbit state estimation for a space target. First, a nonlinear perturbation is viewed or modeled as an unknown input (UI) coupled with the orbit state, to avoid the intractable nonlinear perturbation integral (INPI) required by NESs. Then, a simultaneous mean and covariance correction filter (SMCCF), based on a two-stage expectation maximization (EM) framework, is proposed to simply and analytically fit or identify the first two moments (FTM) of the perturbation (viewed as UI), instead of directly computing such the INPI in NESs. Orbit estimation performance is greatly improved by utilizing the fit UI-FTM to simultaneously correct the state estimation and its covariance. Third, depending on whether enough information is mined, SMCCF should outperform existing NESs or the standard identification algorithms (which view the UI as a constant independent of the state and only utilize the identified UI-mean to correct the state estimation, regardless of its covariance), since it further incorporates the useful covariance information in addition to the mean of the UI. Finally, our simulations demonstrate the superior performance of SMCCF via an orbit estimation example.

  15. Restoration of HST images with missing data

    NASA Technical Reports Server (NTRS)

    Adorf, Hans-Martin

    1992-01-01

    Missing data are a fairly common problem when restoring Hubble Space Telescope observations of extended sources. On Wide Field and Planetary Camera images cosmic ray hits and CCD hot spots are the prevalent causes of data losses, whereas on Faint Object Camera images data are lossed due to reseaux marks, blemishes, areas of saturation and the omnipresent frame edges. This contribution discusses a technique for 'filling in' missing data by statistical inference using information from the surrounding pixels. The major gain consists in minimizing adverse spill-over effects to the restoration in areas neighboring those where data are missing. When the mask delineating the support of 'missing data' is made dynamic, cosmic ray hits, etc. can be detected on the fly during restoration.

  16. Using indirect covariance spectra to identify artifact responses in unsymmetrical indirect covariance calculated spectra.

    PubMed

    Martin, Gary E; Hilton, Bruce D; Blinov, Kirill A; Williams, Antony J

    2008-02-01

    Several groups of authors have reported studies in the areas of indirect and unsymmetrical indirect covariance NMR processing methods. Efforts have recently focused on the use of unsymmetrical indirect covariance processing methods to combine various discrete two-dimensional NMR spectra to afford the equivalent of the much less sensitive hyphenated 2D NMR experiments, for example indirect covariance (icv)-heteronuclear single quantum coherence (HSQC)-COSY and icv-HSQC-nuclear Overhauser effect spectroscopy (NOESY). Alternatively, unsymmetrical indirect covariance processing methods can be used to combine multiple heteronuclear 2D spectra to afford icv-13C-15N HSQC-HMBC correlation spectra. We now report the use of responses contained in indirect covariance processed HSQC spectra as a means for the identification of artifacts in both indirect covariance and unsymmetrical indirect covariance processed 2D NMR spectra. Copyright (c) 2007 John Wiley & Sons, Ltd.

  17. Impact of teamwork on missed care in four Australian hospitals.

    PubMed

    Chapman, Rose; Rahman, Asheq; Courtney, Mary; Chalmers, Cheyne

    2017-01-01

    Investigate effects of teamwork on missed nursing care across a healthcare network in Australia. Missed care is universally used as an indicator of quality nursing care, however, little is known about mitigating effects of teamwork on these events. A descriptive exploratory study. Missed Care and Team Work surveys were completed by 334 nurses. Using Stata software, nursing staff demographic information and components of missed care and teamwork were compared across the healthcare network. Statistical tests were performed to identify predicting factors for missed care. The most commonly reported components of missed care were as follows: ambulation three times per day (43·3%), turning patient every two hours (29%) and mouth care (27·7%). The commonest reasons mentioned for missed care were as follows: inadequate labour resources (range 69·8-52·7%), followed by material resources (range 59·3-33·3%) and communication (range 39·3-27·2%). There were significant differences in missed care scores across units. Using the mean scores in regression correlation matrix, the negative relationship of missed care and teamwork was supported (r = -0·34, p < 0·001). Controlling for occupation of the staff member and staff characteristics in multiple regression models, teamwork alone accounted for about 9% of missed nursing care. Similar to previous international research findings, our results showed nursing teamwork significantly impacted on missed nursing care. Teamwork may be a mitigating factor to address missed care and future research is needed. These results may provide administrators, educators and clinicians with information to develop practices and policies to improve patient care internationally. © 2016 John Wiley & Sons Ltd.

  18. Application of pattern mixture models to address missing data in longitudinal data analysis using SPSS.

    PubMed

    Son, Heesook; Friedmann, Erika; Thomas, Sue A

    2012-01-01

    Longitudinal studies are used in nursing research to examine changes over time in health indicators. Traditional approaches to longitudinal analysis of means, such as analysis of variance with repeated measures, are limited to analyzing complete cases. This limitation can lead to biased results due to withdrawal or data omission bias or to imputation of missing data, which can lead to bias toward the null if data are not missing completely at random. Pattern mixture models are useful to evaluate the informativeness of missing data and to adjust linear mixed model (LMM) analyses if missing data are informative. The aim of this study was to provide an example of statistical procedures for applying a pattern mixture model to evaluate the informativeness of missing data and conduct analyses of data with informative missingness in longitudinal studies using SPSS. The data set from the Patients' and Families' Psychological Response to Home Automated External Defibrillator Trial was used as an example to examine informativeness of missing data with pattern mixture models and to use a missing data pattern in analysis of longitudinal data. Prevention of withdrawal bias, omitted data bias, and bias toward the null in longitudinal LMMs requires the assessment of the informativeness of the occurrence of missing data. Missing data patterns can be incorporated as fixed effects into LMMs to evaluate the contribution of the presence of informative missingness to and control for the effects of missingness on outcomes. Pattern mixture models are a useful method to address the presence and effect of informative missingness in longitudinal studies.

  19. Use of empirical likelihood to calibrate auxiliary information in partly linear monotone regression models.

    PubMed

    Chen, Baojiang; Qin, Jing

    2014-05-10

    In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool-adjacent-violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood-based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.

  20. Observed Score Linear Equating with Covariates

    ERIC Educational Resources Information Center

    Branberg, Kenny; Wiberg, Marie

    2011-01-01

    This paper examined observed score linear equating in two different data collection designs, the equivalent groups design and the nonequivalent groups design, when information from covariates (i.e., background variables correlated with the test scores) was included. The main purpose of the study was to examine the effect (i.e., bias, variance, and…

  1. Fission yield covariances for JEFF: A Bayesian Monte Carlo method

    NASA Astrophysics Data System (ADS)

    Leray, Olivier; Rochman, Dimitri; Fleming, Michael; Sublet, Jean-Christophe; Koning, Arjan; Vasiliev, Alexander; Ferroukhi, Hakim

    2017-09-01

    The JEFF library does not contain fission yield covariances, but simply best estimates and uncertainties. This situation is not unique as all libraries are facing this deficiency, firstly due to the lack of a defined format. An alternative approach is to provide a set of random fission yields, themselves reflecting covariance information. In this work, these random files are obtained combining the information from the JEFF library (fission yields and uncertainties) and the theoretical knowledge from the GEF code. Examples of this method are presented for the main actinides together with their impacts on simple burn-up and decay heat calculations.

  2. Identifying Heat Waves in Florida: Considerations of Missing Weather Data

    PubMed Central

    Leary, Emily; Young, Linda J.; DuClos, Chris; Jordan, Melissa M.

    2015-01-01

    Background Using current climate models, regional-scale changes for Florida over the next 100 years are predicted to include warming over terrestrial areas and very likely increases in the number of high temperature extremes. No uniform definition of a heat wave exists. Most past research on heat waves has focused on evaluating the aftermath of known heat waves, with minimal consideration of missing exposure information. Objectives To identify and discuss methods of handling and imputing missing weather data and how those methods can affect identified periods of extreme heat in Florida. Methods In addition to ignoring missing data, temporal, spatial, and spatio-temporal models are described and utilized to impute missing historical weather data from 1973 to 2012 from 43 Florida weather monitors. Calculated thresholds are used to define periods of extreme heat across Florida. Results Modeling of missing data and imputing missing values can affect the identified periods of extreme heat, through the missing data itself or through the computed thresholds. The differences observed are related to the amount of missingness during June, July, and August, the warmest months of the warm season (April through September). Conclusions Missing data considerations are important when defining periods of extreme heat. Spatio-temporal methods are recommended for data imputation. A heat wave definition that incorporates information from all monitors is advised. PMID:26619198

  3. OPAC Missing Record Retrieval.

    ERIC Educational Resources Information Center

    Johnson, Karl E.

    1996-01-01

    When the Higher Education Library Information Network of Rhode Island transferred members' bibliographic data into a shared online public access catalog (OPAC), 10% of the University of Rhode Island's monograph records were missing. This article describes the consortium's attempts to retrieve records from the database and the effectiveness of…

  4. Semiparametric Estimation of Treatment Effect in a Pretest–Posttest Study with Missing Data

    PubMed Central

    Davidian, Marie; Tsiatis, Anastasios A.; Leon, Selene

    2008-01-01

    The pretest–posttest study is commonplace in numerous applications. Typically, subjects are randomized to two treatments, and response is measured at baseline, prior to intervention with the randomized treatment (pretest), and at prespecified follow-up time (posttest). Interest focuses on the effect of treatments on the change between mean baseline and follow-up response. Missing posttest response for some subjects is routine, and disregarding missing cases can lead to invalid inference. Despite the popularity of this design, a consensus on an appropriate analysis when no data are missing, let alone for taking into account missing follow-up, does not exist. Under a semiparametric perspective on the pretest–posttest model, in which limited distributional assumptions on pretest or posttest response are made, we show how the theory of Robins, Rotnitzky and Zhao may be used to characterize a class of consistent treatment effect estimators and to identify the efficient estimator in the class. We then describe how the theoretical results translate into practice. The development not only shows how a unified framework for inference in this setting emerges from the Robins, Rotnitzky and Zhao theory, but also provides a review and demonstration of the key aspects of this theory in a familiar context. The results are also relevant to the problem of comparing two treatment means with adjustment for baseline covariates. PMID:19081743

  5. Semiparametric Estimation of Treatment Effect in a Pretest-Posttest Study with Missing Data.

    PubMed

    Davidian, Marie; Tsiatis, Anastasios A; Leon, Selene

    2005-08-01

    The pretest-posttest study is commonplace in numerous applications. Typically, subjects are randomized to two treatments, and response is measured at baseline, prior to intervention with the randomized treatment (pretest), and at prespecified follow-up time (posttest). Interest focuses on the effect of treatments on the change between mean baseline and follow-up response. Missing posttest response for some subjects is routine, and disregarding missing cases can lead to invalid inference. Despite the popularity of this design, a consensus on an appropriate analysis when no data are missing, let alone for taking into account missing follow-up, does not exist. Under a semiparametric perspective on the pretest-posttest model, in which limited distributional assumptions on pretest or posttest response are made, we show how the theory of Robins, Rotnitzky and Zhao may be used to characterize a class of consistent treatment effect estimators and to identify the efficient estimator in the class. We then describe how the theoretical results translate into practice. The development not only shows how a unified framework for inference in this setting emerges from the Robins, Rotnitzky and Zhao theory, but also provides a review and demonstration of the key aspects of this theory in a familiar context. The results are also relevant to the problem of comparing two treatment means with adjustment for baseline covariates.

  6. A Simulation Study of Missing Data with Multiple Missing X's

    ERIC Educational Resources Information Center

    Rubright, Jonathan D.; Nandakumar, Ratna; Glutting, Joseph J.

    2014-01-01

    When exploring missing data techniques in a realistic scenario, the current literature is limited: most studies only consider consequences with data missing on a single variable. This simulation study compares the relative bias of two commonly used missing data techniques when data are missing on more than one variable. Factors varied include type…

  7. Covariance mapping techniques

    NASA Astrophysics Data System (ADS)

    Frasinski, Leszek J.

    2016-08-01

    Recent technological advances in the generation of intense femtosecond pulses have made covariance mapping an attractive analytical technique. The laser pulses available are so intense that often thousands of ionisation and Coulomb explosion events will occur within each pulse. To understand the physics of these processes the photoelectrons and photoions need to be correlated, and covariance mapping is well suited for operating at the high counting rates of these laser sources. Partial covariance is particularly useful in experiments with x-ray free electron lasers, because it is capable of suppressing pulse fluctuation effects. A variety of covariance mapping methods is described: simple, partial (single- and multi-parameter), sliced, contingent and multi-dimensional. The relationship to coincidence techniques is discussed. Covariance mapping has been used in many areas of science and technology: inner-shell excitation and Auger decay, multiphoton and multielectron ionisation, time-of-flight and angle-resolved spectrometry, infrared spectroscopy, nuclear magnetic resonance imaging, stimulated Raman scattering, directional gamma ray sensing, welding diagnostics and brain connectivity studies (connectomics). This review gives practical advice for implementing the technique and interpreting the results, including its limitations and instrumental constraints. It also summarises recent theoretical studies, highlights unsolved problems and outlines a personal view on the most promising research directions.

  8. Impact of missing data strategies in studies of parental employment and health: Missing items, missing waves, and missing mothers.

    PubMed

    Nguyen, Cattram D; Strazdins, Lyndall; Nicholson, Jan M; Cooklin, Amanda R

    2018-07-01

    Understanding the long-term health effects of employment - a major social determinant - on population health is best understood via longitudinal cohort studies, yet missing data (attrition, item non-response) remain a ubiquitous challenge. Additionally, and unique to the work-family context, is the intermittent participation of parents, particularly mothers, in employment, yielding 'incomplete' data. Missing data are patterned by gender and social circumstances, and the extent and nature of resulting biases are unknown. This study investigates how estimates of the association between work-family conflict and mental health depend on the use of four different approaches to missing data treatment, each of which allows for progressive inclusion of more cases in the analyses. We used 5 waves of data from 4983 mothers participating in the Longitudinal Study of Australian Children. Only 23% had completely observed work-family conflict data across all waves. Participants with and without missing data differed such that complete cases were the most advantaged group. Comparison of the missing data treatments indicate the expected narrowing of confidence intervals when more sample were included. However, impact on the estimated strength of association varied by level of exposure: At the lower levels of work-family conflict, estimates strengthened (were larger); at higher levels they weakened (were smaller). Our results suggest that inadequate handling of missing data in extant longitudinal studies of work-family conflict and mental health may have misestimated the adverse effects of work-family conflict, particularly for mothers. Considerable caution should be exercised in interpreting analyses that fail to explore and account for biases arising from missing data. Copyright © 2018. Published by Elsevier Ltd.

  9. DISSCO: direct imputation of summary statistics allowing covariates

    PubMed Central

    Xu, Zheng; Duan, Qing; Yan, Song; Chen, Wei; Li, Mingyao; Lange, Ethan; Li, Yun

    2015-01-01

    Background: Imputation of individual level genotypes at untyped markers using an external reference panel of genotyped or sequenced individuals has become standard practice in genetic association studies. Direct imputation of summary statistics can also be valuable, for example in meta-analyses where individual level genotype data are not available. Two methods (DIST and ImpG-Summary/LD), that assume a multivariate Gaussian distribution for the association summary statistics, have been proposed for imputing association summary statistics. However, both methods assume that the correlations between association summary statistics are the same as the correlations between the corresponding genotypes. This assumption can be violated in the presence of confounding covariates. Methods: We analytically show that in the absence of covariates, correlation among association summary statistics is indeed the same as that among the corresponding genotypes, thus serving as a theoretical justification for the recently proposed methods. We continue to prove that in the presence of covariates, correlation among association summary statistics becomes the partial correlation of the corresponding genotypes controlling for covariates. We therefore develop direct imputation of summary statistics allowing covariates (DISSCO). Results: We consider two real-life scenarios where the correlation and partial correlation likely make practical difference: (i) association studies in admixed populations; (ii) association studies in presence of other confounding covariate(s). Application of DISSCO to real datasets under both scenarios shows at least comparable, if not better, performance compared with existing correlation-based methods, particularly for lower frequency variants. For example, DISSCO can reduce the absolute deviation from the truth by 3.9–15.2% for variants with minor allele frequency <5%. Availability and implementation: http://www.unc.edu/∼yunmli/DISSCO. Contact: yunli

  10. Patient understanding of oral contraceptive pill instructions related to missed pills: a systematic review.

    PubMed

    Zapata, Lauren B; Steenland, Maria W; Brahmi, Dalia; Marchbanks, Polly A; Curtis, Kathryn M

    2013-05-01

    Instructions on what to do after pills are missed are critical to reducing unintended pregnancies resulting from patient non-adherence to oral contraceptive (OC) regimens. Missed pill instructions have previously been criticized for being too complex, lacking a definition of what is meant by "missed pills," and for being confusing to women who may not know the estrogen content of their formulation. To help inform the development of missed pill guidance to be included in the forthcoming US Selected Practice Recommendations, the objective of this systematic review was to evaluate the evidence on patient understanding of missed pill instructions. We searched the PubMed database for peer-reviewed articles that examined patient understanding of OC pill instructions that were published in any language from inception of the database through March 2012. We included studies that examined women's knowledge and understanding of missed pill instructions after exposure to some written material (e.g., patient package insert, brochure), as well as studies that compared different types of missed pill instructions on women's comprehension. We used standard abstract forms and grading systems to summarize and assess the quality of the evidence. From 1620 articles, nine studies met our inclusion criteria. Evidence from one randomized controlled trial (RCT) and two descriptive studies found that more women knew what to do after missing 1 pill than after missing 2 or 3 pills (Level I, good, to Level II-3, poor), and two descriptive studies found that more women knew what to do after missing 2 pills than after missing 3 pills (Level II-3, fair). Data from two descriptive studies documented the difficulty women have understanding missed pill instructions contained in patient package inserts (Level II-3, poor), and evidence from two RCTs found that providing written brochures with information on missed pill instructions in addition to contraceptive counseling significantly improved

  11. Communication: Three-fold covariance imaging of laser-induced Coulomb explosions

    NASA Astrophysics Data System (ADS)

    Pickering, James D.; Amini, Kasra; Brouard, Mark; Burt, Michael; Bush, Ian J.; Christensen, Lauge; Lauer, Alexandra; Nielsen, Jens H.; Slater, Craig S.; Stapelfeldt, Henrik

    2016-04-01

    We apply a three-fold covariance imaging method to analyse previously acquired data [C. S. Slater et al., Phys. Rev. A 89, 011401(R) (2014)] on the femtosecond laser-induced Coulomb explosion of spatially pre-aligned 3,5-dibromo-3',5'-difluoro-4'-cyanobiphenyl molecules. The data were acquired using the "Pixel Imaging Mass Spectrometry" camera. We show how three-fold covariance imaging of ionic photofragment recoil trajectories can be used to provide new information about the parent ion's molecular structure prior to its Coulomb explosion. In particular, we show how the analysis may be used to obtain information about molecular conformation and provide an alternative route for enantiomer determination.

  12. Does Missed Care in Isolated Rural Hospitals Matter?

    PubMed

    Smith, Jessica G

    2018-06-01

    Missed care is associated with adverse outcomes such as patient falls and decreased nurse job satisfaction. Although studied in populations of interest such as neonates, children, and heart failure patients, there are no studies about missed care in rural hospitals. Reducing care omissions in rural hospitals might help improve rural patient outcomes and ensure that rural hospitals can remain open in an era of hospital reimbursement dependent on care outcomes, such as through value-based purchasing. Understanding the extent of missed nursing care and its implications for rural populations might provide crucial information to alert rural hospital administrators and nurses about the incidence and influence of missed care on health outcomes. Focusing on missed care within rural hospitals and other rural health care settings is important to address the specific health needs of aging rural U.S. residents who are isolated from high-volume, urban health care facilities.

  13. Characteristics of HIV patients who missed their scheduled appointments

    PubMed Central

    Nagata, Delsa; Gutierrez, Eliana Battaggia

    2016-01-01

    ABSTRACT OBJECTIVE To analyze whether sociodemographic characteristics, consultations and care in special services are associated with scheduled infectious diseases appointments missed by people living with HIV. METHODS This cross-sectional and analytical study included 3,075 people living with HIV who had at least one scheduled appointment with an infectologist at a specialized health unit in 2007. A secondary data base from the Hospital Management & Information System was used. The outcome variable was missing a scheduled medical appointment. The independent variables were sex, age, appointments in specialized and available disciplines, hospitalizations at the Central Institute of the Clinical Hospital at the Faculdade de Medicina of the Universidade de São Paulo, antiretroviral treatment and change of infectologist. Crude and multiple association analysis were performed among the variables, with a statistical significance of p ≤ 0.05. RESULTS More than a third (38.9%) of the patients missed at least one of their scheduled infectious diseases appointments; 70.0% of the patients were male. The rate of missed appointments was 13.9%, albeit with no observed association between sex and absences. Age was inversely associated to missed appointment. Not undertaking anti-retroviral treatment, having unscheduled infectious diseases consultations or social services care and being hospitalized at the Central Institute were directly associated to missed appointments. CONCLUSIONS The Hospital Management & Information System proved to be a useful tool for developing indicators related to the quality of health care of people living with HIV. Other informational systems, which are often developed for administrative purposes, can also be useful for local and regional management and for evaluating the quality of care provided for patients living with HIV. PMID:26786472

  14. A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix.

    PubMed

    Hu, Zongliang; Dong, Kai; Dai, Wenlin; Tong, Tiejun

    2017-09-21

    The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.

  15. Why are they missing? : Bioinformatics characterization of missing human proteins.

    PubMed

    Elguoshy, Amr; Magdeldin, Sameh; Xu, Bo; Hirao, Yoshitoshi; Zhang, Ying; Kinoshita, Naohiko; Takisawa, Yusuke; Nameta, Masaaki; Yamamoto, Keiko; El-Refy, Ali; El-Fiky, Fawzy; Yamamoto, Tadashi

    2016-10-21

    NeXtProt is a web-based protein knowledge platform that supports research on human proteins. NeXtProt (release 2015-04-28) lists 20,060 proteins, among them, 3373 canonical proteins (16.8%) lack credible experimental evidence at protein level (PE2:PE5). Therefore, they are considered as "missing proteins". A comprehensive bioinformatic workflow has been proposed to analyze these "missing" proteins. The aims of current study were to analyze physicochemical properties, existence and distribution of the tryptic cleavage sites, and to pinpoint the signature peptides of the missing proteins. Our findings showed that 23.7% of missing proteins were hydrophobic proteins possessing transmembrane domains (TMD). Also, forty missing entries generate tryptic peptides were either out of mass detection range (>30aa) or mapped to different proteins (<9aa). Additionally, 21% of missing entries didn't generate any unique tryptic peptides. In silico endopeptidase combination strategy increased the possibility of missing proteins identification. Coherently, using both mature protein database and signal peptidome database could be a promising option to identify some missing proteins by targeting their unique N-terminal tryptic peptide from mature protein database and or C-terminus tryptic peptide from signal peptidome database. In conclusion, Identification of missing protein requires additional consideration during sample preparation, extraction, digestion and data analysis to increase its incidence of identification. Copyright © 2016. Published by Elsevier B.V.

  16. Propensity score analysis with partially observed covariates: How should multiple imputation be used?

    PubMed

    Leyrat, Clémence; Seaman, Shaun R; White, Ian R; Douglas, Ian; Smeeth, Liam; Kim, Joseph; Resche-Rigon, Matthieu; Carpenter, James R; Williamson, Elizabeth J

    2017-01-01

    Inverse probability of treatment weighting is a popular propensity score-based approach to estimate marginal treatment effects in observational studies at risk of confounding bias. A major issue when estimating the propensity score is the presence of partially observed covariates. Multiple imputation is a natural approach to handle missing data on covariates: covariates are imputed and a propensity score analysis is performed in each imputed dataset to estimate the treatment effect. The treatment effect estimates from each imputed dataset are then combined to obtain an overall estimate. We call this method MIte. However, an alternative approach has been proposed, in which the propensity scores are combined across the imputed datasets (MIps). Therefore, there are remaining uncertainties about how to implement multiple imputation for propensity score analysis: (a) should we apply Rubin's rules to the inverse probability of treatment weighting treatment effect estimates or to the propensity score estimates themselves? (b) does the outcome have to be included in the imputation model? (c) how should we estimate the variance of the inverse probability of treatment weighting estimator after multiple imputation? We studied the consistency and balancing properties of the MIte and MIps estimators and performed a simulation study to empirically assess their performance for the analysis of a binary outcome. We also compared the performance of these methods to complete case analysis and the missingness pattern approach, which uses a different propensity score model for each pattern of missingness, and a third multiple imputation approach in which the propensity score parameters are combined rather than the propensity scores themselves (MIpar). Under a missing at random mechanism, complete case and missingness pattern analyses were biased in most cases for estimating the marginal treatment effect, whereas multiple imputation approaches were approximately unbiased as long as the

  17. Revealing hidden covariation detection: evidence for implicit abstraction at study.

    PubMed

    Rossnagel, C S

    2001-09-01

    Four experiments in the brain scans paradigm (P. Lewicki, T. Hill, & I. Sasaki, 1989) investigated hidden covariation detection (HCD). In Experiment 1 HCD was found in an implicit- but not in an explicit-instruction group. In Experiment 2 HCD was impaired by nonholistic perception of stimuli but not by divided attention. In Experiment 3 HCD was eliminated by interspersing stimuli that deviated from the critical covariation. In Experiment 4 a transfer procedure was used. HCD was found with dissimilar test stimuli that preserved the covariation but was almost eliminated with similar stimuli that were neutral as to the covariation. Awareness was assessed both by objective and subjective tests in all experiments. Results suggest that HCD is an effect of implicit rule abstraction and that similarity processing plays only a minor role. HCD might be suppressed by intentional search strategies that induce inappropriate aggregation of stimulus information.

  18. Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data.

    PubMed

    Gottfredson, Nisha C; Sterba, Sonya K; Jackson, Kristina M

    2017-01-01

    Random coefficient-dependent (RCD) missingness is a non-ignorable mechanism through which missing data can arise in longitudinal designs. RCD, for which we cannot test, is a problematic form of missingness that occurs if subject-specific random effects correlate with propensity for missingness or dropout. Particularly when covariate missingness is a problem, investigators typically handle missing longitudinal data by using single-level multiple imputation procedures implemented with long-format data, which ignores within-person dependency entirely, or implemented with wide-format (i.e., multivariate) data, which ignores some aspects of within-person dependency. When either of these standard approaches to handling missing longitudinal data is used, RCD missingness leads to parameter bias and incorrect inference. We explain why multilevel multiple imputation (MMI) should alleviate bias induced by a RCD missing data mechanism under conditions that contribute to stronger determinacy of random coefficients. We evaluate our hypothesis with a simulation study. Three design factors are considered: intraclass correlation (ICC; ranging from .25 to .75), number of waves (ranging from 4 to 8), and percent of missing data (ranging from 20 to 50%). We find that MMI greatly outperforms the single-level wide-format (multivariate) method for imputation under a RCD mechanism. For the MMI analyses, bias was most alleviated when the ICC is high, there were more waves of data, and when there was less missing data. Practical recommendations for handling longitudinal missing data are suggested.

  19. Signs of depth-luminance covariance in 3-D cluttered scenes.

    PubMed

    Scaccia, Milena; Langer, Michael S

    2018-03-01

    In three-dimensional (3-D) cluttered scenes such as foliage, deeper surfaces often are more shadowed and hence darker, and so depth and luminance often have negative covariance. We examined whether the sign of depth-luminance covariance plays a role in depth perception in 3-D clutter. We compared scenes rendered with negative and positive depth-luminance covariance where positive covariance means that deeper surfaces are brighter and negative covariance means deeper surfaces are darker. For each scene, the sign of the depth-luminance covariance was given by occlusion cues. We tested whether subjects could use this sign information to judge the depth order of two target surfaces embedded in 3-D clutter. The clutter consisted of distractor surfaces that were randomly distributed in a 3-D volume. We tested three independent variables: the sign of the depth-luminance covariance, the colors of the targets and distractors, and the background luminance. An analysis of variance showed two main effects: Subjects performed better when the deeper surfaces were darker and when the color of the target surfaces was the same as the color of the distractors. There was also a strong interaction: Subjects performed better under a negative depth-luminance covariance condition when targets and distractors had different colors than when they had the same color. Our results are consistent with a "dark means deep" rule, but the use of this rule depends on the similarity between the color of the targets and color of the 3-D clutter.

  20. Point of care experience with pneumococcal and influenza vaccine documentation among persons aged ≥65 years: high refusal rates and missing information.

    PubMed

    Brownfield, Elisha; Marsden, Justin E; Iverson, Patty J; Zhao, Yumin; Mauldin, Patrick D; Moran, William P

    2012-09-01

    Missed opportunities to vaccinate and refusal of vaccine by patients have hindered the achievement of national health care goals. The meaningful use of electronic medical records should improve vaccination rates, but few studies have examined the content of these records. In our vaccine intervention program using an electronic record with physician prompts, paper prompts, and nursing standing orders, we were unable to achieve national vaccine goals, due in large part to missing information and patient refusal. Copyright © 2012 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Mosby, Inc. All rights reserved.

  1. View of MISSE on the Airlock taken during STS-112

    NASA Image and Video Library

    2002-10-10

    STS112-E-05104 (10 October 2002) --- Backdropped by a blue and white Earth, the Materials International Space Station Experiment (MISSE) and a portion of the Quest Airlock are visible. MISSE collects information on how different materials weather in the environment of space.

  2. Using an EM Covariance Matrix to Estimate Structural Equation Models with Missing Data: Choosing an Adjusted Sample Size to Improve the Accuracy of Inferences

    ERIC Educational Resources Information Center

    Enders, Craig K.; Peugh, James L.

    2004-01-01

    Two methods, direct maximum likelihood (ML) and the expectation maximization (EM) algorithm, can be used to obtain ML parameter estimates for structural equation models with missing data (MD). Although the 2 methods frequently produce identical parameter estimates, it may be easier to satisfy missing at random assumptions using EM. However, no…

  3. Covariance Bell inequalities

    NASA Astrophysics Data System (ADS)

    Pozsgay, Victor; Hirsch, Flavien; Branciard, Cyril; Brunner, Nicolas

    2017-12-01

    We introduce Bell inequalities based on covariance, one of the most common measures of correlation. Explicit examples are discussed, and violations in quantum theory are demonstrated. A crucial feature of these covariance Bell inequalities is their nonlinearity; this has nontrivial consequences for the derivation of their local bound, which is not reached by deterministic local correlations. For our simplest inequality, we derive analytically tight bounds for both local and quantum correlations. An interesting application of covariance Bell inequalities is that they can act as "shared randomness witnesses": specifically, the value of the Bell expression gives device-independent lower bounds on both the dimension and the entropy of the shared random variable in a local model.

  4. Missing drivers with dementia: antecedents and recovery.

    PubMed

    Rowe, Meredeth A; Greenblum, Catherine A; Boltz, Marie; Galvin, James E

    2012-11-01

    To determine the circumstances under which persons with dementia become lost while driving, how missing drivers are found, and how Silver Alert notifications are instrumental in those discoveries. A retrospective, descriptive study. Retrospective record review. Conducted using 156 records from the Florida Silver Alert program for October 2008 through May 2010. These alerts were issued in Florida for missing drivers with dementia. Information derived from the reports on characteristics of the missing driver, antecedents to missing event, and discovery of a missing driver. The majority of missing drivers were men aged 58 to 94 who were being cared for by a spouse. Most drivers became lost on routine, caregiver-sanctioned trips to usual locations. Only 15% were driving when found, with most being found in or near a parked car. Law enforcement officers found the large majority. Only 40% were found in the county where they went missing, and 10% were found in a different state. Silver Alert notifications were most effective for law enforcement; citizen alerts resulted in a few discoveries. There was 5% mortality in the study population, with those living alone more likely to be found dead than alive. An additional 15% were found in dangerous situations such as stopped on railroad tracks. Thirty-two percent had documented driving or other dangerous errors, such as driving the wrong way or into secluded areas or walking in or near roadways. © 2012, Copyright the Authors Journal compilation © 2012, The American Geriatrics Society.

  5. Bayes linear covariance matrix adjustment

    NASA Astrophysics Data System (ADS)

    Wilkinson, Darren J.

    1995-12-01

    In this thesis, a Bayes linear methodology for the adjustment of covariance matrices is presented and discussed. A geometric framework for quantifying uncertainties about covariance matrices is set up, and an inner-product for spaces of random matrices is motivated and constructed. The inner-product on this space captures aspects of our beliefs about the relationship between covariance matrices of interest to us, providing a structure rich enough for us to adjust beliefs about unknown matrices in the light of data such as sample covariance matrices, exploiting second-order exchangeability and related specifications to obtain representations allowing analysis. Adjustment is associated with orthogonal projection, and illustrated with examples of adjustments for some common problems. The problem of adjusting the covariance matrices underlying exchangeable random vectors is tackled and discussed. Learning about the covariance matrices associated with multivariate time series dynamic linear models is shown to be amenable to a similar approach. Diagnostics for matrix adjustments are also discussed.

  6. Efficient retrieval of landscape Hessian: Forced optimal covariance adaptive learning

    NASA Astrophysics Data System (ADS)

    Shir, Ofer M.; Roslund, Jonathan; Whitley, Darrell; Rabitz, Herschel

    2014-06-01

    Knowledge of the Hessian matrix at the landscape optimum of a controlled physical observable offers valuable information about the system robustness to control noise. The Hessian can also assist in physical landscape characterization, which is of particular interest in quantum system control experiments. The recently developed landscape theoretical analysis motivated the compilation of an automated method to learn the Hessian matrix about the global optimum without derivative measurements from noisy data. The current study introduces the forced optimal covariance adaptive learning (FOCAL) technique for this purpose. FOCAL relies on the covariance matrix adaptation evolution strategy (CMA-ES) that exploits covariance information amongst the control variables by means of principal component analysis. The FOCAL technique is designed to operate with experimental optimization, generally involving continuous high-dimensional search landscapes (≳30) with large Hessian condition numbers (≳104). This paper introduces the theoretical foundations of the inverse relationship between the covariance learned by the evolution strategy and the actual Hessian matrix of the landscape. FOCAL is presented and demonstrated to retrieve the Hessian matrix with high fidelity on both model landscapes and quantum control experiments, which are observed to possess nonseparable, nonquadratic search landscapes. The recovered Hessian forms were corroborated by physical knowledge of the systems. The implications of FOCAL extend beyond the investigated studies to potentially cover other physically motivated multivariate landscapes.

  7. Weakly Informative Prior for Point Estimation of Covariance Matrices in Hierarchical Models

    ERIC Educational Resources Information Center

    Chung, Yeojin; Gelman, Andrew; Rabe-Hesketh, Sophia; Liu, Jingchen; Dorie, Vincent

    2015-01-01

    When fitting hierarchical regression models, maximum likelihood (ML) estimation has computational (and, for some users, philosophical) advantages compared to full Bayesian inference, but when the number of groups is small, estimates of the covariance matrix (S) of group-level varying coefficients are often degenerate. One can do better, even from…

  8. Background Error Covariance Estimation using Information from a Single Model Trajectory with Application to Ocean Data Assimilation into the GEOS-5 Coupled Model

    NASA Technical Reports Server (NTRS)

    Keppenne, Christian L.; Rienecker, Michele M.; Kovach, Robin M.; Vernieres, Guillaume; Koster, Randal D. (Editor)

    2014-01-01

    An attractive property of ensemble data assimilation methods is that they provide flow dependent background error covariance estimates which can be used to update fields of observed variables as well as fields of unobserved model variables. Two methods to estimate background error covariances are introduced which share the above property with ensemble data assimilation methods but do not involve the integration of multiple model trajectories. Instead, all the necessary covariance information is obtained from a single model integration. The Space Adaptive Forecast error Estimation (SAFE) algorithm estimates error covariances from the spatial distribution of model variables within a single state vector. The Flow Adaptive error Statistics from a Time series (FAST) method constructs an ensemble sampled from a moving window along a model trajectory. SAFE and FAST are applied to the assimilation of Argo temperature profiles into version 4.1 of the Modular Ocean Model (MOM4.1) coupled to the GEOS-5 atmospheric model and to the CICE sea ice model. The results are validated against unassimilated Argo salinity data. They show that SAFE and FAST are competitive with the ensemble optimal interpolation (EnOI) used by the Global Modeling and Assimilation Office (GMAO) to produce its ocean analysis. Because of their reduced cost, SAFE and FAST hold promise for high-resolution data assimilation applications.

  9. Moderating the Covariance Between Family Member’s Substance Use Behavior

    PubMed Central

    Eaves, Lindon J.; Neale, Michael C.

    2014-01-01

    Twin and family studies implicitly assume that the covariation between family members remains constant across differences in age between the members of the family. However, age-specificity in gene expression for shared environmental factors could generate higher correlations between family members who are more similar in age. Cohort effects (cohort × genotype or cohort × common environment) could have the same effects, and both potentially reduce effect sizes estimated in genome-wide association studies where the subjects are heterogeneous in age. In this paper we describe a model in which the covariance between twins and non-twin siblings is moderated as a function of age difference. We describe the details of the model and simulate data using a variety of different parameter values to demonstrate that model fitting returns unbiased parameter estimates. Power analyses are then conducted to estimate the sample sizes required to detect the effects of moderation in a design of twins and siblings. Finally, the model is applied to data on cigarette smoking. We find that (1) the model effectively recovers the simulated parameters, (2) the power is relatively low and therefore requires large sample sizes before small to moderate effect sizes can be found reliably, and (3) the genetic covariance between siblings for smoking behavior decays very rapidly. Result 3 implies that, e.g., genome-wide studies of smoking behavior that use individuals assessed at different ages, or belonging to different birth-year cohorts may have had substantially reduced power to detect effects of genotype on cigarette use. It also implies that significant special twin environmental effects can be explained by age-moderation in some cases. This effect likely contributes to the missing heritability paradox. PMID:24647834

  10. Allowing for uncertainty due to missing continuous outcome data in pairwise and network meta-analysis.

    PubMed

    Mavridis, Dimitris; White, Ian R; Higgins, Julian P T; Cipriani, Andrea; Salanti, Georgia

    2015-02-28

    Missing outcome data are commonly encountered in randomized controlled trials and hence may need to be addressed in a meta-analysis of multiple trials. A common and simple approach to deal with missing data is to restrict analysis to individuals for whom the outcome was obtained (complete case analysis). However, estimated treatment effects from complete case analyses are potentially biased if informative missing data are ignored. We develop methods for estimating meta-analytic summary treatment effects for continuous outcomes in the presence of missing data for some of the individuals within the trials. We build on a method previously developed for binary outcomes, which quantifies the degree of departure from a missing at random assumption via the informative missingness odds ratio. Our new model quantifies the degree of departure from missing at random using either an informative missingness difference of means or an informative missingness ratio of means, both of which relate the mean value of the missing outcome data to that of the observed data. We propose estimating the treatment effects, adjusted for informative missingness, and their standard errors by a Taylor series approximation and by a Monte Carlo method. We apply the methodology to examples of both pairwise and network meta-analysis with multi-arm trials. © 2014 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  11. Missing data imputation: focusing on single imputation.

    PubMed

    Zhang, Zhongheng

    2016-01-01

    Complete case analysis is widely used for handling missing data, and it is the default method in many statistical packages. However, this method may introduce bias and some useful information will be omitted from analysis. Therefore, many imputation methods are developed to make gap end. The present article focuses on single imputation. Imputations with mean, median and mode are simple but, like complete case analysis, can introduce bias on mean and deviation. Furthermore, they ignore relationship with other variables. Regression imputation can preserve relationship between missing values and other variables. There are many sophisticated methods exist to handle missing values in longitudinal data. This article focuses primarily on how to implement R code to perform single imputation, while avoiding complex mathematical calculations.

  12. Missing data imputation: focusing on single imputation

    PubMed Central

    2016-01-01

    Complete case analysis is widely used for handling missing data, and it is the default method in many statistical packages. However, this method may introduce bias and some useful information will be omitted from analysis. Therefore, many imputation methods are developed to make gap end. The present article focuses on single imputation. Imputations with mean, median and mode are simple but, like complete case analysis, can introduce bias on mean and deviation. Furthermore, they ignore relationship with other variables. Regression imputation can preserve relationship between missing values and other variables. There are many sophisticated methods exist to handle missing values in longitudinal data. This article focuses primarily on how to implement R code to perform single imputation, while avoiding complex mathematical calculations. PMID:26855945

  13. Covariance of fluid-turbulence theory.

    PubMed

    Ariki, Taketo

    2015-05-01

    Covariance of physical quantities in fluid-turbulence theory and their governing equations under generalized coordinate transformation is discussed. It is shown that the velocity fluctuation and its governing law have a covariance under far wider group of coordinate transformation than that of conventional Euclidean invariance, and, as a natural consequence, various correlations and their governing laws are shown to be formulated in covariant manners under this wider transformation group. In addition, it is also shown that the covariance of the Reynolds stress is tightly connected to the objectivity of the mean flow.

  14. Choosing relatives for DNA identification of missing persons.

    PubMed

    Ge, Jianye; Budowle, Bruce; Chakraborty, Ranajit

    2011-01-01

    DNA-based analysis is integral to missing person identification cases. When direct references are not available, indirect relative references can be used to identify missing persons by kinship analysis. Generally, more reference relatives render greater accuracy of identification. However, it is costly to type multiple references. Thus, at times, decisions may need to be made on which relatives to type. In this study, pedigrees for 37 common reference scenarios with 13 CODIS STRs were simulated to rank the information content of different combinations of relatives. The results confirm that first-order relatives (parents and fullsibs) are the most preferred relatives to identify missing persons; fullsibs are also informative. Less genetic dependence between references provides a higher on average likelihood ratio. Distant relatives may not be helpful solely by autosomal markers. But lineage-based Y chromosome and mitochondrial DNA markers can increase the likelihood ratio or serve as filters to exclude putative relationships. © 2010 American Academy of Forensic Sciences.

  15. Reconstructing missing information on precipitation datasets: impact of tails on adopted statistical distributions.

    NASA Astrophysics Data System (ADS)

    Pedretti, Daniele; Beckie, Roger Daniel

    2014-05-01

    Missing data in hydrological time-series databases are ubiquitous in practical applications, yet it is of fundamental importance to make educated decisions in problems involving exhaustive time-series knowledge. This includes precipitation datasets, since recording or human failures can produce gaps in these time series. For some applications, directly involving the ratio between precipitation and some other quantity, lack of complete information can result in poor understanding of basic physical and chemical dynamics involving precipitated water. For instance, the ratio between precipitation (recharge) and outflow rates at a discharge point of an aquifer (e.g. rivers, pumping wells, lysimeters) can be used to obtain aquifer parameters and thus to constrain model-based predictions. We tested a suite of methodologies to reconstruct missing information in rainfall datasets. The goal was to obtain a suitable and versatile method to reduce the errors given by the lack of data in specific time windows. Our analyses included both a classical chronologically-pairing approach between rainfall stations and a probability-based approached, which accounted for the probability of exceedence of rain depths measured at two or multiple stations. Our analyses proved that it is not clear a priori which method delivers the best methodology. Rather, this selection should be based considering the specific statistical properties of the rainfall dataset. In this presentation, our emphasis is to discuss the effects of a few typical parametric distributions used to model the behavior of rainfall. Specifically, we analyzed the role of distributional "tails", which have an important control on the occurrence of extreme rainfall events. The latter strongly affect several hydrological applications, including recharge-discharge relationships. The heavy-tailed distributions we considered were parametric Log-Normal, Generalized Pareto, Generalized Extreme and Gamma distributions. The methods were

  16. The treatment of missing data in a large cardiovascular clinical outcomes study.

    PubMed

    Little, Roderick J; Wang, Julia; Sun, Xiang; Tian, Hong; Suh, Eun-Young; Lee, Michael; Sarich, Troy; Oppenheimer, Leonard; Plotnikov, Alexei; Wittes, Janet; Cook-Bruns, Nancy; Burton, Paul; Gibson, C Michael; Mohanty, Surya

    2016-06-01

    The potential impact of missing data on the results of clinical trials has received heightened attention recently. A National Research Council study provides recommendations for limiting missing data in clinical trial design and conduct, and principles for analysis, including the need for sensitivity analyses to assess robustness of findings to alternative assumptions about the missing data. A Food and Drug Administration advisory committee raised missing data as a serious concern in their review of results from the ATLAS ACS 2 TIMI 51 study, a large clinical trial that assessed rivaroxaban for its ability to reduce the risk of cardiovascular death, myocardial infarction or stroke in patients with acute coronary syndrome. This case study describes a variety of measures that were taken to address concerns about the missing data. A range of analyses are described to assess the potential impact of missing data on conclusions. In particular, measures of the amount of missing data are discussed, and the fraction of missing information from multiple imputation is proposed as an alternative measure. The sensitivity analysis in the National Research Council study is modified in the context of survival analysis where some individuals are lost to follow-up. The impact of deviations from ignorable censoring is assessed by differentially increasing the hazard of the primary outcome in the treatment groups and multiply imputing events between dropout and the end of the study. Tipping-point analyses are described, where the deviation from ignorable censoring that results in a reversal of significance of the treatment effect is determined. A study to determine the vital status of participants lost to follow-up was also conducted, and the results of including this additional information are assessed. Sensitivity analyses suggest that findings of the ATLAS ACS 2 TIMI 51 study are robust to missing data; this robustness is reinforced by the follow-up study, since inclusion of data

  17. Constructing statistically unbiased cortical surface templates using feature-space covariance

    NASA Astrophysics Data System (ADS)

    Parvathaneni, Prasanna; Lyu, Ilwoo; Huo, Yuankai; Blaber, Justin; Hainline, Allison E.; Kang, Hakmook; Woodward, Neil D.; Landman, Bennett A.

    2018-03-01

    The choice of surface template plays an important role in cross-sectional subject analyses involving cortical brain surfaces because there is a tendency toward registration bias given variations in inter-individual and inter-group sulcal and gyral patterns. In order to account for the bias and spatial smoothing, we propose a feature-based unbiased average template surface. In contrast to prior approaches, we factor in the sample population covariance and assign weights based on feature information to minimize the influence of covariance in the sampled population. The mean surface is computed by applying the weights obtained from an inverse covariance matrix, which guarantees that multiple representations from similar groups (e.g., involving imaging, demographic, diagnosis information) are down-weighted to yield an unbiased mean in feature space. Results are validated by applying this approach in two different applications. For evaluation, the proposed unbiased weighted surface mean is compared with un-weighted means both qualitatively and quantitatively (mean squared error and absolute relative distance of both the means with baseline). In first application, we validated the stability of the proposed optimal mean on a scan-rescan reproducibility dataset by incrementally adding duplicate subjects. In the second application, we used clinical research data to evaluate the difference between the weighted and unweighted mean when different number of subjects were included in control versus schizophrenia groups. In both cases, the proposed method achieved greater stability that indicated reduced impacts of sampling bias. The weighted mean is built based on covariance information in feature space as opposed to spatial location, thus making this a generic approach to be applicable to any feature of interest.

  18. Examining Solutions to Missing Data in Longitudinal Nursing Research

    PubMed Central

    Roberts, Mary B.; Sullivan, Mary C.; Winchester, Suzy B.

    2017-01-01

    Purpose Longitudinal studies are highly valuable in pediatrics because they provide useful data about developmental patterns of child health and behavior over time. When data are missing, the value of the research is impacted. The study’s purpose was to: (1) introduce a 3-step approach to assess and address missing data; (2) illustrate this approach using categorical and continuous level variables from a longitudinal study of premature infants. Methods A three-step approach with simulations was followed to assess the amount and pattern of missing data and to determine the most appropriate imputation method for the missing data. Patterns of missingness were Missing Completely at Random, Missing at Random, and Not Missing at Random. Missing continuous-level data were imputed using mean replacement, stochastic regression, multiple imputation, and fully conditional specification. Missing categorical-level data were imputed using last value carried forward, hot-decking, stochastic regression, and fully conditional specification. Simulations were used to evaluate these imputation methods under different patterns of missingness at different levels of missing data. Results The rate of missingness was 16–23% for continuous variables and 1–28% for categorical variables. Fully conditional specification imputation provided the least difference in mean and standard deviation estimates for continuous measures. Fully conditional specification imputation was acceptable for categorical measures. Results obtained through simulation reinforced and confirmed these findings. Practice Implications Significant investments are made in the collection of longitudinal data. The prudent handling of missing data can protect these investments and potentially improve the scientific information contained in pediatric longitudinal studies. PMID:28425202

  19. Examining solutions to missing data in longitudinal nursing research.

    PubMed

    Roberts, Mary B; Sullivan, Mary C; Winchester, Suzy B

    2017-04-01

    Longitudinal studies are highly valuable in pediatrics because they provide useful data about developmental patterns of child health and behavior over time. When data are missing, the value of the research is impacted. The study's purpose was to (1) introduce a three-step approach to assess and address missing data and (2) illustrate this approach using categorical and continuous-level variables from a longitudinal study of premature infants. A three-step approach with simulations was followed to assess the amount and pattern of missing data and to determine the most appropriate imputation method for the missing data. Patterns of missingness were Missing Completely at Random, Missing at Random, and Not Missing at Random. Missing continuous-level data were imputed using mean replacement, stochastic regression, multiple imputation, and fully conditional specification (FCS). Missing categorical-level data were imputed using last value carried forward, hot-decking, stochastic regression, and FCS. Simulations were used to evaluate these imputation methods under different patterns of missingness at different levels of missing data. The rate of missingness was 16-23% for continuous variables and 1-28% for categorical variables. FCS imputation provided the least difference in mean and standard deviation estimates for continuous measures. FCS imputation was acceptable for categorical measures. Results obtained through simulation reinforced and confirmed these findings. Significant investments are made in the collection of longitudinal data. The prudent handling of missing data can protect these investments and potentially improve the scientific information contained in pediatric longitudinal studies. © 2017 Wiley Periodicals, Inc.

  20. A quantile regression model for failure-time data with time-dependent covariates

    PubMed Central

    Gorfine, Malka; Goldberg, Yair; Ritov, Ya’acov

    2017-01-01

    Summary Since survival data occur over time, often important covariates that we wish to consider also change over time. Such covariates are referred as time-dependent covariates. Quantile regression offers flexible modeling of survival data by allowing the covariates to vary with quantiles. This article provides a novel quantile regression model accommodating time-dependent covariates, for analyzing survival data subject to right censoring. Our simple estimation technique assumes the existence of instrumental variables. In addition, we present a doubly-robust estimator in the sense of Robins and Rotnitzky (1992, Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell, N. P., Dietz, K. and Farewell, V. T. (editors), AIDS Epidemiology. Boston: Birkhaäuser, pp. 297–331.). The asymptotic properties of the estimators are rigorously studied. Finite-sample properties are demonstrated by a simulation study. The utility of the proposed methodology is demonstrated using the Stanford heart transplant dataset. PMID:27485534

  1. The Concept of Missing Incidents in Persons with Dementia

    PubMed Central

    Rowe, Meredeth; Houston, Amy; Molinari, Victor; Bulat, Tatjana; Bowen, Mary Elizabeth; Spring, Heather; Mutolo, Sandra; McKenzie, Barbara

    2015-01-01

    Behavioral symptoms of dementia often present the greatest challenge for informal caregivers. One behavior, that is a constant concern for caregivers, is the person with dementia leaving a designated area such that their whereabouts become unknown to the caregiver or a missing incident. Based on an extensive literature review and published findings of their own research, members of the International Consortium on Wandering and Missing Incidents constructed a preliminary missing incidents model. Examining the evidence base, specific factors within each category of the model were further described, reviewed and modified until consensus was reached regarding the final model. The model begins to explain in particular the variety of antecedents that are related to missing incidents. The model presented in this paper is designed to be heuristic and may be used to stimulate discussion and the development of effective preventative and response strategies for missing incidents among persons with dementia. PMID:27417817

  2. The Concept of Missing Incidents in Persons with Dementia.

    PubMed

    Rowe, Meredeth; Houston, Amy; Molinari, Victor; Bulat, Tatjana; Bowen, Mary Elizabeth; Spring, Heather; Mutolo, Sandra; McKenzie, Barbara

    2015-11-10

    Behavioral symptoms of dementia often present the greatest challenge for informal caregivers. One behavior, that is a constant concern for caregivers, is the person with dementia leaving a designated area such that their whereabouts become unknown to the caregiver or a missing incident. Based on an extensive literature review and published findings of their own research, members of the International Consortium on Wandering and Missing Incidents constructed a preliminary missing incidents model. Examining the evidence base, specific factors within each category of the model were further described, reviewed and modified until consensus was reached regarding the final model. The model begins to explain in particular the variety of antecedents that are related to missing incidents. The model presented in this paper is designed to be heuristic and may be used to stimulate discussion and the development of effective preventative and response strategies for missing incidents among persons with dementia.

  3. Estimated Environmental Exposures for MISSE-3 and MISSE-4

    NASA Technical Reports Server (NTRS)

    Finckenor, Miria M.; Pippin, Gary; Kinard, William H.

    2008-01-01

    Describes the estimated environmental exposure for MISSE-2 and MISSE-4. These test beds, attached to the outside of the International Space Station, were planned for 3 years of exposure. This was changed to 1 year after MISSE-1 and -2 were in space for 4 years. MISSE-3 and -4 operate in a low Earth orbit space environment, which exposes them to a variety of assaults including atomic oxygen, ultraviolet radiation, particulate radiation, thermal cycling, and meteoroid/space debris impact, as well as contamination associated with proximity to an active space station. Measurements and determinations of atomic oxygen fluences, solar UV exposure levels, molecular contamination levels, and particulate radiation are included.

  4. A Review On Missing Value Estimation Using Imputation Algorithm

    NASA Astrophysics Data System (ADS)

    Armina, Roslan; Zain, Azlan Mohd; Azizah Ali, Nor; Sallehuddin, Roselina

    2017-09-01

    The presence of the missing value in the data set has always been a major problem for precise prediction. The method for imputing missing value needs to minimize the effect of incomplete data sets for the prediction model. Many algorithms have been proposed for countermeasure of missing value problem. In this review, we provide a comprehensive analysis of existing imputation algorithm, focusing on the technique used and the implementation of global or local information of data sets for missing value estimation. In addition validation method for imputation result and way to measure the performance of imputation algorithm also described. The objective of this review is to highlight possible improvement on existing method and it is hoped that this review gives reader better understanding of imputation method trend.

  5. Covariant Uniform Acceleration

    NASA Astrophysics Data System (ADS)

    Friedman, Yaakov; Scarr, Tzvi

    2013-04-01

    We derive a 4D covariant Relativistic Dynamics Equation. This equation canonically extends the 3D relativistic dynamics equation , where F is the 3D force and p = m0γv is the 3D relativistic momentum. The standard 4D equation is only partially covariant. To achieve full Lorentz covariance, we replace the four-force F by a rank 2 antisymmetric tensor acting on the four-velocity. By taking this tensor to be constant, we obtain a covariant definition of uniformly accelerated motion. This solves a problem of Einstein and Planck. We compute explicit solutions for uniformly accelerated motion. The solutions are divided into four Lorentz-invariant types: null, linear, rotational, and general. For null acceleration, the worldline is cubic in the time. Linear acceleration covariantly extends 1D hyperbolic motion, while rotational acceleration covariantly extends pure rotational motion. We use Generalized Fermi-Walker transport to construct a uniformly accelerated family of inertial frames which are instantaneously comoving to a uniformly accelerated observer. We explain the connection between our approach and that of Mashhoon. We show that our solutions of uniformly accelerated motion have constant acceleration in the comoving frame. Assuming the Weak Hypothesis of Locality, we obtain local spacetime transformations from a uniformly accelerated frame K' to an inertial frame K. The spacetime transformations between two uniformly accelerated frames with the same acceleration are Lorentz. We compute the metric at an arbitrary point of a uniformly accelerated frame. We obtain velocity and acceleration transformations from a uniformly accelerated system K' to an inertial frame K. We introduce the 4D velocity, an adaptation of Horwitz and Piron s notion of "off-shell." We derive the general formula for the time dilation between accelerated clocks. We obtain a formula for the angular velocity of a uniformly accelerated object. Every rest point of K' is uniformly accelerated, and

  6. Evolutionary Characteristics of Missing Proteins: Insights into the Evolution of Human Chromosomes Related to Missing-Protein-Encoding Genes.

    PubMed

    Xu, Aishi; Li, Guang; Yang, Dong; Wu, Songfeng; Ouyang, Hongsheng; Xu, Ping; He, Fuchu

    2015-12-04

    Although the "missing protein" is a temporary concept in C-HPP, the biological information for their "missing" could be an important clue in evolutionary studies. Here we classified missing-protein-encoding genes into two groups, the genes encoding PE2 proteins (with transcript evidence) and the genes encoding PE3/4 proteins (with no transcript evidence). These missing-protein-encoding genes distribute unevenly among different chromosomes, chromosomal regions, or gene clusters. In the view of evolutionary features, PE3/4 genes tend to be young, spreading at the nonhomology chromosomal regions and evolving at higher rates. Interestingly, there is a higher proportion of singletons in PE3/4 genes than the proportion of singletons in all genes (background) and OTCSGs (organ, tissue, cell type-specific genes). More importantly, most of the paralogous PE3/4 genes belong to the newly duplicated members of the paralogous gene groups, which mainly contribute to special biological functions, such as "smell perception". These functions are heavily restricted into specific type of cells, tissues, or specific developmental stages, acting as the new functional requirements that facilitated the emergence of the missing-protein-encoding genes during evolution. In addition, the criteria for the extremely special physical-chemical proteins were first set up based on the properties of PE2 proteins, and the evolutionary characteristics of those proteins were explored. Overall, the evolutionary analyses of missing-protein-encoding genes are expected to be highly instructive for proteomics and functional studies in the future.

  7. Reducing Misses and Near Misses Related to Multitasking on the Electronic Health Record: Observational Study and Qualitative Analysis.

    PubMed

    Ratanawongsa, Neda; Matta, George Y; Bohsali, Fuad B; Chisolm, Margaret S

    2018-02-06

    Clinicians' use of electronic health record (EHR) systems while multitasking may increase the risk of making errors, but silent EHR system use may lower patient satisfaction. Delaying EHR system use until after patient visits may increase clinicians' EHR workload, stress, and burnout. We aimed to describe the perspectives of clinicians, educators, administrators, and researchers about misses and near misses that they felt were related to clinician multitasking while using EHR systems. This observational study was a thematic analysis of perspectives elicited from 63 continuing medical education (CME) participants during 2 workshops and 1 interactive lecture about challenges and strategies for relationship-centered communication during clinician EHR system use. The workshop elicited reflection about memorable times when multitasking EHR use was associated with "misses" (errors that were not caught at the time) or "near misses" (mistakes that were caught before leading to errors). We conducted qualitative analysis using an editing analysis style to identify codes and then select representative themes and quotes. All workshop participants shared stories of misses or near misses in EHR system ordering and documentation or patient-clinician communication, wondering about "misses we don't even know about." Risk factors included the computer's position, EHR system usability, note content and style, information overload, problematic workflows, systems issues, and provider and patient communication behaviors and expectations. Strategies to reduce multitasking EHR system misses included clinician transparency when needing silent EHR system use (eg, for prescribing), narrating EHR system use, patient activation during EHR system use, adapting visit organization and workflow, improving EHR system design, and improving team support and systems. CME participants shared numerous stories of errors and near misses in EHR tasks and communication that they felt related to EHR

  8. Missing Drivers with Dementia: Antecedents and Recovery

    PubMed Central

    Rowe, Meredeth A.; Greenblum, Catherine A.; Boltz, Marie; Galvin, James E.

    2013-01-01

    OBJECTIVES To determine the circumstance in which persons with dementia become lost while driving, how missing drivers are found, and how Silver Alert notificationsare instrumental in those discoveries. DESIGN A retrospective, descriptive study. SETTING Retrospective record review. PARTICIPANTS Conducted using 156 records from the Florida Silver Alert program for the time period October, 2008 through May 2010. These alerts were issued in Florida for a missing driver with dementia. MEASUREMENTS Information derived from the reports on characteristics of the missing driver, antecedents to missing event and discovery of a missing driver. RESULTS and CONCLUSION The majority of missing drivers were males, with ages ranging from 58’94, who were being cared for by a spouse. Most drivers became lost on routine, caregiver-sanctioned trips to usual locations. Only 15% were in the act of driving when found with most being found in or near a parked car and the large majority were found by law enforcement officers. Only 40% were found in the county they went missing and 10% were found in a different state. Silver Alert notifications were most effective for law enforcement; citizen alerts resulted in a few discoveries. There was a 5% mortality rate in the study population with those living alone more likely to be found dead than alive. An additional 15% were found in dangerous situations such as stopped on railroad tracks. Thirty-two percent had documented driving or dangerous errors such as, driving thewrong way or into secluded areas, or walking in or near roadways. PMID:23134069

  9. Misconceptions on Missing Data in RAD-seq Phylogenetics with a Deep-scale Example from Flowering Plants.

    PubMed

    Eaton, Deren A R; Spriggs, Elizabeth L; Park, Brian; Donoghue, Michael J

    2017-05-01

    Restriction-site associated DNA (RAD) sequencing and related methods rely on the conservation of enzyme recognition sites to isolate homologous DNA fragments for sequencing, with the consequence that mutations disrupting these sites lead to missing information. There is thus a clear expectation for how missing data should be distributed, with fewer loci recovered between more distantly related samples. This observation has led to a related expectation: that RAD-seq data are insufficiently informative for resolving deeper scale phylogenetic relationships. Here we investigate the relationship between missing information among samples at the tips of a tree and information at edges within it. We re-analyze and review the distribution of missing data across ten RAD-seq data sets and carry out simulations to determine expected patterns of missing information. We also present new empirical results for the angiosperm clade Viburnum (Adoxaceae, with a crown age >50 Ma) for which we examine phylogenetic information at different depths in the tree and with varied sequencing effort. The total number of loci, the proportion that are shared, and phylogenetic informativeness varied dramatically across the examined RAD-seq data sets. Insufficient or uneven sequencing coverage accounted for similar proportions of missing data as dropout from mutation-disruption. Simulations reveal that mutation-disruption, which results in phylogenetically distributed missing data, can be distinguished from the more stochastic patterns of missing data caused by low sequencing coverage. In Viburnum, doubling sequencing coverage nearly doubled the number of parsimony informative sites, and increased by >10X the number of loci with data shared across >40 taxa. Our analysis leads to a set of practical recommendations for maximizing phylogenetic information in RAD-seq studies. [hierarchical redundancy; phylogenetic informativeness; quartet informativeness; Restriction-site associated DNA (RAD

  10. Efficient Storage Scheme of Covariance Matrix during Inverse Modeling

    NASA Astrophysics Data System (ADS)

    Mao, D.; Yeh, T. J.

    2013-12-01

    During stochastic inverse modeling, the covariance matrix of geostatistical based methods carries the information about the geologic structure. Its update during iterations reflects the decrease of uncertainty with the incorporation of observed data. For large scale problem, its storage and update cost too much memory and computational resources. In this study, we propose a new efficient storage scheme for storage and update. Compressed Sparse Column (CSC) format is utilized to storage the covariance matrix, and users can assign how many data they prefer to store based on correlation scales since the data beyond several correlation scales are usually not very informative for inverse modeling. After every iteration, only the diagonal terms of the covariance matrix are updated. The off diagonal terms are calculated and updated based on shortened correlation scales with a pre-assigned exponential model. The correlation scales are shortened by a coefficient, i.e. 0.95, every iteration to show the decrease of uncertainty. There is no universal coefficient for all the problems and users are encouraged to try several times. This new scheme is tested with 1D examples first. The estimated results and uncertainty are compared with the traditional full storage method. In the end, a large scale numerical model is utilized to validate this new scheme.

  11. Inventory Uncertainty Quantification using TENDL Covariance Data in Fispact-II

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Eastwood, J.W.; Morgan, J.G.; Sublet, J.-Ch., E-mail: jean-christophe.sublet@ccfe.ac.uk

    2015-01-15

    The new inventory code Fispact-II provides predictions of inventory, radiological quantities and their uncertainties using nuclear data covariance information. Central to the method is a novel fast pathways search algorithm using directed graphs. The pathways output provides (1) an aid to identifying important reactions, (2) fast estimates of uncertainties, (3) reduced models that retain important nuclides and reactions for use in the code's Monte Carlo sensitivity analysis module. Described are the methods that are being implemented for improving uncertainty predictions, quantification and propagation using the covariance data that the recent nuclear data libraries contain. In the TENDL library, above themore » upper energy of the resolved resonance range, a Monte Carlo method in which the covariance data come from uncertainties of the nuclear model calculations is used. The nuclear data files are read directly by FISPACT-II without any further intermediate processing. Variance and covariance data are processed and used by FISPACT-II to compute uncertainties in collapsed cross sections, and these are in turn used to predict uncertainties in inventories and all derived radiological data.« less

  12. A Semiparametric Approach to Simultaneous Covariance Estimation for Bivariate Sparse Longitudinal Data

    PubMed Central

    Das, Kiranmoy; Daniels, Michael J.

    2014-01-01

    Summary Estimation of the covariance structure for irregular sparse longitudinal data has been studied by many authors in recent years but typically using fully parametric specifications. In addition, when data are collected from several groups over time, it is known that assuming the same or completely different covariance matrices over groups can lead to loss of efficiency and/or bias. Nonparametric approaches have been proposed for estimating the covariance matrix for regular univariate longitudinal data by sharing information across the groups under study. For the irregular case, with longitudinal measurements that are bivariate or multivariate, modeling becomes more difficult. In this article, to model bivariate sparse longitudinal data from several groups, we propose a flexible covariance structure via a novel matrix stick-breaking process for the residual covariance structure and a Dirichlet process mixture of normals for the random effects. Simulation studies are performed to investigate the effectiveness of the proposed approach over more traditional approaches. We also analyze a subset of Framingham Heart Study data to examine how the blood pressure trajectories and covariance structures differ for the patients from different BMI groups (high, medium and low) at baseline. PMID:24400941

  13. Identifying patterns of item missing survey data using latent groups: an observational study

    PubMed Central

    McElwee, Paul; Nathan, Andrea; Burton, Nicola W; Turrell, Gavin

    2017-01-01

    Objectives To examine whether respondents to a survey of health and physical activity and potential determinants could be grouped according to the questions they missed, known as ‘item missing’. Design Observational study of longitudinal data. Setting Residents of Brisbane, Australia. Participants 6901 people aged 40–65 years in 2007. Materials and methods We used a latent class model with a mixture of multinomial distributions and chose the number of classes using the Bayesian information criterion. We used logistic regression to examine if participants’ characteristics were associated with their modal latent class. We used logistic regression to examine whether the amount of item missing in a survey predicted wave missing in the following survey. Results Four per cent of participants missed almost one-fifth of the questions, and this group missed more questions in the middle of the survey. Eighty-three per cent of participants completed almost every question, but had a relatively high missing probability for a question on sleep time, a question which had an inconsistent presentation compared with the rest of the survey. Participants who completed almost every question were generally younger and more educated. Participants who completed more questions were less likely to miss the next longitudinal wave. Conclusions Examining patterns in item missing data has improved our understanding of how missing data were generated and has informed future survey design to help reduce missing data. PMID:29084795

  14. Identifying patterns of item missing survey data using latent groups: an observational study.

    PubMed

    Barnett, Adrian G; McElwee, Paul; Nathan, Andrea; Burton, Nicola W; Turrell, Gavin

    2017-10-30

    To examine whether respondents to a survey of health and physical activity and potential determinants could be grouped according to the questions they missed, known as 'item missing'. Observational study of longitudinal data. Residents of Brisbane, Australia. 6901 people aged 40-65 years in 2007. We used a latent class model with a mixture of multinomial distributions and chose the number of classes using the Bayesian information criterion. We used logistic regression to examine if participants' characteristics were associated with their modal latent class. We used logistic regression to examine whether the amount of item missing in a survey predicted wave missing in the following survey. Four per cent of participants missed almost one-fifth of the questions, and this group missed more questions in the middle of the survey. Eighty-three per cent of participants completed almost every question, but had a relatively high missing probability for a question on sleep time, a question which had an inconsistent presentation compared with the rest of the survey. Participants who completed almost every question were generally younger and more educated. Participants who completed more questions were less likely to miss the next longitudinal wave. Examining patterns in item missing data has improved our understanding of how missing data were generated and has informed future survey design to help reduce missing data. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  15. A fully covariant information-theoretic ultraviolet cutoff for scalar fields in expanding Friedmann Robertson Walker spacetimes

    NASA Astrophysics Data System (ADS)

    Kempf, A.; Chatwin-Davies, A.; Martin, R. T. W.

    2013-02-01

    While a natural ultraviolet cutoff, presumably at the Planck length, is widely assumed to exist in nature, it is nontrivial to implement a minimum length scale covariantly. This is because the presence of a fixed minimum length needs to be reconciled with the ability of Lorentz transformations to contract lengths. In this paper, we implement a fully covariant Planck scale cutoff by cutting off the spectrum of the d'Alembertian. In this scenario, consistent with Lorentz contractions, wavelengths that are arbitrarily smaller than the Planck length continue to exist. However, the dynamics of modes of wavelengths that are significantly smaller than the Planck length possess a very small bandwidth. This has the effect of freezing the dynamics of such modes. While both wavelengths and bandwidths are frame dependent, Lorentz contraction and time dilation conspire to make the freezing of modes of trans-Planckian wavelengths covariant. In particular, we show that this ultraviolet cutoff can be implemented covariantly also in curved spacetimes. We focus on Friedmann Robertson Walker spacetimes and their much-discussed trans-Planckian question: The physical wavelength of each comoving mode was smaller than the Planck scale at sufficiently early times. What was the mode's dynamics then? Here, we show that in the presence of the covariant UV cutoff, the dynamical bandwidth of a comoving mode is essentially zero up until its physical wavelength starts exceeding the Planck length. In particular, we show that under general assumptions, the number of dynamical degrees of freedom of each comoving mode all the way up to some arbitrary finite time is actually finite. Our results also open the way to calculating the impact of this natural UV cutoff on inflationary predictions for the cosmic microwave background.

  16. Missing doses in the life span study of Japanese atomic bomb survivors.

    PubMed

    Richardson, David B; Wing, Steve; Cole, Stephen R

    2013-03-15

    The Life Span Study of atomic bomb survivors is an important source of risk estimates used to inform radiation protection and compensation. Interviews with survivors in the 1950s and 1960s provided information needed to estimate radiation doses for survivors proximal to ground zero. Because of a lack of interview or the complexity of shielding, doses are missing for 7,058 of the 68,119 proximal survivors. Recent analyses excluded people with missing doses, and despite the protracted collection of interview information necessary to estimate some survivors' doses, defined start of follow-up as October 1, 1950, for everyone. We describe the prevalence of missing doses and its association with mortality, distance from hypocenter, city, age, and sex. Missing doses were more common among Nagasaki residents than among Hiroshima residents (prevalence ratio = 2.05; 95% confidence interval: 1.96, 2.14), among people who were closer to ground zero than among those who were far from it, among people who were younger at enrollment than among those who were older, and among males than among females (prevalence ratio = 1.22; 95% confidence interval: 1.17, 1.28). Missing dose was associated with all-cancer and leukemia mortality, particularly during the first years of follow-up (all-cancer rate ratio = 2.16, 95% confidence interval: 1.51, 3.08; and leukemia rate ratio = 4.28, 95% confidence interval: 1.72, 10.67). Accounting for missing dose and late entry should reduce bias in estimated dose-mortality associations.

  17. To Covary or Not to Covary, That is the Question

    NASA Astrophysics Data System (ADS)

    Oehlert, A. M.; Swart, P. K.

    2016-12-01

    The meaning of covariation between the δ13C values of carbonate carbon and that of organic material is classically interpreted as reflecting original variations in the δ13C values of the dissolved inorganic carbon in the depositional environment. However, recently it has been shown by the examination of a core from Great Bahama Bank (Clino) that during exposure not only do the rocks become altered acquiring a negative δ13C value, but at the same time terrestrial vegetation adds organic carbon to the system masking the original marine values. These processes yield a strong positive covariation between δ13Corg and δ13Ccar values even though the signals are clearly not original and unrelated to the marine δ13C values. Examining the correlation between the organic and inorganic system in a stratigraphic sense at Clino and in a second more proximally located core (Unda) using a windowed correlation coefficient technique reveals that the correlation is even more complex. Changes in slope and the magnitude of the correlation are associated with exposure surfaces, facies changes, dolomitized bodies, and non-depositional surfaces. Finally other isotopic systems such as the δ13C value of specific organic compounds as well as δ15N values of bulk and individual compounds can provide additional information. In the case of δ15N values, decreases reflect a changes in the influence of terrestrial organic material and an increase contribution of organic material from the platform surface where the main source of nitrogen is derived from the activities of cyanobacteria.

  18. Reducing Misses and Near Misses Related to Multitasking on the Electronic Health Record: Observational Study and Qualitative Analysis

    PubMed Central

    Matta, George Y; Bohsali, Fuad B; Chisolm, Margaret S

    2018-01-01

    Background Clinicians’ use of electronic health record (EHR) systems while multitasking may increase the risk of making errors, but silent EHR system use may lower patient satisfaction. Delaying EHR system use until after patient visits may increase clinicians’ EHR workload, stress, and burnout. Objective We aimed to describe the perspectives of clinicians, educators, administrators, and researchers about misses and near misses that they felt were related to clinician multitasking while using EHR systems. Methods This observational study was a thematic analysis of perspectives elicited from 63 continuing medical education (CME) participants during 2 workshops and 1 interactive lecture about challenges and strategies for relationship-centered communication during clinician EHR system use. The workshop elicited reflection about memorable times when multitasking EHR use was associated with “misses” (errors that were not caught at the time) or “near misses” (mistakes that were caught before leading to errors). We conducted qualitative analysis using an editing analysis style to identify codes and then select representative themes and quotes. Results All workshop participants shared stories of misses or near misses in EHR system ordering and documentation or patient-clinician communication, wondering about “misses we don’t even know about.” Risk factors included the computer’s position, EHR system usability, note content and style, information overload, problematic workflows, systems issues, and provider and patient communication behaviors and expectations. Strategies to reduce multitasking EHR system misses included clinician transparency when needing silent EHR system use (eg, for prescribing), narrating EHR system use, patient activation during EHR system use, adapting visit organization and workflow, improving EHR system design, and improving team support and systems. Conclusions CME participants shared numerous stories of errors and near misses

  19. Exploitation of Geometric Occlusion and Covariance Spectroscopy in a Gamma Sensor Array

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mukhopadhyay, Sanjoy; Maurer, Richard; Wolff, Ronald

    2013-09-01

    The National Security Technologies, LLC, Remote Sensing Laboratory has recently used an array of six small-footprint (1-inch diameter by 3-inch long) cylindrical crystals of thallium-doped sodium iodide scintillators to obtain angular information from discrete gamma ray–emitting point sources. Obtaining angular information in a near-field measurement for a field-deployed gamma sensor is a requirement for radiological emergency work. Three of the sensors sit at the vertices of a 2-inch isosceles triangle, while the other three sit on the circumference of a 3-inch-radius circle centered in this triangle. This configuration exploits occlusion of sensors, correlation from Compton scattering within a detector array,more » and covariance spectroscopy, a spectral coincidence technique. Careful placement and orientation of individual detectors with reference to other detectors in an array can provide improved angular resolution for determining the source position by occlusion mechanism. By evaluating the values of, and the uncertainties in, the photopeak areas, efficiencies, branching ratio, peak area correction factors, and the correlations between these quantities, one can determine the precise activity of a particular radioisotope from a mixture of radioisotopes that have overlapping photopeaks that are ordinarily hard to deconvolve. The spectral coincidence technique, often known as covariance spectroscopy, examines the correlations and fluctuations in data that contain valuable information about radiation sources, transport media, and detection systems. Covariance spectroscopy enhances radionuclide identification techniques, provides directional information, and makes weaker gamma-ray emission—normally undetectable by common spectroscopic analysis—detectable. A series of experimental results using the concept of covariance spectroscopy are presented.« less

  20. Covariance Partition Priors: A Bayesian Approach to Simultaneous Covariance Estimation for Longitudinal Data.

    PubMed

    Gaskins, J T; Daniels, M J

    2016-01-02

    The estimation of the covariance matrix is a key concern in the analysis of longitudinal data. When data consists of multiple groups, it is often assumed the covariance matrices are either equal across groups or are completely distinct. We seek methodology to allow borrowing of strength across potentially similar groups to improve estimation. To that end, we introduce a covariance partition prior which proposes a partition of the groups at each measurement time. Groups in the same set of the partition share dependence parameters for the distribution of the current measurement given the preceding ones, and the sequence of partitions is modeled as a Markov chain to encourage similar structure at nearby measurement times. This approach additionally encourages a lower-dimensional structure of the covariance matrices by shrinking the parameters of the Cholesky decomposition toward zero. We demonstrate the performance of our model through two simulation studies and the analysis of data from a depression study. This article includes Supplementary Material available online.

  1. Bayesian correction for covariate measurement error: A frequentist evaluation and comparison with regression calibration.

    PubMed

    Bartlett, Jonathan W; Keogh, Ruth H

    2018-06-01

    Bayesian approaches for handling covariate measurement error are well established and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper, we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration, arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to regression calibration and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next, we describe the closely related maximum likelihood and multiple imputation approaches and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of regression calibration and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey.

  2. Depression and literacy are important factors for missed appointments.

    PubMed

    Miller-Matero, Lisa Renee; Clark, Kalin Burkhardt; Brescacin, Carly; Dubaybo, Hala; Willens, David E

    2016-09-01

    Multiple variables are related to missed clinic appointments. However, the prevalence of missed appointments is still high suggesting other factors may play a role. The purpose of this study was to investigate the relationship between missed appointments and multiple variables simultaneously across a health care system, including patient demographics, psychiatric symptoms, cognitive functioning and literacy status. Chart reviews were conducted on 147 consecutive patients who were seen by a primary care psychologist over a six month period and completed measures to determine levels of depression, anxiety, sleep, cognitive functioning and health literacy. Demographic information and rates of missed appointments were also collected from charts. The average rate of missed appointments was 15.38%. In univariate analyses, factors related to higher rates of missed appointments included younger age (p = .03), lower income (p = .05), probable depression (p = .05), sleep difficulty (p = .05) and limited reading ability (p = .003). There were trends for a higher rate of missed appointments for patients identifying as black (p = .06), government insurance (p = .06) and limited math ability (p = .06). In a multivariate model, probable depression (p = .02) and limited reading ability (p = .003) were the only independent predictors. Depression and literacy status may be the most important factors associated with missed appointments. Implications are discussed including regular screening for depression and literacy status as well as interventions that can be utilized to help improve the rate of missed appointments.

  3. What's Missing? Anti-Racist Sex Education!

    ERIC Educational Resources Information Center

    Whitten, Amanda; Sethna, Christabelle

    2014-01-01

    Contemporary sexual health curricula in Canada include information about sexual diversity and queer identities, but what remains missing is any explicit discussion of anti-racist sex education. Although there exists federal and provincial support for multiculturalism and anti-racism in schools, contemporary Canadian sex education omits crucial…

  4. Testing of NASA LaRC Materials under MISSE 6 and MISSE 7 Missions

    NASA Technical Reports Server (NTRS)

    Prasad, Narasimha S.

    2009-01-01

    The objective of the Materials International Space Station Experiment (MISSE) is to study the performance of novel materials when subjected to the synergistic effects of the harsh space environment for several months. MISSE missions provide an opportunity for developing space qualifiable materials. Two lasers and a few optical components from NASA Langley Research Center (LaRC) were included in the MISSE 6 mission for long term exposure. MISSE 6 items were characterized and packed inside a ruggedized Passive Experiment Container (PEC) that resembles a suitcase. The PEC was tested for survivability due to launch conditions. MISSE 6 was transported to the international Space Station (ISS) via STS 123 on March 11. 2008. The astronauts successfully attached the PEC to external handrails of the ISS and opened the PEC for long term exposure to the space environment. The current plan is to bring the MISSE 6 PEC back to the Earth via STS 128 mission scheduled for launch in August 2009. Currently, preparations for launching the MISSE 7 mission are progressing. Laser and lidar components assembled on a flight-worthy platform are included from NASA LaRC. MISSE 7 launch is scheduled to be launched on STS 129 mission. This paper will briefly review recent efforts on MISSE 6 and MISSE 7 missions at NASA Langley Research Center (LaRC).

  5. Recurrent Neural Networks for Multivariate Time Series with Missing Values.

    PubMed

    Che, Zhengping; Purushotham, Sanjay; Cho, Kyunghyun; Sontag, David; Liu, Yan

    2018-04-17

    Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provide useful insights for better understanding and utilization of missing values in time series analysis.

  6. Covariant diagrams for one-loop matching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Zhengkang

    Here, we present a diagrammatic formulation of recently-revived covariant functional approaches to one-loop matching from an ultraviolet (UV) theory to a low-energy effective field theory. Various terms following from a covariant derivative expansion (CDE) are represented by diagrams which, unlike conventional Feynman diagrams, involve gauge-covariant quantities and are thus dubbed "covariant diagrams." The use of covariant diagrams helps organize and simplify one-loop matching calculations, which we illustrate with examples. Of particular interest is the derivation of UV model-independent universal results, which reduce matching calculations of specific UV models to applications of master formulas. We also show how such derivation canmore » be done in a more concise manner than the previous literature, and discuss how additional structures that are not directly captured by existing universal results, including mixed heavy-light loops, open covariant derivatives, and mixed statistics, can be easily accounted for.« less

  7. Covariant diagrams for one-loop matching

    DOE PAGES

    Zhang, Zhengkang

    2017-05-30

    Here, we present a diagrammatic formulation of recently-revived covariant functional approaches to one-loop matching from an ultraviolet (UV) theory to a low-energy effective field theory. Various terms following from a covariant derivative expansion (CDE) are represented by diagrams which, unlike conventional Feynman diagrams, involve gauge-covariant quantities and are thus dubbed "covariant diagrams." The use of covariant diagrams helps organize and simplify one-loop matching calculations, which we illustrate with examples. Of particular interest is the derivation of UV model-independent universal results, which reduce matching calculations of specific UV models to applications of master formulas. We also show how such derivation canmore » be done in a more concise manner than the previous literature, and discuss how additional structures that are not directly captured by existing universal results, including mixed heavy-light loops, open covariant derivatives, and mixed statistics, can be easily accounted for.« less

  8. The Covariance Adjustment Approaches for Combining Incomparable Cox Regressions Caused by Unbalanced Covariates Adjustment: A Multivariate Meta-Analysis Study.

    PubMed

    Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi

    2015-01-01

    Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.

  9. 78 FR 37598 - Missing Participants in Individual Account Plans

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-06-21

    ... PENSION BENEFIT GUARANTY CORPORATION Missing Participants in Individual Account Plans AGENCY: Pension Benefit Guaranty Corporation. ACTION: Request for information. SUMMARY: PBGC is soliciting information from the public to assist it in making decisions about implementing a new program to deal with...

  10. Improving data sharing in research with context-free encoded missing data.

    PubMed

    Hoevenaar-Blom, Marieke P; Guillemont, Juliette; Ngandu, Tiia; Beishuizen, Cathrien R L; Coley, Nicola; Moll van Charante, Eric P; Andrieu, Sandrine; Kivipelto, Miia; Soininen, Hilkka; Brayne, Carol; Meiller, Yannick; Richard, Edo

    2017-01-01

    Lack of attention to missing data in research may result in biased results, loss of power and reduced generalizability. Registering reasons for missing values at the time of data collection, or-in the case of sharing existing data-before making data available to other teams, can save time and efforts, improve scientific value and help to prevent erroneous assumptions and biased results. To ensure that encoding of missing data is sufficient to understand the reason why data are missing, it should ideally be context-free. Therefore, 11 context-free codes of missing data were carefully designed based on three completed randomized controlled clinical trials and tested in a new randomized controlled clinical trial by an international team consisting of clinical researchers and epidemiologists with extended experience in designing and conducting trials and an Information System expert. These codes can be divided into missing due to participant and/or participation characteristics (n = 6), missing by design (n = 4), and due to a procedural error (n = 1). Broad implementation of context-free missing data encoding may enhance the possibilities of data sharing and pooling, thus allowing more powerful analyses using existing data.

  11. Construction of Covariance Functions with Variable Length Fields

    NASA Technical Reports Server (NTRS)

    Gaspari, Gregory; Cohn, Stephen E.; Guo, Jing; Pawson, Steven

    2005-01-01

    This article focuses on construction, directly in physical space, of three-dimensional covariance functions parametrized by a tunable length field, and on an application of this theory to reproduce the Quasi-Biennial Oscillation (QBO) in the Goddard Earth Observing System, Version 4 (GEOS-4) data assimilation system. These Covariance models are referred to as multi-level or nonseparable, to associate them with the application where a multi-level covariance with a large troposphere to stratosphere length field gradient is used to reproduce the QBO from sparse radiosonde observations in the tropical lower stratosphere. The multi-level covariance functions extend well-known single level covariance functions depending only on a length scale. Generalizations of the first- and third-order autoregressive covariances in three dimensions are given, providing multi-level covariances with zero and three derivatives at zero separation, respectively. Multi-level piecewise rational covariances with two continuous derivatives at zero separation are also provided. Multi-level powerlaw covariances are constructed with continuous derivatives of all orders. Additional multi-level covariance functions are constructed using the Schur product of single and multi-level covariance functions. A multi-level powerlaw covariance used to reproduce the QBO in GEOS-4 is described along with details of the assimilation experiments. The new covariance model is shown to represent the vertical wind shear associated with the QBO much more effectively than in the baseline GEOS-4 system.

  12. Missing Doses in the Life Span Study of Japanese Atomic Bomb Survivors

    PubMed Central

    Richardson, David B.; Wing, Steve; Cole, Stephen R.

    2013-01-01

    The Life Span Study of atomic bomb survivors is an important source of risk estimates used to inform radiation protection and compensation. Interviews with survivors in the 1950s and 1960s provided information needed to estimate radiation doses for survivors proximal to ground zero. Because of a lack of interview or the complexity of shielding, doses are missing for 7,058 of the 68,119 proximal survivors. Recent analyses excluded people with missing doses, and despite the protracted collection of interview information necessary to estimate some survivors' doses, defined start of follow-up as October 1, 1950, for everyone. We describe the prevalence of missing doses and its association with mortality, distance from hypocenter, city, age, and sex. Missing doses were more common among Nagasaki residents than among Hiroshima residents (prevalence ratio = 2.05; 95% confidence interval: 1.96, 2.14), among people who were closer to ground zero than among those who were far from it, among people who were younger at enrollment than among those who were older, and among males than among females (prevalence ratio = 1.22; 95% confidence interval: 1.17, 1.28). Missing dose was associated with all-cancer and leukemia mortality, particularly during the first years of follow-up (all-cancer rate ratio = 2.16, 95% confidence interval: 1.51, 3.08; and leukemia rate ratio = 4.28, 95% confidence interval: 1.72, 10.67). Accounting for missing dose and late entry should reduce bias in estimated dose-mortality associations. PMID:23429722

  13. Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading

    PubMed Central

    Ellis, Ian O.; Green, Andrew R.; Hanka, Rudolf

    2008-01-01

    Background We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. Methodology/Principal Findings We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. Conclusions/Significance Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in

  14. Schur Complement Inequalities for Covariance Matrices and Monogamy of Quantum Correlations

    NASA Astrophysics Data System (ADS)

    Lami, Ludovico; Hirche, Christoph; Adesso, Gerardo; Winter, Andreas

    2016-11-01

    We derive fundamental constraints for the Schur complement of positive matrices, which provide an operator strengthening to recently established information inequalities for quantum covariance matrices, including strong subadditivity. This allows us to prove general results on the monogamy of entanglement and steering quantifiers in continuous variable systems with an arbitrary number of modes per party. A powerful hierarchical relation for correlation measures based on the log-determinant of covariance matrices is further established for all Gaussian states, which has no counterpart among quantities based on the conventional von Neumann entropy.

  15. Schur Complement Inequalities for Covariance Matrices and Monogamy of Quantum Correlations.

    PubMed

    Lami, Ludovico; Hirche, Christoph; Adesso, Gerardo; Winter, Andreas

    2016-11-25

    We derive fundamental constraints for the Schur complement of positive matrices, which provide an operator strengthening to recently established information inequalities for quantum covariance matrices, including strong subadditivity. This allows us to prove general results on the monogamy of entanglement and steering quantifiers in continuous variable systems with an arbitrary number of modes per party. A powerful hierarchical relation for correlation measures based on the log-determinant of covariance matrices is further established for all Gaussian states, which has no counterpart among quantities based on the conventional von Neumann entropy.

  16. Treating Sample Covariances for Use in Strongly Coupled Atmosphere-Ocean Data Assimilation

    NASA Astrophysics Data System (ADS)

    Smith, Polly J.; Lawless, Amos S.; Nichols, Nancy K.

    2018-01-01

    Strongly coupled data assimilation requires cross-domain forecast error covariances; information from ensembles can be used, but limited sampling means that ensemble derived error covariances are routinely rank deficient and/or ill-conditioned and marred by noise. Thus, they require modification before they can be incorporated into a standard assimilation framework. Here we compare methods for improving the rank and conditioning of multivariate sample error covariance matrices for coupled atmosphere-ocean data assimilation. The first method, reconditioning, alters the matrix eigenvalues directly; this preserves the correlation structures but does not remove sampling noise. We show that it is better to recondition the correlation matrix rather than the covariance matrix as this prevents small but dynamically important modes from being lost. The second method, model state-space localization via the Schur product, effectively removes sample noise but can dampen small cross-correlation signals. A combination that exploits the merits of each is found to offer an effective alternative.

  17. Evaluation of digital soil mapping approaches with large sets of environmental covariates

    NASA Astrophysics Data System (ADS)

    Nussbaum, Madlene; Spiess, Kay; Baltensweiler, Andri; Grob, Urs; Keller, Armin; Greiner, Lucie; Schaepman, Michael E.; Papritz, Andreas

    2018-01-01

    The spatial assessment of soil functions requires maps of basic soil properties. Unfortunately, these are either missing for many regions or are not available at the desired spatial resolution or down to the required soil depth. The field-based generation of large soil datasets and conventional soil maps remains costly. Meanwhile, legacy soil data and comprehensive sets of spatial environmental data are available for many regions. Digital soil mapping (DSM) approaches relating soil data (responses) to environmental data (covariates) face the challenge of building statistical models from large sets of covariates originating, for example, from airborne imaging spectroscopy or multi-scale terrain analysis. We evaluated six approaches for DSM in three study regions in Switzerland (Berne, Greifensee, ZH forest) by mapping the effective soil depth available to plants (SD), pH, soil organic matter (SOM), effective cation exchange capacity (ECEC), clay, silt, gravel content and fine fraction bulk density for four soil depths (totalling 48 responses). Models were built from 300-500 environmental covariates by selecting linear models through (1) grouped lasso and (2) an ad hoc stepwise procedure for robust external-drift kriging (georob). For (3) geoadditive models we selected penalized smoothing spline terms by component-wise gradient boosting (geoGAM). We further used two tree-based methods: (4) boosted regression trees (BRTs) and (5) random forest (RF). Lastly, we computed (6) weighted model averages (MAs) from the predictions obtained from methods 1-5. Lasso, georob and geoGAM successfully selected strongly reduced sets of covariates (subsets of 3-6 % of all covariates). Differences in predictive performance, tested on independent validation data, were mostly small and did not reveal a single best method for 48 responses. Nevertheless, RF was often the best among methods 1-5 (28 of 48 responses), but was outcompeted by MA for 14 of these 28 responses. RF tended to over

  18. Forrester opens a MISSE PEC installed on the ISS Airlock

    NASA Image and Video Library

    2001-08-16

    STS105-346-007 (18 August 2001) --- Astronaut Patrick G. Forrester, during the second STS-105 extravehicular activity, prepares to work with the Materials International Space Station Experiment (MISSE). The experiment was installed on the outside of the Quest Airlock during the first extravehicular activity (EVA) of the STS-105 mission. MISSE will collect information on how different materials weather in the environment of space.

  19. Generalized Linear Covariance Analysis

    NASA Technical Reports Server (NTRS)

    Carpenter, James R.; Markley, F. Landis

    2014-01-01

    This talk presents a comprehensive approach to filter modeling for generalized covariance analysis of both batch least-squares and sequential estimators. We review and extend in two directions the results of prior work that allowed for partitioning of the state space into solve-for'' and consider'' parameters, accounted for differences between the formal values and the true values of the measurement noise, process noise, and textita priori solve-for and consider covariances, and explicitly partitioned the errors into subspaces containing only the influence of the measurement noise, process noise, and solve-for and consider covariances. In this work, we explicitly add sensitivity analysis to this prior work, and relax an implicit assumption that the batch estimator's epoch time occurs prior to the definitive span. We also apply the method to an integrated orbit and attitude problem, in which gyro and accelerometer errors, though not estimated, influence the orbit determination performance. We illustrate our results using two graphical presentations, which we call the variance sandpile'' and the sensitivity mosaic,'' and we compare the linear covariance results to confidence intervals associated with ensemble statistics from a Monte Carlo analysis.

  20. An approximate generalized linear model with random effects for informative missing data.

    PubMed

    Follmann, D; Wu, M

    1995-03-01

    This paper develops a class of models to deal with missing data from longitudinal studies. We assume that separate models for the primary response and missingness (e.g., number of missed visits) are linked by a common random parameter. Such models have been developed in the econometrics (Heckman, 1979, Econometrica 47, 153-161) and biostatistics (Wu and Carroll, 1988, Biometrics 44, 175-188) literature for a Gaussian primary response. We allow the primary response, conditional on the random parameter, to follow a generalized linear model and approximate the generalized linear model by conditioning on the data that describes missingness. The resultant approximation is a mixed generalized linear model with possibly heterogeneous random effects. An example is given to illustrate the approximate approach, and simulations are performed to critique the adequacy of the approximation for repeated binary data.

  1. Missing people, migrants, identification and human rights.

    PubMed

    Nuzzolese, E

    2012-11-30

    The increasing volume and complexities of migratory flow has led to a range of problems such as human rights issues, public health, disease and border control, and also the regulatory processes. As result of war or internal conflicts missing person cases and management have to be regarded as a worldwide issue. On the other hand, even in peace, the issue of a missing person is still relevant. In 2007 the Italian Ministry of Interior nominated an extraordinary commissar in order to analyse and assess the total number of unidentified recovered bodies and verify the extent of the phenomena of missing persons, reported as 24,912 people in Italy (updated 31 December 2011). Of these 15,632 persons are of foreigner nationalities and are still missing. The census of the unidentified bodies revealed a total of 832 cases recovered in Italy since the year 1974. These bodies/human remains received a regular autopsy and were buried as 'corpse without name". In Italy judicial autopsy is performed to establish cause of death and identity, but odontology and dental radiology is rarely employed in identification cases. Nevertheless, odontologists can substantiate the identification through the 'biological profile' providing further information that can narrow the search to a smaller number of missing individuals even when no ante mortem dental data are available. The forensic dental community should put greater emphasis on the role of the forensic odontology as a tool for humanitarian action of unidentified individuals and best practise in human identification.

  2. Covariance Structure Models for Gene Expression Microarray Data

    ERIC Educational Resources Information Center

    Xie, Jun; Bentler, Peter M.

    2003-01-01

    Covariance structure models are applied to gene expression data using a factor model, a path model, and their combination. The factor model is based on a few factors that capture most of the expression information. A common factor of a group of genes may represent a common protein factor for the transcript of the co-expressed genes, and hence, it…

  3. Adjoints and Low-rank Covariance Representation

    NASA Technical Reports Server (NTRS)

    Tippett, Michael K.; Cohn, Stephen E.

    2000-01-01

    Quantitative measures of the uncertainty of Earth System estimates can be as important as the estimates themselves. Second moments of estimation errors are described by the covariance matrix, whose direct calculation is impractical when the number of degrees of freedom of the system state is large. Ensemble and reduced-state approaches to prediction and data assimilation replace full estimation error covariance matrices by low-rank approximations. The appropriateness of such approximations depends on the spectrum of the full error covariance matrix, whose calculation is also often impractical. Here we examine the situation where the error covariance is a linear transformation of a forcing error covariance. We use operator norms and adjoints to relate the appropriateness of low-rank representations to the conditioning of this transformation. The analysis is used to investigate low-rank representations of the steady-state response to random forcing of an idealized discrete-time dynamical system.

  4. Computation of transform domain covariance matrices

    NASA Technical Reports Server (NTRS)

    Fino, B. J.; Algazi, V. R.

    1975-01-01

    It is often of interest in applications to compute the covariance matrix of a random process transformed by a fast unitary transform. Here, the recursive definition of fast unitary transforms is used to derive recursive relations for the covariance matrices of the transformed process. These relations lead to fast methods of computation of covariance matrices and to substantial reductions of the number of arithmetic operations required.

  5. Regression analysis for bivariate gap time with missing first gap time data.

    PubMed

    Huang, Chia-Hui; Chen, Yi-Hau

    2017-01-01

    We consider ordered bivariate gap time while data on the first gap time are unobservable. This study is motivated by the HIV infection and AIDS study, where the initial HIV contracting time is unavailable, but the diagnosis times for HIV and AIDS are available. We are interested in studying the risk factors for the gap time between initial HIV contraction and HIV diagnosis, and gap time between HIV and AIDS diagnoses. Besides, the association between the two gap times is also of interest. Accordingly, in the data analysis we are faced with two-fold complexity, namely data on the first gap time is completely missing, and the second gap time is subject to induced informative censoring due to dependence between the two gap times. We propose a modeling framework for regression analysis of bivariate gap time under the complexity of the data. The estimating equations for the covariate effects on, as well as the association between, the two gap times are derived through maximum likelihood and suitable counting processes. Large sample properties of the resulting estimators are developed by martingale theory. Simulations are performed to examine the performance of the proposed analysis procedure. An application of data from the HIV and AIDS study mentioned above is reported for illustration.

  6. Forrester with a MISSE PEC installed on the ISS Airlock

    NASA Image and Video Library

    2001-08-16

    STS105-346-011 (18 August 2001) --- Astronaut Patrick G. Forrester, during the second STS-105 extravehicular activity, prepares to work with the Materials International Space Station Experiment (MISSE, almost out of frame at left). The experiment was installed on the outside of the Quest Airlock during the first extravehicular activity (EVA) of the STS-105 mission. MISSE will collect information on how different materials weather in the environment of space.

  7. Relative-Error-Covariance Algorithms

    NASA Technical Reports Server (NTRS)

    Bierman, Gerald J.; Wolff, Peter J.

    1991-01-01

    Two algorithms compute error covariance of difference between optimal estimates, based on data acquired during overlapping or disjoint intervals, of state of discrete linear system. Provides quantitative measure of mutual consistency or inconsistency of estimates of states. Relative-error-covariance concept applied, to determine degree of correlation between trajectories calculated from two overlapping sets of measurements and construct real-time test of consistency of state estimates based upon recently acquired data.

  8. Statewide analysis of missed opportunities for human papillomavirus vaccination using vaccine registry data.

    PubMed

    Kepka, Deanna; Spigarelli, Michael G; Warner, Echo L; Yoneoka, Yukiko; McConnell, Nancy; Balch, Alfred

    2016-12-01

    Human papillomavirus (HPV) vaccine 3-dose completion rates among adolescent females in the US are low. Missed opportunities impede HPV vaccination coverage. A population-based secondary data analysis of de-identified vaccination and demographic data from the Utah Statewide Immunization Information System (USIIS) was conducted. Records were included from 25,866 females ages 11-26 years at any time during 2008-2012 who received at least one of the following adolescent vaccinations documented in the USIIS: Tdap (Tetanus, Diphtheria, Pertussis), meningococcal, and/or influenza. A missed opportunity for HPV vaccination was defined as a clinical encounter where the patient received at least one adolescent vaccination, but not a HPV vaccine. Of 47,665 eligible visits, there were 20,911 missed opportunities (43.87%). Age group, race/ethnicity, and rurality were significantly associated with missed opportunity (p<0.0001). In a multivariable mixed-effects logistic regression model that included ethnicity, location and age, as fixed effects and subject as a random effect, Hispanics were less likely to have a missed opportunity than whites OR 0.59 (95% CI: 0.52-0.66), small rural more likely to have a missed opportunity than urban youth OR 1.8 (95% CI: 1.5-2.2), and preteens more likely than teens OR 2.4 (95% CI: 2.2-2.7). Missed clinical opportunities are a significant barrier to HPV vaccination among female adolescents. Interventions targeted at providers who serve patient groups with the highest missed opportunities are needed to achieve adequate protection from HPV-associated illnesses. This is one of the first studies to utilize state immunization information system data to assess missed opportunities for HPV vaccination.

  9. Prolonged grief and post-traumatic stress among relatives of missing persons and homicidally bereaved individuals: A comparative study.

    PubMed

    Lenferink, Lonneke I M; van Denderen, Mariëtte Y; de Keijser, Jos; Wessel, Ineke; Boelen, Paul A

    2017-02-01

    Traumatic loss (e.g., homicide) is associated with elevated prolonged grief disorder (PGD) and posttraumatic stress disorder (PTSD). Several studies comparing relatives of missing persons with homicidally bereaved individuals showed inconsistent results about the difference in PGD- and PTSD-levels between the groups. These studies were conducted in the context of armed conflict, which may confound the results. The current study aims to compare PGD- and PTSD-levels between the groups outside the context of armed conflict. Relatives of long-term missing persons (n=134) and homicidally bereaved individuals (n=331) completed self-report measures of PGD and PTSD. Multilevel regression modelling was used to compare symptom scores between the groups. Homicidally bereaved individuals reported significantly higher levels of PGD (d=0.86) and PTSD (d=0.28) than relatives of missing persons, when taking relevant covariates (i.e., gender, time since loss, and kinship to the disappeared/deceased person) into account. A limitation of this study is the use of self-report measures instead of clinical interviews. Prior studies among relatives of missing persons and homicidally bereaved individuals in the context of armed conflict may not be generalizable to similar samples outside these contexts. Future research is needed to further explore differences in bereavement-related psychopathology between different groups and correlates and treatment of this psychopathology. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Covariate analysis of bivariate survival data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bennett, L.E.

    1992-01-01

    The methods developed are used to analyze the effects of covariates on bivariate survival data when censoring and ties are present. The proposed method provides models for bivariate survival data that include differential covariate effects and censored observations. The proposed models are based on an extension of the univariate Buckley-James estimators which replace censored data points by their expected values, conditional on the censoring time and the covariates. For the bivariate situation, it is necessary to determine the expectation of the failure times for one component conditional on the failure or censoring time of the other component. Two different methodsmore » have been developed to estimate these expectations. In the semiparametric approach these expectations are determined from a modification of Burke's estimate of the bivariate empirical survival function. In the parametric approach censored data points are also replaced by their conditional expected values where the expected values are determined from a specified parametric distribution. The model estimation will be based on the revised data set, comprised of uncensored components and expected values for the censored components. The variance-covariance matrix for the estimated covariate parameters has also been derived for both the semiparametric and parametric methods. Data from the Demographic and Health Survey was analyzed by these methods. The two outcome variables are post-partum amenorrhea and breastfeeding; education and parity were used as the covariates. Both the covariate parameter estimates and the variance-covariance estimates for the semiparametric and parametric models will be compared. In addition, a multivariate test statistic was used in the semiparametric model to examine contrasts. The significance of the statistic was determined from a bootstrap distribution of the test statistic.« less

  11. Survival analysis with functional covariates for partial follow-up studies.

    PubMed

    Fang, Hong-Bin; Wu, Tong Tong; Rapoport, Aaron P; Tan, Ming

    2016-12-01

    Predictive or prognostic analysis plays an increasingly important role in the era of personalized medicine to identify subsets of patients whom the treatment may benefit the most. Although various time-dependent covariate models are available, such models require that covariates be followed in the whole follow-up period. This article studies a new class of functional survival models where the covariates are only monitored in a time interval that is shorter than the whole follow-up period. This paper is motivated by the analysis of a longitudinal study on advanced myeloma patients who received stem cell transplants and T cell infusions after the transplants. The absolute lymphocyte cell counts were collected serially during hospitalization. Those patients are still followed up if they are alive after hospitalization, while their absolute lymphocyte cell counts cannot be measured after that. Another complication is that absolute lymphocyte cell counts are sparsely and irregularly measured. The conventional method using Cox model with time-varying covariates is not applicable because of the different lengths of observation periods. Analysis based on each single observation obviously underutilizes available information and, more seriously, may yield misleading results. This so-called partial follow-up study design represents increasingly common predictive modeling problem where we have serial multiple biomarkers up to a certain time point, which is shorter than the total length of follow-up. We therefore propose a solution to the partial follow-up design. The new method combines functional principal components analysis and survival analysis with selection of those functional covariates. It also has the advantage of handling sparse and irregularly measured longitudinal observations of covariates and measurement errors. Our analysis based on functional principal components reveals that it is the patterns of the trajectories of absolute lymphocyte cell counts, instead of

  12. Lorentz covariance of loop quantum gravity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rovelli, Carlo; Speziale, Simone

    2011-05-15

    The kinematics of loop gravity can be given a manifestly Lorentz-covariant formulation: the conventional SU(2)-spin-network Hilbert space can be mapped to a space K of SL(2,C) functions, where Lorentz covariance is manifest. K can be described in terms of a certain subset of the projected spin networks studied by Livine, Alexandrov and Dupuis. It is formed by SL(2,C) functions completely determined by their restriction on SU(2). These are square-integrable in the SU(2) scalar product, but not in the SL(2,C) one. Thus, SU(2)-spin-network states can be represented by Lorentz-covariant SL(2,C) functions, as two-component photons can be described in the Lorentz-covariant Gupta-Bleulermore » formalism. As shown by Wolfgang Wieland in a related paper, this manifestly Lorentz-covariant formulation can also be directly obtained from canonical quantization. We show that the spinfoam dynamics of loop quantum gravity is locally SL(2,C)-invariant in the bulk, and yields states that are precisely in K on the boundary. This clarifies how the SL(2,C) spinfoam formalism yields an SU(2) theory on the boundary. These structures define a tidy Lorentz-covariant formalism for loop gravity.« less

  13. Missing Data and Multiple Imputation: An Unbiased Approach

    NASA Technical Reports Server (NTRS)

    Foy, M.; VanBaalen, M.; Wear, M.; Mendez, C.; Mason, S.; Meyers, V.; Alexander, D.; Law, J.

    2014-01-01

    The default method of dealing with missing data in statistical analyses is to only use the complete observations (complete case analysis), which can lead to unexpected bias when data do not meet the assumption of missing completely at random (MCAR). For the assumption of MCAR to be met, missingness cannot be related to either the observed or unobserved variables. A less stringent assumption, missing at random (MAR), requires that missingness not be associated with the value of the missing variable itself, but can be associated with the other observed variables. When data are truly MAR as opposed to MCAR, the default complete case analysis method can lead to biased results. There are statistical options available to adjust for data that are MAR, including multiple imputation (MI) which is consistent and efficient at estimating effects. Multiple imputation uses informing variables to determine statistical distributions for each piece of missing data. Then multiple datasets are created by randomly drawing on the distributions for each piece of missing data. Since MI is efficient, only a limited number, usually less than 20, of imputed datasets are required to get stable estimates. Each imputed dataset is analyzed using standard statistical techniques, and then results are combined to get overall estimates of effect. A simulation study will be demonstrated to show the results of using the default complete case analysis, and MI in a linear regression of MCAR and MAR simulated data. Further, MI was successfully applied to the association study of CO2 levels and headaches when initial analysis showed there may be an underlying association between missing CO2 levels and reported headaches. Through MI, we were able to show that there is a strong association between average CO2 levels and the risk of headaches. Each unit increase in CO2 (mmHg) resulted in a doubling in the odds of reported headaches.

  14. IN CASE YOU MISSED IT: EPA Releases New Chemical Safety Guidelines Aimed at Curbing Animal Testing, Tracking Mercury Imports, and Facilitating the Sharing of Confidential Business Information

    EPA Pesticide Factsheets

    EPA News Release: IN CASE YOU MISSED IT: EPA Releases New Chemical Safety Guidelines Aimed at Curbing Animal Testing, Tracking Mercury Imports, and Facilitating the Sharing of Confidential Business Information

  15. Treatment decisions based on scalar and functional baseline covariates.

    PubMed

    Ciarleglio, Adam; Petkova, Eva; Ogden, R Todd; Tarpey, Thaddeus

    2015-12-01

    The amount and complexity of patient-level data being collected in randomized-controlled trials offer both opportunities and challenges for developing personalized rules for assigning treatment for a given disease or ailment. For example, trials examining treatments for major depressive disorder are not only collecting typical baseline data such as age, gender, or scores on various tests, but also data that measure the structure and function of the brain such as images from magnetic resonance imaging (MRI), functional MRI (fMRI), or electroencephalography (EEG). These latter types of data have an inherent structure and may be considered as functional data. We propose an approach that uses baseline covariates, both scalars and functions, to aid in the selection of an optimal treatment. In addition to providing information on which treatment should be selected for a new patient, the estimated regime has the potential to provide insight into the relationship between treatment response and the set of baseline covariates. Our approach can be viewed as an extension of "advantage learning" to include both scalar and functional covariates. We describe our method and how to implement it using existing software. Empirical performance of our method is evaluated with simulated data in a variety of settings and also applied to data arising from a study of patients with major depressive disorder from whom baseline scalar covariates as well as functional data from EEG are available. © 2015, The International Biometric Society.

  16. Levy Matrices and Financial Covariances

    NASA Astrophysics Data System (ADS)

    Burda, Zdzislaw; Jurkiewicz, Jerzy; Nowak, Maciej A.; Papp, Gabor; Zahed, Ismail

    2003-10-01

    In a given market, financial covariances capture the intra-stock correlations and can be used to address statistically the bulk nature of the market as a complex system. We provide a statistical analysis of three SP500 covariances with evidence for raw tail distributions. We study the stability of these tails against reshuffling for the SP500 data and show that the covariance with the strongest tails is robust, with a spectral density in remarkable agreement with random Lévy matrix theory. We study the inverse participation ratio for the three covariances. The strong localization observed at both ends of the spectral density is analogous to the localization exhibited in the random Lévy matrix ensemble. We discuss two competitive mechanisms responsible for the occurrence of an extensive and delocalized eigenvalue at the edge of the spectrum: (a) the Lévy character of the entries of the correlation matrix and (b) a sort of off-diagonal order induced by underlying inter-stock correlations. (b) can be destroyed by reshuffling, while (a) cannot. We show that the stocks with the largest scattering are the least susceptible to correlations, and likely candidates for the localized states. We introduce a simple model for price fluctuations which captures behavior of the SP500 covariances. It may be of importance for assets diversification.

  17. Structural Analysis of Covariance and Correlation Matrices.

    ERIC Educational Resources Information Center

    Joreskog, Karl G.

    1978-01-01

    A general approach to analysis of covariance structures is considered, in which the variances and covariances or correlations of the observed variables are directly expressed in terms of the parameters of interest. The statistical problems of identification, estimation and testing of such covariance or correlation structures are discussed.…

  18. Missing data in trial-based cost-effectiveness analysis: An incomplete journey.

    PubMed

    Leurent, Baptiste; Gomes, Manuel; Carpenter, James R

    2018-06-01

    Cost-effectiveness analyses (CEA) conducted alongside randomised trials provide key evidence for informing healthcare decision making, but missing data pose substantive challenges. Recently, there have been a number of developments in methods and guidelines addressing missing data in trials. However, it is unclear whether these developments have permeated CEA practice. This paper critically reviews the extent of and methods used to address missing data in recently published trial-based CEA. Issues of the Health Technology Assessment journal from 2013 to 2015 were searched. Fifty-two eligible studies were identified. Missing data were very common; the median proportion of trial participants with complete cost-effectiveness data was 63% (interquartile range: 47%-81%). The most common approach for the primary analysis was to restrict analysis to those with complete data (43%), followed by multiple imputation (30%). Half of the studies conducted some sort of sensitivity analyses, but only 2 (4%) considered possible departures from the missing-at-random assumption. Further improvements are needed to address missing data in cost-effectiveness analyses conducted alongside randomised trials. These should focus on limiting the extent of missing data, choosing an appropriate method for the primary analysis that is valid under contextually plausible assumptions, and conducting sensitivity analyses to departures from the missing-at-random assumption. © 2018 The Authors Health Economics published by John Wiley & Sons Ltd.

  19. A Study on Mutil-Scale Background Error Covariances in 3D-Var Data Assimilation

    NASA Astrophysics Data System (ADS)

    Zhang, Xubin; Tan, Zhe-Min

    2017-04-01

    The construction of background error covariances is a key component of three-dimensional variational data assimilation. There are different scale background errors and interactions among them in the numerical weather Prediction. However, the influence of these errors and their interactions cannot be represented in the background error covariances statistics when estimated by the leading methods. So, it is necessary to construct background error covariances influenced by multi-scale interactions among errors. With the NMC method, this article firstly estimates the background error covariances at given model-resolution scales. And then the information of errors whose scales are larger and smaller than the given ones is introduced respectively, using different nesting techniques, to estimate the corresponding covariances. The comparisons of three background error covariances statistics influenced by information of errors at different scales reveal that, the background error variances enhance particularly at large scales and higher levels when introducing the information of larger-scale errors by the lateral boundary condition provided by a lower-resolution model. On the other hand, the variances reduce at medium scales at the higher levels, while those show slight improvement at lower levels in the nested domain, especially at medium and small scales, when introducing the information of smaller-scale errors by nesting a higher-resolution model. In addition, the introduction of information of larger- (smaller-) scale errors leads to larger (smaller) horizontal and vertical correlation scales of background errors. Considering the multivariate correlations, the Ekman coupling increases (decreases) with the information of larger- (smaller-) scale errors included, whereas the geostrophic coupling in free atmosphere weakens in both situations. The three covariances obtained in above work are used in a data assimilation and model forecast system respectively, and then the

  20. Computer codes for checking, plotting and processing of neutron cross-section covariance data and their application

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sartori, E.; Roussin, R.W.

    This paper presents a brief review of computer codes concerned with checking, plotting, processing and using of covariances of neutron cross-section data. It concentrates on those available from the computer code information centers of the United States and the OECD/Nuclear Energy Agency. Emphasis will be placed also on codes using covariances for specific applications such as uncertainty analysis, data adjustment and data consistency analysis. Recent evaluations contain neutron cross section covariance information for all isotopes of major importance for technological applications of nuclear energy. It is therefore important that the available software tools needed for taking advantage of this informationmore » are widely known as hey permit the determination of better safety margins and allow the optimization of more economic, I designs of nuclear energy systems.« less

  1. IIB supergravity and the E 6(6) covariant vector-tensor hierarchy

    DOE PAGES

    Ciceri, Franz; de Wit, Bernard; Varela, Oscar

    2015-04-20

    IIB supergravity is reformulated with a manifest local USp(8) invariance that makes the embedding of five-dimensional maximal supergravities transparent. In this formulation the ten-dimensional theory exhibits all the 27 one-form fields and 22 of the 27 two-form fields that are required by the vector-tensor hierarchy of the five-dimensional theory. The missing 5 two-form fields must transform in the same representation as a descendant of the ten-dimensional ‘dual graviton’. The invariant E 6(6) symmetric tensor that appears in the vector-tensor hierarchy is reproduced. Generalized vielbeine are derived from the supersymmetry transformations of the vector fields, as well as consistent expressions formore » the USp(8) covariant fermion fields. Implications are further discussed for the consistency of the truncation of IIB supergravity compactified on the five-sphere to maximal gauged supergravity in five space-time dimensions with an SO(6) gauge group.« less

  2. Outlier Removal in Model-Based Missing Value Imputation for Medical Datasets.

    PubMed

    Huang, Min-Wei; Lin, Wei-Chao; Tsai, Chih-Fong

    2018-01-01

    Many real-world medical datasets contain some proportion of missing (attribute) values. In general, missing value imputation can be performed to solve this problem, which is to provide estimations for the missing values by a reasoning process based on the (complete) observed data. However, if the observed data contain some noisy information or outliers, the estimations of the missing values may not be reliable or may even be quite different from the real values. The aim of this paper is to examine whether a combination of instance selection from the observed data and missing value imputation offers better performance than performing missing value imputation alone. In particular, three instance selection algorithms, DROP3, GA, and IB3, and three imputation algorithms, KNNI, MLP, and SVM, are used in order to find out the best combination. The experimental results show that that performing instance selection can have a positive impact on missing value imputation over the numerical data type of medical datasets, and specific combinations of instance selection and imputation methods can improve the imputation results over the mixed data type of medical datasets. However, instance selection does not have a definitely positive impact on the imputation result for categorical medical datasets.

  3. Missing data in FFQs: making assumptions about item non-response.

    PubMed

    Lamb, Karen E; Olstad, Dana Lee; Nguyen, Cattram; Milte, Catherine; McNaughton, Sarah A

    2017-04-01

    FFQs are a popular method of capturing dietary information in epidemiological studies and may be used to derive dietary exposures such as nutrient intake or overall dietary patterns and diet quality. As FFQs can involve large numbers of questions, participants may fail to respond to all questions, leaving researchers to decide how to deal with missing data when deriving intake measures. The aim of the present commentary is to discuss the current practice for dealing with item non-response in FFQs and to propose a research agenda for reporting and handling missing data in FFQs. Single imputation techniques, such as zero imputation (assuming no consumption of the item) or mean imputation, are commonly used to deal with item non-response in FFQs. However, single imputation methods make strong assumptions about the missing data mechanism and do not reflect the uncertainty created by the missing data. This can lead to incorrect inference about associations between diet and health outcomes. Although the use of multiple imputation methods in epidemiology has increased, these have seldom been used in the field of nutritional epidemiology to address missing data in FFQs. We discuss methods for dealing with item non-response in FFQs, highlighting the assumptions made under each approach. Researchers analysing FFQs should ensure that missing data are handled appropriately and clearly report how missing data were treated in analyses. Simulation studies are required to enable systematic evaluation of the utility of various methods for handling item non-response in FFQs under different assumptions about the missing data mechanism.

  4. Covariant harmonic oscillators: 1973 revisited

    NASA Technical Reports Server (NTRS)

    Noz, M. E.

    1993-01-01

    Using the relativistic harmonic oscillator, a physical basis is given to the phenomenological wave function of Yukawa which is covariant and normalizable. It is shown that this wave function can be interpreted in terms of the unitary irreducible representations of the Poincare group. The transformation properties of these covariant wave functions are also demonstrated.

  5. The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning

    ERIC Educational Resources Information Center

    Finch, W. Holmes

    2011-01-01

    Missing information is a ubiquitous aspect of data analysis, including responses to items on cognitive and affective instruments. Although the broader statistical literature describes missing data methods, relatively little work has focused on this issue in the context of differential item functioning (DIF) detection. Such prior research has…

  6. Adult myeloid leukaemia and radon exposure: a Bayesian model for a case-control study with error in covariates.

    PubMed

    Toti, Simona; Biggeri, Annibale; Forastiere, Francesco

    2005-06-30

    The possible association between radon exposure in dwellings and adult myeloid leukaemia had been explored in an Italian province by a case-control study. A total of 44 cases and 211 controls were selected from death certificates file. No association had been found in the original study (OR = 0.58 for > 185 vs 80 < or = Bq/cm). Here we reanalyse the data taking into account the measurement error of radon concentration and the presence of missing data. A Bayesian hierarchical model with error in covariates is proposed which allows appropriate imputation of missing values. The general conclusion of no evidence of association with radon does not change, but a negative association is not observed anymore (OR = 0.99 for > 185 vs 80 < or = Bq/cm). After adjusting for residential house radon and gamma radiation, and for the multilevel data structure, geological features of the soil is associated with adult myeloid leukaemia risk (OR = 2.14, 95 per cent Cr.I. 1.0-5.5). Copyright 2005 John Wiley & Sons, Ltd.

  7. Smooth individual level covariates adjustment in disease mapping.

    PubMed

    Huque, Md Hamidul; Anderson, Craig; Walton, Richard; Woolford, Samuel; Ryan, Louise

    2018-05-01

    Spatial models for disease mapping should ideally account for covariates measured both at individual and area levels. The newly available "indiCAR" model fits the popular conditional autoregresssive (CAR) model by accommodating both individual and group level covariates while adjusting for spatial correlation in the disease rates. This algorithm has been shown to be effective but assumes log-linear associations between individual level covariates and outcome. In many studies, the relationship between individual level covariates and the outcome may be non-log-linear, and methods to track such nonlinearity between individual level covariate and outcome in spatial regression modeling are not well developed. In this paper, we propose a new algorithm, smooth-indiCAR, to fit an extension to the popular conditional autoregresssive model that can accommodate both linear and nonlinear individual level covariate effects while adjusting for group level covariates and spatial correlation in the disease rates. In this formulation, the effect of a continuous individual level covariate is accommodated via penalized splines. We describe a two-step estimation procedure to obtain reliable estimates of individual and group level covariate effects where both individual and group level covariate effects are estimated separately. This distributed computing framework enhances its application in the Big Data domain with a large number of individual/group level covariates. We evaluate the performance of smooth-indiCAR through simulation. Our results indicate that the smooth-indiCAR method provides reliable estimates of all regression and random effect parameters. We illustrate our proposed methodology with an analysis of data on neutropenia admissions in New South Wales (NSW), Australia. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.

    PubMed

    Shah, Anoop D; Bartlett, Jonathan W; Carpenter, James; Nicholas, Owen; Hemingway, Harry

    2014-03-15

    Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The "true" imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001-2010) with complete data on all covariates. Variables were artificially made "missing at random," and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data.

  9. Make the call, don't miss a beat: Heart Attack Information for Women

    MedlinePlus

    ... Other resources Learn more about heart disease and heart attacks. Make the Call, Don't Miss a Beat ... symptoms Learn the 7 most common signs of heart attack in men and women. Chest pain or discomfort " ...

  10. A nonparametric multiple imputation approach for missing categorical data.

    PubMed

    Zhou, Muhan; He, Yulei; Yu, Mandi; Hsu, Chiu-Hsieh

    2017-06-06

    Incomplete categorical variables with more than two categories are common in public health data. However, most of the existing missing-data methods do not use the information from nonresponse (missingness) probabilities. We propose a nearest-neighbour multiple imputation approach to impute a missing at random categorical outcome and to estimate the proportion of each category. The donor set for imputation is formed by measuring distances between each missing value with other non-missing values. The distance function is calculated based on a predictive score, which is derived from two working models: one fits a multinomial logistic regression for predicting the missing categorical outcome (the outcome model) and the other fits a logistic regression for predicting missingness probabilities (the missingness model). A weighting scheme is used to accommodate contributions from two working models when generating the predictive score. A missing value is imputed by randomly selecting one of the non-missing values with the smallest distances. We conduct a simulation to evaluate the performance of the proposed method and compare it with several alternative methods. A real-data application is also presented. The simulation study suggests that the proposed method performs well when missingness probabilities are not extreme under some misspecifications of the working models. However, the calibration estimator, which is also based on two working models, can be highly unstable when missingness probabilities for some observations are extremely high. In this scenario, the proposed method produces more stable and better estimates. In addition, proper weights need to be chosen to balance the contributions from the two working models and achieve optimal results for the proposed method. We conclude that the proposed multiple imputation method is a reasonable approach to dealing with missing categorical outcome data with more than two levels for assessing the distribution of the outcome

  11. HSQC-1,n-ADEQUATE: a new approach to long-range 13C-13C correlation by covariance processing.

    PubMed

    Martin, Gary E; Hilton, Bruce D; Willcott, M Robert; Blinov, Kirill A

    2011-10-01

    Long-range, two-dimensional heteronuclear shift correlation NMR methods play a pivotal role in the assembly of novel molecular structures. The well-established GHMBC method is a high-sensitivity mainstay technique, affording connectivity information via (n)J(CH) coupling pathways. Unfortunately, there is no simple way of determining the value of n and hence no way of differentiating two-bond from three- and occasionally four-bond correlations. Three-bond correlations, however, generally predominate. Recent work has shown that the unsymmetrical indirect covariance or generalized indirect covariance processing of multiplicity edited GHSQC and 1,1-ADEQUATE spectra provides high-sensitivity access to a (13)C-(13) C connectivity map in the form of an HSQC-1,1-ADEQUATE spectrum. Covariance processing of these data allows the 1,1-ADEQUATE connectivity information to be exploited with the inherent sensitivity of the GHSQC spectrum rather than the intrinsically lower sensitivity of the 1,1-ADEQUATE spectrum itself. Data acquisition times and/or sample size can be substantially reduced when covariance processing is to be employed. In an extension of that work, 1,n-ADEQUATE spectra can likewise be subjected to covariance processing to afford high-sensitivity access to the equivalent of (4)J(CH) GHMBC connectivity information. The method is illustrated using strychnine as a model compound. Copyright © 2011 John Wiley & Sons, Ltd.

  12. Precomputing Process Noise Covariance for Onboard Sequential Filters

    NASA Technical Reports Server (NTRS)

    Olson, Corwin G.; Russell, Ryan P.; Carpenter, J. Russell

    2017-01-01

    Process noise is often used in estimation filters to account for unmodeled and mismodeled accelerations in the dynamics. The process noise covariance acts to inflate the state covariance over propagation intervals, increasing the uncertainty in the state. In scenarios where the acceleration errors change significantly over time, the standard process noise covariance approach can fail to provide effective representation of the state and its uncertainty. Consider covariance analysis techniques provide a method to precompute a process noise covariance profile along a reference trajectory using known model parameter uncertainties. The process noise covariance profile allows significantly improved state estimation and uncertainty representation over the traditional formulation. As a result, estimation performance on par with the consider filter is achieved for trajectories near the reference trajectory without the additional computational cost of the consider filter. The new formulation also has the potential to significantly reduce the trial-and-error tuning currently required of navigation analysts. A linear estimation problem as described in several previous consider covariance analysis studies is used to demonstrate the effectiveness of the precomputed process noise covariance, as well as a nonlinear descent scenario at the asteroid Bennu with optical navigation.

  13. Precomputing Process Noise Covariance for Onboard Sequential Filters

    NASA Technical Reports Server (NTRS)

    Olson, Corwin G.; Russell, Ryan P.; Carpenter, J. Russell

    2017-01-01

    Process noise is often used in estimation filters to account for unmodeled and mismodeled accelerations in the dynamics. The process noise covariance acts to inflate the state covariance over propagation intervals, increasing the uncertainty in the state. In scenarios where the acceleration errors change significantly over time, the standard process noise covariance approach can fail to provide effective representation of the state and its uncertainty. Consider covariance analysis techniques provide a method to precompute a process noise covariance profile along a reference trajectory, using known model parameter uncertainties. The process noise covariance profile allows significantly improved state estimation and uncertainty representation over the traditional formulation. As a result, estimation performance on par with the consider filter is achieved for trajectories near the reference trajectory without the additional computational cost of the consider filter. The new formulation also has the potential to significantly reduce the trial-and-error tuning currently required of navigation analysts. A linear estimation problem as described in several previous consider covariance analysis publications is used to demonstrate the effectiveness of the precomputed process noise covariance, as well as a nonlinear descent scenario at the asteroid Bennu with optical navigation.

  14. VIGAN: Missing View Imputation with Generative Adversarial Networks.

    PubMed

    Shang, Chao; Palmer, Aaron; Sun, Jiangwen; Chen, Ko-Shin; Lu, Jin; Bi, Jinbo

    2017-01-01

    In an era when big data are becoming the norm, there is less concern with the quantity but more with the quality and completeness of the data. In many disciplines, data are collected from heterogeneous sources, resulting in multi-view or multi-modal datasets. The missing data problem has been challenging to address in multi-view data analysis. Especially, when certain samples miss an entire view of data, it creates the missing view problem. Classic multiple imputations or matrix completion methods are hardly effective here when no information can be based on in the specific view to impute data for such samples. The commonly-used simple method of removing samples with a missing view can dramatically reduce sample size, thus diminishing the statistical power of a subsequent analysis. In this paper, we propose a novel approach for view imputation via generative adversarial networks (GANs), which we name by VIGAN. This approach first treats each view as a separate domain and identifies domain-to-domain mappings via a GAN using randomly-sampled data from each view, and then employs a multi-modal denoising autoencoder (DAE) to reconstruct the missing view from the GAN outputs based on paired data across the views. Then, by optimizing the GAN and DAE jointly, our model enables the knowledge integration for domain mappings and view correspondences to effectively recover the missing view. Empirical results on benchmark datasets validate the VIGAN approach by comparing against the state of the art. The evaluation of VIGAN in a genetic study of substance use disorders further proves the effectiveness and usability of this approach in life science.

  15. Management of a congenitally missing maxillary central incisor. A case study.

    PubMed

    Tichler, Howard M; Abraham, Jenny E

    2007-03-01

    When a maxillary lateral incisor is missing, often the treatment options can be clearly defined, that is, substitute an adjacent tooth for the missing one; open the space for an implant, a bonded bridge or fixed bridge. When a maxillary central incisor is missing and the space for the tooth is absent, the treatment choices become complicated, especially in a growing child. There must be multi-disciplinary coordination among the restorative dentist, the oral surgeon or periodontist, and the orthodontist to obtain the optimum result. At the initiation of treatment, this information must be relayed and the treatment plan agreed upon by the patient or the parents of the patient.

  16. Meaning of missing values in eyewitness recall and accident records.

    PubMed

    Uttl, Bob; Kisinger, Kelly

    2010-09-02

    Eyewitness recalls and accident records frequently do not mention the conditions and behaviors of interest to researchers and lead to missing values and to uncertainty about the prevalence of these conditions and behaviors surrounding accidents. Missing values may occur because eyewitnesses report the presence but not the absence of obvious clues/accident features. We examined this possibility. Participants watched car accident videos and were asked to recall as much information as they could remember about each accident. The results showed that eyewitnesses were far more likely to report the presence of present obvious clues than the absence of absent obvious clues even though they were aware of their absence. One of the principal mechanisms causing missing values may be eyewitnesses' tendency to not report the absence of obvious features. We discuss the implications of our findings for both retrospective and prospective analyses of accident records, and illustrate the consequences of adopting inappropriate assumptions about the meaning of missing values using the Avaluator Avalanche Accident Prevention Card.

  17. Modeling missing data in knowledge space theory.

    PubMed

    de Chiusole, Debora; Stefanutti, Luca; Anselmi, Pasquale; Robusto, Egidio

    2015-12-01

    Missing data are a well known issue in statistical inference, because some responses may be missing, even when data are collected carefully. The problem that arises in these cases is how to deal with missing data. In this article, the missingness is analyzed in knowledge space theory, and in particular when the basic local independence model (BLIM) is applied to the data. Two extensions of the BLIM to missing data are proposed: The former, called ignorable missing BLIM (IMBLIM), assumes that missing data are missing completely at random; the latter, called missing BLIM (MissBLIM), introduces specific dependencies of the missing data on the knowledge states, thus assuming that the missing data are missing not at random. The IMBLIM and the MissBLIM modeled the missingness in a satisfactory way, in both a simulation study and an empirical application, depending on the process that generates the missingness: If the missing data-generating process is of type missing completely at random, then either IMBLIM or MissBLIM provide adequate fit to the data. However, if the pattern of missingness is functionally dependent upon unobservable features of the data (e.g., missing answers are more likely to be wrong), then only a correctly specified model of the missingness distribution provides an adequate fit to the data. (c) 2015 APA, all rights reserved).

  18. UDU/T/ covariance factorization for Kalman filtering

    NASA Technical Reports Server (NTRS)

    Thornton, C. L.; Bierman, G. J.

    1980-01-01

    There has been strong motivation to produce numerically stable formulations of the Kalman filter algorithms because it has long been known that the original discrete-time Kalman formulas are numerically unreliable. Numerical instability can be avoided by propagating certain factors of the estimate error covariance matrix rather than the covariance matrix itself. This paper documents filter algorithms that correspond to the covariance factorization P = UDU(T), where U is a unit upper triangular matrix and D is diagonal. Emphasis is on computational efficiency and numerical stability, since these properties are of key importance in real-time filter applications. The history of square-root and U-D covariance filters is reviewed. Simple examples are given to illustrate the numerical inadequacy of the Kalman covariance filter algorithms; these examples show how factorization techniques can give improved computational reliability.

  19. Accounting for one-channel depletion improves missing value imputation in 2-dye microarray data.

    PubMed

    Ritz, Cecilia; Edén, Patrik

    2008-01-19

    For 2-dye microarray platforms, some missing values may arise from an un-measurably low RNA expression in one channel only. Information of such "one-channel depletion" is so far not included in algorithms for imputation of missing values. Calculating the mean deviation between imputed values and duplicate controls in five datasets, we show that KNN-based imputation gives a systematic bias of the imputed expression values of one-channel depleted spots. Evaluating the correction of this bias by cross-validation showed that the mean square deviation between imputed values and duplicates were reduced up to 51%, depending on dataset. By including more information in the imputation step, we more accurately estimate missing expression values.

  20. Accounting for informatively missing data in logistic regression by means of reassessment sampling.

    PubMed

    Lin, Ji; Lyles, Robert H

    2015-05-20

    We explore the 'reassessment' design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mechanism, with emphasis on non-ignorable missingness. The estimation is carried out by numerical maximization of the joint likelihood function with close approximation of the accompanying Hessian matrix, using sharable programs that take advantage of general optimization routines in standard software. We show how likelihood ratio tests can be used for model selection and how they facilitate direct hypothesis testing for whether missingness is at random. Examples and simulations are presented to demonstrate the performance of the proposed method. Copyright © 2015 John Wiley & Sons, Ltd.

  1. Low-dimensional Representation of Error Covariance

    NASA Technical Reports Server (NTRS)

    Tippett, Michael K.; Cohn, Stephen E.; Todling, Ricardo; Marchesin, Dan

    2000-01-01

    Ensemble and reduced-rank approaches to prediction and assimilation rely on low-dimensional approximations of the estimation error covariances. Here stability properties of the forecast/analysis cycle for linear, time-independent systems are used to identify factors that cause the steady-state analysis error covariance to admit a low-dimensional representation. A useful measure of forecast/analysis cycle stability is the bound matrix, a function of the dynamics, observation operator and assimilation method. Upper and lower estimates for the steady-state analysis error covariance matrix eigenvalues are derived from the bound matrix. The estimates generalize to time-dependent systems. If much of the steady-state analysis error variance is due to a few dominant modes, the leading eigenvectors of the bound matrix approximate those of the steady-state analysis error covariance matrix. The analytical results are illustrated in two numerical examples where the Kalman filter is carried to steady state. The first example uses the dynamics of a generalized advection equation exhibiting nonmodal transient growth. Failure to observe growing modes leads to increased steady-state analysis error variances. Leading eigenvectors of the steady-state analysis error covariance matrix are well approximated by leading eigenvectors of the bound matrix. The second example uses the dynamics of a damped baroclinic wave model. The leading eigenvectors of a lowest-order approximation of the bound matrix are shown to approximate well the leading eigenvectors of the steady-state analysis error covariance matrix.

  2. View of MISSE PEC taken during STS-118/Expedition 15 Joint Operations

    NASA Image and Video Library

    2007-08-13

    ISS015-E-22410 (13 Aug. 2007) --- Backdropped by a blue and white Earth, a Materials International Space Station Experiment (MISSE) on the exterior of the station is featured in this image photographed by a crewmember during the STS-118 mission's second planned session of extravehicular activity (EVA). MISSE collects information on how different materials weather in the environment of space.

  3. Covariance hypotheses for LANDSAT data

    NASA Technical Reports Server (NTRS)

    Decell, H. P.; Peters, C.

    1983-01-01

    Two covariance hypotheses are considered for LANDSAT data acquired by sampling fields, one an autoregressive covariance structure and the other the hypothesis of exchangeability. A minimum entropy approximation of the first structure by the second is derived and shown to have desirable properties for incorporation into a mixture density estimation procedure. Results of a rough test of the exchangeability hypothesis are presented.

  4. Why near-miss events can decrease an individual's protective response to hurricanes.

    PubMed

    Dillon, Robin L; Tinsley, Catherine H; Cronin, Matthew

    2011-03-01

    Prior research shows that when people perceive the risk of some hazardous event to be low, they are unlikely to engage in mitigation activities for the potential hazard. We believe one factor that can lower inappropriately (from a normative perspective) people's perception of the risk of a hazard is information about prior near-miss events. A near-miss occurs when an event (such as a hurricane), which had some nontrivial probability of ending in disaster (loss of life, property damage), does not because good fortune intervenes. People appear to mistake such good fortune as an indicator of resiliency. In our first study, people with near-miss information were less likely to purchase flood insurance, and this was shown for both participants from the general population and individuals with specific interests in risk and natural disasters. In our second study, we consider a different mitigation decision, that is, to evacuate from a hurricane, and vary the level of statistical probability of hurricane damage. We still found a strong effect for near-miss information. Our research thus shows how people who have experienced a similar situation but escape damage because of chance will make decisions consistent with a perception that the situation is less risky than those without the past experience. We end by discussing the implications for risk communication. © 2010 Society for Risk Analysis.

  5. 20 CFR 364.4 - Placement of missing children posters in Board field offices.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 20 Employees' Benefits 1 2010-04-01 2010-04-01 false Placement of missing children posters in... CHILDREN § 364.4 Placement of missing children posters in Board field offices. (a) Poster content. The... information about that child, which may include a photograph of the child, that will appear on the poster. The...

  6. 20 CFR 364.4 - Placement of missing children posters in Board field offices.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 20 Employees' Benefits 1 2012-04-01 2012-04-01 false Placement of missing children posters in... CHILDREN § 364.4 Placement of missing children posters in Board field offices. (a) Poster content. The... information about that child, which may include a photograph of the child, that will appear on the poster. The...

  7. 20 CFR 364.4 - Placement of missing children posters in Board field offices.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 20 Employees' Benefits 1 2011-04-01 2011-04-01 false Placement of missing children posters in... CHILDREN § 364.4 Placement of missing children posters in Board field offices. (a) Poster content. The... information about that child, which may include a photograph of the child, that will appear on the poster. The...

  8. On normality, ethnicity, and missing values in quantitative trait locus mapping

    PubMed Central

    Labbe, Aurélie; Wormald, Hanna

    2005-01-01

    Background This paper deals with the detection of significant linkage for quantitative traits using a variance components approach. Microsatellite markers were obtained for the Genetic Analysis Workshop 14 Collaborative Study on the Genetics of Alcoholism data. Ethnic heterogeneity, highly skewed quantitative measures, and a high rate of missing values are all present in this dataset and well known to impact upon linkage analysis. This makes it a good candidate for investigation. Results As expected, we observed a number of changes in LOD scores, especially for chromosomes 1, 7, and 18, along with the three factors studied. A dramatic example of such changes can be found in chromosome 7. Highly significant linkage to one of the quantitative traits became insignificant when a proper normalizing transformation of the trait was used and when analysis was carried out on an ethnically homogeneous subset of the original pedigrees. Conclusion In agreement with existing literature, transforming a trait to ensure normality using a Box-Cox transformation is highly recommended in order to avoid false-positive linkages. Furthermore, pedigrees should be sorted by ethnic groups and analyses should be carried out separately. Finally, one should be aware that the inclusion of covariates with a high rate of missing values reduces considerably the number of subjects included in the model. In such a case, the loss in power may be large. Imputation methods are then recommended. PMID:16451664

  9. Matched samples logistic regression in case-control studies with missing values: when to break the matches.

    PubMed

    Hansson, Lisbeth; Khamis, Harry J

    2008-12-01

    Simulated data sets are used to evaluate conditional and unconditional maximum likelihood estimation in an individual case-control design with continuous covariates when there are different rates of excluded cases and different levels of other design parameters. The effectiveness of the estimation procedures is measured by method bias, variance of the estimators, root mean square error (RMSE) for logistic regression and the percentage of explained variation. Conditional estimation leads to higher RMSE than unconditional estimation in the presence of missing observations, especially for 1:1 matching. The RMSE is higher for the smaller stratum size, especially for the 1:1 matching. The percentage of explained variation appears to be insensitive to missing data, but is generally higher for the conditional estimation than for the unconditional estimation. It is particularly good for the 1:2 matching design. For minimizing RMSE, a high matching ratio is recommended; in this case, conditional and unconditional logistic regression models yield comparable levels of effectiveness. For maximizing the percentage of explained variation, the 1:2 matching design with the conditional logistic regression model is recommended.

  10. The MISSE-9 Polymers and Composites Experiment Being Flown on the MISSE-Flight Facility

    NASA Technical Reports Server (NTRS)

    De Groh, Kim K.; Banks, Bruce A.

    2017-01-01

    Materials on the exterior of spacecraft in low Earth orbit (LEO) are subject to extremely harsh environmental conditions, including various forms of radiation (cosmic rays, ultraviolet, x-ray, and charged particle radiation), micrometeoroids and orbital debris, temperature extremes, thermal cycling, and atomic oxygen (AO). These environmental exposures can result in erosion, embrittlement and optical property degradation of susceptible materials, threatening spacecraft performance and durability. To increase our understanding of space environmental effects such as AO erosion and radiation induced embrittlement of spacecraft materials, NASA Glenn has developed a series of experiments flown as part of the Materials International Space Station Experiment (MISSE) missions on the exterior of the International Space Station (ISS). These experiments have provided critical LEO space environment durability data such as AO erosion yield values for many materials and mechanical properties changes after long term space exposure. In continuing these studies, a new Glenn experiment has been proposed, and accepted, for flight on the new MISSE-Flight Facility (MISSE-FF). This experiment is called the Polymers and Composites Experiment and it will be flown as part of the MISSE-9 mission, the inaugural mission of MISSE-FF. Figure 1 provides an artist rendition of MISSE-FF ISS external platform. The MISSE-FF is manifested for launch on SpaceX-13.

  11. Multiple imputation of missing fMRI data in whole brain analysis

    PubMed Central

    Vaden, Kenneth I.; Gebregziabher, Mulugeta; Kuchinsky, Stefanie E.; Eckert, Mark A.

    2012-01-01

    Whole brain fMRI analyses rarely include the entire brain because of missing data that result from data acquisition limits and susceptibility artifact, in particular. This missing data problem is typically addressed by omitting voxels from analysis, which may exclude brain regions that are of theoretical interest and increase the potential for Type II error at cortical boundaries or Type I error when spatial thresholds are used to establish significance. Imputation could significantly expand statistical map coverage, increase power, and enhance interpretations of fMRI results. We examined multiple imputation for group level analyses of missing fMRI data using methods that leverage the spatial information in fMRI datasets for both real and simulated data. Available case analysis, neighbor replacement, and regression based imputation approaches were compared in a general linear model framework to determine the extent to which these methods quantitatively (effect size) and qualitatively (spatial coverage) increased the sensitivity of group analyses. In both real and simulated data analysis, multiple imputation provided 1) variance that was most similar to estimates for voxels with no missing data, 2) fewer false positive errors in comparison to mean replacement, and 3) fewer false negative errors in comparison to available case analysis. Compared to the standard analysis approach of omitting voxels with missing data, imputation methods increased brain coverage in this study by 35% (from 33,323 to 45,071 voxels). In addition, multiple imputation increased the size of significant clusters by 58% and number of significant clusters across statistical thresholds, compared to the standard voxel omission approach. While neighbor replacement produced similar results, we recommend multiple imputation because it uses an informed sampling distribution to deal with missing data across subjects that can include neighbor values and other predictors. Multiple imputation is

  12. Determinants of Maternal Near-Miss in Morocco: Too Late, Too Far, Too Sloppy?

    PubMed Central

    Assarag, Bouchra; Dujardin, Bruno; Delamou, Alexandre; Meski, Fatima-Zahra; De Brouwere, Vincent

    2015-01-01

    Background In Morocco, there is little information on the circumstances surrounding maternal near misses. This study aimed to determine the incidence, characteristics, and determinants of maternal near misses in Morocco. Method A prospective case-control study was conducted at 3 referral maternity hospitals in the Marrakech region of Morocco between February and July 2012. Near-miss cases included severe hemorrhage, hypertensive disorders, and prolonged obstructed labor. Three unmatched controls were selected for each near-miss case. Three categories of risk factors (sociodemographics, reproductive history, and delays), as well as perinatal outcomes, were assessed, and bivariate and multivariate analyses of the determinants were performed. A sample of 30 near misses and 30 non-near misses was interviewed. Results The incidence of near misses was 12‰ of births. Hypertensive disorders during pregnancy (45%) and severe hemorrhage (39%) were the most frequent direct causes of near miss. The main risk factors were illiteracy [OR = 2.35; 95% CI: (1.07–5.15)], lack of antenatal care [OR = 3.97; 95% CI: (1.42–11.09)], complications during pregnancy [OR = 2.81; 95% CI:(1.26–6.29)], and having experienced a first phase delay [OR = 8.71; 95% CI: (3.97–19.12)] and a first phase of third delay [OR = 4.03; 95% CI: (1.75–9.25)]. The main reasons for the first delay were lack of a family authority figure who could make a decision, lack of sufficient financial resources, lack of a vehicle, and fear of health facilities. The majority of near misses demonstrated a third delay with many referrals. The women’s perceptions of the quality of their care highlighted the importance of information, good communication, and attitude. Conclusion Women and newborns with serious obstetric complications have a greater chance of successful outcomes if they are immediately directed to a functioning referral hospital and if the providers are responsive. PMID:25612095

  13. Defining habitat covariates in camera-trap based occupancy studies

    PubMed Central

    Niedballa, Jürgen; Sollmann, Rahel; Mohamed, Azlan bin; Bender, Johannes; Wilting, Andreas

    2015-01-01

    In species-habitat association studies, both the type and spatial scale of habitat covariates need to match the ecology of the focal species. We assessed the potential of high-resolution satellite imagery for generating habitat covariates using camera-trapping data from Sabah, Malaysian Borneo, within an occupancy framework. We tested the predictive power of covariates generated from satellite imagery at different resolutions and extents (focal patch sizes, 10–500 m around sample points) on estimates of occupancy patterns of six small to medium sized mammal species/species groups. High-resolution land cover information had considerably more model support for small, patchily distributed habitat features, whereas it had no advantage for large, homogeneous habitat features. A comparison of different focal patch sizes including remote sensing data and an in-situ measure showed that patches with a 50-m radius had most support for the target species. Thus, high-resolution satellite imagery proved to be particularly useful in heterogeneous landscapes, and can be used as a surrogate for certain in-situ measures, reducing field effort in logistically challenging environments. Additionally, remote sensed data provide more flexibility in defining appropriate spatial scales, which we show to impact estimates of wildlife-habitat associations. PMID:26596779

  14. Analyzing time-ordered event data with missed observations.

    PubMed

    Dokter, Adriaan M; van Loon, E Emiel; Fokkema, Wimke; Lameris, Thomas K; Nolet, Bart A; van der Jeugd, Henk P

    2017-09-01

    A common problem with observational datasets is that not all events of interest may be detected. For example, observing animals in the wild can difficult when animals move, hide, or cannot be closely approached. We consider time series of events recorded in conditions where events are occasionally missed by observers or observational devices. These time series are not restricted to behavioral protocols, but can be any cyclic or recurring process where discrete outcomes are observed. Undetected events cause biased inferences on the process of interest, and statistical analyses are needed that can identify and correct the compromised detection processes. Missed observations in time series lead to observed time intervals between events at multiples of the true inter-event time, which conveys information on their detection probability. We derive the theoretical probability density function for observed intervals between events that includes a probability of missed detection. Methodology and software tools are provided for analysis of event data with potential observation bias and its removal. The methodology was applied to simulation data and a case study of defecation rate estimation in geese, which is commonly used to estimate their digestive throughput and energetic uptake, or to calculate goose usage of a feeding site from dropping density. Simulations indicate that at a moderate chance to miss arrival events ( p  = 0.3), uncorrected arrival intervals were biased upward by up to a factor 3, while parameter values corrected for missed observations were within 1% of their true simulated value. A field case study shows that not accounting for missed observations leads to substantial underestimates of the true defecation rate in geese, and spurious rate differences between sites, which are introduced by differences in observational conditions. These results show that the derived methodology can be used to effectively remove observational biases in time-ordered event

  15. Multi-task Gaussian process for imputing missing data in multi-trait and multi-environment trials.

    PubMed

    Hori, Tomoaki; Montcho, David; Agbangla, Clement; Ebana, Kaworu; Futakuchi, Koichi; Iwata, Hiroyoshi

    2016-11-01

    A method based on a multi-task Gaussian process using self-measuring similarity gave increased accuracy for imputing missing phenotypic data in multi-trait and multi-environment trials. Multi-environmental trial (MET) data often encounter the problem of missing data. Accurate imputation of missing data makes subsequent analysis more effective and the results easier to understand. Moreover, accurate imputation may help to reduce the cost of phenotyping for thinned-out lines tested in METs. METs are generally performed for multiple traits that are correlated to each other. Correlation among traits can be useful information for imputation, but single-trait-based methods cannot utilize information shared by traits that are correlated. In this paper, we propose imputation methods based on a multi-task Gaussian process (MTGP) using self-measuring similarity kernels reflecting relationships among traits, genotypes, and environments. This framework allows us to use genetic correlation among multi-trait multi-environment data and also to combine MET data and marker genotype data. We compared the accuracy of three MTGP methods and iterative regularized PCA using rice MET data. Two scenarios for the generation of missing data at various missing rates were considered. The MTGP performed a better imputation accuracy than regularized PCA, especially at high missing rates. Under the 'uniform' scenario, in which missing data arise randomly, inclusion of marker genotype data in the imputation increased the imputation accuracy at high missing rates. Under the 'fiber' scenario, in which missing data arise in all traits for some combinations between genotypes and environments, the inclusion of marker genotype data decreased the imputation accuracy for most traits while increasing the accuracy in a few traits remarkably. The proposed methods will be useful for solving the missing data problem in MET data.

  16. Effects of growth rate, size, and light availability on tree survival across life stages: a demographic analysis accounting for missing values and small sample sizes.

    PubMed

    Moustakas, Aristides; Evans, Matthew R

    2015-02-28

    Plant survival is a key factor in forest dynamics and survival probabilities often vary across life stages. Studies specifically aimed at assessing tree survival are unusual and so data initially designed for other purposes often need to be used; such data are more likely to contain errors than data collected for this specific purpose. We investigate the survival rates of ten tree species in a dataset designed to monitor growth rates. As some individuals were not included in the census at some time points we use capture-mark-recapture methods both to allow us to account for missing individuals, and to estimate relocation probabilities. Growth rates, size, and light availability were included as covariates in the model predicting survival rates. The study demonstrates that tree mortality is best described as constant between years and size-dependent at early life stages and size independent at later life stages for most species of UK hardwood. We have demonstrated that even with a twenty-year dataset it is possible to discern variability both between individuals and between species. Our work illustrates the potential utility of the method applied here for calculating plant population dynamics parameters in time replicated datasets with small sample sizes and missing individuals without any loss of sample size, and including explanatory covariates.

  17. Convex Banding of the Covariance Matrix.

    PubMed

    Bien, Jacob; Bunea, Florentina; Xiao, Luo

    2016-01-01

    We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings.

  18. Form of the manifestly covariant Lagrangian

    NASA Astrophysics Data System (ADS)

    Johns, Oliver Davis

    1985-10-01

    The preferred form for the manifestly covariant Lagrangian function of a single, charged particle in a given electromagnetic field is the subject of some disagreement in the textbooks. Some authors use a ``homogeneous'' Lagrangian and others use a ``modified'' form in which the covariant Hamiltonian function is made to be nonzero. We argue in favor of the ``homogeneous'' form. We show that the covariant Lagrangian theories can be understood only if one is careful to distinguish quantities evaluated on the varied (in the sense of the calculus of variations) world lines from quantities evaluated on the unvaried world lines. By making this distinction, we are able to derive the Hamilton-Jacobi and Klein-Gordon equations from the ``homogeneous'' Lagrangian, even though the covariant Hamiltonian function is identically zero on all world lines. The derivation of the Klein-Gordon equation in particular gives Lagrangian theoretical support to the derivations found in standard quantum texts, and is also shown to be consistent with the Feynman path-integral method. We conclude that the ``homogeneous'' Lagrangian is a completely adequate basis for covariant Lagrangian theory both in classical and quantum mechanics. The article also explores the analogy with the Fermat theorem of optics, and illustrates a simple invariant notation for the Lagrangian and other four-vector equations.

  19. Convex Banding of the Covariance Matrix

    PubMed Central

    Bien, Jacob; Bunea, Florentina; Xiao, Luo

    2016-01-01

    We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings. PMID:28042189

  20. Missing data in trial‐based cost‐effectiveness analysis: An incomplete journey

    PubMed Central

    Gomes, Manuel; Carpenter, James R.

    2018-01-01

    SUMMARY Cost‐effectiveness analyses (CEA) conducted alongside randomised trials provide key evidence for informing healthcare decision making, but missing data pose substantive challenges. Recently, there have been a number of developments in methods and guidelines addressing missing data in trials. However, it is unclear whether these developments have permeated CEA practice. This paper critically reviews the extent of and methods used to address missing data in recently published trial‐based CEA. Issues of the Health Technology Assessment journal from 2013 to 2015 were searched. Fifty‐two eligible studies were identified. Missing data were very common; the median proportion of trial participants with complete cost‐effectiveness data was 63% (interquartile range: 47%–81%). The most common approach for the primary analysis was to restrict analysis to those with complete data (43%), followed by multiple imputation (30%). Half of the studies conducted some sort of sensitivity analyses, but only 2 (4%) considered possible departures from the missing‐at‐random assumption. Further improvements are needed to address missing data in cost‐effectiveness analyses conducted alongside randomised trials. These should focus on limiting the extent of missing data, choosing an appropriate method for the primary analysis that is valid under contextually plausible assumptions, and conducting sensitivity analyses to departures from the missing‐at‐random assumption. PMID:29573044

  1. Cross-population myelination covariance of human cerebral cortex.

    PubMed

    Ma, Zhiwei; Zhang, Nanyin

    2017-09-01

    Cross-population covariance of brain morphometric quantities provides a measure of interareal connectivity, as it is believed to be determined by the coordinated neurodevelopment of connected brain regions. Although useful, structural covariance analysis predominantly employed bulky morphological measures with mixed compartments, whereas studies of the structural covariance of any specific subdivisions such as myelin are rare. Characterizing myelination covariance is of interest, as it will reveal connectivity patterns determined by coordinated development of myeloarchitecture between brain regions. Using myelin content MRI maps from the Human Connectome Project, here we showed that the cortical myelination covariance was highly reproducible, and exhibited a brain organization similar to that previously revealed by other connectivity measures. Additionally, the myelination covariance network shared common topological features of human brain networks such as small-worldness. Furthermore, we found that the correlation between myelination covariance and resting-state functional connectivity (RSFC) was uniform within each resting-state network (RSN), but could considerably vary across RSNs. Interestingly, this myelination covariance-RSFC correlation was appreciably stronger in sensory and motor networks than cognitive and polymodal association networks, possibly due to their different circuitry structures. This study has established a new brain connectivity measure specifically related to axons, and this measure can be valuable to investigating coordinated myeloarchitecture development. Hum Brain Mapp 38:4730-4743, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  2. Improving record linkage performance in the presence of missing linkage data.

    PubMed

    Ong, Toan C; Mannino, Michael V; Schilling, Lisa M; Kahn, Michael G

    2014-12-01

    Existing record linkage methods do not handle missing linking field values in an efficient and effective manner. The objective of this study is to investigate three novel methods for improving the accuracy and efficiency of record linkage when record linkage fields have missing values. By extending the Fellegi-Sunter scoring implementations available in the open-source Fine-grained Record Linkage (FRIL) software system we developed three novel methods to solve the missing data problem in record linkage, which we refer to as: Weight Redistribution, Distance Imputation, and Linkage Expansion. Weight Redistribution removes fields with missing data from the set of quasi-identifiers and redistributes the weight from the missing attribute based on relative proportions across the remaining available linkage fields. Distance Imputation imputes the distance between the missing data fields rather than imputing the missing data value. Linkage Expansion adds previously considered non-linkage fields to the linkage field set to compensate for the missing information in a linkage field. We tested the linkage methods using simulated data sets with varying field value corruption rates. The methods developed had sensitivity ranging from .895 to .992 and positive predictive values (PPV) ranging from .865 to 1 in data sets with low corruption rates. Increased corruption rates lead to decreased sensitivity for all methods. These new record linkage algorithms show promise in terms of accuracy and efficiency and may be valuable for combining large data sets at the patient level to support biomedical and clinical research. Copyright © 2014 Elsevier Inc. All rights reserved.

  3. Covariate Imbalance and Precision in Measuring Treatment Effects

    ERIC Educational Resources Information Center

    Liu, Xiaofeng Steven

    2011-01-01

    Covariate adjustment can increase the precision of estimates by removing unexplained variance from the error in randomized experiments, although chance covariate imbalance tends to counteract the improvement in precision. The author develops an easy measure to examine chance covariate imbalance in randomization by standardizing the average…

  4. Dimension from covariance matrices.

    PubMed

    Carroll, T L; Byers, J M

    2017-02-01

    We describe a method to estimate embedding dimension from a time series. This method includes an estimate of the probability that the dimension estimate is valid. Such validity estimates are not common in algorithms for calculating the properties of dynamical systems. The algorithm described here compares the eigenvalues of covariance matrices created from an embedded signal to the eigenvalues for a covariance matrix of a Gaussian random process with the same dimension and number of points. A statistical test gives the probability that the eigenvalues for the embedded signal did not come from the Gaussian random process.

  5. Learning through Feature Prediction: An Initial Investigation into Teaching Categories to Children with Autism through Predicting Missing Features

    ERIC Educational Resources Information Center

    Sweller, Naomi

    2015-01-01

    Individuals with autism have difficulty generalising information from one situation to another, a process that requires the learning of categories and concepts. Category information may be learned through: (1) classifying items into categories, or (2) predicting missing features of category items. Predicting missing features has to this point been…

  6. Nuclear Forensics Analysis with Missing and Uncertain Data

    DOE PAGES

    Langan, Roisin T.; Archibald, Richard K.; Lamberti, Vincent

    2015-10-05

    We have applied a new imputation-based method for analyzing incomplete data, called Monte Carlo Bayesian Database Generation (MCBDG), to the Spent Fuel Isotopic Composition (SFCOMPO) database. About 60% of the entries are absent for SFCOMPO. The method estimates missing values of a property from a probability distribution created from the existing data for the property, and then generates multiple instances of the completed database for training a machine learning algorithm. Uncertainty in the data is represented by an empirical or an assumed error distribution. The method makes few assumptions about the underlying data, and compares favorably against results obtained bymore » replacing missing information with constant values.« less

  7. Cyclists' Anger As Determinant of Near Misses Involving Different Road Users.

    PubMed

    Marín Puchades, Víctor; Prati, Gabriele; Rondinella, Gianni; De Angelis, Marco; Fassina, Filippo; Fraboni, Federico; Pietrantoni, Luca

    2017-01-01

    Road anger constitutes one of the determinant factors related to safety outcomes (e.g., accidents, near misses). Although cyclists are considered vulnerable road users due to their relatively high rate of fatalities in traffic, previous research has solely focused on car drivers, and no study has yet investigated the effect of anger on cyclists' safety outcomes. The present research aims to investigate, for the first time, the effects of cycling anger toward different types of road users on near misses involving such road users and near misses in general. Using a daily diary web-based questionnaire, we collected data about daily trips, bicycle use, near misses experienced, cyclist's anger and demographic information from 254 Spanish cyclists. Poisson regression was used to assess the association of cycling anger with near misses, which is a count variable. No relationship was found between general cycling anger and near misses occurrence. Anger toward specific road users had different effects on the probability of near misses with different road users. Anger toward the interaction with car drivers increased the probability of near misses involving cyclists and pedestrians. Anger toward interaction with pedestrians was associated with higher probability of near misses with pedestrians. Anger toward cyclists exerted no effect on the probability of near misses with any road user (i.e., car drivers, cyclists or pedestrians), whereas anger toward the interactions with the police had a diminishing effect on the occurrence of near misses' involving all types of road users. The present study demonstrated that the effect of road anger on safety outcomes among cyclists is different from that of motorists. Moreover, the target of anger played an important role on safety both for the cyclist and the specific road users. Possible explanations for these differences are based on the difference in status and power with motorists, as well as on the potential displaced aggression

  8. Condition Number Regularized Covariance Estimation.

    PubMed

    Won, Joong-Ho; Lim, Johan; Kim, Seung-Jean; Rajaratnam, Bala

    2013-06-01

    Estimation of high-dimensional covariance matrices is known to be a difficult problem, has many applications, and is of current interest to the larger statistics community. In many applications including so-called the "large p small n " setting, the estimate of the covariance matrix is required to be not only invertible, but also well-conditioned. Although many regularization schemes attempt to do this, none of them address the ill-conditioning problem directly. In this paper, we propose a maximum likelihood approach, with the direct goal of obtaining a well-conditioned estimator. No sparsity assumption on either the covariance matrix or its inverse are are imposed, thus making our procedure more widely applicable. We demonstrate that the proposed regularization scheme is computationally efficient, yields a type of Steinian shrinkage estimator, and has a natural Bayesian interpretation. We investigate the theoretical properties of the regularized covariance estimator comprehensively, including its regularization path, and proceed to develop an approach that adaptively determines the level of regularization that is required. Finally, we demonstrate the performance of the regularized estimator in decision-theoretic comparisons and in the financial portfolio optimization setting. The proposed approach has desirable properties, and can serve as a competitive procedure, especially when the sample size is small and when a well-conditioned estimator is required.

  9. Differential Age-Related Changes in Structural Covariance Networks of Human Anterior and Posterior Hippocampus.

    PubMed

    Li, Xinwei; Li, Qiongling; Wang, Xuetong; Li, Deyu; Li, Shuyu

    2018-01-01

    The hippocampus plays an important role in memory function relying on information interaction between distributed brain areas. The hippocampus can be divided into the anterior and posterior sections with different structure and function along its long axis. The aim of this study is to investigate the effects of normal aging on the structural covariance of the anterior hippocampus (aHPC) and the posterior hippocampus (pHPC). In this study, 240 healthy subjects aged 18-89 years were selected and subdivided into young (18-23 years), middle-aged (30-58 years), and older (61-89 years) groups. The aHPC and pHPC was divided based on the location of uncal apex in the MNI space. Then, the structural covariance networks were constructed by examining their covariance in gray matter volumes with other brain regions. Finally, the influence of age on the structural covariance of these hippocampal sections was explored. We found that the aHPC and pHPC had different structural covariance patterns, but both of them were associated with the medial temporal lobe and insula. Moreover, both increased and decreased covariances were found with the aHPC but only increased covariance was found with the pHPC with age ( p < 0.05, family-wise error corrected). These decreased connections occurred within the default mode network, while the increased connectivity mainly occurred in other memory systems that differ from the hippocampus. This study reveals different age-related influence on the structural networks of the aHPC and pHPC, providing an essential insight into the mechanisms of the hippocampus in normal aging.

  10. Parametric Covariance Model for Horizon-Based Optical Navigation

    NASA Technical Reports Server (NTRS)

    Hikes, Jacob; Liounis, Andrew J.; Christian, John A.

    2016-01-01

    This Note presents an entirely parametric version of the covariance for horizon-based optical navigation measurements. The covariance can be written as a function of only the spacecraft position, two sensor design parameters, the illumination direction, the size of the observed planet, the size of the lit arc to be used, and the total number of observed horizon points. As a result, one may now more clearly understand the sensitivity of horizon-based optical navigation performance as a function of these key design parameters, which is insight that was obscured in previous (and nonparametric) versions of the covariance. Finally, the new parametric covariance is shown to agree with both the nonparametric analytic covariance and results from a Monte Carlo analysis.

  11. Meaning of Missing Values in Eyewitness Recall and Accident Records

    PubMed Central

    Uttl, Bob; Kisinger, Kelly

    2010-01-01

    Background Eyewitness recalls and accident records frequently do not mention the conditions and behaviors of interest to researchers and lead to missing values and to uncertainty about the prevalence of these conditions and behaviors surrounding accidents. Missing values may occur because eyewitnesses report the presence but not the absence of obvious clues/accident features. We examined this possibility. Methodology/Principal Findings Participants watched car accident videos and were asked to recall as much information as they could remember about each accident. The results showed that eyewitnesses were far more likely to report the presence of present obvious clues than the absence of absent obvious clues even though they were aware of their absence. Conclusions One of the principal mechanisms causing missing values may be eyewitnesses' tendency to not report the absence of obvious features. We discuss the implications of our findings for both retrospective and prospective analyses of accident records, and illustrate the consequences of adopting inappropriate assumptions about the meaning of missing values using the Avaluator Avalanche Accident Prevention Card. PMID:20824054

  12. A Note on the Use of Missing Auxiliary Variables in Full Information Maximum Likelihood-Based Structural Equation Models

    ERIC Educational Resources Information Center

    Enders, Craig K.

    2008-01-01

    Recent missing data studies have argued in favor of an "inclusive analytic strategy" that incorporates auxiliary variables into the estimation routine, and Graham (2003) outlined methods for incorporating auxiliary variables into structural equation analyses. In practice, the auxiliary variables often have missing values, so it is reasonable to…

  13. Missing Data and Institutional Research

    ERIC Educational Resources Information Center

    Croninger, Robert G.; Douglas, Karen M.

    2005-01-01

    Many do not consider the effect that missing data have on their survey results nor do they know how to handle missing data. This chapter offers strategies for handling item-missing data and provides a practical example of how these strategies may affect results. The chapter concludes with recommendations for preventing and dealing with missing…

  14. A Primer for Handling Missing Values in the Analysis of Education and Training Data

    ERIC Educational Resources Information Center

    Gemici, Sinan; Bednarz, Alice; Lim, Patrick

    2012-01-01

    Quantitative research in vocational education and training (VET) is routinely affected by missing or incomplete information. However, the handling of missing data in published VET research is often sub-optimal, leading to a real risk of generating results that can range from being slightly biased to being plain wrong. Given that the growing…

  15. Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study

    PubMed Central

    Shah, Anoop D.; Bartlett, Jonathan W.; Carpenter, James; Nicholas, Owen; Hemingway, Harry

    2014-01-01

    Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The “true” imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001–2010) with complete data on all covariates. Variables were artificially made “missing at random,” and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data. PMID:24589914

  16. UV imaging reveals facial areas that are prone to skin cancer are disproportionately missed during sunscreen application.

    PubMed

    Pratt, Harry; Hassanin, Kareem; Troughton, Lee D; Czanner, Gabriela; Zheng, Yalin; McCormick, Austin G; Hamill, Kevin J

    2017-01-01

    Application of sunscreen is a widely used mechanism for protecting skin from the harmful effects of UV light. However, protection can only be achieved through effective application, and areas that are routinely missed are likely at increased risk of UV damage. Here we sought to determine if specific areas of the face are missed during routine sunscreen application, and whether provision of public health information is sufficient to improve coverage. To investigate this, 57 participants were imaged with a UV sensitive camera before and after sunscreen application: first visit; minimal pre-instruction, second visit; provided with a public health information statement. Images were scored using a custom automated image analysis process designed to identify areas of high UV reflectance, i.e. missed during sunscreen application, and analysed for 5% significance. Analyses revealed eyelid and periorbital regions to be disproportionately missed during routine sunscreen application (median 14% missed in eyelid region vs 7% in rest of face, p<0.01). Provision of health information caused a significant improvement in coverage to eyelid areas in general however, the medial canthal area was still frequently missed. These data reveal that a public health announcement-type intervention could be effective at improving coverage of high risk areas of the face, however high risk areas are likely to remain unprotected therefore other mechanisms of sun protection should be widely promoted such as UV blocking sunglasses.

  17. Tackling Missing Data in Community Health Studies Using Additive LS-SVM Classifier.

    PubMed

    Wang, Guanjin; Deng, Zhaohong; Choi, Kup-Sze

    2018-03-01

    Missing data is a common issue in community health and epidemiological studies. Direct removal of samples with missing data can lead to reduced sample size and information bias, which deteriorates the significance of the results. While data imputation methods are available to deal with missing data, they are limited in performance and could introduce noises into the dataset. Instead of data imputation, a novel method based on additive least square support vector machine (LS-SVM) is proposed in this paper for predictive modeling when the input features of the model contain missing data. The method also determines simultaneously the influence of the features with missing values on the classification accuracy using the fast leave-one-out cross-validation strategy. The performance of the method is evaluated by applying it to predict the quality of life (QOL) of elderly people using health data collected in the community. The dataset involves demographics, socioeconomic status, health history, and the outcomes of health assessments of 444 community-dwelling elderly people, with 5% to 60% of data missing in some of the input features. The QOL is measured using a standard questionnaire of the World Health Organization. Results show that the proposed method outperforms four conventional methods for handling missing data-case deletion, feature deletion, mean imputation, and K-nearest neighbor imputation, with the average QOL prediction accuracy reaching 0.7418. It is potentially a promising technique for tackling missing data in community health research and other applications.

  18. Toward a hybrid brain-computer interface based on repetitive visual stimuli with missing events.

    PubMed

    Wu, Yingying; Li, Man; Wang, Jing

    2016-07-26

    Steady-state visually evoked potentials (SSVEPs) can be elicited by repetitive stimuli and extracted in the frequency domain with satisfied performance. However, the temporal information of such stimulus is often ignored. In this study, we utilized repetitive visual stimuli with missing events to present a novel hybrid BCI paradigm based on SSVEP and omitted stimulus potential (OSP). Four discs flickering from black to white with missing flickers served as visual stimulators to simultaneously elicit subject's SSVEPs and OSPs. Key parameters in the new paradigm, including flicker frequency, optimal electrodes, missing flicker duration and intervals of missing events were qualitatively discussed with offline data. Two omitted flicker patterns including missing black/white disc were proposed and compared. Averaging times were optimized with Information Transfer Rate (ITR) in online experiments, where SSVEPs and OSPs were identified using Canonical Correlation Analysis in the frequency domain and Support Vector Machine (SVM)-Bayes fusion in the time domain, respectively. The online accuracy and ITR (mean ± standard deviation) over nine healthy subjects were 79.29 ± 18.14 % and 19.45 ± 11.99 bits/min with missing black disc pattern, and 86.82 ± 12.91 % and 24.06 ± 10.95 bits/min with missing white disc pattern, respectively. The proposed BCI paradigm, for the first time, demonstrated that SSVEPs and OSPs can be simultaneously elicited in single visual stimulus pattern and recognized in real-time with satisfied performance. Besides the frequency features such as SSVEP elicited by repetitive stimuli, we found a new feature (OSP) in the time domain to design a novel hybrid BCI paradigm by adding missing events in repetitive stimuli.

  19. Knowledge of consequences of missing teeth in patients attending prosthetic clinic in u.C.h. Ibadan.

    PubMed

    Dosumu, O O; Ogunrinde, J T; Bamigboye, S A

    2014-06-01

    Various causes of tooth loss such as caries, trauma, periodontal diseases, and cancer have been documented in the literature. In addition, factors that can modify these causes such as level of education, age and sex have been studied. There is however paucity of information on whether patients or people with missing teeth are aware of the side effects of tooth loss on them or on the remaining teeth. This study investigated the knowledge of consequences of missing teeth among partially edentulous patients in a teaching hospital. Self-administered questionnaires were distributed to the patients to collect information relating to demography, cause and duration of tooth loss, awareness of the consequences of tooth loss and their sources of information. Four clinical conditions including supra-eruption, mastication, teeth drifting, and facial collapse were used to assess the level of awareness of consequences of missing teeth. Two hundred and three participants were included in the study. Their mean age was 45.5±1.8 years. There was no significant difference between the knowledge of the consequences of missing teeth and sex or on level of education (p(·) 0.05). Dentists constituted the largest source of information to these patients (25.6%) while the media constituted the least (0.5%). The result of this study showed poor knowledge of the consequences of missing teeth among partially edentulous patients and the media that should be of assistance were equally unaware, signifying urgent need for public awareness on this subject.

  20. Selecting a separable parametric spatiotemporal covariance structure for longitudinal imaging data.

    PubMed

    George, Brandon; Aban, Inmaculada

    2015-01-15

    Longitudinal imaging studies allow great insight into how the structure and function of a subject's internal anatomy changes over time. Unfortunately, the analysis of longitudinal imaging data is complicated by inherent spatial and temporal correlation: the temporal from the repeated measures and the spatial from the outcomes of interest being observed at multiple points in a patient's body. We propose the use of a linear model with a separable parametric spatiotemporal error structure for the analysis of repeated imaging data. The model makes use of spatial (exponential, spherical, and Matérn) and temporal (compound symmetric, autoregressive-1, Toeplitz, and unstructured) parametric correlation functions. A simulation study, inspired by a longitudinal cardiac imaging study on mitral regurgitation patients, compared different information criteria for selecting a particular separable parametric spatiotemporal correlation structure as well as the effects on types I and II error rates for inference on fixed effects when the specified model is incorrect. Information criteria were found to be highly accurate at choosing between separable parametric spatiotemporal correlation structures. Misspecification of the covariance structure was found to have the ability to inflate the type I error or have an overly conservative test size, which corresponded to decreased power. An example with clinical data is given illustrating how the covariance structure procedure can be performed in practice, as well as how covariance structure choice can change inferences about fixed effects. Copyright © 2014 John Wiley & Sons, Ltd.

  1. Selecting a Separable Parametric Spatiotemporal Covariance Structure for Longitudinal Imaging Data

    PubMed Central

    George, Brandon; Aban, Inmaculada

    2014-01-01

    Longitudinal imaging studies allow great insight into how the structure and function of a subject’s internal anatomy changes over time. Unfortunately, the analysis of longitudinal imaging data is complicated by inherent spatial and temporal correlation: the temporal from the repeated measures, and the spatial from the outcomes of interest being observed at multiple points in a patients body. We propose the use of a linear model with a separable parametric spatiotemporal error structure for the analysis of repeated imaging data. The model makes use of spatial (exponential, spherical, and Matérn) and temporal (compound symmetric, autoregressive-1, Toeplitz, and unstructured) parametric correlation functions. A simulation study, inspired by a longitudinal cardiac imaging study on mitral regurgitation patients, compared different information criteria for selecting a particular separable parametric spatiotemporal correlation structure as well as the effects on Type I and II error rates for inference on fixed effects when the specified model is incorrect. Information criteria were found to be highly accurate at choosing between separable parametric spatiotemporal correlation structures. Misspecification of the covariance structure was found to have the ability to inflate the Type I error or have an overly conservative test size, which corresponded to decreased power. An example with clinical data is given illustrating how the covariance structure procedure can be done in practice, as well as how covariance structure choice can change inferences about fixed effects. PMID:25293361

  2. Students’ Covariational Reasoning in Solving Integrals’ Problems

    NASA Astrophysics Data System (ADS)

    Harini, N. V.; Fuad, Y.; Ekawati, R.

    2018-01-01

    Covariational reasoning plays an important role to indicate quantities vary in learning calculus. This study investigates students’ covariational reasoning during their studies concerning two covarying quantities in integral problem. Six undergraduate students were chosen to solve problems that involved interpreting and representing how quantities change in tandem. Interviews were conducted to reveal the students’ reasoning while solving covariational problems. The result emphasizes that undergraduate students were able to construct the relation of dependent variables that changes in tandem with the independent variable. However, students faced difficulty in forming images of continuously changing rates and could not accurately apply the concept of integrals. These findings suggest that learning calculus should be increased emphasis on coordinating images of two quantities changing in tandem about instantaneously rate of change and to promote conceptual knowledge in integral techniques.

  3. Are Low-order Covariance Estimates Useful in Error Analyses?

    NASA Astrophysics Data System (ADS)

    Baker, D. F.; Schimel, D.

    2005-12-01

    Atmospheric trace gas inversions, using modeled atmospheric transport to infer surface sources and sinks from measured concentrations, are most commonly done using least-squares techniques that return not only an estimate of the state (the surface fluxes) but also the covariance matrix describing the uncertainty in that estimate. Besides allowing one to place error bars around the estimate, the covariance matrix may be used in simulation studies to learn what uncertainties would be expected from various hypothetical observing strategies. This error analysis capability is routinely used in designing instrumentation, measurement campaigns, and satellite observing strategies. For example, Rayner, et al (2002) examined the ability of satellite-based column-integrated CO2 measurements to constrain monthly-average CO2 fluxes for about 100 emission regions using this approach. Exact solutions for both state vector and covariance matrix become computationally infeasible, however, when the surface fluxes are solved at finer resolution (e.g., daily in time, under 500 km in space). It is precisely at these finer scales, however, that one would hope to be able to estimate fluxes using high-density satellite measurements. Non-exact estimation methods such as variational data assimilation or the ensemble Kalman filter could be used, but they achieve their computational savings by obtaining an only approximate state estimate and a low-order approximation of the true covariance. One would like to be able to use this covariance matrix to do the same sort of error analyses as are done with the full-rank covariance, but is it correct to do so? Here we compare uncertainties and `information content' derived from full-rank covariance matrices obtained from a direct, batch least squares inversion to those from the incomplete-rank covariance matrices given by a variational data assimilation approach solved with a variable metric minimization technique (the Broyden-Fletcher- Goldfarb

  4. Using Concurrent Cardiovascular Information to Augment Survival Time Data from Orthostatic Tilt Tests

    NASA Technical Reports Server (NTRS)

    Feiveson, Alan H.; Fiedler, James; Lee, Stuart M. M.; Westby, Christian M.; Stenger, Michael B.; Platts, Steven H.

    2014-01-01

    Orthostatic Intolerance (OI) is the propensity to develop symptoms of fainting during upright standing. OI is associated with changes in heart rate, blood pressure and other measures of cardiac function. Problem: NASA astronauts have shown increased susceptibility to OI on return from space missions. Current methods for counteracting OI in astronauts include fluid loading and the use of compression garments. Multivariate trajectory spread is greater as OI increases. Pairwise comparisons at the same time within subjects allows incorporation of pass/fail outcomes. Path length, convex hull area, and covariance matrix determinant do well as statistics to summarize this spread Missing data problems Time series analysis need many more time points per OTT session treatment of trend? how incorporate survival information?

  5. Causal evidence in risk and policy perceptions: Applying the covariation/mechanism framework.

    PubMed

    Baucum, Matt; John, Richard

    2018-05-01

    Today's information-rich society demands constant evaluation of cause-effect relationships; behaviors and attitudes ranging from medical choices to voting decisions to policy preferences typically entail some form of causal inference ("Will this policy reduce crime?", "Will this activity improve my health?"). Cause-effect relationships such as these can be thought of as depending on two qualitatively distinct forms of evidence: covariation-based evidence (e.g., "states with this policy have fewer homicides") or mechanism-based (e.g., "this policy will reduce crime by discouraging repeat offenses"). Some psychological work has examined how people process these two forms of causal evidence in instances of "everyday" causality (e.g., assessing why a car will not start), but it is not known how these two forms of evidence contribute to causal judgments in matters of public risk or policy. Three studies (n = 715) investigated whether judgments of risk and policy scenarios would be affected by covariation and mechanism evidence and whether the evidence types interacted with one another (as suggested by past studies). Results showed that causal judgments varied linearly with mechanism strength and logarithmically with covariation strength, and that the evidence types produced only additive effects (but no interaction). We discuss the results' implications for risk communication and policy information dissemination. Copyright © 2018 Elsevier B.V. All rights reserved.

  6. Gram-Schmidt algorithms for covariance propagation

    NASA Technical Reports Server (NTRS)

    Thornton, C. L.; Bierman, G. J.

    1977-01-01

    This paper addresses the time propagation of triangular covariance factors. Attention is focused on the square-root free factorization, P = UD(transpose of U), where U is unit upper triangular and D is diagonal. An efficient and reliable algorithm for U-D propagation is derived which employs Gram-Schmidt orthogonalization. Partitioning the state vector to distinguish bias and coloured process noise parameters increase mapping efficiency. Cost comparisons of the U-D, Schmidt square-root covariance and conventional covariance propagation methods are made using weighted arithmetic operation counts. The U-D time update is shown to be less costly than the Schmidt method; and, except in unusual circumstances, it is within 20% of the cost of conventional propagation.

  7. Gram-Schmidt algorithms for covariance propagation

    NASA Technical Reports Server (NTRS)

    Thornton, C. L.; Bierman, G. J.

    1975-01-01

    This paper addresses the time propagation of triangular covariance factors. Attention is focused on the square-root free factorization, P = UDU/T/, where U is unit upper triangular and D is diagonal. An efficient and reliable algorithm for U-D propagation is derived which employs Gram-Schmidt orthogonalization. Partitioning the state vector to distinguish bias and colored process noise parameters increases mapping efficiency. Cost comparisons of the U-D, Schmidt square-root covariance and conventional covariance propagation methods are made using weighted arithmetic operation counts. The U-D time update is shown to be less costly than the Schmidt method; and, except in unusual circumstances, it is within 20% of the cost of conventional propagation.

  8. Condition Number Regularized Covariance Estimation*

    PubMed Central

    Won, Joong-Ho; Lim, Johan; Kim, Seung-Jean; Rajaratnam, Bala

    2012-01-01

    Estimation of high-dimensional covariance matrices is known to be a difficult problem, has many applications, and is of current interest to the larger statistics community. In many applications including so-called the “large p small n” setting, the estimate of the covariance matrix is required to be not only invertible, but also well-conditioned. Although many regularization schemes attempt to do this, none of them address the ill-conditioning problem directly. In this paper, we propose a maximum likelihood approach, with the direct goal of obtaining a well-conditioned estimator. No sparsity assumption on either the covariance matrix or its inverse are are imposed, thus making our procedure more widely applicable. We demonstrate that the proposed regularization scheme is computationally efficient, yields a type of Steinian shrinkage estimator, and has a natural Bayesian interpretation. We investigate the theoretical properties of the regularized covariance estimator comprehensively, including its regularization path, and proceed to develop an approach that adaptively determines the level of regularization that is required. Finally, we demonstrate the performance of the regularized estimator in decision-theoretic comparisons and in the financial portfolio optimization setting. The proposed approach has desirable properties, and can serve as a competitive procedure, especially when the sample size is small and when a well-conditioned estimator is required. PMID:23730197

  9. An alternative covariance estimator to investigate genetic heterogeneity in populations.

    PubMed

    Heslot, Nicolas; Jannink, Jean-Luc

    2015-11-26

    For genomic prediction and genome-wide association studies (GWAS) using mixed models, covariance between individuals is estimated using molecular markers. Based on the properties of mixed models, using available molecular data for prediction is optimal if this covariance is known. Under this assumption, adding individuals to the analysis should never be detrimental. However, some empirical studies showed that increasing training population size decreased prediction accuracy. Recently, results from theoretical models indicated that even if marker density is high and the genetic architecture of traits is controlled by many loci with small additive effects, the covariance between individuals, which depends on relationships at causal loci, is not always well estimated by the whole-genome kinship. We propose an alternative covariance estimator named K-kernel, to account for potential genetic heterogeneity between populations that is characterized by a lack of genetic correlation, and to limit the information flow between a priori unknown populations in a trait-specific manner. This is similar to a multi-trait model and parameters are estimated by REML and, in extreme cases, it can allow for an independent genetic architecture between populations. As such, K-kernel is useful to study the problem of the design of training populations. K-kernel was compared to other covariance estimators or kernels to examine its fit to the data, cross-validated accuracy and suitability for GWAS on several datasets. It provides a significantly better fit to the data than the genomic best linear unbiased prediction model and, in some cases it performs better than other kernels such as the Gaussian kernel, as shown by an empirical null distribution. In GWAS simulations, alternative kernels control type I errors as well as or better than the classical whole-genome kinship and increase statistical power. No or small gains were observed in cross-validated prediction accuracy. This alternative

  10. Missing clinical and behavioral health data in a large electronic health record (EHR) system.

    PubMed

    Madden, Jeanne M; Lakoma, Matthew D; Rusinak, Donna; Lu, Christine Y; Soumerai, Stephen B

    2016-11-01

    Recent massive investment in electronic health records (EHRs) was predicated on the assumption of improved patient safety, research capacity, and cost savings. However, most US health systems and health records are fragmented and do not share patient information. Our study compared information available in a typical EHR with more complete data from insurance claims, focusing on diagnoses, visits, and hospital care for depression and bipolar disorder. We included insurance plan members aged 12 and over, assigned throughout 2009 to a large multispecialty medical practice in Massachusetts, with diagnoses of depression (N = 5140) or bipolar disorder (N = 462). We extracted insurance claims and EHR data from the primary care site and compared diagnoses of interest, outpatient visits, and acute hospital events (overall and behavioral) between the 2 sources. Patients with depression and bipolar disorder, respectively, averaged 8.4 and 14.0 days of outpatient behavioral care per year; 60% and 54% of these, respectively, were missing from the EHR because they occurred offsite. Total outpatient care days were 20.5 for those with depression and 25.0 for those with bipolar disorder, with 45% and 46% missing, respectively, from the EHR. The EHR missed 89% of acute psychiatric services. Study diagnoses were missing from the EHR's structured event data for 27.3% and 27.7% of patients. EHRs inadequately capture mental health diagnoses, visits, specialty care, hospitalizations, and medications. Missing clinical information raises concerns about medical errors and research integrity. Given the fragmentation of health care and poor EHR interoperability, information exchange, and usability, priorities for further investment in health IT will need thoughtful reconsideration. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. Earth Observation System Flight Dynamics System Covariance Realism

    NASA Technical Reports Server (NTRS)

    Zaidi, Waqar H.; Tracewell, David

    2016-01-01

    This presentation applies a covariance realism technique to the National Aeronautics and Space Administration (NASA) Earth Observation System (EOS) Aqua and Aura spacecraft based on inferential statistics. The technique consists of three parts: collection calculation of definitive state estimates through orbit determination, calculation of covariance realism test statistics at each covariance propagation point, and proper assessment of those test statistics.

  12. Markov modulated Poisson process models incorporating covariates for rainfall intensity.

    PubMed

    Thayakaran, R; Ramesh, N I

    2013-01-01

    Time series of rainfall bucket tip times at the Beaufort Park station, Bracknell, in the UK are modelled by a class of Markov modulated Poisson processes (MMPP) which may be thought of as a generalization of the Poisson process. Our main focus in this paper is to investigate the effects of including covariate information into the MMPP model framework on statistical properties. In particular, we look at three types of time-varying covariates namely temperature, sea level pressure, and relative humidity that are thought to be affecting the rainfall arrival process. Maximum likelihood estimation is used to obtain the parameter estimates, and likelihood ratio tests are employed in model comparison. Simulated data from the fitted model are used to make statistical inferences about the accumulated rainfall in the discrete time interval. Variability of the daily Poisson arrival rates is studied.

  13. Threshold regression to accommodate a censored covariate.

    PubMed

    Qian, Jing; Chiou, Sy Han; Maye, Jacqueline E; Atem, Folefac; Johnson, Keith A; Betensky, Rebecca A

    2018-06-22

    In several common study designs, regression modeling is complicated by the presence of censored covariates. Examples of such covariates include maternal age of onset of dementia that may be right censored in an Alzheimer's amyloid imaging study of healthy subjects, metabolite measurements that are subject to limit of detection censoring in a case-control study of cardiovascular disease, and progressive biomarkers whose baseline values are of interest, but are measured post-baseline in longitudinal neuropsychological studies of Alzheimer's disease. We propose threshold regression approaches for linear regression models with a covariate that is subject to random censoring. Threshold regression methods allow for immediate testing of the significance of the effect of a censored covariate. In addition, they provide for unbiased estimation of the regression coefficient of the censored covariate. We derive the asymptotic properties of the resulting estimators under mild regularity conditions. Simulations demonstrate that the proposed estimators have good finite-sample performance, and often offer improved efficiency over existing methods. We also derive a principled method for selection of the threshold. We illustrate the approach in application to an Alzheimer's disease study that investigated brain amyloid levels in older individuals, as measured through positron emission tomography scans, as a function of maternal age of dementia onset, with adjustment for other covariates. We have developed an R package, censCov, for implementation of our method, available at CRAN. © 2018, The International Biometric Society.

  14. Bayesian source term determination with unknown covariance of measurements

    NASA Astrophysics Data System (ADS)

    Belal, Alkomiet; Tichý, Ondřej; Šmídl, Václav

    2017-04-01

    Determination of a source term of release of a hazardous material into the atmosphere is a very important task for emergency response. We are concerned with the problem of estimation of the source term in the conventional linear inverse problem, y = Mx, where the relationship between the vector of observations y is described using the source-receptor-sensitivity (SRS) matrix M and the unknown source term x. Since the system is typically ill-conditioned, the problem is recast as an optimization problem minR,B(y - Mx)TR-1(y - Mx) + xTB-1x. The first term minimizes the error of the measurements with covariance matrix R, and the second term is a regularization of the source term. There are different types of regularization arising for different choices of matrices R and B, for example, Tikhonov regularization assumes covariance matrix B as the identity matrix multiplied by scalar parameter. In this contribution, we adopt a Bayesian approach to make inference on the unknown source term x as well as unknown R and B. We assume prior on x to be a Gaussian with zero mean and unknown diagonal covariance matrix B. The covariance matrix of the likelihood R is also unknown. We consider two potential choices of the structure of the matrix R. First is the diagonal matrix and the second is a locally correlated structure using information on topology of the measuring network. Since the inference of the model is intractable, iterative variational Bayes algorithm is used for simultaneous estimation of all model parameters. The practical usefulness of our contribution is demonstrated on an application of the resulting algorithm to real data from the European Tracer Experiment (ETEX). This research is supported by EEA/Norwegian Financial Mechanism under project MSMT-28477/2014 Source-Term Determination of Radionuclide Releases by Inverse Atmospheric Dispersion Modelling (STRADI).

  15. Large Covariance Estimation by Thresholding Principal Orthogonal Complements.

    PubMed

    Fan, Jianqing; Liao, Yuan; Mincheva, Martina

    2013-09-01

    This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented.

  16. Covariance NMR Processing and Analysis for Protein Assignment.

    PubMed

    Harden, Bradley J; Frueh, Dominique P

    2018-01-01

    During NMR resonance assignment it is often necessary to relate nuclei to one another indirectly, through their common correlations to other nuclei. Covariance NMR has emerged as a powerful technique to correlate such nuclei without relying on error-prone peak peaking. However, false-positive artifacts in covariance spectra have impeded a general application to proteins. We recently introduced pre- and postprocessing steps to reduce the prevalence of artifacts in covariance spectra, allowing for the calculation of a variety of 4D covariance maps obtained from diverse combinations of pairs of 3D spectra, and we have employed them to assign backbone and sidechain resonances in two large and challenging proteins. In this chapter, we present a detailed protocol describing how to (1) properly prepare existing 3D spectra for covariance, (2) understand and apply our processing script, and (3) navigate and interpret the resulting 4D spectra. We also provide solutions to a number of errors that may occur when using our script, and we offer practical advice when assigning difficult signals. We believe such 4D spectra, and covariance NMR in general, can play an integral role in the assignment of NMR signals.

  17. The impact of missing trauma data on predicting massive transfusion

    PubMed Central

    Trickey, Amber W.; Fox, Erin E.; del Junco, Deborah J.; Ning, Jing; Holcomb, John B.; Brasel, Karen J.; Cohen, Mitchell J.; Schreiber, Martin A.; Bulger, Eileen M.; Phelan, Herb A.; Alarcon, Louis H.; Myers, John G.; Muskat, Peter; Cotton, Bryan A.; Wade, Charles E.; Rahbar, Mohammad H.

    2013-01-01

    INTRODUCTION Missing data are inherent in clinical research and may be especially problematic for trauma studies. This study describes a sensitivity analysis to evaluate the impact of missing data on clinical risk prediction algorithms. Three blood transfusion prediction models were evaluated utilizing an observational trauma dataset with valid missing data. METHODS The PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study included patients requiring ≥ 1 unit of red blood cells (RBC) at 10 participating U.S. Level I trauma centers from July 2009 – October 2010. Physiologic, laboratory, and treatment data were collected prospectively up to 24h after hospital admission. Subjects who received ≥ 10 RBC units within 24h of admission were classified as massive transfusion (MT) patients. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation. A sensitivity analysis for missing data was conducted to determine the upper and lower bounds for correct classification percentages. RESULTS PROMMTT enrolled 1,245 subjects. MT was received by 297 patients (24%). Missing percentage ranged from 2.2% (heart rate) to 45% (respiratory rate). Proportions of complete cases utilized in the MT prediction models ranged from 41% to 88%. All models demonstrated similar correct classification percentages using complete case analysis and multiple imputation. In the sensitivity analysis, correct classification upper-lower bound ranges per model were 4%, 10%, and 12%. Predictive accuracy for all models using PROMMTT data was lower than reported in the original datasets. CONCLUSIONS Evaluating the accuracy clinical prediction models with missing data can be misleading, especially with many predictor variables and moderate levels of missingness per variable. The proposed sensitivity analysis describes the influence of missing data on risk prediction algorithms. Reporting upper/lower bounds

  18. Design, implementation and reporting strategies to reduce the instance and impact of missing patient-reported outcome (PRO) data: a systematic review

    PubMed Central

    Mercieca-Bebber, Rebecca; Palmer, Michael J; Brundage, Michael; Stockler, Martin R; King, Madeleine T

    2016-01-01

    Objectives Patient-reported outcomes (PROs) provide important information about the impact of treatment from the patients' perspective. However, missing PRO data may compromise the interpretability and value of the findings. We aimed to report: (1) a non-technical summary of problems caused by missing PRO data; and (2) a systematic review by collating strategies to: (A) minimise rates of missing PRO data, and (B) facilitate transparent interpretation and reporting of missing PRO data in clinical research. Our systematic review does not address statistical handling of missing PRO data. Data sources MEDLINE and Cumulative Index to Nursing and Allied Health Literature (CINAHL) databases (inception to 31 March 2015), and citing articles and reference lists from relevant sources. Eligibility criteria English articles providing recommendations for reducing missing PRO data rates, or strategies to facilitate transparent interpretation and reporting of missing PRO data were included. Methods 2 reviewers independently screened articles against eligibility criteria. Discrepancies were resolved with the research team. Recommendations were extracted and coded according to framework synthesis. Results 117 sources (55% discussion papers, 26% original research) met the eligibility criteria. Design and methodological strategies for reducing rates of missing PRO data included: incorporating PRO-specific information into the protocol; carefully designing PRO assessment schedules and defining termination rules; minimising patient burden; appointing a PRO coordinator; PRO-specific training for staff; ensuring PRO studies are adequately resourced; and continuous quality assurance. Strategies for transparent interpretation and reporting of missing PRO data include utilising auxiliary data to inform analysis; transparently reporting baseline PRO scores, rates and reasons for missing data; and methods for handling missing PRO data. Conclusions The instance of missing PRO data and its

  19. 40 CFR 98.85 - Procedures for estimating missing data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... to determine combined process and combustion CO2 emissions, the missing data procedures in § 98.35 apply. (b) For CO2 process emissions from cement manufacturing facilities calculated according to § 98... best available estimate of the monthly clinker production based on information used for accounting...

  20. UV imaging reveals facial areas that are prone to skin cancer are disproportionately missed during sunscreen application

    PubMed Central

    Troughton, Lee D.; Czanner, Gabriela; Zheng, Yalin; McCormick, Austin G.

    2017-01-01

    Application of sunscreen is a widely used mechanism for protecting skin from the harmful effects of UV light. However, protection can only be achieved through effective application, and areas that are routinely missed are likely at increased risk of UV damage. Here we sought to determine if specific areas of the face are missed during routine sunscreen application, and whether provision of public health information is sufficient to improve coverage. To investigate this, 57 participants were imaged with a UV sensitive camera before and after sunscreen application: first visit; minimal pre-instruction, second visit; provided with a public health information statement. Images were scored using a custom automated image analysis process designed to identify areas of high UV reflectance, i.e. missed during sunscreen application, and analysed for 5% significance. Analyses revealed eyelid and periorbital regions to be disproportionately missed during routine sunscreen application (median 14% missed in eyelid region vs 7% in rest of face, p<0.01). Provision of health information caused a significant improvement in coverage to eyelid areas in general however, the medial canthal area was still frequently missed. These data reveal that a public health announcement-type intervention could be effective at improving coverage of high risk areas of the face, however high risk areas are likely to remain unprotected therefore other mechanisms of sun protection should be widely promoted such as UV blocking sunglasses. PMID:28968413

  1. Structural covariance of neostriatal and limbic regions in patients with obsessive-compulsive disorder.

    PubMed

    Subirà, Marta; Cano, Marta; de Wit, Stella J; Alonso, Pino; Cardoner, Narcís; Hoexter, Marcelo Q; Kwon, Jun Soo; Nakamae, Takashi; Lochner, Christine; Sato, João R; Jung, Wi Hoon; Narumoto, Jin; Stein, Dan J; Pujol, Jesus; Mataix-Cols, David; Veltman, Dick J; Menchón, José M; van den Heuvel, Odile A; Soriano-Mas, Carles

    2016-03-01

    Frontostriatal and frontoamygdalar connectivity alterations in patients with obsessive-compulsive disorder (OCD) have been typically described in functional neuroimaging studies. However, structural covariance, or volumetric correlations across distant brain regions, also provides network-level information. Altered structural covariance has been described in patients with different psychiatric disorders, including OCD, but to our knowledge, alterations within frontostriatal and frontoamygdalar circuits have not been explored. We performed a mega-analysis pooling structural MRI scans from the Obsessive-compulsive Brain Imaging Consortium and assessed whole-brain voxel-wise structural covariance of 4 striatal regions (dorsal and ventral caudate nucleus, and dorsal-caudal and ventral-rostral putamen) and 2 amygdalar nuclei (basolateral and centromedial-superficial). Images were preprocessed with the standard pipeline of voxel-based morphometry studies using Statistical Parametric Mapping software. Our analyses involved 329 patients with OCD and 316 healthy controls. Patients showed increased structural covariance between the left ventral-rostral putamen and the left inferior frontal gyrus/frontal operculum region. This finding had a significant interaction with age; the association held only in the subgroup of older participants. Patients with OCD also showed increased structural covariance between the right centromedial-superficial amygdala and the ventromedial prefrontal cortex. This was a cross-sectional study. Because this is a multisite data set analysis, participant recruitment and image acquisition were performed in different centres. Most patients were taking medication, and treatment protocols differed across centres. Our results provide evidence for structural network-level alterations in patients with OCD involving 2 frontosubcortical circuits of relevance for the disorder and indicate that structural covariance contributes to fully characterizing brain

  2. Missed Diagnosis of Syrinx

    PubMed Central

    Oh, Chang Hyun; Kim, Chan Gyu; Lee, Jae-Hwan; Park, Hyeong-Chun; Park, Chong Oon

    2012-01-01

    Study Design Prospective, randomized, controlled human study. Purpose We checked the proportion of missed syrinx diagnoses among the examinees of the Korean military conscription. Overview of Literature A syrinx is a fluid-filled cavity within the spinal cord or brain stem and causes various neurological symptoms. A syrinx could easily be diagnosed by magnetic resonance image (MRI), but missed diagnoses seldom occur. Methods In this study, we reviewed 103 cases using cervical images, cervical MRI, or whole spine sagittal MRI, and syrinxes was observed in 18 of these cases. A review of medical certificates or interviews was conducted, and the proportion of syrinx diagnoses was calculated. Results The proportion of syrinx diagnoses was about 66.7% (12 cases among 18). Missed diagnoses were not the result of the length of the syrinx, but due to the type of image used for the initial diagnosis. Conclusions The missed diagnosis proportion of the syrinx is relatively high, therefore, a more careful imaging review is recommended. PMID:22439081

  3. Covariance Matrix Estimation for Massive MIMO

    NASA Astrophysics Data System (ADS)

    Upadhya, Karthik; Vorobyov, Sergiy A.

    2018-04-01

    We propose a novel pilot structure for covariance matrix estimation in massive multiple-input multiple-output (MIMO) systems in which each user transmits two pilot sequences, with the second pilot sequence multiplied by a random phase-shift. The covariance matrix of a particular user is obtained by computing the sample cross-correlation of the channel estimates obtained from the two pilot sequences. This approach relaxes the requirement that all the users transmit their uplink pilots over the same set of symbols. We derive expressions for the achievable rate and the mean-squared error of the covariance matrix estimate when the proposed method is used with staggered pilots. The performance of the proposed method is compared with existing methods through simulations.

  4. Performance of internal covariance estimators for cosmic shear correlation functions

    DOE PAGES

    Friedrich, O.; Seitz, S.; Eifler, T. F.; ...

    2015-12-31

    Data re-sampling methods such as the delete-one jackknife are a common tool for estimating the covariance of large scale structure probes. In this paper we investigate the concepts of internal covariance estimation in the context of cosmic shear two-point statistics. We demonstrate how to use log-normal simulations of the convergence field and the corresponding shear field to carry out realistic tests of internal covariance estimators and find that most estimators such as jackknife or sub-sample covariance can reach a satisfactory compromise between bias and variance of the estimated covariance. In a forecast for the complete, 5-year DES survey we show that internally estimated covariance matrices can provide a large fraction of the true uncertainties on cosmological parameters in a 2D cosmic shear analysis. The volume inside contours of constant likelihood in themore » $$\\Omega_m$$-$$\\sigma_8$$ plane as measured with internally estimated covariance matrices is on average $$\\gtrsim 85\\%$$ of the volume derived from the true covariance matrix. The uncertainty on the parameter combination $$\\Sigma_8 \\sim \\sigma_8 \\Omega_m^{0.5}$$ derived from internally estimated covariances is $$\\sim 90\\%$$ of the true uncertainty.« less

  5. Sophie's story: writing missing journeys.

    PubMed

    Parr, Hester; Stevenson, Olivia

    2014-10-01

    'Sophie's story' is a creative rendition of an interview narrative gathered in a research project on missing people. The paper explains why Sophie's story was written and details the wider intention to provide new narrative resources for police officer training, families of missing people and returned missing people. We contextualize this cultural intervention with an argument about the transformative potential of writing trauma stories. It is suggested that trauma stories produce difficult and unknown affects, but ones that may provide new ways of talking about unspeakable events. Sophie's story is thus presented as a hopeful cultural geography in process, and one that seeks to help rewrite existing social scripts about missing people.

  6. Galaxy-galaxy lensing estimators and their covariance properties

    NASA Astrophysics Data System (ADS)

    Singh, Sukhdeep; Mandelbaum, Rachel; Seljak, Uroš; Slosar, Anže; Vazquez Gonzalez, Jose

    2017-11-01

    We study the covariance properties of real space correlation function estimators - primarily galaxy-shear correlations, or galaxy-galaxy lensing - using SDSS data for both shear catalogues and lenses (specifically the BOSS LOWZ sample). Using mock catalogues of lenses and sources, we disentangle the various contributions to the covariance matrix and compare them with a simple analytical model. We show that not subtracting the lensing measurement around random points from the measurement around the lens sample is equivalent to performing the measurement using the lens density field instead of the lens overdensity field. While the measurement using the lens density field is unbiased (in the absence of systematics), its error is significantly larger due to an additional term in the covariance. Therefore, this subtraction should be performed regardless of its beneficial effects on systematics. Comparing the error estimates from data and mocks for estimators that involve the overdensity, we find that the errors are dominated by the shape noise and lens clustering, which empirically estimated covariances (jackknife and standard deviation across mocks) that are consistent with theoretical estimates, and that both the connected parts of the four-point function and the supersample covariance can be neglected for the current levels of noise. While the trade-off between different terms in the covariance depends on the survey configuration (area, source number density), the diagnostics that we use in this work should be useful for future works to test their empirically determined covariances.

  7. Inclusion of dosimetric data as covariates in toxicity-related radiogenomic studies : A systematic review.

    PubMed

    Yahya, Noorazrul; Chua, Xin-Jane; Manan, Hanani A; Ismail, Fuad

    2018-05-17

    This systematic review evaluates the completeness of dosimetric features and their inclusion as covariates in genetic-toxicity association studies. Original research studies associating genetic features and normal tissue complications following radiotherapy were identified from PubMed. The use of dosimetric data was determined by mining the statement of prescription dose, dose fractionation, target volume selection or arrangement and dose distribution. The consideration of the dosimetric data as covariates was based on the statement mentioned in the statistical analysis section. The significance of these covariates was extracted from the results section. Descriptive analyses were performed to determine their completeness and inclusion as covariates. A total of 174 studies were found to satisfy the inclusion criteria. Studies published ≥2010 showed increased use of dose distribution information (p = 0.07). 33% of studies did not include any dose features in the analysis of gene-toxicity associations. Only 29% included dose distribution features as covariates and reported the results. 59% of studies which included dose distribution features found significant associations to toxicity. A large proportion of studies on the correlation of genetic markers with radiotherapy-related side effects considered no dosimetric parameters. Significance of dose distribution features was found in more than half of the studies including these features, emphasizing their importance. Completeness of radiation-specific clinical data may have increased in recent years which may improve gene-toxicity association studies.

  8. AFCI-2.0 Library of Neutron Cross Section Covariances

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Herman, M.; Herman,M.; Oblozinsky,P.

    2011-06-26

    Neutron cross section covariance library has been under development by BNL-LANL collaborative effort over the last three years. The primary purpose of the library is to provide covariances for the Advanced Fuel Cycle Initiative (AFCI) data adjustment project, which is focusing on the needs of fast advanced burner reactors. The covariances refer to central values given in the 2006 release of the U.S. neutron evaluated library ENDF/B-VII. The preliminary version (AFCI-2.0beta) has been completed in October 2010 and made available to the users for comments. In the final 2.0 release, covariances for a few materials were updated, in particular newmore » LANL evaluations for {sup 238,240}Pu and {sup 241}Am were adopted. BNL was responsible for covariances for structural materials and fission products, management of the library and coordination of the work, while LANL was in charge of covariances for light nuclei and for actinides.« less

  9. Joint Modeling Approach for Semicompeting Risks Data with Missing Nonterminal Event Status

    PubMed Central

    Hu, Chen; Tsodikov, Alex

    2014-01-01

    Semicompeting risks data, where a subject may experience sequential non-terminal and terminal events, and the terminal event may censor the non-terminal event but not vice versa, are widely available in many biomedical studies. We consider the situation when a proportion of subjects’ non-terminal events is missing, such that the observed data become a mixture of “true” semicompeting risks data and partially observed terminal event only data. An illness-death multistate model with proportional hazards assumptions is proposed to study the relationship between non-terminal and terminal events, and provide covariate-specific global and local association measures. Maximum likelihood estimation based on semiparametric regression analysis is used for statistical inference, and asymptotic properties of proposed estimators are studied using empirical process and martingale arguments. We illustrate the proposed method with simulation studies and data analysis of a follicular cell lymphoma study. PMID:24430204

  10. Large Covariance Estimation by Thresholding Principal Orthogonal Complements

    PubMed Central

    Fan, Jianqing; Liao, Yuan; Mincheva, Martina

    2012-01-01

    This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented. PMID:24348088

  11. Alterations in Anatomical Covariance in the Prematurely Born

    PubMed Central

    Scheinost, Dustin; Kwon, Soo Hyun; Lacadie, Cheryl; Vohr, Betty R.; Schneider, Karen C.; Papademetris, Xenophon; Constable, R. Todd; Ment, Laura R.

    2017-01-01

    Abstract Preterm (PT) birth results in long-term alterations in functional and structural connectivity, but the related changes in anatomical covariance are just beginning to be explored. To test the hypothesis that PT birth alters patterns of anatomical covariance, we investigated brain volumes of 25 PTs and 22 terms at young adulthood using magnetic resonance imaging. Using regional volumetrics, seed-based analyses, and whole brain graphs, we show that PT birth is associated with reduced volume in bilateral temporal and inferior frontal lobes, left caudate, left fusiform, and posterior cingulate for prematurely born subjects at young adulthood. Seed-based analyses demonstrate altered patterns of anatomical covariance for PTs compared with terms. PTs exhibit reduced covariance with R Brodmann area (BA) 47, Broca's area, and L BA 21, Wernicke's area, and white matter volume in the left prefrontal lobe, but increased covariance with R BA 47 and left cerebellum. Graph theory analyses demonstrate that measures of network complexity are significantly less robust in PTs compared with term controls. Volumes in regions showing group differences are significantly correlated with phonological awareness, the fundamental basis for reading acquisition, for the PTs. These data suggest both long-lasting and clinically significant alterations in the covariance in the PTs at young adulthood. PMID:26494796

  12. Fluctuations in Conjunction Miss Distance Projections as Time Approaches Time of Closest Approach

    NASA Technical Reports Server (NTRS)

    Christian, John A., III

    2005-01-01

    A responsibility of the Trajectory Operations Officer is to ensure that the International Space Station (ISS) avoids colliding with debris. United States Space Command (USSPACECOM) tracks and catalogs a portion of the debris in Earth orbit, but only objects with a perigee less than 600 km and a radar cross section (RCS) greater than 10 cm-objects that, in fact, represent only a small fraction of the objects in Earth orbit. To accommodate for this, the ISS uses shielding to protect against collisions with smaller objects. This study provides a better understanding of how quickly, and to what degree, USSPACECOM projections tend to converge to the final, true miss distance. The information included is formulated to better predict the behavior of miss distance data during real-time operations. It was determined that the driving components, in order of impact on miss distance fluctuations, are energy dissipation rate (EDR), RCS, and inclination. Data used in this analysis, calculations made, and conclusions drawn are stored in Microsoft Excel log sheets. A separate log sheet, created for each conjunction, contains information such as predicted miss distances, apogee and perigee of debris orbit, EDR, RCS, inclination, tracks and observations, statistical data, and other evaluation/orbital parameters.

  13. Robust Covariate-Adjusted Log-Rank Statistics and Corresponding Sample Size Formula for Recurrent Events Data

    PubMed Central

    Song, Rui; Kosorok, Michael R.; Cai, Jianwen

    2009-01-01

    Summary Recurrent events data are frequently encountered in clinical trials. This article develops robust covariate-adjusted log-rank statistics applied to recurrent events data with arbitrary numbers of events under independent censoring and the corresponding sample size formula. The proposed log-rank tests are robust with respect to different data-generating processes and are adjusted for predictive covariates. It reduces to the Kong and Slud (1997, Biometrika 84, 847–862) setting in the case of a single event. The sample size formula is derived based on the asymptotic normality of the covariate-adjusted log-rank statistics under certain local alternatives and a working model for baseline covariates in the recurrent event data context. When the effect size is small and the baseline covariates do not contain significant information about event times, it reduces to the same form as that of Schoenfeld (1983, Biometrics 39, 499–503) for cases of a single event or independent event times within a subject. We carry out simulations to study the control of type I error and the comparison of powers between several methods in finite samples. The proposed sample size formula is illustrated using data from an rhDNase study. PMID:18162107

  14. On the Likely Utility of Hybrid Weights Optimized for Variances in Hybrid Error Covariance Models

    NASA Astrophysics Data System (ADS)

    Satterfield, E.; Hodyss, D.; Kuhl, D.; Bishop, C. H.

    2017-12-01

    Because of imperfections in ensemble data assimilation schemes, one cannot assume that the ensemble covariance is equal to the true error covariance of a forecast. Previous work demonstrated how information about the distribution of true error variances given an ensemble sample variance can be revealed from an archive of (observation-minus-forecast, ensemble-variance) data pairs. Here, we derive a simple and intuitively compelling formula to obtain the mean of this distribution of true error variances given an ensemble sample variance from (observation-minus-forecast, ensemble-variance) data pairs produced by a single run of a data assimilation system. This formula takes the form of a Hybrid weighted average of the climatological forecast error variance and the ensemble sample variance. Here, we test the extent to which these readily obtainable weights can be used to rapidly optimize the covariance weights used in Hybrid data assimilation systems that employ weighted averages of static covariance models and flow-dependent ensemble based covariance models. Univariate data assimilation and multi-variate cycling ensemble data assimilation are considered. In both cases, it is found that our computationally efficient formula gives Hybrid weights that closely approximate the optimal weights found through the simple but computationally expensive process of testing every plausible combination of weights.

  15. Measuring continuous baseline covariate imbalances in clinical trial data

    PubMed Central

    Ciolino, Jody D.; Martin, Renee’ H.; Zhao, Wenle; Hill, Michael D.; Jauch, Edward C.; Palesch, Yuko Y.

    2014-01-01

    This paper presents and compares several methods of measuring continuous baseline covariate imbalance in clinical trial data. Simulations illustrate that though the t-test is an inappropriate method of assessing continuous baseline covariate imbalance, the test statistic itself is a robust measure in capturing imbalance in continuous covariate distributions. Guidelines to assess effects of imbalance on bias, type I error rate, and power for hypothesis test for treatment effect on continuous outcomes are presented, and the benefit of covariate-adjusted analysis (ANCOVA) is also illustrated. PMID:21865270

  16. Establishing a threshold for the number of missing days using 7 d pedometer data.

    PubMed

    Kang, Minsoo; Hart, Peter D; Kim, Youngdeok

    2012-11-01

    The purpose of this study was to examine the threshold of the number of missing days of recovery using the individual information (II)-centered approach. Data for this study came from 86 participants, aged from 17 to 79 years old, who had 7 consecutive days of complete pedometer (Yamax SW 200) wear. Missing datasets (1 d through 5 d missing) were created by a SAS random process 10,000 times each. All missing values were replaced using the II-centered approach. A 7 d average was calculated for each dataset, including the complete dataset. Repeated measure ANOVA was used to determine the differences between 1 d through 5 d missing datasets and the complete dataset. Mean absolute percentage error (MAPE) was also computed. Mean (SD) daily step count for the complete 7 d dataset was 7979 (3084). Mean (SD) values for the 1 d through 5 d missing datasets were 8072 (3218), 8066 (3109), 7968 (3273), 7741 (3050) and 8314 (3529), respectively (p > 0.05). The lower MAPEs were estimated for 1 d missing (5.2%, 95% confidence interval (CI) 4.4-6.0) and 2 d missing (8.4%, 95% CI 7.0-9.8), while all others were greater than 10%. The results of this study show that the 1 d through 5 d missing datasets, with replaced values, were not significantly different from the complete dataset. Based on the MAPE results, it is not recommended to replace more than two days of missing step counts.

  17. Information matrix estimation procedures for cognitive diagnostic models.

    PubMed

    Liu, Yanlou; Xin, Tao; Andersson, Björn; Tian, Wei

    2018-03-06

    Two new methods to estimate the asymptotic covariance matrix for marginal maximum likelihood estimation of cognitive diagnosis models (CDMs), the inverse of the observed information matrix and the sandwich-type estimator, are introduced. Unlike several previous covariance matrix estimators, the new methods take into account both the item and structural parameters. The relationships between the observed information matrix, the empirical cross-product information matrix, the sandwich-type covariance matrix and the two approaches proposed by de la Torre (2009, J. Educ. Behav. Stat., 34, 115) are discussed. Simulation results show that, for a correctly specified CDM and Q-matrix or with a slightly misspecified probability model, the observed information matrix and the sandwich-type covariance matrix exhibit good performance with respect to providing consistent standard errors of item parameter estimates. However, with substantial model misspecification only the sandwich-type covariance matrix exhibits robust performance. © 2018 The British Psychological Society.

  18. Galaxy–galaxy lensing estimators and their covariance properties

    DOE PAGES

    Singh, Sukhdeep; Mandelbaum, Rachel; Seljak, Uros; ...

    2017-07-21

    Here, we study the covariance properties of real space correlation function estimators – primarily galaxy–shear correlations, or galaxy–galaxy lensing – using SDSS data for both shear catalogues and lenses (specifically the BOSS LOWZ sample). Using mock catalogues of lenses and sources, we disentangle the various contributions to the covariance matrix and compare them with a simple analytical model. We show that not subtracting the lensing measurement around random points from the measurement around the lens sample is equivalent to performing the measurement using the lens density field instead of the lens overdensity field. While the measurement using the lens densitymore » field is unbiased (in the absence of systematics), its error is significantly larger due to an additional term in the covariance. Therefore, this subtraction should be performed regardless of its beneficial effects on systematics. Comparing the error estimates from data and mocks for estimators that involve the overdensity, we find that the errors are dominated by the shape noise and lens clustering, which empirically estimated covariances (jackknife and standard deviation across mocks) that are consistent with theoretical estimates, and that both the connected parts of the four-point function and the supersample covariance can be neglected for the current levels of noise. While the trade-off between different terms in the covariance depends on the survey configuration (area, source number density), the diagnostics that we use in this work should be useful for future works to test their empirically determined covariances.« less

  19. Galaxy–galaxy lensing estimators and their covariance properties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singh, Sukhdeep; Mandelbaum, Rachel; Seljak, Uros

    Here, we study the covariance properties of real space correlation function estimators – primarily galaxy–shear correlations, or galaxy–galaxy lensing – using SDSS data for both shear catalogues and lenses (specifically the BOSS LOWZ sample). Using mock catalogues of lenses and sources, we disentangle the various contributions to the covariance matrix and compare them with a simple analytical model. We show that not subtracting the lensing measurement around random points from the measurement around the lens sample is equivalent to performing the measurement using the lens density field instead of the lens overdensity field. While the measurement using the lens densitymore » field is unbiased (in the absence of systematics), its error is significantly larger due to an additional term in the covariance. Therefore, this subtraction should be performed regardless of its beneficial effects on systematics. Comparing the error estimates from data and mocks for estimators that involve the overdensity, we find that the errors are dominated by the shape noise and lens clustering, which empirically estimated covariances (jackknife and standard deviation across mocks) that are consistent with theoretical estimates, and that both the connected parts of the four-point function and the supersample covariance can be neglected for the current levels of noise. While the trade-off between different terms in the covariance depends on the survey configuration (area, source number density), the diagnostics that we use in this work should be useful for future works to test their empirically determined covariances.« less

  20. A review of the handling of missing longitudinal outcome data in clinical trials

    PubMed Central

    2014-01-01

    The aim of this review was to establish the frequency with which trials take into account missingness, and to discover what methods trialists use for adjustment in randomised controlled trials with longitudinal measurements. Failing to address the problems that can arise from missing outcome data can result in misleading conclusions. Missing data should be addressed as a means of a sensitivity analysis of the complete case analysis results. One hundred publications of randomised controlled trials with longitudinal measurements were selected randomly from trial publications from the years 2005 to 2012. Information was extracted from these trials, including whether reasons for dropout were reported, what methods were used for handing the missing data, whether there was any explanation of the methods for missing data handling, and whether a statistician was involved in the analysis. The main focus of the review was on missing data post dropout rather than missing interim data. Of all the papers in the study, 9 (9%) had no missing data. More than half of the papers included in the study failed to make any attempt to explain the reasons for their choice of missing data handling method. Of the papers with clear missing data handling methods, 44 papers (50%) used adequate methods of missing data handling, whereas 30 (34%) of the papers used missing data methods which may not have been appropriate. In the remaining 17 papers (19%), it was difficult to assess the validity of the methods used. An imputation method was used in 18 papers (20%). Multiple imputation methods were introduced in 1987 and are an efficient way of accounting for missing data in general, and yet only 4 papers used these methods. Out of the 18 papers which used imputation, only 7 displayed the results as a sensitivity analysis of the complete case analysis results. 61% of the papers that used an imputation explained the reasons for their chosen method. Just under a third of the papers made no reference

  1. Missed nursing care: a concept analysis.

    PubMed

    Kalisch, Beatrice J; Landstrom, Gay L; Hinshaw, Ada Sue

    2009-07-01

    This paper is a report of the analysis of the concept of missed nursing care. According to patient safety literature, missed nursing care is an error of omission. This concept has been conspicuously absent in quality and patient safety literature, with individual aspects of nursing care left undone given only occasional mention. An 8-step method of concept analysis - select concept, determine purpose, identify uses, define attributes, identify model case, describe related and contrary cases, identify antecedents and consequences and define empirical referents - was used to examine the concept of missed nursing care. The sources for the analysis were identified by systematic searches of the World Wide Web, MEDLINE, CINAHL and reference lists of related journal articles with a timeline of 1970 to April 2008. Missed nursing care, conceptualized within the Missed Nursing Care Model, is defined as any aspect of required patient care that is omitted (either in part or in whole) or delayed. Various attribute categories reported by nurses in acute care settings contribute to missed nursing care: (1) antecedents that catalyse the need for a decision about priorities; (2) elements of the nursing process and (3) internal perceptions and values of the nurse. Multiple elements in the nursing environment and internal to nurses influence whether needed nursing care is provided. Missed care as conceptualized within the Missed Care Model is a universal phenomenon. The concept is expected to occur across all cultures and countries, thus being international in scope.

  2. Facilitated assignment of large protein NMR signals with covariance sequential spectra using spectral derivatives.

    PubMed

    Harden, Bradley J; Nichols, Scott R; Frueh, Dominique P

    2014-09-24

    Nuclear magnetic resonance (NMR) studies of larger proteins are hampered by difficulties in assigning NMR resonances. Human intervention is typically required to identify NMR signals in 3D spectra, and subsequent procedures depend on the accuracy of this so-called peak picking. We present a method that provides sequential connectivities through correlation maps constructed with covariance NMR, bypassing the need for preliminary peak picking. We introduce two novel techniques to minimize false correlations and merge the information from all original 3D spectra. First, we take spectral derivatives prior to performing covariance to emphasize coincident peak maxima. Second, we multiply covariance maps calculated with different 3D spectra to destroy erroneous sequential correlations. The maps are easy to use and can readily be generated from conventional triple-resonance experiments. Advantages of the method are demonstrated on a 37 kDa nonribosomal peptide synthetase domain subject to spectral overlap.

  3. Protein-protein interaction site predictions with minimum covariance determinant and Mahalanobis distance.

    PubMed

    Qiu, Zhijun; Zhou, Bo; Yuan, Jiangfeng

    2017-11-21

    Protein-protein interaction site (PPIS) prediction must deal with the diversity of interaction sites that limits their prediction accuracy. Use of proteins with unknown or unidentified interactions can also lead to missing interfaces. Such data errors are often brought into the training dataset. In response to these two problems, we used the minimum covariance determinant (MCD) method to refine the training data to build a predictor with better performance, utilizing its ability of removing outliers. In order to predict test data in practice, a method based on Mahalanobis distance was devised to select proper test data as input for the predictor. With leave-one-validation and independent test, after the Mahalanobis distance screening, our method achieved higher performance according to Matthews correlation coefficient (MCC), although only a part of test data could be predicted. These results indicate that data refinement is an efficient approach to improve protein-protein interaction site prediction. By further optimizing our method, it is hopeful to develop predictors of better performance and wide range of application. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Genomic Variants Revealed by Invariably Missing Genotypes in Nelore Cattle

    PubMed Central

    da Silva, Joaquim Manoel; Giachetto, Poliana Fernanda; da Silva, Luiz Otávio Campos; Cintra, Leandro Carrijo; Paiva, Samuel Rezende; Caetano, Alexandre Rodrigues; Yamagishi, Michel Eduardo Beleza

    2015-01-01

    High density genotyping panels have been used in a wide range of applications. From population genetics to genome-wide association studies, this technology still offers the lowest cost and the most consistent solution for generating SNP data. However, in spite of the application, part of the generated data is always discarded from final datasets based on quality control criteria used to remove unreliable markers. Some discarded data consists of markers that failed to generate genotypes, labeled as missing genotypes. A subset of missing genotypes that occur in the whole population under study may be caused by technical issues but can also be explained by the presence of genomic variations that are in the vicinity of the assayed SNP and that prevent genotyping probes from annealing. The latter case may contain relevant information because these missing genotypes might be used to identify population-specific genomic variants. In order to assess which case is more prevalent, we used Illumina HD Bovine chip genotypes from 1,709 Nelore (Bos indicus) samples. We found 3,200 missing genotypes among the whole population. NGS re-sequencing data from 8 sires were used to verify the presence of genomic variations within their flanking regions in 81.56% of these missing genotypes. Furthermore, we discovered 3,300 novel SNPs/Indels, 31% of which are located in genes that may affect traits of importance for the genetic improvement of cattle production. PMID:26305794

  5. Missing value imputation strategies for metabolomics data.

    PubMed

    Armitage, Emily Grace; Godzien, Joanna; Alonso-Herranz, Vanesa; López-Gonzálvez, Ángeles; Barbas, Coral

    2015-12-01

    The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k-means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a "gray area" and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k-means nearest neighbor and the best approximation of positioning real zeros. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Nonrelativistic fluids on scale covariant Newton-Cartan backgrounds

    NASA Astrophysics Data System (ADS)

    Mitra, Arpita

    2017-12-01

    The nonrelativistic covariant framework for fields is extended to investigate fields and fluids on scale covariant curved backgrounds. The scale covariant Newton-Cartan background is constructed using the localization of space-time symmetries of nonrelativistic fields in flat space. Following this, we provide a Weyl covariant formalism which can be used to study scale invariant fluids. By considering ideal fluids as an example, we describe its thermodynamic and hydrodynamic properties and explicitly demonstrate that it satisfies the local second law of thermodynamics. As a further application, we consider the low energy description of Hall fluids. Specifically, we find that the gauge fields for scale transformations lead to corrections of the Wen-Zee and Berry phase terms contained in the effective action.

  7. Empirical State Error Covariance Matrix for Batch Estimation

    NASA Technical Reports Server (NTRS)

    Frisbee, Joe

    2015-01-01

    State estimation techniques effectively provide mean state estimates. However, the theoretical state error covariance matrices provided as part of these techniques often suffer from a lack of confidence in their ability to describe the uncertainty in the estimated states. By a reinterpretation of the equations involved in the weighted batch least squares algorithm, it is possible to directly arrive at an empirical state error covariance matrix. The proposed empirical state error covariance matrix will contain the effect of all error sources, known or not. This empirical error covariance matrix may be calculated as a side computation for each unique batch solution. Results based on the proposed technique will be presented for a simple, two observer and measurement error only problem.

  8. Hidden Covariation Detection Produces Faster, Not Slower, Social Judgments

    ERIC Educational Resources Information Center

    Barker, Lynne A.; Andrade, Jackie

    2006-01-01

    In P. Lewicki's (1986b) demonstration of hidden covariation detection (HCD), responses of participants were slower to faces that corresponded with a covariation encountered previously than to faces with novel covariations. This slowing contrasts with the typical finding that priming leads to faster responding and suggests that HCD is a unique type…

  9. The Effect of Missing Data Handling Methods on Goodness of Fit Indices in Confirmatory Factor Analysis

    ERIC Educational Resources Information Center

    Köse, Alper

    2014-01-01

    The primary objective of this study was to examine the effect of missing data on goodness of fit statistics in confirmatory factor analysis (CFA). For this aim, four missing data handling methods; listwise deletion, full information maximum likelihood, regression imputation and expectation maximization (EM) imputation were examined in terms of…

  10. Handling missing values in the MDS-UPDRS.

    PubMed

    Goetz, Christopher G; Luo, Sheng; Wang, Lu; Tilley, Barbara C; LaPelle, Nancy R; Stebbins, Glenn T

    2015-10-01

    This study was undertaken to define the number of missing values permissible to render valid total scores for each Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS) part. To handle missing values, imputation strategies serve as guidelines to reject an incomplete rating or create a surrogate score. We tested a rigorous, scale-specific, data-based approach to handling missing values for the MDS-UPDRS. From two large MDS-UPDRS datasets, we sequentially deleted item scores, either consistently (same items) or randomly (different items) across all subjects. Lin's Concordance Correlation Coefficient (CCC) compared scores calculated without missing values with prorated scores based on sequentially increasing missing values. The maximal number of missing values retaining a CCC greater than 0.95 determined the threshold for rendering a valid prorated score. A second confirmatory sample was selected from the MDS-UPDRS international translation program. To provide valid part scores applicable across all Hoehn and Yahr (H&Y) stages when the same items are consistently missing, one missing item from Part I, one from Part II, three from Part III, but none from Part IV can be allowed. To provide valid part scores applicable across all H&Y stages when random item entries are missing, one missing item from Part I, two from Part II, seven from Part III, but none from Part IV can be allowed. All cutoff values were confirmed in the validation sample. These analyses are useful for constructing valid surrogate part scores for MDS-UPDRS when missing items fall within the identified threshold and give scientific justification for rejecting partially completed ratings that fall below the threshold. © 2015 International Parkinson and Movement Disorder Society.

  11. Bayes Factor Covariance Testing in Item Response Models.

    PubMed

    Fox, Jean-Paul; Mulder, Joris; Sinharay, Sandip

    2017-12-01

    Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning the underlying covariance structure are evaluated using (fractional) Bayes factor tests. The support for a unidimensional factor (i.e., assumption of local independence) and differential item functioning are evaluated by testing the covariance components. The posterior distribution of common covariance components is obtained in closed form by transforming latent responses with an orthogonal (Helmert) matrix. This posterior distribution is defined as a shifted-inverse-gamma, thereby introducing a default prior and a balanced prior distribution. Based on that, an MCMC algorithm is described to estimate all model parameters and to compute (fractional) Bayes factor tests. Simulation studies are used to show that the (fractional) Bayes factor tests have good properties for testing the underlying covariance structure of binary response data. The method is illustrated with two real data studies.

  12. Eddy Covariance Method: Overview of General Guidelines and Conventional Workflow

    NASA Astrophysics Data System (ADS)

    Burba, G. G.; Anderson, D. J.; Amen, J. L.

    2007-12-01

    received from new users of the Eddy Covariance method and relevant instrumentation, and employs non-technical language to be of practical use to those new to this field. Information is provided on theory of the method (including state of methodology, basic derivations, practical formulations, major assumptions and sources of errors, error treatment, and use in non- traditional terrains), practical workflow (e.g., experimental design, implementation, data processing, and quality control), alternative methods and applications, and the most frequently overlooked details of the measurements. References and access to an extended 141-page Eddy Covariance Guideline in three electronic formats are also provided.

  13. Alterations in Anatomical Covariance in the Prematurely Born.

    PubMed

    Scheinost, Dustin; Kwon, Soo Hyun; Lacadie, Cheryl; Vohr, Betty R; Schneider, Karen C; Papademetris, Xenophon; Constable, R Todd; Ment, Laura R

    2017-01-01

    Preterm (PT) birth results in long-term alterations in functional and structural connectivity, but the related changes in anatomical covariance are just beginning to be explored. To test the hypothesis that PT birth alters patterns of anatomical covariance, we investigated brain volumes of 25 PTs and 22 terms at young adulthood using magnetic resonance imaging. Using regional volumetrics, seed-based analyses, and whole brain graphs, we show that PT birth is associated with reduced volume in bilateral temporal and inferior frontal lobes, left caudate, left fusiform, and posterior cingulate for prematurely born subjects at young adulthood. Seed-based analyses demonstrate altered patterns of anatomical covariance for PTs compared with terms. PTs exhibit reduced covariance with R Brodmann area (BA) 47, Broca's area, and L BA 21, Wernicke's area, and white matter volume in the left prefrontal lobe, but increased covariance with R BA 47 and left cerebellum. Graph theory analyses demonstrate that measures of network complexity are significantly less robust in PTs compared with term controls. Volumes in regions showing group differences are significantly correlated with phonological awareness, the fundamental basis for reading acquisition, for the PTs. These data suggest both long-lasting and clinically significant alterations in the covariance in the PTs at young adulthood. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Massive data compression for parameter-dependent covariance matrices

    NASA Astrophysics Data System (ADS)

    Heavens, Alan F.; Sellentin, Elena; de Mijolla, Damien; Vianello, Alvise

    2017-12-01

    We show how the massive data compression algorithm MOPED can be used to reduce, by orders of magnitude, the number of simulated data sets which are required to estimate the covariance matrix required for the analysis of Gaussian-distributed data. This is relevant when the covariance matrix cannot be calculated directly. The compression is especially valuable when the covariance matrix varies with the model parameters. In this case, it may be prohibitively expensive to run enough simulations to estimate the full covariance matrix throughout the parameter space. This compression may be particularly valuable for the next generation of weak lensing surveys, such as proposed for Euclid and Large Synoptic Survey Telescope, for which the number of summary data (such as band power or shear correlation estimates) is very large, ∼104, due to the large number of tomographic redshift bins which the data will be divided into. In the pessimistic case where the covariance matrix is estimated separately for all points in an Monte Carlo Markov Chain analysis, this may require an unfeasible 109 simulations. We show here that MOPED can reduce this number by a factor of 1000, or a factor of ∼106 if some regularity in the covariance matrix is assumed, reducing the number of simulations required to a manageable 103, making an otherwise intractable analysis feasible.

  15. Design, implementation and reporting strategies to reduce the instance and impact of missing patient-reported outcome (PRO) data: a systematic review.

    PubMed

    Mercieca-Bebber, Rebecca; Palmer, Michael J; Brundage, Michael; Calvert, Melanie; Stockler, Martin R; King, Madeleine T

    2016-06-15

    Patient-reported outcomes (PROs) provide important information about the impact of treatment from the patients' perspective. However, missing PRO data may compromise the interpretability and value of the findings. We aimed to report: (1) a non-technical summary of problems caused by missing PRO data; and (2) a systematic review by collating strategies to: (A) minimise rates of missing PRO data, and (B) facilitate transparent interpretation and reporting of missing PRO data in clinical research. Our systematic review does not address statistical handling of missing PRO data. MEDLINE and Cumulative Index to Nursing and Allied Health Literature (CINAHL) databases (inception to 31 March 2015), and citing articles and reference lists from relevant sources. English articles providing recommendations for reducing missing PRO data rates, or strategies to facilitate transparent interpretation and reporting of missing PRO data were included. 2 reviewers independently screened articles against eligibility criteria. Discrepancies were resolved with the research team. Recommendations were extracted and coded according to framework synthesis. 117 sources (55% discussion papers, 26% original research) met the eligibility criteria. Design and methodological strategies for reducing rates of missing PRO data included: incorporating PRO-specific information into the protocol; carefully designing PRO assessment schedules and defining termination rules; minimising patient burden; appointing a PRO coordinator; PRO-specific training for staff; ensuring PRO studies are adequately resourced; and continuous quality assurance. Strategies for transparent interpretation and reporting of missing PRO data include utilising auxiliary data to inform analysis; transparently reporting baseline PRO scores, rates and reasons for missing data; and methods for handling missing PRO data. The instance of missing PRO data and its potential to bias clinical research can be minimised by implementing

  16. Bi-level Multi-Source Learning for Heterogeneous Block-wise Missing Data

    PubMed Central

    Xiang, Shuo; Yuan, Lei; Fan, Wei; Wang, Yalin; Thompson, Paul M.; Ye, Jieping

    2013-01-01

    Bio-imaging technologies allow scientists to collect large amounts of high-dimensional data from multiple heterogeneous sources for many biomedical applications. In the study of Alzheimer's Disease (AD), neuroimaging data, gene/protein expression data, etc., are often analyzed together to improve predictive power. Joint learning from multiple complementary data sources is advantageous, but feature-pruning and data source selection are critical to learn interpretable models from high-dimensional data. Often, the data collected has block-wise missing entries. In the Alzheimer’s Disease Neuroimaging Initiative (ADNI), most subjects have MRI and genetic information, but only half have cerebrospinal fluid (CSF) measures, a different half has FDG-PET; only some have proteomic data. Here we propose how to effectively integrate information from multiple heterogeneous data sources when data is block-wise missing. We present a unified “bi-level” learning model for complete multi-source data, and extend it to incomplete data. Our major contributions are: (1) our proposed models unify feature-level and source-level analysis, including several existing feature learning approaches as special cases; (2) the model for incomplete data avoids imputing missing data and offers superior performance; it generalizes to other applications with block-wise missing data sources; (3) we present efficient optimization algorithms for modeling complete and incomplete data. We comprehensively evaluate the proposed models including all ADNI subjects with at least one of four data types at baseline: MRI, FDG-PET, CSF and proteomics. Our proposed models compare favorably with existing approaches. PMID:23988272

  17. Bi-level multi-source learning for heterogeneous block-wise missing data.

    PubMed

    Xiang, Shuo; Yuan, Lei; Fan, Wei; Wang, Yalin; Thompson, Paul M; Ye, Jieping

    2014-11-15

    Bio-imaging technologies allow scientists to collect large amounts of high-dimensional data from multiple heterogeneous sources for many biomedical applications. In the study of Alzheimer's Disease (AD), neuroimaging data, gene/protein expression data, etc., are often analyzed together to improve predictive power. Joint learning from multiple complementary data sources is advantageous, but feature-pruning and data source selection are critical to learn interpretable models from high-dimensional data. Often, the data collected has block-wise missing entries. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), most subjects have MRI and genetic information, but only half have cerebrospinal fluid (CSF) measures, a different half has FDG-PET; only some have proteomic data. Here we propose how to effectively integrate information from multiple heterogeneous data sources when data is block-wise missing. We present a unified "bi-level" learning model for complete multi-source data, and extend it to incomplete data. Our major contributions are: (1) our proposed models unify feature-level and source-level analysis, including several existing feature learning approaches as special cases; (2) the model for incomplete data avoids imputing missing data and offers superior performance; it generalizes to other applications with block-wise missing data sources; (3) we present efficient optimization algorithms for modeling complete and incomplete data. We comprehensively evaluate the proposed models including all ADNI subjects with at least one of four data types at baseline: MRI, FDG-PET, CSF and proteomics. Our proposed models compare favorably with existing approaches. © 2013 Elsevier Inc. All rights reserved.

  18. Hawking radiation, covariant boundary conditions, and vacuum states

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Banerjee, Rabin; Kulkarni, Shailesh

    2009-04-15

    The basic characteristics of the covariant chiral current and the covariant chiral energy-momentum tensor are obtained from a chiral effective action. These results are used to justify the covariant boundary condition used in recent approaches of computing the Hawking flux from chiral gauge and gravitational anomalies. We also discuss a connection of our results with the conventional calculation of nonchiral currents and stress tensors in different (Unruh, Hartle-Hawking and Boulware) states.

  19. Search Methods Used to Locate Missing Cats and Locations Where Missing Cats Are Found

    PubMed Central

    Huang, Liyan; Coradini, Marcia; Rand, Jacquie; Morton, John; Albrecht, Kat; Wasson, Brigid; Robertson, Danielle

    2018-01-01

    Simple Summary A least 15% of cat owners lose their pet in a five-year period and some are never found. This paper reports on data gathered from an online questionnaire that asked questions regarding search methods used to locate missing cats and locations where missing cats were found. The most important finding from this retrospective case series was that approximately one third of cats were recovered within 7 days. Secondly, a physical search increased the chances of finding cats alive and 75% of cats were found within a 500 m radius of their point of escape. Thirdly, those cats that were indoor-outdoor and allowed outside unsupervised traveled longer distances compared with indoor cats that were never allowed outside. Lastly, cats considered to be highly curious in nature were more likely to be found inside someone else’s house compared to other personality types. These findings suggest that a physical search within the first week of a cat going missing could be a useful strategy. In light of these findings, further research into this field may show whether programs such as shelter, neuter and return would improve the chances of owners searching and finding their missing cats as well as decreasing euthanasia rates in shelters. Abstract Missing pet cats are often not found by their owners, with many being euthanized at shelters. This study aimed to describe times that lost cats were missing for, search methods associated with their recovery, locations where found and distances travelled. A retrospective case series was conducted where self-selected participants whose cat had gone missing provided data in an online questionnaire. Of the 1210 study cats, only 61% were found within one year, with 34% recovered alive by the owner within 7 days. Few cats were found alive after 90 days. There was evidence that physical searching increased the chance of finding the cat alive (p = 0.073), and 75% of cats were found within 500 m of the point of escape. Up to 75% of

  20. Attrition Bias Related to Missing Outcome Data: A Longitudinal Simulation Study.

    PubMed

    Lewin, Antoine; Brondeel, Ruben; Benmarhnia, Tarik; Thomas, Frédérique; Chaix, Basile

    2018-01-01

    Most longitudinal studies do not address potential selection biases due to selective attrition. Using empirical data and simulating additional attrition, we investigated the effectiveness of common approaches to handle missing outcome data from attrition in the association between individual education level and change in body mass index (BMI). Using data from the two waves of the French RECORD Cohort Study (N = 7,172), we first examined how inverse probability weighting (IPW) and multiple imputation handled missing outcome data from attrition in the observed data (stage 1). Second, simulating additional missing data in BMI at follow-up under various missing-at-random scenarios, we quantified the impact of attrition and assessed how multiple imputation performed compared to complete case analysis and to a perfectly specified IPW model as a gold standard (stage 2). With the observed data in stage 1, we found an inverse association between individual education and change in BMI, with complete case analysis, as well as with IPW and multiple imputation. When we simulated additional attrition under a missing-at-random pattern (stage 2), the bias increased with the magnitude of selective attrition, and multiple imputation was useless to address it. Our simulations revealed that selective attrition in the outcome heavily biased the association of interest. The present article contributes to raising awareness that for missing outcome data, multiple imputation does not do better than complete case analysis. More effort is thus needed during the design phase to understand attrition mechanisms by collecting information on the reasons for dropout.

  1. Near miss and minor occupational injury: Does it share a common causal pathway with major injury?

    PubMed

    Alamgir, Hasanat; Yu, Shicheng; Gorman, Erin; Ngan, Karen; Guzman, Jaime

    2009-01-01

    An essential assumption of injury prevention programs is the common cause hypothesis that the causal pathways of near misses and minor injuries are similar to those of major injuries. The rates of near miss, minor injury and major injury of all reported incidents and musculoskeletal incidents (MSIs) were calculated for three health regions using information from a surveillance database and productive hours from payroll data. The relative distribution of individual causes and activities involved in near miss, minor injury and major injury were then compared. For all reported incidents, there were significant differences in the relative distribution of causes for near miss, minor, and major injury. However, the relative distribution of causes and activities involved in minor and major MSIs were similar. The top causes and activities involved were the same across near miss, minor, and major injury. Finding from this study support the use of near miss and minor injury data as potential outcome measures for injury prevention programs. (c) 2008 Wiley-Liss, Inc.

  2. Predictors of missed appointments in patients referred for congenital or pediatric cardiac magnetic resonance.

    PubMed

    Lu, Jimmy C; Lowery, Ray; Yu, Sunkyung; Ghadimi Mahani, Maryam; Agarwal, Prachi P; Dorfman, Adam L

    2017-07-01

    Congenital cardiac magnetic resonance is a limited resource because of scanner and physician availability. Missed appointments decrease scheduling efficiency, have financial implications and represent missed care opportunities. To characterize the rate of missed appointments and identify modifiable predictors. This single-center retrospective study included all patients with outpatient congenital or pediatric cardiac MR appointments from Jan. 1, 2014, through Dec. 31, 2015. We identified missed appointments (no-shows or same-day cancellations) from the electronic medical record. We obtained demographic and clinical factors from the medical record and assessed socioeconomic factors by U.S. Census block data by patient ZIP code. Statistically significant variables (P<0.05) were included into a multivariable analysis. Of 795 outpatients (median age 18.5 years, interquartile range 13.4-27.1 years) referred for congenital cardiac MR, a total of 91 patients (11.4%) missed appointments; 28 (3.5%) missed multiple appointments. Reason for missed appointment could be identified in only 38 patients (42%), but of these, 28 (74%) were preventable or could have been identified prior to the appointment. In multivariable analysis, independent predictors of missed appointments were referral by a non-cardiologist (adjusted odds ratio [AOR] 5.8, P=0.0002), referral for research (AOR 3.6, P=0.01), having public insurance (AOR 2.1, P=0.004), and having scheduled cardiac MR from November to April (AOR 1.8, P=0.01). Demographic factors can identify patients at higher risk for missing appointments. These data may inform initiatives to limit missed appointments, such as targeted education of referring providers and patients. Further data are needed to evaluate the efficacy of potential interventions.

  3. HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS.

    PubMed

    Fan, Jianqing; Liao, Yuan; Mincheva, Martina

    2011-01-01

    The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied.

  4. Comparing the performance of geostatistical models with additional information from covariates for sewage plume characterization.

    PubMed

    Del Monego, Maurici; Ribeiro, Paulo Justiniano; Ramos, Patrícia

    2015-04-01

    In this work, kriging with covariates is used to model and map the spatial distribution of salinity measurements gathered by an autonomous underwater vehicle in a sea outfall monitoring campaign aiming to distinguish the effluent plume from the receiving waters and characterize its spatial variability in the vicinity of the discharge. Four different geostatistical linear models for salinity were assumed, where the distance to diffuser, the west-east positioning, and the south-north positioning were used as covariates. Sample variograms were fitted by the Matèrn models using weighted least squares and maximum likelihood estimation methods as a way to detect eventual discrepancies. Typically, the maximum likelihood method estimated very low ranges which have limited the kriging process. So, at least for these data sets, weighted least squares showed to be the most appropriate estimation method for variogram fitting. The kriged maps show clearly the spatial variation of salinity, and it is possible to identify the effluent plume in the area studied. The results obtained show some guidelines for sewage monitoring if a geostatistical analysis of the data is in mind. It is important to treat properly the existence of anomalous values and to adopt a sampling strategy that includes transects parallel and perpendicular to the effluent dispersion.

  5. The role of the dentist in identifying missing and unidentified persons.

    PubMed

    Riley, Amber D

    2015-01-01

    The longer a person is missing, the more profound the need for dental records becomes. In 2013, there were >84,000 missing persons and >8,000 unidentified persons registered in the National Crime Information Center (NCIC) database. Tens of thousands of families are left without answers or closure, always maintaining hope that their relative will be located. Law enforcement needs the cooperation of organized dentistry to procure dental records, translate their findings, and upload them into the NCIC database for cross-matching with unidentified person records created by medical examiner and coroner departments across the United States and Canada.

  6. Modeling spatiotemporal covariance for magnetoencephalography or electroencephalography source analysis.

    PubMed

    Plis, Sergey M; George, J S; Jun, S C; Paré-Blagoev, J; Ranken, D M; Wood, C C; Schmidt, D M

    2007-01-01

    We propose a new model to approximate spatiotemporal noise covariance for use in neural electromagnetic source analysis, which better captures temporal variability in background activity. As with other existing formalisms, our model employs a Kronecker product of matrices representing temporal and spatial covariance. In our model, spatial components are allowed to have differing temporal covariances. Variability is represented as a series of Kronecker products of spatial component covariances and corresponding temporal covariances. Unlike previous attempts to model covariance through a sum of Kronecker products, our model is designed to have a computationally manageable inverse. Despite increased descriptive power, inversion of the model is fast, making it useful in source analysis. We have explored two versions of the model. One is estimated based on the assumption that spatial components of background noise have uncorrelated time courses. Another version, which gives closer approximation, is based on the assumption that time courses are statistically independent. The accuracy of the structural approximation is compared to an existing model, based on a single Kronecker product, using both Frobenius norm of the difference between spatiotemporal sample covariance and a model, and scatter plots. Performance of ours and previous models is compared in source analysis of a large number of single dipole problems with simulated time courses and with background from authentic magnetoencephalography data.

  7. Continuous Covariate Imbalance and Conditional Power for Clinical Trial Interim Analyses

    PubMed Central

    Ciolino, Jody D.; Martin, Renee' H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.

    2014-01-01

    Oftentimes valid statistical analyses for clinical trials involve adjustment for known influential covariates, regardless of imbalance observed in these covariates at baseline across treatment groups. Thus, it must be the case that valid interim analyses also properly adjust for these covariates. There are situations, however, in which covariate adjustment is not possible, not planned, or simply carries less merit as it makes inferences less generalizable and less intuitive. In this case, covariate imbalance between treatment groups can have a substantial effect on both interim and final primary outcome analyses. This paper illustrates the effect of influential continuous baseline covariate imbalance on unadjusted conditional power (CP), and thus, on trial decisions based on futility stopping bounds. The robustness of the relationship is illustrated for normal, skewed, and bimodal continuous baseline covariates that are related to a normally distributed primary outcome. Results suggest that unadjusted CP calculations in the presence of influential covariate imbalance require careful interpretation and evaluation. PMID:24607294

  8. Remediating Non-Positive Definite State Covariances for Collision Probability Estimation

    NASA Technical Reports Server (NTRS)

    Hall, Doyle T.; Hejduk, Matthew D.; Johnson, Lauren C.

    2017-01-01

    The NASA Conjunction Assessment Risk Analysis team estimates the probability of collision (Pc) for a set of Earth-orbiting satellites. The Pc estimation software processes satellite position+velocity states and their associated covariance matri-ces. On occasion, the software encounters non-positive definite (NPD) state co-variances, which can adversely affect or prevent the Pc estimation process. Inter-polation inaccuracies appear to account for the majority of such covariances, alt-hough other mechanisms contribute also. This paper investigates the origin of NPD state covariance matrices, three different methods for remediating these co-variances when and if necessary, and the associated effects on the Pc estimation process.

  9. Missing Aircraft Crash Sites and Spatial Relationships to the Last Radar Fix.

    PubMed

    Koester, Robert J; Greatbatch, Ian

    2016-02-01

    Few studies have examined the spatial characteristics of missing aircraft in actual distress. No previous studies have looked at the distance from the last radar plot to the crash site. The purpose of this study was to characterize this distance and then identify environmental and flight characteristics that might be used to predict the spatial relationship and, therefore, aid search and rescue planners. Detailed records were obtained from the U.S. Air Force Rescue Coordination Center for missing aircraft in distress from 2002 to 2008. The data was combined with information from the National Transportation Safety Board (NTSB) Accident Database. The spatial relationship between the last radar plot and crash site was then determined using GIS analysis. A total of 260 missing aircraft incidents involving 509 people were examined, of which 216 (83%) contained radar information. Among the missing aircraft the mortality rate was 89%; most occurred in mountainous terrain (57%); Part 91 flight accounted for 95% of the incidents; and 50% of the aircraft were found within 0.8 nmi from the last radar plot. Flight characteristics, descent rate, icing conditions, and instrument flight rule vs. visual flight rule flight could be used to predict spatial characteristics. In most circumstances, the last radar position is an excellent predictor of the crash site. However, 5% of aircraft are found further than 45.4 nmi. The flight and environmental conditions were identified and placed into an algorithm to aid search planners in determining how factors should be prioritized.

  10. Structural covariance and cortical reorganisation in schizophrenia: a MRI-based morphometric study.

    PubMed

    Palaniyappan, Lena; Hodgson, Olha; Balain, Vijender; Iwabuchi, Sarina; Gowland, Penny; Liddle, Peter

    2018-05-06

    In patients with schizophrenia, distributed abnormalities are observed in grey matter volume. A recent hypothesis posits that these distributed changes are indicative of a plastic reorganisation process occurring in response to a functional defect in neuronal information transmission. We investigated the structural covariance across various brain regions in early-stage schizophrenia to determine if indeed the observed patterns of volumetric loss conform to a coordinated pattern of structural reorganisation. Structural magnetic resonance imaging scans were obtained from 40 healthy adults and 41 age, gender and parental socioeconomic status matched patients with schizophrenia. Volumes of grey matter tissue were estimated at the regional level across 90 atlas-based parcellations. Group-level structural covariance was studied using a graph theoretical framework. Patients had distributed reduction in grey matter volume, with high degree of localised covariance (clustering) compared with controls. Patients with schizophrenia had reduced centrality of anterior cingulate and insula but increased centrality of the fusiform cortex, compared with controls. Simulating targeted removal of highly central nodes resulted in significant loss of the overall covariance patterns in patients compared with controls. Regional volumetric deficits in schizophrenia are not a result of random, mutually independent processes. Our observations support the occurrence of a spatially interconnected reorganisation with the systematic de-escalation of conventional 'hub' regions. This raises the question of whether the morphological architecture in schizophrenia is primed for compensatory functions, albeit with a high risk of inefficiency.

  11. Frame covariant nonminimal multifield inflation

    NASA Astrophysics Data System (ADS)

    Karamitsos, Sotirios; Pilaftsis, Apostolos

    2018-02-01

    We introduce a frame-covariant formalism for inflation of scalar-curvature theories by adopting a differential geometric approach which treats the scalar fields as coordinates living on a field-space manifold. This ensures that our description of inflation is both conformally and reparameterization covariant. Our formulation gives rise to extensions of the usual Hubble and potential slow-roll parameters to generalized fully frame-covariant forms, which allow us to provide manifestly frame-invariant predictions for cosmological observables, such as the tensor-to-scalar ratio r, the spectral indices nR and nT, their runnings αR and αT, the non-Gaussianity parameter fNL, and the isocurvature fraction βiso. We examine the role of the field space curvature in the generation and transfer of isocurvature modes, and we investigate the effect of boundary conditions for the scalar fields at the end of inflation on the observable inflationary quantities. We explore the stability of the trajectories with respect to the boundary conditions by using a suitable sensitivity parameter. To illustrate our approach, we first analyze a simple minimal two-field scenario before studying a more realistic nonminimal model inspired by Higgs inflation. We find that isocurvature effects are greatly enhanced in the latter scenario and must be taken into account for certain values in the parameter space such that the model is properly normalized to the observed scalar power spectrum PR. Finally, we outline how our frame-covariant approach may be extended beyond the tree-level approximation through the Vilkovisky-De Witt formalism, which we generalize to take into account conformal transformations, thereby leading to a fully frame-invariant effective action at the one-loop level.

  12. 40 CFR 98.245 - Procedures for estimating missing data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 21 2014-07-01 2014-07-01 false Procedures for estimating missing data... estimating missing data. For missing feedstock and product flow rates, use the same procedures as for missing... contents and missing molecular weights for fuels as specified in § 98.35(b)(1). For missing flare data...

  13. DISSCO: direct imputation of summary statistics allowing covariates.

    PubMed

    Xu, Zheng; Duan, Qing; Yan, Song; Chen, Wei; Li, Mingyao; Lange, Ethan; Li, Yun

    2015-08-01

    Imputation of individual level genotypes at untyped markers using an external reference panel of genotyped or sequenced individuals has become standard practice in genetic association studies. Direct imputation of summary statistics can also be valuable, for example in meta-analyses where individual level genotype data are not available. Two methods (DIST and ImpG-Summary/LD), that assume a multivariate Gaussian distribution for the association summary statistics, have been proposed for imputing association summary statistics. However, both methods assume that the correlations between association summary statistics are the same as the correlations between the corresponding genotypes. This assumption can be violated in the presence of confounding covariates. We analytically show that in the absence of covariates, correlation among association summary statistics is indeed the same as that among the corresponding genotypes, thus serving as a theoretical justification for the recently proposed methods. We continue to prove that in the presence of covariates, correlation among association summary statistics becomes the partial correlation of the corresponding genotypes controlling for covariates. We therefore develop direct imputation of summary statistics allowing covariates (DISSCO). We consider two real-life scenarios where the correlation and partial correlation likely make practical difference: (i) association studies in admixed populations; (ii) association studies in presence of other confounding covariate(s). Application of DISSCO to real datasets under both scenarios shows at least comparable, if not better, performance compared with existing correlation-based methods, particularly for lower frequency variants. For example, DISSCO can reduce the absolute deviation from the truth by 3.9-15.2% for variants with minor allele frequency <5%. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Real-time probabilistic covariance tracking with efficient model update.

    PubMed

    Wu, Yi; Cheng, Jian; Wang, Jinqiao; Lu, Hanqing; Wang, Jun; Ling, Haibin; Blasch, Erik; Bai, Li

    2012-05-01

    The recently proposed covariance region descriptor has been proven robust and versatile for a modest computational cost. The covariance matrix enables efficient fusion of different types of features, where the spatial and statistical properties, as well as their correlation, are characterized. The similarity between two covariance descriptors is measured on Riemannian manifolds. Based on the same metric but with a probabilistic framework, we propose a novel tracking approach on Riemannian manifolds with a novel incremental covariance tensor learning (ICTL). To address the appearance variations, ICTL incrementally learns a low-dimensional covariance tensor representation and efficiently adapts online to appearance changes of the target with only O(1) computational complexity, resulting in a real-time performance. The covariance-based representation and the ICTL are then combined with the particle filter framework to allow better handling of background clutter, as well as the temporary occlusions. We test the proposed probabilistic ICTL tracker on numerous benchmark sequences involving different types of challenges including occlusions and variations in illumination, scale, and pose. The proposed approach demonstrates excellent real-time performance, both qualitatively and quantitatively, in comparison with several previously proposed trackers.

  15. A three domain covariance framework for EEG/MEG data.

    PubMed

    Roś, Beata P; Bijma, Fetsje; de Gunst, Mathisca C M; de Munck, Jan C

    2015-10-01

    In this paper we introduce a covariance framework for the analysis of single subject EEG and MEG data that takes into account observed temporal stationarity on small time scales and trial-to-trial variations. We formulate a model for the covariance matrix, which is a Kronecker product of three components that correspond to space, time and epochs/trials, and consider maximum likelihood estimation of the unknown parameter values. An iterative algorithm that finds approximations of the maximum likelihood estimates is proposed. Our covariance model is applicable in a variety of cases where spontaneous EEG or MEG acts as source of noise and realistic noise covariance estimates are needed, such as in evoked activity studies, or where the properties of spontaneous EEG or MEG are themselves the topic of interest, like in combined EEG-fMRI experiments in which the correlation between EEG and fMRI signals is investigated. We use a simulation study to assess the performance of the estimator and investigate the influence of different assumptions about the covariance factors on the estimated covariance matrix and on its components. We apply our method to real EEG and MEG data sets. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS

    PubMed Central

    Fan, Jianqing; Liao, Yuan; Mincheva, Martina

    2012-01-01

    The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied. PMID:22661790

  17. Multiple feature fusion via covariance matrix for visual tracking

    NASA Astrophysics Data System (ADS)

    Jin, Zefenfen; Hou, Zhiqiang; Yu, Wangsheng; Wang, Xin; Sun, Hui

    2018-04-01

    Aiming at the problem of complicated dynamic scenes in visual target tracking, a multi-feature fusion tracking algorithm based on covariance matrix is proposed to improve the robustness of the tracking algorithm. In the frame-work of quantum genetic algorithm, this paper uses the region covariance descriptor to fuse the color, edge and texture features. It also uses a fast covariance intersection algorithm to update the model. The low dimension of region covariance descriptor, the fast convergence speed and strong global optimization ability of quantum genetic algorithm, and the fast computation of fast covariance intersection algorithm are used to improve the computational efficiency of fusion, matching, and updating process, so that the algorithm achieves a fast and effective multi-feature fusion tracking. The experiments prove that the proposed algorithm can not only achieve fast and robust tracking but also effectively handle interference of occlusion, rotation, deformation, motion blur and so on.

  18. Structural Effects of Network Sampling Coverage I: Nodes Missing at Random1.

    PubMed

    Smith, Jeffrey A; Moody, James

    2013-10-01

    Network measures assume a census of a well-bounded population. This level of coverage is rarely achieved in practice, however, and we have only limited information on the robustness of network measures to incomplete coverage. This paper examines the effect of node-level missingness on 4 classes of network measures: centrality, centralization, topology and homophily across a diverse sample of 12 empirical networks. We use a Monte Carlo simulation process to generate data with known levels of missingness and compare the resulting network scores to their known starting values. As with past studies (Borgatti et al 2006; Kossinets 2006), we find that measurement bias generally increases with more missing data. The exact rate and nature of this increase, however, varies systematically across network measures. For example, betweenness and Bonacich centralization are quite sensitive to missing data while closeness and in-degree are robust. Similarly, while the tau statistic and distance are difficult to capture with missing data, transitivity shows little bias even with very high levels of missingness. The results are also clearly dependent on the features of the network. Larger, more centralized networks are generally more robust to missing data, but this is especially true for centrality and centralization measures. More cohesive networks are robust to missing data when measuring topological features but not when measuring centralization. Overall, the results suggest that missing data may have quite large or quite small effects on network measurement, depending on the type of network and the question being posed.

  19. True covariance simulation of the EUVE update filter

    NASA Technical Reports Server (NTRS)

    Bar-Itzhack, Itzhack Y.; Harman, R. R.

    1989-01-01

    A covariance analysis of the performance and sensitivity of the attitude determination Extended Kalman Filter (EKF) used by the On Board Computer (OBC) of the Extreme Ultra Violet Explorer (EUVE) spacecraft is presented. The linearized dynamics and measurement equations of the error states are derived which constitute the truth model describing the real behavior of the systems involved. The design model used by the OBC EKF is then obtained by reducing the order of the truth model. The covariance matrix of the EKF which uses the reduced order model is not the correct covariance of the EKF estimation error. A true covariance analysis has to be carried out in order to evaluate the correct accuracy of the OBC generated estimates. The results of such analysis are presented which indicate both the performance and the sensitivity of the OBC EKF.

  20. Integrated approach to estimate the ocean's time variable dynamic topography including its covariance matrix

    NASA Astrophysics Data System (ADS)

    Müller, Silvia; Brockmann, Jan Martin; Schuh, Wolf-Dieter

    2015-04-01

    The ocean's dynamic topography as the difference between the sea surface and the geoid reflects many characteristics of the general ocean circulation. Consequently, it provides valuable information for evaluating or tuning ocean circulation models. The sea surface is directly observed by satellite radar altimetry while the geoid cannot be observed directly. The satellite-based gravity field determination requires different measurement principles (satellite-to-satellite tracking (e.g. GRACE), satellite-gravity-gradiometry (GOCE)). In addition, hydrographic measurements (salinity, temperature and pressure; near-surface velocities) provide information on the dynamic topography. The observation types have different representations and spatial as well as temporal resolutions. Therefore, the determination of the dynamic topography is not straightforward. Furthermore, the integration of the dynamic topography into ocean circulation models requires not only the dynamic topography itself but also its inverse covariance matrix on the ocean model grid. We developed a rigorous combination method in which the dynamic topography is parameterized in space as well as in time. The altimetric sea surface heights are expressed as a sum of geoid heights represented in terms of spherical harmonics and the dynamic topography parameterized by a finite element method which can be directly related to the particular ocean model grid. Besides the difficult task of combining altimetry data with a gravity field model, a major aspect is the consistent combination of satellite data and in-situ observations. The particular characteristics and the signal content of the different observations must be adequately considered requiring the introduction of auxiliary parameters. Within our model the individual observation groups are combined in terms of normal equations considering their full covariance information; i.e. a rigorous variance/covariance propagation from the original measurements to the

  1. Missing binary data extraction challenges from Cochrane reviews in mental health and Campbell reviews with implications for empirical research.

    PubMed

    Spineli, Loukia M

    2017-12-01

    Tο report challenges encountered during the extraction process from Cochrane reviews in mental health and Campbell reviews and to indicate their implications on the empirical performance of different methods to handle missingness. We used a collection of meta-analyses on binary outcomes collated from a previous work on missing outcome data. To evaluate the accuracy of their extraction, we developed specific criteria pertaining to the reporting of missing outcome data in systematic reviews. Using the most popular methods to handle missing binary outcome data, we investigated the implications of the accuracy of the extracted meta-analysis on the random-effects meta-analysis results. Of 113 meta-analyses from Cochrane reviews, 60 (53%) were judged as "unclearly" extracted (ie, no information on the outcome of completers but available information on how missing participants were handled) and 42 (37%) as "unacceptably" extracted (ie, no information on the outcome of completers as well as no information on how missing participants were handled). For the remaining meta-analyses, it was judged that data were "acceptably" extracted (ie, information on the completers' outcome was provided for all trials). Overall, "unclear" extraction overestimated the magnitude of the summary odds ratio and the between-study variance and additionally inflated the uncertainty of both meta-analytical parameters. The only eligible Campbell review was judged as "unclear." Depending on the extent of missingness, the reporting quality of the systematic reviews can greatly affect the accuracy of the extracted meta-analyses and by extent, the empirical performance of different methods to handle missingness. Copyright © 2017 John Wiley & Sons, Ltd.

  2. Strategies for Dealing with Missing Accelerometer Data.

    PubMed

    Stephens, Samantha; Beyene, Joseph; Tremblay, Mark S; Faulkner, Guy; Pullnayegum, Eleanor; Feldman, Brian M

    2018-05-01

    Missing data is a universal research problem that can affect studies examining the relationship between physical activity measured with accelerometers and health outcomes. Statistical techniques are available to deal with missing data; however, available techniques have not been synthesized. A scoping review was conducted to summarize the advantages and disadvantages of identified methods of dealing with missing data from accelerometers. Missing data poses a threat to the validity and interpretation of trials using physical activity data from accelerometry. Imputation using multiple imputation techniques is recommended to deal with missing data and improve the validity and interpretation of studies using accelerometry. Copyright © 2018 Elsevier Inc. All rights reserved.

  3. Bayesian hierarchical model for large-scale covariance matrix estimation.

    PubMed

    Zhu, Dongxiao; Hero, Alfred O

    2007-12-01

    Many bioinformatics problems implicitly depend on estimating large-scale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to "overfitting." We cast the large-scale covariance matrix estimation problem into the Bayesian hierarchical model framework, and introduce dependency between covariance parameters. We demonstrate the advantages of our approaches over the traditional approaches using simulations and OMICS data analysis.

  4. Mind the Gap: The Prospects of Missing Data.

    PubMed

    McConnell, Meghan; Sherbino, Jonathan; Chan, Teresa M

    2016-12-01

    The increasing use of workplace-based assessments (WBAs) in competency-based medical education has led to large data sets that assess resident performance longitudinally. With large data sets, problems that arise from missing data are increasingly likely. The purpose of this study is to examine (1) whether data are missing at random across various WBAs, and (2) the relationship between resident performance and the proportion of missing data. During 2012-2013, a total of 844 WBAs of CanMEDs Roles were completed for 9 second-year emergency medicine residents. To identify whether missing data were randomly distributed across various WBAs, the total number of missing data points was calculated for each Role. To examine whether the amount of missing data was related to resident performance, 5 faculty members rank-ordered the residents based on performance. A median rank score was calculated for each resident and was correlated with the proportion of missing data. More data were missing for Health Advocate and Professional WBAs relative to other competencies ( P  < .001). Furthermore, resident rankings were not related to the proportion of missing data points ( r  = 0.29, P  > .05). The results of the present study illustrate that some CanMEDS Roles are less likely to be assessed than others. At the same time, the amount of missing data did not correlate with resident performance, suggesting lower-performing residents are no more likely to have missing data than their higher-performing peers. This article discusses several approaches to dealing with missing data.

  5. Search Methods Used to Locate Missing Cats and Locations Where Missing Cats Are Found.

    PubMed

    Huang, Liyan; Coradini, Marcia; Rand, Jacquie; Morton, John; Albrecht, Kat; Wasson, Brigid; Robertson, Danielle

    2018-01-02

    Missing pet cats are often not found by their owners, with many being euthanized at shelters. This study aimed to describe times that lost cats were missing for, search methods associated with their recovery, locations where found and distances travelled. A retrospective case series was conducted where self-selected participants whose cat had gone missing provided data in an online questionnaire. Of the 1210 study cats, only 61% were found within one year, with 34% recovered alive by the owner within 7 days. Few cats were found alive after 90 days. There was evidence that physical searching increased the chance of finding the cat alive ( p = 0.073), and 75% of cats were found within 500 m of the point of escape. Up to 75% of cats with outdoor access traveled 1609 m, further than the distance traveled by indoor-only cats (137 m; p ≤ 0.001). Cats considered to be highly curious were more likely to be found inside someone else's house compared to other personality types. These findings suggest that thorough physical searching is a useful strategy, and should be conducted within the first week after cats go missing. They also support further investigation into whether shelter, neuter and return programs improve the chance of owners recovering missing cats and decrease numbers of cats euthanized in shelters.

  6. A systematic review and development of a classification framework for factors associated with missing patient-reported outcome data.

    PubMed

    Palmer, Michael J; Mercieca-Bebber, Rebecca; King, Madeleine; Calvert, Melanie; Richardson, Harriet; Brundage, Michael

    2018-02-01

    Missing patient-reported outcome data can lead to biased results, to loss of power to detect between-treatment differences, and to research waste. Awareness of factors may help researchers reduce missing patient-reported outcome data through study design and trial processes. The aim was to construct a Classification Framework of factors associated with missing patient-reported outcome data in the context of comparative studies. The first step in this process was informed by a systematic review. Two databases (MEDLINE and CINAHL) were searched from inception to March 2015 for English articles. Inclusion criteria were (a) relevant to patient-reported outcomes, (b) discussed missing data or compliance in prospective medical studies, and (c) examined predictors or causes of missing data, including reasons identified in actual trial datasets and reported on cover sheets. Two reviewers independently screened titles and abstracts. Discrepancies were discussed with the research team prior to finalizing the list of eligible papers. In completing the systematic review, four particular challenges to synthesizing the extracted information were identified. To address these challenges, operational principles were established by consensus to guide the development of the Classification Framework. A total of 6027 records were screened. In all, 100 papers were eligible and included in the review. Of these, 57% focused on cancer, 23% did not specify disease, and 20% reported for patients with a variety of non-cancer conditions. In total, 40% of the papers offered a descriptive analysis of possible factors associated with missing data, but some papers used other methods. In total, 663 excerpts of text (units), each describing a factor associated with missing patient-reported outcome data, were extracted verbatim. Redundant units were identified and sequestered. Similar units were grouped, and an iterative process of consensus among the investigators was used to reduce these units to a

  7. Information Gaps: The Missing Links to Learning.

    ERIC Educational Resources Information Center

    Adams, Carl R.

    Communication takes place when a speaker conveys new information to the listener. In second language teaching, information gaps motivate students to use and learn the target language in order to obtain information. The resulting interactive language use may develop affective bonds among the students. A variety of classroom techniques are available…

  8. Perturbative approach to covariance matrix of the matter power spectrum

    NASA Astrophysics Data System (ADS)

    Mohammed, Irshad; Seljak, Uroš; Vlah, Zvonimir

    2017-04-01

    We evaluate the covariance matrix of the matter power spectrum using perturbation theory up to dominant terms at 1-loop order and compare it to numerical simulations. We decompose the covariance matrix into the disconnected (Gaussian) part, trispectrum from the modes outside the survey (supersample variance) and trispectrum from the modes inside the survey, and show how the different components contribute to the overall covariance matrix. We find the agreement with the simulations is at a 10 per cent level up to k ˜ 1 h Mpc-1. We show that all the connected components are dominated by the large-scale modes (k < 0.1 h Mpc-1), regardless of the value of the wave vectors k, k΄ of the covariance matrix, suggesting that one must be careful in applying the jackknife or bootstrap methods to the covariance matrix. We perform an eigenmode decomposition of the connected part of the covariance matrix, showing that at higher k, it is dominated by a single eigenmode. The full covariance matrix can be approximated as the disconnected part only, with the connected part being treated as an external nuisance parameter with a known scale dependence, and a known prior on its variance for a given survey volume. Finally, we provide a prescription for how to evaluate the covariance matrix from small box simulations without the need to simulate large volumes.

  9. Treatment of Missing Data in Workforce Education Research

    ERIC Educational Resources Information Center

    Gemici, Sinan; Rojewski, Jay W.; Lee, In Heok

    2012-01-01

    Most quantitative analyses in workforce education are affected by missing data. Traditional approaches to remedy missing data problems often result in reduced statistical power and biased parameter estimates due to systematic differences between missing and observed values. This article examines the treatment of missing data in pertinent…

  10. Interspecific analysis of covariance structure in the masticatory apparatus of galagos.

    PubMed

    Vinyard, Christopher J

    2007-01-01

    The primate masticatory apparatus (MA) is a functionally integrated set of features, each of which performs important functions in biting, ingestive, and chewing behaviors. A comparison of morphological covariance structure among species for these MA features will help us to further understand the evolutionary history of this region. In this exploratory analysis, the covariance structure of the MA is compared across seven galago species to investigate 1) whether there are differences in covariance structure in this region, and 2) if so, how has this covariation changed with respect to size, MA form, diet, and/or phylogeny? Ten measurements of the MA functionally related to bite force production and load resistance were obtained from 218 adults of seven galago species. Correlation matrices were generated for these 10 dimensions and compared among species via matrix correlations and Mantel tests. Subsequently, pairwise covariance disparity in the MA was estimated as a measure of difference in covariance structure between species. Covariance disparity estimates were correlated with pairwise distances related to differences in body size, MA size and shape, genetic distance (based on cytochrome-b sequences) and percentage of dietary foods to determine whether one or more of these factors is linked to differences in covariance structure. Galagos differ in MA covariance structure. Body size appears to be a major factor correlated with differences in covariance structure among galagos. The largest galago species, Otolemur crassicaudatus, exhibits large differences in body mass and covariance structure relative to other galagos, and thus plays a primary role in creating this association. MA size and shape do not correlate with covariance structure when body mass is held constant. Diet also shows no association. Genetic distance is significantly negatively correlated with covariance disparity when body mass is held constant, but this correlation appears to be a function of

  11. Position Error Covariance Matrix Validation and Correction

    NASA Technical Reports Server (NTRS)

    Frisbee, Joe, Jr.

    2016-01-01

    In order to calculate operationally accurate collision probabilities, the position error covariance matrices predicted at times of closest approach must be sufficiently accurate representations of the position uncertainties. This presentation will discuss why the Gaussian distribution is a reasonable expectation for the position uncertainty and how this assumed distribution type is used in the validation and correction of position error covariance matrices.

  12. The missing impact craters on Venus

    NASA Technical Reports Server (NTRS)

    Speidel, D. H.

    1993-01-01

    The size-frequency pattern of the 842 impact craters on Venus measured to date can be well described (across four standard deviation units) as a single log normal distribution with a mean crater diameter of 14.5 km. This result was predicted in 1991 on examination of the initial Magellan analysis. If this observed distribution is close to the real distribution, the 'missing' 90 percent of the small craters and the 'anomalous' lack of surface splotches may thus be neither missing nor anomalous. I think that the missing craters and missing splotches can be satisfactorily explained by accepting that the observed distribution approximates the real one, that it is not craters that are missing but the impactors. What you see is what you got. The implication that Venus crossing impactors would have the same type of log normal distribution is consistent with recently described distribution for terrestrial craters and Earth crossing asteroids.

  13. Covariant fields on anti-de Sitter spacetimes

    NASA Astrophysics Data System (ADS)

    Cotăescu, Ion I.

    2018-02-01

    The covariant free fields of any spin on anti-de Sitter (AdS) spacetimes are studied, pointing out that these transform under isometries according to covariant representations (CRs) of the AdS isometry group, induced by those of the Lorentz group. Applying the method of ladder operators, it is shown that the CRs with unique spin are equivalent with discrete unitary irreducible representations (UIRs) of positive energy of the universal covering group of the isometry one. The action of the Casimir operators is studied finding how the weights of these representations (reps.) may depend on the mass and spin of the covariant field. The conclusion is that on AdS spacetime, one cannot formulate a universal mass condition as in special relativity.

  14. Incomplete Early Childhood Immunization Series and Missing Fourth DTaP Immunizations; Missed Opportunities or Missed Visits?

    PubMed

    Robison, Steve G

    2013-01-01

    The successful completion of early childhood immunizations is a proxy for overall quality of early care. Immunization statuses are usually assessed by up-to-date (UTD) rates covering combined series of different immunizations. However, series UTD rates often only bear on which single immunization is missing, rather than the success of all immunizations. In the US, most series UTD rates are limited by missing fourth DTaP-containing immunizations (diphtheria/tetanus/pertussis) due at 15 to 18 months of age. Missing 4th DTaP immunizations are associated either with a lack of visits at 15 to 18 months of age, or to visits without immunizations. Typical immunization data however cannot distinguish between these two reasons. This study compared immunization records from the Oregon ALERT IIS with medical encounter records for two-year olds in the Oregon Health Plan. Among those with 3 valid DTaPs by 9 months of age, 31.6% failed to receive a timely 4th DTaP; of those without a 4th DTaP, 42.1% did not have any provider visits from 15 through 18 months of age, while 57.9% had at least one provider visit. Those with a 4th DTaP averaged 2.45 encounters, while those with encounters but without 4th DTaPs averaged 2.23 encounters.

  15. Sophie’s story: writing missing journeys

    PubMed Central

    Stevenson, Olivia

    2014-01-01

    ‘Sophie’s story’ is a creative rendition of an interview narrative gathered in a research project on missing people. The paper explains why Sophie’s story was written and details the wider intention to provide new narrative resources for police officer training, families of missing people and returned missing people. We contextualize this cultural intervention with an argument about the transformative potential of writing trauma stories. It is suggested that trauma stories produce difficult and unknown affects, but ones that may provide new ways of talking about unspeakable events. Sophie’s story is thus presented as a hopeful cultural geography in process, and one that seeks to help rewrite existing social scripts about missing people. PMID:29710880

  16. Spectral measurements of Terrestrial Mars Analogues: support for the ExoMars - Ma_Miss instrument

    NASA Astrophysics Data System (ADS)

    De Angelis, S.; De Sanctis, M. C.; Ammannito, E.; Di Iorio, T.; Carli, C.; Frigeri, A.; Capria, M. T.; Federico, C.; Boccaccini, A.; Capaccioni, F.; Giardino, M.; Cerroni, P.; Palomba, E.; Piccioni, G.

    2013-09-01

    The Ma_Miss (Mars Multispectral Imager for Subsurface Studies) instrument onboard of ExoMars 2018 mission to Mars will investigate the Martian subsoil down to a depth of 2 meters [1]. Ma_Miss is a miniaturized spectrometer, completely integrated within the drilling system of the ExoMars Pasteur rover; it will acquire spectra in the range 0.4-2.2μm, from the excavated borehole wall. The spectroscopic investigation of the subsurface materials will give us precious information about mineralogical, petrologic and geological processes, and will give insights about materials that have not been modified by surface processes such as erosion, weathering or oxidation. Spectroscopic measurements have been performed on Terrestrial Mars Analogues with the Ma_Miss laboratory model (breadboard). Moreover spectroscopic investigation of different sets of Terrestrial Mars Analogues is being carried on with different laboratory setups, as a support for the ExoMars-Ma_Miss instrument.

  17. Tonic and phasic co-variation of peripheral arousal indices in infants

    PubMed Central

    Wass, S.V.; de Barbaro, K.; Clackson, K.

    2015-01-01

    Tonic and phasic differences in peripheral autonomic nervous system (ANS) indicators strongly predict differences in attention and emotion regulation in developmental populations. However, virtually all previous research has been based on individual ANS measures, which poses a variety of conceptual and methodlogical challenges to comparing results across studies. Here we recorded heart rate, electrodermal activity (EDA), pupil size, head movement velocity and peripheral accelerometry concurrently while a cohort of 37 typical 12-month-old infants completed a mixed assessment battery lasting approximately 20 min per participant. We analysed covariation of these autonomic indices in three ways: first, tonic (baseline) arousal; second, co-variation in spontaneous (phasic) changes during testing; third, phasic co-variation relative to an external stimulus event. We found that heart rate, head velocity and peripheral accelerometry showed strong positive co-variation across all three analyses. EDA showed no co-variation in tonic activity levels but did show phasic positive co-variation with other measures, that appeared limited to sections of high but not low general arousal. Tonic pupil size showed significant positive covariation, but phasic pupil changes were inconsistent. We conclude that: (i) there is high covariation between autonomic indices in infants, but that EDA may only be sensitive at extreme arousal levels, (ii) that tonic pupil size covaries with other indices, but does not show predicted patterns of phasic change and (iii) that motor activity appears to be a good proxy measure of ANS activity. The strongest patterns of covariation were observed using epoch durations of 40 s per epoch, although significant covariation between indices was also observed using shorter epochs (1 and 5 s). PMID:26316360

  18. 40 CFR 75.31 - Initial missing data procedures.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 16 2010-07-01 2010-07-01 false Initial missing data procedures. 75.31... (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.31 Initial missing data.... For each hour of missing SO2, Hg, or CO2 emissions concentration data (including CO2 data converted...

  19. 40 CFR 75.31 - Initial missing data procedures.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 16 2011-07-01 2011-07-01 false Initial missing data procedures. 75.31... (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.31 Initial missing data..., or O2 concentration data, and moisture data. For each hour of missing SO2 or CO2 emissions...

  20. 40 CFR 75.31 - Initial missing data procedures.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 17 2013-07-01 2013-07-01 false Initial missing data procedures. 75.31... (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.31 Initial missing data..., or O2 concentration data, and moisture data. For each hour of missing SO2 or CO2 emissions...

  1. 40 CFR 75.31 - Initial missing data procedures.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 17 2012-07-01 2012-07-01 false Initial missing data procedures. 75.31... (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.31 Initial missing data..., or O2 concentration data, and moisture data. For each hour of missing SO2 or CO2 emissions...

  2. 40 CFR 75.31 - Initial missing data procedures.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 17 2014-07-01 2014-07-01 false Initial missing data procedures. 75.31... (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.31 Initial missing data..., or O2 concentration data, and moisture data. For each hour of missing SO2 or CO2 emissions...

  3. Reconstruction of missing daily streamflow data using dynamic regression models

    NASA Astrophysics Data System (ADS)

    Tencaliec, Patricia; Favre, Anne-Catherine; Prieur, Clémentine; Mathevet, Thibault

    2015-12-01

    River discharge is one of the most important quantities in hydrology. It provides fundamental records for water resources management and climate change monitoring. Even very short data-gaps in this information can cause extremely different analysis outputs. Therefore, reconstructing missing data of incomplete data sets is an important step regarding the performance of the environmental models, engineering, and research applications, thus it presents a great challenge. The objective of this paper is to introduce an effective technique for reconstructing missing daily discharge data when one has access to only daily streamflow data. The proposed procedure uses a combination of regression and autoregressive integrated moving average models (ARIMA) called dynamic regression model. This model uses the linear relationship between neighbor and correlated stations and then adjusts the residual term by fitting an ARIMA structure. Application of the model to eight daily streamflow data for the Durance river watershed showed that the model yields reliable estimates for the missing data in the time series. Simulation studies were also conducted to evaluate the performance of the procedure.

  4. Multivariate longitudinal data analysis with censored and intermittent missing responses.

    PubMed

    Lin, Tsung-I; Lachos, Victor H; Wang, Wan-Lun

    2018-05-08

    The multivariate linear mixed model (MLMM) has emerged as an important analytical tool for longitudinal data with multiple outcomes. However, the analysis of multivariate longitudinal data could be complicated by the presence of censored measurements because of a detection limit of the assay in combination with unavoidable missing values arising when subjects miss some of their scheduled visits intermittently. This paper presents a generalization of the MLMM approach, called the MLMM-CM, for a joint analysis of the multivariate longitudinal data with censored and intermittent missing responses. A computationally feasible expectation maximization-based procedure is developed to carry out maximum likelihood estimation within the MLMM-CM framework. Moreover, the asymptotic standard errors of fixed effects are explicitly obtained via the information-based method. We illustrate our methodology by using simulated data and a case study from an AIDS clinical trial. Experimental results reveal that the proposed method is able to provide more satisfactory performance as compared with the traditional MLMM approach. Copyright © 2018 John Wiley & Sons, Ltd.

  5. The impact of covariance misspecification in group-based trajectory models for longitudinal data with non-stationary covariance structure.

    PubMed

    Davies, Christopher E; Glonek, Gary Fv; Giles, Lynne C

    2017-08-01

    One purpose of a longitudinal study is to gain a better understanding of how an outcome of interest changes among a given population over time. In what follows, a trajectory will be taken to mean the series of measurements of the outcome variable for an individual. Group-based trajectory modelling methods seek to identify subgroups of trajectories within a population, such that trajectories that are grouped together are more similar to each other than to trajectories in distinct groups. Group-based trajectory models generally assume a certain structure in the covariances between measurements, for example conditional independence, homogeneous variance between groups or stationary variance over time. Violations of these assumptions could be expected to result in poor model performance. We used simulation to investigate the effect of covariance misspecification on misclassification of trajectories in commonly used models under a range of scenarios. To do this we defined a measure of performance relative to the ideal Bayesian correct classification rate. We found that the more complex models generally performed better over a range of scenarios. In particular, incorrectly specified covariance matrices could significantly bias the results but using models with a correct but more complicated than necessary covariance matrix incurred little cost.

  6. NONPARAMETRIC MANOVA APPROACHES FOR NON-NORMAL MULTIVARIATE OUTCOMES WITH MISSING VALUES

    PubMed Central

    He, Fanyin; Mazumdar, Sati; Tang, Gong; Bhatia, Triptish; Anderson, Stewart J.; Dew, Mary Amanda; Krafty, Robert; Nimgaonkar, Vishwajit; Deshpande, Smita; Hall, Martica; Reynolds, Charles F.

    2017-01-01

    Between-group comparisons often entail many correlated response variables. The multivariate linear model, with its assumption of multivariate normality, is the accepted standard tool for these tests. When this assumption is violated, the nonparametric multivariate Kruskal-Wallis (MKW) test is frequently used. However, this test requires complete cases with no missing values in response variables. Deletion of cases with missing values likely leads to inefficient statistical inference. Here we extend the MKW test to retain information from partially-observed cases. Results of simulated studies and analysis of real data show that the proposed method provides adequate coverage and superior power to complete-case analyses. PMID:29416225

  7. Part Marking and Identification Materials on MISSE

    NASA Technical Reports Server (NTRS)

    Finckenor, Miria M.; Roxby, Donald L.

    2008-01-01

    Many different spacecraft materials were flown as part of the Materials on International Space Station Experiment (MISSE), including several materials used in part marking and identification. The experiment contained Data Matrix symbols applied using laser bonding, vacuum arc vapor deposition, gas assisted laser etch, chemical etch, mechanical dot peening, laser shot peening, and laser induced surface improvement. The effects of ultraviolet radiation on nickel acetate seal versus hot water seal on sulfuric acid anodized aluminum are discussed. These samples were exposed on the International Space Station to the low Earth orbital environment of atomic oxygen, ultraviolet radiation, thermal cycling, and hard vacuum, though atomic oxygen exposure was very limited for some samples. Results from the one-year exposure on MISSE-3 and MISSE-4 are compared to those from MISSE-1 and MISSE-2, which were exposed for four years. Part marking and identification materials on the current MISSE -6 experiment are also discussed.

  8. Perturbative approach to covariance matrix of the matter power spectrum

    DOE PAGES

    Mohammed, Irshad; Seljak, Uros; Vlah, Zvonimir

    2016-12-14

    Here, we evaluate the covariance matrix of the matter power spectrum using perturbation theory up to dominant terms at 1-loop order and compare it to numerical simulations. We decompose the covariance matrix into the disconnected (Gaussian) part, trispectrum from the modes outside the survey (beat coupling or super-sample variance), and trispectrum from the modes inside the survey, and show how the different components contribute to the overall covariance matrix. We find the agreement with the simulations is at a 10\\% level up tomore » $$k \\sim 1 h {\\rm Mpc^{-1}}$$. We also show that all the connected components are dominated by the large-scale modes ($$k<0.1 h {\\rm Mpc^{-1}}$$), regardless of the value of the wavevectors $$k,\\, k'$$ of the covariance matrix, suggesting that one must be careful in applying the jackknife or bootstrap methods to the covariance matrix. We perform an eigenmode decomposition of the connected part of the covariance matrix, showing that at higher $k$ it is dominated by a single eigenmode. Furthermore, the full covariance matrix can be approximated as the disconnected part only, with the connected part being treated as an external nuisance parameter with a known scale dependence, and a known prior on its variance for a given survey volume. Finally, we provide a prescription for how to evaluate the covariance matrix from small box simulations without the need to simulate large volumes.« less

  9. Missing value imputation: with application to handwriting data

    NASA Astrophysics Data System (ADS)

    Xu, Zhen; Srihari, Sargur N.

    2015-01-01

    Missing values make pattern analysis difficult, particularly with limited available data. In longitudinal research, missing values accumulate, thereby aggravating the problem. Here we consider how to deal with temporal data with missing values in handwriting analysis. In the task of studying development of individuality of handwriting, we encountered the fact that feature values are missing for several individuals at several time instances. Six algorithms, i.e., random imputation, mean imputation, most likely independent value imputation, and three methods based on Bayesian network (static Bayesian network, parameter EM, and structural EM), are compared with children's handwriting data. We evaluate the accuracy and robustness of the algorithms under different ratios of missing data and missing values, and useful conclusions are given. Specifically, static Bayesian network is used for our data which contain around 5% missing data to provide adequate accuracy and low computational cost.

  10. Pu239 Cross-Section Variations Based on Experimental Uncertainties and Covariances

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sigeti, David Edward; Williams, Brian J.; Parsons, D. Kent

    2016-10-18

    Algorithms and software have been developed for producing variations in plutonium-239 neutron cross sections based on experimental uncertainties and covariances. The varied cross-section sets may be produced as random samples from the multi-variate normal distribution defined by an experimental mean vector and covariance matrix, or they may be produced as Latin-Hypercube/Orthogonal-Array samples (based on the same means and covariances) for use in parametrized studies. The variations obey two classes of constraints that are obligatory for cross-section sets and which put related constraints on the mean vector and covariance matrix that detemine the sampling. Because the experimental means and covariances domore » not obey some of these constraints to sufficient precision, imposing the constraints requires modifying the experimental mean vector and covariance matrix. Modification is done with an algorithm based on linear algebra that minimizes changes to the means and covariances while insuring that the operations that impose the different constraints do not conflict with each other.« less

  11. Planned Missing Data Designs in Educational Psychology Research

    ERIC Educational Resources Information Center

    Rhemtulla, Mijke; Hancock, Gregory R.

    2016-01-01

    Although missing data are often viewed as a challenge for applied researchers, in fact missing data can be highly beneficial. Specifically, when the amount of missing data on specific variables is carefully controlled, a balance can be struck between statistical power and research costs. This article presents the issue of planned missing data by…

  12. Covariant information-density cutoff in curved space-time.

    PubMed

    Kempf, Achim

    2004-06-04

    In information theory, the link between continuous information and discrete information is established through well-known sampling theorems. Sampling theory explains, for example, how frequency-filtered music signals are reconstructible perfectly from discrete samples. In this Letter, sampling theory is generalized to pseudo-Riemannian manifolds. This provides a new set of mathematical tools for the study of space-time at the Planck scale: theories formulated on a differentiable space-time manifold can be equivalent to lattice theories. There is a close connection to generalized uncertainty relations which have appeared in string theory and other studies of quantum gravity.

  13. Missed Lesions at CT Colonography: Lessons Learned

    PubMed Central

    Pickhardt, Perry J.

    2017-01-01

    Misinterpretation at CT colonography (CTC) can result in either a colorectal lesion being missed (false negative) or a false-positive diagnosis. This review will largely focus on potential missed lesions – and ways to avoid such misses. The general causes of false-negative interpretation at CTC can be broadly characterized and grouped into discrete categories related to suboptimal study technique, specific lesion characteristics, anatomic location, and imaging artifacts. Overlapping causes further increase the likelihood of missing a clinically relevant lesion. In the end, if the technical factors of bowel preparation, colonic distention, and robust CTC software are adequately addressed on a consistent basis, and the reader is aware of all the potential pitfalls at CTC, important lesions will seldom be missed. PMID:22539045

  14. Eliciting Systematic Rule Use in Covariation Judgment [the Early Years].

    ERIC Educational Resources Information Center

    Shaklee, Harriet; Paszek, Donald

    Related research suggests that children may show some simple understanding of event covariations by the early elementary school years. The present experiments use a rule analysis methodology to investigate covariation judgments of children in this age range. In Experiment 1, children in second, third and fourth grade judged covariations on 12…

  15. Robust infrared targets tracking with covariance matrix representation

    NASA Astrophysics Data System (ADS)

    Cheng, Jian

    2009-07-01

    Robust infrared target tracking is an important and challenging research topic in many military and security applications, such as infrared imaging guidance, infrared reconnaissance, scene surveillance, etc. To effectively tackle the nonlinear and non-Gaussian state estimation problems, particle filtering is introduced to construct the theory framework of infrared target tracking. Under this framework, the observation probabilistic model is one of main factors for infrared targets tracking performance. In order to improve the tracking performance, covariance matrices are introduced to represent infrared targets with the multi-features. The observation probabilistic model can be constructed by computing the distance between the reference target's and the target samples' covariance matrix. Because the covariance matrix provides a natural tool for integrating multiple features, and is scale and illumination independent, target representation with covariance matrices can hold strong discriminating ability and robustness. Two experimental results demonstrate the proposed method is effective and robust for different infrared target tracking, such as the sensor ego-motion scene, and the sea-clutter scene.

  16. MISSE-8

    NASA Image and Video Library

    2013-05-24

    iss036e004042 (5/24/2013) --- View of Materials on International Space Station Experiment - 8 (MISSE-8) which is installed on the ExPRESS Logistics Carrier 2 (ELC-2),located on the S3 Truss Outboard Zenith site.

  17. Longitudinal data analysis with non-ignorable missing data.

    PubMed

    Tseng, Chi-hong; Elashoff, Robert; Li, Ning; Li, Gang

    2016-02-01

    A common problem in the longitudinal data analysis is the missing data problem. Two types of missing patterns are generally considered in statistical literature: monotone and non-monotone missing data. Nonmonotone missing data occur when study participants intermittently miss scheduled visits, while monotone missing data can be from discontinued participation, loss to follow-up, and mortality. Although many novel statistical approaches have been developed to handle missing data in recent years, few methods are available to provide inferences to handle both types of missing data simultaneously. In this article, a latent random effects model is proposed to analyze longitudinal outcomes with both monotone and non-monotone missingness in the context of missing not at random. Another significant contribution of this article is to propose a new computational algorithm for latent random effects models. To reduce the computational burden of high-dimensional integration problem in latent random effects models, we develop a new computational algorithm that uses a new adaptive quadrature approach in conjunction with the Taylor series approximation for the likelihood function to simplify the E-step computation in the expectation-maximization algorithm. Simulation study is performed and the data from the scleroderma lung study are used to demonstrate the effectiveness of this method. © The Author(s) 2012.

  18. Methods for Mediation Analysis with Missing Data

    ERIC Educational Resources Information Center

    Zhang, Zhiyong; Wang, Lijuan

    2013-01-01

    Despite wide applications of both mediation models and missing data techniques, formal discussion of mediation analysis with missing data is still rare. We introduce and compare four approaches to dealing with missing data in mediation analysis including list wise deletion, pairwise deletion, multiple imputation (MI), and a two-stage maximum…

  19. Active life expectancy from annual follow-up data with missing responses.

    PubMed

    Izmirlian, G; Brock, D; Ferrucci, L; Phillips, C

    2000-03-01

    Active life expectancy (ALE) at a given age is defined as the expected remaining years free of disability. In this study, three categories of health status are defined according to the ability to perform activities of daily living independently. Several studies have used increment-decrement life tables to estimate ALE, without error analysis, from only a baseline and one follow-up interview. The present work conducts an individual-level covariate analysis using a three-state Markov chain model for multiple follow-up data. Using a logistic link, the model estimates single-year transition probabilities among states of health, accounting for missing interviews. This approach has the advantages of smoothing subsequent estimates and increased power by using all follow-ups. We compute ALE and total life expectancy from these estimated single-year transition probabilities. Variance estimates are computed using the delta method. Data from the Iowa Established Population for the Epidemiologic Study of the Elderly are used to test the effects of smoking on ALE on all 5-year age groups past 65 years, controlling for sex and education.

  20. Bayesian sensitivity analysis methods to evaluate bias due to misclassification and missing data using informative priors and external validation data.

    PubMed

    Luta, George; Ford, Melissa B; Bondy, Melissa; Shields, Peter G; Stamey, James D

    2013-04-01

    Recent research suggests that the Bayesian paradigm may be useful for modeling biases in epidemiological studies, such as those due to misclassification and missing data. We used Bayesian methods to perform sensitivity analyses for assessing the robustness of study findings to the potential effect of these two important sources of bias. We used data from a study of the joint associations of radiotherapy and smoking with primary lung cancer among breast cancer survivors. We used Bayesian methods to provide an operational way to combine both validation data and expert opinion to account for misclassification of the two risk factors and missing data. For comparative purposes we considered a "full model" that allowed for both misclassification and missing data, along with alternative models that considered only misclassification or missing data, and the naïve model that ignored both sources of bias. We identified noticeable differences between the four models with respect to the posterior distributions of the odds ratios that described the joint associations of radiotherapy and smoking with primary lung cancer. Despite those differences we found that the general conclusions regarding the pattern of associations were the same regardless of the model used. Overall our results indicate a nonsignificantly decreased lung cancer risk due to radiotherapy among nonsmokers, and a mildly increased risk among smokers. We described easy to implement Bayesian methods to perform sensitivity analyses for assessing the robustness of study findings to misclassification and missing data. Copyright © 2012 Elsevier Ltd. All rights reserved.

  1. Space-Time Modelling of Groundwater Level Using Spartan Covariance Function

    NASA Astrophysics Data System (ADS)

    Varouchakis, Emmanouil; Hristopulos, Dionissios

    2014-05-01

    Geostatistical models often need to handle variables that change in space and in time, such as the groundwater level of aquifers. A major advantage of space-time observations is that a higher number of data supports parameter estimation and prediction. In a statistical context, space-time data can be considered as realizations of random fields that are spatially extended and evolve in time. The combination of spatial and temporal measurements in sparsely monitored watersheds can provide very useful information by incorporating spatiotemporal correlations. Spatiotemporal interpolation is usually performed by applying the standard Kriging algorithms extended in a space-time framework. Spatiotemoral covariance functions for groundwater level modelling, however, have not been widely developed. We present a new non-separable theoretical spatiotemporal variogram function which is based on the Spartan covariance family and evaluate its performance in spatiotemporal Kriging (STRK) interpolation. The original spatial expression (Hristopulos and Elogne 2007) that has been successfully used for the spatial interpolation of groundwater level (Varouchakis and Hristopulos 2013) is modified by defining the following space-time normalized distance h = °h2r-+-α h2τ, hr=r- ξr, hτ=τ- ξτ; where r is the spatial lag vector, τ the temporal lag vector, ξr is the correlation length in position space (r) and ξτ in time (τ), h the normalized space-time lag vector, h = |h| is its Euclidean norm of the normalized space-time lag and α the coefficient that determines the relative weight of the time lag. The space-time experimental semivariogram is determined from the biannual (wet and dry period) time series of groundwater level residuals (obtained from the original series after trend removal) between the years 1981 and 2003 at ten sampling stations located in the Mires hydrological basin in the island of Crete (Greece). After the hydrological year 2002-2003 there is a significant

  2. Error Covariance Penalized Regression: A novel multivariate model combining penalized regression with multivariate error structure.

    PubMed

    Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C

    2018-06-29

    A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.

  3. Adaptive error covariances estimation methods for ensemble Kalman filters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhen, Yicun, E-mail: zhen@math.psu.edu; Harlim, John, E-mail: jharlim@psu.edu

    2015-08-01

    This paper presents a computationally fast algorithm for estimating, both, the system and observation noise covariances of nonlinear dynamics, that can be used in an ensemble Kalman filtering framework. The new method is a modification of Belanger's recursive method, to avoid an expensive computational cost in inverting error covariance matrices of product of innovation processes of different lags when the number of observations becomes large. When we use only product of innovation processes up to one-lag, the computational cost is indeed comparable to a recently proposed method by Berry–Sauer's. However, our method is more flexible since it allows for usingmore » information from product of innovation processes of more than one-lag. Extensive numerical comparisons between the proposed method and both the original Belanger's and Berry–Sauer's schemes are shown in various examples, ranging from low-dimensional linear and nonlinear systems of SDEs and 40-dimensional stochastically forced Lorenz-96 model. Our numerical results suggest that the proposed scheme is as accurate as the original Belanger's scheme on low-dimensional problems and has a wider range of more accurate estimates compared to Berry–Sauer's method on L-96 example.« less

  4. Empirical Performance of Covariates in Education Observational Studies

    ERIC Educational Resources Information Center

    Wong, Vivian C.; Valentine, Jeffrey C.; Miller-Bains, Kate

    2017-01-01

    This article summarizes results from 12 empirical evaluations of observational methods in education contexts. We look at the performance of three common covariate-types in observational studies where the outcome is a standardized reading or math test. They are: pretest measures, local geographic matching, and rich covariate sets with a strong…

  5. Missing: Children and Young People with SEBD

    ERIC Educational Resources Information Center

    Visser, John; Daniels, Harry; Macnab, Natasha

    2005-01-01

    This article explores the issue of missing from and missing out on education. It argues that too little is known with regard to the characteristics of children and young people missing from schooling. It postulates that many of these pupils will have social, emotional and behavioural difficulties which are largely unrecognized and thus not…

  6. Empirical likelihood method for non-ignorable missing data problems.

    PubMed

    Guan, Zhong; Qin, Jing

    2017-01-01

    Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)'s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased.

  7. Ma_MISS on ExoMars: Mineralogical Characterization of the Martian Subsurface

    NASA Astrophysics Data System (ADS)

    De Sanctis, Maria Cristina; Altieri, Francesca; Ammannito, Eleonora; Biondi, David; De Angelis, Simone; Meini, Marco; Mondello, Giuseppe; Novi, Samuele; Paolinetti, Riccardo; Soldani, Massimo; Mugnuolo, Raffaele; Pirrotta, Simone; Vago, Jorge L.; Ma_MISS Team

    2017-07-01

    The Ma_MISS (Mars Multispectral Imager for Subsurface Studies) experiment is the visible and near infrared (VNIR) miniaturized spectrometer hosted by the drill system of the ExoMars 2020 rover. Ma_MISS will perform IR spectral reflectance investigations in the 0.4-2.2 μm range to characterize the mineralogy of excavated borehole walls at different depths (between 0 and 2 m). The spectral sampling is about 20 nm, whereas the spatial resolution over the target is 120 μm. Making use of the drill's movement, the instrument slit can scan a ring and build up hyperspectral images of a borehole. The main goal of the Ma_MISS instrument is to study the martian subsurface environment. Access to the martian subsurface is crucial to our ability to constrain the nature, timing, and duration of alteration and sedimentation processes on Mars, as well as habitability conditions. Subsurface deposits likely host and preserve H2O ice and hydrated materials that will contribute to our understanding of the H2O geochemical environment (both in the liquid and in the solid state) at the ExoMars 2020 landing site. The Ma_MISS spectral range and sampling capabilities have been carefully selected to allow the study of minerals and ices in situ before the collection of samples. Ma_MISS will be implemented to accomplish the following scientific objectives: (1) determine the composition of subsurface materials, (2) map the distribution of subsurface H2O and volatiles, (3) characterize important optical and physical properties of materials (e.g., grain size), and (4) produce a stratigraphic column that will inform with regard to subsurface geological processes. The Ma_MISS findings will help to refine essential criteria that will aid in our selection of the most interesting subsurface formations from which to collect samples.

  8. Robust information propagation through noisy neural circuits

    PubMed Central

    Pouget, Alexandre

    2017-01-01

    Sensory neurons give highly variable responses to stimulation, which can limit the amount of stimulus information available to downstream circuits. Much work has investigated the factors that affect the amount of information encoded in these population responses, leading to insights about the role of covariability among neurons, tuning curve shape, etc. However, the informativeness of neural responses is not the only relevant feature of population codes; of potentially equal importance is how robustly that information propagates to downstream structures. For instance, to quantify the retina’s performance, one must consider not only the informativeness of the optic nerve responses, but also the amount of information that survives the spike-generating nonlinearity and noise corruption in the next stage of processing, the lateral geniculate nucleus. Our study identifies the set of covariance structures for the upstream cells that optimize the ability of information to propagate through noisy, nonlinear circuits. Within this optimal family are covariances with “differential correlations”, which are known to reduce the information encoded in neural population activities. Thus, covariance structures that maximize information in neural population codes, and those that maximize the ability of this information to propagate, can be very different. Moreover, redundancy is neither necessary nor sufficient to make population codes robust against corruption by noise: redundant codes can be very fragile, and synergistic codes can—in some cases—optimize robustness against noise. PMID:28419098

  9. Incorrect support and missing center tolerances of phasing algorithms

    DOE PAGES

    Huang, Xiaojing; Nelson, Johanna; Steinbrener, Jan; ...

    2010-01-01

    In x-ray diffraction microscopy, iterative algorithms retrieve reciprocal space phase information, and a real space image, from an object's coherent diffraction intensities through the use of a priori information such as a finite support constraint. In many experiments, the object's shape or support is not well known, and the diffraction pattern is incompletely measured. We describe here computer simulations to look at the effects of both of these possible errors when using several common reconstruction algorithms. Overly tight object supports prevent successful convergence; however, we show that this can often be recognized through pathological behavior of the phase retrieval transfermore » function. Dynamic range limitations often make it difficult to record the central speckles of the diffraction pattern. We show that this leads to increasing artifacts in the image when the number of missing central speckles exceeds about 10, and that the removal of unconstrained modes from the reconstructed image is helpful only when the number of missing central speckles is less than about 50. In conclusion, this simulation study helps in judging the reconstructability of experimentally recorded coherent diffraction patterns.« less

  10. Crisis communication: learning from the 1998 LPG near miss in Stockholm.

    PubMed

    Castenfors, K; Svedin, L

    2001-12-14

    The authors examine current trends in urban risks and resilience in relation to hazardous material transports in general, and crisis communication and the Stockholm liquefied petroleum gas (LPG) near miss in 1998 in particular. The article discusses how current dynamics affecting urban areas, such as the decay in terms of increased condensation and limited expansion alternatives combined with industry site contamination and transports of hazardous materials on old worn-out physical infrastructure, work together to produce high-risk factors and increase urban vulnerability in large parts of the world today. Crisis communication takes a particularly pronounced role in the article as challenges in communication and confidence maintenance under conditions of information uncertainty and limited information control are explored. The LPG near miss case illustrates a Swedish case of urban risk and the tight coupling to hazardous material transports. The case also serves as a current example of Swedish resilience and lack of preparedness in urban crises, with particular observations and lessons learned in regards to crisis communication.

  11. Discovery of a missing disease spreader

    NASA Astrophysics Data System (ADS)

    Maeno, Yoshiharu

    2011-10-01

    This study presents a method to discover an outbreak of an infectious disease in a region for which data are missing, but which is at work as a disease spreader. Node discovery for the spread of an infectious disease is defined as discriminating between the nodes which are neighboring to a missing disease spreader node, and the rest, given a dataset on the number of cases. The spread is described by stochastic differential equations. A perturbation theory quantifies the impact of the missing spreader on the moments of the number of cases. Statistical discriminators examine the mid-body or tail-ends of the probability density function, and search for the disturbance from the missing spreader. They are tested with computationally synthesized datasets, and applied to the SARS outbreak and flu pandemic.

  12. Mixed-Poisson Point Process with Partially-Observed Covariates: Ecological Momentary Assessment of Smoking.

    PubMed

    Neustifter, Benjamin; Rathbun, Stephen L; Shiffman, Saul

    2012-01-01

    Ecological Momentary Assessment is an emerging method of data collection in behavioral research that may be used to capture the times of repeated behavioral events on electronic devices, and information on subjects' psychological states through the electronic administration of questionnaires at times selected from a probability-based design as well as the event times. A method for fitting a mixed Poisson point process model is proposed for the impact of partially-observed, time-varying covariates on the timing of repeated behavioral events. A random frailty is included in the point-process intensity to describe variation among subjects in baseline rates of event occurrence. Covariate coefficients are estimated using estimating equations constructed by replacing the integrated intensity in the Poisson score equations with a design-unbiased estimator. An estimator is also proposed for the variance of the random frailties. Our estimators are robust in the sense that no model assumptions are made regarding the distribution of the time-varying covariates or the distribution of the random effects. However, subject effects are estimated under gamma frailties using an approximate hierarchical likelihood. The proposed approach is illustrated using smoking data.

  13. An interference account of the missing-VP effect

    PubMed Central

    Häussler, Jana; Bader, Markus

    2015-01-01

    Sentences with doubly center-embedded relative clauses in which a verb phrase (VP) is missing are sometimes perceived as grammatical, thus giving rise to an illusion of grammaticality. In this paper, we provide a new account of why missing-VP sentences, which are both complex and ungrammatical, lead to an illusion of grammaticality, the so-called missing-VP effect. We propose that the missing-VP effect in particular, and processing difficulties with multiply center-embedded clauses more generally, are best understood as resulting from interference during cue-based retrieval. When processing a sentence with double center-embedding, a retrieval error due to interference can cause the verb of an embedded clause to be erroneously attached into a higher clause. This can lead to an illusion of grammaticality in the case of missing-VP sentences and to processing complexity in the case of complete sentences with double center-embedding. Evidence for an interference account of the missing-VP effect comes from experiments that have investigated the missing-VP effect in German using a speeded grammaticality judgments procedure. We review this evidence and then present two new experiments that show that the missing-VP effect can be found in German also with less restricting procedures. One experiment was a questionnaire study which required grammaticality judgments from participants without imposing any time constraints. The second experiment used a self-paced reading procedure and did not require any judgments. Both experiments confirm the prior findings of missing-VP effects in German and also show that the missing-VP effect is subject to a primacy effect as known from the memory literature. Based on this evidence, we argue that an account of missing-VP effects in terms of interference during cue-based retrieval is superior to accounts in terms of limited memory resources or in terms of experience with embedded structures. PMID:26136698

  14. Conditional Covariance Theory and Detect for Polytomous Items

    ERIC Educational Resources Information Center

    Zhang, Jinming

    2007-01-01

    This paper extends the theory of conditional covariances to polytomous items. It has been proven that under some mild conditions, commonly assumed in the analysis of response data, the conditional covariance of two items, dichotomously or polytomously scored, given an appropriately chosen composite is positive if, and only if, the two items…

  15. Great apes and children infer causal relations from patterns of variation and covariation.

    PubMed

    Völter, Christoph J; Sentís, Inés; Call, Josep

    2016-10-01

    We investigated whether nonhuman great apes (N=23), 2.5-year-old (N=20), and 3-year-old children (N=40) infer causal relations from patterns of variation and covariation by adapting the blicket detector paradigm for apes. We presented chimpanzees (Pan troglodytes), bonobos (Pan paniscus), orangutans (Pongo abelii), gorillas (Gorilla gorilla), and children (Homo sapiens) with a novel reward dispenser, the blicket detector. The detector was activated by inserting specific (yet randomly determined) objects, the so-called blickets. Once activated a reward was released, accompanied by lights and a short tone. Participants were shown different patterns of variation and covariation between two different objects and the activation of the detector. When subsequently choosing between one of the two objects to activate the detector on their own all species, except gorillas (who failed the training), took these patterns of correlation into account. In particular, apes and 2.5-year-old children ignored objects whose effect on the detector completely depended on the presence of another object. Follow-up experiments explored whether the apes and children were also able to re-evaluate evidence retrospectively. Only children (3-year-olds in particular) were able to make such retrospective inferences about causal structures from observing the effects of the experimenter's actions. Apes succeeded here only when they observed the effects of their own interventions. Together, this study provides evidence that apes, like young children, accurately infer causal structures from patterns of (co)variation and that they use this information to inform their own interventions. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. Phenotypic covariance at species' borders.

    PubMed

    Caley, M Julian; Cripps, Edward; Game, Edward T

    2013-05-28

    Understanding the evolution of species limits is important in ecology, evolution, and conservation biology. Despite its likely importance in the evolution of these limits, little is known about phenotypic covariance in geographically marginal populations, and the degree to which it constrains, or facilitates, responses to selection. We investigated phenotypic covariance in morphological traits at species' borders by comparing phenotypic covariance matrices (P), including the degree of shared structure, the distribution of strengths of pair-wise correlations between traits, the degree of morphological integration of traits, and the ranks of matricies, between central and marginal populations of three species-pairs of coral reef fishes. Greater structural differences in P were observed between populations close to range margins and conspecific populations toward range centres, than between pairs of conspecific populations that were both more centrally located within their ranges. Approximately 80% of all pair-wise trait correlations within populations were greater in the north, but these differences were unrelated to the position of the sampled population with respect to the geographic range of the species. Neither the degree of morphological integration, nor ranks of P, indicated greater evolutionary constraint at range edges. Characteristics of P observed here provide no support for constraint contributing to the formation of these species' borders, but may instead reflect structural change in P caused by selection or drift, and their potential to evolve in the future.

  17. [Prevention and handling of missing data in clinical trials].

    PubMed

    Jiang, Zhi-wei; Li, Chan-juan; Wang, Ling; Xia, Jie-lai

    2015-11-01

    Missing data is a common but unavoidable issue in clinical trials. It not only lowers the trial power, but brings the bias to the trial results. Therefore, on one hand, the missing data handling methods are employed in data analysis. On the other hand, it is vital to prevent the missing data in the trials. Prevention of missing data should take the first place. From the perspective of data, firstly, some measures should be taken at the stages of protocol design, data collection and data check to enhance the patients' compliance and reduce the unnecessary missing data. Secondly, the causes of confirmed missing data in the trials should be notified and recorded in detail, which are very important to determine the mechanism of missing data and choose the suitable missing data handling methods, e.g., last observation carried forward (LOCF); multiple imputation (MI); mixed-effect model repeated measure (MMRM), etc.

  18. Defining near misses: towards a sharpened definition based on empirical data about error handling processes.

    PubMed

    Kessels-Habraken, Marieke; Van der Schaaf, Tjerk; De Jonge, Jan; Rutte, Christel

    2010-05-01

    Medical errors in health care still occur frequently. Unfortunately, errors cannot be completely prevented and 100% safety can never be achieved. Therefore, in addition to error reduction strategies, health care organisations could also implement strategies that promote timely error detection and correction. Reporting and analysis of so-called near misses - usually defined as incidents without adverse consequences for patients - are necessary to gather information about successful error recovery mechanisms. This study establishes the need for a clearer and more consistent definition of near misses to enable large-scale reporting and analysis in order to obtain such information. Qualitative incident reports and interviews were collected on four units of two Dutch general hospitals. Analysis of the 143 accompanying error handling processes demonstrated that different incident types each provide unique information about error handling. Specifically, error handling processes underlying incidents that did not reach the patient differed significantly from those of incidents that reached the patient, irrespective of harm, because of successful countermeasures that had been taken after error detection. We put forward two possible definitions of near misses and argue that, from a practical point of view, the optimal definition may be contingent on organisational context. Both proposed definitions could yield large-scale reporting of near misses. Subsequent analysis could enable health care organisations to improve the safety and quality of care proactively by (1) eliminating failure factors before real accidents occur, (2) enhancing their ability to intercept errors in time, and (3) improving their safety culture. Copyright 2010 Elsevier Ltd. All rights reserved.

  19. The Performance Analysis Based on SAR Sample Covariance Matrix

    PubMed Central

    Erten, Esra

    2012-01-01

    Multi-channel systems appear in several fields of application in science. In the Synthetic Aperture Radar (SAR) context, multi-channel systems may refer to different domains, as multi-polarization, multi-interferometric or multi-temporal data, or even a combination of them. Due to the inherent speckle phenomenon present in SAR images, the statistical description of the data is almost mandatory for its utilization. The complex images acquired over natural media present in general zero-mean circular Gaussian characteristics. In this case, second order statistics as the multi-channel covariance matrix fully describe the data. For practical situations however, the covariance matrix has to be estimated using a limited number of samples, and this sample covariance matrix follow the complex Wishart distribution. In this context, the eigendecomposition of the multi-channel covariance matrix has been shown in different areas of high relevance regarding the physical properties of the imaged scene. Specifically, the maximum eigenvalue of the covariance matrix has been frequently used in different applications as target or change detection, estimation of the dominant scattering mechanism in polarimetric data, moving target indication, etc. In this paper, the statistical behavior of the maximum eigenvalue derived from the eigendecomposition of the sample multi-channel covariance matrix in terms of multi-channel SAR images is simplified for SAR community. Validation is performed against simulated data and examples of estimation and detection problems using the analytical expressions are as well given. PMID:22736976

  20. A consistent covariant quantization of the Brink-Schwarz superparticle

    NASA Astrophysics Data System (ADS)

    Eisenberg, Yeshayahu

    1992-02-01

    We perform the covariant quantization of the ten-dimensional Brink-Schwarz superparticle by reducing it to a system whose constraints are all first class, covariant and have only two levels of reducibility. Research supported by the Rothschild Fellowship.

  1. Missed doses of oral antihyperglycemic medications in US adults with type 2 diabetes mellitus: prevalence and self-reported reasons.

    PubMed

    Vietri, Jeffrey T; Wlodarczyk, Catherine S; Lorenzo, Rose; Rajpathak, Swapnil

    2016-09-01

    Adherence to antihyperglycemic medication is thought to be suboptimal, but the proportion of patients missing doses, the number of doses missed, and reasons for missing are not well described. This survey was conducted to estimate the prevalence of and reasons for missed doses of oral antihyperglycemic medications among US adults with type 2 diabetes mellitus, and to explore associations between missed doses and health outcomes. The study was a cross-sectional patient survey. Respondents were contacted via a commercial survey panel and completed an on-line questionnaire via the Internet. Respondents provided information about their use of oral antihyperglycemic medications including doses missed in the prior 4 weeks, personal characteristics, and health outcomes. Weights were calculated to project the prevalence to the US adult population with type 2 diabetes mellitus. Outcomes were compared according to number of doses missed in the past 4 weeks using bivariate statistics and generalized linear models. Approximately 30% of adult patients with type 2 diabetes mellitus reported missing or reducing ≥1 dose of oral antihyperglycemic medication in the prior 4 weeks. Accidental missing was more commonly reported than purposeful skipping, with forgetting the most commonly reported reason. The timing of missed doses suggested respondents had also forgotten about doses missed, so the prevalence of missed doses is likely higher than reported. Outcomes were poorer among those who reported missing three or more doses in the prior 4 weeks. A substantial number of US adults with type 2 diabetes mellitus miss doses of their oral antihyperglycemic medications.

  2. Measuring evapotranspiration using an eddy covariance system over the Albany Thicket of the Eastern Cape, South Africa

    NASA Astrophysics Data System (ADS)

    Gwate, O.; Mantel, Sukhmani K.; Palmer, Anthony R.; Gibson, Lesley A.

    2016-10-01

    Determining water and carbon fluxes over a vegetated surface is important in a context of global environmental changes and the fluxes help in understanding ecosystem functioning. Pursuant to this, the study measured evapotranspiration (ET) using an eddy covariance (EC) system installed over an intact example of the Albany Thicket (AT) vegetation in the Eastern Cape, South Africa. Environmental constraints to ET were also assessed by examining the response of ET to biotic and abiotic factors. The EC system comprised of an open path Infrared Gas Analyser and Sonic anemometer and an attendant weather station to measure bi-meteorological variables. Post processing of eddy covariance data was conducted using EddyPro software. Quality assessment of fluxes was also performed and rejected and missing data were filled using the method of mean diurnal variations (MDV). Much of the variation in ET was accounted for by the leaf area index (LAI, p < 0.001, 41%) and soil moisture content (SWC, p < 0.001, 32%). Total measured ET during the experiment was greater than total rainfall received owing to the high water storage capacity of the vegetation and the possibility of vegetation accessing ground water. Most of the net radiation was consumed by sensible heat flux and this means that ET in the area is essentially water limited since abundant energy was available to drive turbulent transfers of energy. Understanding the environmental constraints to ET is crucial in predicting the ecosystem response to environmental forces such as climate change.

  3. Estimation for the Linear Model With Uncertain Covariance Matrices

    NASA Astrophysics Data System (ADS)

    Zachariah, Dave; Shariati, Nafiseh; Bengtsson, Mats; Jansson, Magnus; Chatterjee, Saikat

    2014-03-01

    We derive a maximum a posteriori estimator for the linear observation model, where the signal and noise covariance matrices are both uncertain. The uncertainties are treated probabilistically by modeling the covariance matrices with prior inverse-Wishart distributions. The nonconvex problem of jointly estimating the signal of interest and the covariance matrices is tackled by a computationally efficient fixed-point iteration as well as an approximate variational Bayes solution. The statistical performance of estimators is compared numerically to state-of-the-art estimators from the literature and shown to perform favorably.

  4. MISSE 1 and 2 Tray Temperature Measurements

    NASA Technical Reports Server (NTRS)

    Harvey, Gale A.; Kinard, William H.

    2006-01-01

    The Materials International Space Station Experiment (MISSE 1 & 2) was deployed August 10,2001 and retrieved July 30,2005. This experiment is a co-operative endeavor by NASA-LaRC. NASA-GRC, NASA-MSFC, NASA-JSC, the Materials Laboratory at the Air Force Research Laboratory, and the Boeing Phantom Works. The objective of the experiment is to evaluate performance, stability, and long term survivability of materials and components planned for use by NASA and DOD on future LEO, synchronous orbit, and interplanetary space missions. Temperature is an important parameter in the evaluation of space environmental effects on materials. The MISSE 1 & 2 had autonomous temperature data loggers to measure the temperature of each of the four experiment trays. The MISSE tray-temperature data loggers have one external thermistor data channel, and a 12 bit digital converter. The MISSE experiment trays were exposed to the ISS space environment for nearly four times the nominal design lifetime for this experiment. Nevertheless, all of the data loggers provided useful temperature measurements of MISSE. The temperature measurement system has been discussed in a previous paper. This paper presents temperature measurements of MISSE payload experiment carriers (PECs) 1 and 2 experiment trays.

  5. Quantification of Covariance in Tropical Cyclone Activity across Teleconnected Basins

    NASA Astrophysics Data System (ADS)

    Tolwinski-Ward, S. E.; Wang, D.

    2015-12-01

    Rigorous statistical quantification of natural hazard covariance across regions has important implications for risk management, and is also of fundamental scientific interest. We present a multivariate Bayesian Poisson regression model for inferring the covariance in tropical cyclone (TC) counts across multiple ocean basins and across Saffir-Simpson intensity categories. Such covariability results from the influence of large-scale modes of climate variability on local environments that can alternately suppress or enhance TC genesis and intensification, and our model also simultaneously quantifies the covariance of TC counts with various climatic modes in order to deduce the source of inter-basin TC covariability. The model explicitly treats the time-dependent uncertainty in observed maximum sustained wind data, and hence the nominal intensity category of each TC. Differences in annual TC counts as measured by different agencies are also formally addressed. The probabilistic output of the model can be probed for probabilistic answers to such questions as: - Does the relationship between different categories of TCs differ statistically by basin? - Which climatic predictors have significant relationships with TC activity in each basin? - Are the relationships between counts in different basins conditionally independent given the climatic predictors, or are there other factors at play affecting inter-basin covariability? - How can a portfolio of insured property be optimized across space to minimize risk? Although we present results of our model applied to TCs, the framework is generalizable to covariance estimation between multivariate counts of natural hazards across regions and/or across peril types.

  6. Using machine learning to assess covariate balance in matching studies.

    PubMed

    Linden, Ariel; Yarnold, Paul R

    2016-12-01

    In order to assess the effectiveness of matching approaches in observational studies, investigators typically present summary statistics for each observed pre-intervention covariate, with the objective of showing that matching reduces the difference in means (or proportions) between groups to as close to zero as possible. In this paper, we introduce a new approach to distinguish between study groups based on their distributions of the covariates using a machine-learning algorithm called optimal discriminant analysis (ODA). Assessing covariate balance using ODA as compared with the conventional method has several key advantages: the ability to ascertain how individuals self-select based on optimal (maximum-accuracy) cut-points on the covariates; the application to any variable metric and number of groups; its insensitivity to skewed data or outliers; and the use of accuracy measures that can be widely applied to all analyses. Moreover, ODA accepts analytic weights, thereby extending the assessment of covariate balance to any study design where weights are used for covariate adjustment. By comparing the two approaches using empirical data, we are able to demonstrate that using measures of classification accuracy as balance diagnostics produces highly consistent results to those obtained via the conventional approach (in our matched-pairs example, ODA revealed a weak statistically significant relationship not detected by the conventional approach). Thus, investigators should consider ODA as a robust complement, or perhaps alternative, to the conventional approach for assessing covariate balance in matching studies. © 2016 John Wiley & Sons, Ltd.

  7. Missing observations in multiyear rotation sampling designs

    NASA Technical Reports Server (NTRS)

    Gbur, E. E.; Sielken, R. L., Jr. (Principal Investigator)

    1982-01-01

    Because Multiyear estimation of at-harvest stratum crop proportions is more efficient than single year estimation, the behavior of multiyear estimators in the presence of missing acquisitions was studied. Only the (worst) case when a segment proportion cannot be estimated for the entire year is considered. The effect of these missing segments on the variance of the at-harvest stratum crop proportion estimator is considered when missing segments are not replaced, and when missing segments are replaced by segments not sampled in previous years. The principle recommendations are to replace missing segments according to some specified strategy, and to use a sequential procedure for selecting a sampling design; i.e., choose an optimal two year design and then, based on the observed two year design after segment losses have been taken into account, choose the best possible three year design having the observed two year parent design.

  8. 40 CFR 75.37 - Missing data procedures for moisture.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 17 2012-07-01 2012-07-01 false Missing data procedures for moisture... PROGRAMS (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.37 Missing... system shall substitute for missing moisture data using the procedures of this section. (b) Where no...

  9. 40 CFR 75.37 - Missing data procedures for moisture.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 16 2010-07-01 2010-07-01 false Missing data procedures for moisture... PROGRAMS (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.37 Missing... system shall substitute for missing moisture data using the procedures of this section. (b) Where no...

  10. 40 CFR 75.37 - Missing data procedures for moisture.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 17 2014-07-01 2014-07-01 false Missing data procedures for moisture... PROGRAMS (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.37 Missing... system shall substitute for missing moisture data using the procedures of this section. (b) Where no...

  11. 40 CFR 75.37 - Missing data procedures for moisture.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 17 2013-07-01 2013-07-01 false Missing data procedures for moisture... PROGRAMS (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.37 Missing... system shall substitute for missing moisture data using the procedures of this section. (b) Where no...

  12. 40 CFR 75.37 - Missing data procedures for moisture.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 16 2011-07-01 2011-07-01 false Missing data procedures for moisture... PROGRAMS (CONTINUED) CONTINUOUS EMISSION MONITORING Missing Data Substitution Procedures § 75.37 Missing... system shall substitute for missing moisture data using the procedures of this section. (b) Where no...

  13. Covariance and correlation estimation in electron-density maps.

    PubMed

    Altomare, Angela; Cuocci, Corrado; Giacovazzo, Carmelo; Moliterni, Anna; Rizzi, Rosanna

    2012-03-01

    Quite recently two papers have been published [Giacovazzo & Mazzone (2011). Acta Cryst. A67, 210-218; Giacovazzo et al. (2011). Acta Cryst. A67, 368-382] which calculate the variance in any point of an electron-density map at any stage of the phasing process. The main aim of the papers was to associate a standard deviation to each pixel of the map, in order to obtain a better estimate of the map reliability. This paper deals with the covariance estimate between points of an electron-density map in any space group, centrosymmetric or non-centrosymmetric, no matter the correlation between the model and target structures. The aim is as follows: to verify if the electron density in one point of the map is amplified or depressed as an effect of the electron density in one or more other points of the map. High values of the covariances are usually connected with undesired features of the map. The phases are the primitive random variables of our probabilistic model; the covariance changes with the quality of the model and therefore with the quality of the phases. The conclusive formulas show that the covariance is also influenced by the Patterson map. Uncertainty on measurements may influence the covariance, particularly in the final stages of the structure refinement; a general formula is obtained taking into account both phase and measurement uncertainty, valid at any stage of the crystal structure solution.

  14. Inverse modeling of the terrestrial carbon flux in China with flux covariance among inverted regions

    NASA Astrophysics Data System (ADS)

    Wang, H.; Jiang, F.; Chen, J. M.; Ju, W.; Wang, H.

    2011-12-01

    Quantitative understanding of the role of ocean and terrestrial biosphere in the global carbon cycle, their response and feedback to climate change is required for the future projection of the global climate. China has the largest amount of anthropogenic CO2 emission, diverse terrestrial ecosystems and an unprecedented rate of urbanization. Thus information on spatial and temporal distributions of the terrestrial carbon flux in China is of great importance in understanding the global carbon cycle. We developed a nested inversion with focus in China. Based on Transcom 22 regions for the globe, we divide China and its neighboring countries into 17 regions, making 39 regions in total for the globe. A Bayesian synthesis inversion is made to estimate the terrestrial carbon flux based on GlobalView CO2 data. In the inversion, GEOS-Chem is used as the transport model to develop the transport matrix. A terrestrial ecosystem model named BEPS is used to produce the prior surface flux to constrain the inversion. However, the sparseness of available observation stations in Asia poses a challenge to the inversion for the 17 small regions. To obtain additional constraint on the inversion, a prior flux covariance matrix is constructed using the BEPS model through analyzing the correlation in the net carbon flux among regions under variable climate conditions. The use of the covariance among different regions in the inversion effectively extends the information content of CO2 observations to more regions. The carbon flux over the 39 land and ocean regions are inverted for the period from 2004 to 2009. In order to investigate the impact of introducing the covariance matrix with non-zero off-diagonal values to the inversion, the inverted terrestrial carbon flux over China is evaluated against ChinaFlux eddy-covariance observations after applying an upscaling methodology.

  15. Obstetric near-miss and maternal mortality in maternity university hospital, Damascus, Syria: a retrospective study

    PubMed Central

    2010-01-01

    Background Investigating severe maternal morbidity (near-miss) is a newly recognised tool that identifies women at highest risk of maternal death and helps allocate resources especially in low income countries. This study aims to i. document the frequency and nature of maternal near-miss at hospital level in Damascus, Capital of Syria, ii. evaluate the level of care at maternal life-saving emergency services by comparatively analysing near-misses and maternal mortalities. Methods Retrospective facility-based review of cases of near-miss and maternal mortality that took place in the years 2006-2007 at Damascus Maternity University Hospital, Syria. Near-miss cases were defined based on disease-specific criteria (Filippi 2005) including: haemorrhage, hypertensive disorders in pregnancy, dystocia, infection and anaemia. Main outcomes included maternal mortality ratio (MMR), maternal near miss ratio (MNMR), mortality indices and proportion of near-miss cases and mortality cases to hospital admissions. Results There were 28 025 deliveries, 15 maternal deaths and 901 near-miss cases. The study showed a MNMR of 32.9/1000 live births, a MMR of 54.8/100 000 live births and a relatively low mortality index of 1.7%. Hypertensive disorders (52%) and haemorrhage (34%) were the top causes of near-misses. Late pregnancy haemorrhage was the leading cause of maternal mortality (60%) while sepsis had the highest mortality index (7.4%). Most cases (93%) were referred in critical conditions from other facilities; namely traditional birth attendants homes (67%), primary (5%) and secondary (10%) healthcare unites and private practices (11%). 26% of near-miss cases were admitted to Intensive Care Unit (ICU). Conclusion Near-miss analyses provide valuable information on obstetric care. The study highlights the need to improve antenatal care which would help early identification of high risk pregnancies. It also emphasises the importance of both: developing protocols to prevent/manage post

  16. Effect of correlation on covariate selection in linear and nonlinear mixed effect models.

    PubMed

    Bonate, Peter L

    2017-01-01

    The effect of correlation among covariates on covariate selection was examined with linear and nonlinear mixed effect models. Demographic covariates were extracted from the National Health and Nutrition Examination Survey III database. Concentration-time profiles were Monte Carlo simulated where only one covariate affected apparent oral clearance (CL/F). A series of univariate covariate population pharmacokinetic models was fit to the data and compared with the reduced model without covariate. The "best" covariate was identified using either the likelihood ratio test statistic or AIC. Weight and body surface area (calculated using Gehan and George equation, 1970) were highly correlated (r = 0.98). Body surface area was often selected as a better covariate than weight, sometimes as high as 1 in 5 times, when weight was the covariate used in the data generating mechanism. In a second simulation, parent drug concentration and three metabolites were simulated from a thorough QT study and used as covariates in a series of univariate linear mixed effects models of ddQTc interval prolongation. The covariate with the largest significant LRT statistic was deemed the "best" predictor. When the metabolite was formation-rate limited and only parent concentrations affected ddQTc intervals the metabolite was chosen as a better predictor as often as 1 in 5 times depending on the slope of the relationship between parent concentrations and ddQTc intervals. A correlated covariate can be chosen as being a better predictor than another covariate in a linear or nonlinear population analysis by sheer correlation These results explain why for the same drug different covariates may be identified in different analyses. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. 40 CFR 98.265 - Procedures for estimating missing data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 21 2011-07-01 2011-07-01 false Procedures for estimating missing data... estimating missing data. (a) For each missing value of the inorganic carbon content of phosphate rock or... immediately preceding and immediately following the missing data incident. You must document and keep records...

  18. 40 CFR 98.265 - Procedures for estimating missing data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 22 2013-07-01 2013-07-01 false Procedures for estimating missing data... estimating missing data. (a) For each missing value of the inorganic carbon content of phosphate rock or... immediately preceding and immediately following the missing data incident. You must document and keep records...

  19. 40 CFR 98.265 - Procedures for estimating missing data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 22 2012-07-01 2012-07-01 false Procedures for estimating missing data... estimating missing data. (a) For each missing value of the inorganic carbon content of phosphate rock or... immediately preceding and immediately following the missing data incident. You must document and keep records...

  20. Stable Estimation of a Covariance Matrix Guided by Nuclear Norm Penalties

    PubMed Central

    Chi, Eric C.; Lange, Kenneth

    2014-01-01

    Estimation of a covariance matrix or its inverse plays a central role in many statistical methods. For these methods to work reliably, estimated matrices must not only be invertible but also well-conditioned. The current paper introduces a novel prior to ensure a well-conditioned maximum a posteriori (MAP) covariance estimate. The prior shrinks the sample covariance estimator towards a stable target and leads to a MAP estimator that is consistent and asymptotically efficient. Thus, the MAP estimator gracefully transitions towards the sample covariance matrix as the number of samples grows relative to the number of covariates. The utility of the MAP estimator is demonstrated in two standard applications – discriminant analysis and EM clustering – in this sampling regime. PMID:25143662