Use of the AIC with the EM algorithm: A demonstration of a probability model selection technique
Glosup, J.G.; Axelrod M.C.
1994-11-15
The problem of discriminating between two potential probability models, a Gaussian distribution and a mixture of Gaussian distributions, is considered. The focus of our interest is a case where the models are potentially non-nested and the parameters of the mixture model are estimated through the EM algorithm. The AIC, which is frequently used as a criterion for discriminating between non-nested models, is modified to work with the EM algorithm and is shown to provide a model selection tool for this situation. A particular problem involving an infinite mixture distribution known as Middleton`s Class A model is used to demonstrate the effectiveness and limitations of this method.
Vrieze, Scott I.
2012-01-01
This article reviews the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) in model selection and the appraisal of psychological theory. The focus is on latent variable models given their growing use in theory testing and construction. We discuss theoretical statistical results in regression and illustrate more important issues with novel simulations involving latent variable models including factor analysis, latent profile analysis, and factor mixture models. Asymptotically, the BIC is consistent, in that it will select the true model if, among other assumptions, the true model is among the candidate models considered. The AIC is not consistent under these circumstances. When the true model is not in the candidate model set the AIC is effcient, in that it will asymptotically choose whichever model minimizes the mean squared error of prediction/estimation. The BIC is not effcient under these circumstances. Unlike the BIC, the AIC also has a minimax property, in that it can minimize the maximum possible risk in finite sample sizes. In sum, the AIC and BIC have quite different properties that require different assumptions, and applied researchers and methodologists alike will benefit from improved understanding of the asymptotic and finite-sample behavior of these criteria. The ultimate decision to use AIC or BIC depends on many factors, including: the loss function employed, the study's methodological design, the substantive research question, and the notion of a true model and its applicability to the study at hand. PMID:22309957
Vrieze, Scott I
2012-06-01
This article reviews the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) in model selection and the appraisal of psychological theory. The focus is on latent variable models, given their growing use in theory testing and construction. Theoretical statistical results in regression are discussed, and more important issues are illustrated with novel simulations involving latent variable models including factor analysis, latent profile analysis, and factor mixture models. Asymptotically, the BIC is consistent, in that it will select the true model if, among other assumptions, the true model is among the candidate models considered. The AIC is not consistent under these circumstances. When the true model is not in the candidate model set the AIC is efficient, in that it will asymptotically choose whichever model minimizes the mean squared error of prediction/estimation. The BIC is not efficient under these circumstances. Unlike the BIC, the AIC also has a minimax property, in that it can minimize the maximum possible risk in finite sample sizes. In sum, the AIC and BIC have quite different properties that require different assumptions, and applied researchers and methodologists alike will benefit from improved understanding of the asymptotic and finite-sample behavior of these criteria. The ultimate decision to use the AIC or BIC depends on many factors, including the loss function employed, the study's methodological design, the substantive research question, and the notion of a true model and its applicability to the study at hand.
ERIC Educational Resources Information Center
Vrieze, Scott I.
2012-01-01
This article reviews the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) in model selection and the appraisal of psychological theory. The focus is on latent variable models, given their growing use in theory testing and construction. Theoretical statistical results in regression are discussed, and more important…
Truth, models, model sets, AIC, and multimodel inference: a Bayesian perspective
Barker, Richard J.; Link, William A.
2015-01-01
Statistical inference begins with viewing data as realizations of stochastic processes. Mathematical models provide partial descriptions of these processes; inference is the process of using the data to obtain a more complete description of the stochastic processes. Wildlife and ecological scientists have become increasingly concerned with the conditional nature of model-based inference: what if the model is wrong? Over the last 2 decades, Akaike's Information Criterion (AIC) has been widely and increasingly used in wildlife statistics for 2 related purposes, first for model choice and second to quantify model uncertainty. We argue that for the second of these purposes, the Bayesian paradigm provides the natural framework for describing uncertainty associated with model choice and provides the most easily communicated basis for model weighting. Moreover, Bayesian arguments provide the sole justification for interpreting model weights (including AIC weights) as coherent (mathematically self consistent) model probabilities. This interpretation requires treating the model as an exact description of the data-generating mechanism. We discuss the implications of this assumption, and conclude that more emphasis is needed on model checking to provide confidence in the quality of inference.
AIC649 Induces a Bi-Phasic Treatment Response in the Woodchuck Model of Chronic Hepatitis B
Paulsen, Daniela; Weber, Olaf; Ruebsamen-Schaeff, Helga; Tennant, Bud C.; Menne, Stephan
2015-01-01
AIC649 has been shown to directly address the antigen presenting cell arm of the host immune defense leading to a regulated cytokine release and activation of T cell responses. In the present study we analyzed the antiviral efficacy of AIC649 as well as its potential to induce functional cure in animal models for chronic hepatitis B. Hepatitis B virus transgenic mice and chronically woodchuck hepatitis virus (WHV) infected woodchucks were treated with AIC649, respectively. In the mouse system AIC649 decreased the hepatitis B virus titer as effective as the “gold standard”, Tenofovir. Interestingly, AIC649-treated chronically WHV infected woodchucks displayed a bi-phasic pattern of response: The marker for functional cure—hepatitis surface antigen—first increased but subsequently decreased even after cessation of treatment to significantly reduced levels. We hypothesize that the observed bi-phasic response pattern to AIC649 treatment reflects a physiologically “concerted”, reconstituted immune response against WHV and therefore may indicate a potential for inducing functional cure in HBV-infected patients. PMID:26656974
NASA Astrophysics Data System (ADS)
Cama, Mariaelena; Cristi Nicu, Ionut; Conoscenti, Christian; Quénéhervé, Geraldine; Maerker, Michael
2016-04-01
Landslide susceptibility can be defined as the likelihood of a landslide occurring in a given area on the basis of local terrain conditions. In the last decades many research focused on its evaluation by means of stochastic approaches under the assumption that 'the past is the key to the future' which means that if a model is able to reproduce a known landslide spatial distribution, it will be able to predict the future locations of new (i.e. unknown) slope failures. Among the various stochastic approaches, Binary Logistic Regression (BLR) is one of the most used because it calculates the susceptibility in probabilistic terms and its results are easily interpretable from a geomorphological point of view. However, very often not much importance is given to multicollinearity assessment whose effect is that the coefficient estimates are unstable, with opposite sign and therefore difficult to interpret. Therefore, it should be evaluated every time in order to make a model whose results are geomorphologically correct. In this study the effects of multicollinearity in the predictive performance and robustness of landslide susceptibility models are analyzed. In particular, the multicollinearity is estimated by means of Variation Inflation Index (VIF) which is also used as selection criterion for the independent variables (VIF Stepwise Selection) and compared to the more commonly used AIC Stepwise Selection. The robustness of the results is evaluated through 100 replicates of the dataset. The study area selected to perform this analysis is the Moldavian Plateau where landslides are among the most frequent geomorphological processes. This area has an increasing trend of urbanization and a very high potential regarding the cultural heritage, being the place of discovery of the largest settlement belonging to the Cucuteni Culture from Eastern Europe (that led to the development of the great complex Cucuteni-Tripyllia). Therefore, identifying the areas susceptible to
Warszycki, Dawid; Śmieja, Marek; Kafel, Rafał
2017-02-09
The Average Information Content Maximization algorithm (AIC-MAX) based on mutual information maximization was recently introduced to select the most discriminatory features. Here, this methodology was applied to select the most significant bits from the Klekota-Roth fingerprint for serotonin receptors ligands as well as to select the most important features for distinguishing ligands with activity for one receptor versus another. The interpretation of selected bits and machine-learning experiments performed using the reduced interpretations outperformed the raw fingerprints and indicated the most important structural features of the analyzed ligands in terms of activity and selectivity. Moreover, the AIC-MAX methodology applied here for serotonin receptor ligands can also be applied to other target classes.
Model Selection for Geostatistical Models
Hoeting, Jennifer A.; Davis, Richard A.; Merton, Andrew A.; Thompson, Sandra E.
2006-02-01
We consider the problem of model selection for geospatial data. Spatial correlation is typically ignored in the selection of explanatory variables and this can influence model selection results. For example, the inclusion or exclusion of particular explanatory variables may not be apparent when spatial correlation is ignored. To address this problem, we consider the Akaike Information Criterion (AIC) as applied to a geostatistical model. We offer a heuristic derivation of the AIC in this context and provide simulation results that show that using AIC for a geostatistical model is superior to the often used approach of ignoring spatial correlation in the selection of explanatory variables. These ideas are further demonstrated via a model for lizard abundance. We also employ the principle of minimum description length (MDL) to variable selection for the geostatistical model. The effect of sampling design on the selection of explanatory covariates is also explored.
Model selection for geostatistical models.
Hoeting, Jennifer A; Davis, Richard A; Merton, Andrew A; Thompson, Sandra E
2006-02-01
We consider the problem of model selection for geospatial data. Spatial correlation is often ignored in the selection of explanatory variables, and this can influence model selection results. For example, the importance of particular explanatory variables may not be apparent when spatial correlation is ignored. To address this problem, we consider the Akaike Information Criterion (AIC) as applied to a geostatistical model. We offer a heuristic derivation of the AIC in this context and provide simulation results that show that using AIC for a geostatistical model is superior to the often-used traditional approach of ignoring spatial correlation in the selection of explanatory variables. These ideas are further demonstrated via a model for lizard abundance. We also apply the principle of minimum description length (MDL) to variable selection for the geostatistical model. The effect of sampling design on the selection of explanatory covariates is also explored. R software to implement the geostatistical model selection methods described in this paper is available in the Supplement.
AIC and the challenge of complexity: A case study from ecology.
Moll, Remington J; Steel, Daniel; Montgomery, Robert A
2016-12-01
Philosophers and scientists alike have suggested Akaike's Information Criterion (AIC), and other similar model selection methods, show predictive accuracy justifies a preference for simplicity in model selection. This epistemic justification of simplicity is limited by an assumption of AIC which requires that the same probability distribution must generate the data used to fit the model and the data about which predictions are made. This limitation has been previously noted but appears to often go unnoticed by philosophers and scientists and has not been analyzed in relation to complexity. If predictions are about future observations, we argue that this assumption is unlikely to hold for models of complex phenomena. That in turn creates a practical limitation for simplicity's AIC-based justification because scientists modeling such phenomena are often interested in predicting the future. We support our argument with an ecological case study concerning the reintroduction of wolves into Yellowstone National Park, U.S.A. We suggest that AIC might still lend epistemic support for simplicity by leading to better explanations of complex phenomena.
ERIC Educational Resources Information Center
Beretvas, S. Natasha; Murphy, Daniel L.
2013-01-01
The authors assessed correct model identification rates of Akaike's information criterion (AIC), corrected criterion (AICC), consistent AIC (CAIC), Hannon and Quinn's information criterion (HQIC), and Bayesian information criterion (BIC) for selecting among cross-classified random effects models. Performance of default values for the 5…
Mazerolle, M.J.
2006-01-01
In ecology, researchers frequently use observational studies to explain a given pattern, such as the number of individuals in a habitat patch, with a large number of explanatory (i.e., independent) variables. To elucidate such relationships, ecologists have long relied on hypothesis testing to include or exclude variables in regression models, although the conclusions often depend on the approach used (e.g., forward, backward, stepwise selection). Though better tools have surfaced in the mid 1970's, they are still underutilized in certain fields, particularly in herpetology. This is the case of the Akaike information criterion (AIC) which is remarkably superior in model selection (i.e., variable selection) than hypothesis-based approaches. It is simple to compute and easy to understand, but more importantly, for a given data set, it provides a measure of the strength of evidence for each model that represents a plausible biological hypothesis relative to the entire set of models considered. Using this approach, one can then compute a weighted average of the estimate and standard error for any given variable of interest across all the models considered. This procedure, termed model-averaging or multimodel inference, yields precise and robust estimates. In this paper, I illustrate the use of the AIC in model selection and inference, as well as the interpretation of results analysed in this framework with two real herpetological data sets. The AIC and measures derived from it is should be routinely adopted by herpetologists. ?? Koninklijke Brill NV 2006.
On Model Selection Criteria in Multimodel Analysis
NASA Astrophysics Data System (ADS)
Meyer, P. D.; Ye, M.; Neuman, S. P.
2007-12-01
Hydrologic systems are open and complex, rendering them prone to multiple conceptualizations and mathematical descriptions. There has been a growing tendency to postulate several alternative hydrologic models for a site and use model selection criteria to (a) rank these models, (b) eliminate some of them and/or (c) weigh and average predictions and statistics generated by multiple models. This has led to some debate among hydrogeologists about the merits and demerits of common model selection (also known as model discrimination or information) criteria such as AIC, AICc, BIC, and KIC and some lack of clarity about the proper interpretation and mathematical representation of each criterion. In particular, whereas we [Neuman, 2003; Ye et al., 2004, 2005; Meyer et al., 2007] have based our approach to multimodel hydrologic ranking and inference on the Bayesian criterion KIC (which reduces asymptotically to BIC), Poeter and Anderson [2005] have voiced a strong preference for the information-theoretic criterion AICc (which reduces asymptotically to AIC). Their preference stems in part from a perception that KIC and BIC require a "true" or "quasi-true" model to be in the set of alternatives while AIC and AICc are free of such an unreasonable requirement. We examine the model selection literature to find that (a) all published rigorous derivations of AIC and AICc require that the (true) model having generated the observational data be in the set of candidate models; (b) though BIC and KIC were originally derived by assuming that such a model is in the set, BIC has been rederived by Cavanaugh and Neath [1999] without the need for such an assumption; (c) KIC reduces to BIC as the number of observations becomes large relative to the number of adjustable model parameters, implying that it likewise does not require the existence of a true model in the set of alternatives; (d) if a true model is in the set, BIC and KIC select with probability one the true model as sample size
Appropriate model selection methods for nonstationary generalized extreme value models
NASA Astrophysics Data System (ADS)
Kim, Hanbeen; Kim, Sooyoung; Shin, Hongjoon; Heo, Jun-Haeng
2017-04-01
Several evidences of hydrologic data series being nonstationary in nature have been found to date. This has resulted in the conduct of many studies in the area of nonstationary frequency analysis. Nonstationary probability distribution models involve parameters that vary over time. Therefore, it is not a straightforward process to apply conventional goodness-of-fit tests to the selection of an appropriate nonstationary probability distribution model. Tests that are generally recommended for such a selection include the Akaike's information criterion (AIC), corrected Akaike's information criterion (AICc), Bayesian information criterion (BIC), and likelihood ratio test (LRT). In this study, the Monte Carlo simulation was performed to compare the performances of these four tests, with regard to nonstationary as well as stationary generalized extreme value (GEV) distributions. Proper model selection ratios and sample sizes were taken into account to evaluate the performances of all the four tests. The BIC demonstrated the best performance with regard to stationary GEV models. In case of nonstationary GEV models, the AIC proved to be better than the other three methods, when relatively small sample sizes were considered. With larger sample sizes, the AIC, BIC, and LRT presented the best performances for GEV models which have nonstationary location and/or scale parameters, respectively. Simulation results were then evaluated by applying all four tests to annual maximum rainfall data of selected sites, as observed by the Korea Meteorological Administration.
Model selection bias and Freedman's paradox
Lukacs, P.M.; Burnham, K.P.; Anderson, D.R.
2010-01-01
In situations where limited knowledge of a system exists and the ratio of data points to variables is small, variable selection methods can often be misleading. Freedman (Am Stat 37:152-155, 1983) demonstrated how common it is to select completely unrelated variables as highly "significant" when the number of data points is similar in magnitude to the number of variables. A new type of model averaging estimator based on model selection with Akaike's AIC is used with linear regression to investigate the problems of likely inclusion of spurious effects and model selection bias, the bias introduced while using the data to select a single seemingly "best" model from a (often large) set of models employing many predictor variables. The new model averaging estimator helps reduce these problems and provides confidence interval coverage at the nominal level while traditional stepwise selection has poor inferential properties. ?? The Institute of Statistical Mathematics, Tokyo 2009.
Autonomic Intelligent Cyber Sensor (AICS) Version 1.0.1
2015-03-01
The Autonomic Intelligent Cyber Sensor (AICS) provides cyber security and industrial network state awareness for Ethernet based control network implementations. The AICS utilizes collaborative mechanisms based on Autonomic Research and a Service Oriented Architecture (SOA) to: 1) identify anomalous network traffic; 2) discover network entity information; 3) deploy deceptive virtual hosts; and 4) implement self-configuring modules. AICS achieves these goals by dynamically reacting to the industrial human-digital ecosystem in which it resides. Information is transported internally and externally on a standards based, flexible two-level communication structure.
Posada, David; Buckley, Thomas R
2004-10-01
Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (model-averaged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AIC-based model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus(genus Carabus) ground beetles described by Sota and Vogler (2001).
Model weights and the foundations of multimodel inference
Link, W.A.; Barker, R.J.
2006-01-01
Statistical thinking in wildlife biology and ecology has been profoundly influenced by the introduction of AIC (Akaike?s information criterion) as a tool for model selection and as a basis for model averaging. In this paper, we advocate the Bayesian paradigm as a broader framework for multimodel inference, one in which model averaging and model selection are naturally linked, and in which the performance of AIC-based tools is naturally evaluated. Prior model weights implicitly associated with the use of AIC are seen to highly favor complex models: in some cases, all but the most highly parameterized models in the model set are virtually ignored a priori. We suggest the usefulness of the weighted BIC (Bayesian information criterion) as a computationally simple alternative to AIC, based on explicit selection of prior model probabilities rather than acceptance of default priors associated with AIC. We note, however, that both procedures are only approximate to the use of exact Bayes factors. We discuss and illustrate technical difficulties associated with Bayes factors, and suggest approaches to avoiding these difficulties in the context of model selection for a logistic regression. Our example highlights the predisposition of AIC weighting to favor complex models and suggests a need for caution in using the BIC for computing approximate posterior model weights.
Mission science value-cost savings from the Advanced Imaging Communication System (AICS)
NASA Technical Reports Server (NTRS)
Rice, R. F.
1984-01-01
An Advanced Imaging Communication System (AICS) was proposed in the mid-1970s as an alternative to the Voyager data/communication system architecture. The AICS achieved virtually error free communication with little loss in the downlink data rate by concatenating a powerful Reed-Solomon block code with the Voyager convolutionally coded, Viterbi decoded downlink channel. The clean channel allowed AICS sophisticated adaptive data compression techniques. Both Voyager and the Galileo mission have implemented AICS components, and the concatenated channel itself is heading for international standardization. An analysis that assigns a dollar value/cost savings to AICS mission performance gains is presented. A conservative value or savings of $3 million for Voyager, $4.5 million for Galileo, and as much as $7 to 9.5 million per mission for future projects such as the proposed Mariner Mar 2 series is shown.
Towards a Model Selection Rule for Quantum State Tomography
NASA Astrophysics Data System (ADS)
Scholten, Travis; Blume-Kohout, Robin
Quantum tomography on large and/or complex systems will rely heavily on model selection techniques, which permit on-the-fly selection of small efficient statistical models (e.g. small Hilbert spaces) that accurately fit the data. Many model selection tools, such as hypothesis testing or Akaike's AIC, rely implicitly or explicitly on the Wilks Theorem, which predicts the behavior of the loglikelihood ratio statistic (LLRS) used to choose between models. We used Monte Carlo simulations to study the behavior of the LLRS in quantum state tomography, and found that it disagrees dramatically with Wilks' prediction. We propose a simple explanation for this behavior; namely, that boundaries (in state space and between models) play a significant role in determining the distribution of the LLRS. The resulting distribution is very complex, depending strongly both on the true state and the nature of the data. We consider a simplified model that neglects anistropy in the Fisher information, derive an almost analytic prediction for the mean value of the LLRS, and compare it to numerical experiments. While our simplified model outperforms the Wilks Theorem, it still does not predict the LLRS accurately, implying that alternative methods may be necessary for tomographic model selection. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE.
Thomas, D.L.; Johnson, D.; Griffith, B.
2006-01-01
Modeling the probability of use of land units characterized by discrete and continuous measures, we present a Bayesian random-effects model to assess resource selection. This model provides simultaneous estimation of both individual- and population-level selection. Deviance information criterion (DIC), a Bayesian alternative to AIC that is sample-size specific, is used for model selection. Aerial radiolocation data from 76 adult female caribou (Rangifer tarandus) and calf pairs during 1 year on an Arctic coastal plain calving ground were used to illustrate models and assess population-level selection of landscape attributes, as well as individual heterogeneity of selection. Landscape attributes included elevation, NDVI (a measure of forage greenness), and land cover-type classification. Results from the first of a 2-stage model-selection procedure indicated that there is substantial heterogeneity among cow-calf pairs with respect to selection of the landscape attributes. In the second stage, selection of models with heterogeneity included indicated that at the population-level, NDVI and land cover class were significant attributes for selection of different landscapes by pairs on the calving ground. Population-level selection coefficients indicate that the pairs generally select landscapes with higher levels of NDVI, but the relationship is quadratic. The highest rate of selection occurs at values of NDVI less than the maximum observed. Results for land cover-class selections coefficients indicate that wet sedge, moist sedge, herbaceous tussock tundra, and shrub tussock tundra are selected at approximately the same rate, while alpine and sparsely vegetated landscapes are selected at a lower rate. Furthermore, the variability in selection by individual caribou for moist sedge and sparsely vegetated landscapes is large relative to the variability in selection of other land cover types. The example analysis illustrates that, while sometimes computationally intense, a
Model selection for the extraction of movement primitives
Endres, Dominik M.; Chiovetto, Enrico; Giese, Martin A.
2013-01-01
A wide range of blind source separation methods have been used in motor control research for the extraction of movement primitives from EMG and kinematic data. Popular examples are principal component analysis (PCA), independent component analysis (ICA), anechoic demixing, and the time-varying synergy model (d'Avella and Tresch, 2002). However, choosing the parameters of these models, or indeed choosing the type of model, is often done in a heuristic fashion, driven by result expectations as much as by the data. We propose an objective criterion which allows to select the model type, number of primitives and the temporal smoothness prior. Our approach is based on a Laplace approximation to the posterior distribution of the parameters of a given blind source separation model, re-formulated as a Bayesian generative model. We first validate our criterion on ground truth data, showing that it performs at least as good as traditional model selection criteria [Bayesian information criterion, BIC (Schwarz, 1978) and the Akaike Information Criterion (AIC) (Akaike, 1974)]. Then, we analyze human gait data, finding that an anechoic mixture model with a temporal smoothness constraint on the sources can best account for the data. PMID:24391580
Quantitative Rheological Model Selection
NASA Astrophysics Data System (ADS)
Freund, Jonathan; Ewoldt, Randy
2014-11-01
The more parameters in a rheological the better it will reproduce available data, though this does not mean that it is necessarily a better justified model. Good fits are only part of model selection. We employ a Bayesian inference approach that quantifies model suitability by balancing closeness to data against both the number of model parameters and their a priori uncertainty. The penalty depends upon prior-to-calibration expectation of the viable range of values that model parameters might take, which we discuss as an essential aspect of the selection criterion. Models that are physically grounded are usually accompanied by tighter physical constraints on their respective parameters. The analysis reflects a basic principle: models grounded in physics can be expected to enjoy greater generality and perform better away from where they are calibrated. In contrast, purely empirical models can provide comparable fits, but the model selection framework penalizes their a priori uncertainty. We demonstrate the approach by selecting the best-justified number of modes in a Multi-mode Maxwell description of PVA-Borax. We also quantify relative merits of the Maxwell model relative to powerlaw fits and purely empirical fits for PVA-Borax, a viscoelastic liquid, and gluten.
Markon, Kristian E; Krueger, Robert F
2004-11-01
Information theory provides an attractive basis for statistical inference and model selection. However, little is known about the relative performance of different information-theoretic criteria in covariance structure modeling, especially in behavioral genetic contexts. To explore these issues, information-theoretic fit criteria were compared with regard to their ability to discriminate between multivariate behavioral genetic models under various model, distribution, and sample size conditions. Results indicate that performance depends on sample size, model complexity, and distributional specification. The Bayesian Information Criterion (BIC) is more robust to distributional misspecification than Akaike's Information Criterion (AIC) under certain conditions, and outperforms AIC in larger samples and when comparing more complex models. An approximation to the Minimum Description Length (MDL; Rissanen, J. (1996). IEEE Transactions on Information Theory 42:40-47, Rissanen, J. (2001). IEEE Transactions on Information Theory 47:1712-1717) criterion, involving the empirical Fisher information matrix, exhibits variable patterns of performance due to the complexity of estimating Fisher information matrices. Results indicate that a relatively new information-theoretic criterion, Draper's Information Criterion (DIC; Draper, 1995), which shares features of the Bayesian and MDL criteria, performs similarly to or better than BIC. Results emphasize the importance of further research into theory and computation of information-theoretic criteria.
Double point source W-phase inversion: Real-time implementation and automated model selection
Nealy, Jennifer; Hayes, Gavin
2015-01-01
Rapid and accurate characterization of an earthquake source is an extremely important and ever evolving field of research. Within this field, source inversion of the W-phase has recently been shown to be an effective technique, which can be efficiently implemented in real-time. An extension to the W-phase source inversion is presented in which two point sources are derived to better characterize complex earthquakes. A single source inversion followed by a double point source inversion with centroid locations fixed at the single source solution location can be efficiently run as part of earthquake monitoring network operational procedures. In order to determine the most appropriate solution, i.e., whether an earthquake is most appropriately described by a single source or a double source, an Akaike information criterion (AIC) test is performed. Analyses of all earthquakes of magnitude 7.5 and greater occurring since January 2000 were performed with extended analyses of the September 29, 2009 magnitude 8.1 Samoa earthquake and the April 19, 2014 magnitude 7.5 Papua New Guinea earthquake. The AIC test is shown to be able to accurately select the most appropriate model and the selected W-phase inversion is shown to yield reliable solutions that match published analyses of the same events.
Double Point Source W-phase Inversion: Real-time Implementation and Automated Model Selection
NASA Astrophysics Data System (ADS)
Nealy, J. L.; Hayes, G. P.
2015-12-01
Rapid and accurate characterization of an earthquake source is an extremely important and ever-evolving field of research. Within this field, source inversion of the W-phase has recently been shown to be an effective technique, which can be efficiently implemented in real-time. An extension to the W-phase source inversion is presented in which two point sources are derived to better characterize complex earthquakes. A single source inversion followed by a double point source inversion with centroid locations fixed at the single source solution location can be efficiently run as part of earthquake monitoring network operational procedures. In order to determine the most appropriate solution, i.e., whether an earthquake is most appropriately described by a single source or a double source, an Akaike information criterion (AIC) test is performed. Analyses of all earthquakes of magnitude 7.5 and greater occurring since January 2000 were performed with extended analyses of the September 29, 2009 magnitude 8.1 Samoa and the April 19, 2014 magnitude 7.5 Papua New Guinea earthquakes. The AIC test is shown to be able to accurately select the most appropriate model and the selected W-phase inversion is shown to yield reliable solutions that match previously published analyses of the same events.
Double point source W-phase inversion: Real-time implementation and automated model selection
NASA Astrophysics Data System (ADS)
Nealy, Jennifer L.; Hayes, Gavin P.
2015-12-01
Rapid and accurate characterization of an earthquake source is an extremely important and ever evolving field of research. Within this field, source inversion of the W-phase has recently been shown to be an effective technique, which can be efficiently implemented in real-time. An extension to the W-phase source inversion is presented in which two point sources are derived to better characterize complex earthquakes. A single source inversion followed by a double point source inversion with centroid locations fixed at the single source solution location can be efficiently run as part of earthquake monitoring network operational procedures. In order to determine the most appropriate solution, i.e., whether an earthquake is most appropriately described by a single source or a double source, an Akaike information criterion (AIC) test is performed. Analyses of all earthquakes of magnitude 7.5 and greater occurring since January 2000 were performed with extended analyses of the September 29, 2009 magnitude 8.1 Samoa earthquake and the April 19, 2014 magnitude 7.5 Papua New Guinea earthquake. The AIC test is shown to be able to accurately select the most appropriate model and the selected W-phase inversion is shown to yield reliable solutions that match published analyses of the same events.
Complexity vs. simplicity: groundwater model ranking using information criteria.
Engelhardt, I; De Aguinaga, J G; Mikat, H; Schüth, C; Liedl, R
2014-01-01
A groundwater model characterized by a lack of field data about hydraulic model parameters and boundary conditions combined with many observation data sets for calibration purpose was investigated concerning model uncertainty. Seven different conceptual models with a stepwise increase from 0 to 30 adjustable parameters were calibrated using PEST. Residuals, sensitivities, the Akaike information criterion (AIC and AICc), Bayesian information criterion (BIC), and Kashyap's information criterion (KIC) were calculated for a set of seven inverse calibrated models with increasing complexity. Finally, the likelihood of each model was computed. Comparing only residuals of the different conceptual models leads to an overparameterization and certainty loss in the conceptual model approach. The model employing only uncalibrated hydraulic parameters, estimated from sedimentological information, obtained the worst AIC, BIC, and KIC values. Using only sedimentological data to derive hydraulic parameters introduces a systematic error into the simulation results and cannot be recommended for generating a valuable model. For numerical investigations with high numbers of calibration data the BIC and KIC select as optimal a simpler model than the AIC. The model with 15 adjusted parameters was evaluated by AIC as the best option and obtained a likelihood of 98%. The AIC disregards the potential model structure error and the selection of the KIC is, therefore, more appropriate. Sensitivities to piezometric heads were highest for the model with only five adjustable parameters and sensitivity coefficients were directly influenced by the changes in extracted groundwater volumes.
Selecting among competing models of electro-optic, infrared camera system range performance
Nichols, Jonathan M.; Hines, James E.; Nichols, James D.
2013-01-01
Range performance is often the key requirement around which electro-optical and infrared camera systems are designed. This work presents an objective framework for evaluating competing range performance models. Model selection based on the Akaike’s Information Criterion (AIC) is presented for the type of data collected during a typical human observer and target identification experiment. These methods are then demonstrated on observer responses to both visible and infrared imagery in which one of three maritime targets was placed at various ranges. We compare the performance of a number of different models, including those appearing previously in the literature. We conclude that our model-based approach offers substantial improvements over the traditional approach to inference, including increased precision and the ability to make predictions for some distances other than the specific set for which experimental trials were conducted.
Test procedures, AN/AIC-27 system and component units. [for space shuttle
NASA Technical Reports Server (NTRS)
Reiff, F. H.
1975-01-01
The AN/AIC-27 (v) intercommunication system is a 30-channel audio distribution which consists of: air crew station units, maintenance station units, and a central control unit. A test procedure for each of the above units and also a test procedure for the system are presented. The intent of the test is to provide data for use in shuttle audio subsystem design.
Individual Influence on Model Selection
ERIC Educational Resources Information Center
Sterba, Sonya K.; Pek, Jolynn
2012-01-01
Researchers in psychology are increasingly using model selection strategies to decide among competing models, rather than evaluating the fit of a given model in isolation. However, such interest in model selection outpaces an awareness that one or a few cases can have disproportionate impact on the model ranking. Though case influence on the fit…
[Preparation and characterization of poly-Si films on different topography substrates by AIC].
Wang, Cheng-Long; Fan, Duo-Wang; Liu, Hong-Zhong; Zhang, Fu-Jia; Xing, Da; Liu, Song-Hao
2009-03-01
Polycrystalline silicon (poly-Si) thin-films were made on planar and textured glass substrates by aluminum-induced crystallization (AIC) of in situ amorphous silicon (a-Si) deposited by DC-magnetron. The poly-Si films were characterized by Raman spectroscopy, X-ray diffraction (XRD) and atomic force microscopy (AFM). A narrow and symmetrical Ranman peak at the wave number of about 521 cm(-1) was observed for all samples, indicating that the films were fully crystallized. XRD results show that the crystallites in the authors' AIC poly-Si films were preferably (111) oriented. The measurement of full width at half maximum (FWHW) of (111) XRD peaks showed that the quality of the films was affected by the a-Si deposition temperature and the surface morphology of the glass substrates. It is likely that an a-Si deposition temperature of 200 degrees C seems to be ideal for the preparation of poly-Si films by AIC.
A new method for arrival time determination of impact signal based on HHT and AIC
NASA Astrophysics Data System (ADS)
Liu, Mingzhou; Yang, Jiangxin; Cao, Yanpeng; Fu, Weinan; Cao, Yanlong
2017-03-01
Time-difference method is usually used to locate loose parts in nuclear power plant, the key to which is estimating the arrival time of impact signal caused by the crash of loose parts. However, the dispersion behavior of impact signal and the noise of nuclear power station primary circuit have negative effect on the arrival time determination. In this paper, a method of arrival time determination of impact signal based on Hilbert-Huang Transform (HHT) and Akaike Information Criterion (AIC) is proposed. Firstly, the impact signal is decomposed by Empirical Mode Decomposition (EMD). Then the instantaneous frequency of the first intrinsic mode function (IMF) is calculated, which characterizes the difference between the background noise and the impact signal. The arrival time is determined finally by AIC function. The proposed method is tested through simulation experiment which takes steel balls as the real loose parts. The deviation between the arrival time determined by proposed method and the real arrival time distributes stably under different SNRs and different sensor-to-drop point distances, mostly within the range ±0.5 ms. The proposed method is also compared with another AIC technique and a RMS approach, both of which have more dispersive distribution of deviation, quite a lot out of the range ±1 ms.
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Wöhling, Thomas; Samaniego, Luis; Nowak, Wolfgang
2014-12-01
Bayesian model selection or averaging objectively ranks a number of plausible, competing conceptual models based on Bayes' theorem. It implicitly performs an optimal trade-off between performance in fitting available data and minimum model complexity. The procedure requires determining Bayesian model evidence (BME), which is the likelihood of the observed data integrated over each model's parameter space. The computation of this integral is highly challenging because it is as high-dimensional as the number of model parameters. Three classes of techniques to compute BME are available, each with its own challenges and limitations: (1) Exact and fast analytical solutions are limited by strong assumptions. (2) Numerical evaluation quickly becomes unfeasible for expensive models. (3) Approximations known as information criteria (ICs) such as the AIC, BIC, or KIC (Akaike, Bayesian, or Kashyap information criterion, respectively) yield contradicting results with regard to model ranking. Our study features a theory-based intercomparison of these techniques. We further assess their accuracy in a simplistic synthetic example where for some scenarios an exact analytical solution exists. In more challenging scenarios, we use a brute-force Monte Carlo integration method as reference. We continue this analysis with a real-world application of hydrological model selection. This is a first-time benchmarking of the various methods for BME evaluation against true solutions. Results show that BME values from ICs are often heavily biased and that the choice of approximation method substantially influences the accuracy of model ranking. For reliable model selection, bias-free numerical methods should be preferred over ICs whenever computationally feasible.
Schöniger, Anneli; Wöhling, Thomas; Samaniego, Luis; Nowak, Wolfgang
2014-01-01
Bayesian model selection or averaging objectively ranks a number of plausible, competing conceptual models based on Bayes' theorem. It implicitly performs an optimal trade-off between performance in fitting available data and minimum model complexity. The procedure requires determining Bayesian model evidence (BME), which is the likelihood of the observed data integrated over each model's parameter space. The computation of this integral is highly challenging because it is as high-dimensional as the number of model parameters. Three classes of techniques to compute BME are available, each with its own challenges and limitations: (1) Exact and fast analytical solutions are limited by strong assumptions. (2) Numerical evaluation quickly becomes unfeasible for expensive models. (3) Approximations known as information criteria (ICs) such as the AIC, BIC, or KIC (Akaike, Bayesian, or Kashyap information criterion, respectively) yield contradicting results with regard to model ranking. Our study features a theory-based intercomparison of these techniques. We further assess their accuracy in a simplistic synthetic example where for some scenarios an exact analytical solution exists. In more challenging scenarios, we use a brute-force Monte Carlo integration method as reference. We continue this analysis with a real-world application of hydrological model selection. This is a first-time benchmarking of the various methods for BME evaluation against true solutions. Results show that BME values from ICs are often heavily biased and that the choice of approximation method substantially influences the accuracy of model ranking. For reliable model selection, bias-free numerical methods should be preferred over ICs whenever computationally feasible. PMID:25745272
Selecting a distributional assumption for modelling relative densities of benthic macroinvertebrates
Gray, B.R.
2005-01-01
The selection of a distributional assumption suitable for modelling macroinvertebrate density data is typically challenging. Macroinvertebrate data often exhibit substantially larger variances than expected under a standard count assumption, that of the Poisson distribution. Such overdispersion may derive from multiple sources, including heterogeneity of habitat (historically and spatially), differing life histories for organisms collected within a single collection in space and time, and autocorrelation. Taken to extreme, heterogeneity of habitat may be argued to explain the frequent large proportions of zero observations in macroinvertebrate data. Sampling locations may consist of habitats defined qualitatively as either suitable or unsuitable. The former category may yield random or stochastic zeroes and the latter structural zeroes. Heterogeneity among counts may be accommodated by treating the count mean itself as a random variable, while extra zeroes may be accommodated using zero-modified count assumptions, including zero-inflated and two-stage (or hurdle) approaches. These and linear assumptions (following log- and square root-transformations) were evaluated using 9 years of mayfly density data from a 52 km, ninth-order reach of the Upper Mississippi River (n = 959). The data exhibited substantial overdispersion relative to that expected under a Poisson assumption (i.e. variance:mean ratio = 23 ??? 1), and 43% of the sampling locations yielded zero mayflies. Based on the Akaike Information Criterion (AIC), count models were improved most by treating the count mean as a random variable (via a Poisson-gamma distributional assumption) and secondarily by zero modification (i.e. improvements in AIC values = 9184 units and 47-48 units, respectively). Zeroes were underestimated by the Poisson, log-transform and square root-transform models, slightly by the standard negative binomial model but not by the zero-modified models (61%, 24%, 32%, 7%, and 0%, respectively
ERIC Educational Resources Information Center
Bogiages, Christopher A.; Lotter, Christine
2011-01-01
In their research, scientists generate, test, and modify scientific models. These models can be shared with others and demonstrate a scientist's understanding of how the natural world works. Similarly, students can generate and modify models to gain a better understanding of the content, process, and nature of science (Kenyon, Schwarz, and Hug…
Empirical evaluation of scoring functions for Bayesian network model selection.
Liu, Zhifa; Malone, Brandon; Yuan, Changhe
2012-01-01
In this work, we empirically evaluate the capability of various scoring functions of Bayesian networks for recovering true underlying structures. Similar investigations have been carried out before, but they typically relied on approximate learning algorithms to learn the network structures. The suboptimal structures found by the approximation methods have unknown quality and may affect the reliability of their conclusions. Our study uses an optimal algorithm to learn Bayesian network structures from datasets generated from a set of gold standard Bayesian networks. Because all optimal algorithms always learn equivalent networks, this ensures that only the choice of scoring function affects the learned networks. Another shortcoming of the previous studies stems from their use of random synthetic networks as test cases. There is no guarantee that these networks reflect real-world data. We use real-world data to generate our gold-standard structures, so our experimental design more closely approximates real-world situations. A major finding of our study suggests that, in contrast to results reported by several prior works, the Minimum Description Length (MDL) (or equivalently, Bayesian information criterion (BIC)) consistently outperforms other scoring functions such as Akaike's information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures. We believe this finding is a result of using both datasets generated from real-world applications rather than from random processes used in previous studies and learning algorithms to select high-scoring structures rather than selecting random models. Other findings of our study support existing work, e.g., large sample sizes result in learning structures closer to the true underlying structure; the BDeu score is sensitive to the parameter settings; and the fNML performs pretty well on small datasets. We also
Empirical evaluation of scoring functions for Bayesian network model selection
2012-01-01
In this work, we empirically evaluate the capability of various scoring functions of Bayesian networks for recovering true underlying structures. Similar investigations have been carried out before, but they typically relied on approximate learning algorithms to learn the network structures. The suboptimal structures found by the approximation methods have unknown quality and may affect the reliability of their conclusions. Our study uses an optimal algorithm to learn Bayesian network structures from datasets generated from a set of gold standard Bayesian networks. Because all optimal algorithms always learn equivalent networks, this ensures that only the choice of scoring function affects the learned networks. Another shortcoming of the previous studies stems from their use of random synthetic networks as test cases. There is no guarantee that these networks reflect real-world data. We use real-world data to generate our gold-standard structures, so our experimental design more closely approximates real-world situations. A major finding of our study suggests that, in contrast to results reported by several prior works, the Minimum Description Length (MDL) (or equivalently, Bayesian information criterion (BIC)) consistently outperforms other scoring functions such as Akaike's information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures. We believe this finding is a result of using both datasets generated from real-world applications rather than from random processes used in previous studies and learning algorithms to select high-scoring structures rather than selecting random models. Other findings of our study support existing work, e.g., large sample sizes result in learning structures closer to the true underlying structure; the BDeu score is sensitive to the parameter settings; and the fNML performs pretty well on small datasets. We also
Individual influence on model selection.
Sterba, Sonya K; Pek, Jolynn
2012-12-01
Researchers in psychology are increasingly using model selection strategies to decide among competing models, rather than evaluating the fit of a given model in isolation. However, such interest in model selection outpaces an awareness that one or a few cases can have disproportionate impact on the model ranking. Though case influence on the fit of a single model in isolation has been often studied, case influence on model selection results is greatly underappreciated in psychology. This article introduces the issue of case influence on model selection and proposes 3 influence diagnostics for commonly used selection indices: the chi-square difference test, Bayesian information criterion, and Akaike's information criterion. These 3 diagnostics can be obtained simply from the byproducts of full information maximum likelihood estimation without heavy computational burden. We provide practical information on the interpretation and behavior of these diagnostics for applied researchers and provide software code to facilitate their use. Simulated and empirical examples involving different kinds of model comparison scenarios encountered in cross-sectional, longitudinal, and multilevel research as well as involving different kinds of outcome distributions illustrate the generality of the proposed diagnostics. An awareness of how cases influence model selection results is shown to aid researchers in understanding how representative their sample level results are at the case level.
Launch vehicle selection model
NASA Technical Reports Server (NTRS)
Montoya, Alex J.
1990-01-01
Over the next 50 years, humans will be heading for the Moon and Mars to build scientific bases to gain further knowledge about the universe and to develop rewarding space activities. These large scale projects will last many years and will require large amounts of mass to be delivered to Low Earth Orbit (LEO). It will take a great deal of planning to complete these missions in an efficient manner. The planning of a future Heavy Lift Launch Vehicle (HLLV) will significantly impact the overall multi-year launching cost for the vehicle fleet depending upon when the HLLV will be ready for use. It is desirable to develop a model in which many trade studies can be performed. In one sample multi-year space program analysis, the total launch vehicle cost of implementing the program reduced from 50 percent to 25 percent. This indicates how critical it is to reduce space logistics costs. A linear programming model has been developed to answer such questions. The model is now in its second phase of development, and this paper will address the capabilities of the model and its intended uses. The main emphasis over the past year was to make the model user friendly and to incorporate additional realistic constraints that are difficult to represent mathematically. We have developed a methodology in which the user has to be knowledgeable about the mission model and the requirements of the payloads. We have found a representation that will cut down the solution space of the problem by inserting some preliminary tests to eliminate some infeasible vehicle solutions. The paper will address the handling of these additional constraints and the methodology for incorporating new costing information utilizing learning curve theory. The paper will review several test cases that will explore the preferred vehicle characteristics and the preferred period of construction, i.e., within the next decade, or in the first decade of the next century. Finally, the paper will explore the interaction
Regularization Parameter Selections via Generalized Information Criterion
Zhang, Yiyun; Li, Runze; Tsai, Chih-Ling
2009-01-01
We apply the nonconcave penalized likelihood approach to obtain variable selections as well as shrinkage estimators. This approach relies heavily on the choice of regularization parameter, which controls the model complexity. In this paper, we propose employing the generalized information criterion (GIC), encompassing the commonly used Akaike information criterion (AIC) and Bayesian information criterion (BIC), for selecting the regularization parameter. Our proposal makes a connection between the classical variable selection criteria and the regularization parameter selections for the nonconcave penalized likelihood approaches. We show that the BIC-type selector enables identification of the true model consistently, and the resulting estimator possesses the oracle property in the terminology of Fan and Li (2001). In contrast, however, the AIC-type selector tends to overfit with positive probability. We further show that the AIC-type selector is asymptotically loss efficient, while the BIC-type selector is not. Our simulation results confirm these theoretical findings, and an empirical example is presented. Some technical proofs are given in the online supplementary material. PMID:20676354
Model selection for logistic regression models
NASA Astrophysics Data System (ADS)
Duller, Christine
2012-09-01
Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.
Comparison of Two Gas Selection Methodologies: An Application of Bayesian Model Averaging
Renholds, Andrea S.; Thompson, Sandra E.; Anderson, Kevin K.; Chilton, Lawrence K.
2006-03-31
One goal of hyperspectral imagery analysis is the detection and characterization of plumes. Characterization includes identifying the gases in the plumes, which is a model selection problem. Two gas selection methods compared in this report are Bayesian model averaging (BMA) and minimum Akaike information criterion (AIC) stepwise regression (SR). Simulated spectral data from a three-layer radiance transfer model were used to compare the two methods. Test gases were chosen to span the types of spectra observed, which exhibit peaks ranging from broad to sharp. The size and complexity of the search libraries were varied. Background materials were chosen to either replicate a remote area of eastern Washington or feature many common background materials. For many cases, BMA and SR performed the detection task comparably in terms of the receiver operating characteristic curves. For some gases, BMA performed better than SR when the size and complexity of the search library increased. This is encouraging because we expect improved BMA performance upon incorporation of prior information on background materials and gases.
Multidimensional Rasch Model Information-Based Fit Index Accuracy
ERIC Educational Resources Information Center
Harrell-Williams, Leigh M.; Wolfe, Edward W.
2013-01-01
Most research on confirmatory factor analysis using information-based fit indices (Akaike information criterion [AIC], Bayesian information criteria [BIC], bias-corrected AIC [AICc], and consistent AIC [CAIC]) has used a structural equation modeling framework. Minimal research has been done concerning application of these indices to item response…
Selected Tether Applications Cost Model
NASA Technical Reports Server (NTRS)
Keeley, Michael G.
1988-01-01
Diverse cost-estimating techniques and data combined into single program. Selected Tether Applications Cost Model (STACOM 1.0) is interactive accounting software tool providing means for combining several independent cost-estimating programs into fully-integrated mathematical model capable of assessing costs, analyzing benefits, providing file-handling utilities, and putting out information in text and graphical forms to screen, printer, or plotter. Program based on Lotus 1-2-3, version 2.0. Developed to provide clear, concise traceability and visibility into methodology and rationale for estimating costs and benefits of operations of Space Station tether deployer system.
Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data
Xu, Lizhen; Paterson, Andrew D.; Turpin, Williams; Xu, Wei
2015-01-01
Typical data in a microbiome study consist of the operational taxonomic unit (OTU) counts that have the characteristic of excess zeros, which are often ignored by investigators. In this paper, we compare the performance of different competing methods to model data with zero inflated features through extensive simulations and application to a microbiome study. These methods include standard parametric and non-parametric models, hurdle models, and zero inflated models. We examine varying degrees of zero inflation, with or without dispersion in the count component, as well as different magnitude and direction of the covariate effect on structural zeros and the count components. We focus on the assessment of type I error, power to detect the overall covariate effect, measures of model fit, and bias and effectiveness of parameter estimations. We also evaluate the abilities of model selection strategies using Akaike information criterion (AIC) or Vuong test to identify the correct model. The simulation studies show that hurdle and zero inflated models have well controlled type I errors, higher power, better goodness of fit measures, and are more accurate and efficient in the parameter estimation. Besides that, the hurdle models have similar goodness of fit and parameter estimation for the count component as their corresponding zero inflated models. However, the estimation and interpretation of the parameters for the zero components differs, and hurdle models are more stable when structural zeros are absent. We then discuss the model selection strategy for zero inflated data and implement it in a gut microbiome study of > 400 independent subjects. PMID:26148172
Model selection for modified gravity.
Kitching, T D; Simpson, F; Heavens, A F; Taylor, A N
2011-12-28
In this article, we review model selection predictions for modified gravity scenarios as an explanation for the observed acceleration of the expansion history of the Universe. We present analytical procedures for calculating expected Bayesian evidence values in two cases: (i) that modified gravity is a simple parametrized extension of general relativity (GR; two nested models), such that a Bayes' factor can be calculated, and (ii) that we have a class of non-nested models where a rank-ordering of evidence values is required. We show that, in the case of a minimal modified gravity parametrization, we can expect large area photometric and spectroscopic surveys, using three-dimensional cosmic shear and baryonic acoustic oscillations, to 'decisively' distinguish modified gravity models over GR (or vice versa), with odds of ≫1:100. It is apparent that the potential discovery space for modified gravity models is large, even in a simple extension to gravity models, where Newton's constant G is allowed to vary as a function of time and length scale. On the time and length scales where dark energy dominates, it is only through large-scale cosmological experiments that we can hope to understand the nature of gravity.
Model selection for pion photoproduction
NASA Astrophysics Data System (ADS)
Landay, J.; Döring, M.; Fernández-Ramírez, C.; Hu, B.; Molina, R.
2017-01-01
Partial-wave analysis of meson and photon-induced reactions is needed to enable the comparison of many theoretical approaches to data. In both energy-dependent and independent parametrizations of partial waves, the selection of the model amplitude is crucial. Principles of the S matrix are implemented to a different degree in different approaches; but a many times overlooked aspect concerns the selection of undetermined coefficients and functional forms for fitting, leading to a minimal yet sufficient parametrization. We present an analysis of low-energy neutral pion photoproduction using the least absolute shrinkage and selection operator (LASSO) in combination with criteria from information theory and K -fold cross validation. These methods are not yet widely known in the analysis of excited hadrons but will become relevant in the era of precision spectroscopy. The principle is first illustrated with synthetic data; then, its feasibility for real data is demonstrated by analyzing the latest available measurements of differential cross sections (d σ /d Ω ), photon-beam asymmetries (Σ ), and target asymmetry differential cross sections (d σT/d ≡T d σ /d Ω ) in the low-energy regime.
Model selection for pion photoproduction
Landay, J.; Doring, M.; Fernandez-Ramirez, C.; ...
2017-01-12
Partial-wave analysis of meson and photon-induced reactions is needed to enable the comparison of many theoretical approaches to data. In both energy-dependent and independent parametrizations of partial waves, the selection of the model amplitude is crucial. Principles of the S matrix are implemented to a different degree in different approaches; but a many times overlooked aspect concerns the selection of undetermined coefficients and functional forms for fitting, leading to a minimal yet sufficient parametrization. We present an analysis of low-energy neutral pion photoproduction using the least absolute shrinkage and selection operator (LASSO) in combination with criteria from information theory andmore » K-fold cross validation. These methods are not yet widely known in the analysis of excited hadrons but will become relevant in the era of precision spectroscopy. As a result, the principle is first illustrated with synthetic data; then, its feasibility for real data is demonstrated by analyzing the latest available measurements of differential cross sections (dσ/dΩ), photon-beam asymmetries (Σ), and target asymmetry differential cross sections (dσT/d≡Tdσ/dΩ) in the low-energy regime.« less
Selected Logistics Models and Techniques.
1984-09-01
Programmable Calculator LCC...Program 27 TI-59 Programmable Calculator LCC Model 30 Unmanned Spacecraft Cost Model 31 iv I: TABLE OF CONTENTS (CONT’D) (Subject Index) LOGISTICS...34"" - % - "° > - " ° .° - " .’ > -% > ]*° - LOGISTICS ANALYSIS MODEL/TECHNIQUE DATA MODEL/TECHNIQUE NAME: TI-59 Programmable Calculator LCC Model TYPE MODEL: Cost Estimating DEVELOPED BY:
IRT Model Selection Methods for Dichotomous Items
ERIC Educational Resources Information Center
Kang, Taehoon; Cohen, Allan S.
2007-01-01
Fit of the model to the data is important if the benefits of item response theory (IRT) are to be obtained. In this study, the authors compared model selection results using the likelihood ratio test, two information-based criteria, and two Bayesian methods. An example illustrated the potential for inconsistency in model selection depending on…
Variable Selection in Semiparametric Regression Modeling.
Li, Runze; Liang, Hua
2008-01-01
In this paper, we are concerned with how to select significant variables in semiparametric modeling. Variable selection for semiparametric regression models consists of two components: model selection for nonparametric components and select significant variables for parametric portion. Thus, it is much more challenging than that for parametric models such as linear models and generalized linear models because traditional variable selection procedures including stepwise regression and the best subset selection require model selection to nonparametric components for each submodel. This leads to very heavy computational burden. In this paper, we propose a class of variable selection procedures for semiparametric regression models using nonconcave penalized likelihood. The newly proposed procedures are distinguished from the traditional ones in that they delete insignificant variables and estimate the coefficients of significant variables simultaneously. This allows us to establish the sampling properties of the resulting estimate. We first establish the rate of convergence of the resulting estimate. With proper choices of penalty functions and regularization parameters, we then establish the asymptotic normality of the resulting estimate, and further demonstrate that the proposed procedures perform as well as an oracle procedure. Semiparametric generalized likelihood ratio test is proposed to select significant variables in the nonparametric component. We investigate the asymptotic behavior of the proposed test and demonstrate its limiting null distribution follows a chi-squared distribution, which is independent of the nuisance parameters. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed variable selection procedures.
Student learning using the natural selection model
NASA Astrophysics Data System (ADS)
Mesmer, Karen Luann
Students often have difficulty in learning natural selection, a major model in biology. This study examines what middle school students are capable of learning when taught about natural selection using a modeling approach. Students were taught the natural selection model including the components of population, variation, selective advantage, survival, heredity and reproduction. They then used the model to solve three case studies. Their learning was evaluated from responses on a pretest, a posttest and interviews. The results suggest that middle school students can identify components of the natural selection model in a Darwinian explanation, explain the significance of the components and relate them to each other as well as solve evolutionary problems using the model.
MODEL SELECTION FOR SPECTROPOLARIMETRIC INVERSIONS
Asensio Ramos, A.; Manso Sainz, R.; Martinez Gonzalez, M. J.; Socas-Navarro, H.; Viticchie, B.
2012-04-01
Inferring magnetic and thermodynamic information from spectropolarimetric observations relies on the assumption of a parameterized model atmosphere whose parameters are tuned by comparison with observations. Often, the choice of the underlying atmospheric model is based on subjective reasons. In other cases, complex models are chosen based on objective reasons (for instance, the necessity to explain asymmetries in the Stokes profiles) but it is not clear what degree of complexity is needed. The lack of an objective way of comparing models has, sometimes, led to opposing views of the solar magnetism because the inferred physical scenarios are essentially different. We present the first quantitative model comparison based on the computation of the Bayesian evidence ratios for spectropolarimetric observations. Our results show that there is not a single model appropriate for all profiles simultaneously. Data with moderate signal-to-noise ratios (S/Ns) favor models without gradients along the line of sight. If the observations show clear circular and linear polarization signals above the noise level, models with gradients along the line are preferred. As a general rule, observations with large S/Ns favor more complex models. We demonstrate that the evidence ratios correlate well with simple proxies. Therefore, we propose to calculate these proxies when carrying out standard least-squares inversions to allow for model comparison in the future.
An Economic Model for Selective Admissions
ERIC Educational Resources Information Center
Haglund, Alma
1978-01-01
The author presents an economic model for selective admissions to postsecondary nursing programs. Primary determinants of the admissions model are employment needs, availability of educational resources, and personal resources (ability and learning potential). As there are more applicants than resources, selective admission practices are…
A Computational Model of Selection by Consequences
ERIC Educational Resources Information Center
McDowell, J. J.
2004-01-01
Darwinian selection by consequences was instantiated in a computational model that consisted of a repertoire of behaviors undergoing selection, reproduction, and mutation over many generations. The model in effect created a digital organism that emitted behavior continuously. The behavior of this digital organism was studied in three series of…
Estimation and Accuracy after Model Selection
Efron, Bradley
2013-01-01
Classical statistical theory ignores model selection in assessing estimation accuracy. Here we consider bootstrap methods for computing standard errors and confidence intervals that take model selection into account. The methodology involves bagging, also known as bootstrap smoothing, to tame the erratic discontinuities of selection-based estimators. A useful new formula for the accuracy of bagging then provides standard errors for the smoothed estimators. Two examples, nonparametric and parametric, are carried through in detail: a regression model where the choice of degree (linear, quadratic, cubic, …) is determined by the Cp criterion, and a Lasso-based estimation problem. PMID:25346558
Review and selection of unsaturated flow models
Reeves, M.; Baker, N.A.; Duguid, J.O.
1994-04-04
Since the 1960`s, ground-water flow models have been used for analysis of water resources problems. In the 1970`s, emphasis began to shift to analysis of waste management problems. This shift in emphasis was largely brought about by site selection activities for geologic repositories for disposal of high-level radioactive wastes. Model development during the 1970`s and well into the 1980`s focused primarily on saturated ground-water flow because geologic repositories in salt, basalt, granite, shale, and tuff were envisioned to be below the water table. Selection of the unsaturated zone at Yucca Mountain, Nevada, for potential disposal of waste began to shift model development toward unsaturated flow models. Under the US Department of Energy (DOE), the Civilian Radioactive Waste Management System Management and Operating Contractor (CRWMS M&O) has the responsibility to review, evaluate, and document existing computer models; to conduct performance assessments; and to develop performance assessment models, where necessary. This document describes the CRWMS M&O approach to model review and evaluation (Chapter 2), and the requirements for unsaturated flow models which are the bases for selection from among the current models (Chapter 3). Chapter 4 identifies existing models, and their characteristics. Through a detailed examination of characteristics, Chapter 5 presents the selection of models for testing. Chapter 6 discusses the testing and verification of selected models. Chapters 7 and 8 give conclusions and make recommendations, respectively. Chapter 9 records the major references for each of the models reviewed. Appendix A, a collection of technical reviews for each model, contains a more complete list of references. Finally, Appendix B characterizes the problems used for model testing.
Model Selection with the Linear Mixed Model for Longitudinal Data
ERIC Educational Resources Information Center
Ryoo, Ji Hoon
2011-01-01
Model building or model selection with linear mixed models (LMMs) is complicated by the presence of both fixed effects and random effects. The fixed effects structure and random effects structure are codependent, so selection of one influences the other. Most presentations of LMM in psychology and education are based on a multilevel or…
Objective Bayesian model selection for Cox regression.
Held, Leonhard; Gravestock, Isaac; Sabanés Bové, Daniel
2016-12-20
There is now a large literature on objective Bayesian model selection in the linear model based on the g-prior. The methodology has been recently extended to generalized linear models using test-based Bayes factors. In this paper, we show that test-based Bayes factors can also be applied to the Cox proportional hazards model. If the goal is to select a single model, then both the maximum a posteriori and the median probability model can be calculated. For clinical prediction of survival, we shrink the model-specific log hazard ratio estimates with subsequent calculation of the Breslow estimate of the cumulative baseline hazard function. A Bayesian model average can also be employed. We illustrate the proposed methodology with the analysis of survival data on primary biliary cirrhosis patients and the development of a clinical prediction model for future cardiovascular events based on data from the Second Manifestations of ARTerial disease (SMART) cohort study. Cross-validation is applied to compare the predictive performance with alternative model selection approaches based on Harrell's c-Index, the calibration slope and the integrated Brier score. Finally, a novel application of Bayesian variable selection to optimal conditional prediction via landmarking is described. Copyright © 2016 John Wiley & Sons, Ltd.
Image Modeling and Enhancement via Structured Sparse Model Selection
2010-01-01
signal estimation is then calculated with the selected model. The model selection leads to a guaranteed near optimal denoising estimator. The degree...are adapted to the image of interest and are computed with a simple and fast procedure. State-of-the-art results are shown in image denoising ...deblurring, and inpainting. Index Terms— Model selection, structured sparsity, best basis, denoising , deblurring, inpainting 1. INTRODUCTION Image enhancement
Bacheler, N.M.; Hightower, J.E.; Burdick, S.M.; Paramore, L.M.; Buckel, J.A.; Pollock, K.H.
2010-01-01
Estimating the selectivity patterns of various fishing gears is a critical component of fisheries stock assessment due to the difficulty in obtaining representative samples from most gears. We used short-term recoveries (n = 3587) of tagged red drum Sciaenops ocellatus to directly estimate age- and length-based selectivity patterns using generalized linear models. The most parsimonious models were selected using AIC, and standard deviations were estimated using simulations. Selectivity of red drum was dependent upon the regulation period in which the fish was caught, the gear used to catch the fish (i.e., hook-and-line, gill nets, pound nets), and the fate of the fish upon recovery (i.e., harvested or released); models including all first-order interactions between main effects outperformed models without interactions. Selectivity of harvested fish was generally dome-shaped and shifted toward larger, older fish in response to regulation changes. Selectivity of caught-and-released red drum was highest on the youngest and smallest fish in the early and middle regulation periods, but increased on larger, legal-sized fish in the late regulation period. These results suggest that catch-and-release mortality has consistently been high for small, young red drum, but has recently become more common in larger, older fish. This method of estimating selectivity from short-term tag recoveries is valuable because it is simpler than full tag-return models, and may be more robust because yearly fishing and natural mortality rates do not need to be modeled and estimated. ?? 2009 Elsevier B.V.
Burdick, Summer M.; Hightower, Joseph E.; Bacheler, Nathan M.; Paramore, Lee M.; Buckel, Jeffrey A.; Pollock, Kenneth H.
2010-01-01
Estimating the selectivity patterns of various fishing gears is a critical component of fisheries stock assessment due to the difficulty in obtaining representative samples from most gears. We used short-term recoveries (n = 3587) of tagged red drum Sciaenops ocellatus to directly estimate age- and length-based selectivity patterns using generalized linear models. The most parsimonious models were selected using AIC, and standard deviations were estimated using simulations. Selectivity of red drum was dependent upon the regulation period in which the fish was caught, the gear used to catch the fish (i.e., hook-and-line, gill nets, pound nets), and the fate of the fish upon recovery (i.e., harvested or released); models including all first-order interactions between main effects outperformed models without interactions. Selectivity of harvested fish was generally dome-shaped and shifted toward larger, older fish in response to regulation changes. Selectivity of caught-and-released red drum was highest on the youngest and smallest fish in the early and middle regulation periods, but increased on larger, legal-sized fish in the late regulation period. These results suggest that catch-and-release mortality has consistently been high for small, young red drum, but has recently become more common in larger, older fish. This method of estimating selectivity from short-term tag recoveries is valuable because it is simpler than full tag-return models, and may be more robust because yearly fishing and natural mortality rates do not need to be modeled and estimated.
Modelling with words: Narrative and natural selection.
Dimech, Dominic K
2017-02-18
I argue that verbal models should be included in a philosophical account of the scientific practice of modelling. Weisberg (2013) has directly opposed this thesis on the grounds that verbal structures, if they are used in science, only merely describe models. I look at examples from Darwin's On the Origin of Species (1859) of verbally constructed narratives that I claim model the general phenomenon of evolution by natural selection. In each of the cases I look at, a particular scenario is described that involves at least some fictitious elements but represents the salient causal components of natural selection. I pronounce the importance of prioritising observation of scientific practice for the philosophy of modelling and I suggest that there are other likely model types that are excluded from philosophical accounts.
An Ss Model with Adverse Selection.
ERIC Educational Resources Information Center
House, Christopher L.; Leahy, John V.
2004-01-01
We present a model of the market for a used durable in which agents face fixed costs of adjustment, the magnitude of which depends on the degree of adverse selection in the secondary market. We find that, unlike typical models, the sS bands in our model contract as the variance of the shock increases. We also analyze a dynamic version of the model…
NASA Astrophysics Data System (ADS)
Olbert, Kai; Meier, Thomas; Cristiano, Luigia
2015-04-01
A quick picking procedure is an important tool to process large datasets in seismology. Identifying phases and determining the precise onset times at seismological stations is essential not just for localization procedures but also for seismic body-wave tomography. The automated picking procedure should be fast, robust, precise and consistent. In manual processing the speed and consistency are not guaranteed and therefore unreproducible errors may be introduced, especially for large amounts of data. In this work an offline P- and S-phase picker based on an autoregressive-prediction approach is optimized and applied to different data sets. The onset time can be described as the sum of the event source time, the theoretic travel time according to a reference velocity model and a deviation from the theoretic travel time due to lateral heterogeneity or errors in the source location. With this approach the onset time at each station can be found around the theoretical travel time within a time window smaller than the maximum lateral heterogeneity. Around the theoretic travel time an autoregressive prediction error is calculated from one or several components as characteristic function of the waveform. The minimum of the Akaike-Information-Criteria of the characteristic function identifies the phase. As was shown by Küperkoch et al. (2012), the Akaike-Information-Criteria has the tendency to be too late. Therefore, an additional processing step for precise picking is needed. In the vicinity of the minimum of the Akaike-Information-Criteria a cost function is defined and used to find the optimal estimate of the arrival time. The cost function is composed of the CF and three side conditions. The idea behind the use of a cost function is to find the phase pick in the last minimum before the CF rises due to the phase onset. The final onset time is picked in the minimum of the cost function. The automatic picking procedure is applied on datasets recorded at stations of the
On spatial mutation-selection models
Kondratiev, Yuri; Kutoviy, Oleksandr E-mail: kutovyi@mit.edu; Minlos, Robert Pirogov, Sergey
2013-11-15
We discuss the selection procedure in the framework of mutation models. We study the regulation for stochastically developing systems based on a transformation of the initial Markov process which includes a cost functional. The transformation of initial Markov process by cost functional has an analytic realization in terms of a Kimura-Maruyama type equation for the time evolution of states or in terms of the corresponding Feynman-Kac formula on the path space. The state evolution of the system including the limiting behavior is studied for two types of mutation-selection models.
Bayesian variable selection for latent class models.
Ghosh, Joyee; Herring, Amy H; Siega-Riz, Anna Maria
2011-09-01
In this article, we develop a latent class model with class probabilities that depend on subject-specific covariates. One of our major goals is to identify important predictors of latent classes. We consider methodology that allows estimation of latent classes while allowing for variable selection uncertainty. We propose a Bayesian variable selection approach and implement a stochastic search Gibbs sampler for posterior computation to obtain model-averaged estimates of quantities of interest such as marginal inclusion probabilities of predictors. Our methods are illustrated through simulation studies and application to data on weight gain during pregnancy, where it is of interest to identify important predictors of latent weight gain classes.
Bayesian model selection and isocurvature perturbations
NASA Astrophysics Data System (ADS)
Beltrán, María; García-Bellido, Juan; Lesgourgues, Julien; Liddle, Andrew R.; Slosar, Anže
2005-03-01
Present cosmological data are well explained assuming purely adiabatic perturbations, but an admixture of isocurvature perturbations is also permitted. We use a Bayesian framework to compare the performance of cosmological models including isocurvature modes with the purely adiabatic case; this framework automatically and consistently penalizes models which use more parameters to fit the data. We compute the Bayesian evidence for fits to a data set comprised of WMAP and other microwave anisotropy data, the galaxy power spectrum from 2dFGRS and SDSS, and Type Ia supernovae luminosity distances. We find that Bayesian model selection favors the purely adiabatic models, but so far only at low significance.
A model for plant lighting system selection
NASA Technical Reports Server (NTRS)
Ciolkosz, D. E.; Albright, L. D.; Sager, J. C.; Langhans, R. W.
2002-01-01
A decision model is presented that compares lighting systems for a plant growth scenario and chooses the most appropriate system from a given set of possible choices. The model utilizes a Multiple Attribute Utility Theory approach, and incorporates expert input and performance simulations to calculate a utility value for each lighting system being considered. The system with the highest utility is deemed the most appropriate system. The model was applied to a greenhouse scenario, and analyses were conducted to test the model's output for validity. Parameter variation indicates that the model performed as expected. Analysis of model output indicates that differences in utility among the candidate lighting systems were sufficiently large to give confidence that the model's order of selection was valid.
Observability in strategic models of viability selection.
Gámez, M; Carreño, R; Kósa, A; Varga, Z
2003-10-01
Strategic models of frequency-dependent viability selection, in terms of mathematical systems theory, are considered as a dynamic observation system. Using a general sufficient condition for observability of nonlinear systems with invariant manifold, it is studied whether, observing certain phenotypic characteristics of the population, the development of its genetic state can be recovered, at least near equilibrium.
The Critical Infrastructure Portfolio Selection Model
2008-06-13
Gregory Ehlers ties together two concepts that are fundamental to enabling a thorough understanding of the Critical Infrastructure Portfolio Selection...work of world-renowned economists, Paul Collier and Anke Hoeffler, and the econometric models that these scholars have developed in an effort to...
Electronic Delivery Systems: A Selection Model.
ERIC Educational Resources Information Center
Pallesen, Peter J.; Haley, Paul; Jones, Edward S.; Moore, Bobbie; Widlake, Dina E.; Medsker, Karen L.
1999-01-01
Discussion of electronic learning delivery systems focuses on a delivery system selection model that is designed for use by performance improvement professionals who are choosing between satellite networks, teleconferencing, Internet/Intranet networks, desktop multimedia, electronic performance support systems, transportable audio/video, and the…
A Theoretical Model for Selective Exposure Research.
ERIC Educational Resources Information Center
Roloff, Michael E.; Noland, Mark
This study tests the basic assumptions underlying Fishbein's Model of Attitudes by correlating an individual's selective exposure to types of television programs (situation comedies, family drama, and action/adventure) with the attitudinal similarity between individual attitudes and attitudes characterized on the programs. Twenty-three college…
A computational model of selection by consequences.
McDowell, J J
2004-01-01
Darwinian selection by consequences was instantiated in a computational model that consisted of a repertoire of behaviors undergoing selection, reproduction, and mutation over many generations. The model in effect created a digital organism that emitted behavior continuously. The behavior of this digital organism was studied in three series of computational experiments that arranged reinforcement according to random-interval (RI) schedules. The quantitative features of the model were varied over wide ranges in these experiments, and many of the qualitative features of the model also were varied. The digital organism consistently showed a hyperbolic relation between response and reinforcement rates, and this hyperbolic description of the data was consistently better than the description provided by other, similar, function forms. In addition, the parameters of the hyperbola varied systematically with the quantitative, and some of the qualitative, properties of the model in ways that were consistent with findings from biological organisms. These results suggest that the material events responsible for an organism's responding on RI schedules are computationally equivalent to Darwinian selection by consequences. They also suggest that the computational model developed here is worth pursuing further as a possible dynamic account of behavior. PMID:15357512
Marami Milani, Mohammad Reza; Hense, Andreas; Rahmani, Elham; Ploeger, Angelika
2016-07-23
This study focuses on multiple linear regression models relating six climate indices (temperature humidity THI, environmental stress ESI, equivalent temperature index ETI, heat load HLI, modified HLI (HLI new), and respiratory rate predictor RRP) with three main components of cow's milk (yield, fat, and protein) for cows in Iran. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Uncertainty estimation is employed by applying bootstrapping through resampling. Cross validation is used to avoid over-fitting. Climatic parameters are calculated from the NASA-MERRA global atmospheric reanalysis. Milk data for the months from April to September, 2002 to 2010 are used. The best linear regression models are found in spring between milk yield as the predictand and THI, ESI, ETI, HLI, and RRP as predictors with p-value < 0.001 and R² (0.50, 0.49) respectively. In summer, milk yield with independent variables of THI, ETI, and ESI show the highest relation (p-value < 0.001) with R² (0.69). For fat and protein the results are only marginal. This method is suggested for the impact studies of climate variability/change on agriculture and food science fields when short-time series or data with large uncertainty are available.
SCAN: A Scalable Model of Attentional Selection.
Hudson, Patrick T.W.; van den Herik, H Jaap; Postma, Eric O.
1997-08-01
This paper describes the SCAN (Signal Channelling Attentional Network) model, a scalable neural network model for attentional scanning. The building block of SCAN is a gating lattice, a sparsely-connected neural network defined as a special case of the Ising lattice from statistical mechanics. The process of spatial selection through covert attention is interpreted as a biological solution to the problem of translation-invariant pattern processing. In SCAN, a sequence of pattern translations combines active selection with translation-invariant processing. Selected patterns are channelled through a gating network, formed by a hierarchical fractal structure of gating lattices, and mapped onto an output window. We show how the incorporation of an expectation-generating classifier network (e.g. Carpenter and Grossberg's ART network) into SCAN allows attentional selection to be driven by expectation. Simulation studies show the SCAN model to be capable of attending and identifying object patterns that are part of a realistically sized natural image. Copyright 1997 Elsevier Science Ltd.
Review and selection of unsaturated flow models
1993-09-10
Under the US Department of Energy (DOE), the Civilian Radioactive Waste Management System Management and Operating Contractor (CRWMS M&O) has the responsibility to review, evaluate, and document existing computer ground-water flow models; to conduct performance assessments; and to develop performance assessment models, where necessary. In the area of scientific modeling, the M&O CRWMS has the following responsibilities: To provide overall management and integration of modeling activities. To provide a framework for focusing modeling and model development. To identify areas that require increased or decreased emphasis. To ensure that the tools necessary to conduct performance assessment are available. These responsibilities are being initiated through a three-step process. It consists of a thorough review of existing models, testing of models which best fit the established requirements, and making recommendations for future development that should be conducted. Future model enhancement will then focus on the models selected during this activity. Furthermore, in order to manage future model development, particularly in those areas requiring substantial enhancement, the three-step process will be updated and reported periodically in the future.
Bayesian Model Selection for Group Studies
Stephan, Klaas Enno; Penny, Will D.; Daunizeau, Jean; Moran, Rosalyn J.; Friston, Karl J.
2009-01-01
Bayesian model selection (BMS) is a powerful method for determining the most likely among a set of competing hypotheses about the mechanisms that generated observed data. BMS has recently found widespread application in neuroimaging, particularly in the context of dynamic causal modelling (DCM). However, so far, combining BMS results from several subjects has relied on simple (fixed effects) metrics, e.g. the group Bayes factor (GBF), that do not account for group heterogeneity or outliers. In this paper, we compare the GBF with two random effects methods for BMS at the between-subject or group level. These methods provide inference on model-space using a classical and Bayesian perspective respectively. First, a classical (frequentist) approach uses the log model evidence as a subject-specific summary statistic. This enables one to use analysis of variance to test for differences in log-evidences over models, relative to inter-subject differences. We then consider the same problem in Bayesian terms and describe a novel hierarchical model, which is optimised to furnish a probability density on the models themselves. This new variational Bayes method rests on treating the model as a random variable and estimating the parameters of a Dirichlet distribution which describes the probabilities for all models considered. These probabilities then define a multinomial distribution over model space, allowing one to compute how likely it is that a specific model generated the data of a randomly chosen subject as well as the exceedance probability of one model being more likely than any other model. Using empirical and synthetic data, we show that optimising a conditional density of the model probabilities, given the log-evidences for each model over subjects, is more informative and appropriate than both the GBF and frequentist tests of the log-evidences. In particular, we found that the hierarchical Bayesian approach is considerably more robust than either of the other
Image Discrimination Models With Stochastic Channel Selection
NASA Technical Reports Server (NTRS)
Ahumada, Albert J., Jr.; Beard, Bettina L.; Null, Cynthia H. (Technical Monitor)
1995-01-01
Many models of human image processing feature a large fixed number of channels representing cortical units varying in spatial position (visual field direction and eccentricity) and spatial frequency (radial frequency and orientation). The values of these parameters are usually sampled at fixed values selected to ensure adequate overlap considering the bandwidth and/or spread parameters, which are usually fixed. Even high levels of overlap does not always ensure that the performance of the model will vary smoothly with image translation or scale changes. Physiological measurements of bandwidth and/or spread parameters result in a broad distribution of estimated parameter values and the prediction of some psychophysical results are facilitated by the assumption that these parameters also take on a range of values. Selecting a sample of channels from a continuum of channels rather than using a fixed set can make model performance vary smoothly with changes in image position, scale, and orientation. It also facilitates the addition of spatial inhomogeneity, nonlinear feature channels, and focus of attention to channel models.
Inflation model selection meets dark radiation
NASA Astrophysics Data System (ADS)
Tram, Thomas; Vallance, Robert; Vennin, Vincent
2017-01-01
We investigate how inflation model selection is affected by the presence of additional free-streaming relativistic degrees of freedom, i.e. dark radiation. We perform a full Bayesian analysis of both inflation parameters and cosmological parameters taking reheating into account self-consistently. We compute the Bayesian evidence for a few representative inflation scenarios in both the standard ΛCDM model and an extension including dark radiation parametrised by its effective number of relativistic species Neff. Using a minimal dataset (Planck low-l polarisation, temperature power spectrum and lensing reconstruction), we find that the observational status of most inflationary models is unchanged. The exceptions are potentials such as power-law inflation that predict large values for the scalar spectral index that can only be realised when Neff is allowed to vary. Adding baryon acoustic oscillations data and the B-mode data from BICEP2/Keck makes power-law inflation disfavoured, while adding local measurements of the Hubble constant H0 makes power-law inflation slightly favoured compared to the best single-field plateau potentials. This illustrates how the dark radiation solution to the H0 tension would have deep consequences for inflation model selection.
Model selection for radiochromic film dosimetry.
Méndez, I
2015-05-21
The purpose of this study was to find the most accurate model for radiochromic film dosimetry by comparing different channel independent perturbation models. A model selection approach based on (algorithmic) information theory was followed, and the results were validated using gamma-index analysis on a set of benchmark test cases. Several questions were addressed: (a) whether incorporating the information of the non-irradiated film, by scanning prior to irradiation, improves the results; (b) whether lateral corrections are necessary when using multichannel models; (c) whether multichannel dosimetry produces better results than single-channel dosimetry; (d) which multichannel perturbation model provides more accurate film doses. It was found that scanning prior to irradiation and applying lateral corrections improved the accuracy of the results. For some perturbation models, increasing the number of color channels did not result in more accurate film doses. Employing Truncated Normal perturbations was found to provide better results than using Micke-Mayer perturbation models. Among the models being compared, the triple-channel model with Truncated Normal perturbations, net optical density as the response and subject to the application of lateral corrections was found to be the most accurate model. The scope of this study was circumscribed by the limits under which the models were tested. In this study, the films were irradiated with megavoltage radiotherapy beams, with doses from about 20-600 cGy, entire (8 inch × 10 inch) films were scanned, the functional form of the sensitometric curves was a polynomial and the different lots were calibrated using the plane-based method.
Resampling methods for model fitting and model selection.
Babu, G Jogesh
2011-11-01
Resampling procedures for fitting models and model selection are considered in this article. Nonparametric goodness-of-fit statistics are generally based on the empirical distribution function. The distribution-free property of these statistics does not hold in the multivariate case or when some of the parameters are estimated. Bootstrap methods to estimate the underlying distributions are discussed in such cases. The results hold not only in the case of one-dimensional parameter space, but also for the vector parameters. Bootstrap methods for inference, when the data is from an unknown distribution that may or may not belong to a specified family of distributions, are also considered. Most of the information criteria-based model selection procedures such as the Akaike information criterion, Bayesian information criterion, and minimum description length use estimation of bias. The bias, which is inevitable in model selection problems, arises mainly from estimating the distance between the "true" model and an estimated model. A jackknife type procedure for model selection is discussed, which instead of bias estimation is based on bias reduction.
Model selection versus model averaging in dose finding studies.
Schorning, Kirsten; Bornkamp, Björn; Bretz, Frank; Dette, Holger
2016-09-30
A key objective of Phase II dose finding studies in clinical drug development is to adequately characterize the dose response relationship of a new drug. An important decision is then on the choice of a suitable dose response function to support dose selection for the subsequent Phase III studies. In this paper, we compare different approaches for model selection and model averaging using mathematical properties as well as simulations. We review and illustrate asymptotic properties of model selection criteria and investigate their behavior when changing the sample size but keeping the effect size constant. In a simulation study, we investigate how the various approaches perform in realistically chosen settings. Finally, the different methods are illustrated with a recently conducted Phase II dose finding study in patients with chronic obstructive pulmonary disease. Copyright © 2016 John Wiley & Sons, Ltd.
Model Selection for Cox Models with Time-Varying Coefficients
Yan, Jun; Huang, Jian
2011-01-01
Summary Cox models with time-varying coefficients offer great flexibility in capturing the temporal dynamics of covariate effects on right censored failure times. Since not all covariate coefficients are time-varying, model selection for such models presents an additional challenge, which is to distinguish covariates with time-varying coefficient from those with time-independent coefficient. We propose an adaptive group lasso method that not only selects important variables but also selects between time-independent and time-varying specifications of their presence in the model. Each covariate effect is partitioned into a time-independent part and a time-varying part, the latter of which is characterized by a group of coefficients of basis splines without intercept. Model selection and estimation are carried out through a fast, iterative group shooting algorithm. Our approach is shown to have good properties in a simulation study that mimics realistic situations with up to 20 variables. A real example illustrates the utility of the method. PMID:22506825
Improved modeling of GPS selective availability
NASA Technical Reports Server (NTRS)
Braasch, Michael S.; Fink, Annmarie; Duffus, Keith
1994-01-01
Selective Availability (SA) represents the dominant error source for stand-alone users of the Global Positioning System (GPS). Even for DGPS, SA mandates the update rate required for a desired level of accuracy in realtime applications. As was witnessed in the recent literature, the ability to model this error source is crucial to the proper evaluation of GPS-based systems. A variety of SA models were proposed to date; however, each has its own shortcomings. Most of these models were based on limited data sets or data which were corrupted by additional error sources. A comprehensive treatment of the problem is presented. The phenomenon of SA is discussed and a technique is presented whereby both clock and orbit components of SA are identifiable. Extensive SA data sets collected from Block 2 satellites are presented. System Identification theory then is used to derive a robust model of SA from the data. This theory also allows for the statistical analysis of SA. The stationarity of SA over time and across different satellites is analyzed and its impact on the modeling problem is discussed.
Bayesian model selection analysis of WMAP3
Parkinson, David; Mukherjee, Pia; Liddle, Andrew R.
2006-06-15
We present a Bayesian model selection analysis of WMAP3 data using our code CosmoNest. We focus on the density perturbation spectral index n{sub S} and the tensor-to-scalar ratio r, which define the plane of slow-roll inflationary models. We find that while the Bayesian evidence supports the conclusion that n{sub S}{ne}1, the data are not yet powerful enough to do so at a strong or decisive level. If tensors are assumed absent, the current odds are approximately 8 to 1 in favor of n{sub S}{ne}1 under our assumptions, when WMAP3 data is used together with external data sets. WMAP3 data on its own is unable to distinguish between the two models. Further, inclusion of r as a parameter weakens the conclusion against the Harrison-Zel'dovich case (n{sub S}=1, r=0), albeit in a prior-dependent way. In appendices we describe the CosmoNest code in detail, noting its ability to supply posterior samples as well as to accurately compute the Bayesian evidence. We make a first public release of CosmoNest, now available at www.cosmonest.org.
A Selective Review of Group Selection in High-Dimensional Models.
Huang, Jian; Breheny, Patrick; Ma, Shuangge
2012-01-01
Grouping structures arise naturally in many statistical modeling problems. Several methods have been proposed for variable selection that respect grouping structure in variables. Examples include the group LASSO and several concave group selection methods. In this article, we give a selective review of group selection concerning methodological developments, theoretical properties and computational algorithms. We pay particular attention to group selection methods involving concave penalties. We address both group selection and bi-level selection methods. We describe several applications of these methods in nonparametric additive models, semiparametric regression, seemingly unrelated regressions, genomic data analysis and genome wide association studies. We also highlight some issues that require further study.
Growth rate modeling for selective tungsten LPCVD
NASA Astrophysics Data System (ADS)
Wolf, H.; Streiter, R.; Schulz, S. E.; Gessner, T.
1995-10-01
Selective chemical vapor deposition of tungsten plugs on sputtered tungsten was performed in a single-wafer cold-wall reactor using silane (SiH 4) and tungsten hexafluoride (WF 6). Extensive SEM measurements of film thickness were carried out to study the dependence of growth rates on various process conditions, wafer loading, and via dimensions. The results have been interpreted by numerical calculations based on a simulation model which is also presented. Both continuum fluid dynamics and the ballistic line-of-sight approach are used for transport modeling. The reaction rate is described by an empirical rate expression using coefficients fitted from experimental data. In the range 0.2 < p( SiH 4) /p( WF 6) < 0.75 , the reaction order was determined as 1.55 and -0.55 with respect to SiH 4 and WF 6, respectively. For higher partial pressure ratios the second-order rate dependence on p(SiH 4) and the minus first-order dependence on p(WF 6) were confirmed.
NASA Astrophysics Data System (ADS)
Strupczewski, Witold G.; Bogdanowich, Ewa; Debele, Sisay
2016-04-01
Under Polish climate conditions the series of Annual Maxima (AM) flows are usually a mixture of peak flows of thaw- and rainfall- originated floods. The northern, lowland regions are dominated by snowmelt floods whilst in mountainous regions the proportion of rainfall floods is predominant. In many stations the majority of AM can be of snowmelt origin, but the greatest peak flows come from rainfall floods or vice versa. In a warming climate, precipitation is less likely to occur as snowfall. A shift from a snow- towards a rain-dominated regime results in a decreasing trend in mean and standard deviations of winter peak flows whilst rainfall floods do not exhibit any trace of non-stationarity. That is why a simple form of trends (i.e. linear trends) are more difficult to identify in AM time-series than in Seasonal Maxima (SM), usually winter season time-series. Hence it is recommended to analyse trends in SM, where a trend in standard deviation strongly influences the time -dependent upper quantiles. The uncertainty associated with the extrapolation of the trend makes it necessary to apply a relationship for trend which has time derivative tending to zero, e.g. we can assume a new climate equilibrium epoch approaching, or a time horizon is limited by the validity of the trend model. For both winter and summer SM time series, at least three distributions functions with trend model in the location, scale and shape parameters are estimated by means of the GAMLSS package using the ML-techniques. The resulting trend estimates in mean and standard deviation are mutually compared to the observed trends. Then, using AIC measures as weights, a multi-model distribution is constructed for each of two seasons separately. Further, assuming a mutual independence of the seasonal maxima, an AM model with time-dependent parameters can be obtained. The use of a multi-model approach can alleviate the effects of different and often contradictory trends obtained by using and identifying
Increasing selection response by Bayesian modeling of heterogeneous environmental variances
Technology Transfer Automated Retrieval System (TEKTRAN)
Heterogeneity of environmental variance among genotypes reduces selection response because genotypes with higher variance are more likely to be selected than low-variance genotypes. Modeling heterogeneous variances to obtain weighted means corrected for heterogeneous variances is difficult in likel...
Marami Milani, Mohammad Reza; Hense, Andreas; Rahmani, Elham; Ploeger, Angelika
2016-01-01
This study focuses on multiple linear regression models relating six climate indices (temperature humidity THI, environmental stress ESI, equivalent temperature index ETI, heat load HLI, modified HLI (HLI new), and respiratory rate predictor RRP) with three main components of cow’s milk (yield, fat, and protein) for cows in Iran. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Uncertainty estimation is employed by applying bootstrapping through resampling. Cross validation is used to avoid over-fitting. Climatic parameters are calculated from the NASA-MERRA global atmospheric reanalysis. Milk data for the months from April to September, 2002 to 2010 are used. The best linear regression models are found in spring between milk yield as the predictand and THI, ESI, ETI, HLI, and RRP as predictors with p-value < 0.001 and R2 (0.50, 0.49) respectively. In summer, milk yield with independent variables of THI, ETI, and ESI show the highest relation (p-value < 0.001) with R2 (0.69). For fat and protein the results are only marginal. This method is suggested for the impact studies of climate variability/change on agriculture and food science fields when short-time series or data with large uncertainty are available. PMID:28231147
Model for personal computer system selection.
Blide, L
1987-12-01
Successful computer software and hardware selection is best accomplished by following an organized approach such as the one described in this article. The first step is to decide what you want to be able to do with the computer. Secondly, select software that is user friendly, well documented, bug free, and that does what you want done. Next, you select the computer, printer and other needed equipment from the group of machines on which the software will run. Key factors here are reliability and compatibility with other microcomputers in your facility. Lastly, you select a reliable vendor who will provide good, dependable service in a reasonable time. The ability to correctly select computer software and hardware is a key skill needed by medical record professionals today and in the future. Professionals can make quality computer decisions by selecting software and systems that are compatible with other computers in their facility, allow for future net-working, ease of use, and adaptability for expansion as new applications are identified. The key to success is to not only provide for your present needs, but to be prepared for future rapid expansion and change in your computer usage as technology and your skills grow.
Validation subset selections for extrapolation oriented QSPAR models.
Szántai-Kis, Csaba; Kövesdi, István; Kéri, György; Orfi, László
2003-01-01
One of the most important features of QSPAR models is their predictive ability. The predictive ability of QSPAR models should be checked by external validation. In this work we examined three different types of external validation set selection methods for their usefulness in in-silico screening. The usefulness of the selection methods was studied in such a way that: 1) We generated thousands of QSPR models and stored them in 'model banks'. 2) We selected a final top model from the model banks based on three different validation set selection methods. 3) We predicted large data sets, which we called 'chemical universe sets', and calculated the corresponding SEPs. The models were generated from small fractions of the available water solubility data during a GA Variable Subset Selection procedure. The external validation sets were constructed by random selections, uniformly distributed selections or by perimeter-oriented selections. We found that the best performing models on the perimeter-oriented external validation sets usually gave the best validation results when the remaining part of the available data was overwhelmingly large, i.e., when the model had to make a lot of extrapolations. We also compared the top final models obtained from external validation set selection methods in three independent and different sizes of 'chemical universe sets'.
Selection of Temporal Lags When Modeling Economic and Financial Processes.
Matilla-Garcia, Mariano; Ojeda, Rina B; Marin, Manuel Ruiz
2016-10-01
This paper suggests new nonparametric statistical tools and procedures for modeling linear and nonlinear univariate economic and financial processes. In particular, the tools presented help in selecting relevant lags in the model description of a general linear or nonlinear time series; that is, nonlinear models are not a restriction. The tests seem to be robust to the selection of free parameters. We also show that the test can be used as a diagnostic tool for well-defined models.
Model Selection for Monitoring CO2 Plume during Sequestration
2014-12-31
The model selection method developed as part of this project mainly includes four steps: (1) assessing the connectivity/dynamic characteristics of a large prior ensemble of models, (2) model clustering using multidimensional scaling coupled with k-mean clustering, (3) model selection using the Bayes' rule in the reduced model space, (4) model expansion using iterative resampling of the posterior models. The fourth step expresses one of the advantages of the method: it provides a built-in means of quantifying the uncertainty in predictions made with the selected models. In our application to plume monitoring, by expanding the posterior space of models, the final ensemble of representations of geological model can be used to assess the uncertainty in predicting the future displacement of the CO2 plume. The software implementation of this approach is attached here.
The Multilingual Lexicon: Modelling Selection and Control
ERIC Educational Resources Information Center
de Bot, Kees
2004-01-01
In this paper an overview of research on the multilingual lexicon is presented as the basis for a model for processing multiple languages. With respect to specific issues relating to the processing of more than two languages, it is suggested that there is no need to develop a specific model for such multilingual processing, but at the same time we…
On Optimal Input Design and Model Selection for Communication Channels
Li, Yanyan; Djouadi, Seddik M; Olama, Mohammed M
2013-01-01
In this paper, the optimal model (structure) selection and input design which minimize the worst case identification error for communication systems are provided. The problem is formulated using metric complexity theory in a Hilbert space setting. It is pointed out that model selection and input design can be handled independently. Kolmogorov n-width is used to characterize the representation error introduced by model selection, while Gel fand and Time n-widths are used to represent the inherent error introduced by input design. After the model is selected, an optimal input which minimizes the worst case identification error is shown to exist. In particular, it is proven that the optimal model for reducing the representation error is a Finite Impulse Response (FIR) model, and the optimal input is an impulse at the start of the observation interval. FIR models are widely popular in communication systems, such as, in Orthogonal Frequency Division Multiplexing (OFDM) systems.
Using multilevel models to quantify heterogeneity in resource selection
Wagner, T.; Diefenbach, D.R.; Christensen, S.A.; Norton, A.S.
2011-01-01
Models of resource selection are being used increasingly to predict or model the effects of management actions rather than simply quantifying habitat selection. Multilevel, or hierarchical, models are an increasingly popular method to analyze animal resource selection because they impose a relatively weak stochastic constraint to model heterogeneity in habitat use and also account for unequal sample sizes among individuals. However, few studies have used multilevel models to model coefficients as a function of predictors that may influence habitat use at different scales or quantify differences in resource selection among groups. We used an example with white-tailed deer (Odocoileus virginianus) to illustrate how to model resource use as a function of distance to road that varies among deer by road density at the home range scale. We found that deer avoidance of roads decreased as road density increased. Also, we used multilevel models with sika deer (Cervus nippon) and white-tailed deer to examine whether resource selection differed between species. We failed to detect differences in resource use between these two species and showed how information-theoretic and graphical measures can be used to assess how resource use may have differed. Multilevel models can improve our understanding of how resource selection varies among individuals and provides an objective, quantifiable approach to assess differences or changes in resource selection. ?? The Wildlife Society, 2011.
Astrophysical Model Selection in Gravitational Wave Astronomy
NASA Technical Reports Server (NTRS)
Adams, Matthew R.; Cornish, Neil J.; Littenberg, Tyson B.
2012-01-01
Theoretical studies in gravitational wave astronomy have mostly focused on the information that can be extracted from individual detections, such as the mass of a binary system and its location in space. Here we consider how the information from multiple detections can be used to constrain astrophysical population models. This seemingly simple problem is made challenging by the high dimensionality and high degree of correlation in the parameter spaces that describe the signals, and by the complexity of the astrophysical models, which can also depend on a large number of parameters, some of which might not be directly constrained by the observations. We present a method for constraining population models using a hierarchical Bayesian modeling approach which simultaneously infers the source parameters and population model and provides the joint probability distributions for both. We illustrate this approach by considering the constraints that can be placed on population models for galactic white dwarf binaries using a future space-based gravitational wave detector. We find that a mission that is able to resolve approximately 5000 of the shortest period binaries will be able to constrain the population model parameters, including the chirp mass distribution and a characteristic galaxy disk radius to within a few percent. This compares favorably to existing bounds, where electromagnetic observations of stars in the galaxy constrain disk radii to within 20%.
Bayesian model selection for LISA pathfinder
NASA Astrophysics Data System (ADS)
Karnesis, Nikolaos; Nofrarias, Miquel; Sopuerta, Carlos F.; Gibert, Ferran; Armano, Michele; Audley, Heather; Congedo, Giuseppe; Diepholz, Ingo; Ferraioli, Luigi; Hewitson, Martin; Hueller, Mauro; Korsakova, Natalia; McNamara, Paul W.; Plagnol, Eric; Vitale, Stefano
2014-03-01
The main goal of the LISA Pathfinder (LPF) mission is to fully characterize the acceleration noise models and to test key technologies for future space-based gravitational-wave observatories similar to the eLISA concept. The data analysis team has developed complex three-dimensional models of the LISA Technology Package (LTP) experiment onboard the LPF. These models are used for simulations, but, more importantly, they will be used for parameter estimation purposes during flight operations. One of the tasks of the data analysis team is to identify the physical effects that contribute significantly to the properties of the instrument noise. A way of approaching this problem is to recover the essential parameters of a LTP model fitting the data. Thus, we want to define the simplest model that efficiently explains the observations. To do so, adopting a Bayesian framework, one has to estimate the so-called Bayes factor between two competing models. In our analysis, we use three main different methods to estimate it: the reversible jump Markov chain Monte Carlo method, the Schwarz criterion, and the Laplace approximation. They are applied to simulated LPF experiments in which the most probable LTP model that explains the observations is recovered. The same type of analysis presented in this paper is expected to be followed during flight operations. Moreover, the correlation of the output of the aforementioned methods with the design of the experiment is explored.
Methods for model selection in applied science and engineering.
Field, Richard V., Jr.
2004-10-01
Mathematical models are developed and used to study the properties of complex systems and/or modify these systems to satisfy some performance requirements in just about every area of applied science and engineering. A particular reason for developing a model, e.g., performance assessment or design, is referred to as the model use. Our objective is the development of a methodology for selecting a model that is sufficiently accurate for an intended use. Information on the system being modeled is, in general, incomplete, so that there may be two or more models consistent with the available information. The collection of these models is called the class of candidate models. Methods are developed for selecting the optimal member from a class of candidate models for the system. The optimal model depends on the available information, the selected class of candidate models, and the model use. Classical methods for model selection, including the method of maximum likelihood and Bayesian methods, as well as a method employing a decision-theoretic approach, are formulated to select the optimal model for numerous applications. There is no requirement that the candidate models be random. Classical methods for model selection ignore model use and require data to be available. Examples are used to show that these methods can be unreliable when data is limited. The decision-theoretic approach to model selection does not have these limitations, and model use is included through an appropriate utility function. This is especially important when modeling high risk systems, where the consequences of using an inappropriate model for the system can be disastrous. The decision-theoretic method for model selection is developed and applied for a series of complex and diverse applications. These include the selection of the: (1) optimal order of the polynomial chaos approximation for non-Gaussian random variables and stationary stochastic processes, (2) optimal pressure load model to be
Nonmathematical Models for Evolution of Altruism, and for Group Selection
Darlington, P. J.
1972-01-01
Mathematical biologists have failed to produce a satisfactory general model for evolution of altruism, i.e., of behaviors by which “altruists” benefit other individuals but not themselves; kin selection does not seem to be a sufficient explanation of nonreciprocal altruism. Nonmathematical (but mathematically acceptable) models are now proposed for evolution of negative altruism in dual-determinant and of positive altruism in tri-determinant systems. Peck orders, territorial systems, and an ant society are analyzed as examples. In all models, evolution is primarily by individual selection, probably supplemented by group selection. Group selection is differential extinction of populations. It can act only on populations preformed by selection at the individual level, but can either cancel individual selective trends (effecting evolutionary homeostasis) or supplement them; its supplementary effect is probably increasingly important in the evolution of increasingly organized populations. PMID:4501113
Gerretzen, Jan; Szymańska, Ewa; Bart, Jacob; Davies, Antony N; van Manen, Henk-Jan; van den Heuvel, Edwin R; Jansen, Jeroen J; Buydens, Lutgarde M C
2016-09-28
The aim of data preprocessing is to remove data artifacts-such as a baseline, scatter effects or noise-and to enhance the contextually relevant information. Many preprocessing methods exist to deliver one or more of these benefits, but which method or combination of methods should be used for the specific data being analyzed is difficult to select. Recently, we have shown that a preprocessing selection approach based on Design of Experiments (DoE) enables correct selection of highly appropriate preprocessing strategies within reasonable time frames. In that approach, the focus was solely on improving the predictive performance of the chemometric model. This is, however, only one of the two relevant criteria in modeling: interpretation of the model results can be just as important. Variable selection is often used to achieve such interpretation. Data artifacts, however, may hamper proper variable selection by masking the true relevant variables. The choice of preprocessing therefore has a huge impact on the outcome of variable selection methods and may thus hamper an objective interpretation of the final model. To enhance such objective interpretation, we here integrate variable selection into the preprocessing selection approach that is based on DoE. We show that the entanglement of preprocessing selection and variable selection not only improves the interpretation, but also the predictive performance of the model. This is achieved by analyzing several experimental data sets of which the true relevant variables are available as prior knowledge. We show that a selection of variables is provided that complies more with the true informative variables compared to individual optimization of both model aspects. Importantly, the approach presented in this work is generic. Different types of models (e.g. PCR, PLS, …) can be incorporated into it, as well as different variable selection methods and different preprocessing methods, according to the taste and experience of
Development of SPAWM: selection program for available watershed models.
Cho, Yongdeok; Roesner, Larry A
2014-01-01
A selection program for available watershed models (also known as SPAWM) was developed. Thirty-three commonly used watershed models were analyzed in depth and classified in accordance to their attributes. These attributes consist of: (1) land use; (2) event or continuous; (3) time steps; (4) water quality; (5) distributed or lumped; (6) subsurface; (7) overland sediment; and (8) best management practices. Each of these attributes was further classified into sub-attributes. Based on user selected sub-attributes, the most appropriate watershed model is selected from the library of watershed models. SPAWM is implemented using Excel Visual Basic and is designed for use by novices as well as by experts on watershed modeling. It ensures that the necessary sub-attributes required by the user are captured and made available in the selected watershed model.
Use of generalised additive models to categorise continuous variables in clinical prediction
2013-01-01
Background In medical practice many, essentially continuous, clinical parameters tend to be categorised by physicians for ease of decision-making. Indeed, categorisation is a common practice both in medical research and in the development of clinical prediction rules, particularly where the ensuing models are to be applied in daily clinical practice to support clinicians in the decision-making process. Since the number of categories into which a continuous predictor must be categorised depends partly on the relationship between the predictor and the outcome, the need for more than two categories must be borne in mind. Methods We propose a categorisation methodology for clinical-prediction models, using Generalised Additive Models (GAMs) with P-spline smoothers to determine the relationship between the continuous predictor and the outcome. The proposed method consists of creating at least one average-risk category along with high- and low-risk categories based on the GAM smooth function. We applied this methodology to a prospective cohort of patients with exacerbated chronic obstructive pulmonary disease. The predictors selected were respiratory rate and partial pressure of carbon dioxide in the blood (PCO2), and the response variable was poor evolution. An additive logistic regression model was used to show the relationship between the covariates and the dichotomous response variable. The proposed categorisation was compared to the continuous predictor as the best option, using the AIC and AUC evaluation parameters. The sample was divided into a derivation (60%) and validation (40%) samples. The first was used to obtain the cut points while the second was used to validate the proposed methodology. Results The three-category proposal for the respiratory rate was ≤ 20;(20,24];> 24, for which the following values were obtained: AIC=314.5 and AUC=0.638. The respective values for the continuous predictor were AIC=317.1 and AUC=0.634, with no statistically
Modeling Selective Intergranular Oxidation of Binary Alloys
Xu, Zhijie; Li, Dongsheng; Schreiber, Daniel K.; Rosso, Kevin M.; Bruemmer, Stephen M.
2015-01-07
Intergranular attack of alloys under hydrothermal conditions is a complex problem that depends on metal and oxygen transport kinetics via solid-state and channel-like pathways to an advancing oxidation front. Experiments reveal very different rates of intergranular attack and minor element depletion distances ahead of the oxidation front for nickel-based binary alloys depending on the minor element. For example, a significant Cr depletion up to 9 µm ahead of grain boundary crack tips were documented for Ni-5Cr binary alloy, in contrast to relatively moderate Al depletion for Ni-5Al (~100s of nm). We present a mathematical kinetics model that adapts Wagner’s model for thick film growth to intergranular attack of binary alloys. The transport coefficients of elements O, Ni, Cr, and Al in bulk alloys and along grain boundaries were estimated from the literature. For planar surface oxidation, a critical concentration of the minor element can be determined from the model where the oxide of minor element becomes dominant over the major element. This generic model for simple grain boundary oxidation can predict oxidation penetration velocities and minor element depletion distances ahead of the advancing front that are comparable to experimental data. The significant distance of depletion of Cr in Ni-5Cr in contrast to the localized Al depletion in Ni-5Al can be explained by the model due to the combination of the relatively faster diffusion of Cr along the grain boundary and slower diffusion in bulk grains, relative to Al.
The genealogy of samples in models with selection.
Neuhauser, C; Krone, S M
1997-02-01
We introduce the genealogy of a random sample of genes taken from a large haploid population that evolves according to random reproduction with selection and mutation. Without selection, the genealogy is described by Kingman's well-known coalescent process. In the selective case, the genealogy of the sample is embedded in a graph with a coalescing and branching structure. We describe this graph, called the ancestral selection graph, and point out differences and similarities with Kingman's coalescent. We present simulations for a two-allele model with symmetric mutation in which one of the alleles has a selective advantage over the other. We find that when the allele frequencies in the population are already in equilibrium, then the genealogy does not differ much from the neutral case. This is supported by rigorous results. Furthermore, we describe the ancestral selection graph for other selective models with finitely many selection classes, such as the K-allele models, infinitely-many-alleles models. DNA sequence models, and infinitely-many-sites models, and briefly discuss the diploid case.
The Genealogy of Samples in Models with Selection
Neuhauser, C.; Krone, S. M.
1997-01-01
We introduce the genealogy of a random sample of genes taken from a large haploid population that evolves according to random reproduction with selection and mutation. Without selection, the genealogy is described by Kingman's well-known coalescent process. In the selective case, the genealogy of the sample is embedded in a graph with a coalescing and branching structure. We describe this graph, called the ancestral selection graph, and point out differences and similarities with Kingman's coalescent. We present simulations for a two-allele model with symmetric mutation in which one of the alleles has a selective advantage over the other. We find that when the allele frequencies in the population are already in equilibrium, then the genealogy does not differ much from the neutral case. This is supported by rigorous results. Furthermore, we describe the ancestral selection graph for other selective models with finitely many selection classes, such as the K-allele models, infinitely-many-alleles models, DNA sequence models, and infinitely-many-sites models, and briefly discuss the diploid case. PMID:9071604
Wildum, Steffen; Zimmermann, Holger
2015-01-01
Despite modern prevention and treatment strategies, human cytomegalovirus (HCMV) remains a common opportunistic pathogen associated with serious morbidity and mortality in immunocompromised individuals, such as transplant recipients and AIDS patients. All drugs currently licensed for the treatment of HCMV infection target the viral DNA polymerase and are associated with severe toxicity issues and the emergence of drug resistance. Letermovir (AIC246, MK-8228) is a new anti-HCMV agent in clinical development that acts via a novel mode of action and has demonstrated anti-HCMV activity in vitro and in vivo. For the future, drug combination therapies, including letermovir, might be indicated under special medical conditions, such as the emergence of multidrug-resistant virus strains in transplant recipients or in HCMV-HIV-coinfected patients. Accordingly, knowledge of the compatibility of letermovir with other HCMV or HIV antivirals is of medical importance. Here, we evaluated the inhibition of HCMV replication by letermovir in combination with all currently approved HCMV antivirals using cell culture checkerboard assays. In addition, the effects of letermovir on the antiviral activities of selected HIV drugs, and vice versa, were analyzed. Using two different mathematical techniques to analyze the experimental data, (i) additive effects were observed for the combination of letermovir with anti-HCMV drugs and (ii) no interaction was found between letermovir and anti-HIV drugs. Since none of the tested drug combinations significantly antagonized letermovir efficacy (or vice versa), our findings suggest that letermovir may offer the potential for combination therapy with the tested HCMV and HIV drugs. PMID:25779572
Model selection in systems biology depends on experimental design.
Silk, Daniel; Kirk, Paul D W; Barnes, Chris P; Toni, Tina; Stumpf, Michael P H
2014-06-01
Experimental design attempts to maximise the information available for modelling tasks. An optimal experiment allows the inferred models or parameters to be chosen with the highest expected degree of confidence. If the true system is faithfully reproduced by one of the models, the merit of this approach is clear - we simply wish to identify it and the true parameters with the most certainty. However, in the more realistic situation where all models are incorrect or incomplete, the interpretation of model selection outcomes and the role of experimental design needs to be examined more carefully. Using a novel experimental design and model selection framework for stochastic state-space models, we perform high-throughput in-silico analyses on families of gene regulatory cascade models, to show that the selected model can depend on the experiment performed. We observe that experimental design thus makes confidence a criterion for model choice, but that this does not necessarily correlate with a model's predictive power or correctness. Finally, in the special case of linear ordinary differential equation (ODE) models, we explore how wrong a model has to be before it influences the conclusions of a model selection analysis.
Model Selection in Systems Biology Depends on Experimental Design
Silk, Daniel; Kirk, Paul D. W.; Barnes, Chris P.; Toni, Tina; Stumpf, Michael P. H.
2014-01-01
Experimental design attempts to maximise the information available for modelling tasks. An optimal experiment allows the inferred models or parameters to be chosen with the highest expected degree of confidence. If the true system is faithfully reproduced by one of the models, the merit of this approach is clear - we simply wish to identify it and the true parameters with the most certainty. However, in the more realistic situation where all models are incorrect or incomplete, the interpretation of model selection outcomes and the role of experimental design needs to be examined more carefully. Using a novel experimental design and model selection framework for stochastic state-space models, we perform high-throughput in-silico analyses on families of gene regulatory cascade models, to show that the selected model can depend on the experiment performed. We observe that experimental design thus makes confidence a criterion for model choice, but that this does not necessarily correlate with a model's predictive power or correctness. Finally, in the special case of linear ordinary differential equation (ODE) models, we explore how wrong a model has to be before it influences the conclusions of a model selection analysis. PMID:24922483
Modeling HIV-1 Drug Resistance as Episodic Directional Selection
Murrell, Ben; de Oliveira, Tulio; Seebregts, Chris; Kosakovsky Pond, Sergei L.; Scheffler, Konrad
2012-01-01
The evolution of substitutions conferring drug resistance to HIV-1 is both episodic, occurring when patients are on antiretroviral therapy, and strongly directional, with site-specific resistant residues increasing in frequency over time. While methods exist to detect episodic diversifying selection and continuous directional selection, no evolutionary model combining these two properties has been proposed. We present two models of episodic directional selection (MEDS and EDEPS) which allow the a priori specification of lineages expected to have undergone directional selection. The models infer the sites and target residues that were likely subject to directional selection, using either codon or protein sequences. Compared to its null model of episodic diversifying selection, MEDS provides a superior fit to most sites known to be involved in drug resistance, and neither one test for episodic diversifying selection nor another for constant directional selection are able to detect as many true positives as MEDS and EDEPS while maintaining acceptable levels of false positives. This suggests that episodic directional selection is a better description of the process driving the evolution of drug resistance. PMID:22589711
Remedial action selection using groundwater modeling
Haddad, B.I.; Parish, G.B.; Hauge, L.
1996-12-31
An environmental investigation uncovered petroleum contamination at a gasoline station in southern Wisconsin. The site was located in part of the ancestral Rock River valley in Rock County, Wisconsin where the valley is filled with sands and gravels. Groundwater pump tests were conducted for determination of aquifer properties needed to plan a remediation system; the results were indicative of a very high hydraulic conductivity. The site hydrogeology was modeled using the U.S. Geological Survey`s groundwater model, Modflow. The calibrated model was used to determine the number, pumping rate, and configuration of recovery wells to remediate the site. The most effective configuration was three wells pumping at 303 liters per minute (1/m) (80 gallons per minute (gpm)), producing a total pumping rate of 908 l/m (240 gpm). Treating 908 l/min (240 gpm) or 1,308,240 liters per day (345,600 gallons per day) constituted a significant volume to be treated and discharged. It was estimated that pumping for the two year remediation would cost $375,000 while the air sparging would cost $200,000. The recommended remedial system consisted of eight air sparging wells and four vapor recovery laterals. The Wisconsin Department of Natural Resources (WDNR) approved the remedial action plan in March, 1993. After 11 months of effective operation the concentrations of removed VOCs had decreased by 94 percent and groundwater sampling indicated no detectable concentrations of gasoline contaminants. Groundwater modeling was an effective technique to determine the economic feasibility of a groundwater remedial alternative.
CHull: a generic convex-hull-based model selection method.
Wilderjans, Tom F; Ceulemans, Eva; Meers, Kristof
2013-03-01
When analyzing data, researchers are often confronted with a model selection problem (e.g., determining the number of components/factors in principal components analysis [PCA]/factor analysis or identifying the most important predictors in a regression analysis). To tackle such a problem, researchers may apply some objective procedure, like parallel analysis in PCA/factor analysis or stepwise selection methods in regression analysis. A drawback of these procedures is that they can only be applied to the model selection problem at hand. An interesting alternative is the CHull model selection procedure, which was originally developed for multiway analysis (e.g., multimode partitioning). However, the key idea behind the CHull procedure--identifying a model that optimally balances model goodness of fit/misfit and model complexity--is quite generic. Therefore, the procedure may also be used when applying many other analysis techniques. The aim of this article is twofold. First, we demonstrate the wide applicability of the CHull method by showing how it can be used to solve various model selection problems in the context of PCA, reduced K-means, best-subset regression, and partial least squares regression. Moreover, a comparison of CHull with standard model selection methods for these problems is performed. Second, we present the CHULL software, which may be downloaded from http://ppw.kuleuven.be/okp/software/CHULL/, to assist the user in applying the CHull procedure.
Model Selection and Accounting for Model Uncertainty in Graphical Models Using OCCAM’s Window
1991-07-22
There are also approaches based on information criteria and discrepancy measures (Gokhale and Kullback, 1978; Sakamoto, 1984; Linhart and Zucchini , 1986...Statistical Society (Series B), 50,157-224. Linhart, H. and Zucchini , W. (1986) Model Selection. New York: Wiley. Miller, A.J. (1984) Selection of
A mixed model reduction method for preserving selected physical information
NASA Astrophysics Data System (ADS)
Zhang, Jing; Zheng, Gangtie
2017-03-01
A new model reduction method in the frequency domain is presented. By mixedly using the model reduction techniques from both the time domain and the frequency domain, the dynamic model is condensed to selected physical coordinates, and the contribution of slave degrees of freedom is taken as a modification to the model in the form of effective modal mass of virtually constrained modes. The reduced model can preserve the physical information related to the selected physical coordinates such as physical parameters and physical space positions of corresponding structure components. For the cases of non-classical damping, the method is extended to the model reduction in the state space but still only contains the selected physical coordinates. Numerical results are presented to validate the method and show the effectiveness of the model reduction.
Model selection in cognitive science as an inverse problem
NASA Astrophysics Data System (ADS)
Myung, Jay I.; Pitt, Mark A.; Navarro, Daniel J.
2005-03-01
How should we decide among competing explanations (models) of a cognitive phenomenon? This problem of model selection is at the heart of the scientific enterprise. Ideally, we would like to identify the model that actually generated the data at hand. However, this is an un-achievable goal as it is fundamentally ill-posed. Information in a finite data sample is seldom sufficient to point to a single model. Multiple models may provide equally good descriptions of the data, a problem that is exacerbated by the presence of random error in the data. In fact, model selection bears a striking similarity to perception, in that both require solving an inverse problem. Just as perceptual ambiguity can be addressed only by introducing external constraints on the interpretation of visual images, the ill-posedness of the model selection problem requires us to introduce external constraints on the choice of the most appropriate model. Model selection methods differ in how these external constraints are conceptualized and formalized. In this review we discuss the development of the various approaches, the differences between them, and why the methods perform as they do. An application example of selection methods in cognitive modeling is also discussed.
Golbon, Reza; Ogutu, Joseph Ochieng; Cotter, Marc; Sauerborn, Joachim
2015-12-01
Linear mixed models were developed and used to predict rubber (Hevea brasiliensis) yield based on meteorological conditions to which rubber trees had been exposed for periods ranging from 1 day to 2 months prior to tapping events. Predictors included a range of moving averages of meteorological covariates spanning different windows of time before the date of the tapping events. Serial autocorrelation in the latex yield measurements was accounted for using random effects and a spatial generalization of the autoregressive error covariance structure suited to data sampled at irregular time intervals. Information theoretics, specifically the Akaike information criterion (AIC), AIC corrected for small sample size (AICc), and Akaike weights, was used to select models with the greatest strength of support in the data from a set of competing candidate models. The predictive performance of the selected best model was evaluated using both leave-one-out cross-validation (LOOCV) and an independent test set. Moving averages of precipitation, minimum and maximum temperature, and maximum relative humidity with a 30-day lead period were identified as the best yield predictors. Prediction accuracy expressed in terms of the percentage of predictions within a measurement error of 5 g for cross-validation and also for the test dataset was above 99 %.
NASA Astrophysics Data System (ADS)
Golbon, Reza; Ogutu, Joseph Ochieng; Cotter, Marc; Sauerborn, Joachim
2015-12-01
Linear mixed models were developed and used to predict rubber ( Hevea brasiliensis) yield based on meteorological conditions to which rubber trees had been exposed for periods ranging from 1 day to 2 months prior to tapping events. Predictors included a range of moving averages of meteorological covariates spanning different windows of time before the date of the tapping events. Serial autocorrelation in the latex yield measurements was accounted for using random effects and a spatial generalization of the autoregressive error covariance structure suited to data sampled at irregular time intervals. Information theoretics, specifically the Akaike information criterion (AIC), AIC corrected for small sample size (AICc), and Akaike weights, was used to select models with the greatest strength of support in the data from a set of competing candidate models. The predictive performance of the selected best model was evaluated using both leave-one-out cross-validation (LOOCV) and an independent test set. Moving averages of precipitation, minimum and maximum temperature, and maximum relative humidity with a 30-day lead period were identified as the best yield predictors. Prediction accuracy expressed in terms of the percentage of predictions within a measurement error of 5 g for cross-validation and also for the test dataset was above 99 %.
A Conditional Logit Model of Collegiate Major Selection.
ERIC Educational Resources Information Center
Milley, Donald J.; Bee, Richard H.
1982-01-01
Hypothesizes a conditional logit model of decision making to explain collegiate major selection. Results suggest a link between student environment and preference structure and preference structures and student major selection. Suggests findings are limited by use of a largely commuter student population. (KMF)
A Working Model of Natural Selection Illustrated by Table Tennis
ERIC Educational Resources Information Center
Dinc, Muhittin; Kilic, Selda; Aladag, Caner
2013-01-01
Natural selection is one of the most important topics in biology and it helps to clarify the variety and complexity of organisms. However, students in almost every stage of education find it difficult to understand the mechanism of natural selection and they can develop misconceptions about it. This article provides an active model of natural…
A Model for Investigating Predictive Validity at Highly Selective Institutions.
ERIC Educational Resources Information Center
Gross, Alan L.; And Others
A statistical model for investigating predictive validity at highly selective institutions is described. When the selection ratio is small, one must typically deal with a data set containing relatively large amounts of missing data on both criterion and predictor variables. Standard statistical approaches are based on the strong assumption that…
Augmented Self-Modeling as an Intervention for Selective Mutism
ERIC Educational Resources Information Center
Kehle, Thomas J.; Bray, Melissa A.; Byer-Alcorace, Gabriel F.; Theodore, Lea A.; Kovac, Lisa M.
2012-01-01
Selective mutism is a rare disorder that is difficult to treat. It is often associated with oppositional defiant behavior, particularly in the home setting, social phobia, and, at times, autism spectrum disorder characteristics. The augmented self-modeling treatment has been relatively successful in promoting rapid diminishment of selective mutism…
Determinants of wood thrush nest success: A multi-scale, model selection approach
Driscoll, M.J.L.; Donovan, T.; Mickey, R.; Howard, A.; Fleming, K.K.
2005-01-01
We collected data on 212 wood thrush (Hylocichla mustelina) nests in central New York from 1998 to 2000 to determine the factors that most strongly influence nest success. We used an information-theoretic approach to assess and rank 9 models that examined the relationship between nest success (i.e., the probability that a nest would successfully fledge at least 1 wood thrush offspring) and habitat conditions at different spatial scales. We found that 4 variables were significant predictors of nesting success for wood thrushes: (1) total core habitat within 5 km of a study site, (2) distance to forest-field edge, (3) total forest cover within 5 km of the study site, and (4) density and variation in diameter of trees and shrubs surrounding the nest. The coefficients of these predictors were all positive. Of the 9 models evaluated, amount of core habitat in the 5-km landscape was the best-fit model, but the vegetation structure model (i.e., the density of trees and stems surrounding a nest) was also supported by the data. Based on AIC weights, enhancement of core area is likely to be a more effective management option than any other habitat-management options explored in this study. Bootstrap analysis generally confirmed these results; core and vegetation structure models were ranked 1, 2, or 3 in over 50% of 1,000 bootstrap trials. However, bootstrap results did not point to a decisive model, which suggests that multiple habitat factors are influencing wood thrush nesting success. Due to model uncertainty, we used a model averaging approach to predict the success or failure of each nest in our dataset. This averaged model was able to correctly predict 61.1% of nest outcomes.
Robust Decision-making Applied to Model Selection
Hemez, Francois M.
2012-08-06
The scientific and engineering communities are relying more and more on numerical models to simulate ever-increasingly complex phenomena. Selecting a model, from among a family of models that meets the simulation requirements, presents a challenge to modern-day analysts. To address this concern, a framework is adopted anchored in info-gap decision theory. The framework proposes to select models by examining the trade-offs between prediction accuracy and sensitivity to epistemic uncertainty. The framework is demonstrated on two structural engineering applications by asking the following question: Which model, of several numerical models, approximates the behavior of a structure when parameters that define each of those models are unknown? One observation is that models that are nominally more accurate are not necessarily more robust, and their accuracy can deteriorate greatly depending upon the assumptions made. It is posited that, as reliance on numerical models increases, establishing robustness will become as important as demonstrating accuracy.
Development, Selection, and Validation of Tumor Growth Models
NASA Astrophysics Data System (ADS)
Shahmoradi, Amir; Lima, Ernesto; Oden, J. Tinsley
In recent years, a multitude of different mathematical approaches have been taken to develop multiscale models of solid tumor growth. Prime successful examples include the lattice-based, agent-based (off-lattice), and phase-field approaches, or a hybrid of these models applied to multiple scales of tumor, from subcellular to tissue level. Of overriding importance is the predictive power of these models, particularly in the presence of uncertainties. This presentation describes our attempt at developing lattice-based, agent-based and phase-field models of tumor growth and assessing their predictive power through new adaptive algorithms for model selection and model validation embodied in the Occam Plausibility Algorithm (OPAL), that brings together model calibration, determination of sensitivities of outputs to parameter variances, and calculation of model plausibilities for model selection. Institute for Computational Engineering and Sciences.
RUC at TREC 2014: Select Resources Using Topic Models
2014-11-01
them being observed (i.e. sampled). To infer the topic Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the...Selection. In CIKM 2009, pages 1277-1286. [10] M. Baillie, M. Carmen, and F. Crestani. A Multiple- Collection Latent Topic Model for Federated...RUC at TREC 2014: Select Resources Using Topic Models Qiuyue Wang, Shaochen Shi, Wei Cao School of Information Renmin University of China Beijing
A guide to Bayesian model selection for ecologists
Hooten, Mevin B.; Hobbs, N.T.
2015-01-01
The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.
Models of selection, isolation, and gene flow in speciation.
Hart, Michael W
2014-10-01
Many marine ecologists aspire to use genetic data to understand how selection and demographic history shape the evolution of diverging populations as they become reproductively isolated species. I propose combining two types of genetic analysis focused on this key early stage of the speciation process to identify the selective agents directly responsible for population divergence. Isolation-with-migration (IM) models can be used to characterize reproductive isolation between populations (low gene flow), while codon models can be used to characterize selection for population differences at the molecular level (especially positive selection for high rates of amino acid substitution). Accessible transcriptome sequencing methods can generate the large quantities of data needed for both types of analysis. I highlight recent examples (including our work on fertilization genes in sea stars) in which this confluence of interest, models, and data has led to taxonomically broad advances in understanding marine speciation at the molecular level. I also highlight new models that incorporate both demography and selection: simulations based on these theoretical advances suggest that polymorphisms shared among individuals (a key source of information in IM models) may lead to false-positive evidence of selection (in codon models), especially during the early stages of population divergence and speciation that are most in need of study. The false-positive problem may be resolved through a combination of model improvements plus experiments that document the phenotypic and fitness effects of specific polymorphisms for which codon models and IM models indicate selection and reproductive isolation (such as genes that mediate sperm-egg compatibility at fertilization).
Modeling of display color parameters and algorithmic color selection
NASA Astrophysics Data System (ADS)
Silverstein, Louis D.; Lepkowski, James S.; Carter, Robert C.; Carter, Ellen C.
1986-01-01
An algorithmic approach to color selection, which is based on psychophysical models of color processing, is described. The factors that affect color differentiation, such as wavelength separation, color stimulus size, and brightness adaptation level, are discussed. The use of the CIE system of colorimetry and the CIELUV color difference metric for display color modeling is examined. The computer program combines the selection algorithm with internally derived correction factors for color image field size, ambient lighting characteristics, and anomalous red-green color vision deficiencies of display operators. The performance of the program is evaluated and uniform chromaticity scale diagrams for six-color and seven-color selection problems are provided.
Fisher-Wright model with deterministic seed bank and selection.
Koopmann, Bendix; Müller, Johannes; Tellier, Aurélien; Živković, Daniel
2017-04-01
Seed banks are common characteristics to many plant species, which allow storage of genetic diversity in the soil as dormant seeds for various periods of time. We investigate an above-ground population following a Fisher-Wright model with selection coupled with a deterministic seed bank assuming the length of the seed bank is kept constant and the number of seeds is large. To assess the combined impact of seed banks and selection on genetic diversity, we derive a general diffusion model. The applied techniques outline a path of approximating a stochastic delay differential equation by an appropriately rescaled stochastic differential equation. We compute the equilibrium solution of the site-frequency spectrum and derive the times to fixation of an allele with and without selection. Finally, it is demonstrated that seed banks enhance the effect of selection onto the site-frequency spectrum while slowing down the time until the mutation-selection equilibrium is reached.
Brandt, Laura A.; Benscoter, Allison; Harvey, Rebecca G.; Speroterra, Carolina; Bucklin, David N.; Romanach, Stephanie; Watling, James I.; Mazzotti, Frank J.
2017-01-01
Climate envelope models are widely used to describe potential future distribution of species under different climate change scenarios. It is broadly recognized that there are both strengths and limitations to using climate envelope models and that outcomes are sensitive to initial assumptions, inputs, and modeling methods Selection of predictor variables, a central step in modeling, is one of the areas where different techniques can yield varying results. Selection of climate variables to use as predictors is often done using statistical approaches that develop correlations between occurrences and climate data. These approaches have received criticism in that they rely on the statistical properties of the data rather than directly incorporating biological information about species responses to temperature and precipitation. We evaluated and compared models and prediction maps for 15 threatened or endangered species in Florida based on two variable selection techniques: expert opinion and a statistical method. We compared model performance between these two approaches for contemporary predictions, and the spatial correlation, spatial overlap and area predicted for contemporary and future climate predictions. In general, experts identified more variables as being important than the statistical method and there was low overlap in the variable sets (<40%) between the two methods Despite these differences in variable sets (expert versus statistical), models had high performance metrics (>0.9 for area under the curve (AUC) and >0.7 for true skill statistic (TSS). Spatial overlap, which compares the spatial configuration between maps constructed using the different variable selection techniques, was only moderate overall (about 60%), with a great deal of variability across species. Difference in spatial overlap was even greater under future climate projections, indicating additional divergence of model outputs from different variable selection techniques. Our work is in
Multicriteria framework for selecting a process modelling language
NASA Astrophysics Data System (ADS)
Scanavachi Moreira Campos, Ana Carolina; Teixeira de Almeida, Adiel
2016-01-01
The choice of process modelling language can affect business process management (BPM) since each modelling language shows different features of a given process and may limit the ways in which a process can be described and analysed. However, choosing the appropriate modelling language for process modelling has become a difficult task because of the availability of a large number modelling languages and also due to the lack of guidelines on evaluating, and comparing languages so as to assist in selecting the most appropriate one. This paper proposes a framework for selecting a modelling language in accordance with the purposes of modelling. This framework is based on the semiotic quality framework (SEQUAL) for evaluating process modelling languages and a multicriteria decision aid (MCDA) approach in order to select the most appropriate language for BPM. This study does not attempt to set out new forms of assessment and evaluation criteria, but does attempt to demonstrate how two existing approaches can be combined so as to solve the problem of selection of modelling language. The framework is described in this paper and then demonstrated by means of an example. Finally, the advantages and disadvantages of using SEQUAL and MCDA in an integrated manner are discussed.
Miao, Hongyu; Dykes, Carrie; Demeter, Lisa M; Wu, Hulin
2009-03-01
Many biological processes and systems can be described by a set of differential equation (DE) models. However, literature in statistical inference for DE models is very sparse. We propose statistical estimation, model selection, and multimodel averaging methods for HIV viral fitness experiments in vitro that can be described by a set of nonlinear ordinary differential equations (ODE). The parameter identifiability of the ODE models is also addressed. We apply the proposed methods and techniques to experimental data of viral fitness for HIV-1 mutant 103N. We expect that the proposed modeling and inference approaches for the DE models can be widely used for a variety of biomedical studies.
Optimal experiment design for model selection in biochemical networks
2014-01-01
Background Mathematical modeling is often used to formalize hypotheses on how a biochemical network operates by discriminating between competing models. Bayesian model selection offers a way to determine the amount of evidence that data provides to support one model over the other while favoring simple models. In practice, the amount of experimental data is often insufficient to make a clear distinction between competing models. Often one would like to perform a new experiment which would discriminate between competing hypotheses. Results We developed a novel method to perform Optimal Experiment Design to predict which experiments would most effectively allow model selection. A Bayesian approach is applied to infer model parameter distributions. These distributions are sampled and used to simulate from multivariate predictive densities. The method is based on a k-Nearest Neighbor estimate of the Jensen Shannon divergence between the multivariate predictive densities of competing models. Conclusions We show that the method successfully uses predictive differences to enable model selection by applying it to several test cases. Because the design criterion is based on predictive distributions, which can be computed for a wide range of model quantities, the approach is very flexible. The method reveals specific combinations of experiments which improve discriminability even in cases where data is scarce. The proposed approach can be used in conjunction with existing Bayesian methodologies where (approximate) posteriors have been determined, making use of relations that exist within the inferred posteriors. PMID:24555498
Efficient Regularized Regression with L0 Penalty for Variable Selection and Network Construction
2016-01-01
Variable selections for regression with high-dimensional big data have found many applications in bioinformatics and computational biology. One appealing approach is the L0 regularized regression which penalizes the number of nonzero features in the model directly. However, it is well known that L0 optimization is NP-hard and computationally challenging. In this paper, we propose efficient EM (L0EM) and dual L0EM (DL0EM) algorithms that directly approximate the L0 optimization problem. While L0EM is efficient with large sample size, DL0EM is efficient with high-dimensional (n ≪ m) data. They also provide a natural solution to all Lp p ∈ [0,2] problems, including lasso with p = 1 and elastic net with p ∈ [1,2]. The regularized parameter λ can be determined through cross validation or AIC and BIC. We demonstrate our methods through simulation and high-dimensional genomic data. The results indicate that L0 has better performance than lasso, SCAD, and MC+, and L0 with AIC or BIC has similar performance as computationally intensive cross validation. The proposed algorithms are efficient in identifying the nonzero variables with less bias and constructing biologically important networks with high-dimensional big data. PMID:27843486
Novel web service selection model based on discrete group search.
Zhai, Jie; Shao, Zhiqing; Guo, Yi; Zhang, Haiteng
2014-01-01
In our earlier work, we present a novel formal method for the semiautomatic verification of specifications and for describing web service composition components by using abstract concepts. After verification, the instantiations of components were selected to satisfy the complex service performance constraints. However, selecting an optimal instantiation, which comprises different candidate services for each generic service, from a large number of instantiations is difficult. Therefore, we present a new evolutionary approach on the basis of the discrete group search service (D-GSS) model. With regard to obtaining the optimal multiconstraint instantiation of the complex component, the D-GSS model has competitive performance compared with other service selection models in terms of accuracy, efficiency, and ability to solve high-dimensional service composition component problems. We propose the cost function and the discrete group search optimizer (D-GSO) algorithm and study the convergence of the D-GSS model through verification and test cases.
Monthly streamflow prediction in the Volta Basin of West Africa: A SISO NARMAX polynomial modelling
NASA Astrophysics Data System (ADS)
Amisigo, B. A.; van de Giesen, N.; Rogers, C.; Andah, W. E. I.; Friesen, J.
Single-input-single-output (SISO) non-linear system identification techniques were employed to model monthly catchment runoff at selected gauging sites in the Volta Basin of West Africa. NARMAX (Non-linear Autoregressive Moving Average with eXogenous Input) polynomial models were fitted to basin monthly rainfall and gauging station runoff data for each of the selected sites and used to predict monthly runoff at the sites. An error reduction ratio (ERR) algorithm was used to order regressors for various combinations of input, output and noise lags (various model structures) and the significant regressors for each model selected by applying an Akaike Information Criterion (AIC) to independent rainfall-runoff validation series. Model parameters were estimated from the Matlab REGRESS function (an orthogonal least squares method). In each case, the sub-model without noise terms was fitted first followed by a fitting of the noise model. The coefficient of determination ( R-squared), the Nash-Sutcliffe Efficiency criterion (NSE) and the F statistic for the estimation (training) series were used to evaluate the significance of fit of each model to this series while model selection from the range of models fitted for each gauging site was done by examining the NSEs and the AICs of the validation series. Monthly runoff predictions from the selected models were very good, and the polynomial models appeared to have captured a good part of the rainfall-runoff non-linearity. The results indicate that the NARMAX modelling framework is suitable for monthly river runoff prediction in the Volta Basin. The several good models made available by the NARMAX modelling framework could be useful in the selection of model structures that also provide insights into the physical behaviour of the catchment rainfall-runoff system.
Standard Codon Substitution Models Overestimate Purifying Selection for Nonstationary Data
Yap, Von Bing; Huttley, Gavin A.
2017-01-01
Estimation of natural selection on protein-coding sequences is a key comparative genomics approach for de novo prediction of lineage-specific adaptations. Selective pressure is measured on a per-gene basis by comparing the rate of nonsynonymous substitutions to the rate of synonymous substitutions. All published codon substitution models have been time-reversible and thus assume that sequence composition does not change over time. We previously demonstrated that if time-reversible DNA substitution models are applied in the presence of changing sequence composition, the number of substitutions is systematically biased towards overestimation. We extend these findings to the case of codon substitution models and further demonstrate that the ratio of nonsynonymous to synonymous rates of substitution tends to be underestimated over three data sets of mammals, vertebrates, and insects. Our basis for comparison is a nonstationary codon substitution model that allows sequence composition to change. Goodness-of-fit results demonstrate that our new model tends to fit the data better. Direct measurement of nonstationarity shows that bias in estimates of natural selection and genetic distance increases with the degree of violation of the stationarity assumption. Additionally, inferences drawn under time-reversible models are systematically affected by compositional divergence. As genomic sequences accumulate at an accelerating rate, the importance of accurate de novo estimation of natural selection increases. Our results establish that our new model provides a more robust perspective on this fundamental quantity. PMID:28175284
Evolution models with base substitutions, insertions, deletions, and selection
NASA Astrophysics Data System (ADS)
Saakian, D. B.
2008-12-01
The evolution model with parallel mutation-selection scheme is solved for the case when selection is accompanied by base substitutions, insertions, and deletions. The fitness is assumed to be either a single-peak function (i.e., having one finite discontinuity) or a smooth function of the Hamming distance from the reference sequence. The mean fitness is calculated exactly in large-genome limit. In the case of insertions and deletions the evolution characteristics depend on the choice of reference sequence.
Genetic signatures of natural selection in a model invasive ascidian
NASA Astrophysics Data System (ADS)
Lin, Yaping; Chen, Yiyong; Yi, Changho; Fong, Jonathan J.; Kim, Won; Rius, Marc; Zhan, Aibin
2017-03-01
Invasive species represent promising models to study species’ responses to rapidly changing environments. Although local adaptation frequently occurs during contemporary range expansion, the associated genetic signatures at both population and genomic levels remain largely unknown. Here, we use genome-wide gene-associated microsatellites to investigate genetic signatures of natural selection in a model invasive ascidian, Ciona robusta. Population genetic analyses of 150 individuals sampled in Korea, New Zealand, South Africa and Spain showed significant genetic differentiation among populations. Based on outlier tests, we found high incidence of signatures of directional selection at 19 loci. Hitchhiking mapping analyses identified 12 directional selective sweep regions, and all selective sweep windows on chromosomes were narrow (~8.9 kb). Further analyses indentified 132 candidate genes under selection. When we compared our genetic data and six crucial environmental variables, 16 putatively selected loci showed significant correlation with these environmental variables. This suggests that the local environmental conditions have left significant signatures of selection at both population and genomic levels. Finally, we identified “plastic” genomic regions and genes that are promising regions to investigate evolutionary responses to rapid environmental change in C. robusta.
Genetic signatures of natural selection in a model invasive ascidian
Lin, Yaping; Chen, Yiyong; Yi, Changho; Fong, Jonathan J.; Kim, Won; Rius, Marc; Zhan, Aibin
2017-01-01
Invasive species represent promising models to study species’ responses to rapidly changing environments. Although local adaptation frequently occurs during contemporary range expansion, the associated genetic signatures at both population and genomic levels remain largely unknown. Here, we use genome-wide gene-associated microsatellites to investigate genetic signatures of natural selection in a model invasive ascidian, Ciona robusta. Population genetic analyses of 150 individuals sampled in Korea, New Zealand, South Africa and Spain showed significant genetic differentiation among populations. Based on outlier tests, we found high incidence of signatures of directional selection at 19 loci. Hitchhiking mapping analyses identified 12 directional selective sweep regions, and all selective sweep windows on chromosomes were narrow (~8.9 kb). Further analyses indentified 132 candidate genes under selection. When we compared our genetic data and six crucial environmental variables, 16 putatively selected loci showed significant correlation with these environmental variables. This suggests that the local environmental conditions have left significant signatures of selection at both population and genomic levels. Finally, we identified “plastic” genomic regions and genes that are promising regions to investigate evolutionary responses to rapid environmental change in C. robusta. PMID:28266616
IT vendor selection model by using structural equation model & analytical hierarchy process
NASA Astrophysics Data System (ADS)
Maitra, Sarit; Dominic, P. D. D.
2012-11-01
Selecting and evaluating the right vendors is imperative for an organization's global marketplace competitiveness. Improper selection and evaluation of potential vendors can dwarf an organization's supply chain performance. Numerous studies have demonstrated that firms consider multiple criteria when selecting key vendors. This research intends to develop a new hybrid model for vendor selection process with better decision making. The new proposed model provides a suitable tool for assisting decision makers and managers to make the right decisions and select the most suitable vendor. This paper proposes a Hybrid model based on Structural Equation Model (SEM) and Analytical Hierarchy Process (AHP) for long-term strategic vendor selection problems. The five steps framework of the model has been designed after the thorough literature study. The proposed hybrid model will be applied using a real life case study to assess its effectiveness. In addition, What-if analysis technique will be used for model validation purpose.
Cognitive niches: an ecological model of strategy selection.
Marewski, Julian N; Schooler, Lael J
2011-07-01
How do people select among different strategies to accomplish a given task? Across disciplines, the strategy selection problem represents a major challenge. We propose a quantitative model that predicts how selection emerges through the interplay among strategies, cognitive capacities, and the environment. This interplay carves out for each strategy a cognitive niche, that is, a limited number of situations in which the strategy can be applied, simplifying strategy selection. To illustrate our proposal, we consider selection in the context of 2 theories: the simple heuristics framework and the ACT-R (adaptive control of thought-rational) architecture of cognition. From the heuristics framework, we adopt the thesis that people make decisions by selecting from a repertoire of simple decision strategies that exploit regularities in the environment and draw on cognitive capacities, such as memory and time perception. ACT-R provides a quantitative theory of how these capacities adapt to the environment. In 14 simulations and 10 experiments, we consider the choice between strategies that operate on the accessibility of memories and those that depend on elaborate knowledge about the world. Based on Internet statistics, our model quantitatively predicts people's familiarity with and knowledge of real-world objects, the distributional characteristics of the associated speed of memory retrieval, and the cognitive niches of classic decision strategies, including those of the fluency, recognition, integration, lexicographic, and sequential-sampling heuristics. In doing so, the model specifies when people will be able to apply different strategies and how accurate, fast, and effortless people's decisions will be.
Model Selection in Historical Research Using Approximate Bayesian Computation
Rubio-Campillo, Xavier
2016-01-01
Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953
Robust model selection and the statistical classification of languages
NASA Astrophysics Data System (ADS)
García, J. E.; González-López, V. A.; Viola, M. L. L.
2012-10-01
In this paper we address the problem of model selection for the set of finite memory stochastic processes with finite alphabet, when the data is contaminated. We consider m independent samples, with more than half of them being realizations of the same stochastic process with law Q, which is the one we want to retrieve. We devise a model selection procedure such that for a sample size large enough, the selected process is the one with law Q. Our model selection strategy is based on estimating relative entropies to select a subset of samples that are realizations of the same law. Although the procedure is valid for any family of finite order Markov models, we will focus on the family of variable length Markov chain models, which include the fixed order Markov chain model family. We define the asymptotic breakdown point (ABDP) for a model selection procedure, and we show the ABDP for our procedure. This means that if the proportion of contaminated samples is smaller than the ABDP, then, as the sample size grows our procedure selects a model for the process with law Q. We also use our procedure in a setting where we have one sample conformed by the concatenation of sub-samples of two or more stochastic processes, with most of the subsamples having law Q. We conducted a simulation study. In the application section we address the question of the statistical classification of languages according to their rhythmic features using speech samples. This is an important open problem in phonology. A persistent difficulty on this problem is that the speech samples correspond to several sentences produced by diverse speakers, corresponding to a mixture of distributions. The usual procedure to deal with this problem has been to choose a subset of the original sample which seems to best represent each language. The selection is made by listening to the samples. In our application we use the full dataset without any preselection of samples. We apply our robust methodology estimating
A neural network model for visual selection and shifting.
Qiao, Yuanhua; Liu, Xiaojie; Miao, Jun; Duan, Lijuan
2016-09-01
In this paper, a two-layer network is built to simulate the mechanism of visual selection and shifting based on the mapping dynamic model for instantaneous frequency. Unlike the differential equation model using limit cycle to simulate neuron oscillation, we build an instantaneous frequency mapping dynamic model to describe the change of the neuron frequency to avoid the difficulty of generating limit cycle. The activity of the neuron is rebuilt based on the instantaneous frequency and in this work, we use the first layer of neurons to implement image segmentation and the second layer of neurons to act as visual selector. The frequency of the second neuron (central neuron) is always changing, while central neuron resonates with the neurons corresponding to an object, the object is selected, then with the central neuron frequency changing, the selected object loses attention, the process goes on.
Screening of selective histone deacetylase inhibitors by proteochemometric modeling
2012-01-01
Background Histone deacetylase (HDAC) is a novel target for the treatment of cancer and it can be classified into three classes, i.e., classes I, II, and IV. The inhibitors selectively targeting individual HDAC have been proved to be the better candidate antitumor drugs. To screen selective HDAC inhibitors, several proteochemometric (PCM) models based on different combinations of three kinds of protein descriptors, two kinds of ligand descriptors and multiplication cross-terms were constructed in our study. Results The results show that structure similarity descriptors are better than sequence similarity descriptors and geometry descriptors in the leftacterization of HDACs. Furthermore, the predictive ability was not improved by introducing the cross-terms in our models. Finally, a best PCM model based on protein structure similarity descriptors and 32-dimensional general descriptors was derived (R2 = 0.9897, Qtest2 = 0.7542), which shows a powerful ability to screen selective HDAC inhibitors. Conclusions Our best model not only predict the activities of inhibitors for each HDAC isoform, but also screen and distinguish class-selective inhibitors and even more isoform-selective inhibitors, thus it provides a potential way to discover or design novel candidate antitumor drugs with reduced side effect. PMID:22913517
Uncertain programming models for portfolio selection with uncertain returns
NASA Astrophysics Data System (ADS)
Zhang, Bo; Peng, Jin; Li, Shengguo
2015-10-01
In an indeterminacy economic environment, experts' knowledge about the returns of securities consists of much uncertainty instead of randomness. This paper discusses portfolio selection problem in uncertain environment in which security returns cannot be well reflected by historical data, but can be evaluated by the experts. In the paper, returns of securities are assumed to be given by uncertain variables. According to various decision criteria, the portfolio selection problem in uncertain environment is formulated as expected-variance-chance model and chance-expected-variance model by using the uncertainty programming. Within the framework of uncertainty theory, for the convenience of solving the models, some crisp equivalents are discussed under different conditions. In addition, a hybrid intelligent algorithm is designed in the paper to provide a general method for solving the new models in general cases. At last, two numerical examples are provided to show the performance and applications of the models and algorithm.
The E-MS Algorithm: Model Selection with Incomplete Data.
Jiang, Jiming; Nguyen, Thuan; Rao, J Sunil
2015-04-04
We propose a procedure associated with the idea of the E-M algorithm for model selection in the presence of missing data. The idea extends the concept of parameters to include both the model and the parameters under the model, and thus allows the model to be part of the E-M iterations. We develop the procedure, known as the E-MS algorithm, under the assumption that the class of candidate models is finite. Some special cases of the procedure are considered, including E-MS with the generalized information criteria (GIC), and E-MS with the adaptive fence (AF; Jiang et al. 2008). We prove numerical convergence of the E-MS algorithm as well as consistency in model selection of the limiting model of the E-MS convergence, for E-MS with GIC and E-MS with AF. We study the impact on model selection of different missing data mechanisms. Furthermore, we carry out extensive simulation studies on the finite-sample performance of the E-MS with comparisons to other procedures. The methodology is also illustrated on a real data analysis involving QTL mapping for an agricultural study on barley grains.
NASA Astrophysics Data System (ADS)
Fachruddin Syah, Achmad; Saitoh, Sei-Ichi; Alabia, Irene D.; Hirawake, Toru
2017-01-01
To evaluate the effects of oceanographic conditions on the formation of the potential fishing zones for Pacific saury in western North Pacific, fishing locations of Pacific saury from Defense Meteorological Satellite Program/Operating Linescan System (DMSP/OLS) and satellite-based oceanographic information were used to construct species habitat models. A 2-level slicing method was used to identify the bright regions as actual fishing areas from OLS images, collected during the peak fishing season of Pacific saury in the North Pacific. Statistical metrics, including the significance of model terms, and reduction in the Akaike’s Information Criterion (AIC) were used as the bases for model selection. The selected model was then used to visualize the basin scale distributions of the Pacific saury habitat. The predicted potential fishing zones exhibited spatial correspondence with the fishing locations. The results from generalized additive model revealed that the Pacific saury habitat selection was significantly influenced by the SST ranges from 13-18°C, SSC ranges from 0.5-1.8 mg.m-3, SSHA ranges from 5-17 cm and EKE ranges from 700-1200 cm2s-2. Moreover, among the set of oceanographic factors examined, SST explained the smallest AIC and is thus, considered to be the most significant variable in the geographic distribution of Pacific saury.
Tests of Bayesian model selection techniques for gravitational wave astronomy
Cornish, Neil J.; Littenberg, Tyson B.
2007-10-15
The analysis of gravitational wave data involves many model selection problems. The most important example is the detection problem of selecting between the data being consistent with instrument noise alone, or instrument noise and a gravitational wave signal. The analysis of data from ground based gravitational wave detectors is mostly conducted using classical statistics, and methods such as the Neyman-Peterson criteria are used for model selection. Future space based detectors, such as the Laser Interferometer Space Antenna (LISA), are expected to produce rich data streams containing the signals from many millions of sources. Determining the number of sources that are resolvable, and the most appropriate description of each source poses a challenging model selection problem that may best be addressed in a Bayesian framework. An important class of LISA sources are the millions of low-mass binary systems within our own galaxy, tens of thousands of which will be detectable. Not only are the number of sources unknown, but so are the number of parameters required to model the waveforms. For example, a significant subset of the resolvable galactic binaries will exhibit orbital frequency evolution, while a smaller number will have measurable eccentricity. In the Bayesian approach to model selection one needs to compute the Bayes factor between competing models. Here we explore various methods for computing Bayes factors in the context of determining which galactic binaries have measurable frequency evolution. The methods explored include a reverse jump Markov chain Monte Carlo algorithm, Savage-Dickie density ratios, the Schwarz-Bayes information criterion, and the Laplace approximation to the model evidence. We find good agreement between all of the approaches.
Genome-Wide Heterogeneity of Nucleotide Substitution Model Fit
Arbiza, Leonardo; Patricio, Mateus; Dopazo, Hernán; Posada, David
2011-01-01
At a genomic scale, the patterns that have shaped molecular evolution are believed to be largely heterogeneous. Consequently, comparative analyses should use appropriate probabilistic substitution models that capture the main features under which different genomic regions have evolved. While efforts have concentrated in the development and understanding of model selection techniques, no descriptions of overall relative substitution model fit at the genome level have been reported. Here, we provide a characterization of best-fit substitution models across three genomic data sets including coding regions from mammals, vertebrates, and Drosophila (24,000 alignments). According to the Akaike Information Criterion (AIC), 82 of 88 models considered were selected as best-fit models at least in one occasion, although with very different frequencies. Most parameter estimates also varied broadly among genes. Patterns found for vertebrates and Drosophila were quite similar and often more complex than those found in mammals. Phylogenetic trees derived from models in the 95% confidence interval set showed much less variance and were significantly closer to the tree estimated under the best-fit model than trees derived from models outside this interval. Although alternative criteria selected simpler models than the AIC, they suggested similar patterns. All together our results show that at a genomic scale, different gene alignments for the same set of taxa are best explained by a large variety of different substitution models and that model choice has implications on different parameter estimates including the inferred phylogenetic trees. After taking into account the differences related to sample size, our results suggest a noticeable diversity in the underlying evolutionary process. All together, we conclude that the use of model selection techniques is important to obtain consistent phylogenetic estimates from real data at a genomic scale. PMID:21824869
Tamuri, Asif U.; dos Reis, Mario; Goldstein, Richard A.
2012-01-01
Estimation of the distribution of selection coefficients of mutations is a long-standing issue in molecular evolution. In addition to population-based methods, the distribution can be estimated from DNA sequence data by phylogenetic-based models. Previous models have generally found unimodal distributions where the probability mass is concentrated between mildly deleterious and nearly neutral mutations. Here we use a sitewise mutation–selection phylogenetic model to estimate the distribution of selection coefficients among novel and fixed mutations (substitutions) in a data set of 244 mammalian mitochondrial genomes and a set of 401 PB2 proteins from influenza. We find a bimodal distribution of selection coefficients for novel mutations in both the mitochondrial data set and for the influenza protein evolving in its natural reservoir, birds. Most of the mutations are strongly deleterious with the rest of the probability mass concentrated around mildly deleterious to neutral mutations. The distribution of the coefficients among substitutions is unimodal and symmetrical around nearly neutral substitutions for both data sets at adaptive equilibrium. About 0.5% of the nonsynonymous mutations and 14% of the nonsynonymous substitutions in the mitochondrial proteins are advantageous, with 0.5% and 24% observed for the influenza protein. Following a host shift of influenza from birds to humans, however, we find among novel mutations in PB2 a trimodal distribution with a small mode of advantageous mutations. PMID:22209901
Fixation probability in a two-locus intersexual selection model.
Durand, Guillermo; Lessard, Sabin
2016-06-01
We study a two-locus model of intersexual selection in a finite haploid population reproducing according to a discrete-time Moran model with a trait locus expressed in males and a preference locus expressed in females. We show that the probability of ultimate fixation of a single mutant allele for a male ornament introduced at random at the trait locus given any initial frequency state at the preference locus is increased by weak intersexual selection and recombination, weak or strong. Moreover, this probability exceeds the initial frequency of the mutant allele even in the case of a costly male ornament if intersexual selection is not too weak. On the other hand, the probability of ultimate fixation of a single mutant allele for a female preference towards a male ornament introduced at random at the preference locus is increased by weak intersexual selection and weak recombination if the female preference is not costly, and is strong enough in the case of a costly male ornament. The analysis relies on an extension of the ancestral recombination-selection graph for samples of haplotypes to take into account events of intersexual selection, while the symbolic calculation of the fixation probabilities is made possible in a reasonable time by an optimizing algorithm.
Improvement of hydrological model calibration by selecting multiple parameter ranges
NASA Astrophysics Data System (ADS)
Wu, Qiaofeng; Liu, Shuguang; Cai, Yi; Li, Xinjian; Jiang, Yangming
2017-01-01
The parameters of hydrological models are usually calibrated to achieve good performance, owing to the highly non-linear problem of hydrology process modelling. However, parameter calibration efficiency has a direct relation with parameter range. Furthermore, parameter range selection is affected by probability distribution of parameter values, parameter sensitivity, and correlation. A newly proposed method is employed to determine the optimal combination of multi-parameter ranges for improving the calibration of hydrological models. At first, the probability distribution was specified for each parameter of the model based on genetic algorithm (GA) calibration. Then, several ranges were selected for each parameter according to the corresponding probability distribution, and subsequently the optimal range was determined by comparing the model results calibrated with the different selected ranges. Next, parameter correlation and sensibility were evaluated by quantifying two indexes, RC Y, X and SE, which can be used to coordinate with the negatively correlated parameters to specify the optimal combination of ranges of all parameters for calibrating models. It is shown from the investigation that the probability distribution of calibrated values of any particular parameter in a Xinanjiang model approaches a normal or exponential distribution. The multi-parameter optimal range selection method is superior to the single-parameter one for calibrating hydrological models with multiple parameters. The combination of optimal ranges of all parameters is not the optimum inasmuch as some parameters have negative effects on other parameters. The application of the proposed methodology gives rise to an increase of 0.01 in minimum Nash-Sutcliffe efficiency (ENS) compared with that of the pure GA method. The rising of minimum ENS with little change of the maximum may shrink the range of the possible solutions, which can effectively reduce uncertainty of the model performance.
Uniform design based SVM model selection for face recognition
NASA Astrophysics Data System (ADS)
Li, Weihong; Liu, Lijuan; Gong, Weiguo
2010-02-01
Support vector machine (SVM) has been proved to be a powerful tool for face recognition. The generalization capacity of SVM depends on the model with optimal hyperparameters. The computational cost of SVM model selection results in application difficulty in face recognition. In order to overcome the shortcoming, we utilize the advantage of uniform design--space filling designs and uniformly scattering theory to seek for optimal SVM hyperparameters. Then we propose a face recognition scheme based on SVM with optimal model which obtained by replacing the grid and gradient-based method with uniform design. The experimental results on Yale and PIE face databases show that the proposed method significantly improves the efficiency of SVM model selection.
A topic evolution model with sentiment and selective attention
NASA Astrophysics Data System (ADS)
Si, Xia-Meng; Wang, Wen-Dong; Zhai, Chun-Qing; Ma, Yan
2017-04-01
Topic evolution is a hybrid dynamics of information propagation and opinion interaction. The dynamics of opinion interaction is inherently interwoven with the dynamics of information propagation in the network, owing to the bidirectional influences between interaction and diffusion. The degree of sentiment determines if the topic can continue to spread from this node, and the selective attention determines the information flow direction and communicatee selection. For this end, we put forward a sentiment-based mixed dynamics model with selective attention, and applied the Bayesian updating rules on it. Our model can indirectly describe the isolated users who seem isolated from a topic due to some reasons even everybody around them has heard about it. Numerical simulations show that, more insiders initially and fewer simultaneous spreaders can lessen the extremism. To promote the topic diffusion or restrain the prevailing of extremism, fewer agents with constructive motivation and more agents with no involving motivation are encouraged.
Model-based sensor location selection for helicopter gearbox monitoring
NASA Technical Reports Server (NTRS)
Jammu, Vinay B.; Wang, Keming; Danai, Kourosh; Lewicki, David G.
1996-01-01
A new methodology is introduced to quantify the significance of accelerometer locations for fault diagnosis of helicopter gearboxes. The basis for this methodology is an influence model which represents the effect of various component faults on accelerometer readings. Based on this model, a set of selection indices are defined to characterize the diagnosability of each component, the coverage of each accelerometer, and the relative redundancy between the accelerometers. The effectiveness of these indices is evaluated experimentally by measurement-fault data obtained from an OH-58A main rotor gearbox. These data are used to obtain a ranking of individual accelerometers according to their significance in diagnosis. Comparison between the experimentally obtained rankings and those obtained from the selection indices indicates that the proposed methodology offers a systematic means for accelerometer location selection.
Selection and Retention in Teacher Education: A Model
ERIC Educational Resources Information Center
Brubaker, Harold A.
1976-01-01
With the stabilization of student population and a demand for improvement in the quality of prospective teachers, a model has been developed for improving selection-retention procedures, giving administrative organization, role of student personnel services, criteria for admission to the profession, and provisions for probationary status,…
Selecting Microcomputer Network Configurations: A Model for Technological Endurance.
ERIC Educational Resources Information Center
Drummond, Marshall; And Others
A model approach is suggested for the selection of a microcomputer network that will identify specific needs and arrive at solutions with maximum flexibility to avoid technological obsolescence. Chapter 1 specifies functional needs for a network design. This chapter discusses the process of evaluating whether a network is appropriate; examines and…
An Assessment-Based Model for Counseling Strategy Selection.
ERIC Educational Resources Information Center
Nelson, Mary Lee
2002-01-01
Presents a counseling strategy selection model grounded in technical eclecticism and based on thorough assessment of the client's problems. Assessment should consider client mental health, counseling goals, problem complexity, and capacity and desire for insight. Distinguishing between simple and complex problems can aid assessment and provide…
Research and Development into a Comprehensive Media Selection Model.
ERIC Educational Resources Information Center
Cantor, Jeffrey A.
1988-01-01
Describes and discusses an instructional systems media selection model based on training effectiveness and cost effectiveness prediction techniques that were developed to support the U.S. Navy's training programs. Highlights include instructional delivery systems (IDS); decision making; trainee characteristics; training requirements analysis; an…
A model selection approach to analysis of variance and covariance.
Alber, Susan A; Weiss, Robert E
2009-06-15
An alternative to analysis of variance is a model selection approach where every partition of the treatment means into clusters with equal value is treated as a separate model. The null hypothesis that all treatments are equal corresponds to the partition with all means in a single cluster. The alternative hypothesis correspond to the set of all other partitions of treatment means. A model selection approach can also be used for a treatment by covariate interaction, where the null hypothesis and each alternative correspond to a partition of treatments into clusters with equal covariate effects. We extend the partition-as-model approach to simultaneous inference for both treatment main effect and treatment interaction with a continuous covariate with separate partitions for the intercepts and treatment-specific slopes. The model space is the Cartesian product of the intercept partition and the slope partition, and we develop five joint priors for this model space. In four of these priors the intercept and slope partition are dependent. We advise on setting priors over models, and we use the model to analyze an orthodontic data set that compares the frictional resistance created by orthodontic fixtures.
Goodenough, Anne E.; Hart, Adam G.; Stafford, Richard
2012-01-01
Despite recent papers on problems associated with full-model and stepwise regression, their use is still common throughout ecological and environmental disciplines. Alternative approaches, including generating multiple models and comparing them post-hoc using techniques such as Akaike's Information Criterion (AIC), are becoming more popular. However, these are problematic when there are numerous independent variables and interpretation is often difficult when competing models contain many different variables and combinations of variables. Here, we detail a new approach, REVS (Regression with Empirical Variable Selection), which uses all-subsets regression to quantify empirical support for every independent variable. A series of models is created; the first containing the variable with most empirical support, the second containing the first variable and the next most-supported, and so on. The comparatively small number of resultant models (n = the number of predictor variables) means that post-hoc comparison is comparatively quick and easy. When tested on a real dataset – habitat and offspring quality in the great tit (Parus major) – the optimal REVS model explained more variance (higher R2), was more parsimonious (lower AIC), and had greater significance (lower P values), than full, stepwise or all-subsets models; it also had higher predictive accuracy based on split-sample validation. Testing REVS on ten further datasets suggested that this is typical, with R2 values being higher than full or stepwise models (mean improvement = 31% and 7%, respectively). Results are ecologically intuitive as even when there are several competing models, they share a set of “core” variables and differ only in presence/absence of one or two additional variables. We conclude that REVS is useful for analysing complex datasets, including those in ecology and environmental disciplines. PMID:22479605
Goodenough, Anne E; Hart, Adam G; Stafford, Richard
2012-01-01
Despite recent papers on problems associated with full-model and stepwise regression, their use is still common throughout ecological and environmental disciplines. Alternative approaches, including generating multiple models and comparing them post-hoc using techniques such as Akaike's Information Criterion (AIC), are becoming more popular. However, these are problematic when there are numerous independent variables and interpretation is often difficult when competing models contain many different variables and combinations of variables. Here, we detail a new approach, REVS (Regression with Empirical Variable Selection), which uses all-subsets regression to quantify empirical support for every independent variable. A series of models is created; the first containing the variable with most empirical support, the second containing the first variable and the next most-supported, and so on. The comparatively small number of resultant models (n = the number of predictor variables) means that post-hoc comparison is comparatively quick and easy. When tested on a real dataset--habitat and offspring quality in the great tit (Parus major)--the optimal REVS model explained more variance (higher R(2)), was more parsimonious (lower AIC), and had greater significance (lower P values), than full, stepwise or all-subsets models; it also had higher predictive accuracy based on split-sample validation. Testing REVS on ten further datasets suggested that this is typical, with R(2) values being higher than full or stepwise models (mean improvement = 31% and 7%, respectively). Results are ecologically intuitive as even when there are several competing models, they share a set of "core" variables and differ only in presence/absence of one or two additional variables. We conclude that REVS is useful for analysing complex datasets, including those in ecology and environmental disciplines.
How Many Separable Sources? Model Selection In Independent Components Analysis
Woods, Roger P.; Hansen, Lars Kai; Strother, Stephen
2015-01-01
Unlike mixtures consisting solely of non-Gaussian sources, mixtures including two or more Gaussian components cannot be separated using standard independent components analysis methods that are based on higher order statistics and independent observations. The mixed Independent Components Analysis/Principal Components Analysis (mixed ICA/PCA) model described here accommodates one or more Gaussian components in the independent components analysis model and uses principal components analysis to characterize contributions from this inseparable Gaussian subspace. Information theory can then be used to select from among potential model categories with differing numbers of Gaussian components. Based on simulation studies, the assumptions and approximations underlying the Akaike Information Criterion do not hold in this setting, even with a very large number of observations. Cross-validation is a suitable, though computationally intensive alternative for model selection. Application of the algorithm is illustrated using Fisher's iris data set and Howells' craniometric data set. Mixed ICA/PCA is of potential interest in any field of scientific investigation where the authenticity of blindly separated non-Gaussian sources might otherwise be questionable. Failure of the Akaike Information Criterion in model selection also has relevance in traditional independent components analysis where all sources are assumed non-Gaussian. PMID:25811988
Bayesian Variable Selection on Model Spaces Constrained by Heredity Conditions.
Taylor-Rodriguez, Daniel; Womack, Andrew; Bliznyuk, Nikolay
2016-01-01
This paper investigates Bayesian variable selection when there is a hierarchical dependence structure on the inclusion of predictors in the model. In particular, we study the type of dependence found in polynomial response surfaces of orders two and higher, whose model spaces are required to satisfy weak or strong heredity conditions. These conditions restrict the inclusion of higher-order terms depending upon the inclusion of lower-order parent terms. We develop classes of priors on the model space, investigate their theoretical and finite sample properties, and provide a Metropolis-Hastings algorithm for searching the space of models. The tools proposed allow fast and thorough exploration of model spaces that account for hierarchical polynomial structure in the predictors and provide control of the inclusion of false positives in high posterior probability models.
Model building strategy for logistic regression: purposeful selection.
Zhang, Zhongheng
2016-03-01
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.
Bayesian analysis. II. Signal detection and model selection
NASA Astrophysics Data System (ADS)
Bretthorst, G. Larry
In the preceding. paper, Bayesian analysis was applied to the parameter estimation problem, given quadrature NMR data. Here Bayesian analysis is extended to the problem of selecting the model which is most probable in view of the data and all the prior information. In addition to the analytic calculation, two examples are given. The first example demonstrates how to use Bayesian probability theory to detect small signals in noise. The second example uses Bayesian probability theory to compute the probability of the number of decaying exponentials in simulated T1 data. The Bayesian answer to this question is essentially a microcosm of the scientific method and a quantitative statement of Ockham's razor: theorize about possible models, compare these to experiment, and select the simplest model that "best" fits the data.
Model selection as a science driver for dark energy surveys
NASA Astrophysics Data System (ADS)
Mukherjee, Pia; Parkinson, David; Corasaniti, Pier Stefano; Liddle, Andrew R.; Kunz, Martin
2006-07-01
A key science goal of upcoming dark energy surveys is to seek time-evolution of the dark energy. This problem is one of model selection, where the aim is to differentiate between cosmological models with different numbers of parameters. However, the power of these surveys is traditionally assessed by estimating their ability to constrain parameters, which is a different statistical problem. In this paper, we use Bayesian model selection techniques, specifically forecasting of the Bayes factors, to compare the abilities of different proposed surveys in discovering dark energy evolution. We consider six experiments - supernova luminosity measurements by the Supernova Legacy Survey, SNAP, JEDI and ALPACA, and baryon acoustic oscillation measurements by WFMOS and JEDI - and use Bayes factor plots to compare their statistical constraining power. The concept of Bayes factor forecasting has much broader applicability than dark energy surveys.
Supplier Selection in Virtual Enterprise Model of Manufacturing Supply Network
NASA Astrophysics Data System (ADS)
Kaihara, Toshiya; Opadiji, Jayeola F.
The market-based approach to manufacturing supply network planning focuses on the competitive attitudes of various enterprises in the network to generate plans that seek to maximize the throughput of the network. It is this competitive behaviour of the member units that we explore in proposing a solution model for a supplier selection problem in convergent manufacturing supply networks. We present a formulation of autonomous units of the network as trading agents in a virtual enterprise network interacting to deliver value to market consumers and discuss the effect of internal and external trading parameters on the selection of suppliers by enterprise units.
Lin, H. Y.; Desmond, R.; Liu, Y. H.; Bridges, S. L.; Soong, S. J.
2013-01-01
Summary Many complex disease traits are observed to be associated with single nucleotide polymorphism (SNP) interactions. In testing small-scale SNP-SNP interactions, variable selection procedures in logistic regressions are commonly used. The empirical evidence of variable selection for testing interactions in logistic regressions is limited. This simulation study was designed to compare nine variable selection procedures in logistic regressions for testing SNP-SNP interactions. Data on 10 SNPs were simulated for 400 and 1000 subjects (case/control ratio=1). The simulated model included one main effect and two 2-way interactions. The variable selection procedures included automatic selection (stepwise, forward and backward), common 2-step selection, AIC- and BIC-based selection. The hierarchical rule effect, in which all main effects and lower order terms of the highest-order interaction term are included in the model regardless of their statistical significance, was also examined. We found that the stepwise variable selection without the hierarchical rule which had reasonably high authentic (true positive) proportion and low noise (false positive) proportion, is a better method compared to other variable selection procedures. The procedure without the hierarchical rule requires fewer terms in testing interactions, so it can accommodate more SNPs than the procedure with the hierarchical rule. For testing interactions, the procedures without the hierarchical rule had higher authentic proportion and lower noise proportion compared with ones with the hierarchical rule. These variable selection procedures were also applied and compared in a rheumatoid arthritis study. PMID:18231122
Information theoretic model selection applied to supernovae data
NASA Astrophysics Data System (ADS)
Biesiada, Marek
2007-02-01
Current advances in observational cosmology suggest that our Universe is flat and dominated by dark energy. There are several different theoretical ideas invoked to explain the dark energy with relatively little guidance of which one of them might be right. Therefore the emphasis of ongoing and forthcoming research in this field shifts from estimating specific parameters of the cosmological model to the model selection. In this paper we apply an information theoretic model selection approach based on the Akaike criterion as an estimator of Kullback Leibler entropy. Although this approach has already been used by some authors in a similar context, this paper provides a more systematic introduction to the Akaike criterion. In particular, we present the proper way of ranking the competing models on the basis of Akaike weights (in Bayesian language: posterior probabilities of the models). This important ingredient is lacking from alternative studies dealing with cosmological applications of the Akaike criterion. Of the many particular models of dark energy we focus on four: quintessence, quintessence with a time varying equation of state, the braneworld scenario and the generalized Chaplygin gas model, and test them on Riess's gold sample. As a result we obtain that the best model—in terms of the Akaike criterion—is the quintessence model. The odds suggest that although there exist differences in the support given to specific scenarios by supernova data, most of the models considered receive similar support. The only exception is the Chaplygin gas which is considerably less supported. One can also note that models similar in structure, e.g. ΛCDM, quintessence and quintessence with a variable equation of state, are closer to each other in terms of Kullback Leibler entropy. Models having different structure, e.g. Chaplygin gas and the braneworld scenario, are more distant (in the Kullback Leibler sense) from the best one.
Bayesian spatially dependent variable selection for small area health modeling.
Choi, Jungsoon; Lawson, Andrew B
2016-06-16
Statistical methods for spatial health data to identify the significant covariates associated with the health outcomes are of critical importance. Most studies have developed variable selection approaches in which the covariates included appear within the spatial domain and their effects are fixed across space. However, the impact of covariates on health outcomes may change across space and ignoring this behavior in spatial epidemiology may cause the wrong interpretation of the relations. Thus, the development of a statistical framework for spatial variable selection is important to allow for the estimation of the space-varying patterns of covariate effects as well as the early detection of disease over space. In this paper, we develop flexible spatial variable selection approaches to find the spatially-varying subsets of covariates with significant effects. A Bayesian hierarchical latent model framework is applied to account for spatially-varying covariate effects. We present a simulation example to examine the performance of the proposed models with the competing models. We apply our models to a county-level low birth weight incidence dataset in Georgia.
Bayesian nonparametric centered random effects models with variable selection.
Yang, Mingan
2013-03-01
In a linear mixed effects model, it is common practice to assume that the random effects follow a parametric distribution such as a normal distribution with mean zero. However, in the case of variable selection, substantial violation of the normality assumption can potentially impact the subset selection and result in poor interpretation and even incorrect results. In nonparametric random effects models, the random effects generally have a nonzero mean, which causes an identifiability problem for the fixed effects that are paired with the random effects. In this article, we focus on a Bayesian method for variable selection. We characterize the subject-specific random effects nonparametrically with a Dirichlet process and resolve the bias simultaneously. In particular, we propose flexible modeling of the conditional distribution of the random effects with changes across the predictor space. The approach is implemented using a stochastic search Gibbs sampler to identify subsets of fixed effects and random effects to be included in the model. Simulations are provided to evaluate and compare the performance of our approach to the existing ones. We then apply the new approach to a real data example, cross-country and interlaboratory rodent uterotrophic bioassay.
Broken selection rule in the quantum Rabi model
Forn-Díaz, P.; Romero, G.; Harmans, C. J. P. M.; Solano, E.; Mooij, J. E.
2016-01-01
Understanding the interaction between light and matter is very relevant for fundamental studies of quantum electrodynamics and for the development of quantum technologies. The quantum Rabi model captures the physics of a single atom interacting with a single photon at all regimes of coupling strength. We report the spectroscopic observation of a resonant transition that breaks a selection rule in the quantum Rabi model, implemented using an LC resonator and an artificial atom, a superconducting qubit. The eigenstates of the system consist of a superposition of bare qubit-resonator states with a relative sign. When the qubit-resonator coupling strength is negligible compared to their own frequencies, the matrix element between excited eigenstates of different sign is very small in presence of a resonator drive, establishing a sign-preserving selection rule. Here, our qubit-resonator system operates in the ultrastrong coupling regime, where the coupling strength is 10% of the resonator frequency, allowing sign-changing transitions to be activated and, therefore, detected. This work shows that sign-changing transitions are an unambiguous, distinctive signature of systems operating in the ultrastrong coupling regime of the quantum Rabi model. These results pave the way to further studies of sign-preserving selection rules in multiqubit and multiphoton models. PMID:27273346
Broken selection rule in the quantum Rabi model.
Forn-Díaz, P; Romero, G; Harmans, C J P M; Solano, E; Mooij, J E
2016-06-07
Understanding the interaction between light and matter is very relevant for fundamental studies of quantum electrodynamics and for the development of quantum technologies. The quantum Rabi model captures the physics of a single atom interacting with a single photon at all regimes of coupling strength. We report the spectroscopic observation of a resonant transition that breaks a selection rule in the quantum Rabi model, implemented using an LC resonator and an artificial atom, a superconducting qubit. The eigenstates of the system consist of a superposition of bare qubit-resonator states with a relative sign. When the qubit-resonator coupling strength is negligible compared to their own frequencies, the matrix element between excited eigenstates of different sign is very small in presence of a resonator drive, establishing a sign-preserving selection rule. Here, our qubit-resonator system operates in the ultrastrong coupling regime, where the coupling strength is 10% of the resonator frequency, allowing sign-changing transitions to be activated and, therefore, detected. This work shows that sign-changing transitions are an unambiguous, distinctive signature of systems operating in the ultrastrong coupling regime of the quantum Rabi model. These results pave the way to further studies of sign-preserving selection rules in multiqubit and multiphoton models.
Selection of Models for Ingestion Pathway and Relocation Radii Determination
Blanchard, A.
1998-12-17
The distance at which intermediate phase protective actions (such as food interdiction and relocation) may be needed following postulated accidents at three Savannah River Site nonreactor nuclear facilities will be determined by modeling. The criteria used to select dispersion/deposition models are presented. Several models were considered, including ARAC, MACCS, HOTSPOT, WINDS (coupled with PUFF-PLUME), and UFOTRI. Although ARAC and WINDS are expected to provide more accurate modeling of atmospheric transport following an actual release, analyses consistent with regulatory guidance for planning purposes may be accomplished with comparatively simple dispersion models such as HOTSPOT and UFOTRI. A recommendation is made to use HOTSPOT for non-tritium facilities and UFOTRI for tritium facilities.
Selection of Models for Ingestion Pathway and Relocation
Blanchard, A.; Thompson, J.M.
1998-11-01
The area in which intermediate phase protective actions (such as food interdiction and relocation) may be needed following postulated accidents at three Savannah River Site nonreactor nuclear facilities will be determined by modeling. The criteria used to select dispersion/deposition models are presented. Several models are considered, including ARAC, MACCS, HOTSPOT, WINDS (coupled with PUFF-PLUME), and UFOTRI. Although ARAC and WINDS are expected to provide more accurate modeling of atmospheric transport following an actual release, analyses consistent with regulatory guidance for planning purposes may be accomplished with comparatively simple dispersion models such as HOTSPOT and UFOTRI. A recommendation is made to use HOTSPOT for non-tritium facilities and UFOTRI for tritium facilities. The most recent Food and Drug Administration Derived Intervention Levels (August 1998) are adopted as evaluation guidelines for ingestion pathways.
Selection of Models for Ingestion Pathway and Relocation
Blanchard, A.; Thompson, J.M.
1999-02-01
The area in which intermediate phase protective actions (such as food interdiction and relocation) may be needed following postulated accidents at three Savannah River Site nonreactor nuclear facilities will be determined by modeling. The criteria used to select dispersion/deposition models are presented. Several models are considered, including ARAC, MACCS, HOTSPOT, WINDS (coupled with PUFF-PLUME), and UFOTRI. Although ARAC and WINDS are expected to provide more accurate modeling of atmospheric transport following an actual release, analyses consistent with regulatory guidance for planning purposes may be accomplished with comparatively simple dispersion models such as HOTSPOT and UFOTRI. A recommendation is made to use HOTSPOT for non-tritium facilities and UFOTRI for tritium facilities. The most recent Food and Drug Administration Derived Intervention Levels (August 1998) are adopted as evaluation guidelines for ingestion pathways.
Model selection applied to reconstruction of the Primordial Power Spectrum
NASA Astrophysics Data System (ADS)
Vázquez, J. Alberto; Bridges, M.; Hobson, M. P.; Lasenby, A. N.
2012-06-01
The preferred shape for the primordial spectrum of curvature perturbations is determined by performing a Bayesian model selection analysis of cosmological observations. We first reconstruct the spectrum modelled as piecewise linear in log k between nodes in k-space whose amplitudes and positions are allowed to vary. The number of nodes together with their positions are chosen by the Bayesian evidence, so that we can both determine the complexity supported by the data and locate any features present in the spectrum. In addition to the node-based reconstruction, we consider a set of parameterised models for the primordial spectrum: the standard power-law parameterisation, the spectrum produced from the Lasenby & Doran (LD) model and a simple variant parameterisation. By comparing the Bayesian evidence for different classes of spectra, we find the power-law parameterisation is significantly disfavoured by current cosmological observations, which show a preference for the LD model.
Selection between Linear Factor Models and Latent Profile Models Using Conditional Covariances
ERIC Educational Resources Information Center
Halpin, Peter F.; Maraun, Michael D.
2010-01-01
A method for selecting between K-dimensional linear factor models and (K + 1)-class latent profile models is proposed. In particular, it is shown that the conditional covariances of observed variables are constant under factor models but nonlinear functions of the conditioning variable under latent profile models. The performance of a convenient…
Modeling selective elimination of quiescent cancer cells from bone marrow.
Cavnar, Stephen P; Rickelmann, Andrew D; Meguiar, Kaille F; Xiao, Annie; Dosch, Joseph; Leung, Brendan M; Cai Lesher-Perez, Sasha; Chitta, Shashank; Luker, Kathryn E; Takayama, Shuichi; Luker, Gary D
2015-08-01
Patients with many types of malignancy commonly harbor quiescent disseminated tumor cells in bone marrow. These cells frequently resist chemotherapy and may persist for years before proliferating as recurrent metastases. To test for compounds that eliminate quiescent cancer cells, we established a new 384-well 3D spheroid model in which small numbers of cancer cells reversibly arrest in G1/G0 phase of the cell cycle when cultured with bone marrow stromal cells. Using dual-color bioluminescence imaging to selectively quantify viability of cancer and stromal cells in the same spheroid, we identified single compounds and combination treatments that preferentially eliminated quiescent breast cancer cells but not stromal cells. A treatment combination effective against malignant cells in spheroids also eliminated breast cancer cells from bone marrow in a mouse xenograft model. This research establishes a novel screening platform for therapies that selectively target quiescent tumor cells, facilitating identification of new drugs to prevent recurrent cancer.
Parameter Estimation and Model Selection in Computational Biology
Lillacci, Gabriele; Khammash, Mustafa
2010-01-01
A central challenge in computational modeling of biological systems is the determination of the model parameters. Typically, only a fraction of the parameters (such as kinetic rate constants) are experimentally measured, while the rest are often fitted. The fitting process is usually based on experimental time course measurements of observables, which are used to assign parameter values that minimize some measure of the error between these measurements and the corresponding model prediction. The measurements, which can come from immunoblotting assays, fluorescent markers, etc., tend to be very noisy and taken at a limited number of time points. In this work we present a new approach to the problem of parameter selection of biological models. We show how one can use a dynamic recursive estimator, known as extended Kalman filter, to arrive at estimates of the model parameters. The proposed method follows. First, we use a variation of the Kalman filter that is particularly well suited to biological applications to obtain a first guess for the unknown parameters. Secondly, we employ an a posteriori identifiability test to check the reliability of the estimates. Finally, we solve an optimization problem to refine the first guess in case it should not be accurate enough. The final estimates are guaranteed to be statistically consistent with the measurements. Furthermore, we show how the same tools can be used to discriminate among alternate models of the same biological process. We demonstrate these ideas by applying our methods to two examples, namely a model of the heat shock response in E. coli, and a model of a synthetic gene regulation system. The methods presented are quite general and may be applied to a wide class of biological systems where noisy measurements are used for parameter estimation or model selection. PMID:20221262
Velocity selection in the symmetric model of dendritic crystal growth
NASA Technical Reports Server (NTRS)
Barbieri, Angelo; Hong, Daniel C.; Langer, J. S.
1987-01-01
An analytic solution of the problem of velocity selection in a fully nonlocal model of dendritic crystal growth is presented. The analysis uses a WKB technique to derive and evaluate a solvability condition for the existence of steady-state needle-like solidification fronts in the limit of small under-cooling Delta. For the two-dimensional symmetric model with a capillary anisotropy of strength alpha, it is found that the velocity is proportional to (Delta to the 4th) times (alpha exp 7/4). The application of the method in three dimensions is also described.
Sutton, Steven C; Hu, Mingxiu
2006-05-05
Many mathematical models have been proposed for establishing an in vitro/in vivo correlation (IVIVC). The traditional IVIVC model building process consists of 5 steps: deconvolution, model fitting, convolution, prediction error evaluation, and cross-validation. This is a time-consuming process and typically a few models at most are tested for any given data set. The objectives of this work were to (1) propose a statistical tool to screen models for further development of an IVIVC, (2) evaluate the performance of each model under different circumstances, and (3) investigate the effectiveness of common statistical model selection criteria for choosing IVIVC models. A computer program was developed to explore which model(s) would be most likely to work well with a random variation from the original formulation. The process used Monte Carlo simulation techniques to build IVIVC models. Data-based model selection criteria (Akaike Information Criteria [AIC], R2) and the probability of passing the Food and Drug Administration "prediction error" requirement was calculated. To illustrate this approach, several real data sets representing a broad range of release profiles are used to illustrate the process and to demonstrate the advantages of this automated process over the traditional approach. The Hixson-Crowell and Weibull models were often preferred over the linear. When evaluating whether a Level A IVIVC model was possible, the model selection criteria AIC generally selected the best model. We believe that the approach we proposed may be a rapid tool to determine which IVIVC model (if any) is the most applicable.
Application of the modelling power approach to variable subset selection for GA-PLS QSAR models.
Sagrado, Salvador; Cronin, Mark T D
2008-02-25
A previously developed function, the Modelling Power Plot, has been applied to QSARs developed using partial least squares (PLS) following variable selection from a genetic algorithm (GA). Modelling power (Mp) integrates the predictive and descriptive capabilities of a QSAR. With regard to QSARs for narcotic toxic potency, Mp was able to guide the optimal selection of variables using a GA. The results emphasise the importance of Mp to assess the success of the variable selection and that techniques such as PLS are more robust following variable selection.
Selecting, weeding, and weighting biased climate model ensembles
NASA Astrophysics Data System (ADS)
Jackson, C. S.; Picton, J.; Huerta, G.; Nosedal Sanchez, A.
2012-12-01
In the Bayesian formulation, the "log-likelihood" is a test statistic for selecting, weeding, or weighting climate model ensembles with observational data. This statistic has the potential to synthesize the physical and data constraints on quantities of interest. One of the thorny issues for formulating the log-likelihood is how one should account for biases. While in the past we have included a generic discrepancy term, not all biases affect predictions of quantities of interest. We make use of a 165-member ensemble CAM3.1/slab ocean climate models with different parameter settings to think through the issues that are involved with predicting each model's sensitivity to greenhouse gas forcing given what can be observed from the base state. In particular we use multivariate empirical orthogonal functions to decompose the differences that exist among this ensemble to discover what fields and regions matter to the model's sensitivity. We find that the differences that matter are a small fraction of the total discrepancy. Moreover, weighting members of the ensemble using this knowledge does a relatively poor job of adjusting the ensemble mean toward the known answer. This points out the shortcomings of using weights to correct for biases in climate model ensembles created by a selection process that does not emphasize the priorities of your log-likelihood.
The Impact of Varied Discrimination Parameters on Mixed-Format Item Response Theory Model Selection
ERIC Educational Resources Information Center
Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G.
2013-01-01
Whittaker, Chang, and Dodd compared the performance of model selection criteria when selecting among mixed-format IRT models and found that the criteria did not perform adequately when selecting the more parameterized models. It was suggested by M. S. Johnson that the problems when selecting the more parameterized models may be because of the low…
Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates.
Liu, Jingyuan; Li, Runze; Wu, Rongling
2014-01-01
This paper is concerned with feature screening and variable selection for varying coefficient models with ultrahigh dimensional covariates. We propose a new feature screening procedure for these models based on conditional correlation coefficient. We systematically study the theoretical properties of the proposed procedure, and establish their sure screening property and the ranking consistency. To enhance the finite sample performance of the proposed procedure, we further develop an iterative feature screening procedure. Monte Carlo simulation studies were conducted to examine the performance of the proposed procedures. In practice, we advocate a two-stage approach for varying coefficient models. The two stage approach consists of (a) reducing the ultrahigh dimensionality by using the proposed procedure and (b) applying regularization methods for dimension-reduced varying coefficient models to make statistical inferences on the coefficient functions. We illustrate the proposed two-stage approach by a real data example.
Whelan, Simon; Allen, James E; Blackburne, Benjamin P; Talavera, David
2015-01-01
Molecular phylogenetics is a powerful tool for inferring both the process and pattern of evolution from genomic sequence data. Statistical approaches, such as maximum likelihood and Bayesian inference, are now established as the preferred methods of inference. The choice of models that a researcher uses for inference is of critical importance, and there are established methods for model selection conditioned on a particular type of data, such as nucleotides, amino acids, or codons. A major limitation of existing model selection approaches is that they can only compare models acting upon a single type of data. Here, we extend model selection to allow comparisons between models describing different types of data by introducing the idea of adapter functions, which project aggregated models onto the originally observed sequence data. These projections are implemented in the program ModelOMatic and used to perform model selection on 3722 families from the PANDIT database, 68 genes from an arthropod phylogenomic data set, and 248 genes from a vertebrate phylogenomic data set. For the PANDIT and arthropod data, we find that amino acid models are selected for the overwhelming majority of alignments; with progressively smaller numbers of alignments selecting codon and nucleotide models, and no families selecting RY-based models. In contrast, nearly all alignments from the vertebrate data set select codon-based models. The sequence divergence, the number of sequences, and the degree of selection acting upon the protein sequences may contribute to explaining this variation in model selection. Our ModelOMatic program is fast, with most families from PANDIT taking fewer than 150 s to complete, and should therefore be easily incorporated into existing phylogenetic pipelines. ModelOMatic is available at https://code.google.com/p/modelomatic/.
Selection of Representative Models for Decision Analysis Under Uncertainty
NASA Astrophysics Data System (ADS)
Meira, Luis A. A.; Coelho, Guilherme P.; Santos, Antonio Alberto S.; Schiozer, Denis J.
2016-03-01
The decision-making process in oil fields includes a step of risk analysis associated with the uncertainties present in the variables of the problem. Such uncertainties lead to hundreds, even thousands, of possible scenarios that are supposed to be analyzed so an effective production strategy can be selected. Given this high number of scenarios, a technique to reduce this set to a smaller, feasible subset of representative scenarios is imperative. The selected scenarios must be representative of the original set and also free of optimistic and pessimistic bias. This paper is devoted to propose an assisted methodology to identify representative models in oil fields. To do so, first a mathematical function was developed to model the representativeness of a subset of models with respect to the full set that characterizes the problem. Then, an optimization tool was implemented to identify the representative models of any problem, considering not only the cross-plots of the main output variables, but also the risk curves and the probability distribution of the attribute-levels of the problem. The proposed technique was applied to two benchmark cases and the results, evaluated by experts in the field, indicate that the obtained solutions are richer than those identified by previously adopted manual approaches. The program bytecode is available under request.
UQ-Guided Selection of Physical Parameterizations in Climate Models
NASA Astrophysics Data System (ADS)
Lucas, D. D.; Debusschere, B.; Ghan, S.; Rosa, D.; Bulaevskaya, V.; Anderson, G. J.; Chowdhary, K.; Qian, Y.; Lin, G.; Larson, V. E.; Zhang, G. J.; Randall, D. A.
2015-12-01
Given two or more parameterizations that represent the same physical process in a climate model, scientists are sometimes faced with difficult decisions about which scheme to choose for their simulations and analysis. These decisions are often based on subjective criteria, such as "which scheme is easier to use, is computationally less expensive, or produces results that look better?" Uncertainty quantification (UQ) and model selection methods can be used to objectively rank the performance of different physical parameterizations by increasing the preference for schemes that fit observational data better, while at the same time penalizing schemes that are overly complex or have excessive degrees-of-freedom. Following these principles, we are developing a perturbed-parameter UQ framework to assist in the selection of parameterizations for a climate model. Preliminary results will be presented on the application of the framework to assess the performance of two alternate schemes for simulating tropical deep convection (CLUBB-SILHS and ZM-trigmem) in the U.S. Dept. of Energy's ACME climate model. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, is supported by the DOE Office of Science through the Scientific Discovery Through Advanced Computing (SciDAC), and is released as LLNL-ABS-675799.
Model selection and inference for censored lifetime medical expenditures.
Johnson, Brent A; Long, Qi; Huang, Yijian; Chansky, Kari; Redman, Mary
2016-09-01
Identifying factors associated with increased medical cost is important for many micro- and macro-institutions, including the national economy and public health, insurers and the insured. However, assembling comprehensive national databases that include both the cost and individual-level predictors can prove challenging. Alternatively, one can use data from smaller studies with the understanding that conclusions drawn from such analyses may be limited to the participant population. At the same time, smaller clinical studies have limited follow-up and lifetime medical cost may not be fully observed for all study participants. In this context, we develop new model selection methods and inference procedures for secondary analyses of clinical trial data when lifetime medical cost is subject to induced censoring. Our model selection methods extend a theory of penalized estimating function to a calibration regression estimator tailored for this data type. Next, we develop a novel inference procedure for the unpenalized regression estimator using perturbation and resampling theory. Then, we extend this resampling plan to accommodate regularized coefficient estimation of censored lifetime medical cost and develop postselection inference procedures for the final model. Our methods are motivated by data from Southwest Oncology Group Protocol 9509, a clinical trial of patients with advanced nonsmall cell lung cancer, and our models of lifetime medical cost are specific to this population. But the methods presented in this article are built on rather general techniques and could be applied to larger databases as those data become available.
Selecting global climate models for regional climate change studies.
Pierce, David W; Barnett, Tim P; Santer, Benjamin D; Gleckler, Peter J
2009-05-26
Regional or local climate change modeling studies currently require starting with a global climate model, then downscaling to the region of interest. How should global models be chosen for such studies, and what effect do such choices have? This question is addressed in the context of a regional climate detection and attribution (D&A) study of January-February-March (JFM) temperature over the western U.S. Models are often selected for a regional D&A analysis based on the quality of the simulated regional climate. Accordingly, 42 performance metrics based on seasonal temperature and precipitation, the El Nino/Southern Oscillation (ENSO), and the Pacific Decadal Oscillation are constructed and applied to 21 global models. However, no strong relationship is found between the score of the models on the metrics and results of the D&A analysis. Instead, the importance of having ensembles of runs with enough realizations to reduce the effects of natural internal climate variability is emphasized. Also, the superiority of the multimodel ensemble average (MM) to any 1 individual model, already found in global studies examining the mean climate, is true in this regional study that includes measures of variability as well. Evidence is shown that this superiority is largely caused by the cancellation of offsetting errors in the individual global models. Results with both the MM and models picked randomly confirm the original D&A results of anthropogenically forced JFM temperature changes in the western U.S. Future projections of temperature do not depend on model performance until the 2080s, after which the better performing models show warmer temperatures.
Selecting global climate models for regional climate change studies
Pierce, David W.; Barnett, Tim P.; Santer, Benjamin D.; Gleckler, Peter J.
2009-01-01
Regional or local climate change modeling studies currently require starting with a global climate model, then downscaling to the region of interest. How should global models be chosen for such studies, and what effect do such choices have? This question is addressed in the context of a regional climate detection and attribution (D&A) study of January-February-March (JFM) temperature over the western U.S. Models are often selected for a regional D&A analysis based on the quality of the simulated regional climate. Accordingly, 42 performance metrics based on seasonal temperature and precipitation, the El Nino/Southern Oscillation (ENSO), and the Pacific Decadal Oscillation are constructed and applied to 21 global models. However, no strong relationship is found between the score of the models on the metrics and results of the D&A analysis. Instead, the importance of having ensembles of runs with enough realizations to reduce the effects of natural internal climate variability is emphasized. Also, the superiority of the multimodel ensemble average (MM) to any 1 individual model, already found in global studies examining the mean climate, is true in this regional study that includes measures of variability as well. Evidence is shown that this superiority is largely caused by the cancellation of offsetting errors in the individual global models. Results with both the MM and models picked randomly confirm the original D&A results of anthropogenically forced JFM temperature changes in the western U.S. Future projections of temperature do not depend on model performance until the 2080s, after which the better performing models show warmer temperatures. PMID:19439652
A new selection metric for multiobjective hydrologic model calibration
NASA Astrophysics Data System (ADS)
Asadzadeh, Masoud; Tolson, Bryan A.; Burn, Donald H.
2014-09-01
A novel selection metric called Convex Hull Contribution (CHC) is introduced for solving multiobjective (MO) optimization problems with Pareto fronts that can be accurately approximated by a convex curve. The hydrologic model calibration literature shows that many biobjective calibration problems with a proper setup result in such Pareto fronts. The CHC selection approach identifies a subset of archived nondominated solutions whose map in the objective space forms convex approximation of the Pareto front. The optimization algorithm can sample solely from these solutions to more accurately approximate the convex shape of the Pareto front. It is empirically demonstrated that CHC improves the performance of Pareto Archived Dynamically Dimensioned Search (PA-DDS) when solving MO problems with convex Pareto fronts. This conclusion is based on the results of several benchmark mathematical problems and several hydrologic model calibration problems with two or three objective functions. The impact of CHC on PA-DDS performance is most evident when the computational budget is somewhat limited. It is also demonstrated that 1,000 solution evaluations (limited budget in this study) is sufficient for PA-DDS with CHC-based selection to achieve very high quality calibration results relative to the results achieved after 10,000 solution evaluations.
On observation distributions for state space models of population survey data.
Knape, Jonas; Jonzén, Niclas; Sköld, Martin
2011-11-01
1. State space models are starting to replace more simple time series models in analyses of temporal dynamics of populations that are not perfectly censused. By simultaneously modelling both the dynamics and the observations, consistent estimates of population dynamical parameters may be obtained. For many data sets, the distribution of observation errors is unknown and error models typically chosen in an ad-hoc manner. 2. To investigate the influence of the choice of observation error on inferences, we analyse the dynamics of a replicated time series of red kangaroo surveys using a state space model with linear state dynamics. Surveys were performed through aerial counts and Poisson, overdispersed Poisson, normal and log-normal distributions may all be adequate for modelling observation errors for the data. We fit each of these to the data and compare them using AIC. 3. The state space models were fitted with maximum likelihood methods using a recent importance sampling technique that relies on the Kalman filter. The method relaxes the assumption of Gaussian observation errors required by the basic Kalman filter. Matlab code for fitting linear state space models with Poisson observations is provided. 4. The ability of AIC to identify the correct observation model was investigated in a small simulation study. For the parameter values used in the study, without replicated observations, the correct observation distribution could sometimes be identified but model selection was prone to misclassification. On the other hand, when observations were replicated, the correct distribution could typically be identified. 5. Our results illustrate that inferences may differ markedly depending on the observation distributions used, suggesting that choosing an adequate observation model can be critical. Model selection and simulations show that for the models and parameter values in this study, a suitable observation model can typically be identified if observations are
A Dual-Stage Two-Phase Model of Selective Attention
ERIC Educational Resources Information Center
Hubner, Ronald; Steinhauser, Marco; Lehle, Carola
2010-01-01
The dual-stage two-phase (DSTP) model is introduced as a formal and general model of selective attention that includes both an early and a late stage of stimulus selection. Whereas at the early stage information is selected by perceptual filters whose selectivity is relatively limited, at the late stage stimuli are selected more efficiently on a…
Nonparametric Bayes Conditional Distribution Modeling With Variable Selection.
Chung, Yeonseung; Dunson, David B
2009-12-01
This article considers a methodology for flexibly characterizing the relationship between a response and multiple predictors. Goals are (1) to estimate the conditional response distribution addressing the distributional changes across the predictor space, and (2) to identify important predictors for the response distribution change both within local regions and globally. We first introduce the probit stick-breaking process (PSBP) as a prior for an uncountable collection of predictor-dependent random distributions and propose a PSBP mixture (PSBPM) of normal regressions for modeling the conditional distributions. A global variable selection structure is incorporated to discard unimportant predictors, while allowing estimation of posterior inclusion probabilities. Local variable selection is conducted relying on the conditional distribution estimates at different predictor points. An efficient stochastic search sampling algorithm is proposed for posterior computation. The methods are illustrated through simulation and applied to an epidemiologic study.
Refined homology model of monoacylglycerol lipase: toward a selective inhibitor
NASA Astrophysics Data System (ADS)
Bowman, Anna L.; Makriyannis, Alexandros
2009-11-01
Monoacylglycerol lipase (MGL) is primarily responsible for the hydrolysis of 2-arachidonoylglycerol (2-AG), an endocannabinoid with full agonist activity at both cannabinoid receptors. Increased tissue 2-AG levels consequent to MGL inhibition are considered therapeutic against pain, inflammation, and neurodegenerative disorders. However, the lack of MGL structural information has hindered the development of MGL-selective inhibitors. Here, we detail a fully refined homology model of MGL which preferentially identifies MGL inhibitors over druglike noninhibitors. We include for the first time insight into the active-site geometry and potential hydrogen-bonding interactions along with molecular dynamics simulations describing the opening and closing of the MGL helical-domain lid. Docked poses of both the natural substrate and known inhibitors are detailed. A comparison of the MGL active-site to that of the other principal endocannabinoid metabolizing enzyme, fatty acid amide hydrolase, demonstrates key differences which provide crucial insight toward the design of selective MGL inhibitors as potential drugs.
Competing metabolic strategies in a multilevel selection model
Amado, André; Fernández, Lenin; Huang, Weini; Ferreira, Fernando F.
2016-01-01
The evolutionary mechanisms of energy efficiency have been addressed. One important question is to understand how the optimized usage of energy can be selected in an evolutionary process, especially when the immediate advantage of gathering efficient individuals in an energetic context is not clear. We propose a model of two competing metabolic strategies differing in their resource usage, an efficient strain which converts resource into energy at high efficiency but displays a low rate of resource consumption, and an inefficient strain which consumes resource at a high rate but at low yield. We explore the dynamics in both well-mixed and structured populations. The selection for optimized energy usage is measured by the likelihood that an efficient strain can invade a population of inefficient strains. It is found that the parameter space at which the efficient strain can thrive in structured populations is always broader than observed in well-mixed populations. PMID:28018642
NASA Astrophysics Data System (ADS)
Zhao, Pei; Shao, Ming-an; Horton, Robert
2011-02-01
Soil particle-size distributions (PSD) have been used to estimate soil hydraulic properties. Various parametric PSD models have been proposed to describe the soil PSD from sparse experimental data. It is important to determine which PSD model best represents specific soils. Fourteen PSD models were examined in order to determine the best model for representing the deposited soils adjacent to dams in the China Loess Plateau; these were: Skaggs (S-1, S-2, and S-3), fractal (FR), Jaky (J), Lima and Silva (LS), Morgan (M), Gompertz (G), logarithm (L), exponential (E), log-exponential (LE), Weibull (W), van Genuchten type (VG) as well as Fredlund (F) models. Four-hundred and eighty samples were obtained from soils deposited in the Liudaogou catchment. The coefficient of determination (R 2), the Akaike's information criterion (AIC), and the modified AIC (mAIC) were used. Based upon R 2 and AIC, the three- and four-parameter models were both good at describing the PSDs of deposited soils, and the LE, FR, and E models were the poorest. However, the mAIC in conjunction with R 2 and AIC results indicated that the W model was optimum for describing PSD of the deposited soils for emphasizing the effect of parameter number. This analysis was also helpful for finding out which model is the best one. Our results are applicable to the China Loess Plateau.
Development of Solar Drying Model for Selected Cambodian Fish Species
Hubackova, Anna; Kucerova, Iva; Chrun, Rithy; Chaloupkova, Petra; Banout, Jan
2014-01-01
A solar drying was investigated as one of perspective techniques for fish processing in Cambodia. The solar drying was compared to conventional drying in electric oven. Five typical Cambodian fish species were selected for this study. Mean solar drying temperature and drying air relative humidity were 55.6°C and 19.9%, respectively. The overall solar dryer efficiency was 12.37%, which is typical for natural convection solar dryers. An average evaporative capacity of solar dryer was 0.049 kg·h−1. Based on coefficient of determination (R2), chi-square (χ2) test, and root-mean-square error (RMSE), the most suitable models describing natural convection solar drying kinetics were Logarithmic model, Diffusion approximate model, and Two-term model for climbing perch and Nile tilapia, swamp eel and walking catfish and Channa fish, respectively. In case of electric oven drying, the Modified Page 1 model shows the best results for all investigated fish species except Channa fish where the two-term model is the best one. Sensory evaluation shows that most preferable fish is climbing perch, followed by Nile tilapia and walking catfish. This study brings new knowledge about drying kinetics of fresh water fish species in Cambodia and confirms the solar drying as acceptable technology for fish processing. PMID:25250381
Development of solar drying model for selected Cambodian fish species.
Hubackova, Anna; Kucerova, Iva; Chrun, Rithy; Chaloupkova, Petra; Banout, Jan
2014-01-01
A solar drying was investigated as one of perspective techniques for fish processing in Cambodia. The solar drying was compared to conventional drying in electric oven. Five typical Cambodian fish species were selected for this study. Mean solar drying temperature and drying air relative humidity were 55.6 °C and 19.9%, respectively. The overall solar dryer efficiency was 12.37%, which is typical for natural convection solar dryers. An average evaporative capacity of solar dryer was 0.049 kg · h(-1). Based on coefficient of determination (R(2)), chi-square (χ(2)) test, and root-mean-square error (RMSE), the most suitable models describing natural convection solar drying kinetics were Logarithmic model, Diffusion approximate model, and Two-term model for climbing perch and Nile tilapia, swamp eel and walking catfish and Channa fish, respectively. In case of electric oven drying, the Modified Page 1 model shows the best results for all investigated fish species except Channa fish where the two-term model is the best one. Sensory evaluation shows that most preferable fish is climbing perch, followed by Nile tilapia and walking catfish. This study brings new knowledge about drying kinetics of fresh water fish species in Cambodia and confirms the solar drying as acceptable technology for fish processing.
A Successive Selection Method for finite element model updating
NASA Astrophysics Data System (ADS)
Gou, Baiyong; Zhang, Weijie; Lu, Qiuhai; Wang, Bo
2016-03-01
Finite Element (FE) model can be updated effectively and efficiently by using the Response Surface Method (RSM). However, it often involves performance trade-offs such as high computational cost for better accuracy or loss of efficiency for lots of design parameter updates. This paper proposes a Successive Selection Method (SSM), which is based on the linear Response Surface (RS) function and orthogonal design. SSM rewrites the linear RS function into a number of linear equations to adjust the Design of Experiment (DOE) after every FE calculation. SSM aims to interpret the implicit information provided by the FE analysis, to locate the Design of Experiment (DOE) points more quickly and accurately, and thereby to alleviate the computational burden. This paper introduces the SSM and its application, describes the solution steps of point selection for DOE in detail, and analyzes SSM's high efficiency and accuracy in the FE model updating. A numerical example of a simply supported beam and a practical example of a vehicle brake disc show that the SSM can provide higher speed and precision in FE model updating for engineering problems than traditional RSM.
Selection Strategies for Social Influence in the Threshold Model
NASA Astrophysics Data System (ADS)
Karampourniotis, Panagiotis; Szymanski, Boleslaw; Korniss, Gyorgy
The ubiquity of online social networks makes the study of social influence extremely significant for its applications to marketing, politics and security. Maximizing the spread of influence by strategically selecting nodes as initiators of a new opinion or trend is a challenging problem. We study the performance of various strategies for selection of large fractions of initiators on a classical social influence model, the Threshold model (TM). Under the TM, a node adopts a new opinion only when the fraction of its first neighbors possessing that opinion exceeds a pre-assigned threshold. The strategies we study are of two kinds: strategies based solely on the initial network structure (Degree-rank, Dominating Sets, PageRank etc.) and strategies that take into account the change of the states of the nodes during the evolution of the cascade, e.g. the greedy algorithm. We find that the performance of these strategies depends largely on both the network structure properties, e.g. the assortativity, and the distribution of the thresholds assigned to the nodes. We conclude that the optimal strategy needs to combine the network specifics and the model specific parameters to identify the most influential spreaders. Supported in part by ARL NS-CTA, ARO, and ONR.
Selection Experiments in the Penna Model for Biological Aging
NASA Astrophysics Data System (ADS)
Medeiros, G.; Idiart, M. A.; de Almeida, R. M. C.
We consider the Penna model for biological aging to investigate correlations between early fertility and late life survival rates in populations at equilibrium. We consider inherited initial reproduction ages together with a reproduction cost translated in a probability that mother and offspring die at birth, depending on the mother age. For convenient sets of parameters, the equilibrated populations present genetic variability in what regards both genetically programmed death age and initial reproduction age. In the asexual Penna model, a negative correlation between early life fertility and late life survival rates naturally emerges in the stationary solutions. In the sexual Penna model, selection experiments are performed where individuals are sorted by initial reproduction age from the equilibrated populations and the separated populations are evolved independently. After a transient, a negative correlation between early fertility and late age survival rates also emerges in the sense that populations that start reproducing earlier present smaller average genetically programmed death age. These effects appear due to the age structure of populations in the steady state solution of the evolution equations. We claim that the same demographic effects may be playing an important role in selection experiments in the laboratory.
Continuum model for chiral induced spin selectivity in helical molecules
Medina, Ernesto; González-Arraga, Luis A.; Finkelstein-Shapiro, Daniel; Mujica, Vladimiro; Berche, Bertrand
2015-05-21
A minimal model is exactly solved for electron spin transport on a helix. Electron transport is assumed to be supported by well oriented p{sub z} type orbitals on base molecules forming a staircase of definite chirality. In a tight binding interpretation, the spin-orbit coupling (SOC) opens up an effective π{sub z} − π{sub z} coupling via interbase p{sub x,y} − p{sub z} hopping, introducing spin coupled transport. The resulting continuum model spectrum shows two Kramers doublet transport channels with a gap proportional to the SOC. Each doubly degenerate channel satisfies time reversal symmetry; nevertheless, a bias chooses a transport direction and thus selects for spin orientation. The model predicts (i) which spin orientation is selected depending on chirality and bias, (ii) changes in spin preference as a function of input Fermi level and (iii) back-scattering suppression protected by the SO gap. We compute the spin current with a definite helicity and find it to be proportional to the torsion of the chiral structure and the non-adiabatic Aharonov-Anandan phase. To describe room temperature transport, we assume that the total transmission is the result of a product of coherent steps.
Zhang, Xinyu; Cao, Jiguo; Carroll, Raymond J
2015-03-01
We consider model selection and estimation in a context where there are competing ordinary differential equation (ODE) models, and all the models are special cases of a "full" model. We propose a computationally inexpensive approach that employs statistical estimation of the full model, followed by a combination of a least squares approximation (LSA) and the adaptive Lasso. We show the resulting method, here called the LSA method, to be an (asymptotically) oracle model selection method. The finite sample performance of the proposed LSA method is investigated with Monte Carlo simulations, in which we examine the percentage of selecting true ODE models, the efficiency of the parameter estimation compared to simply using the full and true models, and coverage probabilities of the estimated confidence intervals for ODE parameters, all of which have satisfactory performances. Our method is also demonstrated by selecting the best predator-prey ODE to model a lynx and hare population dynamical system among some well-known and biologically interpretable ODE models.
BUILDING ROBUST APPEARANCE MODELS USING ON-LINE FEATURE SELECTION
PORTER, REID B.; LOVELAND, ROHAN; ROSTEN, ED
2007-01-29
In many tracking applications, adapting the target appearance model over time can improve performance. This approach is most popular in high frame rate video applications where latent variables, related to the objects appearance (e.g., orientation and pose), vary slowly from one frame to the next. In these cases the appearance model and the tracking system are tightly integrated, and latent variables are often included as part of the tracking system's dynamic model. In this paper we describe our efforts to track cars in low frame rate data (1 frame/second) acquired from a highly unstable airborne platform. Due to the low frame rate, and poor image quality, the appearance of a particular vehicle varies greatly from one frame to the next. This leads us to a different problem: how can we build the best appearance model from all instances of a vehicle we have seen so far. The best appearance model should maximize the future performance of the tracking system, and maximize the chances of reacquiring the vehicle once it leaves the field of view. We propose an online feature selection approach to this problem and investigate the performance and computational trade-offs with a real-world dataset.
A Neuronal Network Model for Pitch Selectivity and Representation
Huang, Chengcheng; Rinzel, John
2016-01-01
Pitch is a perceptual correlate of periodicity. Sounds with distinct spectra can elicit the same pitch. Despite the importance of pitch perception, understanding the cellular mechanism of pitch perception is still a major challenge and a mechanistic model of pitch is lacking. A multi-stage neuronal network model is developed for pitch frequency estimation using biophysically-based, high-resolution coincidence detector neurons. The neuronal units respond only to highly coincident input among convergent auditory nerve fibers across frequency channels. Their selectivity for only very fast rising slopes of convergent input enables these slope-detectors to distinguish the most prominent coincidences in multi-peaked input time courses. Pitch can then be estimated from the first-order interspike intervals of the slope-detectors. The regular firing pattern of the slope-detector neurons are similar for sounds sharing the same pitch despite the distinct timbres. The decoded pitch strengths also correlate well with the salience of pitch perception as reported by human listeners. Therefore, our model can serve as a neural representation for pitch. Our model performs successfully in estimating the pitch of missing fundamental complexes and reproducing the pitch variation with respect to the frequency shift of inharmonic complexes. It also accounts for the phase sensitivity of pitch perception in the cases of Schroeder phase, alternating phase and random phase relationships. Moreover, our model can also be applied to stochastic sound stimuli, iterated-ripple-noise, and account for their multiple pitch perceptions. PMID:27378900
Parameter estimation and analysis model selections in fluorescence correlation spectroscopy
NASA Astrophysics Data System (ADS)
Dong, Shiqing; Zhou, Jie; Ding, Xuemei; Wang, Yuhua; Xie, Shusen; Yang, Hongqin
2016-10-01
Fluorescence correlation spectroscopy (FCS) is a powerful technique that could provide high temporal resolution and detection for the diffusions of biomolecules at extremely low concentrations. The accuracy of this approach primarily depends on experimental condition requirements and the data analysis model. In this study, we have set up a confocal-based FCS system. And then we used a Rhodamine6G solution to calibrate the system and get the related parameters. An experimental measurement was carried out on one-component solution to evaluate the relationship between a certain number of molecules and concentrations. The results showed FCS system we built was stable and valid. Finally, a two-component solution experiment was carried out to show the importance of analysis model selection. It is a promising method for single molecular diffusion study in living cells.
Ultrastructural model for size selectivity in glomerular filtration.
Edwards, A; Daniels, B S; Deen, W M
1999-06-01
A theoretical model was developed to relate the size selectivity of the glomerular barrier to the structural characteristics of the individual layers of the capillary wall. Thicknesses and other linear dimensions were evaluated, where possible, from previous electron microscopic studies. The glomerular basement membrane (GBM) was represented as a homogeneous material characterized by a Darcy permeability and by size-dependent hindrance coefficients for diffusion and convection, respectively; those coefficients were estimated from recent data obtained with isolated rat GBM. The filtration slit diaphragm was modeled as a single row of cylindrical fibers of equal radius but nonuniform spacing. The resistances of the remainder of the slit channel, and of the endothelial fenestrae, to macromolecule movement were calculated to be negligible. The slit diaphragm was found to be the most restrictive part of the barrier. Because of that, macromolecule concentrations in the GBM increased, rather than decreased, in the direction of flow. Thus the overall sieving coefficient (ratio of Bowman's space concentration to that in plasma) was predicted to be larger for the intact capillary wall than for a hypothetical structure with no GBM. In other words, because the slit diaphragm and GBM do not act independently, the overall sieving coefficient is not simply the product of those for GBM alone and the slit diaphragm alone. Whereas the calculated sieving coefficients were sensitive to the structural features of the slit diaphragm and to the GBM hindrance coefficients, variations in GBM thickness or filtration slit frequency were predicted to have little effect. The ability of the ultrastructural model to represent fractional clearance data in vivo was at least equal to that of conventional pore models with the same number of adjustable parameters. The main strength of the present approach, however, is that it provides a framework for relating structural findings to the size
Radial Domany-Kinzel models with mutation and selection
NASA Astrophysics Data System (ADS)
Lavrentovich, Maxim O.; Korolev, Kirill S.; Nelson, David R.
2013-01-01
We study the effect of spatial structure, genetic drift, mutation, and selective pressure on the evolutionary dynamics in a simplified model of asexual organisms colonizing a new territory. Under an appropriate coarse-graining, the evolutionary dynamics is related to the directed percolation processes that arise in voter models, the Domany-Kinzel (DK) model, contact process, and so on. We explore the differences between linear (flat front) expansions and the much less familiar radial (curved front) range expansions. For the radial expansion, we develop a generalized, off-lattice DK model that minimizes otherwise persistent lattice artifacts. With both simulations and analytical techniques, we study the survival probability of advantageous mutants, the spatial correlations between domains of neutral strains, and the dynamics of populations with deleterious mutations. “Inflation” at the frontier leads to striking differences between radial and linear expansions. For a colony with initial radius R0 expanding at velocity v, significant genetic demixing, caused by local genetic drift, occurs only up to a finite time t*=R0/v, after which portions of the colony become causally disconnected due to the inflating perimeter of the expanding front. As a result, the effect of a selective advantage is amplified relative to genetic drift, increasing the survival probability of advantageous mutants. Inflation also modifies the underlying directed percolation transition, introducing novel scaling functions and modifications similar to a finite-size effect. Finally, we consider radial range expansions with deflating perimeters, as might arise from colonization initiated along the shores of an island.
Cliff-edge model of obstetric selection in humans
Mitteroecker, Philipp; Huttegger, Simon M.; Fischer, Barbara; Pavlicev, Mihaela
2016-01-01
The strikingly high incidence of obstructed labor due to the disproportion of fetal size and the mother’s pelvic dimensions has puzzled evolutionary scientists for decades. Here we propose that these high rates are a direct consequence of the distinct characteristics of human obstetric selection. Neonatal size relative to the birth-relevant maternal dimensions is highly variable and positively associated with reproductive success until it reaches a critical value, beyond which natural delivery becomes impossible. As a consequence, the symmetric phenotype distribution cannot match the highly asymmetric, cliff-edged fitness distribution well: The optimal phenotype distribution that maximizes population mean fitness entails a fraction of individuals falling beyond the “fitness edge” (i.e., those with fetopelvic disproportion). Using a simple mathematical model, we show that weak directional selection for a large neonate, a narrow pelvic canal, or both is sufficient to account for the considerable incidence of fetopelvic disproportion. Based on this model, we predict that the regular use of Caesarean sections throughout the last decades has led to an evolutionary increase of fetopelvic disproportion rates by 10 to 20%. PMID:27930310
Evaluation of Model Fit in Cognitive Diagnosis Models
ERIC Educational Resources Information Center
Hu, Jinxiang; Miller, M. David; Huggins-Manley, Anne Corinne; Chen, Yi-Hsin
2016-01-01
Cognitive diagnosis models (CDMs) estimate student ability profiles using latent attributes. Model fit to the data needs to be ascertained in order to determine whether inferences from CDMs are valid. This study investigated the usefulness of some popular model fit statistics to detect CDM fit including relative fit indices (AIC, BIC, and CAIC),…
Bayesian Model Selection with Network Based Diffusion Analysis
Whalen, Andrew; Hoppitt, William J. E.
2016-01-01
A number of recent studies have used Network Based Diffusion Analysis (NBDA) to detect the role of social transmission in the spread of a novel behavior through a population. In this paper we present a unified framework for performing NBDA in a Bayesian setting, and demonstrate how the Watanabe Akaike Information Criteria (WAIC) can be used for model selection. We present a specific example of applying this method to Time to Acquisition Diffusion Analysis (TADA). To examine the robustness of this technique, we performed a large scale simulation study and found that NBDA using WAIC could recover the correct model of social transmission under a wide range of cases, including under the presence of random effects, individual level variables, and alternative models of social transmission. This work suggests that NBDA is an effective and widely applicable tool for uncovering whether social transmission underpins the spread of a novel behavior, and may still provide accurate results even when key model assumptions are relaxed. PMID:27092089
Impact of selected troposphere models on Precise Point Positioning convergence
NASA Astrophysics Data System (ADS)
Kalita, Jakub; Rzepecka, Zofia
2016-04-01
The Precise Point Positioning (PPP) absolute method is currently intensively investigated in order to reach fast convergence time. Among various sources that influence the convergence of the PPP, the tropospheric delay is one of the most important. Numerous models of tropospheric delay are developed and applied to PPP processing. However, with rare exceptions, the quality of those models does not allow fixing the zenith path delay tropospheric parameter, leaving difference between nominal and final value to the estimation process. Here we present comparison of several PPP result sets, each of which based on different troposphere model. The respective nominal values are adopted from models: VMF1, GPT2w, MOPS and ZERO-WET. The PPP solution admitted as reference is based on the final troposphere product from the International GNSS Service (IGS). The VMF1 mapping function was used for all processing variants in order to provide capability to compare impact of applied nominal values. The worst case initiates zenith wet delay with zero value (ZERO-WET). Impact from all possible models for tropospheric nominal values should fit inside both IGS and ZERO-WET border variants. The analysis is based on data from seven IGS stations located in mid-latitude European region from year 2014. For the purpose of this study several days with the most active troposphere were selected for each of the station. All the PPP solutions were determined using gLAB open-source software, with the Kalman filter implemented independently by the authors of this work. The processing was performed on 1 hour slices of observation data. In addition to the analysis of the output processing files, the presented study contains detailed analysis of the tropospheric conditions for the selected data. The overall results show that for the height component the VMF1 model outperforms GPT2w and MOPS by 35-40% and ZERO-WET variant by 150%. In most of the cases all solutions converge to the same values during first
COUNCIL FOR REGULATORY ENVIRONMENTAL MODELING (CREM) PILOT WATER QUALITY MODEL SELECTION TOOL
EPA's Council for Regulatory Environmental Modeling (CREM) is currently supporting the development of a pilot model selection tool that is intended to help the states and the regions implement the total maximum daily load (TMDL) program. This tool will be implemented within the ...
A Model for Selection of Eyespots on Butterfly Wings
Sekimura, Toshio; Venkataraman, Chandrasekhar; Madzvamuse, Anotida
2015-01-01
Unsolved Problem The development of eyespots on the wing surface of butterflies of the family Nympalidae is one of the most studied examples of biological pattern formation.However, little is known about the mechanism that determines the number and precise locations of eyespots on the wing. Eyespots develop around signaling centers, called foci, that are located equidistant from wing veins along the midline of a wing cell (an area bounded by veins). A fundamental question that remains unsolved is, why a certain wing cell develops an eyespot, while other wing cells do not. Key Idea and Model We illustrate that the key to understanding focus point selection may be in the venation system of the wing disc. Our main hypothesis is that changes in morphogen concentration along the proximal boundary veins of wing cells govern focus point selection. Based on previous studies, we focus on a spatially two-dimensional reaction-diffusion system model posed in the interior of each wing cell that describes the formation of focus points. Using finite element based numerical simulations, we demonstrate that variation in the proximal boundary condition is sufficient to robustly select whether an eyespot focus point forms in otherwise identical wing cells. We also illustrate that this behavior is robust to small perturbations in the parameters and geometry and moderate levels of noise. Hence, we suggest that an anterior-posterior pattern of morphogen concentration along the proximal vein may be the main determinant of the distribution of focus points on the wing surface. In order to complete our model, we propose a two stage reaction-diffusion system model, in which an one-dimensional surface reaction-diffusion system, posed on the proximal vein, generates the morphogen concentrations that act as non-homogeneous Dirichlet (i.e., fixed) boundary conditions for the two-dimensional reaction-diffusion model posed in the wing cells. The two-stage model appears capable of generating focus
Improving permafrost distribution modelling using feature selection algorithms
NASA Astrophysics Data System (ADS)
Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail
2016-04-01
The availability of an increasing number of spatial data on the occurrence of mountain permafrost allows the employment of machine learning (ML) classification algorithms for modelling the distribution of the phenomenon. One of the major problems when dealing with high-dimensional dataset is the number of input features (variables) involved. Application of ML classification algorithms to this large number of variables leads to the risk of overfitting, with the consequence of a poor generalization/prediction. For this reason, applying feature selection (FS) techniques helps simplifying the amount of factors required and improves the knowledge on adopted features and their relation with the studied phenomenon. Moreover, taking away irrelevant or redundant variables from the dataset effectively improves the quality of the ML prediction. This research deals with a comparative analysis of permafrost distribution models supported by FS variable importance assessment. The input dataset (dimension = 20-25, 10 m spatial resolution) was constructed using landcover maps, climate data and DEM derived variables (altitude, aspect, slope, terrain curvature, solar radiation, etc.). It was completed with permafrost evidences (geophysical and thermal data and rock glacier inventories) that serve as training permafrost data. Used FS algorithms informed about variables that appeared less statistically important for permafrost presence/absence. Three different algorithms were compared: Information Gain (IG), Correlation-based Feature Selection (CFS) and Random Forest (RF). IG is a filter technique that evaluates the worth of a predictor by measuring the information gain with respect to the permafrost presence/absence. Conversely, CFS is a wrapper technique that evaluates the worth of a subset of predictors by considering the individual predictive ability of each variable along with the degree of redundancy between them. Finally, RF is a ML algorithm that performs FS as part of its
Multiphysics modeling of selective laser sintering/melting
NASA Astrophysics Data System (ADS)
Ganeriwala, Rishi Kumar
A significant percentage of total global employment is due to the manufacturing industry. However, manufacturing also accounts for nearly 20% of total energy usage in the United States according to the EIA. In fact, manufacturing accounted for 90% of industrial energy consumption and 84% of industry carbon dioxide emissions in 2002. Clearly, advances in manufacturing technology and efficiency are necessary to curb emissions and help society as a whole. Additive manufacturing (AM) refers to a relatively recent group of manufacturing technologies whereby one can 3D print parts, which has the potential to significantly reduce waste, reconfigure the supply chain, and generally disrupt the whole manufacturing industry. Selective laser sintering/melting (SLS/SLM) is one type of AM technology with the distinct advantage of being able to 3D print metals and rapidly produce net shape parts with complicated geometries. In SLS/SLM parts are built up layer-by-layer out of powder particles, which are selectively sintered/melted via a laser. However, in order to produce defect-free parts of sufficient strength, the process parameters (laser power, scan speed, layer thickness, powder size, etc.) must be carefully optimized. Obviously, these process parameters will vary depending on material, part geometry, and desired final part characteristics. Running experiments to optimize these parameters is costly, energy intensive, and extremely material specific. Thus a computational model of this process would be highly valuable. In this work a three dimensional, reduced order, coupled discrete element - finite difference model is presented for simulating the deposition and subsequent laser heating of a layer of powder particles sitting on top of a substrate. Validation is provided and parameter studies are conducted showing the ability of this model to help determine appropriate process parameters and an optimal powder size distribution for a given material. Next, thermal stresses upon
Scaling limits of a model for selection at two scales
NASA Astrophysics Data System (ADS)
Luo, Shishi; Mattingly, Jonathan C.
2017-04-01
The dynamics of a population undergoing selection is a central topic in evolutionary biology. This question is particularly intriguing in the case where selective forces act in opposing directions at two population scales. For example, a fast-replicating virus strain outcompetes slower-replicating strains at the within-host scale. However, if the fast-replicating strain causes host morbidity and is less frequently transmitted, it can be outcompeted by slower-replicating strains at the between-host scale. Here we consider a stochastic ball-and-urn process which models this type of phenomenon. We prove the weak convergence of this process under two natural scalings. The first scaling leads to a deterministic nonlinear integro-partial differential equation on the interval [0,1] with dependence on a single parameter, λ. We show that the fixed points of this differential equation are Beta distributions and that their stability depends on λ and the behavior of the initial data around 1. The second scaling leads to a measure-valued Fleming–Viot process, an infinite dimensional stochastic process that is frequently associated with a population genetics.
Robustness and epistasis in mutation-selection models
NASA Astrophysics Data System (ADS)
Wolff, Andrea; Krug, Joachim
2009-09-01
We investigate the fitness advantage associated with the robustness of a phenotype against deleterious mutations using deterministic mutation-selection models of a quasispecies type equipped with a mesa-shaped fitness landscape. We obtain analytic results for the robustness effect which become exact in the limit of infinite sequence length. Thereby, we are able to clarify a seeming contradiction between recent rigorous work and an earlier heuristic treatment based on mapping to a Schrödinger equation. We exploit the quantum mechanical analogy to calculate a correction term for finite sequence lengths and verify our analytic results by numerical studies. In addition, we investigate the occurrence of an error threshold for a general class of epistatic landscapes and show that diminishing epistasis is a necessary but not sufficient condition for error threshold behaviour.
Model catalysis by size-selected cluster deposition
Anderson, Scott
2015-11-20
This report summarizes the accomplishments during the last four years of the subject grant. Results are presented for experiments in which size-selected model catalysts were studied under surface science and aqueous electrochemical conditions. Strong effects of cluster size were found, and by correlating the size effects with size-dependent physical properties of the samples measured by surface science methods, it was possible to deduce mechanistic insights, such as the factors that control the rate-limiting step in the reactions. Results are presented for CO oxidation, CO binding energetics and geometries, and electronic effects under surface science conditions, and for the electrochemical oxygen reduction reaction, ethanol oxidation reaction, and for oxidation of carbon by water.
A Technical Guide to Ground-Water Model Selection at Sites Contaminated with Radioactive Substances
This report addresses the selection of ground-water flow and contaminant transport models and is intended to be used by hydrogeologists and geoscientists responsible for selecting transport models for use at sites containing radioactive materials.
2006-03-01
time periods. According to Peter Kennedy in “A Guide to Econometrics ”, there are numerous appealing features of the panel model, of which the...Criteria (AIC) values. According to Kennedy, in A Guide to Econometrics , the use of the AIC and R2 to determine appropriate lag lengths in time- 40...two regressions are different ( Stata , 2005:306-307). For the purpose of this research, failing to reject Ho, a large p-value, was the desired outcome
Agent-Based vs. Equation-based Epidemiological Models:A Model Selection Case Study
Sukumar, Sreenivas R; Nutaro, James J
2012-01-01
This paper is motivated by the need to design model validation strategies for epidemiological disease-spread models. We consider both agent-based and equation-based models of pandemic disease spread and study the nuances and complexities one has to consider from the perspective of model validation. For this purpose, we instantiate an equation based model and an agent based model of the 1918 Spanish flu and we leverage data published in the literature for our case- study. We present our observations from the perspective of each implementation and discuss the application of model-selection criteria to compare the risk in choosing one modeling paradigm to another. We conclude with a discussion of our experience and document future ideas for a model validation framework.
Ponciano, José Miguel; Taper, Mark L; Dennis, Brian; Lele, Subhash R
2009-02-01
Hierarchical statistical models are increasingly being used to describe complex ecological processes. The data cloning (DC) method is a new general technique that uses Markov chain Monte Carlo (MCMC) algorithms to compute maximum likelihood (ML) estimates along with their asymptotic variance estimates for hierarchical models. Despite its generality, the method has two inferential limitations. First, it only provides Wald-type confidence intervals, known to be inaccurate in small samples. Second, it only yields ML parameter estimates, but not the maximized likelihood values used for profile likelihood intervals, likelihood ratio hypothesis tests, and information-theoretic model selection. Here we describe how to overcome these inferential limitations with a computationally efficient method for calculating likelihood ratios via data cloning. The ability to calculate likelihood ratios allows one to do hypothesis tests, construct accurate confidence intervals and undertake information-based model selection with hierarchical models in a frequentist context. To demonstrate the use of these tools with complex ecological models, we reanalyze part of Gause's classic Paramecium data with state-space population models containing both environmental noise and sampling error. The analysis results include improved confidence intervals for parameters, a hypothesis test of laboratory replication, and a comparison of the Beverton-Holt and the Ricker growth forms based on a model selection index.
Model Selection in the Analysis of Photoproduction Data
NASA Astrophysics Data System (ADS)
Landay, Justin
2017-01-01
Scattering experiments provide one of the most powerful and useful tools for probing matter to better understand its fundamental properties governed by the strong interaction. As the spectroscopy of the excited states of nucleons enters a new era of precision ushered in by improved experiments at Jefferson Lab and other facilities around the world, traditional partial-wave analysis methods must be adjusted accordingly. In this poster, we present a rigorous set of statistical tools and techniques that we implemented; most notably, the LASSO method, which serves for the selection of the simplest model, allowing us to avoid over fitting. In the case of establishing the spectrum of exited baryons, it avoids overpopulation of the spectrum and thus the occurrence of false-positives. This is a prerequisite to reliably compare theories like lattice QCD or quark models to experiments. Here, we demonstrate the principle by simultaneously fitting three observables in neutral pion photo-production, such as the differential cross section, beam asymmetry and target polarization across thousands of data points. Other authors include Michael Doring, Bin Hu, and Raquel Molina.
NASA Astrophysics Data System (ADS)
Wentworth, Mami Tonoe
Uncertainty quantification plays an important role when making predictive estimates of model responses. In this context, uncertainty quantification is defined as quantifying and reducing uncertainties, and the objective is to quantify uncertainties in parameter, model and measurements, and propagate the uncertainties through the model, so that one can make a predictive estimate with quantified uncertainties. Two of the aspects of uncertainty quantification that must be performed prior to propagating uncertainties are model calibration and parameter selection. There are several efficient techniques for these processes; however, the accuracy of these methods are often not verified. This is the motivation for our work, and in this dissertation, we present and illustrate verification frameworks for model calibration and parameter selection in the context of biological and physical models. First, HIV models, developed and improved by [2, 3, 8], describe the viral infection dynamics of an HIV disease. These are also used to make predictive estimates of viral loads and T-cell counts and to construct an optimal control for drug therapy. Estimating input parameters is an essential step prior to uncertainty quantification. However, not all the parameters are identifiable, implying that they cannot be uniquely determined by the observations. These unidentifiable parameters can be partially removed by performing parameter selection, a process in which parameters that have minimal impacts on the model response are determined. We provide verification techniques for Bayesian model calibration and parameter selection for an HIV model. As an example of a physical model, we employ a heat model with experimental measurements presented in [10]. A steady-state heat model represents a prototypical behavior for heat conduction and diffusion process involved in a thermal-hydraulic model, which is a part of nuclear reactor models. We employ this simple heat model to illustrate verification
Model-based fault detection and identification with online aerodynamic model structure selection
NASA Astrophysics Data System (ADS)
Lombaerts, T.
2013-12-01
This publication describes a recursive algorithm for the approximation of time-varying nonlinear aerodynamic models by means of a joint adaptive selection of the model structure and parameter estimation. This procedure is called adaptive recursive orthogonal least squares (AROLS) and is an extension and modification of the previously developed ROLS procedure. This algorithm is particularly useful for model-based fault detection and identification (FDI) of aerospace systems. After the failure, a completely new aerodynamic model can be elaborated recursively with respect to structure as well as parameter values. The performance of the identification algorithm is demonstrated on a simulation data set.
Effects of Parceling on Model Selection: Parcel-Allocation Variability in Model Ranking.
Sterba, Sonya K; Rights, Jason D
2016-01-25
Research interest often lies in comparing structural model specifications implying different relationships among latent factors. In this context parceling is commonly accepted, assuming the item-level measurement structure is well known and, conservatively, assuming items are unidimensional in the population. Under these assumptions, researchers compare competing structural models, each specified using the same parcel-level measurement model. However, little is known about consequences of parceling for model selection in this context-including whether and when model ranking could vary across alternative item-to-parcel allocations within-sample. This article first provides a theoretical framework that predicts the occurrence of parcel-allocation variability (PAV) in model selection index values and its consequences for PAV in ranking of competing structural models. These predictions are then investigated via simulation. We show that conditions known to manifest PAV in absolute fit of a single model may or may not manifest PAV in model ranking. Thus, one cannot assume that low PAV in absolute fit implies a lack of PAV in ranking, and vice versa. PAV in ranking is shown to occur under a variety of conditions, including large samples. To provide an empirically supported strategy for selecting a model when PAV in ranking exists, we draw on relationships between structural model rankings in parcel- versus item-solutions. This strategy employs the across-allocation modal ranking. We developed software tools for implementing this strategy in practice, and illustrate them with an example. Even if a researcher has substantive reason to prefer one particular allocation, investigating PAV in ranking within-sample still provides an informative sensitivity analysis.
Estimating seabed scattering mechanisms via Bayesian model selection.
Steininger, Gavin; Dosso, Stan E; Holland, Charles W; Dettmer, Jan
2014-10-01
A quantitative inversion procedure is developed and applied to determine the dominant scattering mechanism (surface roughness and/or volume scattering) from seabed scattering-strength data. The classification system is based on trans-dimensional Bayesian inversion with the deviance information criterion used to select the dominant scattering mechanism. Scattering is modeled using first-order perturbation theory as due to one of three mechanisms: Interface scattering from a rough seafloor, volume scattering from a heterogeneous sediment layer, or mixed scattering combining both interface and volume scattering. The classification system is applied to six simulated test cases where it correctly identifies the true dominant scattering mechanism as having greater support from the data in five cases; the remaining case is indecisive. The approach is also applied to measured backscatter-strength data where volume scattering is determined as the dominant scattering mechanism. Comparison of inversion results with core data indicates the method yields both a reasonable volume heterogeneity size distribution and a good estimate of the sub-bottom depths at which scatterers occur.
Alternating direction methods for latent variable gaussian graphical model selection.
Ma, Shiqian; Xue, Lingzhou; Zou, Hui
2013-08-01
Chandrasekaran, Parrilo, and Willsky (2012) proposed a convex optimization problem for graphical model selection in the presence of unobserved variables. This convex optimization problem aims to estimate an inverse covariance matrix that can be decomposed into a sparse matrix minus a low-rank matrix from sample data. Solving this convex optimization problem is very challenging, especially for large problems. In this letter, we propose two alternating direction methods for solving this problem. The first method is to apply the classic alternating direction method of multipliers to solve the problem as a consensus problem. The second method is a proximal gradient-based alternating-direction method of multipliers. Our methods take advantage of the special structure of the problem and thus can solve large problems very efficiently. A global convergence result is established for the proposed methods. Numerical results on both synthetic data and gene expression data show that our methods usually solve problems with 1 million variables in 1 to 2 minutes and are usually 5 to 35 times faster than a state-of-the-art Newton-CG proximal point algorithm.
Modeling neuron selectivity over simple midlevel features for image classification.
Shu Kong; Zhuolin Jiang; Qiang Yang
2015-08-01
We now know that good mid-level features can greatly enhance the performance of image classification, but how to efficiently learn the image features is still an open question. In this paper, we present an efficient unsupervised midlevel feature learning approach (MidFea), which only involves simple operations, such as k-means clustering, convolution, pooling, vector quantization, and random projection. We show this simple feature can also achieve good performance in traditional classification task. To further boost the performance, we model the neuron selectivity (NS) principle by building an additional layer over the midlevel features prior to the classifier. The NS-layer learns category-specific neurons in a supervised manner with both bottom-up inference and top-down analysis, and thus supports fast inference for a query image. Through extensive experiments, we demonstrate that this higher level NS-layer notably improves the classification accuracy with our simple MidFea, achieving comparable performances for face recognition, gender classification, age estimation, and object categorization. In particular, our approach runs faster in inference by an order of magnitude than sparse coding-based feature learning methods. As a conclusion, we argue that not only do carefully learned features (MidFea) bring improved performance, but also a sophisticated mechanism (NS-layer) at higher level boosts the performance further.
Applying optimal model selection in principal stratification for causal inference.
Odondi, Lang'o; McNamee, Roseanne
2013-05-20
Noncompliance to treatment allocation is a key source of complication for causal inference. Efficacy estimation is likely to be compounded by the presence of noncompliance in both treatment arms of clinical trials where the intention-to-treat estimate provides a biased estimator for the true causal estimate even under homogeneous treatment effects assumption. Principal stratification method has been developed to address such posttreatment complications. The present work extends a principal stratification method that adjusts for noncompliance in two-treatment arms trials by developing model selection for covariates predicting compliance to treatment in each arm. We apply the method to analyse data from the Esprit study, which was conducted to ascertain whether unopposed oestrogen (hormone replacement therapy) reduced the risk of further cardiac events in postmenopausal women who survive a first myocardial infarction. We adjust for noncompliance in both treatment arms under a Bayesian framework to produce causal risk ratio estimates for each principal stratum. For mild values of a sensitivity parameter and using separate predictors of compliance in each arm, principal stratification results suggested that compliance with hormone replacement therapy only would reduce the risk for death and myocardial reinfarction by about 47% and 25%, respectively, whereas compliance with either treatment would reduce the risk for death by 13% and reinfarction by 60% among the most compliant. However, the results were sensitive to the user-defined sensitivity parameter.
Model catalysts prepared by size-selected nanocluster deposition
NASA Astrophysics Data System (ADS)
Aizawa, Masato
Catalytic activity and product selectivity of supported metal catalysts strongly depend on the size of metal particles, support materials, and preparation methods. A novel instrument was employed to investigate electronic structure, morphology, and chemical properties of the model supported catalysts. Size- and energy-selected metal clusters containing fewer than 30 atoms on TiO 2 (110) were characterized by X-ray photoelectron spectroscopy (XPS), Auger electron spectroscopy (AES), ion scattering spectroscopy (ISS), and temperature programmed desorption (TPD). The performance of the instrument was checked with investigating the oxidation of vanadium and niobium clusters supported on TiO2 (110). Ni clusters are in the zero oxidation state on the support at low impact energies. Oxidation, however, occurs either when increasing the impact energy or when chemisorbed oxygen being available at oxygen defect sites of TiO 2. The small clusters bind preferentially to oxygen sites. The large clusters appear to retain some three dimensional structure on the support. For these clusters, no obvious desorption features were observed in the temperature range above 140K. The lack of CO desorption is interpreted in terms of strong Ni cluster-TiO2 binding. It seems that the nickel clusters are sintering and/or are encapsulated during the TPD experiments. Ir clusters are also in the zero oxidation state on the support irrespective of the cluster size and the impact energies. At low impact energies the clusters stay more or less intact on the support. The clusters embed themselves into the support at higher impact energies. The threshold energy for the embedding is lower for the larger clusters. This system shows pronounced substrate-mediated adsorption (SMA) of CO at low doses, with the effect varying inversely with cluster size. CO adsorbed via SMA at low CO dose is bound differently from that at high CO dose. In experiments with sequential C16O and C18O doses, C16O → C18O
Bayesian model selection for a finite element model of a large civil aircraft
Hemez, F. M.; Rutherford, A. C.
2004-01-01
Nine aircraft stiffness parameters have been varied and used as inputs to a finite element model of an aircraft to generate natural frequency and deflection features (Goge, 2003). This data set (147 input parameter configurations and associated outputs) is now used to generate a metamodel, or a fast running surrogate model, using Bayesian model selection methods. Once a forward relationship is defined, the metamodel may be used in an inverse sense. That is, knowing the measured output frequencies and deflections, what were the input stiffness parameters that caused them?
Posada, David
2006-07-01
ModelTest server is a web-based application for the selection of models of nucleotide substitution using the program ModelTest. The server takes as input a text file with likelihood scores for the set of candidate models. Models can be selected with hierarchical likelihood ratio tests, or with the Akaike or Bayesian information criteria. The output includes several statistics for the assessment of model selection uncertainty, for model averaging or to estimate the relative importance of model parameters. The server can be accessed at http://darwin.uvigo.es/software/modeltest_server.html.
Posada, David
2006-01-01
ModelTest server is a web-based application for the selection of models of nucleotide substitution using the program ModelTest. The server takes as input a text file with likelihood scores for the set of candidate models. Models can be selected with hierarchical likelihood ratio tests, or with the Akaike or Bayesian information criteria. The output includes several statistics for the assessment of model selection uncertainty, for model averaging or to estimate the relative importance of model parameters. The server can be accessed at . PMID:16845102
Model selection and assessment for multi-species occupancy models
Broms, Kristin M.; Hooten, Mevin B.; Fitzpatrick, Ryan M.
2016-01-01
While multi-species occupancy models (MSOMs) are emerging as a popular method for analyzing biodiversity data, formal checking and validation approaches for this class of models have lagged behind. Concurrent with the rise in application of MSOMs among ecologists, a quiet regime shift is occurring in Bayesian statistics where predictive model comparison approaches are experiencing a resurgence. Unlike single-species occupancy models that use integrated likelihoods, MSOMs are usually couched in a Bayesian framework and contain multiple levels. Standard model checking and selection methods are often unreliable in this setting and there is only limited guidance in the ecological literature for this class of models. We examined several different contemporary Bayesian hierarchical approaches for checking and validating MSOMs and applied these methods to a freshwater aquatic study system in Colorado, USA, to better understand the diversity and distributions of plains fishes. Our findings indicated distinct differences among model selection approaches, with cross-validation techniques performing the best in terms of prediction.
Principal Selection in Rural School Districts: A Process Model.
ERIC Educational Resources Information Center
Richardson, M. D.; And Others
Recent research illustrates the increasingly important role of the school principal. As a result, procedures for selecting principals have also become more critical to rural school districts. School systems, particularly rural school districts, are encouraged to adopt systematic, rational means for selecting administrators. Such procedures will…
Dynamical modelling of NGC 6809: selecting the best model using Bayesian inference
NASA Astrophysics Data System (ADS)
Diakogiannis, Foivos I.; Lewis, Geraint F.; Ibata, Rodrigo A.
2014-02-01
The precise cosmological origin of globular clusters remains uncertain, a situation hampered by the struggle of observational approaches in conclusively identifying the presence, or not, of dark matter in these systems. In this paper, we address this question through an analysis of the particular case of NGC 6809. While previous studies have performed dynamical modelling of this globular cluster using a small number of available kinematic data, they did not perform appropriate statistical inference tests for the choice of best model description; such statistical inference for model selection is important since, in general, different models can result in significantly different inferred quantities. With the latest kinematic data, we use Bayesian inference tests for model selection and thus obtain the best-fitting models, as well as mass and dynamic mass-to-light ratio estimates. For this, we introduce a new likelihood function that provides more constrained distributions for the defining parameters of dynamical models. Initially, we consider models with a known distribution function, and then model the cluster using solutions of the spherically symmetric Jeans equation; this latter approach depends upon the mass density profile and anisotropy β parameter. In order to find the best description for the cluster we compare these models by calculating their Bayesian evidence. We find smaller mass and dynamic mass-to-light ratio values than previous studies, with the best-fitting Michie model for a constant mass-to-light ratio of Upsilon = 0.90^{+0.14}_{-0.14} and M_{dyn}=6.10^{+0.51}_{-0.88} × 10^4 M_{{⊙}}. We exclude the significant presence of dark matter throughout the cluster, showing that no physically motivated distribution of dark matter can be present away from the cluster core.
Continuous time limits of the utterance selection model.
Michaud, Jérôme
2017-02-01
In this paper we derive alternative continuous time limits of the utterance selection model (USM) for language change [G. J. Baxter et al., Phys. Rev. E 73, 046118 (2006)PLEEE81539-375510.1103/PhysRevE.73.046118]. This is motivated by the fact that the Fokker-Planck continuous time limit derived in the original version of the USM is only valid for a small range of parameters. We investigate the consequences of relaxing these constraints on parameters. Using the normal approximation of the multinomial approximation, we derive a continuous time limit of the USM in the form of a weak-noise stochastic differential equation. We argue that this weak noise, not captured by the Kramers-Moyal expansion, cannot be neglected. We then propose a coarse-graining procedure, which takes the form of a stochastic version of the heterogeneous mean field approximation. This approximation groups the behavior of nodes of the same degree, reducing the complexity of the problem. With the help of this approximation, we study in detail two simple families of networks: the regular networks and the star-shaped networks. The analysis reveals and quantifies a finite-size effect of the dynamics. If we increase the size of the network by keeping all the other parameters constant, we transition from a state where conventions emerge to a state where no convention emerges. Furthermore, we show that the degree of a node acts as a time scale. For heterogeneous networks such as star-shaped networks, the time scale difference can become very large, leading to a noisier behavior of highly connected nodes.
Continuous time limits of the utterance selection model
NASA Astrophysics Data System (ADS)
Michaud, Jérôme
2017-02-01
In this paper we derive alternative continuous time limits of the utterance selection model (USM) for language change [G. J. Baxter et al., Phys. Rev. E 73, 046118 (2006), 10.1103/PhysRevE.73.046118]. This is motivated by the fact that the Fokker-Planck continuous time limit derived in the original version of the USM is only valid for a small range of parameters. We investigate the consequences of relaxing these constraints on parameters. Using the normal approximation of the multinomial approximation, we derive a continuous time limit of the USM in the form of a weak-noise stochastic differential equation. We argue that this weak noise, not captured by the Kramers-Moyal expansion, cannot be neglected. We then propose a coarse-graining procedure, which takes the form of a stochastic version of the heterogeneous mean field approximation. This approximation groups the behavior of nodes of the same degree, reducing the complexity of the problem. With the help of this approximation, we study in detail two simple families of networks: the regular networks and the star-shaped networks. The analysis reveals and quantifies a finite-size effect of the dynamics. If we increase the size of the network by keeping all the other parameters constant, we transition from a state where conventions emerge to a state where no convention emerges. Furthermore, we show that the degree of a node acts as a time scale. For heterogeneous networks such as star-shaped networks, the time scale difference can become very large, leading to a noisier behavior of highly connected nodes.
Selecting single model in combination forecasting based on cointegration test and encompassing test.
Jiang, Chuanjin; Zhang, Jing; Song, Fugen
2014-01-01
Combination forecasting takes all characters of each single forecasting method into consideration, and combines them to form a composite, which increases forecasting accuracy. The existing researches on combination forecasting select single model randomly, neglecting the internal characters of the forecasting object. After discussing the function of cointegration test and encompassing test in the selection of single model, supplemented by empirical analysis, the paper gives the single model selection guidance: no more than five suitable single models can be selected from many alternative single models for a certain forecasting target, which increases accuracy and stability.
Selecting Single Model in Combination Forecasting Based on Cointegration Test and Encompassing Test
Jiang, Chuanjin; Zhang, Jing; Song, Fugen
2014-01-01
Combination forecasting takes all characters of each single forecasting method into consideration, and combines them to form a composite, which increases forecasting accuracy. The existing researches on combination forecasting select single model randomly, neglecting the internal characters of the forecasting object. After discussing the function of cointegration test and encompassing test in the selection of single model, supplemented by empirical analysis, the paper gives the single model selection guidance: no more than five suitable single models can be selected from many alternative single models for a certain forecasting target, which increases accuracy and stability. PMID:24892061
Smith, Graham C; Delahay, Richard J; McDonald, Robbie A; Budgey, Richard
2016-01-01
Bovine tuberculosis (bTB) causes substantial economic losses to cattle farmers and taxpayers in the British Isles. Disease management in cattle is complicated by the role of the European badger (Meles meles) as a host of the infection. Proactive, non-selective culling of badgers can reduce the incidence of disease in cattle but may also have negative effects in the area surrounding culls that have been associated with social perturbation of badger populations. The selective removal of infected badgers would, in principle, reduce the number culled, but the effects of selective culling on social perturbation and disease outcomes are unclear. We used an established model to simulate non-selective badger culling, non-selective badger vaccination and a selective trap and vaccinate or remove (TVR) approach to badger management in two distinct areas: South West England and Northern Ireland. TVR was simulated with and without social perturbation in effect. The lower badger density in Northern Ireland caused no qualitative change in the effect of management strategies on badgers, although the absolute number of infected badgers was lower in all cases. However, probably due to differing herd density in Northern Ireland, the simulated badger management strategies caused greater variation in subsequent cattle bTB incidence. Selective culling in the model reduced the number of badgers killed by about 83% but this only led to an overall benefit for cattle TB incidence if there was no social perturbation of badgers. We conclude that the likely benefit of selective culling will be dependent on the social responses of badgers to intervention but that other population factors including badger and cattle density had little effect on the relative benefits of selective culling compared to other methods, and that this may also be the case for disease management in other wild host populations.
Smith, Graham C.; Delahay, Richard J.; McDonald, Robbie A.
2016-01-01
Bovine tuberculosis (bTB) causes substantial economic losses to cattle farmers and taxpayers in the British Isles. Disease management in cattle is complicated by the role of the European badger (Meles meles) as a host of the infection. Proactive, non-selective culling of badgers can reduce the incidence of disease in cattle but may also have negative effects in the area surrounding culls that have been associated with social perturbation of badger populations. The selective removal of infected badgers would, in principle, reduce the number culled, but the effects of selective culling on social perturbation and disease outcomes are unclear. We used an established model to simulate non-selective badger culling, non-selective badger vaccination and a selective trap and vaccinate or remove (TVR) approach to badger management in two distinct areas: South West England and Northern Ireland. TVR was simulated with and without social perturbation in effect. The lower badger density in Northern Ireland caused no qualitative change in the effect of management strategies on badgers, although the absolute number of infected badgers was lower in all cases. However, probably due to differing herd density in Northern Ireland, the simulated badger management strategies caused greater variation in subsequent cattle bTB incidence. Selective culling in the model reduced the number of badgers killed by about 83% but this only led to an overall benefit for cattle TB incidence if there was no social perturbation of badgers. We conclude that the likely benefit of selective culling will be dependent on the social responses of badgers to intervention but that other population factors including badger and cattle density had little effect on the relative benefits of selective culling compared to other methods, and that this may also be the case for disease management in other wild host populations. PMID:27893809
ERIC Educational Resources Information Center
Olejnik, Stephen; Mills, Jamie; Keselman, Harvey
2000-01-01
Evaluated the use of Mallow's C(p) and Wherry's adjusted R squared (R. Wherry, 1931) statistics to select a final model from a pool of model solutions using computer generated data. Neither statistic identified the underlying regression model any better than, and usually less well than, the stepwise selection method, which itself was poor for…
Technology Transfer Automated Retrieval System (TEKTRAN)
Genomic selection (GS) models use genome-wide genetic information to predict genetic values of candidates for selection. Originally these models were developed without considering genotype ' environment interaction (GE). Several authors have proposed extensions of the cannonical GS model that accomm...
Selecting the Number of Principal Components in Functional Data
Li, Yehua; Wang, Naisyin; Carroll, Raymond J.
2013-01-01
Functional principal component analysis (FPCA) has become the most widely used dimension reduction tool for functional data analysis. We consider functional data measured at random, subject-specific time points, contaminated with measurement error, allowing for both sparse and dense functional data, and propose novel information criteria to select the number of principal component in such data. We propose a Bayesian information criterion based on marginal modeling that can consistently select the number of principal components for both sparse and dense functional data. For dense functional data, we also developed an Akaike information criterion (AIC) based on the expected Kullback-Leibler information under a Gaussian assumption. In connecting with factor analysis in multivariate time series data, we also consider the information criteria by Bai & Ng (2002) and show that they are still consistent for dense functional data, if a prescribed undersmoothing scheme is undertaken in the FPCA algorithm. We perform intensive simulation studies and show that the proposed information criteria vastly outperform existing methods for this type of data. Surprisingly, our empirical evidence shows that our information criteria proposed for dense functional data also perform well for sparse functional data. An empirical example using colon carcinogenesis data is also provided to illustrate the results. PMID:24376287
Selecting the Number of Principal Components in Functional Data.
Li, Yehua; Wang, Naisyin; Carroll, Raymond J
2013-12-19
Functional principal component analysis (FPCA) has become the most widely used dimension reduction tool for functional data analysis. We consider functional data measured at random, subject-specific time points, contaminated with measurement error, allowing for both sparse and dense functional data, and propose novel information criteria to select the number of principal component in such data. We propose a Bayesian information criterion based on marginal modeling that can consistently select the number of principal components for both sparse and dense functional data. For dense functional data, we also developed an Akaike information criterion (AIC) based on the expected Kullback-Leibler information under a Gaussian assumption. In connecting with factor analysis in multivariate time series data, we also consider the information criteria by Bai & Ng (2002) and show that they are still consistent for dense functional data, if a prescribed undersmoothing scheme is undertaken in the FPCA algorithm. We perform intensive simulation studies and show that the proposed information criteria vastly outperform existing methods for this type of data. Surprisingly, our empirical evidence shows that our information criteria proposed for dense functional data also perform well for sparse functional data. An empirical example using colon carcinogenesis data is also provided to illustrate the results.
CORRELATION PURSUIT: FORWARD STEPWISE VARIABLE SELECTION FOR INDEX MODELS
Zhong, Wenxuan; Zhang, Tingting; Zhu, Yu; Liu, Jun S.
2012-01-01
In this article, a stepwise procedure, correlation pursuit (COP), is developed for variable selection under the sufficient dimension reduction framework, in which the response variable Y is influenced by the predictors X1, X2, …, Xp through an unknown function of a few linear combinations of them. Unlike linear stepwise regression, COP does not impose a special form of relationship (such as linear) between the response variable and the predictor variables. The COP procedure selects variables that attain the maximum correlation between the transformed response and the linear combination of the variables. Various asymptotic properties of the COP procedure are established, and in particular, its variable selection performance under diverging number of predictors and sample size has been investigated. The excellent empirical performance of the COP procedure in comparison with existing methods are demonstrated by both extensive simulation studies and a real example in functional genomics. PMID:23243388
Model-independent plot of dynamic PET data facilitates data interpretation and model selection.
Munk, Ole Lajord
2012-02-21
When testing new PET radiotracers or new applications of existing tracers, the blood-tissue exchange and the metabolism need to be examined. However, conventional plots of measured time-activity curves from dynamic PET do not reveal the inherent kinetic information. A novel model-independent volume-influx plot (vi-plot) was developed and validated. The new vi-plot shows the time course of the instantaneous distribution volume and the instantaneous influx rate. The vi-plot visualises physiological information that facilitates model selection and it reveals when a quasi-steady state is reached, which is a prerequisite for the use of the graphical analyses by Logan and Gjedde-Patlak. Both axes of the vi-plot have direct physiological interpretation, and the plot shows kinetic parameter in close agreement with estimates obtained by non-linear kinetic modelling. The vi-plot is equally useful for analyses of PET data based on a plasma input function or a reference region input function. The vi-plot is a model-independent and informative plot for data exploration that facilitates the selection of an appropriate method for data analysis.
ERIC Educational Resources Information Center
Ravelo Hurtado, Nestor E.; Nitko, Anthony J.
This paper describes a modified lottery selection procedure and compares it with several popular unbiased candidate selection models in a Venezuelan academic selection situation. The procedure uses modified version of F. S. Ellett's lottery method as a means of partially satisfying the principles of substantive fairness. Ellett's procedure…
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Illman, Walter A.; Wöhling, Thomas; Nowak, Wolfgang
2015-12-01
Groundwater modelers face the challenge of how to assign representative parameter values to the studied aquifer. Several approaches are available to parameterize spatial heterogeneity in aquifer parameters. They differ in their conceptualization and complexity, ranging from homogeneous models to heterogeneous random fields. While it is common practice to invest more effort into data collection for models with a finer resolution of heterogeneities, there is a lack of advice which amount of data is required to justify a certain level of model complexity. In this study, we propose to use concepts related to Bayesian model selection to identify this balance. We demonstrate our approach on the characterization of a heterogeneous aquifer via hydraulic tomography in a sandbox experiment (Illman et al., 2010). We consider four increasingly complex parameterizations of hydraulic conductivity: (1) Effective homogeneous medium, (2) geology-based zonation, (3) interpolation by pilot points, and (4) geostatistical random fields. First, we investigate the shift in justified complexity with increasing amount of available data by constructing a model confusion matrix. This matrix indicates the maximum level of complexity that can be justified given a specific experimental setup. Second, we determine which parameterization is most adequate given the observed drawdown data. Third, we test how the different parameterizations perform in a validation setup. The results of our test case indicate that aquifer characterization via hydraulic tomography does not necessarily require (or justify) a geostatistical description. Instead, a zonation-based model might be a more robust choice, but only if the zonation is geologically adequate.
NASA Astrophysics Data System (ADS)
Martin-StPaul, N. K.; Ay, J. S.; Guillemot, J.; Doyen, L.; Leadley, P.
2014-12-01
Species distribution models (SDMs) are widely used to study and predict the outcome of global changes on species. In human dominated ecosystems the presence of a given species is the result of both its ecological suitability and human footprint on nature such as land use choices. Land use choices may thus be responsible for a selection bias in the presence/absence data used in SDM calibration. We present a structural modelling approach (i.e. based on structural equation modelling) that accounts for this selection bias. The new structural species distribution model (SSDM) estimates simultaneously land use choices and species responses to bioclimatic variables. A land use equation based on an econometric model of landowner choices was joined to an equation of species response to bioclimatic variables. SSDM allows the residuals of both equations to be dependent, taking into account the possibility of shared omitted variables and measurement errors. We provide a general description of the statistical theory and a set of applications on forest trees over France using databases of climate and forest inventory at different spatial resolution (from 2km to 8km). We also compared the outputs of the SSDM with outputs of a classical SDM (i.e. Biomod ensemble modelling) in terms of bioclimatic response curves and potential distributions under current climate and climate change scenarios. The shapes of the bioclimatic response curves and the modelled species distribution maps differed markedly between SSDM and classical SDMs, with contrasted patterns according to species and spatial resolutions. The magnitude and directions of these differences were dependent on the correlations between the errors from both equations and were highest for higher spatial resolutions. A first conclusion is that the use of classical SDMs can potentially lead to strong miss-estimation of the actual and future probability of presence modelled. Beyond this selection bias, the SSDM we propose represents
Young Children's Selective Learning of Rule Games from Reliable and Unreliable Models
ERIC Educational Resources Information Center
Rakoczy, Hannes; Warneken, Felix; Tomasello, Michael
2009-01-01
We investigated preschoolers' selective learning from models that had previously appeared to be reliable or unreliable. Replicating previous research, children from 4 years selectively learned novel words from reliable over unreliable speakers. Extending previous research, children also selectively learned other kinds of acts--novel games--from…
Selection of Authentic Modelling Practices as Contexts for Chemistry Education
ERIC Educational Resources Information Center
Prins, Gjalt T.; Bulte, Astrid M. W.; van Driel, Jan H.; Pilot, Albert
2008-01-01
In science education, students should come to understand the nature and significance of models. In the case of chemistry education it is argued that the present use of models is often not meaningful from the students' perspective. A strategy to overcome this problem is to use an authentic chemical modelling practice as a context for a curriculum…
Support interference of wind tunnel models: A selective annotated bibliography
NASA Technical Reports Server (NTRS)
Tuttle, M. H.; Lawing, P. L.
1984-01-01
This bibliography, with abstracts, consists of 143 citations arranged in chronological order by dates of publication. Selection of the citations was made for their relevance to the problems involved in understanding or avoiding support interference in wind tunnel testing throughout the Mach number range. An author index is included.
Support interference of wind tunnel models: A selective annotated bibliography
NASA Technical Reports Server (NTRS)
Tuttle, M. H.; Gloss, B. B.
1981-01-01
This bibliography, with abstracts, consists of 143 citations arranged in chronological order by dates of publication. Selection of the citations was made for their relevance to the problems involved in understanding or avoiding support interference in wind tunnel testing throughout the Mach number range. An author index is included.
Madrasi, Kumpal; Chaturvedula, Ayyappa; Haberer, Jessica E; Sale, Mark; Fossler, Michael J; Bangsberg, David; Baeten, Jared M; Celum, Connie; Hendrix, Craig W
2016-12-06
Adherence is a major factor in the effectiveness of preexposure prophylaxis (PrEP) for HIV prevention. Modeling patterns of adherence helps to identify influential covariates of different types of adherence as well as to enable clinical trial simulation so that appropriate interventions can be developed. We developed a Markov mixed-effects model to understand the covariates influencing adherence patterns to daily oral PrEP. Electronic adherence records (date and time of medication bottle cap opening) from the Partners PrEP ancillary adherence study with a total of 1147 subjects were used. This study included once-daily dosing regimens of placebo, oral tenofovir disoproxil fumarate (TDF), and TDF in combination with emtricitabine (FTC), administered to HIV-uninfected members of serodiscordant couples. One-coin and first- to third-order Markov models were fit to the data using NONMEM(®) 7.2. Model selection criteria included objective function value (OFV), Akaike information criterion (AIC), visual predictive checks, and posterior predictive checks. Covariates were included based on forward addition (α = 0.05) and backward elimination (α = 0.001). Markov models better described the data than 1-coin models. A third-order Markov model gave the lowest OFV and AIC, but the simpler first-order model was used for covariate model building because no additional benefit on prediction of target measures was observed for higher-order models. Female sex and older age had a positive impact on adherence, whereas Sundays, sexual abstinence, and sex with a partner other than the study partner had a negative impact on adherence. Our findings suggest adherence interventions should consider the role of these factors.
Yi, Nengjun; Shriner, Daniel; Banerjee, Samprit; Mehta, Tapan; Pomp, Daniel; Yandell, Brian S.
2007-01-01
We extend our Bayesian model selection framework for mapping epistatic QTL in experimental crosses to include environmental effects and gene–environment interactions. We propose a new, fast Markov chain Monte Carlo algorithm to explore the posterior distribution of unknowns. In addition, we take advantage of any prior knowledge about genetic architecture to increase posterior probability on more probable models. These enhancements have significant computational advantages in models with many effects. We illustrate the proposed method by detecting new epistatic and gene–sex interactions for obesity-related traits in two real data sets of mice. Our method has been implemented in the freely available package R/qtlbim (http://www.qtlbim.org) to facilitate the general usage of the Bayesian methodology for genomewide interacting QTL analysis. PMID:17483424
Fuel model selection for BEHAVE in midwestern oak savannas
Grabner, K.W.; Dwyer, J.P.; Cutter, B.E.
2001-01-01
BEHAVE, a fire behavior prediction system, can be a useful tool for managing areas with prescribed fire. However, the proper choice of fuel models can be critical in developing management scenarios. BEHAVE predictions were evaluated using four standardized fuel models that partially described oak savanna fuel conditions: Fuel Model 1 (Short Grass), 2 (Timber and Grass), 3 (Tall Grass), and 9 (Hardwood Litter). Although all four models yielded regressions with R2 in excess of 0.8, Fuel Model 2 produced the most reliable fire behavior predictions.
Bailey, Jacqueline; Timmis, Jon; Chtanova, Tatyana
2016-01-01
The advent of two-photon microscopy now reveals unprecedented, detailed spatio-temporal data on cellular motility and interactions in vivo. Understanding cellular motility patterns is key to gaining insight into the development and possible manipulation of the immune response. Computational simulation has become an established technique for understanding immune processes and evaluating hypotheses in the context of experimental data, and there is clear scope to integrate microscopy-informed motility dynamics. However, determining which motility model best reflects in vivo motility is non-trivial: 3D motility is an intricate process requiring several metrics to characterize. This complicates model selection and parameterization, which must be performed against several metrics simultaneously. Here we evaluate Brownian motion, Lévy walk and several correlated random walks (CRWs) against the motility dynamics of neutrophils and lymph node T cells under inflammatory conditions by simultaneously considering cellular translational and turn speeds, and meandering indices. Heterogeneous cells exhibiting a continuum of inherent translational speeds and directionalities comprise both datasets, a feature significantly improving capture of in vivo motility when simulated as a CRW. Furthermore, translational and turn speeds are inversely correlated, and the corresponding CRW simulation again improves capture of our in vivo data, albeit to a lesser extent. In contrast, Brownian motion poorly reflects our data. Lévy walk is competitive in capturing some aspects of neutrophil motility, but T cell directional persistence only, therein highlighting the importance of evaluating models against several motility metrics simultaneously. This we achieve through novel application of multi-objective optimization, wherein each model is independently implemented and then parameterized to identify optimal trade-offs in performance against each metric. The resultant Pareto fronts of optimal
Read, Mark N; Bailey, Jacqueline; Timmis, Jon; Chtanova, Tatyana
2016-09-01
The advent of two-photon microscopy now reveals unprecedented, detailed spatio-temporal data on cellular motility and interactions in vivo. Understanding cellular motility patterns is key to gaining insight into the development and possible manipulation of the immune response. Computational simulation has become an established technique for understanding immune processes and evaluating hypotheses in the context of experimental data, and there is clear scope to integrate microscopy-informed motility dynamics. However, determining which motility model best reflects in vivo motility is non-trivial: 3D motility is an intricate process requiring several metrics to characterize. This complicates model selection and parameterization, which must be performed against several metrics simultaneously. Here we evaluate Brownian motion, Lévy walk and several correlated random walks (CRWs) against the motility dynamics of neutrophils and lymph node T cells under inflammatory conditions by simultaneously considering cellular translational and turn speeds, and meandering indices. Heterogeneous cells exhibiting a continuum of inherent translational speeds and directionalities comprise both datasets, a feature significantly improving capture of in vivo motility when simulated as a CRW. Furthermore, translational and turn speeds are inversely correlated, and the corresponding CRW simulation again improves capture of our in vivo data, albeit to a lesser extent. In contrast, Brownian motion poorly reflects our data. Lévy walk is competitive in capturing some aspects of neutrophil motility, but T cell directional persistence only, therein highlighting the importance of evaluating models against several motility metrics simultaneously. This we achieve through novel application of multi-objective optimization, wherein each model is independently implemented and then parameterized to identify optimal trade-offs in performance against each metric. The resultant Pareto fronts of optimal
Guidance to select and prepare input values for OPP's aquatic exposure models. Intended to improve the consistency in modeling the fate of pesticides in the environment and quality of OPP's aquatic risk assessments.
Edla, Shwetha; Kovvali, Narayan; Papandreou-Suppappola, Antonia
2012-01-01
Constructing statistical models of electrocardiogram (ECG) signals, whose parameters can be used for automated disease classification, is of great importance in precluding manual annotation and providing prompt diagnosis of cardiac diseases. ECG signals consist of several segments with different morphologies (namely the P wave, QRS complex and the T wave) in a single heart beat, which can vary across individuals and diseases. Also, existing statistical ECG models exhibit a reliance upon obtaining a priori information from the ECG data by using preprocessing algorithms to initialize the filter parameters, or to define the user-specified model parameters. In this paper, we propose an ECG modeling technique using the sequential Markov chain Monte Carlo (SMCMC) filter that can perform simultaneous model selection, by adaptively choosing from different representations depending upon the nature of the data. Our results demonstrate the ability of the algorithm to track various types of ECG morphologies, including intermittently occurring ECG beats. In addition, we use the estimated model parameters as the feature set to classify between ECG signals with normal sinus rhythm and four different types of arrhythmia.
Amine modeling for CO2 capture: internals selection.
Karpe, Prakash; Aichele, Clint P
2013-04-16
Traditionally, trays have been the mass-transfer device of choice in amine absorption units. However, the need to process large volumes of flue gas to capture CO2 and the resultant high costs of multiple trains of large trayed columns have prompted process licensors and vendors to investigate alternative mass-transfer devices. These alternatives include third-generation random packings and structured packings. Nevertheless, clear-cut guidelines for selection of packings for amine units are lacking. This paper provides well-defined guidelines and a consistent framework for the choice of mass-transfer devices for amine absorbers and regenerators. This work emphasizes the role played by the flow parameter, a measure of column liquid loading and pressure, in the type of packing selected. In addition, this paper demonstrates the significant economic advantage of packings over trays in terms of capital costs (CAPEX) and operating costs (OPEX).
Bayesian model selection validates a biokinetic model for zirconium processing in humans
2012-01-01
Background In radiation protection, biokinetic models for zirconium processing are of crucial importance in dose estimation and further risk analysis for humans exposed to this radioactive substance. They provide limiting values of detrimental effects and build the basis for applications in internal dosimetry, the prediction for radioactive zirconium retention in various organs as well as retrospective dosimetry. Multi-compartmental models are the tool of choice for simulating the processing of zirconium. Although easily interpretable, determining the exact compartment structure and interaction mechanisms is generally daunting. In the context of observing the dynamics of multiple compartments, Bayesian methods provide efficient tools for model inference and selection. Results We are the first to apply a Markov chain Monte Carlo approach to compute Bayes factors for the evaluation of two competing models for zirconium processing in the human body after ingestion. Based on in vivo measurements of human plasma and urine levels we were able to show that a recently published model is superior to the standard model of the International Commission on Radiological Protection. The Bayes factors were estimated by means of the numerically stable thermodynamic integration in combination with a recently developed copula-based Metropolis-Hastings sampler. Conclusions In contrast to the standard model the novel model predicts lower accretion of zirconium in bones. This results in lower levels of noxious doses for exposed individuals. Moreover, the Bayesian approach allows for retrospective dose assessment, including credible intervals for the initially ingested zirconium, in a significantly more reliable fashion than previously possible. All methods presented here are readily applicable to many modeling tasks in systems biology. PMID:22863152
Diagnosing Hybrid Systems: a Bayesian Model Selection Approach
NASA Technical Reports Server (NTRS)
McIlraith, Sheila A.
2005-01-01
In this paper we examine the problem of monitoring and diagnosing noisy complex dynamical systems that are modeled as hybrid systems-models of continuous behavior, interleaved by discrete transitions. In particular, we examine continuous systems with embedded supervisory controllers that experience abrupt, partial or full failure of component devices. Building on our previous work in this area (MBCG99;MBCG00), our specific focus in this paper ins on the mathematical formulation of the hybrid monitoring and diagnosis task as a Bayesian model tracking algorithm. The nonlinear dynamics of many hybrid systems present challenges to probabilistic tracking. Further, probabilistic tracking of a system for the purposes of diagnosis is problematic because the models of the system corresponding to failure modes are numerous and generally very unlikely. To focus tracking on these unlikely models and to reduce the number of potential models under consideration, we exploit logic-based techniques for qualitative model-based diagnosis to conjecture a limited initial set of consistent candidate models. In this paper we discuss alternative tracking techniques that are relevant to different classes of hybrid systems, focusing specifically on a method for tracking multiple models of nonlinear behavior simultaneously using factored sampling and conditional density propagation. To illustrate and motivate the approach described in this paper we examine the problem of monitoring and diganosing NASA's Sprint AERCam, a small spherical robotic camera unit with 12 thrusters that enable both linear and rotational motion.
A Journal Selection Model and its Implications for a Library System
ERIC Educational Resources Information Center
Kraft, D. H.; Hill, T. W., Jr.
1973-01-01
The problem of selecting which journals to acquire in order to best satisfy library objectives is modeled as a zero-one linear programming problem and examined in detail. The model can be used to aid the librarian in making better selection decisions. (30 references) (Author/KE)
Computational approaches to parameter estimation and model selection in immunology
NASA Astrophysics Data System (ADS)
Baker, C. T. H.; Bocharov, G. A.; Ford, J. M.; Lumb, P. M.; Norton, S. J.; Paul, C. A. H.; Junt, T.; Krebs, P.; Ludewig, B.
2005-12-01
One of the significant challenges in biomathematics (and other areas of science) is to formulate meaningful mathematical models. Our problem is to decide on a parametrized model which is, in some sense, most likely to represent the information in a set of observed data. In this paper, we illustrate the computational implementation of an information-theoretic approach (associated with a maximum likelihood treatment) to modelling in immunology.The approach is illustrated by modelling LCMV infection using a family of models based on systems of ordinary differential and delay differential equations. The models (which use parameters that have a scientific interpretation) are chosen to fit data arising from experimental studies of virus-cytotoxic T lymphocyte kinetics; the parametrized models that result are arranged in a hierarchy by the computation of Akaike indices. The practical illustration is used to convey more general insight. Because the mathematical equations that comprise the models are solved numerically, the accuracy in the computation has a bearing on the outcome, and we address this and other practical details in our discussion.
Beyond the List: Schools Selecting Alternative CSR Models.
ERIC Educational Resources Information Center
Clark, Gail; Apthorp, Helen; Van Buhler, Rebecca; Dean, Ceri; Barley, Zoe
A study was conducted to describe the population of alternative models for comprehensive school reform in the region served by Mid-continent Research for Education and Learning (McREL). The study addressed the questions of whether schools that did not propose to adopt widely known or implemented reform models were able to design a reform process…
Physics-based statistical learning approach to mesoscopic model selection
NASA Astrophysics Data System (ADS)
Taverniers, Søren; Haut, Terry S.; Barros, Kipton; Alexander, Francis J.; Lookman, Turab
2015-11-01
In materials science and many other research areas, models are frequently inferred without considering their generalization to unseen data. We apply statistical learning using cross-validation to obtain an optimally predictive coarse-grained description of a two-dimensional kinetic nearest-neighbor Ising model with Glauber dynamics (GD) based on the stochastic Ginzburg-Landau equation (sGLE). The latter is learned from GD "training" data using a log-likelihood analysis, and its predictive ability for various complexities of the model is tested on GD "test" data independent of the data used to train the model on. Using two different error metrics, we perform a detailed analysis of the error between magnetization time trajectories simulated using the learned sGLE coarse-grained description and those obtained using the GD model. We show that both for equilibrium and out-of-equilibrium GD training trajectories, the standard phenomenological description using a quartic free energy does not always yield the most predictive coarse-grained model. Moreover, increasing the amount of training data can shift the optimal model complexity to higher values. Our results are promising in that they pave the way for the use of statistical learning as a general tool for materials modeling and discovery.
Achieving runtime adaptability through automated model evolution and variant selection
NASA Astrophysics Data System (ADS)
Mosincat, Adina; Binder, Walter; Jazayeri, Mehdi
2014-01-01
Dynamically adaptive systems propose adaptation by means of variants that are specified in the system model at design time and allow for a fixed set of different runtime configurations. However, in a dynamic environment, unanticipated changes may result in the inability of the system to meet its quality requirements. To allow the system to react to these changes, this article proposes a solution for automatically evolving the system model by integrating new variants and periodically validating the existing ones based on updated quality parameters. To illustrate this approach, the article presents a BPEL-based framework using a service composition model to represent the functional requirements of the system. The framework estimates quality of service (QoS) values based on information provided by a monitoring mechanism, ensuring that changes in QoS are reflected in the system model. The article shows how the evolved model can be used at runtime to increase the system's autonomic capabilities and delivered QoS.
Model selection, identification and validation in anaerobic digestion: a review.
Donoso-Bravo, Andres; Mailier, Johan; Martin, Cristina; Rodríguez, Jorge; Aceves-Lara, César Arturo; Vande Wouwer, Alain
2011-11-01
Anaerobic digestion enables waste (water) treatment and energy production in the form of biogas. The successful implementation of this process has lead to an increasing interest worldwide. However, anaerobic digestion is a complex biological process, where hundreds of microbial populations are involved, and whose start-up and operation are delicate issues. In order to better understand the process dynamics and to optimize the operating conditions, the availability of dynamic models is of paramount importance. Such models have to be inferred from prior knowledge and experimental data collected from real plants. Modeling and parameter identification are vast subjects, offering a realm of approaches and methods, which can be difficult to fully understand by scientists and engineers dedicated to the plant operation and improvements. This review article discusses existing modeling frameworks and methodologies for parameter estimation and model validation in the field of anaerobic digestion processes. The point of view is pragmatic, intentionally focusing on simple but efficient methods.
Demographic modeling of selected fish species with RAMAS
Saila, S.; Martin, B.; Ferson, S.; Ginzburg, L.; Millstein, J. )
1991-03-01
The microcomputer program RAMAS 3 developed for EPRI, has been used to model the intrinsic natural variability of seven important fish species: cod, Atlantic herring, yellowtail flounder, haddock, striped bass, American shad and white perch. Demographic data used to construct age-based population models included information on spawning biology, longevity, sex ratio and (age-specific) mortality and fecundity. These data were collected from published and unpublished sources. The natural risks of extinction and of falling below threshold population abundances (quasi-extinction) are derived for each of the seven fish species based on measured and estimated values for their demographic parameters. The analysis of these species provides evidence that including density-dependent compensation in the demographic model typically lowers the expected chance of extinction. This is because if density dependence generally acts as a restoring force it seems reasonable to conclude that models which include density dependence would exhibit less fluctuation than models without compensation since density-dependent populations experience a pull towards equilibrium. Since extinction probabilities are determined by the size of the fluctuation of population abundance, models without density dependence will show higher risks of extinction, given identical circumstances. Thus, models without compensation can be used as conservative estimators of risk, that is, if a compensation-free model yields acceptable extinction risk, adding compensation will not increase this risk. Since it is usually difficult to estimate the parameters needed for a model with compensation, such conservative estimates of the risks of extinction based on a model without compensation are very useful in the methodology of impact assessment. 103 refs., 19 figs., 10 tabs.
Brandt, Laura A.; Benscoter, Allison; Harvey, Rebecca G.; Speroterra, Carolina; Bucklin, David N.; Romanach, Stephanie; Watling, James I.; Mazzotti, Frank J.
2017-01-01
The data we used for this study include species occurrence data (n=15 species), climate data and predictions, an expert opinion questionnaire, and species masks that represented the model domain for each species. For this data release, we include the results of the expert opinion questionnaire and the species model domains (or masks). We developed an expert opinion questionnaire to gather information on expert opinion regarding the importance of climate variables in determining a species geographic range. The species masks, or model domains, were defined separately for each species using a variation of the “target-group” approach (Phillips et al. 2009), where the domain was determined using convex polygons including occurrence data for at least three phylogenetically related and similar species (Watling et al. 2012). The species occurrence data, climate data, and climate predictions are freely available online, and therefore not included in this data release. The species occurrence data were obtained from the online database Global Biodiversity Information Facility (GBIF; http://www.gbif.org/), and from scientific literature (Watling et al. 2011). Climate data were obtained from the WorldClim database (Hijmans et al. 2005) and climate predictions were obtained from the Center for Ocean-Atmosphere Prediction Studies (COAPS) at Florida State University (https://floridaclimateinstitute.org/resources/data-sets/regional-downscaling). See metadata for references.
Cardiac lineage selection: integrating biological complexity into computational models.
Foley, Ann
2009-01-01
The emergence of techniques to study developmental processes using systems biology approaches offers exciting possibilities for the developmental biologist. In particular cardiac lineage selection may be particularly amenable to these types of studies since the heart is the first fully functional organ to form in vertebrates. However there are many technical obstacles that need to be overcome for these studies to proceed. Here we present a brief overview of cardiomyocyte lineage deterimination and discuss how different aspects of this process either benefit from or present unique challenges for the development of systems biology approaches.
Four states magnetic dots: a design selection by micromagnetic modeling
NASA Astrophysics Data System (ADS)
Louis, D.; Hauet, T.; Petit-Watelot, S.; Lacour, D.; Hehn, M.; Montaigne, F.
2016-10-01
In a context where sub-micrometric magnetic dots are foreseen to play an active role in various new breeds of electronics components such as magnetic memories, magnetic logics or bio-sensors, the use of micromagnetic simulations to optimize their shapes and spatial arrangement with respect to a chosen application has become unavoidable. Prior realizing experimentally magnetic dots presenting four stable magnetic states (4SMS), we performed a micromagnetic study to select a design providing not only four equivalent magnetic states in a single dot but also exhibiting mostly uniform magnetic states.
Model selection for athermal cross-linked fiber networks.
Shahsavari, A; Picu, R C
2012-07-01
Athermal random fiber networks are usually modeled by representing each fiber as a truss, a Euler-Bernoulli or a Timoshenko beam, and, in the case of cross-linked networks, each cross-link as a pinned, rotating, or welded joint. In this work we study the effect of these various modeling options on the dependence of the overall network stiffness on system parameters. We conclude that Timoshenko beams can be used for the entire range of density and beam stiffness parameters, while the Euler-Bernoulli model can be used only at relatively low network densities. In the high density-high bending stiffness range, strain energy is stored predominantly in the axial and shear deformation modes, while in the other extreme range of parameters, the energy is stored in the bending mode. The effect of the model size on the network stiffness is also discussed.
Back to basics for Bayesian model building in genomic selection.
Kärkkäinen, Hanni P; Sillanpää, Mikko J
2012-07-01
Numerous Bayesian methods of phenotype prediction and genomic breeding value estimation based on multilocus association models have been proposed. Computationally the methods have been based either on Markov chain Monte Carlo or on faster maximum a posteriori estimation. The demand for more accurate and more efficient estimation has led to the rapid emergence of workable methods, unfortunately at the expense of well-defined principles for Bayesian model building. In this article we go back to the basics and build a Bayesian multilocus association model for quantitative and binary traits with carefully defined hierarchical parameterization of Student's t and Laplace priors. In this treatment we consider alternative model structures, using indicator variables and polygenic terms. We make the most of the conjugate analysis, enabled by the hierarchical formulation of the prior densities, by deriving the fully conditional posterior densities of the parameters and using the acquired known distributions in building fast generalized expectation-maximization estimation algorithms.
A Multistage R & D Project Selection Model with Multiple Objectives.
1979-01-01
projects. , Each feature of the proposed model is designed to incorporate realistic aspects of the R&D decision making process. First, the model...148 Directions of Future Research in the R&D Decision Making Environment ................. 148 Other Applications and Extensions of the...incorporate the realistic aspects of the decentralized structure of organizational R&D decision making and the corresponding hierarchies. The major
Mass concentration in a nonlocal model of clonal selection.
Busse, J-E; Gwiazda, P; Marciniak-Czochra, A
2016-10-01
Self-renewal is a constitutive property of stem cells. Testing the cancer stem cell hypothesis requires investigation of the impact of self-renewal on cancer expansion. To better understand this impact, we propose a mathematical model describing the dynamics of a continuum of cell clones structured by the self-renewal potential. The model is an extension of the finite multi-compartment models of interactions between normal and cancer cells in acute leukemias. It takes a form of a system of integro-differential equations with a nonlinear and nonlocal coupling which describes regulatory feedback loops of cell proliferation and differentiation. We show that this coupling leads to mass concentration in points corresponding to the maxima of the self-renewal potential and the solutions of the model tend asymptotically to Dirac measures multiplied by positive constants. Furthermore, using a Lyapunov function constructed for the finite dimensional counterpart of the model, we prove that the total mass of the solution converges to a globally stable equilibrium. Additionally, we show stability of the model in the space of positive Radon measures equipped with the flat metric (bounded Lipschitz distance). Analytical results are illustrated by numerical simulations.
Model examination of selective media for isolation of Listeria strains.
Domján Kovács, H; Ralovich, B
1991-01-01
During the Tenth International Symposium on Listeriosis (Pécs, Hungary, 1988) the Working Party on Culture Media of IUMS-ICFMH suggested comparative examination of nine enrichment broths and nine solid selective media. On the basis of this proposal the following media were studied: LiCl-phenylethanol-moxalactam agar (LPM), polymyxin-acriflavine-LiCl-ceftazidime-aesculin-mannitol agar (PALCAM) No. 1 (home made) and No. 2 (Merck), acriflavine-ceftazidime agar (AC), Oxford agar, tripaflavine-nalidixic acid serum agar (TNSA) and Forray's agar. The study was performed as described in "Testing methods for use in quality assurance of culture media". Oxford agar proved to be the best medium. LPM, AC and Forray's agars were somewhat more inhibitory than Oxford medium. In productivity TNSA and PALCAM media were weakest but the latter one was more selective. When 43 sausage samples were enriched in UVM broths and subcultured on the above mentioned media the number of positive samples was the same on Oxford, LPM, AC and TNSA agars but it was lower on PALCAM agar No. 1. When 103 milk samples were subcultured on TNSA and PALCAM agar No. 2, the number of positive samples was the same.
A modelling framework for the analysis of artificial-selection time series.
Le Rouzic, Arnaud; Houle, David; Hansen, Thomas F
2011-04-01
Artificial-selection experiments constitute an important source of empirical information for breeders, geneticists and evolutionary biologists. Selected characters can generally be shifted far from their initial state, sometimes beyond what is usually considered as typical inter-specific divergence. A careful analysis of the data collected during such experiments may thus reveal the dynamical properties of the genetic architecture that underlies the trait under selection. Here, we propose a statistical framework describing the dynamics of selection-response time series. We highlight how both phenomenological models (which do not make assumptions on the nature of genetic phenomena) and mechanistic models (explaining the temporal trends in terms of e.g. mutations, epistasis or canalization) can be used to understand and interpret artificial-selection data. The practical use of the models and their implementation in a software package are demonstrated through the analysis of a selection experiment on the shape of the wing in Drosophila melanogaster.
Sale, Mark; Sherer, Eric A
2015-01-01
The current algorithm for selecting a population pharmacokinetic/pharmacodynamic model is based on the well-established forward addition/backward elimination method. A central strength of this approach is the opportunity for a modeller to continuously examine the data and postulate new hypotheses to explain observed biases. This algorithm has served the modelling community well, but the model selection process has essentially remained unchanged for the last 30 years. During this time, more robust approaches to model selection have been made feasible by new technology and dramatic increases in computation speed. We review these methods, with emphasis on genetic algorithm approaches and discuss the role these methods may play in population pharmacokinetic/pharmacodynamic model selection.
Model validity and frequency band selection in operational modal analysis
NASA Astrophysics Data System (ADS)
Au, Siu-Kui
2016-12-01
Experimental modal analysis aims at identifying the modal properties (e.g., natural frequencies, damping ratios, mode shapes) of a structure using vibration measurements. Two basic questions are encountered when operating in the frequency domain: Is there a mode near a particular frequency? If so, how much spectral data near the frequency can be included for modal identification without incurring significant modeling error? For data with high signal-to-noise (s/n) ratios these questions can be addressed using empirical tools such as singular value spectrum. Otherwise they are generally open and can be challenging, e.g., for modes with low s/n ratios or close modes. In this work these questions are addressed using a Bayesian approach. The focus is on operational modal analysis, i.e., with 'output-only' ambient data, where identification uncertainty and modeling error can be significant and their control is most demanding. The approach leads to 'evidence ratios' quantifying the relative plausibility of competing sets of modeling assumptions. The latter involves modeling the 'what-if-not' situation, which is non-trivial but is resolved by systematic consideration of alternative models and using maximum entropy principle. Synthetic and field data are considered to investigate the behavior of evidence ratios and how they should be interpreted in practical applications.
Turbulence Model Selection for Low Reynolds Number Flows
2016-01-01
One of the major flow phenomena associated with low Reynolds number flow is the formation of separation bubbles on an airfoil’s surface. NACA4415 airfoil is commonly used in wind turbines and UAV applications. The stall characteristics are gradual compared to thin airfoils. The primary criterion set for this work is the capture of laminar separation bubble. Flow is simulated for a Reynolds number of 120,000. The numerical analysis carried out shows the advantages and disadvantages of a few turbulence models. The turbulence models tested were: one equation Spallart Allmars (S-A), two equation SST K-ω, three equation Intermittency (γ) SST, k-kl-ω and finally, the four equation transition γ-Reθ SST. However, the variation in flow physics differs between these turbulence models. Procedure to establish the accuracy of the simulation, in accord with previous experimental results, has been discussed in detail. PMID:27104354
Catalog of selected heavy duty transport energy management models
NASA Technical Reports Server (NTRS)
Colello, R. G.; Boghani, A. B.; Gardella, N. C.; Gott, P. G.; Lee, W. D.; Pollak, E. C.; Teagan, W. P.; Thomas, R. G.; Snyder, C. M.; Wilson, R. P., Jr.
1983-01-01
A catalog of energy management models for heavy duty transport systems powered by diesel engines is presented. The catalog results from a literature survey, supplemented by telephone interviews and mailed questionnaires to discover the major computer models currently used in the transportation industry in the following categories: heavy duty transport systems, which consist of highway (vehicle simulation), marine (ship simulation), rail (locomotive simulation), and pipeline (pumping station simulation); and heavy duty diesel engines, which involve models that match the intake/exhaust system to the engine, fuel efficiency, emissions, combustion chamber shape, fuel injection system, heat transfer, intake/exhaust system, operating performance, and waste heat utilization devices, i.e., turbocharger, bottoming cycle.
Optimization of the selective frequency damping parameters using model reduction
NASA Astrophysics Data System (ADS)
Cunha, Guilherme; Passaggia, Pierre-Yves; Lazareff, Marc
2015-09-01
In the present work, an optimization methodology to compute the best control parameters, χ and Δ, for the selective frequency damping method is presented. The optimization does not suppose any a priori knowledge of the flow physics, neither of the underlying numerical methods, and is especially suited for simulations requiring large quantity of grid elements and processors. It allows for obtaining an optimal convergence rate to a steady state of the damped Navier-Stokes system. This is achieved using the Dynamic Mode Decomposition, which is a snapshot-based method, to estimate the eigenvalues associated with global unstable dynamics. Validations test cases are presented for the numerical configurations of a laminar flow past a 2D cylinder, a separated boundary-layer over a shallow bump, and a 3D turbulent stratified-Poiseuille flow.
Keith, Scott W; Allison, David B
2014-09-29
This paper details the design, evaluation, and implementation of a framework for detecting and modeling nonlinearity between a binary outcome and a continuous predictor variable adjusted for covariates in complex samples. The framework provides familiar-looking parameterizations of output in terms of linear slope coefficients and odds ratios. Estimation methods focus on maximum likelihood optimization of piecewise linear free-knot splines formulated as B-splines. Correctly specifying the optimal number and positions of the knots improves the model, but is marked by computational intensity and numerical instability. Our inference methods utilize both parametric and nonparametric bootstrapping. Unlike other nonlinear modeling packages, this framework is designed to incorporate multistage survey sample designs common to nationally representative datasets. We illustrate the approach and evaluate its performance in specifying the correct number of knots under various conditions with an example using body mass index (BMI; kg/m(2)) and the complex multi-stage sampling design from the Third National Health and Nutrition Examination Survey to simulate binary mortality outcomes data having realistic nonlinear sample-weighted risk associations with BMI. BMI and mortality data provide a particularly apt example and area of application since BMI is commonly recorded in large health surveys with complex designs, often categorized for modeling, and nonlinearly related to mortality. When complex sample design considerations were ignored, our method was generally similar to or more accurate than two common model selection procedures, Schwarz's Bayesian Information Criterion (BIC) and Akaike's Information Criterion (AIC), in terms of correctly selecting the correct number of knots. Our approach provided accurate knot selections when complex sampling weights were incorporated, while AIC and BIC were not effective under these conditions.
Keith, Scott W.; Allison, David B.
2014-01-01
This paper details the design, evaluation, and implementation of a framework for detecting and modeling non-linearity between a binary outcome and a continuous predictor variable adjusted for covariates in complex samples. The framework provides familiar-looking parameterizations of output in terms of linear slope coefficients and odds ratios. Estimation methods focus on maximum likelihood optimization of piecewise linear free-knot splines formulated as B-splines. Correctly specifying the optimal number and positions of the knots improves the model, but is marked by computational intensity and numerical instability. Our inference methods utilize both parametric and non-parametric bootstrapping. Unlike other non-linear modeling packages, this framework is designed to incorporate multistage survey sample designs common to nationally representative datasets. We illustrate the approach and evaluate its performance in specifying the correct number of knots under various conditions with an example using body mass index (BMI, kg/m2) and the complex multistage sampling design from the Third National Health and Nutrition Examination Survey to simulate binary mortality outcomes data having realistic non-linear sample-weighted risk associations with BMI. BMI and mortality data provide a particularly apt example and area of application since BMI is commonly recorded in large health surveys with complex designs, often categorized for modeling, and non-linearly related to mortality. When complex sample design considerations were ignored, our method was generally similar to or more accurate than two common model selection procedures, Schwarz’s Bayesian Information Criterion (BIC) and Akaike’s Information Criterion (AIC), in terms of correctly selecting the correct number of knots. Our approach provided accurate knot selections when complex sampling weights were incorporated, while AIC and BIC were not effective under these conditions. PMID:25610831
Sexual selection under parental choice: a revision to the model.
Apostolou, Menelaos
2014-06-01
Across human cultures, parents exercise considerable influence over their children's mate choices. The model of parental choice provides a good account of these patterns, but its prediction that male parents exercise more control than female ones is not well founded in evolutionary theory. To address this shortcoming, the present article proposes a revision to the model. In particular, parental uncertainty, residual reproductive value, reproductive variance, asymmetry in the control of resources, physical strength, and access to weaponry make control over mating more profitable for male parents than female ones; in turn, this produces an asymmetrical incentive for controlling mate choice. Several implications of this formulation are also explored.
The Applicability of Selected Evaluation Models to Evolving Investigative Designs.
ERIC Educational Resources Information Center
Smith, Nick L.; Hauer, Diane M.
1990-01-01
Ten evaluation models are examined in terms of their applicability to investigative, emergent design programs: Stake's portrayal, Wolf's adversary, Patton's utilization, Guba's investigative journalism, Scriven's goal-free, Scriven's modus operandi, Eisner's connoisseurial, Stufflebeam's CIPP, Tyler's objective based, and Levin's cost…
Thermal performance curves of Paramecium caudatum: a model selection approach.
Krenek, Sascha; Berendonk, Thomas U; Petzoldt, Thomas
2011-05-01
The ongoing climate change has motivated numerous studies investigating the temperature response of various organisms, especially that of ectotherms. To correctly describe the thermal performance of these organisms, functions are needed which sufficiently fit to the complete optimum curve. Surprisingly, model-comparisons for the temperature-dependence of population growth rates of an important ectothermic group, the protozoa, are still missing. In this study, temperature reaction norms of natural isolates of the freshwater protist Paramecium caudatum were investigated, considering nearly the entire temperature range. These reaction norms were used to estimate thermal performance curves by applying a set of commonly used model functions. An information theory approach was used to compare models and to identify the best ones for describing these data. Our results indicate that the models which can describe negative growth at the high- and low-temperature branch of an optimum curve are preferable. This is a prerequisite for accurately calculating the critical upper and lower thermal limits. While we detected a temperature optimum of around 29 °C for all investigated clonal strains, the critical thermal limits were considerably different between individual clones. Here, the tropical clone showed the narrowest thermal tolerance, with a shift of its critical thermal limits to higher temperatures.
SUPERCRITICAL WATER OXIDATION MODEL DEVELOPMENT FOR SELECTED EPA PRIORITY POLLUTANTS
Supercritical Water Oxidation (SCWO) evaluated for five compounds: acetic acid, 2,4-dichlorophenol, pentachlorophenol, pyridine, 2,4-dichlorophenoxyacetic acid (methyl ester). inetic models were developed for acetic acid, 2,4-dichlorophenol, and pyridine. he test compounds were e...
A Data Envelopment Analysis Model for Renewable Energy Technology Selection
Technology Transfer Automated Retrieval System (TEKTRAN)
Public and media interest in alternative energy sources, such as renewable fuels, has rapidly increased in recent years due to higher prices for oil and natural gas. However, the current body of research providing comparative decision making models that either rank these alternative energy sources a...
Factor selection and structural identification in the interaction ANOVA model.
Post, Justin B; Bondell, Howard D
2013-03-01
When faced with categorical predictors and a continuous response, the objective of an analysis often consists of two tasks: finding which factors are important and determining which levels of the factors differ significantly from one another. Often times, these tasks are done separately using Analysis of Variance (ANOVA) followed by a post hoc hypothesis testing procedure such as Tukey's Honestly Significant Difference test. When interactions between factors are included in the model the collapsing of levels of a factor becomes a more difficult problem. When testing for differences between two levels of a factor, claiming no difference would refer not only to equality of main effects, but also to equality of each interaction involving those levels. This structure between the main effects and interactions in a model is similar to the idea of heredity used in regression models. This article introduces a new method for accomplishing both of the common analysis tasks simultaneously in an interaction model while also adhering to the heredity-type constraint on the model. An appropriate penalization is constructed that encourages levels of factors to collapse and entire factors to be set to zero. It is shown that the procedure has the oracle property implying that asymptotically it performs as well as if the exact structure were known beforehand. We also discuss the application to estimating interactions in the unreplicated case. Simulation studies show the procedure outperforms post hoc hypothesis testing procedures as well as similar methods that do not include a structural constraint. The method is also illustrated using a real data example.
Decomposition and model selection for large contingency tables.
Dahinden, Corinne; Kalisch, Markus; Bühlmann, Peter
2010-04-01
Large contingency tables summarizing categorical variables arise in many areas. One example is in biology, where large numbers of biomarkers are cross-tabulated according to their discrete expression level. Interactions of the variables are of great interest and are generally studied with log-linear models. The structure of a log-linear model can be visually represented by a graph from which the conditional independence structure can then be easily read off. However, since the number of parameters in a saturated model grows exponentially in the number of variables, this generally comes with a heavy computational burden. Even if we restrict ourselves to models of lower-order interactions or other sparse structures, we are faced with the problem of a large number of cells which play the role of sample size. This is in sharp contrast to high-dimensional regression or classification procedures because, in addition to a high-dimensional parameter, we also have to deal with the analogue of a huge sample size. Furthermore, high-dimensional tables naturally feature a large number of sampling zeros which often leads to the nonexistence of the maximum likelihood estimate. We therefore present a decomposition approach, where we first divide the problem into several lower-dimensional problems and then combine these to form a global solution. Our methodology is computationally feasible for log-linear interaction models with many categorical variables each or some of them having many levels. We demonstrate the proposed method on simulated data and apply it to a bio-medical problem in cancer research.
Response to selection in finite locus models with non-additive effects.
Esfandyari, Hadi; Henryon, Mark; Berg, Peer; Thomasen, Jorn Rind; Bijma, Piter; Sørensen, Anders Christian
2017-01-12
Under the finite-locus model in the absence of mutation, the additive genetic variation is expected to decrease when directional selection is acting on a population, according to quantitative-genetic theory. However, some theoretical studies of selection suggest that the level of additive variance can be sustained or even increased when non-additive genetic effects are present. We tested the hypothesis that finite-locus models with both additive and non-additive genetic effects maintain more additive genetic variance (V_A) and realize larger medium-to-long term genetic gains than models with only additive effects when the trait under selection is subject to truncation selection. Four genetic models that included additive, dominance, and additive-by-additive epistatic effects were simulated. The simulated genome for individuals consisted of 25 chromosomes, each with a length of 1M. One hundred bi-allelic QTL, four on each chromosome, were considered. In each generation, 100 sires and 100 dams were mated, producing five progeny per mating. The population was selected for a single trait (h(2)=0.1) for 100 discrete generations with selection on phenotype or BLUP-EBV. V_A decreased with directional truncation selection even in presence of non-additive genetic effects. Non-additive effects influenced long-term response to selection and among genetic models additive gene action had highest response to selection. In addition, in all genetic models, BLUP-EBV resulted in a greater fixation of favourable and unfavourable alleles and higher response than phenotypic selection. In conclusion, for the schemes we simulated, the presence of non-additive genetic effects had little effect in changes of additive variance and V_A decreased by directional selection.
Sparse model selection in the highly under-sampled regime
NASA Astrophysics Data System (ADS)
Bulso, Nicola; Marsili, Matteo; Roudi, Yasser
2016-09-01
We propose a method for recovering the structure of a sparse undirected graphical model when very few samples are available. The method decides about the presence or absence of bonds between pairs of variable by considering one pair at a time and using a closed form formula, analytically derived by calculating the posterior probability for every possible model explaining a two body system using Jeffreys prior. The approach does not rely on the optimization of any cost functions and consequently is much faster than existing algorithms. Despite this time and computational advantage, numerical results show that for several sparse topologies the algorithm is comparable to the best existing algorithms, and is more accurate in the presence of hidden variables. We apply this approach to the analysis of US stock market data and to neural data, in order to show its efficiency in recovering robust statistical dependencies in real data with non-stationary correlations in time and/or space.
Behavior changes in SIS STD models with selective mixing
Hyman, J.M.; Li, J.
1997-08-01
The authors propose and analyze a heterogeneous, multigroup, susceptible-infective-susceptible (SIS) sexually transmitted disease (STD) model where the desirability and acceptability in partnership formations are functions of the infected individuals. They derive explicit formulas for the epidemic thresholds, prove the existence and uniqueness of the equilibrium states for the two-group model and provide a complete analysis of their local and global stability. The authors then investigate the effects of behavior changes on the transmission dynamics and analyze the sensitivity of the epidemic to the magnitude of the behavior changes. They verify that if people modify their behavior to reduce the probability of infection with individuals in highly infected groups, through either reduced contacts, reduced partner formations, or using safe sex, the infection level may be decreased. However, if people continue to have intragroup and intergroup partnerships, then changing the desirability and acceptability formation cannot eradicate the epidemic once it exceeds the epidemic threshold.
Model Averaging and Dimension Selection for the Singular Value Decomposition
2006-01-10
the analysis of relational data (Harshman et al., 1982), biplots (Gabriel 1971, Gower and Hand 1996) and in reduced-rank interaction models for...numbers of random matrices,” SIAM J. Matrix Anal. Appl., 9, 543–560. Gabriel, K. R. (1971), “The biplot graphic display of matrices with application to...and Hand, D. J. (1996), Biplots , vol. 54 of Monographs on Statistics and Applied Probability, Chapman and Hall Ltd., London. Green, P. J. (2003
Buy or Lease Cost Model - Selected Railway Equipment.
1981-04-01
applications; specifically, there is no adequate buy or lease cost model applicable to railway car acquisition for the Defense Freight Railway Interchange...it would be less costly to lease cars rather than buy them. Lease/ buy calculations are labor intensive, and do not easily permit the sensitivity...corpora- tion using railway tank cars observes that their calculations always favor buy over lease options. To compensate for non-quantitative costs and
Model-Based Sensor Selection for Helicopter Gearbox Monitoring
1996-04-01
fault diagnosis of helicopter gearboxes is therefore necessary to prevent major breakdowns due to progression of undetected...in the gearbox . Once the presence of a fault is prompted by the fault detection network, fault diagnosis is performed by the Structure-Based...Components Figure 3: Overview of fault detection and diagnosis in the proposed model-based di- agnostic system for helicopter gearboxes . the OH-58A gearbox
CNTRICS final animal model task selection: Control of attention
Lustig, C.; Kozak, R.; Sarter, M.; Young, J.W.; Robbins, T.W.
2012-01-01
Schizophrenia is associated with impaired attention. The top-down control of attention, defined as the ability to guide and refocus attention in accordance with internal goals and representations, was identified by the Cognitive Neuroscience Treatment Research to Improve Cognition in Schizophrenia (CNTRICS) initiative as an important construct for task development and research. A recent CNTRICS meeting identified three tasks commonly used with rodent models as having high construct validity and promise for further development: The 5-choice serial reaction time task, the 5-choice continuous performance task, and the distractor condition sustained attention task. Here we describe their current status, including data on their neural substrates, evidence for sensitivity to neuropharmacological manipulations and genetic influences, and data from animal models of the cognitive deficits of schizophrenia. A common strength is the development of parallel human tasks to facilitate connections to the neural circuitry and drug development research done in these animal models. We conclude with recommendations for the steps needed to improve testing so that it better represents the complex biological and behavioral picture presented by schizophrenia. PMID:22683929
The selection of the starting field: General versus tailored model
NASA Astrophysics Data System (ADS)
Lerch, F. S.; Wagner, C. A.; Colombo, O. L.; Klosko, S. M.; Williamson, R. G.
1985-04-01
The E-mats (normal equations) were solved for both the methods of the tailor-made a priori model and the general a priori model. Errors in the solutions of the 21 geopotential coefficients were plotted for comparison of the two methods. An ideal Topex accuracy goal of 1/4 the errors in the GEM-9 model (1/4 GEM-9 sigma's) was also plotted as a significance level to compare the differences in the solution of the two methods. Both cases of simulation with noise on the data and without noise were plotted. The following additional information were plotted in Figure 1: (1) the general a priori starting values (GEM9 + or - 3 sigma), (2) the standard deviations (error estimate) of the recovered coefficients for the case where noise was applied to the data, and (3) the Topex accuracy goal of 1/4 GEM-9 error sigma's for comparison. A log scale was used since over 6 orders of magnitude are seen in the plots.
Compromise Approach-Based Genetic Algorithm for Constrained Multiobjective Portfolio Selection Model
NASA Astrophysics Data System (ADS)
Li, Jun
In this paper, fuzzy set theory is incorporated into a multiobjective portfolio selection model for investors’ taking into three criteria: return, risk and liquidity. The cardinality constraint, the buy-in threshold constraint and the round-lots constraints are considered in the proposed model. To overcome the difficulty of evaluation a large set of efficient solutions and selection of the best one on non-dominated surface, a compromise approach-based genetic algorithm is presented to obtain a compromised solution for the proposed constrained multiobjective portfolio selection model.
Selective Cooperation in Early Childhood – How to Choose Models and Partners
Hermes, Jonas; Behne, Tanya; Studte, Kristin; Zeyen, Anna-Maria; Gräfenhain, Maria; Rakoczy, Hannes
2016-01-01
Cooperation is essential for human society, and children engage in cooperation from early on. It is unclear, however, how children select their partners for cooperation. We know that children choose selectively whom to learn from (e.g. preferring reliable over unreliable models) on a rational basis. The present study investigated whether children (and adults) also choose their cooperative partners selectively and what model characteristics they regard as important for cooperative partners and for informants about novel words. Three- and four-year-old children (N = 64) and adults (N = 14) saw contrasting pairs of models differing either in physical strength or in accuracy (in labeling known objects). Participants then performed different tasks (cooperative problem solving and word learning) requiring the choice of a partner or informant. Both children and adults chose their cooperative partners selectively. Moreover they showed the same pattern of selective model choice, regarding a wide range of model characteristics as important for cooperation (preferring both the strong and the accurate model for a strength-requiring cooperation tasks), but only prior knowledge as important for word learning (preferring the knowledgeable but not the strong model for word learning tasks). Young children’s selective model choice thus reveals an early rational competence: They infer characteristics from past behavior and flexibly consider what characteristics are relevant for certain tasks. PMID:27505043
Estimates of live-tree carbon stores in the Pacific Northwest are sensitive to model selection
2011-01-01
Background Estimates of live-tree carbon stores are influenced by numerous uncertainties. One of them is model-selection uncertainty: one has to choose among multiple empirical equations and conversion factors that can be plausibly justified as locally applicable to calculate the carbon store from inventory measurements such as tree height and diameter at breast height (DBH). Here we quantify the model-selection uncertainty for the five most numerous tree species in six counties of northwest Oregon, USA. Results The results of our study demonstrate that model-selection error may introduce 20 to 40% uncertainty into a live-tree carbon estimate, possibly making this form of error the largest source of uncertainty in estimation of live-tree carbon stores. The effect of model selection could be even greater if models are applied beyond the height and DBH ranges for which they were developed. Conclusions Model-selection uncertainty is potentially large enough that it could limit the ability to track forest carbon with the precision and accuracy required by carbon accounting protocols. Without local validation based on detailed measurements of usually destructively sampled trees, it is very difficult to choose the best model when there are several available. Our analysis suggests that considering tree form in equation selection may better match trees to existing equations and that substantial gaps exist, in terms of both species and diameter ranges, that are ripe for new model-building effort. PMID:21477353
Schmidtmann, I; Elsäßer, A; Weinmann, A; Binder, H
2014-12-30
For determining a manageable set of covariates potentially influential with respect to a time-to-event endpoint, Cox proportional hazards models can be combined with variable selection techniques, such as stepwise forward selection or backward elimination based on p-values, or regularized regression techniques such as component-wise boosting. Cox regression models have also been adapted for dealing with more complex event patterns, for example, for competing risks settings with separate, cause-specific hazard models for each event type, or for determining the prognostic effect pattern of a variable over different landmark times, with one conditional survival model for each landmark. Motivated by a clinical cancer registry application, where complex event patterns have to be dealt with and variable selection is needed at the same time, we propose a general approach for linking variable selection between several Cox models. Specifically, we combine score statistics for each covariate across models by Fisher's method as a basis for variable selection. This principle is implemented for a stepwise forward selection approach as well as for a regularized regression technique. In an application to data from hepatocellular carcinoma patients, the coupled stepwise approach is seen to facilitate joint interpretation of the different cause-specific Cox models. In conditional survival models at landmark times, which address updates of prediction as time progresses and both treatment and other potential explanatory variables may change, the coupled regularized regression approach identifies potentially important, stably selected covariates together with their effect time pattern, despite having only a small number of events. These results highlight the promise of the proposed approach for coupling variable selection between Cox models, which is particularly relevant for modeling for clinical cancer registries with their complex event patterns.
Estimating the predictive quality of dose-response after model selection.
Hu, Chuanpu; Dong, Yingwen
2007-07-20
Prediction of dose-response is important in dose selection in drug development. As the true dose-response shape is generally unknown, model selection is frequently used, and predictions based on the final selected model. Correctly assessing the quality of the predictions requires accounting for the uncertainties caused by the model selection process, which has been difficult. Recently, a new approach called data perturbation has emerged. It allows important predictive characteristics be computed while taking model selection into consideration. We study, through simulation, the performance of data perturbation in estimating standard error of parameter estimates and prediction errors. Data perturbation was found to give excellent prediction error estimates, although at times large Monte Carlo sizes were needed to obtain good standard error estimates. Overall, it is a useful tool to characterize uncertainties in dose-response predictions, with the potential of allowing more accurate dose selection in drug development. We also look at the influence of model selection on estimation bias. This leads to insights into candidate model choices that enable good dose-response prediction.
Rodrigue, Nicolas; Philippe, Hervé; Lartillot, Nicolas
2010-01-01
Modeling the interplay between mutation and selection at the molecular level is key to evolutionary studies. To this end, codon-based evolutionary models have been proposed as pertinent means of studying long-range evolutionary patterns and are widely used. However, these approaches have not yet consolidated results from amino acid level phylogenetic studies showing that selection acting on proteins displays strong site-specific effects, which translate into heterogeneous amino acid propensities across the columns of alignments; related codon-level studies have instead focused on either modeling a single selective context for all codon columns, or a separate selective context for each codon column, with the former strategy deemed too simplistic and the latter deemed overparameterized. Here, we integrate recent developments in nonparametric statistical approaches to propose a probabilistic model that accounts for the heterogeneity of amino acid fitness profiles across the coding positions of a gene. We apply the model to a dozen real protein-coding gene alignments and find it to produce biologically plausible inferences, for instance, as pertaining to site-specific amino acid constraints, as well as distributions of scaled selection coefficients. In their account of mutational features as well as the heterogeneous regimes of selection at the amino acid level, the modeling approaches studied here can form a backdrop for several extensions, accounting for other selective features, for variable population size, or for subtleties of mutational features, all with parameterizations couched within population-genetic theory. PMID:20176949
Lessons for neurotoxicology from selected model compounds: SGOMSEC joint report.
Rice, D C; Evangelista de Duffard, A M; Duffard, R; Iregren, A; Satoh, H; Watanabe, C
1996-01-01
The ability to identify potential neurotoxicants depends upon the characteristics of our test instruments. The neurotoxic properties of lead, methylmercury, polychlorinated biphenyls, and organic solvents would all have been detected at some dose level by tests in current use, provided that the doses were high enough and administered at an appropriate time such as during gestation. The adequacy of animal studies, particularly rodent studies, to predict intake levels at which human health can be protected is disappointing, however. It is unlikely that the use of advanced behavioral methodology would alleviate the apparent lack of sensitivity of the rodent model for many agents. PMID:8860323
Causal Inference and Model Selection in Complex Settings
NASA Astrophysics Data System (ADS)
Zhao, Shandong
Propensity score methods have become a part of the standard toolkit for applied researchers who wish to ascertain causal effects from observational data. While they were originally developed for binary treatments, several researchers have proposed generalizations of the propensity score methodology for non-binary treatment regimes. In this article, we firstly review three main methods that generalize propensity scores in this direction, namely, inverse propensity weighting (IPW), the propensity function (P-FUNCTION), and the generalized propensity score (GPS), along with recent extensions of the GPS that aim to improve its robustness. We compare the assumptions, theoretical properties, and empirical performance of these methods. We propose three new methods that provide robust causal estimation based on the P-FUNCTION and GPS. While our proposed P-FUNCTION-based estimator preforms well, we generally advise caution in that all available methods can be biased by model misspecification and extrapolation. In a related line of research, we consider adjustment for posttreatment covariates in causal inference. Even in a randomized experiment, observations might have different compliance performance under treatment and control assignment. This posttreatment covariate cannot be adjusted using standard statistical methods. We review the principal stratification framework which allows for modeling this effect as part of its Bayesian hierarchical models. We generalize the current model to add the possibility of adjusting for pretreatment covariates. We also propose a new estimator of the average treatment effect over the entire population. In a third line of research, we discuss the spectral line detection problem in high energy astrophysics. We carefully review how this problem can be statistically formulated as a precise hypothesis test with point null hypothesis, why a usual likelihood ratio test does not apply for problem of this nature, and a doable fix to correctly
SELECTION OF CANDIDATE EUTROPHICATION MODELS FOR TOTAL MAXIMUM DAILY LOADS ANALYSES
A tiered approach was developed to evaluate candidate eutrophication models to select a common suite of models that could be used for Total Maximum Daily Loads (TMDL) analyses in estuaries, rivers, and lakes/reservoirs. Consideration for linkage to watershed models and ecologica...
An exponential ESS model and its application to frequency-dependent selection.
Li, J; Liu, L
1989-10-01
A nonlinear ESS model is put forward, that is, a nonnegative exponential ESS model. For a simple case, we discuss the existence, uniqueness, and stability of an ESS. As an application of the model, we give a quantitative analysis of frequency-dependent selection in population genetics when the rare type has an advantage.
An Evaluation Model To Select an Integrated Learning System in a Large, Suburban School District.
ERIC Educational Resources Information Center
Curlette, William L.; And Others
The systematic evaluation process used in Georgia's DeKalb County School System to purchase comprehensive instructional software--an integrated learning system (ILS)--is described, and the decision-making model for selection is presented. Selection and implementation of an ILS were part of an instructional technology plan for the DeKalb schools…
An Item Response Model for Nominal Data Based on the Rising Selection Ratios Criterion
ERIC Educational Resources Information Center
Revuelta, Javier
2005-01-01
Complete response vectors of all answer options in multiple-choice items can be used to estimate ability. The rising selection ratios criterion is necessary for scoring individuals because it implies that estimated ability always increases when the correct alternative is selected. This paper introduces the generalized DLT model, which assumes…
Augmented Self-Modeling as a Treatment for Children with Selective Mutism.
ERIC Educational Resources Information Center
Kehle, Thomas J.; Madaus, Melissa R.; Baratta, Victoria S.; Bray, Melissa A.
1998-01-01
Describes the treatment of three children experiencing selective mutism. The procedure utilized incorporated self-modeling, mystery motivators, self-reinforcement, stimulus fading, spacing, and antidepressant medication. All three children evidenced a complete cessation of selective mutism and maintained their treatment gains at follow-up.…
A Comparative Study of Item Selection Methods Utilizing Latent Trait Theretic Models and Concepts.
Latent trait theory provides the practitioner interested in developing tests with a far more powerful method of item selection than that provided by...follows: 1) Provide some background on information curves for items and tests; 2) Develop several item selection methods and using a typical item pool...into three ability categories. Keywords: Latent trait theory, Psychological tests, Mathematical models, Aptitude tests. (SDW)
2011-01-01
We propose several models applicable to both selection and election processes when each selecting or electing subject has access to different information about the objects to choose from. We wrote special software to simulate these processes. We consider both the cases when the environment is neutral (natural process) as well as when the environment is involved (controlled process). PMID:21892959
NASA Astrophysics Data System (ADS)
Creaco, E.; Berardi, L.; Sun, Siao; Giustolisi, O.; Savic, D.
2016-04-01
The growing availability of field data, from information and communication technologies (ICTs) in "smart" urban infrastructures, allows data modeling to understand complex phenomena and to support management decisions. Among the analyzed phenomena, those related to storm water quality modeling have recently been gaining interest in the scientific literature. Nonetheless, the large amount of available data poses the problem of selecting relevant variables to describe a phenomenon and enable robust data modeling. This paper presents a procedure for the selection of relevant input variables using the multiobjective evolutionary polynomial regression (EPR-MOGA) paradigm. The procedure is based on scrutinizing the explanatory variables that appear inside the set of EPR-MOGA symbolic model expressions of increasing complexity and goodness of fit to target output. The strategy also enables the selection to be validated by engineering judgement. In such context, the multiple case study extension of EPR-MOGA, called MCS-EPR-MOGA, is adopted. The application of the proposed procedure to modeling storm water quality parameters in two French catchments shows that it was able to significantly reduce the number of explanatory variables for successive analyses. Finally, the EPR-MOGA models obtained after the input selection are compared with those obtained by using the same technique without benefitting from input selection and with those obtained in previous works where other data-modeling techniques were used on the same data. The comparison highlights the effectiveness of both EPR-MOGA and the input selection procedure.
Blanchard, A.; O`Kula, K.R.; East, J.M.
1998-06-01
This paper highlights the logic used to select a dispersion/consequence methodology, describes the collection of tritium models contained in the suite of analysis options (the `tool kit`), and provides application examples.
Frisch, Simon; Dshemuchadse, Maja; Görner, Max; Goschke, Thomas; Scherbaum, Stefan
2015-11-01
Selective attention biases information processing toward stimuli that are relevant for achieving our goals. However, the nature of this bias is under debate: Does it solely rely on the amplification of goal-relevant information or is there a need for additional inhibitory processes that selectively suppress currently distracting information? Here, we explored the processes underlying selective attention with a dynamic, modeling-based approach that focuses on the continuous evolution of behavior over time. We present two dynamic neural field models incorporating the diverging theoretical assumptions. Simulations with both models showed that they make similar predictions with regard to response times but differ markedly with regard to their continuous behavior. Human data observed via mouse tracking as a continuous measure of performance revealed evidence for the model solely based on amplification but no indication of persisting selective distracter inhibition.
NASA Astrophysics Data System (ADS)
Zarindast, Atousa; Seyed Hosseini, Seyed Mohamad; Pishvaee, Mir Saman
2016-11-01
Robust supplier selection problem, in a scenario-based approach has been proposed, when the demand and exchange rates are subject to uncertainties. First, a deterministic multi-objective mixed integer linear programming is developed; then, the robust counterpart of the proposed mixed integer linear programming is presented using the recent extension in robust optimization theory. We discuss decision variables, respectively, by a two-stage stochastic planning model, a robust stochastic optimization planning model which integrates worst case scenario in modeling approach and finally by equivalent deterministic planning model. The experimental study is carried out to compare the performances of the three models. Robust model resulted in remarkable cost saving and it illustrated that to cope with such uncertainties, we should consider them in advance in our planning. In our case study different supplier were selected due to this uncertainties and since supplier selection is a strategic decision, it is crucial to consider these uncertainties in planning approach.
Generative model selection using a scalable and size-independent complex network classifier
Motallebi, Sadegh Aliakbary, Sadegh Habibi, Jafar
2013-12-15
Real networks exhibit nontrivial topological features, such as heavy-tailed degree distribution, high clustering, and small-worldness. Researchers have developed several generative models for synthesizing artificial networks that are structurally similar to real networks. An important research problem is to identify the generative model that best fits to a target network. In this paper, we investigate this problem and our goal is to select the model that is able to generate graphs similar to a given network instance. By the means of generating synthetic networks with seven outstanding generative models, we have utilized machine learning methods to develop a decision tree for model selection. Our proposed method, which is named “Generative Model Selection for Complex Networks,” outperforms existing methods with respect to accuracy, scalability, and size-independence.
Li, Xingfeng; Coyle, Damien; Maguire, Liam; McGinnity, Thomas M; Benali, Habib
2011-07-01
In this paper a model selection algorithm for a nonlinear system identification method is proposed to study functional magnetic resonance imaging (fMRI) effective connectivity. Unlike most other methods, this method does not need a pre-defined structure/model for effective connectivity analysis. Instead, it relies on selecting significant nonlinear or linear covariates for the differential equations to describe the mapping relationship between brain output (fMRI response) and input (experiment design). These covariates, as well as their coefficients, are estimated based on a least angle regression (LARS) method. In the implementation of the LARS method, Akaike's information criterion corrected (AICc) algorithm and the leave-one-out (LOO) cross-validation method were employed and compared for model selection. Simulation comparison between the dynamic causal model (DCM), nonlinear identification method, and model selection method for modelling the single-input-single-output (SISO) and multiple-input multiple-output (MIMO) systems were conducted. Results show that the LARS model selection method is faster than DCM and achieves a compact and economic nonlinear model simultaneously. To verify the efficacy of the proposed approach, an analysis of the dorsal and ventral visual pathway networks was carried out based on three real datasets. The results show that LARS can be used for model selection in an fMRI effective connectivity study with phase-encoded, standard block, and random block designs. It is also shown that the LOO cross-validation method for nonlinear model selection has less residual sum squares than the AICc algorithm for the study.
Estimating animal resource selection from telemetry data using point process models
Johnson, Devin S.; Hooten, Mevin B.; Kuhn, Carey E.
2013-01-01
To demonstrate the analysis of telemetry data with the point process approach, we analysed a data set of telemetry locations from northern fur seals (Callorhinus ursinus) in the Pribilof Islands, Alaska. Both a space–time and an aggregated space-only model were fitted. At the individual level, the space–time analysis showed little selection relative to the habitat covariates. However, at the study area level, the space-only model showed strong selection relative to the covariates.
Molecular column densities in selected model atmospheres. [stellar atmospheres
NASA Technical Reports Server (NTRS)
Johnson, H. R.; Sneden, C.; Beebe, R. F.
1975-01-01
Molecular column densities are presented for 35 molecules in a variety of cool stellar model atmospheres. From an examination of the predicted column densities, we draw the following conclusions: (1) OH might be visible in carbon stars which have been generated from triplet-alpha burning, but will be absent from carbon stars generated from the CNO bi-cycle; (2) the TiO/ZrO ratio shows small but interesting variations as C/O is changed and as the effective temperature is changed; (3) the column density of silicon dicarbide (SiC2) is sensitive to abundance, temperature, and gravity; hence, all relationships between the strength of SiC2 and other stellar parameters will show appreciable scatter. There is, however, a substantial luminosity effect present in the SiC2 column densities; (4) unexpectedly, SiC2 is anticorrelated with C2; (5) the presence of SiC2 in a carbon star allows us to eliminate the possibility that these stars are both 'hot' (T sub eff greater than or equal to 3000 K) and have been produced through the CNO bi-cycle (so that C/H is less than solar).
A model of face selection in viewing video stories.
Suda, Yuki; Kitazawa, Shigeru
2015-01-19
When typical adults watch TV programs, they show surprisingly stereo-typed gaze behaviours, as indicated by the almost simultaneous shifts of their gazes from one face to another. However, a standard saliency model based on low-level physical features alone failed to explain such typical gaze behaviours. To find rules that explain the typical gaze behaviours, we examined temporo-spatial gaze patterns in adults while they viewed video clips with human characters that were played with or without sound, and in the forward or reverse direction. We here show the following: 1) the "peak" face scanpath, which followed the face that attracted the largest number of views but ignored other objects in the scene, still retained the key features of actual scanpaths, 2) gaze behaviours remained unchanged whether the sound was provided or not, 3) the gaze behaviours were sensitive to time reversal, and 4) nearly 60% of the variance of gaze behaviours was explained by the face saliency that was defined as a function of its size, novelty, head movements, and mouth movements. These results suggest that humans share a face-oriented network that integrates several visual features of multiple faces, and directs our eyes to the most salient face at each moment.
Balancing Selection in Species with Separate Sexes: Insights from Fisher’s Geometric Model
Connallon, Tim; Clark, Andrew G.
2014-01-01
How common is balancing selection, and what fraction of phenotypic variance is attributable to balanced polymorphisms? Despite decades of research, answers to these questions remain elusive. Moreover, there is no clear theoretical prediction about the frequency with which balancing selection is expected to arise within a population. Here, we use an extension of Fisher’s geometric model of adaptation to predict the probability of balancing selection in a population with separate sexes, wherein polymorphism is potentially maintained by two forms of balancing selection: (1) heterozygote advantage, where heterozygous individuals at a locus have higher fitness than homozygous individuals, and (2) sexually antagonistic selection (a.k.a. intralocus sexual conflict), where the fitness of each sex is maximized by different genotypes at a locus. We show that balancing selection is common under biologically plausible conditions and that sex differences in selection or sex-by-genotype effects of mutations can each increase opportunities for balancing selection. Although heterozygote advantage and sexual antagonism represent alternative mechanisms for maintaining polymorphism, they mutually exist along a balancing selection continuum that depends on population and sex-specific parameters of selection and mutation. Sexual antagonism is the dominant mode of balancing selection across most of this continuum. PMID:24812306
Models of preconception care implementation in selected countries.
Ebrahim, Shahul H; Lo, Sue Seen-Tsing; Zhuo, Jiatong; Han, Jung-Yeol; Delvoye, Pierre; Zhu, Li
2006-09-01
Globally, maternal and child health faces diverse challenges depending on the status of the development of the country. Some countries have introduced or explored preconception care for various reasons. Falling birth rates and increasing knowledge about risk factors for adverse pregnancy outcomes led to the introduction of preconception care in Hong Kong in 1998, and South Korea in 2004. In Hong Kong, comprehensive preconception care including laboratory tests are provided to over 4000 women each year at a cost of $75 per person. In Korea, about 60% of the women served have known medical risk history, and the challenge is to expand the program capacity to all women who plan pregnancy, and conducting social marketing. Belgium has established an ad hoc-committee to develop a comprehensive social marketing and professional training strategy for pilot testing preconception care models in the French speaking part of Belgium, an area that represents 5 million people and 50,000 births per year using prenatal care and pediatric clinics, gynecological departments, and the genetic centers. In China, Guangxi province piloted preconceptional HIV testing and counseling among couples who sought the then mandatory premarital medical examination as a component of the three-pronged approach to reduce mother to child transmission of HIV. HIV testing rates among couples increased from 38% to 62% over one year period. In October 2003, China changed the legal requirement of premarital medical examination from mandatory to "voluntary." This change was interpreted by most women that the premarital health examination was "unnecessary" and overall premarital health examination rates dropped. Social marketing efforts piloted in 2004 indicated that 95% of women were willing to pay up to RMB 100 (US$12) for preconception health care services. These case studies illustrate programmatic feasibility of preconception care services to address maternal and child health and other public health
Neural Underpinnings of Decision Strategy Selection: A Review and a Theoretical Model
Wichary, Szymon; Smolen, Tomasz
2016-01-01
In multi-attribute choice, decision makers use decision strategies to arrive at the final choice. What are the neural mechanisms underlying decision strategy selection? The first goal of this paper is to provide a literature review on the neural underpinnings and cognitive models of decision strategy selection and thus set the stage for a neurocognitive model of this process. The second goal is to outline such a unifying, mechanistic model that can explain the impact of noncognitive factors (e.g., affect, stress) on strategy selection. To this end, we review the evidence for the factors influencing strategy selection, the neural basis of strategy use and the cognitive models of this process. We also present the Bottom-Up Model of Strategy Selection (BUMSS). The model assumes that the use of the rational Weighted Additive strategy and the boundedly rational heuristic Take The Best can be explained by one unifying, neurophysiologically plausible mechanism, based on the interaction of the frontoparietal network, orbitofrontal cortex, anterior cingulate cortex and the brainstem nucleus locus coeruleus. According to BUMSS, there are three processes that form the bottom-up mechanism of decision strategy selection and lead to the final choice: (1) cue weight computation, (2) gain modulation, and (3) weighted additive evaluation of alternatives. We discuss how these processes might be implemented in the brain, and how this knowledge allows us to formulate novel predictions linking strategy use and neural signals. PMID:27877103
Neural Underpinnings of Decision Strategy Selection: A Review and a Theoretical Model.
Wichary, Szymon; Smolen, Tomasz
2016-01-01
In multi-attribute choice, decision makers use decision strategies to arrive at the final choice. What are the neural mechanisms underlying decision strategy selection? The first goal of this paper is to provide a literature review on the neural underpinnings and cognitive models of decision strategy selection and thus set the stage for a neurocognitive model of this process. The second goal is to outline such a unifying, mechanistic model that can explain the impact of noncognitive factors (e.g., affect, stress) on strategy selection. To this end, we review the evidence for the factors influencing strategy selection, the neural basis of strategy use and the cognitive models of this process. We also present the Bottom-Up Model of Strategy Selection (BUMSS). The model assumes that the use of the rational Weighted Additive strategy and the boundedly rational heuristic Take The Best can be explained by one unifying, neurophysiologically plausible mechanism, based on the interaction of the frontoparietal network, orbitofrontal cortex, anterior cingulate cortex and the brainstem nucleus locus coeruleus. According to BUMSS, there are three processes that form the bottom-up mechanism of decision strategy selection and lead to the final choice: (1) cue weight computation, (2) gain modulation, and (3) weighted additive evaluation of alternatives. We discuss how these processes might be implemented in the brain, and how this knowledge allows us to formulate novel predictions linking strategy use and neural signals.
Sauerbrei, Willi; Royston, Patrick; Binder, Harald
2007-12-30
In developing regression models, data analysts are often faced with many predictor variables that may influence an outcome variable. After more than half a century of research, the 'best' way of selecting a multivariable model is still unresolved. It is generally agreed that subject matter knowledge, when available, should guide model building. However, such knowledge is often limited, and data-dependent model building is required. We limit the scope of the modelling exercise to selecting important predictors and choosing interpretable and transportable functions for continuous predictors. Assuming linear functions, stepwise selection and all-subset strategies are discussed; the key tuning parameters are the nominal P-value for testing a variable for inclusion and the penalty for model complexity, respectively. We argue that stepwise procedures perform better than a literature-based assessment would suggest. Concerning selection of functional form for continuous predictors, the principal competitors are fractional polynomial functions and various types of spline techniques. We note that a rigorous selection strategy known as multivariable fractional polynomials (MFP) has been developed. No spline-based procedure for simultaneously selecting variables and functional forms has found wide acceptance. Results of FP and spline modelling are compared in two data sets. It is shown that spline modelling, while extremely flexible, can generate fitted curves with uninterpretable 'wiggles', particularly when automatic methods for choosing the smoothness are employed. We give general recommendations to practitioners for carrying out variable and function selection. While acknowledging that further research is needed, we argue why MFP is our preferred approach for multivariable model building with continuous covariates.
Differences between selection on sex versus recombination in red queen models with diploid hosts.
Agrawal, Aneil F
2009-08-01
The Red Queen hypothesis argues that parasites generate selection for genetic mixing (sex and recombination) in their hosts. A number of recent papers have examined this hypothesis using models with haploid hosts. In these haploid models, sex and recombination are selectively equivalent. However, sex and recombination are not equivalent in diploids because selection on sex depends on the consequences of segregation as well as recombination. Here I compare how parasites select on modifiers of sexual reproduction and modifiers of recombination rate. Across a wide set of parameters, parasites tend to select against both sex and recombination, though recombination is favored more often than is sex. There is little correspondence between the conditions favoring sex and those favoring recombination, indicating that the direction of selection on sex is often determined by the effects of segregation, not recombination. Moreover, when sex was favored it is usually due to a long-term advantage whereas short-term effects are often responsible for selection favoring recombination. These results strongly indicate that Red Queen models focusing exclusively on the effects of recombination cannot be used to infer the type of selection on sex that is generated by parasites on diploid hosts.
Structure-selection techniques applied to continuous-time nonlinear models
NASA Astrophysics Data System (ADS)
Aguirre, Luis A.; Freitas, Ubiratan S.; Letellier, Christophe; Maquet, Jean
2001-10-01
This paper addresses the problem of choosing the multinomials that should compose a polynomial mathematical model starting from data. The mathematical representation used is a nonlinear differential equation of the polynomial type. Some approaches that have been used in the context of discrete-time models are adapted and applied to continuous-time models. Two examples are included to illustrate the main ideas. Models obtained with and without structure selection are compared using topological analysis. The main differences between structure-selected models and complete structure models are: (i) the former are more parsimonious than the latter, (ii) a predefined fixed-point configuration can be guaranteed for the former, and (iii) the former set of models produce attractors that are topologically closer to the original attractor than those produced by the complete structure models.
NASA Astrophysics Data System (ADS)
Tomasi, G.; Kimberley, S.; Rosso, L.; Aboagye, E.; Turkheimer, F.
2012-04-01
In positron emission tomography (PET) studies involving organs different from the brain, ignoring the metabolite contribution to the tissue time-activity curves (TAC), as in the standard single-input (SI) models, may compromise the accuracy of the estimated parameters. We employed here double-input (DI) compartmental modeling (CM), previously used for [11C]thymidine, and a novel DI spectral analysis (SA) approach on the tracers 5-[18F]fluorouracil (5-[18F]FU) and [18F]fluorothymidine ([18F]FLT). CM and SA were performed initially with a SI approach using the parent plasma TAC as an input function. These methods were then employed using a DI approach with the metabolite plasma TAC as an additional input function. Regions of interest (ROIs) corresponding to healthy liver, kidneys and liver metastases for 5-[18F]FU and to tumor, vertebra and liver for [18F]FLT were analyzed. For 5-[18F]FU, the improvement of the fit quality with the DI approaches was remarkable; in CM, the Akaike information criterion (AIC) always selected the DI over the SI model. Volume of distribution estimates obtained with DI CM and DI SA were in excellent agreement, for both parent 5-[18F]FU (R2 = 0.91) and metabolite [18F]FBAL (R2 = 0.99). For [18F]FLT, the DI methods provided notable improvements but less substantial than for 5-[18F]FU due to the lower rate of metabolism of [18F]FLT. On the basis of the AIC values, agreement between [18F]FLT Ki estimated with the SI and DI models was good (R2 = 0.75) for the ROIs where the metabolite contribution was negligible, indicating that the additional input did not bias the parent tracer only-related estimates. When the AIC suggested a substantial contribution of the metabolite [18F]FLT-glucuronide, on the other hand, the change in the parent tracer only-related parameters was significant (R2 = 0.33 for Ki). Our results indicated that improvements of DI over SI approaches can range from moderate to substantial and are more significant for tracers with
Selecting Spatial Scale of Covariates in Regression Models of Environmental Exposures
Grant, Lauren P.; Gennings, Chris; Wheeler, David C.
2015-01-01
Environmental factors or socioeconomic status variables used in regression models to explain environmental chemical exposures or health outcomes are often in practice modeled at the same buffer distance or spatial scale. In this paper, we present four model selection algorithms that select the best spatial scale for each buffer-based or area-level covariate. Contamination of drinking water by nitrate is a growing problem in agricultural areas of the United States, as ingested nitrate can lead to the endogenous formation of N-nitroso compounds, which are potent carcinogens. We applied our methods to model nitrate levels in private wells in Iowa. We found that environmental variables were selected at different spatial scales and that a model allowing spatial scale to vary across covariates provided the best goodness of fit. Our methods can be applied to investigate the association between environmental risk factors available at multiple spatial scales or buffer distances and measures of disease, including cancers. PMID:25983543
Island-Model Genomic Selection for Long-Term Genetic Improvement of Autogamous Crops.
Yabe, Shiori; Yamasaki, Masanori; Ebana, Kaworu; Hayashi, Takeshi; Iwata, Hiroyoshi
2016-01-01
Acceleration of genetic improvement of autogamous crops such as wheat and rice is necessary to increase cereal production in response to the global food crisis. Population and pedigree methods of breeding, which are based on inbred line selection, are used commonly in the genetic improvement of autogamous crops. These methods, however, produce a few novel combinations of genes in a breeding population. Recurrent selection promotes recombination among genes and produces novel combinations of genes in a breeding population, but it requires inaccurate single-plant evaluation for selection. Genomic selection (GS), which can predict genetic potential of individuals based on their marker genotype, might have high reliability of single-plant evaluation and might be effective in recurrent selection. To evaluate the efficiency of recurrent selection with GS, we conducted simulations using real marker genotype data of rice cultivars. Additionally, we introduced the concept of an "island model" inspired by evolutionary algorithms that might be useful to maintain genetic variation through the breeding process. We conducted GS simulations using real marker genotype data of rice cultivars to evaluate the efficiency of recurrent selection and the island model in an autogamous species. Results demonstrated the importance of producing novel combinations of genes through recurrent selection. An initial population derived from admixture of multiple bi-parental crosses showed larger genetic gains than a population derived from a single bi-parental cross in whole cycles, suggesting the importance of genetic variation in an initial population. The island-model GS better maintained genetic improvement in later generations than the other GS methods, suggesting that the island-model GS can utilize genetic variation in breeding and can retain alleles with small effects in the breeding population. The island-model GS will become a new breeding method that enhances the potential of genomic
Reich, Brian J.; Storlie, Curtis B.; Bondell, Howard D.
2009-01-01
With many predictors, choosing an appropriate subset of the covariates is a crucial, and difficult, step in nonparametric regression. We propose a Bayesian nonparametric regression model for curve-fitting and variable selection. We use the smoothing spline ANOVA framework to decompose the regression function into interpretable main effect and interaction functions. Stochastic search variable selection via MCMC sampling is used to search for models that fit the data well. Also, we show that variable selection is highly-sensitive to hyperparameter choice and develop a technique to select hyperparameters that control the long-run false positive rate. The method is used to build an emulator for a complex computer model for two-phase fluid flow. PMID:19789732
A model of two-way selection system for human behavior.
Zhou, Bin; Qin, Shujia; Han, Xiao-Pu; He, Zhe; Xie, Jia-Rong; Wang, Bing-Hong
2014-01-01
Two-way selection is a common phenomenon in nature and society. It appears in the processes like choosing a mate between men and women, making contracts between job hunters and recruiters, and trading between buyers and sellers. In this paper, we propose a model of two-way selection system, and present its analytical solution for the expectation of successful matching total and the regular pattern that the matching rate trends toward an inverse proportion to either the ratio between the two sides or the ratio of the state total to the smaller group's people number. The proposed model is verified by empirical data of the matchmaking fairs. Results indicate that the model well predicts this typical real-world two-way selection behavior to the bounded error extent, thus it is helpful for understanding the dynamics mechanism of the real-world two-way selection system.
Green, Christopher T.; Zhang, Yong; Jurgens, Bryant C.; Starn, J. Jeffrey; Landon, Matthew K.
2014-01-01
Analytical models of the travel time distribution (TTD) from a source area to a sample location are often used to estimate groundwater ages and solute concentration trends. The accuracies of these models are not well known for geologically complex aquifers. In this study, synthetic datasets were used to quantify the accuracy of four analytical TTD models as affected by TTD complexity, observation errors, model selection, and tracer selection. Synthetic TTDs and tracer data were generated from existing numerical models with complex hydrofacies distributions for one public-supply well and 14 monitoring wells in the Central Valley, California. Analytical TTD models were calibrated to synthetic tracer data, and prediction errors were determined for estimates of TTDs and conservative tracer (NO3−) concentrations. Analytical models included a new, scale-dependent dispersivity model (SDM) for two-dimensional transport from the watertable to a well, and three other established analytical models. The relative influence of the error sources (TTD complexity, observation error, model selection, and tracer selection) depended on the type of prediction. Geological complexity gave rise to complex TTDs in monitoring wells that strongly affected errors of the estimated TTDs. However, prediction errors for NO3− and median age depended more on tracer concentration errors. The SDM tended to give the most accurate estimates of the vertical velocity and other predictions, although TTD model selection had minor effects overall. Adding tracers improved predictions if the new tracers had different input histories. Studies using TTD models should focus on the factors that most strongly affect the desired predictions.
The development of an episode selection and aggregation approach, designed to support distributional estimation of use with the Models-3 Community Multiscale Air Quality (CMAQ) model, is described. The approach utilized cluster analysis of the 700-hPa east-west and north-south...
Cross-validation pitfalls when selecting and assessing regression and classification models
2014-01-01
Background We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches. Methods We describe in detail an algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and we define a repeated nested cross-validation algorithm for model assessment. As regards variable selection and parameter tuning we define two algorithms (repeated grid-search cross-validation and double cross-validation), and provide arguments for using the repeated grid-search in the general case. Results We show results of our algorithms on seven QSAR datasets. The variation of the prediction performance, which is the result of choosing different splits of the dataset in V-fold cross-validation, needs to be taken into account when selecting and assessing classification and regression models. Conclusions We demonstrate the importance of repeating cross-validation when selecting an optimal model, as well as the importance of repeating nested cross-validation when assessing a prediction error. PMID:24678909
Random forest (RF) is popular in ecological and environmental modeling, in part, because of its insensitivity to correlated predictors and resistance to overfitting. Although variable selection has been proposed to improve both performance and interpretation of RF models, it is u...
The Student-Selection Process: A Model of Student Courses in Higher Education.
ERIC Educational Resources Information Center
Saunders, J. A.; Lancaster, G. A.
Factors that affect college students' choice of studies and implications for colleges and universities that are competing for the declining numbers of students were assessed. A student-selection process model, derived from the innovation-decision model, provides some insights into the choice process and indicates the likely limitations of the…
Perturbation Selection and Local Influence Analysis for Nonlinear Structural Equation Model
ERIC Educational Resources Information Center
Chen, Fei; Zhu, Hong-Tu; Lee, Sik-Yum
2009-01-01
Local influence analysis is an important statistical method for studying the sensitivity of a proposed model to model inputs. One of its important issues is related to the appropriate choice of a perturbation vector. In this paper, we develop a general method to select an appropriate perturbation vector and a second-order local influence measure…
AN AGGREGATION AND EPISODE SELECTION SCHEME FOR EPA'S MODELS-3 CMAQ
The development of an episode selection and aggregation approach, designed to support distributional estimation for use with the Models-3 Community Multiscale Air Quality (CMAQ) model, is described. The approach utilized cluster analysis of the 700 hPa u and v wind field compo...
Model selection forecasts for the spectral index from the Planck satellite
Pahud, Cedric; Liddle, Andrew R.; Mukherjee, Pia; Parkinson, David
2006-06-15
The recent WMAP3 results have placed measurements of the spectral index n{sub S} in an interesting position. While parameter estimation techniques indicate that the Harrison-Zel'dovich spectrum n{sub S}=1 is strongly excluded (in the absence of tensor perturbations), Bayesian model selection techniques reveal that the case against n{sub S}=1 is not yet conclusive. In this paper, we forecast the ability of the Planck satellite mission to use Bayesian model selection to convincingly exclude (or favor) the Harrison-Zel'dovich model.
Supplier selection based on a neural network model using genetic algorithm.
Golmohammadi, Davood; Creese, Robert C; Valian, Haleh; Kolassa, John
2009-09-01
In this paper, a decision-making model was developed to select suppliers using neural networks (NNs). This model used historical supplier performance data for selection of vendor suppliers. Input and output were designed in a unique manner for training purposes. The managers' judgments about suppliers were simulated by using a pairwise comparisons matrix for output estimation in the NN. To obtain the benefit of a search technique for model structure and training, genetic algorithm (GA) was applied for the initial weights and architecture of the network. The suppliers' database information (input) can be updated over time to change the suppliers' score estimation based on their performance. The case study illustrated shows how the model can be applied for suppliers' selection.
Mutation-selection dynamics and error threshold in an evolutionary model for Turing machines.
Musso, Fabio; Feverati, Giovanni
2012-01-01
We investigate the mutation-selection dynamics for an evolutionary computation model based on Turing machines. The use of Turing machines allows for very simple mechanisms of code growth and code activation/inactivation through point mutations. To any value of the point mutation probability corresponds a maximum amount of active code that can be maintained by selection and the Turing machines that reach it are said to be at the error threshold. Simulations with our model show that the Turing machines population evolve toward the error threshold. Mathematical descriptions of the model point out that this behaviour is due more to the mutation-selection dynamics than to the intrinsic nature of the Turing machines. This indicates that this result is much more general than the model considered here and could play a role also in biological evolution.
Bruni, Renato; Cesarone, Francesco; Scozzari, Andrea; Tardella, Fabio
2016-09-01
A large number of portfolio selection models have appeared in the literature since the pioneering work of Markowitz. However, even when computational and empirical results are described, they are often hard to replicate and compare due to the unavailability of the datasets used in the experiments. We provide here several datasets for portfolio selection generated using real-world price values from several major stock markets. The datasets contain weekly return values, adjusted for dividends and for stock splits, which are cleaned from errors as much as possible. The datasets are available in different formats, and can be used as benchmarks for testing the performances of portfolio selection models and for comparing the efficiency of the algorithms used to solve them. We also provide, for these datasets, the portfolios obtained by several selection strategies based on Stochastic Dominance models (see "On Exact and Approximate Stochastic Dominance Strategies for Portfolio Selection" (Bruni et al. [2])). We believe that testing portfolio models on publicly available datasets greatly simplifies the comparison of the different portfolio selection strategies.
Teodoro, P E; Bhering, L L; Costa, R D; Rocha, R B; Laviola, B G
2016-08-19
The aim of this study was to estimate genetic parameters via mixed models and simultaneously to select Jatropha progenies grown in three regions of Brazil that meet high adaptability and stability. From a previous phenotypic selection, three progeny tests were installed in 2008 in the municipalities of Planaltina-DF (Midwest), Nova Porteirinha-MG (Southeast), and Pelotas-RS (South). We evaluated 18 families of half-sib in a randomized block design with three replications. Genetic parameters were estimated using restricted maximum likelihood/best linear unbiased prediction. Selection was based on the harmonic mean of the relative performance of genetic values method in three strategies considering: 1) performance in each environment (with interaction effect); 2) performance in each environment (with interaction effect); and 3) simultaneous selection for grain yield, stability and adaptability. Accuracy obtained (91%) reveals excellent experimental quality and consequently safety and credibility in the selection of superior progenies for grain yield. The gain with the selection of the best five progenies was more than 20%, regardless of the selection strategy. Thus, based on the three selection strategies used in this study, the progenies 4, 11, and 3 (selected in all environments and the mean environment and by adaptability and phenotypic stability methods) are the most suitable for growing in the three regions evaluated.
A selection model for accounting for publication bias in a full network meta-analysis.
Mavridis, Dimitris; Welton, Nicky J; Sutton, Alex; Salanti, Georgia
2014-12-30
Copas and Shi suggested a selection model to explore the potential impact of publication bias via sensitivity analysis based on assumptions for the probability of publication of trials conditional on the precision of their results. Chootrakool et al. extended this model to three-arm trials but did not fully account for the implications of the consistency assumption, and their model is difficult to generalize for complex network structures with more than three treatments. Fitting these selection models within a frequentist setting requires maximization of a complex likelihood function, and identification problems are common. We have previously presented a Bayesian implementation of the selection model when multiple treatments are compared with a common reference treatment. We now present a general model suitable for complex, full network meta-analysis that accounts for consistency when adjusting results for publication bias. We developed a design-by-treatment selection model to describe the mechanism by which studies with different designs (sets of treatments compared in a trial) and precision may be selected for publication. We fit the model in a Bayesian setting because it avoids the numerical problems encountered in the frequentist setting, it is generalizable with respect to the number of treatments and study arms, and it provides a flexible framework for sensitivity analysis using external knowledge. Our model accounts for the additional uncertainty arising from publication bias more successfully compared to the standard Copas model or its previous extensions. We illustrate the methodology using a published triangular network for the failure of vascular graft or arterial patency.
He, Xingrong; Yang, Yongqiang; Wu, Weihui; Wang, Di; Ding, Huanwen; Huang, Weihong
2010-06-01
In order to simplify the distal femoral comminuted fracture surgery and improve the accuracy of the parts to be reset, a kind of surgery orienting model for the surgery operation was designed according to the scanning data of computer tomography and the three-dimensional reconstruction image. With the use of DiMetal-280 selective laser melting rapid prototyping system, the surgery orienting model of 316L stainless steel was made through orthogonal experiment for processing parameter optimization. The technology of direct manufacturing of surgery orienting model by selective laser melting was noted to have obvious superiority with high speed, precise profile and good accuracy in size when compared with the conventional one. The model was applied in a real surgical operation for thighbone replacement; it worked well. The successful development of the model provides a new method for the automatic manufacture of customized surgery model, thus building a foundation for more clinical applications in the future.
A new fuzzy multi-objective higher order moment portfolio selection model for diversified portfolios
NASA Astrophysics Data System (ADS)
Yue, Wei; Wang, Yuping
2017-01-01
Due to the important effect of the higher order moments to portfolio returns, the aim of this paper is to make use of the third and fourth moments for fuzzy multi-objective portfolio selection model. Firstly, in order to overcome the low diversity of the obtained solution set and lead to corner solutions for the conventional higher moment portfolio selection models, a new entropy function based on Minkowski measure is proposed as a new objective function and a novel fuzzy multi-objective weighted possibilistic higher order moment portfolio model is presented. Secondly, to solve the proposed model efficiently, a new multi-objective evolutionary algorithm is designed. Thirdly, several portfolio performance evaluation techniques are used to evaluate the performance of the portfolio models. Finally, some experiments are conducted by using the data of Shanghai Stock Exchange and the results indicate the efficiency and effectiveness of the proposed model and algorithm.
Model selection and change detection for a time-varying mean in process monitoring
NASA Astrophysics Data System (ADS)
Burr, Tom; Hamada, Michael S.; Ticknor, Larry; Weaver, Brian
2014-07-01
Process monitoring (PM) for nuclear safeguards sometimes requires estimation of thresholds corresponding to small false alarm rates. Threshold estimation is an old topic; however, because possible new roles for PM are being evaluated in nuclear safeguards, it is timely to consider modern model selection options in the context of alarm threshold estimation. One of the possible new PM roles involves PM residuals, where a residual is defined as residual=data-prediction. This paper briefly reviews alarm threshold estimation, introduces model selection options, and considers several assumptions regarding the data-generating mechanism for PM residuals. Four PM examples from nuclear safeguards are included. One example involves frequent by-batch material balance closures where a dissolution vessel has time-varying efficiency, leading to time-varying material holdup. Another example involves periodic partial cleanout of in-process inventory, leading to challenging structure in the time series of PM residuals. Our main focus is model selection to select a defensible model for normal behavior with a time-varying mean in a PM residual stream. We use approximate Bayesian computation to perform the model selection and parameter estimation for normal behavior. We then describe a simple lag-one-differencing option similar to that used to monitor non-stationary times series to monitor for off-normal behavior.
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Wöhling, Thomas; Nowak, Wolfgang
2014-05-01
Bayesian model averaging ranks the predictive capabilities of alternative conceptual models based on Bayes' theorem. The individual models are weighted with their posterior probability to be the best one in the considered set of models. Finally, their predictions are combined into a robust weighted average and the predictive uncertainty can be quantified. This rigorous procedure does, however, not yet account for possible instabilities due to measurement noise in the calibration data set. This is a major drawback, since posterior model weights may suffer a lack of robustness related to the uncertainty in noisy data, which may compromise the reliability of model ranking. We present a new statistical concept to account for measurement noise as source of uncertainty for the weights in Bayesian model averaging. Our suggested upgrade reflects the limited information content of data for the purpose of model selection. It allows us to assess the significance of the determined posterior model weights, the confidence in model selection, and the accuracy of the quantified predictive uncertainty. Our approach rests on a brute-force Monte Carlo framework. We determine the robustness of model weights against measurement noise by repeatedly perturbing the observed data with random realizations of measurement error. Then, we analyze the induced variability in posterior model weights and introduce this "weighting variance" as an additional term into the overall prediction uncertainty analysis scheme. We further determine the theoretical upper limit in performance of the model set which is imposed by measurement noise. As an extension to the merely relative model ranking, this analysis provides a measure of absolute model performance. To finally decide, whether better data or longer time series are needed to ensure a robust basis for model selection, we resample the measurement time series and assess the convergence of model weights for increasing time series length. We illustrate
The Limits and Possibilities of Decision Models for Army Research and Development Project Selection
1992-03-01
MULTIATTRIBUTE UTILITY THEORY 57 B. ASSESSMENT OF THE STRENGTHS, LIMITATIONS, AND WEAKNESSES OF THE MAUT MODEL .. ......... 62 VIII. THE GOAL PROGRAMMING MODEL...several sources. We will then move to a development of the Analytic Hierarchy Process (AHP). Next, the 36 Multiattribute Utility Theory ( MAUT ) will...decision model which has proven useful for supporting the evaluation of projects for R&D selection is multiattribute utility theory or MAUT .
The effects of modeling contingencies in the treatment of food selectivity in children with autism.
Fu, Sherrene B; Penrod, Becky; Fernand, Jonathan K; Whelan, Colleen M; Griffith, Kristin; Medved, Shannon
2015-11-01
The current study investigated the effectiveness of stating and modeling contingencies in increasing food consumption for two children with food selectivity. Results suggested that stating and modeling a differential reinforcement (DR) contingency for food consumption was effective in increasing consumption of two target foods for one child, and stating and modeling a DR plus nonremoval of the spoon contingency was effective in increasing consumption of the remaining food for the first child and all target foods for the second child.
Selection and mutation in X-linked recessive diseases epidemiological model.
Verrilli, Francesca; Kebriaei, Hamed; Glielmo, Luigi; Corless, Martin; Del Vecchio, Carmen
2015-01-01
To describe the epidemiology of X-linked recessive diseases we developed a discrete time, structured, non linear mathematical model. The model allows for de novo mutations (i.e. affected sibling born to unaffected parents) and selection (i.e., distinct fitness rates depending on individual's health conditions). Applying Lyapunov direct method we found the domain of attraction of model's equilibrium point and studied the convergence properties of the degenerate equilibrium where only affected individuals survive.
Vermeyen, T.
1995-07-01
Bureau of Reclamation conducted this hydraulic model study to provide Pacific Gas and Electric Company with an evaluation of several selective withdrawal structures that are being considered to reduce intake flow temperatures through the Prattville Intake at Lake Almanor, California. Release temperature control using selective withdrawal structures is being considered in an effort to improve the cold-water fishery in the North Fork of the Feather River.
Ghodrati, Masoud; Khaligh-Razavi, Seyed-Mahdi; Ebrahimpour, Reza; Rajaei, Karim; Pooyan, Mohammad
2012-01-01
Humans can effectively and swiftly recognize objects in complex natural scenes. This outstanding ability has motivated many computational object recognition models. Most of these models try to emulate the behavior of this remarkable system. The human visual system hierarchically recognizes objects in several processing stages. Along these stages a set of features with increasing complexity is extracted by different parts of visual system. Elementary features like bars and edges are processed in earlier levels of visual pathway and as far as one goes upper in this pathway more complex features will be spotted. It is an important interrogation in the field of visual processing to see which features of an object are selected and represented by the visual cortex. To address this issue, we extended a hierarchical model, which is motivated by biology, for different object recognition tasks. In this model, a set of object parts, named patches, extracted in the intermediate stages. These object parts are used for training procedure in the model and have an important role in object recognition. These patches are selected indiscriminately from different positions of an image and this can lead to the extraction of non-discriminating patches which eventually may reduce the performance. In the proposed model we used an evolutionary algorithm approach to select a set of informative patches. Our reported results indicate that these patches are more informative than usual random patches. We demonstrate the strength of the proposed model on a range of object recognition tasks. The proposed model outperforms the original model in diverse object recognition tasks. It can be seen from the experiments that selected features are generally particular parts of target images. Our results suggest that selected features which are parts of target objects provide an efficient set for robust object recognition.
Ghodrati, Masoud; Khaligh-Razavi, Seyed-Mahdi; Ebrahimpour, Reza; Rajaei, Karim; Pooyan, Mohammad
2012-01-01
Humans can effectively and swiftly recognize objects in complex natural scenes. This outstanding ability has motivated many computational object recognition models. Most of these models try to emulate the behavior of this remarkable system. The human visual system hierarchically recognizes objects in several processing stages. Along these stages a set of features with increasing complexity is extracted by different parts of visual system. Elementary features like bars and edges are processed in earlier levels of visual pathway and as far as one goes upper in this pathway more complex features will be spotted. It is an important interrogation in the field of visual processing to see which features of an object are selected and represented by the visual cortex. To address this issue, we extended a hierarchical model, which is motivated by biology, for different object recognition tasks. In this model, a set of object parts, named patches, extracted in the intermediate stages. These object parts are used for training procedure in the model and have an important role in object recognition. These patches are selected indiscriminately from different positions of an image and this can lead to the extraction of non-discriminating patches which eventually may reduce the performance. In the proposed model we used an evolutionary algorithm approach to select a set of informative patches. Our reported results indicate that these patches are more informative than usual random patches. We demonstrate the strength of the proposed model on a range of object recognition tasks. The proposed model outperforms the original model in diverse object recognition tasks. It can be seen from the experiments that selected features are generally particular parts of target images. Our results suggest that selected features which are parts of target objects provide an efficient set for robust object recognition. PMID:22384229
Selection on plasticity of seasonal life-history traits using random regression mixed model analysis
Brommer, Jon E; Kontiainen, Pekka; Pietiäinen, Hannu
2012-01-01
Theory considers the covariation of seasonal life-history traits as an optimal reaction norm, implying that deviating from this reaction norm reduces fitness. However, the estimation of reaction-norm properties (i.e., elevation, linear slope, and higher order slope terms) and the selection on these is statistically challenging. We here advocate the use of random regression mixed models to estimate reaction-norm properties and the use of bivariate random regression to estimate selection on these properties within a single model. We illustrate the approach by random regression mixed models on 1115 observations of clutch sizes and laying dates of 361 female Ural owl Strix uralensis collected over 31 years to show that (1) there is variation across individuals in the slope of their clutch size–laying date relationship, and that (2) there is selection on the slope of the reaction norm between these two traits. Hence, natural selection potentially drives the negative covariance in clutch size and laying date in this species. The random-regression approach is hampered by inability to estimate nonlinear selection, but avoids a number of disadvantages (stats-on-stats, connecting reaction-norm properties to fitness). The approach is of value in describing and studying selection on behavioral reaction norms (behavioral syndromes) or life-history reaction norms. The approach can also be extended to consider the genetic underpinning of reaction-norm properties. PMID:22837818
Cross Validation for Selection of Cortical Interaction Models From Scalp EEG or MEG
Cheung, Bing Leung Patrick; Nowak, Robert; Lee, Hyong Chol; van Drongelen, Wim; Van Veen, Barry D.
2012-01-01
A cross-validation (CV) method based on state-space framework is introduced for comparing the fidelity of different cortical interaction models to the measured scalp electroencephalogram (EEG) or magnetoencephalography (MEG) data being modeled. A state equation models the cortical interaction dynamics and an observation equation represents the scalp measurement of cortical activity and noise. The measured data are partitioned into training and test sets. The training set is used to estimate model parameters and the model quality is evaluated by computing test data innovations for the estimated model. Two CV metrics normalized mean square error and log-likelihood are estimated by averaging over different training/test partitions of the data. The effectiveness of this method of model selection is illustrated by comparing two linear modeling methods and two nonlinear modeling methods on simulated EEG data derived using both known dynamic systems and measured electrocorticography data from an epilepsy patient. PMID:22084038
First Principles Molecular Modeling of Sensing Material Selection for Hybrid Biomimetic Nanosensors
NASA Astrophysics Data System (ADS)
Blanco, Mario; McAlpine, Michael C.; Heath, James R.
Hybrid biomimetic nanosensors use selective polymeric and biological materials that integrate flexible recognition moieties with nanometer size transducers. These sensors have the potential to offer the building blocks for a universal sensing platform. Their vast range of chemistries and high conformational flexibility present both a problem and an opportunity. Nonetheless, it has been shown that oligopeptide aptamers from sequenced genes can be robust substrates for the selective recognition of specific chemical species. Here we present first principles molecular modeling approaches tailored to peptide sequences suitable for the selective discrimination of small molecules on nanowire arrays. The modeling strategy is fully atomistic. The excellent performance of these sensors, their potential biocompatibility combined with advanced mechanistic modeling studies, could potentially lead to applications such as: unobtrusive implantable medical sensors for disease diagnostics, light weight multi-purpose sensing devices for aerospace applications, ubiquitous environmental monitoring devices in urban and rural areas, and inexpensive smart packaging materials for active in-situ food safety labeling.
Bagging linear sparse Bayesian learning models for variable selection in cancer diagnosis.
Lu, Chuan; Devos, Andy; Suykens, Johan A K; Arús, Carles; Van Huffel, Sabine
2007-05-01
This paper investigates variable selection (VS) and classification for biomedical datasets with a small sample size and a very high input dimension. The sequential sparse Bayesian learning methods with linear bases are used as the basic VS algorithm. Selected variables are fed to the kernel-based probabilistic classifiers: Bayesian least squares support vector machines (BayLS-SVMs) and relevance vector machines (RVMs). We employ the bagging techniques for both VS and model building in order to improve the reliability of the selected variables and the predictive performance. This modeling strategy is applied to real-life medical classification problems, including two binary cancer diagnosis problems based on microarray data and a brain tumor multiclass classification problem using spectra acquired via magnetic resonance spectroscopy. The work is experimentally compared to other VS methods. It is shown that the use of bagging can improve the reliability and stability of both VS and model prediction.
NASA Astrophysics Data System (ADS)
Gesing, Adam J.; Das, Subodh K.
2017-02-01
With United States Department of Energy Advanced Research Project Agency funding, experimental proof-of-concept was demonstrated for RE-12TM electrorefining process of extraction of desired amount of Mg from recycled scrap secondary Al molten alloys. The key enabling technology for this process was the selection of the suitable electrolyte composition and operating temperature. The selection was made using the FactSage thermodynamic modeling software and the light metal, molten salt, and oxide thermodynamic databases. Modeling allowed prediction of the chemical equilibria, impurity contents in both anode and cathode products, and in the electrolyte. FactSage also provided data on the physical properties of the electrolyte and the molten metal phases including electrical conductivity and density of the molten phases. Further modeling permitted selection of electrode and cell construction materials chemically compatible with the combination of molten metals and the electrolyte.
2011-01-01
Background There is an increasing recognition that modelling and simulation can assist in the process of designing health care policies, strategies and operations. However, the current use is limited and answers to questions such as what methods to use and when remain somewhat underdeveloped. Aim The aim of this study is to provide a mechanism for decision makers in health services planning and management to compare a broad range of modelling and simulation methods so that they can better select and use them or better commission relevant modelling and simulation work. Methods This paper proposes a modelling and simulation method comparison and selection tool developed from a comprehensive literature review, the research team's extensive expertise and inputs from potential users. Twenty-eight different methods were identified, characterised by their relevance to different application areas, project life cycle stages, types of output and levels of insight, and four input resources required (time, money, knowledge and data). Results The characterisation is presented in matrix forms to allow quick comparison and selection. This paper also highlights significant knowledge gaps in the existing literature when assessing the applicability of particular approaches to health services management, where modelling and simulation skills are scarce let alone money and time. Conclusions A modelling and simulation method comparison and selection tool is developed to assist with the selection of methods appropriate to supporting specific decision making processes. In particular it addresses the issue of which method is most appropriate to which specific health services management problem, what the user might expect to be obtained from the method, and what is required to use the method. In summary, we believe the tool adds value to the scarce existing literature on methods comparison and selection. PMID:21595946
A general linear model-based approach for inferring selection to climate
2013-01-01
Background Many efforts have been made to detect signatures of positive selection in the human genome, especially those associated with expansion from Africa and subsequent colonization of all other continents. However, most approaches have not directly probed the relationship between the environment and patterns of variation among humans. We have designed a method to identify regions of the genome under selection based on Mantel tests conducted within a general linear model framework, which we call MAntel-GLM to Infer Clinal Selection (MAGICS). MAGICS explicitly incorporates population-specific and genome-wide patterns of background variation as well as information from environmental values to provide an improved picture of selection and its underlying causes in human populations. Results Our results significantly overlap with those obtained by other published methodologies, but MAGICS has several advantages. These include improvements that: limit false positives by reducing the number of independent tests conducted and by correcting for geographic distance, which we found to be a major contributor to selection signals; yield absolute rather than relative estimates of significance; identify specific geographic regions linked most strongly to particular signals of selection; and detect recent balancing as well as directional selection. Conclusions We find evidence of selection associated with climate (P < 10-5) in 354 genes, and among these observe a highly significant enrichment for directional positive selection. Two of our strongest 'hits’, however, ADRA2A and ADRA2C, implicated in vasoconstriction in response to cold and pain stimuli, show evidence of balancing selection. Our results clearly demonstrate evidence of climate-related signals of directional and balancing selection. PMID:24053227
The Pattern of Neutral Molecular Variation under the Background Selection Model
Charlesworth, D.; Charlesworth, B.; Morgan, M. T.
1995-01-01
Stochastic simulations of the infinite sites model were used to study the behavior of genetic diversity at a neutral locus in a genomic region without recombination, but subject to selection against deleterious alleles maintained by recurrent mutation (background selection). In large populations, the effect of background selection on the number of segregating sites approaches the effct on nucleotide site diversity, i.e., the reduction in genetic variability caused by background selection resembles that caused by a simple reduction in effective population size. We examined, by coalescence-based methods, the power of several tests for the departure from neutral expectation of the frequency spectra of alleles in samples from randomly mating populations (TAJIMA's, FU and LI's, and WATTERSON's tests). All of the tests have low power unless the selection against mutant alleles is extremely weak. In Drosophila, significant TAJIMA's tests are usually not obtained with empirical data sets from loci in genomic regions with restricted recombination frequencies and that exhibit low genetic diversity. This is consistent with the operation of background selection as opposed to selective sweeps. It remains to be decided whether background selection is sufficient to explain the observed extent of reduction in diversity in regions of restricted recombination. PMID:8601499
The quantitative genetics of indirect genetic effects: a selective review of modelling issues.
Bijma, P
2014-01-01
Indirect genetic effects (IGE) occur when the genotype of an individual affects the phenotypic trait value of another conspecific individual. IGEs can have profound effects on both the magnitude and the direction of response to selection. Models of inheritance and response to selection in traits subject to IGEs have been developed within two frameworks; a trait-based framework in which IGEs are specified as a direct consequence of individual trait values, and a variance-component framework in which phenotypic variance is decomposed into a direct and an indirect additive genetic component. This work is a selective review of the quantitative genetics of traits affected by IGEs, with a focus on modelling, estimation and interpretation issues. It includes a discussion on variance-component vs trait-based models of IGEs, a review of issues related to the estimation of IGEs from field data, including the estimation of the interaction coefficient Ψ (psi), and a discussion on the relevance of IGEs for response to selection in cases where the strength of interaction varies among pairs of individuals. An investigation of the trait-based model shows that the interaction coefficient Ψ may deviate considerably from the corresponding regression coefficient when feedback occurs. The increasing research effort devoted to IGEs suggests that they are a widespread phenomenon, probably particularly in natural populations and plants. Further work in this field should considerably broaden our understanding of the quantitative genetics of inheritance and response to selection in relation to the social organisation of populations.
Liu, Siwei; Rovine, Michael J; Molenaar, Peter C M
2012-03-01
With increasing popularity, growth curve modeling is more and more often considered as the 1st choice for analyzing longitudinal data. Although the growth curve approach is often a good choice, other modeling strategies may more directly answer questions of interest. It is common to see researchers fit growth curve models without considering alterative modeling strategies. In this article we compare 3 approaches for analyzing longitudinal data: repeated measures analysis of variance, covariance pattern models, and growth curve models. As all are members of the general linear mixed model family, they represent somewhat different assumptions about the way individuals change. These assumptions result in different patterns of covariation among the residuals around the fixed effects. In this article, we first indicate the kinds of data that are appropriately modeled by each and use real data examples to demonstrate possible problems associated with the blanket selection of the growth curve model. We then present a simulation that indicates the utility of Akaike information criterion and Bayesian information criterion in the selection of a proper residual covariance structure. The results cast doubt on the popular practice of automatically using growth curve modeling for longitudinal data without comparing the fit of different models. Finally, we provide some practical advice for assessing mean changes in the presence of correlated data.
Journal selection decisions: a biomedical library operations research model. I. The framework.
Kraft, D H; Polacsek, R A; Soergel, L; Burns, K; Klair, A
1976-01-01
The problem of deciding which journal titles to select for acquisition in a biomedical library is modeled. The approach taken is based on cost/benefit ratios. Measures of journal worth, methods of data collection, and journal cost data are considered. The emphasis is on the development of a practical process for selecting journal titles, based on the objectivity and rationality of the model; and on the collection of the approprate data and library statistics in a reasonable manner. The implications of this process towards an overall management information system (MIS) for biomedical serials handling are discussed. PMID:820391
Anxolabéhère, D
1976-11-15
During the study of Drosophila melanogaster experimental populations, the adaptiive values of three genotypes corresponding to sepia locus were measured, and a frequency dependent selection model was proposed. This model and the overdominance model are compared to the experimental populations. The fit is better with the frequency dependent selection model.
Using the Animal Model to Accelerate Response to Selection in a Self-Pollinating Crop
Cowling, Wallace A.; Stefanova, Katia T.; Beeck, Cameron P.; Nelson, Matthew N.; Hargreaves, Bonnie L. W.; Sass, Olaf; Gilmour, Arthur R.; Siddique, Kadambot H. M.
2015-01-01
We used the animal model in S0 (F1) recurrent selection in a self-pollinating crop including, for the first time, phenotypic and relationship records from self progeny, in addition to cross progeny, in the pedigree. We tested the model in Pisum sativum, the autogamous annual species used by Mendel to demonstrate the particulate nature of inheritance. Resistance to ascochyta blight (Didymella pinodes complex) in segregating S0 cross progeny was assessed by best linear unbiased prediction over two cycles of selection. Genotypic concurrence across cycles was provided by pure-line ancestors. From cycle 1, 102/959 S0 plants were selected, and their S1 self progeny were intercrossed and selfed to produce 430 S0 and 575 S2 individuals that were evaluated in cycle 2. The analysis was improved by including all genetic relationships (with crossing and selfing in the pedigree), additive and nonadditive genetic covariances between cycles, fixed effects (cycles and spatial linear trends), and other random effects. Narrow-sense heritability for ascochyta blight resistance was 0.305 and 0.352 in cycles 1 and 2, respectively, calculated from variance components in the full model. The fitted correlation of predicted breeding values across cycles was 0.82. Average accuracy of predicted breeding values was 0.851 for S2 progeny of S1 parent plants and 0.805 for S0 progeny tested in cycle 2, and 0.878 for S1 parent plants for which no records were available. The forecasted response to selection was 11.2% in the next cycle with 20% S0 selection proportion. This is the first application of the animal model to cyclic selection in heterozygous populations of selfing plants. The method can be used in genomic selection, and for traits measured on S0-derived bulks such as grain yield. PMID:25943522
NASA Astrophysics Data System (ADS)
Wu, Yu-Shu; Forsyth, Peter A.
2001-04-01
Selecting the proper primary variables is a critical step in efficiently modeling the highly nonlinear problem of multiphase subsurface flow in a heterogeneous porous-fractured media. Current simulation and ground modeling techniques consist of (1) spatial discretization of mass and/or heat conservation equations using finite difference or finite element methods; (2) fully implicit time discretization; (3) solving the nonlinear, discrete algebraic equations using a Newton iterative scheme. Previous modeling efforts indicate that the choice of primary variables for a Newton iteration not only impacts computational performance of a numerical code, but may also determine the feasibility of a numerical modeling study in many field applications. This paper presents an analysis and general recommendations for selecting primary variables in simulating multiphase, subsurface flow for one-active phase (Richards' equation), two-phase (gas and liquid) and three-phase (gas, water and nonaqueous phase liquid or NAPL) conditions. In many cases, a dynamic variable switching or variable substitution scheme may have to be used in order to achieve optimal numerical performance and robustness. The selection of primary variables depends in general on the sensitivity of the system of equations to the variables selected at given phase and flow conditions. We will present a series of numerical tests and large-scale field simulation examples, including modeling one (active)-phase, two-phase and three-phase flow problems in multi-dimensional, porous-fractured subsurface systems.
Wu, Y S; Forsyth, P A
2001-04-01
Selecting the proper primary variables is a critical step in efficiently modeling the highly nonlinear problem of multiphase subsurface flow in a heterogeneous porous-fractured media. Current simulation and ground modeling techniques consist of (1) spatial discretization of mass and/or heat conservation equations using finite difference or finite element methods; (2) fully implicit time discretization; (3) solving the nonlinear, discrete algebraic equations using a Newton iterative scheme. Previous modeling efforts indicate that the choice of primary variables for a Newton iteration not only impacts computational performance of a numerical code, but may also determine the feasibility of a numerical modeling study in many field applications. This paper presents an analysis and general recommendations for selecting primary variables in simulating multiphase, subsurface flow for one-active phase (Richards' equation), two-phase (gas and liquid) and three-phase (gas, water and nonaqueous phase liquid or NAPL) conditions. In many cases, a dynamic variable switching or variable substitution scheme may have to be used in order to achieve optimal numerical performance and robustness. The selection of primary variables depends in general on the sensitivity of the system of equations to the variables selected at given phase and flow conditions. We will present a series of numerical tests and large-scale field simulation examples, including modeling one (active)-phase, two-phase and three-phase flow problems in multi-dimensional, porous-fractured subsurface systems.
Howell, Bryan; Lad, Shivanand P.; Grill, Warren M.
2014-01-01
Spinal cord stimulation (SCS) is an alternative or adjunct therapy to treat chronic pain, a prevalent and clinically challenging condition. Although SCS has substantial clinical success, the therapy is still prone to failures, including lead breakage, lead migration, and poor pain relief. The goal of this study was to develop a computational model of SCS and use the model to compare activation of neural elements during intradural and extradural electrode placement. We constructed five patient-specific models of SCS. Stimulation thresholds predicted by the model were compared to stimulation thresholds measured intraoperatively, and we used these models to quantify the efficiency and selectivity of intradural and extradural SCS. Intradural placement dramatically increased stimulation efficiency and reduced the power required to stimulate the dorsal columns by more than 90%. Intradural placement also increased selectivity, allowing activation of a greater proportion of dorsal column fibers before spread of activation to dorsal root fibers, as well as more selective activation of individual dermatomes at different lateral deviations from the midline. Further, the results suggest that current electrode designs used for extradural SCS are not optimal for intradural SCS, and a novel azimuthal tripolar design increased stimulation selectivity, even beyond that achieved with an intradural paddle array. Increased stimulation efficiency is expected to increase the battery life of implantable pulse generators, increase the recharge interval of rechargeable implantable pulse generators, and potentially reduce stimulator volume. The greater selectivity of intradural stimulation may improve the success rate of SCS by mitigating the sensitivity of pain relief to malpositioning of the electrode. The outcome of this effort is a better quantitative understanding of how intradural electrode placement can potentially increase the selectivity and efficiency of SCS, which, in turn
Discrete choice modeling of shovelnose sturgeon habitat selection in the Lower Missouri River
Bonnot, T.W.; Wildhaber, M.L.; Millspaugh, J.J.; DeLonay, A.J.; Jacobson, R.B.; Bryan, J.L.
2011-01-01
Substantive changes to physical habitat in the Lower Missouri River, resulting from intensive management, have been implicated in the decline of pallid (Scaphirhynchus albus) and shovelnose (S. platorynchus) sturgeon. To aid in habitat rehabilitation efforts, we evaluated habitat selection of gravid, female shovelnose sturgeon during the spawning season in two sections (lower and upper) of the Lower Missouri River in 2005 and in the upper section in 2007. We fit discrete choice models within an information theoretic framework to identify selection of means and variability in three components of physical habitat. Characterizing habitat within divisions around fish better explained selection than habitat values at the fish locations. In general, female shovelnose sturgeon were negatively associated with mean velocity between them and the bank and positively associated with variability in surrounding depths. For example, in the upper section in 2005, a 0.5 m s-1 decrease in velocity within 10 m in the bank direction increased the relative probability of selection 70%. In the upper section fish also selected sites with surrounding structure in depth (e.g., change in relief). Differences in models between sections and years, which are reinforced by validation rates, suggest that changes in habitat due to geomorphology, hydrology, and their interactions over time need to be addressed when evaluating habitat selection. Because of the importance of variability in surrounding depths, these results support an emphasis on restoring channel complexity as an objective of habitat restoration for shovelnose sturgeon in the Lower Missouri River.
Genomic Response to Selection for Predatory Behavior in a Mammalian Model of Adaptive Radiation.
Konczal, Mateusz; Koteja, Paweł; Orlowska-Feuer, Patrycja; Radwan, Jacek; Sadowska, Edyta T; Babik, Wiesław
2016-09-01
If genetic architectures of various quantitative traits are similar, as studies on model organisms suggest, comparable selection pressures should produce similar molecular patterns for various traits. To test this prediction, we used a laboratory model of vertebrate adaptive radiation to investigate the genetic basis of the response to selection for predatory behavior and compare it with evolution of aerobic capacity reported in an earlier work. After 13 generations of selection, the proportion of bank voles (Myodes [=Clethrionomys] glareolus) showing predatory behavior was five times higher in selected lines than in controls. We analyzed the hippocampus and liver transcriptomes and found repeatable changes in allele frequencies and gene expression. Genes with the largest differences between predatory and control lines are associated with hunger, aggression, biological rhythms, and functioning of the nervous system. Evolution of predatory behavior could be meaningfully compared with evolution of high aerobic capacity, because the experiments and analyses were performed in the same methodological framework. The number of genes that changed expression was much smaller in predatory lines, and allele frequencies changed repeatably in predatory but not in aerobic lines. This suggests that more variants of smaller effects underlie variation in aerobic performance, whereas fewer variants of larger effects underlie variation in predatory behavior. Our results thus contradict the view that comparable selection pressures for different quantitative traits produce similar molecular patterns. Therefore, to gain knowledge about molecular-level response to selection for complex traits, we need to investigate not only multiple replicate populations but also multiple quantitative traits.
Discrete choice modeling of shovelnose sturgeon habitat selection in the Lower Missouri River
Bonnot, T.W.; Wildhaber, M.L.; Millspaugh, J.J.; DeLonay, A.J.; Jacobson, R.B.; Bryan, J.L.
2011-01-01
Substantive changes to physical habitat in the Lower Missouri River, resulting from intensive management, have been implicated in the decline of pallid (Scaphirhynchus albus) and shovelnose (S. platorynchus) sturgeon. To aid in habitat rehabilitation efforts, we evaluated habitat selection of gravid, female shovelnose sturgeon during the spawning season in two sections (lower and upper) of the Lower Missouri River in 2005 and in the upper section in 2007. We fit discrete choice models within an information theoretic framework to identify selection of means and variability in three components of physical habitat. Characterizing habitat within divisions around fish better explained selection than habitat values at the fish locations. In general, female shovelnose sturgeon were negatively associated with mean velocity between them and the bank and positively associated with variability in surrounding depths. For example, in the upper section in 2005, a 0.5ms-1 decrease in velocity within 10m in the bank direction increased the relative probability of selection 70%. In the upper section fish also selected sites with surrounding structure in depth (e.g., change in relief). Differences in models between sections and years, which are reinforced by validation rates, suggest that changes in habitat due to geomorphology, hydrology, and their interactions over time need to be addressed when evaluating habitat selection. Because of the importance of variability in surrounding depths, these results support an emphasis on restoring channel complexity as an objective of habitat restoration for shovelnose sturgeon in the Lower Missouri River. ?? 2011 Blackwell Verlag, Berlin.
Dynamics of Genetic Variability in Two-Locus Models of Stabilizing Selection
Gavrilets, S.; Hastings, A.
1994-01-01
We study a two locus model, with additive contributions to the phenotype, to explore the dynamics of different phenotypic characteristics under stabilizing selection and recombination. We demonstrate that the interaction of selection and recombination results in constraints on the mode of phenotypic evolution. Let V(g) be the genic variance of the trait and C(L) be the contribution of linkage disequilibrium to the genotypic variance. We demonstrate that, independent of the initial conditions, the dynamics of the system on the plane (V(g), C(L)) are typically characterized by a quick approach to a straight line with slow evolution along this line afterward. We analyze how the mode and the rate of phenotypic evolution depend on the strength of selection relative to recombination, on the form of fitness function, and the difference in allelic effect. We argue that if selection is not extremely weak relative to recombination, linkage disequilibrium generated by stabilizing selection influences the dynamics significantly. We demonstrate that under these conditions, which are plausible in nature and certainly the case in artificial stabilizing selection experiments, the model can have a polymorphic equilibrium with positive linkage disequilibrium that is stable simultaneously with monomorphic equilibria. PMID:7828833
Using maximum entropy modeling for optimal selection of sampling sites for monitoring networks
Stohlgren, Thomas J.; Kumar, Sunil; Barnett, David T.; Evangelista, Paul H.
2011-01-01
Environmental monitoring programs must efficiently describe state shifts. We propose using maximum entropy modeling to select dissimilar sampling sites to capture environmental variability at low cost, and demonstrate a specific application: sample site selection for the Central Plains domain (453,490 km2) of the National Ecological Observatory Network (NEON). We relied on four environmental factors: mean annual temperature and precipitation, elevation, and vegetation type. A “sample site” was defined as a 20 km × 20 km area (equal to NEON’s airborne observation platform [AOP] footprint), within which each 1 km2 cell was evaluated for each environmental factor. After each model run, the most environmentally dissimilar site was selected from all potential sample sites. The iterative selection of eight sites captured approximately 80% of the environmental envelope of the domain, an improvement over stratified random sampling and simple random designs for sample site selection. This approach can be widely used for cost-efficient selection of survey and monitoring sites.
Zigler, Corwin Matthew; Dominici, Francesca
2014-01-01
Causal inference with observational data frequently relies on the notion of the propensity score (PS) to adjust treatment comparisons for observed confounding factors. As decisions in the era of “big data” are increasingly reliant on large and complex collections of digital data, researchers are frequently confronted with decisions regarding which of a high-dimensional covariate set to include in the PS model in order to satisfy the assumptions necessary for estimating average causal effects. Typically, simple or ad-hoc methods are employed to arrive at a single PS model, without acknowledging the uncertainty associated with the model selection. We propose three Bayesian methods for PS variable selection and model averaging that 1) select relevant variables from a set of candidate variables to include in the PS model and 2) estimate causal treatment effects as weighted averages of estimates under different PS models. The associated weight for each PS model reflects the data-driven support for that model’s ability to adjust for the necessary variables. We illustrate features of our proposed approaches with a simulation study, and ultimately use our methods to compare the effectiveness of surgical vs. nonsurgical treatment for brain tumors among 2,606 Medicare beneficiaries. Supplementary materials are available online. PMID:24696528
Variable selection in multivariate modeling of drug product formula and manufacturing process.
Cui, Yong; Song, Xiling; Chuang, King; Venkatramani, Cadapakam; Lee, Sueanne; Gallegos, Gregory; Venkateshwaran, Thirunellai; Xie, Minli
2012-12-01
Multivariate data analysis methods such as partial least square (PLS) modeling have been increasingly applied to pharmaceutical product development. This study applied the PLS modeling to analyze a product development dataset generated from a design of experiment and historical batch data. Attention was paid in particular to the assessment of the importance of predictor variables, and subsequently the variable selection in the PLS modeling. The assessment indicated that irrelevant and collinear predictors could be extensively present in the initial PLS model. Therefore, variable selection is an important step in the optimization of the pharmaceutical product process model. The variable importance for projections (VIP) and coefficient values can be employed to rank the importance of predictors. On the basis of this ranking, the irrelevant predictors can be removed. To further reduce collinear predictors, multiple rounds of PLS modeling on different combinations of predictors may be necessary. To this end, stepwise reduction of predictors based on their VIP/coefficient ranking was introduced and was proven to be an effective approach to identify and remove redundant collinear predictors. Overall, the study demonstrated that the variable selection procedure implemented herein can effectively evaluate the importance of variables and optimize models of drug product processes.
Novel approach to evolutionary neural network based descriptor selection and QSAR model development
NASA Astrophysics Data System (ADS)
Debeljak, Željko; Marohnić, Viktor; Srečnik, Goran; Medić-Šarić, Marica
2005-12-01
Capability of evolutionary neural network (ENN) based QSAR approach to direct the descriptor selection process towards stable descriptor subset (DS) composition characterized by acceptable generalization, as well as the influence of description stability on QSAR model interpretation have been examined. In order to analyze the DS stability and QSAR model generalization properties multiple random dataset partitions into training and test set were made. Acceptability criteria proposed by Golbraikh et al. [J. Comput.-Aided Mol. Des., 17 (2003) 241] have been chosen for selection of highly predictive QSAR models from a set of all models produced by ENN for each dataset splitting. All QSAR models that pass Golbraikh's filter generated by ENN for each dataset partition were collected. Two final DS forming principles were compared. Standard principle is based on selection of descriptors characterized by highest frequencies among all descriptors that appear in the pool [J. Chem. Inf. Comput. Sci., 43 (2003) 949]. Search across the model pool for DS that are stable against multiple dataset subsampling i.e. universal DS solutions is the basis of novel approach. Based on described principles benzodiazepine QSAR has been proposed and evaluated against results reported by others in terms of final DS composition and model predictive performance.
The Sim-SEQ Project: Comparison of Selected Flow Models for the S-3 Site
Mukhopadhyay, Sumit; Doughty, Christine A.; Bacon, Diana H.; Li, Jun; Wei, Lingli; Yamamoto, Hajime; Gasda, Sarah E.; Hosseini, Seyyed; Nicot, Jean-Philippe; Birkholzer, Jens
2015-05-23
Sim-SEQ is an international initiative on model comparison for geologic carbon sequestration, with an objective to understand and, if possible, quantify model uncertainties. Model comparison efforts in Sim-SEQ are at present focusing on one specific field test site, hereafter referred to as the Sim-SEQ Study site (or S-3 site). Within Sim-SEQ, different modeling teams are developing conceptual models of CO2 injection at the S-3 site. In this paper, we select five flow models of the S-3 site and provide a qualitative comparison of their attributes and predictions. These models are based on five different simulators or modeling approaches: TOUGH2/EOS7C, STOMP-CO2e, MoReS, TOUGH2-MP/ECO2N, and VESA. In addition to model-to-model comparison, we perform a limited model-to-data comparison, and illustrate how model choices impact model predictions. We conclude the paper by making recommendations for model refinement that are likely to result in less uncertainty in model predictions.
Hodge, N. E.; Ferencz, R. M.; Vignes, R. M.
2016-05-30
Selective laser melting (SLM) is an additive manufacturing process in which multiple, successive layers of metal powders are heated via laser in order to build a part. Modeling of SLM requires consideration of the complex interaction between heat transfer and solid mechanics. Here, the present work describes the authors initial efforts to validate their first generation model. In particular, the comparison of model-generated solid mechanics results, including both deformation and stresses, is presented. Additionally, results of various perturbations of the process parameters and modeling strategies are discussed.
A Model-Based Approach for Identifying Signatures of Ancient Balancing Selection in Genetic Data
DeGiorgio, Michael; Lohmueller, Kirk E.; Nielsen, Rasmus
2014-01-01
While much effort has focused on detecting positive and negative directional selection in the human genome, relatively little work has been devoted to balancing selection. This lack of attention is likely due to the paucity of sophisticated methods for identifying sites under balancing selection. Here we develop two composite likelihood ratio tests for detecting balancing selection. Using simulations, we show that these methods outperform competing methods under a variety of assumptions and demographic models. We apply the new methods to whole-genome human data, and find a number of previously-identified loci with strong evidence of balancing selection, including several HLA genes. Additionally, we find evidence for many novel candidates, the strongest of which is FANK1, an imprinted gene that suppresses apoptosis, is expressed during meiosis in males, and displays marginal signs of segregation distortion. We hypothesize that balancing selection acts on this locus to stabilize the segregation distortion and negative fitness effects of the distorter allele. Thus, our methods are able to reproduce many previously-hypothesized signals of balancing selection, as well as discover novel interesting candidates. PMID:25144706
Novel harmonic regularization approach for variable selection in Cox's proportional hazards model.
Chu, Ge-Jin; Liang, Yong; Wang, Jia-Xuan
2014-01-01
Variable selection is an important issue in regression and a number of variable selection methods have been proposed involving nonconvex penalty functions. In this paper, we investigate a novel harmonic regularization method, which can approximate nonconvex Lq (1/2 < q < 1) regularizations, to select key risk factors in the Cox's proportional hazards model using microarray gene expression data. The harmonic regularization method can be efficiently solved using our proposed direct path seeking approach, which can produce solutions that closely approximate those for the convex loss function and the nonconvex regularization. Simulation results based on the artificial datasets and four real microarray gene expression datasets, such as real diffuse large B-cell lymphoma (DCBCL), the lung cancer, and the AML datasets, show that the harmonic regularization method can be more accurate for variable selection than existing Lasso series methods.
Tourret, Damien; Clarke, Amy J.; Imhoff, Seth D.; Gibbs, Paul J.; Gibbs, John W.; Karma, Alain
2015-05-27
We present a three-dimensional extension of the multiscale dendritic needle network (DNN) model. This approach enables quantitative simulations of the unsteady dynamics of complex hierarchical networks in spatially extended dendritic arrays. We apply the model to directional solidification of Al-9.8 wt.%Si alloy and directly compare the model predictions with measurements from experiments with in situ x-ray imaging. The focus is on the dynamical selection of primary spacings over a range of growth velocities, and the influence of sample geometry on the selection of spacings. Simulation results show good agreement with experiments. The computationally efficient DNN model opens new avenues for investigating the dynamics of large dendritic arrays at scales relevant to solidification experiments and processes.
Peterson, Christine B; Stingo, Francesco C; Vannucci, Marina
2016-03-30
In this work, we develop a Bayesian approach to perform selection of predictors that are linked within a network. We achieve this by combining a sparse regression model relating the predictors to a response variable with a graphical model describing conditional dependencies among the predictors. The proposed method is well-suited for genomic applications because it allows the identification of pathways of functionally related genes or proteins that impact an outcome of interest. In contrast to previous approaches for network-guided variable selection, we infer the network among predictors using a Gaussian graphical model and do not assume that network information is available a priori. We demonstrate that our method outperforms existing methods in identifying network-structured predictors in simulation settings and illustrate our proposed model with an application to inference of proteins relevant to glioblastoma survival.
Tourret, Damien; Clarke, Amy J.; Imhoff, Seth D.; ...
2015-05-27
We present a three-dimensional extension of the multiscale dendritic needle network (DNN) model. This approach enables quantitative simulations of the unsteady dynamics of complex hierarchical networks in spatially extended dendritic arrays. We apply the model to directional solidification of Al-9.8 wt.%Si alloy and directly compare the model predictions with measurements from experiments with in situ x-ray imaging. The focus is on the dynamical selection of primary spacings over a range of growth velocities, and the influence of sample geometry on the selection of spacings. Simulation results show good agreement with experiments. The computationally efficient DNN model opens new avenues formore » investigating the dynamics of large dendritic arrays at scales relevant to solidification experiments and processes.« less
Pattern selection in a boundary-layer model of dendritic growth in the presence of impurities
NASA Technical Reports Server (NTRS)
Karma, A.; Kotliar, B. G.
1985-01-01
Presently analyzed, in the context of a boundary-layer model, is the problem of pattern selection in dendritic growth in a situation where impurities are present in the undercooled liquid. It is found that the tip-velocity selection criterion that has been proposed recently for the geometrical model and the boundary-layer model of a pure substance can be extended, in a nontrivial way, to this more complex situation where two coupled diffusion fields (temperature and solute) determine the interface dynamics. This model predicts a sharp enhancement of tip velocity in good qualitative agreement with experiment. This agreement is consistent with the conjecture that a solvability condition can be used to determine the operating point of the dendrite in the full nonlocal problem.
Bayesian parameter inference and model selection by population annealing in systems biology.
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named "posterior parameter ensemble". We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor.
Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832
Zawbaa, Hossam M; Szlȩk, Jakub; Grosan, Crina; Jachowicz, Renata; Mendyk, Aleksander
2016-01-01
Poly-lactide-co-glycolide (PLGA) is a copolymer of lactic and glycolic acid. Drug release from PLGA microspheres depends not only on polymer properties but also on drug type, particle size, morphology of microspheres, release conditions, etc. Selecting a subset of relevant properties for PLGA is a challenging machine learning task as there are over three hundred features to consider. In this work, we formulate the selection of critical attributes for PLGA as a multiobjective optimization problem with the aim of minimizing the error of predicting the dissolution profile while reducing the number of attributes selected. Four bio-inspired optimization algorithms: antlion optimization, binary version of antlion optimization, grey wolf optimization, and social spider optimization are used to select the optimal feature set for predicting the dissolution profile of PLGA. Besides these, LASSO algorithm is also used for comparisons. Selection of crucial variables is performed under the assumption that both predictability and model simplicity are of equal importance to the final result. During the feature selection process, a set of input variables is employed to find minimum generalization error across different predictive models and their settings/architectures. The methodology is evaluated using predictive modeling for which various tools are chosen, such as Cubist, random forests, artificial neural networks (monotonic MLP, deep learning MLP), multivariate adaptive regression splines, classification and regression tree, and hybrid systems of fuzzy logic and evolutionary computations (fugeR). The experimental results are compared with the results reported by Szlȩk. We obtain a normalized root mean square error (NRMSE) of 15.97% versus 15.4%, and the number of selected input features is smaller, nine versus eleven.
Zawbaa, Hossam M.; Szlȩk, Jakub; Grosan, Crina; Jachowicz, Renata; Mendyk, Aleksander
2016-01-01
Poly-lactide-co-glycolide (PLGA) is a copolymer of lactic and glycolic acid. Drug release from PLGA microspheres depends not only on polymer properties but also on drug type, particle size, morphology of microspheres, release conditions, etc. Selecting a subset of relevant properties for PLGA is a challenging machine learning task as there are over three hundred features to consider. In this work, we formulate the selection of critical attributes for PLGA as a multiobjective optimization problem with the aim of minimizing the error of predicting the dissolution profile while reducing the number of attributes selected. Four bio-inspired optimization algorithms: antlion optimization, binary version of antlion optimization, grey wolf optimization, and social spider optimization are used to select the optimal feature set for predicting the dissolution profile of PLGA. Besides these, LASSO algorithm is also used for comparisons. Selection of crucial variables is performed under the assumption that both predictability and model simplicity are of equal importance to the final result. During the feature selection process, a set of input variables is employed to find minimum generalization error across different predictive models and their settings/architectures. The methodology is evaluated using predictive modeling for which various tools are chosen, such as Cubist, random forests, artificial neural networks (monotonic MLP, deep learning MLP), multivariate adaptive regression splines, classification and regression tree, and hybrid systems of fuzzy logic and evolutionary computations (fugeR). The experimental results are compared with the results reported by Szlȩk. We obtain a normalized root mean square error (NRMSE) of 15.97% versus 15.4%, and the number of selected input features is smaller, nine versus eleven. PMID:27315205
Application Of Decision Tree Approach To Student Selection Model- A Case Study
NASA Astrophysics Data System (ADS)
Harwati; Sudiya, Amby
2016-01-01
The main purpose of the institution is to provide quality education to the students and to improve the quality of managerial decisions. One of the ways to improve the quality of students is to arrange the selection of new students with a more selective. This research takes the case in the selection of new students at Islamic University of Indonesia, Yogyakarta, Indonesia. One of the university's selection is through filtering administrative selection based on the records of prospective students at the high school without paper testing. Currently, that kind of selection does not yet has a standard model and criteria. Selection is only done by comparing candidate application file, so the subjectivity of assessment is very possible to happen because of the lack standard criteria that can differentiate the quality of students from one another. By applying data mining techniques classification, can be built a model selection for new students which includes criteria to certain standards such as the area of origin, the status of the school, the average value and so on. These criteria are determined by using rules that appear based on the classification of the academic achievement (GPA) of the students in previous years who entered the university through the same way. The decision tree method with C4.5 algorithm is used here. The results show that students are given priority for admission is that meet the following criteria: came from the island of Java, public school, majoring in science, an average value above 75, and have at least one achievement during their study in high school.
The C1C2: A framework for simultaneous model selection and assessment
Eklund, Martin; Spjuth, Ola; Wikberg, Jarl ES
2008-01-01
Background There has been recent concern regarding the inability of predictive modeling approaches to generalize to new data. Some of the problems can be attributed to improper methods for model selection and assessment. Here, we have addressed this issue by introducing a novel and general framework, the C1C2, for simultaneous model selection and assessment. The framework relies on a partitioning of the data in order to separate model choice from model assessment in terms of used data. Since the number of conceivable models in general is vast, it was also of interest to investigate the employment of two automatic search methods, a genetic algorithm and a brute-force method, for model choice. As a demonstration, the C1C2 was applied to simulated and real-world datasets. A penalized linear model was assumed to reasonably approximate the true relation between the dependent and independent variables, thus reducing the model choice problem to a matter of variable selection and choice of penalizing parameter. We also studied the impact of assuming prior knowledge about the number of relevant variables on model choice and generalization error estimates. The results obtained with the C1C2 were compared to those obtained by employing repeated K-fold cross-validation for choosing and assessing a model. Results The C1C2 framework performed well at finding the true model in terms of choosing the correct variable subset and producing reasonable choices for the penalizing parameter, even in situations when the independent variables were highly correlated and when the number of observations was less than the number of variables. The C1C2 framework was also found to give accurate estimates of the generalization error. Prior information about the number of important independent variables improved the variable subset choice but reduced the accuracy of generalization error estimates. Using the genetic algorithm worsened the model choice but not the generalization error estimates
Russon, A E; Galdikas, B M
1995-03-01
We discuss selectivity in great ape imitation, on the basis of an observational study of spontaneous imitation in free-ranging rehabilitant orangutans (Pongo pygmaeus). Research on great ape imitation has neglected selectivity, although comparative evidence suggests it may be important. We observed orangutans in central Indonesian Borneo and assessed patterns in the models and actions they spontaneously imitated. The patterns we found resembled those reported in humans. Orangutans preferred models with whom they had positive affective relationships (e.g., important caregiver or older sibling) and actions that reflected their current competence, were receptively familiar, and were relevant to tasks that faced them. Both developmental and individual variability were found. We discuss the probable functions of imitation for great apes and the role of selectivity in directing it. We also make suggestions for more effective elicitation of imitation.
A semiparametric graphical modelling approach for large-scale equity selection
Liu, Han; Mulvey, John; Zhao, Tianqi
2016-01-01
We propose a new stock selection strategy that exploits rebalancing returns and improves portfolio performance. To effectively harvest rebalancing gains, we apply ideas from elliptical-copula graphical modelling and stability inference to select stocks that are as independent as possible. The proposed elliptical-copula graphical model has a latent Gaussian representation; its structure can be effectively inferred using the regularized rank-based estimators. The resulting algorithm is computationally efficient and scales to large data-sets. To show the efficacy of the proposed method, we apply it to conduct equity selection based on a 16-year health care stock data-set and a large 34-year stock data-set. Empirical tests show that the proposed method is superior to alternative strategies including a principal component analysis-based approach and the classical Markowitz strategy based on the traditional buy-and-hold assumption. PMID:28316507
Modeling the Temperature Fields of Copper Powder Melting in the Process of Selective Laser Melting
NASA Astrophysics Data System (ADS)
Saprykin, A. A.; Ibragimov, E. A.; Babakova, E. V.
2016-08-01
Various process variables influence on the quality of the end product when SLM (Selective Laser Melting) synthesizing items of powder materials. The authors of the paper suggest using the model of distributing the temperature fields when forming single tracks and layers of copper powder PMS-1. Relying on the results of modeling it is proposed to reduce melting of powder particles out of the scanning area.
Liu, Fei; Sun, Guang-ming; He, Yong
2010-01-01
Near infrared (NIR) spectroscopy combined with variable selection method of modeling power was investigated for the fast and accurate geographical origin discrimination of auricularia auricula. A total of 240 samples of auriculari auricula were collected in the market, and the spectra of all samples were scanned within the spectral region of 1100-2500 nm. The calibration set was composed of 180 (45 samples for each origin) samples, and the remaining 60 samples were employed as the validation set. The optimal partial least squares (PLS) discriminant model was achieved after performance comparison of different preprocessing (Savitzky-Golay smoothing, standard normal variate, 1-derivative, and 2-derivative). The effective wavelengths, which were selected by modeling power (MP) and used as input data matrix of least squares-support vector machine (LS-SVM), were employed for the development of modeling power-least squares-support vector machine (MP-LS-SVM) model. Radial basis function (RBF) kernel was applied as kernel function. Three threshold methods for variable selection by modeling power were applied in MP-LSSVM models, and there were the values of modeling power higher than 0.95, higher than 0.90, and higher than 0.90 combined with peak location (0.90+Peak). The correct recognition ratio in the validation set was used as evaluation standards. The absolute error of prediction was set as 0.1, 0.2 and 0.5, which showed the wrong recognition threshold value. The results indicated that the MP-LS-SVM (0.90+Peak) model could achieve the optimal performance in all three absolute error standards (0.1, 0.2 and 0.5), and the correct recognition ratio was 98.3%, 100% and 100%, respectively. The variable selection threshold (0.90+Peak) was the most suitable one in the application of modeling power. It was concluded that modeling power was an effective variable selection method, and near infrared spectroscopy combined with MP-LS-SVM model was successfully applied for the origin
On Selective Harvesting of an Inshore-Offshore Fishery: A Bioeconomic Model
ERIC Educational Resources Information Center
Purohit, D.; Chaudhuri, K. S.
2004-01-01
A bioeconomic model is developed for the selective harvesting of a single species, inshore-offshore fishery, assuming that the growth of the species is governed by the Gompertz law. The dynamical system governing the fishery is studied in depth; the local and global stability of its non-trivial steady state are examined. Existence of a bionomic…
An Associative Index Model for the Results List Based on Vannevar Bush's Selection Concept
ERIC Educational Resources Information Center
Cole, Charles; Julien, Charles-Antoine; Leide, John E.
2010-01-01
Introduction: We define the results list problem in information search and suggest the "associative index model", an ad-hoc, user-derived indexing solution based on Vannevar Bush's description of an associative indexing approach for his memex machine. We further define what selection means in indexing terms with reference to Charles…
ERIC Educational Resources Information Center
Moses, Tim; Holland, Paul W.
2010-01-01
In this study, eight statistical strategies were evaluated for selecting the parameterizations of loglinear models for smoothing the bivariate test score distributions used in nonequivalent groups with anchor test (NEAT) equating. Four of the strategies were based on significance tests of chi-square statistics (Likelihood Ratio, Pearson,…
Mathematical analysis and modeling of motion direction selectivity in the retina.
Escobar, María-José; Pezo, Danilo; Orio, Patricio
2013-11-01
Motion detection is one of the most important and primitive computations performed by our visual system. Specifically in the retina, ganglion cells producing motion direction-selective responses have been addressed by different disciplines, such as mathematics, neurophysiology and computational modeling, since the beginnings of vision science. Although a number of studies have analyzed theoretical and mathematical considerations for such responses, a clear picture of the underlying cellular mechanisms is only recently emerging. In general, motion direction selectivity is based on a non-linear asymmetric computation inside a receptive field differentiating cell responses between preferred and null direction stimuli. To what extent can biological findings match these considerations? In this review, we outline theoretical and mathematical studies of motion direction selectivity, aiming to map the properties of the models onto the neural circuitry and synaptic connectivity found in the retina. Additionally, we review several compartmental models that have tried to fill this gap. Finally, we discuss the remaining challenges that computational models will have to tackle in order to fully understand the retinal motion direction-selective circuitry.
NASA Astrophysics Data System (ADS)
Taormina, R.; Galelli, S.; Karakaya, G.; Ahipasaoglu, S. D.
2016-11-01
This work investigates the uncertainty associated to the presence of multiple subsets of predictors yielding data-driven models with the same, or similar, predictive accuracy. To handle this uncertainty effectively, we introduce a novel input variable selection algorithm, called Wrapper for Quasi Equally Informative Subset Selection (W-QEISS), specifically conceived to identify all alternate subsets of predictors in a given dataset. The search process is based on a four-objective optimization problem that minimizes the number of selected predictors, maximizes the predictive accuracy of a data-driven model and optimizes two information theoretic metrics of relevance and redundancy, which guarantee that the selected subsets are highly informative and with little intra-subset similarity. The algorithm is first tested on two synthetic test problems and then demonstrated on a real-world streamflow prediction problem in the Yampa River catchment (US). Results show that complex hydro-meteorological datasets are characterized by a large number of alternate subsets of predictors, which provides useful insights on the underlying physical processes. Furthermore, the presence of multiple subsets of predictors-and associated models-helps find a better trade-off between different measures of predictive accuracy commonly adopted for hydrological modelling problems.
A supplier-selection model with classification and joint replenishment of inventory items
NASA Astrophysics Data System (ADS)
Mohammaditabar, Davood; Hassan Ghodsypour, Seyed
2016-06-01
Since inventory costs are closely related to suppliers, many models in the literature have selected the suppliers and also allocated orders, simultaneously. Such models usually consider either a single inventory item or multiple inventory items which have independent holding and ordering costs. However, in practice, ordering multiple items from the same supplier leads to a reduction in ordering costs. This paper presents a model in capacity-constrained supplier-selection and order-allocation problem, which considers the joint replenishment of inventory items with a direct grouping approach. In such supplier-selection problems, the following items are considered: a fixed major ordering cost to each supplier, which is independent from the items in the order; a minor ordering cost for each item ordered to each supplier; and the inventory holding and purchasing costs. To solve the developed NP-hard problem, a simulated annealing algorithm was proposed and then compared to a modified genetic algorithm of the literature. The numerical example represented that the number of groups and selected suppliers were reduced when the major ordering cost increased in comparison to other costs. There were also more savings when the number of groups was determined by the model in comparison to predetermined number of groups or no grouping scenarios.
An Actuarial Model for Selecting Participants for a Special Medical Education Program.
ERIC Educational Resources Information Center
Walker-Bartnick, Leslie; And Others
An actuarial model applied to the selection process of a special medical school program at the University of Maryland School of Medicine was tested. The 77 students in the study sample were admitted to the university's Fifth Pathway Program, which is designed for U.S. citizens who completed their medical school training, except for internship and…
Bocedi, Greta; Reid, Jane M
2015-01-01
Explaining the evolution and maintenance of polyandry remains a key challenge in evolutionary ecology. One appealing explanation is the sexually selected sperm (SSS) hypothesis, which proposes that polyandry evolves due to indirect selection stemming from positive genetic covariance with male fertilization efficiency, and hence with a male's success in postcopulatory competition for paternity. However, the SSS hypothesis relies on verbal analogy with “sexy-son” models explaining coevolution of female preferences for male displays, and explicit models that validate the basic SSS principle are surprisingly lacking. We developed analogous genetically explicit individual-based models describing the SSS and “sexy-son” processes. We show that the analogy between the two is only partly valid, such that the genetic correlation arising between polyandry and fertilization efficiency is generally smaller than that arising between preference and display, resulting in less reliable coevolution. Importantly, indirect selection was too weak to cause polyandry to evolve in the presence of negative direct selection. Negatively biased mutations on fertilization efficiency did not generally rescue runaway evolution of polyandry unless realized fertilization was highly skewed toward a single male, and coevolution was even weaker given random mating order effects on fertilization. Our models suggest that the SSS process is, on its own, unlikely to generally explain the evolution of polyandry. PMID:25330405
Aggressive Adolescents in Residential Care: A Selective Review of Treatment Requirements and Models
ERIC Educational Resources Information Center
Knorth, Erik J.; Klomp, Martin; Van den Bergh, Peter M.; Noom, Marc J.
2007-01-01
This article presents a selective inventory of treatment methods of aggressive behavior. Special attention is paid to types of intervention that, according to research, are frequently used in Dutch residential youth care. These methods are based on (1) principles of (cognitive) behavior management and control, (2) the social competence model, and…
Variable selection in subdistribution hazard frailty models with competing risks data
Do Ha, Il; Lee, Minjung; Oh, Seungyoung; Jeong, Jong-Hyeon; Sylvester, Richard; Lee, Youngjo
2014-01-01
The proportional subdistribution hazards model (i.e. Fine-Gray model) has been widely used for analyzing univariate competing risks data. Recently, this model has been extended to clustered competing risks data via frailty. To the best of our knowledge, however, there has been no literature on variable selection method for such competing risks frailty models. In this paper, we propose a simple but unified procedure via a penalized h-likelihood (HL) for variable selection of fixed effects in a general class of subdistribution hazard frailty models, in which random effects may be shared or correlated. We consider three penalty functions (LASSO, SCAD and HL) in our variable selection procedure. We show that the proposed method can be easily implemented using a slight modification to existing h-likelihood estimation approaches. Numerical studies demonstrate that the proposed procedure using the HL penalty performs well, providing a higher probability of choosing the true model than LASSO and SCAD methods without losing prediction accuracy. The usefulness of the new method is illustrated using two actual data sets from multi-center clinical trials. PMID:25042872
A coalescent dual process in a Moran model with genic selection.
Etheridge, A M; Griffiths, R C
2009-06-01
A coalescent dual process for a multi-type Moran model with genic selection is derived using a generator approach. This leads to an expansion of the transition functions in the Moran model and the Wright-Fisher diffusion process limit in terms of the transition functions for the coalescent dual. A graphical representation of the Moran model (in the spirit of Harris) identifies the dual as a strong dual process following typed lines backwards in time. An application is made to the harmonic measure problem of finding the joint probability distribution of the time to the first loss of an allele from the population and the distribution of the surviving alleles at the time of loss. Our dual process mirrors the Ancestral Selection Graph of [Krone, S. M., Neuhauser, C., 1997. Ancestral processes with selection. Theoret. Popul. Biol. 51, 210-237; Neuhauser, C., Krone, S. M., 1997. The genealogy of samples in models with selection. Genetics 145, 519-534], which allows one to reconstruct the genealogy of a random sample from a population subject to genic selection. In our setting, we follow [Stephens, M., Donnelly, P., 2002. Ancestral inference in population genetics models with selection. Aust. N. Z. J. Stat. 45, 395-430] in assuming that the types of individuals in the sample are known. There are also close links to [Fearnhead, P., 2002. The common ancestor at a nonneutral locus. J. Appl. Probab. 39, 38-54]. However, our methods and applications are quite different. This work can also be thought of as extending a dual process construction in a Wright-Fisher diffusion in [Barbour, A.D., Ethier, S.N., Griffiths, R.C., 2000. A transition function expansion for a diffusion model with selection. Ann. Appl. Probab. 10, 123-162]. The application to the harmonic measure problem extends a construction provided in the setting of a neutral diffusion process model in [Ethier, S.N., Griffiths, R.C., 1991. Harmonic measure for random genetic drift. In: Pinsky, M.A. (Ed.), Diffusion
Unbiased descriptor and parameter selection confirms the potential of proteochemometric modelling
Freyhult, Eva; Prusis, Peteris; Lapinsh, Maris; Wikberg, Jarl ES; Moulton, Vincent; Gustafsson, Mats G
2005-01-01
Background Proteochemometrics is a new methodology that allows prediction of protein function directly from real interaction measurement data without the need of 3D structure information. Several reported proteochemometric models of ligand-receptor interactions have already yielded significant insights into various forms of bio-molecular interactions. The proteochemometric models are multivariate regression models that predict binding affinity for a particular combination of features of the ligand and protein. Although proteochemometric models have already offered interesting results in various studies, no detailed statistical evaluation of their average predictive power has been performed. In particular, variable subset selection performed to date has always relied on using all available examples, a situation also encountered in microarray gene expression data analysis. Results A methodology for an unbiased evaluation of the predictive power of proteochemometric models was implemented and results from applying it to two of the largest proteochemometric data sets yet reported are presented. A double cross-validation loop procedure is used to estimate the expected performance of a given design method. The unbiased performance estimates (P2) obtained for the data sets that we consider confirm that properly designed single proteochemometric models have useful predictive power, but that a standard design based on cross validation may yield models with quite limited performance. The results also show that different commercial software packages employed for the design of proteochemometric models may yield very different and therefore misleading performance estimates. In addition, the differences in the models obtained in the double CV loop indicate that detailed chemical interpretation of a single proteochemometric model is uncertain when data sets are small. Conclusion The double CV loop employed offer unbiased performance estimates about a given proteochemometric
Estimation and Model Selection for Finite Mixtures of Latent Interaction Models
ERIC Educational Resources Information Center
Hsu, Jui-Chen
2011-01-01
Latent interaction models and mixture models have received considerable attention in social science research recently, but little is known about how to handle if unobserved population heterogeneity exists in the endogenous latent variables of the nonlinear structural equation models. The current study estimates a mixture of latent interaction…
NASA Astrophysics Data System (ADS)
Sahul Hameed, Ruzanna; Thiruchelvam, Sivadass; Nasharuddin Mustapha, Kamal; Che Muda, Zakaria; Mat Husin, Norhayati; Ezanee Rusli, Mohd; Yong, Lee Choon; Ghazali, Azrul; Itam, Zarina; Hakimie, Hazlinda; Beddu, Salmia; Liyana Mohd Kamal, Nur
2016-03-01
This paper proposes a conceptual framework to compare criteria/factor that influence the supplier selection. A mixed methods approach comprising qualitative and quantitative survey will be used. The study intend to identify and define the metrics that key stakeholders at Public Works Department (PWD) believed should be used for supplier. The outcomes would foresee the possible initiatives to bring procurement in PWD to a strategic level. The results will provide a deeper understanding of drivers for supplier’s selection in the construction industry. The obtained output will benefit many parties involved in the supplier selection decision-making. The findings provides useful information and greater understanding of the perceptions that PWD executives hold regarding supplier selection and the extent to which these perceptions are consistent with findings from prior studies. The findings from this paper can be utilized as input for policy makers to outline any changes in the current procurement code of practice in order to enhance the degree of transparency and integrity in decision-making.
Relevance popularity: A term event model based feature selection scheme for text classification.
Feng, Guozhong; An, Baiguo; Yang, Fengqin; Wang, Han; Zhang, Libiao
2017-01-01
Feature selection is a practical approach for improving the performance of text classification methods by optimizing the feature subsets input to classifiers. In traditional feature selection methods such as information gain and chi-square, the number of documents that contain a particular term (i.e. the document frequency) is often used. However, the frequency of a given term appearing in each document has not been fully investigated, even though it is a promising feature to produce accurate classifications. In this paper, we propose a new feature selection scheme based on a term event Multinomial naive Bayes probabilistic model. According to the model assumptions, the matching score function, which is based on the prediction probability ratio, can be factorized. Finally, we derive a feature selection measurement for each term after replacing inner parameters by their estimators. On a benchmark English text datasets (20 Newsgroups) and a Chinese text dataset (MPH-20), our numerical experiment results obtained from using two widely used text classifiers (naive Bayes and support vector machine) demonstrate that our method outperformed the representative feature selection methods.
Immune selection in neoplasia: towards a microevolutionary model of cancer development
Pettit, S J; Seymour, K; O'Flaherty, E; Kirby, J A
2000-01-01
The dual properties of genetic instability and clonal expansion allow the development of a tumour to occur in a microevolutionary fashion. A broad range of pressures are exerted upon a tumour during neoplastic development. Such pressures are responsible for the selection of adaptations which provide a growth or survival advantage to the tumour. The nature of such selective pressures is implied in the phenotype of tumours that have undergone selection. We have reviewed a range of immunologically relevant adaptations that are frequently exhibited by common tumours. Many of these have the potential to function as mechanisms of immune response evasion by the tumour. Thus, such adaptations provide evidence for both the existence of immune surveillance, and the concept of immune selection in neoplastic development. This line of reasoning is supported by experimental evidence from murine models of immune involvement in neoplastic development. The process of immune selection has serious implications for the development of clinical immunotherapeutic strategies and our understanding of current in vivo models of tumour immunotherapy. © 2000 Cancer Research Campaign PMID:10864195
Bayesian Covariate Selection in Mixed-Effects Models For Longitudinal Shape Analysis
Muralidharan, Prasanna; Fishbaugh, James; Kim, Eun Young; Johnson, Hans J.; Paulsen, Jane S.; Gerig, Guido; Fletcher, P. Thomas
2016-01-01
The goal of longitudinal shape analysis is to understand how anatomical shape changes over time, in response to biological processes, including growth, aging, or disease. In many imaging studies, it is also critical to understand how these shape changes are affected by other factors, such as sex, disease diagnosis, IQ, etc. Current approaches to longitudinal shape analysis have focused on modeling age-related shape changes, but have not included the ability to handle covariates. In this paper, we present a novel Bayesian mixed-effects shape model that incorporates simultaneous relationships between longitudinal shape data and multiple predictors or covariates to the model. Moreover, we place an Automatic Relevance Determination (ARD) prior on the parameters, that lets us automatically select which covariates are most relevant to the model based on observed data. We evaluate our proposed model and inference procedure on a longitudinal study of Huntington's disease from PREDICT-HD. We first show the utility of the ARD prior for model selection in a univariate modeling of striatal volume, and next we apply the full high-dimensional longitudinal shape model to putamen shapes. PMID:28090246
NASA Astrophysics Data System (ADS)
Guo, Danlu; Westra, Seth; Maier, Holger R.
2015-04-01
Projected changes to near-surface atmospheric temperature, wind, humidity and solar radiation are expected to lead to changes in evaporative demand - and thus changes to the catchment water balance - in many catchments worldwide. To quantify likely implications on runoff, a modelling chain is commonly used in which the meteorological variables are first converted to potential evapotranspiration (PET), followed by the conversion of PET to runoff using one or more rainfall-runoff models. The role of the PET model and rainfall-runoff model selection on changes to the catchment water balance is assessed using a sensitivity analysis applied to data from five climatologically different catchments in Australia. Changes to temperature have the strongest influence on both evapotranspiration and runoff for all models and catchments, whereas the relative role of the remaining variables depends on both the catchment location and the PET and rainfall-runoff model choice. Importantly, sensitivity experiments show that 1) distributions of climate variables differ for dry/wet conditions; 2) seasonal distribution of changes to PET differs for driving variables. These findings suggest possible interactions between PET model selection and the way that evapotranspiration processes are represented within rainfall-runoff model. For a constant percentage change to PET, this effect can lead to five-fold difference in runoff changes depending on which meteorological variable is being perturbed.
Yang, Kai-Fu; Li, Chao-Yi; Li, Yong-Jie
2015-01-01
Both the neurons with orientation-selective and with non-selective surround inhibition have been observed in the primary visual cortex (V1) of primates and cats. Though the inhibition coming from the surround region (named as non-classical receptive field, nCRF) has been considered playing critical role in visual perception, the specific role of orientation-selective and non-selective inhibition in the task of contour detection is less known. To clarify above question, we first carried out computational analysis of the contour detection performance of V1 neurons with different types of surround inhibition, on the basis of which we then proposed two integrated models to evaluate their role in this specific perceptual task by combining the two types of surround inhibition with two different ways. The two models were evaluated with synthetic images and a set of challenging natural images, and the results show that both of the integrated models outperform the typical models with orientation-selective or non-selective inhibition alone. The findings of this study suggest that V1 neurons with different types of center-surround interaction work in cooperative and adaptive ways at least when extracting organized structures from cluttered natural scenes. This work is expected to inspire efficient phenomenological models for engineering applications in field of computational machine-vision.
Gonzalo Cogno, Soledad; Mato, Germán
2015-01-01
Orientation selectivity is ubiquitous in the primary visual cortex (V1) of mammals. In cats and monkeys, V1 displays spatially ordered maps of orientation preference. Instead, in mice, squirrels, and rats, orientation selective neurons in V1 are not spatially organized, giving rise to a seemingly random pattern usually referred to as a salt-and-pepper layout. The fact that such different organizations can sharpen orientation tuning leads to question the structural role of the intracortical connections; specifically the influence of plasticity and the generation of functional connectivity. In this work, we analyze the effect of plasticity processes on orientation selectivity for both scenarios. We study a computational model of layer 2/3 and a reduced one-dimensional model of orientation selective neurons, both in the balanced state. We analyze two plasticity mechanisms. The first one involves spike-timing dependent plasticity (STDP), while the second one considers the reconnection of the interactions according to the preferred orientations of the neurons. We find that under certain conditions STDP can indeed improve selectivity but it works in a somehow unexpected way, that is, effectively decreasing the modulated part of the intracortical connectivity as compared to the non-modulated part of it. For the reconnection mechanism we find that increasing functional connectivity leads, in fact, to a decrease in orientation selectivity if the network is in a stable balanced state. Both counterintuitive results are a consequence of the dynamics of the balanced state. We also find that selectivity can increase due to a reconnection process if the resulting connections give rise to an unstable balanced state. We compare these findings with recent experimental results. PMID:26347615
Torres, F E; Teodoro, P E; Rodrigues, E V; Santos, A; Corrêa, A M; Ceccon, G
2016-04-29
The aim of this study was to select erect cowpea (Vigna unguiculata L.) genotypes simultaneously for high adaptability, stability, and yield grain in Mato Grosso do Sul, Brazil using mixed models. We conducted six trials of different cowpea genotypes in 2005 and 2006 in Aquidauana, Chapadão do Sul, Dourados, and Primavera do Leste. The experimental design was randomized complete blocks with four replications and 20 genotypes. Genetic parameters were estimated by restricted maximum likelihood/best linear unbiased prediction, and selection was based on the harmonic mean of the relative performance of genetic values method using three strategies: selection based on the predicted breeding value, having considered the performance mean of the genotypes in all environments (no interaction effect); the performance in each environment (with an interaction effect); and the simultaneous selection for grain yield, stability, and adaptability. The MNC99542F-5 and MNC99-537F-4 genotypes could be grown in various environments, as they exhibited high grain yield, adaptability, and stability. The average heritability of the genotypes was moderate to high and the selective accuracy was 82%, indicating an excellent potential for selection.
A Biologically Inspired Computational Model of Basal Ganglia in Action Selection
Baston, Chiara; Ursino, Mauro
2015-01-01
The basal ganglia (BG) are a subcortical structure implicated in action selection. The aim of this work is to present a new cognitive neuroscience model of the BG, which aspires to represent a parsimonious balance between simplicity and completeness. The model includes the 3 main pathways operating in the BG circuitry, that is, the direct (Go), indirect (NoGo), and hyperdirect pathways. The main original aspects, compared with previous models, are the use of a two-term Hebb rule to train synapses in the striatum, based exclusively on neuronal activity changes caused by dopamine peaks or dips, and the role of the cholinergic interneurons (affected by dopamine themselves) during learning. Some examples are displayed, concerning a few paradigmatic cases: action selection in basal conditions, action selection in the presence of a strong conflict (where the role of the hyperdirect pathway emerges), synapse changes induced by phasic dopamine, and learning new actions based on a previous history of rewards and punishments. Finally, some simulations show model working in conditions of altered dopamine levels, to illustrate pathological cases (dopamine depletion in parkinsonian subjects or dopamine hypermedication). Due to its parsimonious approach, the model may represent a straightforward tool to analyze BG functionality in behavioral experiments. PMID:26640481
Badouin, H; Gladieux, P; Gouzy, J; Siguenza, S; Aguileta, G; Snirc, A; Le Prieur, S; Jeziorski, C; Branca, A; Giraud, T
2017-04-01
Identifying the genes underlying adaptation, their distribution in genomes and the evolutionary forces shaping genomic diversity are key challenges in evolutionary biology. Very few studies have investigated the abundance and distribution of selective sweeps in species with high-quality reference genomes, outside a handful of model species. Pathogenic fungi are tractable eukaryote models for investigating the genomics of adaptation. By sequencing 53 genomes of two species of anther-smut fungi and mapping them against a high-quality reference genome, we showed that selective sweeps were abundant and scattered throughout the genome in one species, affecting near 17% of the genome, but much less numerous and in different genomic regions in its sister species, where they left footprints in only 1% of the genome. Polymorphism was negatively correlated with linkage disequilibrium levels in the genomes, consistent with recurrent positive and/or background selection. Differential expression in planta and in vitro, and functional annotation, suggested that many of the selective sweeps were probably involved in adaptation to the host plant. Examples include glycoside hydrolases, pectin lyases and an extracellular membrane protein with CFEM domain. This study thus provides candidate genes for being involved in plant-pathogen interaction (effectors), which have remained elusive for long in this otherwise well-studied system. Their identification will foster future functional and evolutionary studies, in the plant and in the anther-smut pathogens, being model species of natural plant-pathogen associations. In addition, our results suggest that positive selection can have a pervasive impact in shaping genomic variability in pathogens and selfing species, broadening our knowledge of the occurrence and frequency of selective events in natural populations.
NASA Astrophysics Data System (ADS)
Placek, Ben; Knuth, Kevin H.; Angerhausen, Daniel
2014-11-01
EXONEST is an algorithm dedicated to detecting and characterizing the photometric signatures of exoplanets, which include reflection and thermal emission, Doppler boosting, and ellipsoidal variations. Using Bayesian inference, we can test between competing models that describe the data as well as estimate model parameters. We demonstrate this approach by testing circular versus eccentric planetary orbital models, as well as testing for the presence or absence of four photometric effects. In addition to using Bayesian model selection, a unique aspect of EXONEST is the potential capability to distinguish between reflective and thermal contributions to the light curve. A case study is presented using Kepler data recorded from the transiting planet KOI-13b. By considering only the nontransiting portions of the light curve, we demonstrate that it is possible to estimate the photometrically relevant model parameters of KOI-13b. Furthermore, Bayesian model testing confirms that the orbit of KOI-13b has a detectable eccentricity.
Placek, Ben; Knuth, Kevin H.; Angerhausen, Daniel E-mail: kknuth@albany.edu
2014-11-10
EXONEST is an algorithm dedicated to detecting and characterizing the photometric signatures of exoplanets, which include reflection and thermal emission, Doppler boosting, and ellipsoidal variations. Using Bayesian inference, we can test between competing models that describe the data as well as estimate model parameters. We demonstrate this approach by testing circular versus eccentric planetary orbital models, as well as testing for the presence or absence of four photometric effects. In addition to using Bayesian model selection, a unique aspect of EXONEST is the potential capability to distinguish between reflective and thermal contributions to the light curve. A case study is presented using Kepler data recorded from the transiting planet KOI-13b. By considering only the nontransiting portions of the light curve, we demonstrate that it is possible to estimate the photometrically relevant model parameters of KOI-13b. Furthermore, Bayesian model testing confirms that the orbit of KOI-13b has a detectable eccentricity.
Scheel, Ida; Ferkingstad, Egil; Frigessi, Arnoldo; Haug, Ola; Hinnerichsen, Mikkel; Meze-Hausken, Elisabeth
2013-01-01
Climate change will affect the insurance industry. We develop a Bayesian hierarchical statistical approach to explain and predict insurance losses due to weather events at a local geographic scale. The number of weather-related insurance claims is modelled by combining generalized linear models with spatially smoothed variable selection. Using Gibbs sampling and reversible jump Markov chain Monte Carlo methods, this model is fitted on daily weather and insurance data from each of the 319 municipalities which constitute southern and central Norway for the period 1997–2006. Precise out-of-sample predictions validate the model. Our results show interesting regional patterns in the effect of different weather covariates. In addition to being useful for insurance pricing, our model can be used for short-term predictions based on weather forecasts and for long-term predictions based on downscaled climate models. PMID:23396890
Bayesian model selection for incomplete data using the posterior predictive distribution.
Daniels, Michael J; Chatterjee, Arkendu S; Wang, Chenguang
2012-12-01
We explore the use of a posterior predictive loss criterion for model selection for incomplete longitudinal data. We begin by identifying a property that most model selection criteria for incomplete data should consider. We then show that a straightforward extension of the Gelfand and Ghosh (1998, Biometrika, 85, 1-11) criterion to incomplete data has two problems. First, it introduces an extra term (in addition to the goodness of fit and penalty terms) that compromises the criterion. Second, it does not satisfy the aforementioned property. We propose an alternative and explore its properties via simulations and on a real dataset and compare it to the deviance information criterion (DIC). In general, the DIC outperforms the posterior predictive criterion, but the latter criterion appears to work well overall and is very easy to compute unlike the DIC in certain classes of models for missing data.
Shen, Chung-Wei; Chen, Yi-Hau
2015-10-01
Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations.
Bayesian evidence computation for model selection in non-linear geoacoustic inference problems.
Dettmer, Jan; Dosso, Stan E; Osler, John C
2010-12-01
This paper applies a general Bayesian inference approach, based on Bayesian evidence computation, to geoacoustic inversion of interface-wave dispersion data. Quantitative model selection is carried out by computing the evidence (normalizing constants) for several model parameterizations using annealed importance sampling. The resulting posterior probability density estimate is compared to estimates obtained from Metropolis-Hastings sampling to ensure consistent results. The approach is applied to invert interface-wave dispersion data collected on the Scotian Shelf, off the east coast of Canada for the sediment shear-wave velocity profile. Results are consistent with previous work on these data but extend the analysis to a rigorous approach including model selection and uncertainty analysis. The results are also consistent with core samples and seismic reflection measurements carried out in the area.
NASA Astrophysics Data System (ADS)
Mutoh, Atsuko; Tokuhara, Shinya; Kanoh, Masayoshi; Oboshi, Tamon; Kato, Shohei; Itoh, Hidenori
It is generally thought that living things have trends in their preferences. The mechanism of occurrence of another trends in successive periods is concerned in their conformity. According to social impact theory, the minority is always exists in the group. There is a possibility that the minority make the transition to the majority by conforming agents. Because of agent's promotion of their conform actions, the majority can make the transition. We proposed an evolutionary model with both genes and memes, and elucidated the interaction between genes and memes on sexual selection. In this paper, we propose an agent model for sexual selection imported the concept of conformity. Using this model we try an environment where male agents and female agents are existed, we find that periodic phenomena of fashion are expressed. And we report the influence of conformity and differentiation on the transition of their preferences.
Models Used to Select Strategic Planning Experts for High Technology Productions
NASA Astrophysics Data System (ADS)
Zakharova, Alexandra A.; Grigorjeva, Antonina A.; Tseplit, Anna P.; Ozgogov, Evgenij V.
2016-04-01
The article deals with the problems and specific aspects in organizing works of experts involved in assessment of companies that manufacture complex high-technology products. A model is presented that is intended for evaluating competences of experts in individual functional areas of expertise. Experts are selected to build a group on the basis of tables used to determine a competence level. An expert selection model based on fuzzy logic is proposed and additional requirements for the expert group composition can be taken into account, with regard to the needed quality and competence related preferences of decision-makers. A Web-based information system model is developed for the interaction between experts and decision-makers when carrying out online examinations.
Frey, H.C.; Rubin, E.S.
1990-06-01
This report documents cost models developed for selected integrated gasification combined cycle (IGCC) systems. The objective is to obtain a series of capital and operating cost models that can be integrated with an existing set of IGCC process performance models developed at the US Department of Energy Morgantown Energy Technology Center. These models are implemented in ASPEN, a Fortran-based process simulator. Under a separate task, a probabilistic modeling capability has been added to the ASPEN simulator, facilitating analysis of uncertainties in new process performance and cost (Diwekar and Rubin, 1989). One application of the cost models presented here is to explicitly characterize uncertainties in capital and annual costs, supplanting the traditional approach of incorporating uncertainty via a contingency factor. The IGCC systems selected by DOE/METC for cost model development include the following: KRW gasifier with cold gas cleanup; KRW gasifier with hot gas cleanup; and Lurgi gasifier with hot gas cleanup. For each technology, the cost model includes both capital and annual costs. The capital cost models estimate the costs of each major plant section as a function of key performance and design parameters. A standard cost method based on the Electric Power Research Institute (EPRI) Technical Assessment Guide (1986) was adopted. The annual cost models are based on operating and maintenance labor requirements, maintenance material requirements, the costs of utilities and reagent consumption, and credits from byproduct sales. Uncertainties in cost parameters are identified for both capital and operating cost models. Appendices contain cost models for the above three IGCC systems, a number of operating trains subroutines, range checking subroutines, and financial subroutines. 88 refs., 69 figs., 21 tabs.
NASA Technical Reports Server (NTRS)
Holms, A. G.
1977-01-01
As many as three iterated statistical model deletion procedures were considered for an experiment. Population model coefficients were chosen to simulate a saturated 2 to the 4th power experiment having an unfavorable distribution of parameter values. Using random number studies, three model selection strategies were developed, namely, (1) a strategy to be used in anticipation of large coefficients of variation, approximately 65 percent, (2) a strategy to be sued in anticipation of small coefficients of variation, 4 percent or less, and (3) a security regret strategy to be used in the absence of such prior knowledge.
Pérez-Figueroa, A; Cruz, F; Carvajal-Rodríguez, A; Rolán-Alvarez, E; Caballero, A
2005-01-01
Two rocky shore ecotypes of Littorina saxatilis from north-west Spain live at different shore levels and habitats and have developed an incomplete reproductive isolation through size assortative mating. The system is regarded as an example of sympatric ecological speciation. Several experiments have indicated that different evolutionary forces (migration, assortative mating and habitat-dependent selection) play a role in maintaining the polymorphism. However, an assessment of the combined contributions of these forces supporting the observed pattern in the wild is absent. A model selection procedure using computer simulations was used to investigate the contribution of the different evolutionary forces towards the maintenance of the polymorphism. The agreement between alternative models and experimental estimates for a number of parameters was quantified by a least square method. The results of the analysis show that the fittest evolutionary model for the observed polymorphism is characterized by a high gene flow, intermediate-high reproductive isolation between ecotypes, and a moderate to strong selection against the nonresident ecotypes on each shore level. In addition, a substantial number of additive loci contributing to the selected trait and a narrow hybrid definition with respect to the phenotype are scenarios that better explain the polymorphism, whereas the ecotype fitnesses at the mid-shore, the level of phenotypic plasticity, and environmental effects are not key parameters.
A mathematical model for the rational design of chimeric ligands in selective drug therapies.
Doldán-Martelli, V; Guantes, R; Míguez, D G
2013-02-13
Chimeric drugs with selective potential toward specific cell types constitute one of the most promising forefronts of modern Pharmacology. We present a mathematical model to test and optimize these synthetic constructs, as an alternative to conventional empirical design. We take as a case study a chimeric construct composed of epidermal growth factor (EGF) linked to different mutants of interferon (IFN). Our model quantitatively reproduces all the experimental results, illustrating how chimeras using mutants of IFN with reduced affinity exhibit enhanced selectivity against cell overexpressing EGF receptor. We also investigate how chimeric selectivity can be improved based on the balance between affinity rates, receptor abundance, activity of ligand subunits, and linker length between subunits. The simplicity and generality of the model facilitate a straightforward application to other chimeric constructs, providing a quantitative systematic design and optimization of these selective drugs against certain cell-based diseases, such as Alzheimer's and cancer.CPT: Pharmacometrics & Systems Pharmacology (2013) 2, e26; doi:10.1038/psp.2013.2; advance online publication 13 February 2013.
Guo, Hua; Jiang, Bin
2014-12-16
CONSPECTUS: Mode specificity is defined by the differences in reactivity due to excitations in various reactant modes, while bond selectivity refers to selective bond breaking in a reaction. These phenomena not only shed light on reaction dynamics but also open the door for laser control of reactions. The existence of mode specificity and bond selectivity in a reaction indicates that not all forms of energy are equivalent in promoting the reactivity, thus defying a statistical treatment. They also allow the enhancement of reactivity and control product branching ratio. As a result, they are of central importance in chemistry. This Account discusses recent advances in our understanding of these nonstatistical phenomena. In particular, the newly proposed sudden vector projection (SVP) model and its applications are reviewed. The SVP model is based on the premise that the collision in many direct reactions is much faster than intramolecular vibrational energy redistribution in the reactants. In such a sudden limit, the coupling of a reactant mode with the reaction coordinate at the transition state, which dictates its ability to promote the reaction, is approximately quantified by the projection of the former onto the latter. The SVP model can be considered as a generalization of the venerable Polanyi's rules, which are based on the location of the barrier. The SVP model is instead based on properties of the saddle point and as a result capable of treating the translational, rotational, and multiple vibrational modes in reactions involving polyatomic reactants. In case of surface reactions, the involvement of surface atoms can also be examined. Taking advantage of microscopic reversibility, the SVP model has also been used to predict product energy disposal in reactions. This simple yet powerful rule of thumb has been successfully demonstrated in many reactions including uni- and bimolecular reactions in the gas phase and gas-surface reactions. The success of the SVP
A model for particle-selective transport of tracers in sediments with conveyor belt deposit feeders
NASA Astrophysics Data System (ADS)
Robbins, John A.
1986-07-01
Conveyor belt deposit-feeding organisms prevalent in both marine and freshwater systems have a profound effect on sediment properties and transport processes. These organisms ingest sediments over a range of depths while depositing gut contents from tails protruding above the sediment surface. This action results in particle-selective transfer of buried materials to the sediment surface and imposes an accelerated rate of sediment and pore water burial within the feeding zone. Most previous efforts to combine sediment diagenesis with the effects of biogenic reworking characterize mixing as exclusively diffusive and ignore such major advective effects. Here a model is developed, based on fundamental diagenetic equations for transport and reaction, for the distribution of tracers in accumulating sediments subject to compaction and diffusive as well as advective redistribution by benthic organisms. Conveyor belt (CB) feeding is characterized as a first-order process with a depth-dependent rate constant which is either localized (Gaussian) or distributed (integrated Gaussian). Biogenic diffusivity of bulk sediments is allowed these alternative depth dependences as well. The model assumes simple linear adsorption of the tracer between solid and solution phases and uses a time-dependent flux at (x = 0) which is a combination of that originating externally and the depth-integrated contributions from feeding. Particle selectivity is introduced by applying mass conservation separately to transport of the tracer and bulk sediments. Properties of the model are illustrated for tracers in nondispersive systems strongly bound to sediment solids. CB recycling gives rise to transient reflections on passage of a tracer pulse through the zone of bioturbation. Reflections readily disappear in the presence of various integrative processes. The system time resolution is defined in terms of the downward propagation of dual tracer pulses and shown to be systematically degraded by
Garcia, Andres; Wang, Jing; Windus, Theresa L.; ...
2016-05-20
Statistical mechanical modeling is developed to describe a catalytic conversion reaction A → Bc or Bt with concentration-dependent selectivity of the products, Bc or Bt, where reaction occurs inside catalytic particles traversed by narrow linear nanopores. The associated restricted diffusive transport, which in the extreme case is described by single-file diffusion, naturally induces strong concentration gradients. Hence, by comparing kinetic Monte Carlo simulation results with analytic treatments, selectivity is shown to be impacted by strong spatial correlations induced by restricted diffusivity in the presence of reaction and also by a subtle clustering of reactants, A.
Martínez, Isabel; Wiegand, Thorsten; Camarero, J Julio; Batllori, Enric; Gutiérrez, Emilia
2011-05-01
Alpine tree-line ecotones are characterized by marked changes at small spatial scales that may result in a variety of physiognomies. A set of alternative individual-based models was tested with data from four contrasting Pinus uncinata ecotones in the central Spanish Pyrenees to reveal the minimal subset of processes required for tree-line formation. A Bayesian approach combined with Markov chain Monte Carlo methods was employed to obtain the posterior distribution of model parameters, allowing the use of model selection procedures. The main features of real tree lines emerged only in models considering nonlinear responses in individual rates of growth or mortality with respect to the altitudinal gradient. Variation in tree-line physiognomy reflected mainly changes in the relative importance of these nonlinear responses, while other processes, such as dispersal limitation and facilitation, played a secondary role. Different nonlinear responses also determined the presence or absence of krummholz, in agreement with recent findings highlighting a different response of diffuse and abrupt or krummholz tree lines to climate change. The method presented here can be widely applied in individual-based simulation models and will turn model selection and evaluation in this type of models into a more transparent, effective, and efficient exercise.
NASA Astrophysics Data System (ADS)
Brunetti, Carlotta; Linde, Niklas; Vrugt, Jasper A.
2017-04-01
Geophysical data can help to discriminate among multiple competing subsurface hypotheses (conceptual models). Here, we explore the merits of Bayesian model selection in hydrogeophysics using crosshole ground-penetrating radar data from the South Oyster Bacterial Transport Site in Virginia, USA. Implementation of Bayesian model selection requires computation of the marginal likelihood of the measured data, or evidence, for each conceptual model being used. In this paper, we compare three different evidence estimators, including (1) the brute force Monte Carlo method, (2) the Laplace-Metropolis method, and (3) the numerical integration method proposed by Volpi et al. (2016). The three types of subsurface models that we consider differ in their treatment of the porosity distribution and use (a) horizontal layering with fixed layer thicknesses, (b) vertical layering with fixed layer thicknesses and (c) a multi-Gaussian field. Our results demonstrate that all three estimators provide equivalent results in low parameter dimensions, yet in higher dimensions the brute force Monte Carlo method is inefficient. The isotropic multi-Gaussian model is most supported by the travel time data with Bayes factors that are larger than 10100 compared to conceptual models that assume horizontal or vertical layering of the porosity field.
Heikkinen, Risto K.; Bocedi, Greta; Kuussaari, Mikko; Heliölä, Janne; Leikola, Niko; Pöyry, Juha; Travis, Justin M. J.
2014-01-01
Dynamic models for range expansion provide a promising tool for assessing species’ capacity to respond to climate change by shifting their ranges to new areas. However, these models include a number of uncertainties which may affect how successfully they can be applied to climate change oriented conservation planning. We used RangeShifter, a novel dynamic and individual-based modelling platform, to study two potential sources of such uncertainties: the selection of land cover data and the parameterization of key life-history traits. As an example, we modelled the range expansion dynamics of two butterfly species, one habitat specialist (Maniola jurtina) and one generalist (Issoria lathonia). Our results show that projections of total population size, number of occupied grid cells and the mean maximal latitudinal range shift were all clearly dependent on the choice made between using CORINE land cover data vs. using more detailed grassland data from three alternative national databases. Range expansion was also sensitive to the parameterization of the four considered life-history traits (magnitude and probability of long-distance dispersal events, population growth rate and carrying capacity), with carrying capacity and magnitude of long-distance dispersal showing the strongest effect. Our results highlight the sensitivity of dynamic species population models to the selection of existing land cover data and to uncertainty in the model parameters and indicate that these need to be carefully evaluated before the models are applied to conservation planning. PMID:25265281
An Optimization Model for the Selection of Bus-Only Lanes in a City.
Chen, Qun
2015-01-01
The planning of urban bus-only lane networks is an important measure to improve bus service and bus priority. To determine the effective arrangement of bus-only lanes, a bi-level programming model for urban bus lane layout is developed in this study that considers accessibility and budget constraints. The goal of the upper-level model is to minimize the total travel time, and the lower-level model is a capacity-constrained traffic assignment model that describes the passenger flow assignment on bus lines, in which the priority sequence of the transfer times is reflected in the passengers' route-choice behaviors. Using the proposed bi-level programming model, optimal bus lines are selected from a set of candidate bus lines; thus, the corresponding bus lane network on which the selected bus lines run is determined. The solution method using a genetic algorithm in the bi-level programming model is developed, and two numerical examples are investigated to demonstrate the efficacy of the proposed model.
An Optimization Model for the Selection of Bus-Only Lanes in a City
Chen, Qun
2015-01-01
The planning of urban bus-only lane networks is an important measure to improve bus service and bus priority. To determine the effective arrangement of bus-only lanes, a bi-level programming model for urban bus lane layout is developed in this study that considers accessibility and budget constraints. The goal of the upper-level model is to minimize the total travel time, and the lower-level model is a capacity-constrained traffic assignment model that describes the passenger flow assignment on bus lines, in which the priority sequence of the transfer times is reflected in the passengers’ route-choice behaviors. Using the proposed bi-level programming model, optimal bus lines are selected from a set of candidate bus lines; thus, the corresponding bus lane network on which the selected bus lines run is determined. The solution method using a genetic algorithm in the bi-level programming model is developed, and two numerical examples are investigated to demonstrate the efficacy of the proposed model. PMID:26214001
Rational selection of training and test sets for the development of validated QSAR models
NASA Astrophysics Data System (ADS)
Golbraikh, Alexander; Shen, Min; Xiao, Zhiyan; Xiao, Yun-De; Lee, Kuo-Hsiung; Tropsha, Alexander
2003-02-01
Quantitative Structure-Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors ( kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q 2 for the training set and accuracy of prediction ( R 2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.
Selecting a CSR Model: Quality and Implications of the Model Adoption Process
ERIC Educational Resources Information Center
Le Floch, Kerstin Carlson; Zhang, Yu; Kurki, Anja; Herrmann, Suzannah
2006-01-01
The process through which a school adopts a comprehensive school reform (CSR) model has been suggested to be a key element in the lifecycle of school reform, contributing to stakeholder buy in and subsequent implementation. We studied the model adoption process, both on a national scale with survey data and in more depth with qualitative case…
ERIC Educational Resources Information Center
Samuelsen, Karen
2012-01-01
The notion that there is often no clear distinction between factorial and typological models (von Davier, Naemi, & Roberts, this issue) is sound. As von Davier et al. state, theory often indicates a preference between these models; however the statistical criteria by which these are delineated offer much less clarity. In many ways the procedure…
NASA Astrophysics Data System (ADS)
Demyanov, V.; Backhouse, L.; Christie, M.
2015-12-01
There is a continuous challenge in identifying and propagating geologically realistic features into reservoir models. Many of the contemporary geostatistical algorithms are limited by various modelling assumptions, like stationarity or Gaussianity. Another related challenge is to ensure the realistic geological features introduced into a geomodel are preserved during the model update in history matching studies, when the model properties are tuned to fit the flow response to production data. The above challenges motivate exploration and application of other statistical approaches to build and calibrate reservoir models, in particular, methods based on statistical learning. The paper proposes a novel data driven approach - Multiple Kernel Learning (MKL) - for modelling porous property distributions in sub-surface reservoirs. Multiple Kernel Learning aims to extract relevant spatial features from spatial patterns and to combine them in a non-linear way. This ability allows to handle multiple geological scenarios, which represent different spatial scales and a range of modelling concepts/assumptions. Multiple Kernel Learning is not restricted by deterministic or statistical modelling assumptions and, therefore, is more flexible for modelling heterogeneity at different scales and integrating data and knowledge. We demonstrate an MKL application to a problem of history matching based on a diverse prior information embedded into a range of possible geological scenarios. MKL was able to select the most influential prior geological scenarios and fuse the selected spatial features into a multi-scale property model. The MKL was applied to Brugge history matching benchmark example by calibrating the parameters of the MKL reservoir model parameters to production data. The history matching results were compared to the ones obtained from other contemporary approaches - EnKF and kernel PCA with stochastic optimisation.
Dang, Tran Ngoc; Seposo, Xerxes T; Duc, Nguyen Huu Chau; Thang, Tran Binh; An, Do Dang; Hang, Lai Thi Minh; Long, Tran Thanh; Loan, Bui Thi Hong; Honda, Yasushi
2016-01-01
Background The relationship between temperature and mortality has been found to be U-, V-, or J-shaped in developed temperate countries; however, in developing tropical/subtropical cities, it remains unclear. Objectives Our goal was to investigate the relationship between temperature and mortality in Hue, a subtropical city in Viet Nam. Design We collected daily mortality data from the Vietnamese A6 mortality reporting system for 6,214 deceased persons between 2009 and 2013. A distributed lag non-linear model was used to examine the temperature effects on all-cause and cause-specific mortality by assuming negative binomial distribution for count data. We developed an objective-oriented model selection with four steps following the Akaike information criterion (AIC) rule (i.e. a smaller AIC value indicates a better model). Results High temperature-related mortality was more strongly associated with short lags, whereas low temperature-related mortality was more strongly associated with long lags. The low temperatures increased risk in all-category mortality compared to high temperatures. We observed elevated temperature-mortality risk in vulnerable groups: elderly people (high temperature effect, relative risk [RR]=1.42, 95% confidence interval [CI]=1.11-1.83; low temperature effect, RR=2.0, 95% CI=1.13-3.52), females (low temperature effect, RR=2.19, 95% CI=1.14-4.21), people with respiratory disease (high temperature effect, RR=2.45, 95% CI=0.91-6.63), and those with cardiovascular disease (high temperature effect, RR=1.6, 95% CI=1.15-2.22; low temperature effect, RR=1.99, 95% CI=0.92-4.28). Conclusions In Hue, the temperature significantly increased the risk of mortality, especially in vulnerable groups (i.e. elderly, female, people with respiratory and cardiovascular diseases). These findings may provide a foundation for developing adequate policies to address the effects of temperature on health in Hue City.
Identification of landscape features influencing gene flow: How useful are habitat selection models?
Roffler, Gretchen H; Schwartz, Michael K; Pilgrim, Kristy L; Talbot, Sandra L; Sage, George K; Adams, Layne G; Luikart, Gordon
2016-07-01
Understanding how dispersal patterns are influenced by landscape heterogeneity is critical for modeling species connectivity. Resource selection function (RSF) models are increasingly used in landscape genetics approaches. However, because the ecological factors that drive habitat selection may be different from those influencing dispersal and gene flow, it is important to consider explicit assumptions and spatial scales of measurement. We calculated pairwise genetic distance among 301 Dall's sheep (Ovis dalli dalli) in southcentral Alaska using an intensive noninvasive sampling effort and 15 microsatellite loci. We used multiple regression of distance matrices to assess the correlation of pairwise genetic distance and landscape resistance derived from an RSF, and combinations of landscape features hypothesized to influence dispersal. Dall's sheep gene flow was positively correlated with steep slopes, moderate peak normalized difference vegetation indices (NDVI), and open land cover. Whereas RSF covariates were significant in predicting genetic distance, the RSF model itself was not significantly correlated with Dall's sheep gene flow, suggesting that certain habitat features important during summer (rugged terrain, mid-range elevation) were not influential to effective dispersal. This work underscores that consideration of both habitat selection and landscape genetics models may be useful in developing management strategies to both meet the immediate survival of a species and allow for long-term genetic connectivity.
Identification of landscape features influencing gene flow: How useful are habitat selection models?
Roffler, Gretchen H.; Schwartz, Michael K.; Pilgrim, Kristy L.; Talbot, Sandra; Sage, Kevin; Adams, Layne G.; Luikart, Gordon
2016-01-01
Understanding how dispersal patterns are influenced by landscape heterogeneity is critical for modeling species connectivity. Resource selection function (RSF) models are increasingly used in landscape genetics approaches. However, because the ecological factors that drive habitat selection may be different from those influencing dispersal and gene flow, it is important to consider explicit assumptions and spatial scales of measurement. We calculated pairwise genetic distance among 301 Dall's sheep (Ovis dalli dalli) in southcentral Alaska using an intensive noninvasive sampling effort and 15 microsatellite loci. We used multiple regression of distance matrices to assess the correlation of pairwise genetic distance and landscape resistance derived from an RSF, and combinations of landscape features hypothesized to influence dispersal. Dall's sheep gene flow was positively correlated with steep slopes, moderate peak normalized difference vegetation indices (NDVI), and open land cover. Whereas RSF covariates were significant in predicting genetic distance, the RSF model itself was not significantly correlated with Dall's sheep gene flow, suggesting that certain habitat features important during summer (rugged terrain, mid-range elevation) were not influential to effective dispersal. This work underscores that consideration of both habitat selection and landscape genetics models may be useful in developing management strategies to both meet the immediate survival of a species and allow for long-term genetic connectivity.
A linear model fails to predict orientation selectivity of cells in the cat visual cortex.
Volgushev, M; Vidyasagar, T R; Pei, X
1996-01-01
1. Postsynaptic potentials (PSPs) evoked by visual stimulation in simple cells in the cat visual cortex were recorded using in vivo whole-cell technique. Responses to small spots of light presented at different positions over the receptive field and responses to elongated bars of different orientations centred on the receptive field were recorded. 2. To test whether a linear model can account for orientation selectivity of cortical neurones, responses to elongated bars were compared with responses predicted by a linear model from the receptive field map obtained from flashing spots. 3. The linear model faithfully predicted the preferred orientation, but not the degree of orientation selectivity or the sharpness of orientation tuning. The ratio of optimal to non-optimal responses was always underestimated by the model. 4. Thus non-linear mechanisms, which can include suppression of non-optimal responses and/or amplification of optimal responses, are involved in the generation of orientation selectivity in the primary visual cortex. PMID:8930828
2016-01-01
Estrogen receptor β (ERβ) selective agonists are considered potential therapeutic agents for a variety of pathological conditions, including several types of cancer. Their development is particularly challenging, since differences in the ligand binding cavities of the two ER subtypes α and β are minimal. We have carried out a rational design of new salicylketoxime derivatives which display unprecedentedly high levels of ERβ selectivity for this class of compounds, both in binding affinity and in cell-based functional assays. An endogenous gene expression assay was used to further characterize the pharmacological action of these compounds. Finally, these ERβ-selective agonists were found to inhibit proliferation of a glioma cell line in vitro. Most importantly, one of these compounds also proved to be active in an in vivo xenograft model of human glioma, thus demonstrating the high potential of this type of compounds against this devastating disease. PMID:25559213
Forward-in-Time, Spatially Explicit Modeling Software to Simulate Genetic Lineages Under Selection
Currat, Mathias; Gerbault, Pascale; Di, Da; Nunes, José M.; Sanchez-Mazas, Alicia
2015-01-01
SELECTOR is a software package for studying the evolution of multiallelic genes under balancing or positive selection while simulating complex evolutionary scenarios that integrate demographic growth and migration in a spatially explicit population framework. Parameters can be varied both in space and time to account for geographical, environmental, and cultural heterogeneity. SELECTOR can be used within an approximate Bayesian computation estimation framework. We first describe the principles of SELECTOR and validate the algorithms by comparing its outputs for simple models with theoretical expectations. Then, we show how it can be used to investigate genetic differentiation of loci under balancing selection in interconnected demes with spatially heterogeneous gene flow. We identify situations in which balancing selection reduces genetic differentiation between population groups compared with neutrality and explain conflicting outcomes observed for human leukocyte antigen loci. These results and three previously published applications demonstrate that SELECTOR is efficient and robust for building insight into human settlement history and evolution. PMID:26949332
An experimental model for the spatial structuring and selection of bacterial communities.
Thomas, Torsten; Kindinger, Ilona; Yu, Dan; Esvaran, Meera; Blackall, Linda; Forehead, Hugh; Johnson, Craig R; Manefield, Mike
2011-11-01
Community-level selection is an important concept in evolutionary biology and has been predicted to arise in systems that are spatially structured. Here we develop an experimental model for spatially-structured bacterial communities based on coaggregating strains and test their relative fitness under a defined selection pressure. As selection we apply protozoan grazing in a defined, continuous culturing system. We demonstrate that a slow-growing bacterial strain Blastomonas natatoria 2.1, which forms coaggregates with Micrococcus luteus, can outcompete a fast-growing, closely related strain Blastomonas natatoria 2.8 under conditions of protozoan grazing. The competitive benefit provided by spatial structuring has implications for the evolution of natural bacterial communities in the environment.
Feature Selection Based on High Dimensional Model Representation for Hyperspectral Images.
Taskin Kaya, Gulsen; Kaya, Huseyin; Bruzzone, Lorenzo
2017-03-24
In hyperspectral image analysis, the classification task has generally been addressed jointly with dimensionality reduction due to both the high correlation between the spectral features and the noise present in spectral bands which might significantly degrade classification performance. In supervised classification, limited training instances in proportion to the number of spectral features have negative impacts on the classification accuracy, which has known as Hughes effects or curse of dimensionality in the literature. In this paper, we focus on dimensionality reduction problem, and propose a novel feature-selection algorithm which is based on the method called High Dimensional Model Representation. The proposed algorithm is tested on some toy examples and hyperspectral datasets in comparison to conventional feature-selection algorithms in terms of classification accuracy, stability of the selected features and computational time. The results showed that the proposed approach provides both high classification accuracy and robust features with a satisfactory computational time.
Evolution of recombination rates in a multi-locus, haploid-selection, symmetric-viability model.
Chasnov, J R; Ye, Felix Xiaofeng
2013-02-01
A fast algorithm for computing multi-locus recombination is extended to include a recombination-modifier locus. This algorithm and a linear stability analysis is used to investigate the evolution of recombination rates in a multi-locus, haploid-selection, symmetric-viability model for which stable equilibria have recently been determined. When the starting equilibrium is symmetric with two selected loci, we show analytically that modifier alleles that reduce recombination always invade. When the starting equilibrium is monomorphic, and there is a fixed nonzero recombination rate between the modifier locus and the selected loci, we determine analytical conditions for which a modifier allele can invade. In particular, we show that a gap exists between the recombination rates of modifiers that can invade and the recombination rate that specifies the lower stability boundary of the monomorphic equilibrium. A numerical investigation shows that a similar gap exists in a weakened form when the starting equilibrium is fully polymorphic but asymmetric.
NASA Technical Reports Server (NTRS)
Hidalgo, Homero, Jr.
2000-01-01
An innovative methodology for determining structural target mode selection and mode selection based on a specific criterion is presented. An effective approach to single out modes which interact with specific locations on a structure has been developed for the X-33 Launch Vehicle Finite Element Model (FEM). We presented Root-Sum-Square (RSS) displacement method computes resultant modal displacement for each mode at selected degrees of freedom (DOF) and sorts to locate modes with highest values. This method was used to determine modes, which most influenced specific locations/points on the X-33 flight vehicle such as avionics control components, aero-surface control actuators, propellant valve and engine points for use in flight control stability analysis and for flight POGO stability analysis. Additionally, the modal RSS method allows for primary or global target vehicle modes to also be identified in an accurate and efficient manner.
NASA Astrophysics Data System (ADS)
Coakley, Kevin J.; Qu, Jifeng
2017-04-01
In the electronic measurement of the Boltzmann constant based on Johnson noise thermometry, the ratio of the power spectral densities of thermal noise across a resistor at the triple point of water, and pseudo-random noise synthetically generated by a quantum-accurate voltage-noise source is constant to within 1 part in a billion for frequencies up to 1 GHz. Given knowledge of this ratio, and the values of other parameters that are known or measured, one can determine the Boltzmann constant. Due, in part, to mismatch between transmission lines, the experimental ratio spectrum varies with frequency. We model this spectrum as an even polynomial function of frequency where the constant term in the polynomial determines the Boltzmann constant. When determining this constant (offset) from experimental data, the assumed complexity of the ratio spectrum model and the maximum frequency analyzed (fitting bandwidth) dramatically affects results. Here, we select the complexity of the model by cross-validation—a data-driven statistical learning method. For each of many fitting bandwidths, we determine the component of uncertainty of the offset term that accounts for random and systematic effects associated with imperfect knowledge of model complexity. We select the fitting bandwidth that minimizes this uncertainty. In the most recent measurement of the Boltzmann constant, results were determined, in part, by application of an earlier version of the method described here. Here, we extend the earlier analysis by considering a broader range of fitting bandwidths and quantify an additional component of uncertainty that accounts for imperfect performance of our fitting bandwidth selection method. For idealized simulated data with additive noise similar to experimental data, our method correctly selects the true complexity of the ratio spectrum model for all cases considered. A new analysis of data from the recent experiment yields evidence for a temporal trend in the offset
Numerical algebraic geometry for model selection and its application to the life sciences
Gross, Elizabeth; Davis, Brent; Ho, Kenneth L.; Bates, Daniel J.
2016-01-01
Researchers working with mathematical models are often confronted by the related problems of parameter estimation, model validation and model selection. These are all optimization problems, well known to be challenging due to nonlinearity, non-convexity and multiple local optima. Furthermore, the challenges are compounded when only partial data are available. Here, we consider polynomial models (e.g. mass-action chemical reaction networks at steady state) and describe a framework for their analysis based on optimization using numerical algebraic geometry. Specifically, we use probability-one polynomial homotopy continuation methods to compute all critical points of the objective function, then filter to recover the global optima. Our approach exploits the geometrical structures relating models and data, and we demonstrate its utility on examples from cell signalling, synthetic biology and epidemiology. PMID:27733697
Numerical algebraic geometry for model selection and its application to the life sciences.
Gross, Elizabeth; Davis, Brent; Ho, Kenneth L; Bates, Daniel J; Harrington, Heather A
2016-10-01
Researchers working with mathematical models are often confronted by the related problems of parameter estimation, model validation and model selection. These are all optimization problems, well known to be challenging due to nonlinearity, non-convexity and multiple local optima. Furthermore, the challenges are compounded when only partial data are available. Here, we consider polynomial models (e.g. mass-action chemical reaction networks at steady state) and describe a framework for their analysis based on optimization using numerical algebraic geometry. Specifically, we use probability-one polynomial homotopy continuation methods to compute all critical points of the objective function, then filter to recover the global optima. Our approach exploits the geometrical structures relating models and data, and we demonstrate its utility on examples from cell signalling, synthetic biology and epidemiology.
Consideration in selecting crops for the human-rated life support system: a linear programming model
NASA Astrophysics Data System (ADS)
Wheeler, E. F.; Kossowski, J.; Goto, E.; Langhans, R. W.; White, G.; Albright, L. D.; Wilcox, D.
A Linear Programming model has been constructed which aids in selecting appropriate crops for CELSS (Controlled Environment Life Support System) food production. A team of Controlled Environment Agriculture (CEA) faculty, staff, graduate students and invited experts representing more than a dozen disciplines, provided a wide range of expertise in developing the model and the crop production program. The model incorporates nutritional content and controlled-environment based production yields of carefully chosen crops into a framework where a crop mix can be constructed to suit the astronauts' needs. The crew's nutritional requirements can be adequately satisfied with only a few crops (assuming vitamin mineral supplements are provided) but this will not be satisfactory from a culinary standpoint. This model is flexible enough that taste and variety driven food choices can be built into the model.
Consideration in selecting crops for the human-rated life support system: a Linear Programming model
NASA Technical Reports Server (NTRS)
Wheeler, E. F.; Kossowski, J.; Goto, E.; Langhans, R. W.; White, G.; Albright, L. D.; Wilcox, D.; Henninger, D. L. (Principal Investigator)
1996-01-01
A Linear Programming model has been constructed which aids in selecting appropriate crops for CELSS (Controlled Environment Life Support System) food production. A team of Controlled Environment Agriculture (CEA) faculty, staff, graduate students and invited experts representing more than a dozen disciplines, provided a wide range of expertise in developing the model and the crop production program. The model incorporates nutritional content and controlled-environment based production yields of carefully chosen crops into a framework where a crop mix can be constructed to suit the astronauts' needs. The crew's nutritional requirements can be adequately satisfied with only a few crops (assuming vitamin mineral supplements are provided) but this will not be satisfactory from a culinary standpoint. This model is flexible enough that taste and variety driven food choices can be built into the model.
NASA Technical Reports Server (NTRS)
Noor, A. K.; Peters, J. M.
1981-01-01
Simple mixed models are developed for use in the geometrically nonlinear analysis of deep arches. A total Lagrangian description of the arch deformation is used, the analytical formulation being based on a form of the nonlinear deep arch theory with the effects of transverse shear deformation included. The fundamental unknowns comprise the six internal forces and generalized displacements of the arch, and the element characteristic arrays are obtained by using Hellinger-Reissner mixed variational principle. The polynomial interpolation functions employed in approximating the forces are one degree lower than those used in approximating the displacements, and the forces are discontinuous at the interelement boundaries. Attention is given to the equivalence between the mixed models developed herein and displacement models based on reduced integration of both the transverse shear and extensional energy terms. The advantages of mixed models over equivalent displacement models are summarized. Numerical results are presented to demonstrate the high accuracy and effectiveness of the mixed models developed and to permit a comparison of their performance with that of other mixed models reported in the literature.
Influence of model selection on the predicted distribution of the seagrass Zostera marina
NASA Astrophysics Data System (ADS)
Downie, Anna-Leena; von Numers, Mikael; Boström, Christoffer
2013-04-01
There is an increasing need to model the distribution of species and habitats for effective conservation planning, but there is a paucity of models for the marine environment. We used presence (131) and absence (219) records of the marine angiosperm Zostera marina L. from the archipelago of SW Finland, northern Baltic Sea, to model its distribution in a 5400 km2 area. We used depth, slope, turbidity, wave exposure and distance to sandy shores as environmental predictors, and compared a presence-absence method: generalised additive model (GAM), with a presence only method: maximum entropy (Maxent). Models were validated using semi-independent data sets. Both models performed well and described the niche of Z. marina fairly consistently, although there were differences in the way the models weighted the environmental variables, and consequently the spatial predictions differed somewhat. A notable outcome from the process was that with relatively equal model performance, the area actually predicted in geographical space can vary by twofold. The area predicted as suitable for Z. marina by the ensemble was almost half of that predicted by the GAM model by itself. The ensemble of model predictions increased the model predictive capability marginally and clearly shifted the model towards a more conservative prediction, increasing specificity, but at the same time sacrificing sensitivity. The environmental predictors selected into the final models described the potential distribution of Z. marina well and showed that in the northern Baltic the species occupies a narrow niche, typically thriving in shallow and moderately exposed to exposed locations near sandy shores. We conclude that a prediction based on a combination of model results provides a more realistic estimate of the core area suitable for Z. marina and should be the modelling approach implemented in conservation planning and management.
Modulation Depth Estimation and Variable Selection in State-Space Models for Neural Interfaces
Hochberg, Leigh R.; Donoghue, John P.; Brown, Emery N.
2015-01-01
Rapid developments in neural interface technology are making it possible to record increasingly large signal sets of neural activity. Various factors such as asymmetrical information distribution and across-channel redundancy may, however, limit the benefit of high-dimensional signal sets, and the increased computational complexity may not yield corresponding improvement in system performance. High-dimensional system models may also lead to overfitting and lack of generalizability. To address these issues, we present a generalized modulation depth measure using the state-space framework that quantifies the tuning of a neural signal channel to relevant behavioral covariates. For a dynamical system, we develop computationally efficient procedures for estimating modulation depth from multivariate data. We show that this measure can be used to rank neural signals and select an optimal channel subset for inclusion in the neural decoding algorithm. We present a scheme for choosing the optimal subset based on model order selection criteria. We apply this method to neuronal ensemble spike-rate decoding in neural interfaces, using our framework to relate motor cortical activity with intended movement kinematics. With offline analysis of intracortical motor imagery data obtained from individuals with tetraplegia using the BrainGate neural interface, we demonstrate that our variable selection scheme is useful for identifying and ranking the most information-rich neural signals. We demonstrate that our approach offers several orders of magnitude lower complexity but virtually identical decoding performance compared to greedy search and other selection schemes. Our statistical analysis shows that the modulation depth of human motor cortical single-unit signals is well characterized by the generalized Pareto distribution. Our variable selection scheme has wide applicability in problems involving multisensor signal modeling and estimation in biomedical engineering systems. PMID
NASA Astrophysics Data System (ADS)
Tang, Baobao
Different positions of a material used for structures experience different stresses, sometimes at both extremes, when undergoing processing, manufacturing, and serving. Taking the three-point bending as an example, the plate experiences higher stress in the middle span area and lower stress in both sides of the plate. In order to ensure the performance and reduce the cost of the composite, placement of different composite material with different mechanical properties, i.e. selective reinforcement, is proposed. Very few study has been conducted on selective reinforcement. Therefore, basic understanding on the relationship between the selective reinforcing variables and the overall properties of composite material is still unclear and there is still no clear methodology to design composite materials under different types of loads. This study started from the analysis of composite laminate under three point bending test. From the mechanical analysis and simulation result of homogeneously reinforced composite materials, it is found that the stress is not evenly distributed on the plate based on through-thickness direction and longitudinal direction. Based on these results, a map for the stress distribution under three point bending was developed. Next, the composite plate was selectively designed using two types of configurations. Mathematical and finite element analysis (FEA) models were built based on these designs. Experimental data from tests of hybrid composite materials was used to verify the mathematical and FEA models. Analysis of the mathematical model indicates that the increase in stiffness of the material at the top and bottom surfaces and middle-span area is the most effective way to improve the flexural modulus in three point bending test. At the end of this study, a complete methodology to perform the selective design was developed.
A dislocation-based model for variant selection during the γ-to-α‧ transformation
NASA Astrophysics Data System (ADS)
Wittridge, N. J.; Jonas, J. J.; Root, J. H.
2001-04-01
A phase transformation model is described for variant selection during the austenite-to-martensite transformation. The model depends entirely on the presence of glide dislocations in the deformed austenite. The direct correlation between the 24 slip systems of the Bishop and Hill (B-H) crystal plasticity model and the 24 <112> rotation axes of the Kurdjumov-Sachs (K-S) orientation relationship is employed. Two selection criteria, based on slip activity and permissible dislocation reactions, govern the variants that are chosen to represent the final transformation texture. The development of the model via analysis of the experimental results of Liu and Bunge is described. The model is applied to the four distinct strain paths: (1) plane strain rolling, (2) axisymmetric extension, (3) axisymmetric compression, and (4) simple shear. Experimental deformation and transformation textures were produced for comparison purposes via appropriate deformation and quenching procedures. In each case, the transformation texture predicted using the dislocation reaction model is in excellent agreement with the experimental findings.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Monniaux, Danielle; Michel, Philippe; Postel, Marie; Clément, Frédérique
2016-06-01
In this review, we present multi-scale mathematical models of ovarian follicular development that are based on the embedding of physiological mechanisms into the cell scale. During basal follicular development, follicular growth operates through an increase in the oocyte size concomitant with the proliferation of its surrounding granulosa cells. We have developed a spatio-temporal model of follicular morphogenesis explaining how the interactions between the oocyte and granulosa cells need to be properly balanced to shape the follicle. During terminal follicular development, the ovulatory follicle is selected amongst a cohort of simultaneously growing follicles. To address this process of follicle selection, we have developed a model giving a continuous and deterministic description of follicle development, adapted to high numbers of cells and based on the dynamical and hormonally regulated repartition of granulosa cells into different cell states, namely proliferation, differentiation and apoptosis. This model takes into account the hormonal feedback loop involving the growing ovarian follicles and the pituitary gland, and enables the exploration of mechanisms regulating the number of ovulations at each ovarian cycle. Both models are useful for addressing ovarian physio-pathological situations. Moreover, they can be proposed as generic modelling environments to study various developmental processes and cell interaction mechanisms.
Baldassarre, Luca; Pontil, Massimiliano; Mourão-Miranda, Janaina
2017-01-01
Structured sparse methods have received significant attention in neuroimaging. These methods allow the incorporation of domain knowledge through additional spatial and temporal constraints in the predictive model and carry the promise of being more interpretable than non-structured sparse methods, such as LASSO