NASA Astrophysics Data System (ADS)
Strader, Anne; Schneider, Max; Schorlemmer, Danijel; Liukis, Maria
2016-04-01
The Collaboratory for the Study of Earthquake Predictability (CSEP) was developed to rigorously test earthquake forecasts retrospectively and prospectively through reproducible, completely transparent experiments within a controlled environment (Zechar et al., 2010). During 2006-2011, thirteen five-year time-invariant prospective earthquake mainshock forecasts developed by the Regional Earthquake Likelihood Models (RELM) working group were evaluated through the CSEP testing center (Schorlemmer and Gerstenberger, 2007). The number, spatial, and magnitude components of the forecasts were compared to the respective observed seismicity components using a set of consistency tests (Schorlemmer et al., 2007, Zechar et al., 2010). In the initial experiment, all but three forecast models passed every test at the 95% significance level, with all forecasts displaying consistent log-likelihoods (L-test) and magnitude distributions (M-test) with the observed seismicity. In the ten-year RELM experiment update, we reevaluate these earthquake forecasts over an eight-year period from 2008-2016, to determine the consistency of previous likelihood testing results over longer time intervals. Additionally, we test the Uniform California Earthquake Rupture Forecast (UCERF2), developed by the U.S. Geological Survey (USGS), and the earthquake rate model developed by the California Geological Survey (CGS) and the USGS for the National Seismic Hazard Mapping Program (NSHMP) against the RELM forecasts. Both the UCERF2 and NSHMP forecasts pass all consistency tests, though the Helmstetter et al. (2007) and Shen et al. (2007) models exhibit greater information gain per earthquake according to the T- and W- tests (Rhoades et al., 2011). Though all but three RELM forecasts pass the spatial likelihood test (S-test), multiple forecasts fail the M-test due to overprediction of the number of earthquakes during the target period. Though there is no significant difference between the UCERF2 and NSHMP models, residual scores show that the NSHMP model is preferred in locations with earthquake occurrence, due to the lower seismicity rates forecasted by the UCERF2 model.
Lee, Ya-Ting; Turcotte, Donald L; Holliday, James R; Sachs, Michael K; Rundle, John B; Chen, Chien-Chih; Tiampo, Kristy F
2011-10-04
The Regional Earthquake Likelihood Models (RELM) test of earthquake forecasts in California was the first competitive evaluation of forecasts of future earthquake occurrence. Participants submitted expected probabilities of occurrence of M ≥ 4.95 earthquakes in 0.1° × 0.1° cells for the period 1 January 1, 2006, to December 31, 2010. Probabilities were submitted for 7,682 cells in California and adjacent regions. During this period, 31 M ≥ 4.95 earthquakes occurred in the test region. These earthquakes occurred in 22 test cells. This seismic activity was dominated by earthquakes associated with the M = 7.2, April 4, 2010, El Mayor-Cucapah earthquake in northern Mexico. This earthquake occurred in the test region, and 16 of the other 30 earthquakes in the test region could be associated with it. Nine complete forecasts were submitted by six participants. In this paper, we present the forecasts in a way that allows the reader to evaluate which forecast is the most "successful" in terms of the locations of future earthquakes. We conclude that the RELM test was a success and suggest ways in which the results can be used to improve future forecasts.
Lee, Ya-Ting; Turcotte, Donald L.; Holliday, James R.; Sachs, Michael K.; Rundle, John B.; Chen, Chien-Chih; Tiampo, Kristy F.
2011-01-01
The Regional Earthquake Likelihood Models (RELM) test of earthquake forecasts in California was the first competitive evaluation of forecasts of future earthquake occurrence. Participants submitted expected probabilities of occurrence of M≥4.95 earthquakes in 0.1° × 0.1° cells for the period 1 January 1, 2006, to December 31, 2010. Probabilities were submitted for 7,682 cells in California and adjacent regions. During this period, 31 M≥4.95 earthquakes occurred in the test region. These earthquakes occurred in 22 test cells. This seismic activity was dominated by earthquakes associated with the M = 7.2, April 4, 2010, El Mayor–Cucapah earthquake in northern Mexico. This earthquake occurred in the test region, and 16 of the other 30 earthquakes in the test region could be associated with it. Nine complete forecasts were submitted by six participants. In this paper, we present the forecasts in a way that allows the reader to evaluate which forecast is the most “successful” in terms of the locations of future earthquakes. We conclude that the RELM test was a success and suggest ways in which the results can be used to improve future forecasts. PMID:21949355
First Results of the Regional Earthquake Likelihood Models Experiment
Schorlemmer, D.; Zechar, J.D.; Werner, M.J.; Field, E.H.; Jackson, D.D.; Jordan, T.H.
2010-01-01
The ability to successfully predict the future behavior of a system is a strong indication that the system is well understood. Certainly many details of the earthquake system remain obscure, but several hypotheses related to earthquake occurrence and seismic hazard have been proffered, and predicting earthquake behavior is a worthy goal and demanded by society. Along these lines, one of the primary objectives of the Regional Earthquake Likelihood Models (RELM) working group was to formalize earthquake occurrence hypotheses in the form of prospective earthquake rate forecasts in California. RELM members, working in small research groups, developed more than a dozen 5-year forecasts; they also outlined a performance evaluation method and provided a conceptual description of a Testing Center in which to perform predictability experiments. Subsequently, researchers working within the Collaboratory for the Study of Earthquake Predictability (CSEP) have begun implementing Testing Centers in different locations worldwide, and the RELM predictability experiment-a truly prospective earthquake prediction effort-is underway within the U. S. branch of CSEP. The experiment, designed to compare time-invariant 5-year earthquake rate forecasts, is now approximately halfway to its completion. In this paper, we describe the models under evaluation and present, for the first time, preliminary results of this unique experiment. While these results are preliminary-the forecasts were meant for an application of 5 years-we find interesting results: most of the models are consistent with the observation and one model forecasts the distribution of earthquakes best. We discuss the observed sample of target earthquakes in the context of historical seismicity within the testing region, highlight potential pitfalls of the current tests, and suggest plans for future revisions to experiments such as this one. ?? 2010 The Author(s).
First Results of the Regional Earthquake Likelihood Models Experiment
NASA Astrophysics Data System (ADS)
Schorlemmer, Danijel; Zechar, J. Douglas; Werner, Maximilian J.; Field, Edward H.; Jackson, David D.; Jordan, Thomas H.
2010-08-01
The ability to successfully predict the future behavior of a system is a strong indication that the system is well understood. Certainly many details of the earthquake system remain obscure, but several hypotheses related to earthquake occurrence and seismic hazard have been proffered, and predicting earthquake behavior is a worthy goal and demanded by society. Along these lines, one of the primary objectives of the Regional Earthquake Likelihood Models (RELM) working group was to formalize earthquake occurrence hypotheses in the form of prospective earthquake rate forecasts in California. RELM members, working in small research groups, developed more than a dozen 5-year forecasts; they also outlined a performance evaluation method and provided a conceptual description of a Testing Center in which to perform predictability experiments. Subsequently, researchers working within the Collaboratory for the Study of Earthquake Predictability (CSEP) have begun implementing Testing Centers in different locations worldwide, and the RELM predictability experiment—a truly prospective earthquake prediction effort—is underway within the U.S. branch of CSEP. The experiment, designed to compare time-invariant 5-year earthquake rate forecasts, is now approximately halfway to its completion. In this paper, we describe the models under evaluation and present, for the first time, preliminary results of this unique experiment. While these results are preliminary—the forecasts were meant for an application of 5 years—we find interesting results: most of the models are consistent with the observation and one model forecasts the distribution of earthquakes best. We discuss the observed sample of target earthquakes in the context of historical seismicity within the testing region, highlight potential pitfalls of the current tests, and suggest plans for future revisions to experiments such as this one.
Mavi, Parm; Niranjan, Rituraj; Dutt, Parmesh; Zaidi, Asifa; Shukla, Jai Shankar; Korfhagen, Thomas
2014-01-01
Resistin-like molecule (Relm)-α is a secreted, cysteine-rich protein belonging to a newly defined family of proteins, including resistin, Relm-β, and Relm-γ. Although resistin was initially defined based on its insulin-resistance activity, the family members are highly induced in various inflammatory states. Earlier studies implicated Relm-α in insulin resistance, asthmatic responses, and intestinal inflammation; however, its function still remains an enigma. We now report that Relm-α is strongly induced in the esophagus in an allergen-challenged murine model of eosinophilic esophagitis (EoE). Furthermore, to understand the in vivo role of Relm-α, we generated Relm-α gene-inducible bitransgenic mice by using lung-specific CC-10 promoter (CC10-rtTA-Relm-α). We found Relm-α protein is significantly induced in the esophagus of CC10-rtTA-Relm-α bitransgenic mice exposed to doxycycline food. The most prominent effect observed by the induction of Relm-α is epithelial cell hyperplasia, basal layer thickness, accumulation of activated CD4+ and CD4− T cell subsets, and eosinophilic inflammation in the esophagus. The in vitro experiments further confirm that Relm-α promotes primary epithelial cell proliferation but has no chemotactic activity for eosinophils. Taken together, our studies report for the first time that Relm-α induction in the esophagus has a major role in promoting epithelial cell hyperplasia and basal layer thickness, and the accumulation of activated CD4+ and CD4− T cell subsets may be responsible for partial esophageal eosinophilia in the mouse models of EoE. Notably, the epithelial cell hyperplasia and basal layer thickness are the characteristic features commonly observed in human EoE. PMID:24994859
Mavi, Parm; Niranjan, Rituraj; Dutt, Parmesh; Zaidi, Asifa; Shukla, Jai Shankar; Korfhagen, Thomas; Mishra, Anil
2014-09-01
Resistin-like molecule (Relm)-α is a secreted, cysteine-rich protein belonging to a newly defined family of proteins, including resistin, Relm-β, and Relm-γ. Although resistin was initially defined based on its insulin-resistance activity, the family members are highly induced in various inflammatory states. Earlier studies implicated Relm-α in insulin resistance, asthmatic responses, and intestinal inflammation; however, its function still remains an enigma. We now report that Relm-α is strongly induced in the esophagus in an allergen-challenged murine model of eosinophilic esophagitis (EoE). Furthermore, to understand the in vivo role of Relm-α, we generated Relm-α gene-inducible bitransgenic mice by using lung-specific CC-10 promoter (CC10-rtTA-Relm-α). We found Relm-α protein is significantly induced in the esophagus of CC10-rtTA-Relm-α bitransgenic mice exposed to doxycycline food. The most prominent effect observed by the induction of Relm-α is epithelial cell hyperplasia, basal layer thickness, accumulation of activated CD4(+) and CD4(-) T cell subsets, and eosinophilic inflammation in the esophagus. The in vitro experiments further confirm that Relm-α promotes primary epithelial cell proliferation but has no chemotactic activity for eosinophils. Taken together, our studies report for the first time that Relm-α induction in the esophagus has a major role in promoting epithelial cell hyperplasia and basal layer thickness, and the accumulation of activated CD4(+) and CD4(-) T cell subsets may be responsible for partial esophageal eosinophilia in the mouse models of EoE. Notably, the epithelial cell hyperplasia and basal layer thickness are the characteristic features commonly observed in human EoE. Copyright © 2014 the American Physiological Society.
Earthquake likelihood model testing
Schorlemmer, D.; Gerstenberger, M.C.; Wiemer, S.; Jackson, D.D.; Rhoades, D.A.
2007-01-01
INTRODUCTIONThe Regional Earthquake Likelihood Models (RELM) project aims to produce and evaluate alternate models of earthquake potential (probability per unit volume, magnitude, and time) for California. Based on differing assumptions, these models are produced to test the validity of their assumptions and to explore which models should be incorporated in seismic hazard and risk evaluation. Tests based on physical and geological criteria are useful but we focus on statistical methods using future earthquake catalog data only. We envision two evaluations: a test of consistency with observed data and a comparison of all pairs of models for relative consistency. Both tests are based on the likelihood method, and both are fully prospective (i.e., the models are not adjusted to fit the test data). To be tested, each model must assign a probability to any possible event within a specified region of space, time, and magnitude. For our tests the models must use a common format: earthquake rates in specified “bins” with location, magnitude, time, and focal mechanism limits.Seismology cannot yet deterministically predict individual earthquakes; however, it should seek the best possible models for forecasting earthquake occurrence. This paper describes the statistical rules of an experiment to examine and test earthquake forecasts. The primary purposes of the tests described below are to evaluate physical models for earthquakes, assure that source models used in seismic hazard and risk studies are consistent with earthquake data, and provide quantitative measures by which models can be assigned weights in a consensus model or be judged as suitable for particular regions.In this paper we develop a statistical method for testing earthquake likelihood models. A companion paper (Schorlemmer and Gerstenberger 2007, this issue) discusses the actual implementation of these tests in the framework of the RELM initiative.Statistical testing of hypotheses is a common task and a wide range of possible testing procedures exist. Jolliffe and Stephenson (2003) present different forecast verifications from atmospheric science, among them likelihood testing of probability forecasts and testing the occurrence of binary events. Testing binary events requires that for each forecasted event, the spatial, temporal and magnitude limits be given. Although major earthquakes can be considered binary events, the models within the RELM project express their forecasts on a spatial grid and in 0.1 magnitude units; thus the results are a distribution of rates over space and magnitude. These forecasts can be tested with likelihood tests.In general, likelihood tests assume a valid null hypothesis against which a given hypothesis is tested. The outcome is either a rejection of the null hypothesis in favor of the test hypothesis or a nonrejection, meaning the test hypothesis cannot outperform the null hypothesis at a given significance level. Within RELM, there is no accepted null hypothesis and thus the likelihood test needs to be expanded to allow comparable testing of equipollent hypotheses.To test models against one another, we require that forecasts are expressed in a standard format: the average rate of earthquake occurrence within pre-specified limits of hypocentral latitude, longitude, depth, magnitude, time period, and focal mechanisms. Focal mechanisms should either be described as the inclination of P-axis, declination of P-axis, and inclination of the T-axis, or as strike, dip, and rake angles. Schorlemmer and Gerstenberger (2007, this issue) designed classes of these parameters such that similar models will be tested against each other. These classes make the forecasts comparable between models. Additionally, we are limited to testing only what is precisely defined and consistently reported in earthquake catalogs. Therefore it is currently not possible to test such information as fault rupture length or area, asperity location, etc. Also, to account for data quality issues, we allow for location and magnitude uncertainties as well as the probability that an event is dependent on another event.As we mentioned above, only models with comparable forecasts can be tested against each other. Our current tests are designed to examine grid-based models. This requires that any fault-based model be adapted to a grid before testing is possible. While this is a limitation of the testing, it is an inherent difficulty in any such comparative testing. Please refer to appendix B for a statistical evaluation of the application of the Poisson hypothesis to fault-based models.The testing suite we present consists of three different tests: L-Test, N-Test, and R-Test. These tests are defined similarily to Kagan and Jackson (1995). The first two tests examine the consistency of the hypotheses with the observations while the last test compares the spatial performances of the models.
Hogan, Simon P.; Seidu, Luqman; Blanchard, Carine; Groschwitz, Katherine; Mishra, Anil; Karow, Margaret L.; Ahrens, Richard; Artis, David; Murphy, Andrew J.; Valenzuela, David M.; Yancopoulos, George D.; Rothenberg, Marc E.
2007-01-01
Background: Resistin-like molecule (RELM) β is a cysteine-rich cytokine expressed in the gastrointestinal tract and implicated in insulin resistance and gastrointestinal nematode immunity; however, its function primarily remains an enigma. Objective: We sought to elucidate the function of RELM-β in the gastrointestinal tract. Methods: We generated RELM-β gene-targeted mice and examined colonic epithelial barrier function, gene expression profiles, and susceptibility to acute colonic inflammation. Results: We show that RELM-β is constitutively expressed in the colon by goblet cells and enterocytes and has a role in homeostasis, as assessed by alterations in colon mRNA transcripts and epithelial barrier function in the absence of RELM-β. Using acute colonic inflammatory models, we demonstrate that RELM-β has a central role in the regulation of susceptibility to colonic inflammation. Mechanistic studies identify that RELM-β regulates expression of type III regenerating gene (REG) (REG3β and γ), molecules known to influence nuclear factor κB signaling. Conclusions: These data define a critical role for RELM-β in the maintenance of colonic barrier function and gastrointestinal innate immunity. Clinical implications: These findings identify RELM-β as an important molecule in homeostatic gastrointestinal function and colonic inflammation, and as such, these results have implications for a variety of human inflammatory gastrointestinal conditions, including allergic gastroenteropathies. PMID:16815164
Pine, Gabrielle M; Batugedara, Hashini M; Nair, Meera G
2018-06-01
The Resistin-Like Molecules (RELM) α, β, and γ and their namesake, resistin, share structural and sequence homology but exhibit significant diversity in expression and function within their mammalian host. RELM proteins are expressed in a wide range of diseases, such as: microbial infections (eg. bacterial and helminth), inflammatory diseases (eg. asthma, fibrosis) and metabolic disorders (eg. diabetes). While the expression pattern and molecular regulation of RELM proteins are well characterized, much controversy remains over their proposed functions, with evidence of host-protective and pathogenic roles. Moreover, the receptors for RELM proteins are unclear, although three receptors for resistin, decorin, adenylyl cyclase-associated protein 1 (CAP1), and Toll-like Receptor 4 (TLR4) have recently been proposed. In this review, we will first summarize the molecular regulation of the RELM gene family, including transcription regulation and tissue expression in humans and mouse disease models. Second, we will outline the function and receptor-mediated signaling associated with RELM proteins. Finally, we will discuss recent studies suggesting that, despite early misconceptions that these proteins are pathogenic, RELM proteins have a more nuanced and potentially beneficial role for the host in certain disease settings. Copyright © 2018 Elsevier Ltd. All rights reserved.
Regional Earthquake Likelihood Models: A realm on shaky grounds?
NASA Astrophysics Data System (ADS)
Kossobokov, V.
2005-12-01
Seismology is juvenile and its appropriate statistical tools to-date may have a "medievil flavor" for those who hurry up to apply a fuzzy language of a highly developed probability theory. To become "quantitatively probabilistic" earthquake forecasts/predictions must be defined with a scientific accuracy. Following the most popular objectivists' viewpoint on probability, we cannot claim "probabilities" adequate without a long series of "yes/no" forecast/prediction outcomes. Without "antiquated binary language" of "yes/no" certainty we cannot judge an outcome ("success/failure"), and, therefore, quantify objectively a forecast/prediction method performance. Likelihood scoring is one of the delicate tools of Statistics, which could be worthless or even misleading when inappropriate probability models are used. This is a basic loophole for a misuse of likelihood as well as other statistical methods on practice. The flaw could be avoided by an accurate verification of generic probability models on the empirical data. It is not an easy task in the frames of the Regional Earthquake Likelihood Models (RELM) methodology, which neither defines the forecast precision nor allows a means to judge the ultimate success or failure in specific cases. Hopefully, the RELM group realizes the problem and its members do their best to close the hole with an adequate, data supported choice. Regretfully, this is not the case with the erroneous choice of Gerstenberger et al., who started the public web site with forecasts of expected ground shaking for `tomorrow' (Nature 435, 19 May 2005). Gerstenberger et al. have inverted the critical evidence of their study, i.e., the 15 years of recent seismic record accumulated just in one figure, which suggests rejecting with confidence above 97% "the generic California clustering model" used in automatic calculations. As a result, since the date of publication in Nature the United States Geological Survey website delivers to the public, emergency planners and the media, a forecast product, which is based on wrong assumptions that violate the best-documented earthquake statistics in California, which accuracy was not investigated, and which forecasts were not tested in a rigorous way.
RELM-β promotes human pulmonary artery smooth muscle cell proliferation via FAK-stimulated surviving
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Chunlong, E-mail: lclmd@sina.com; Li, Xiaohui; Luo, Qiong
Resistin-like molecule-β (RELM-β), focal adhesion kinase (FAK), and survivin may be involved in the proliferation of cultured human pulmonary artery smooth muscle cells (HPAMSCs), which is involved in pulmonary hypertension. HPAMSCs were treated with human recombinant RELM-β (rhRELM-β). siRNAs against FAK and survivin were transfected into cultured HPASMCs. Expression of FAK and survivin were examined by RT-PCR and western blot. Immunofluorescence was used to localize FAK. Flow cytometry was used to examine cell cycle distribution and cell death. Compared to the control group, all rhRELM-β-treated groups demonstrated significant increases in the expression of FAK and survivin (P<0.05). rhRELM-β significantly increasedmore » the proportion of HPASMCs in the S phase and decreased the proportion in G0/G1. FAK siRNA down-regulated survivin expression while survivin siRNA did not affect FAK expression. FAK siRNA effectively inhibited FAK and survivin expression in RELM-β-treated HPASMCs and partially suppressed cell proliferation. RELM-β promoted HPASMC proliferation and upregulated FAK and survivin expression. In conclusion, results suggested that FAK is upstream of survivin in the signaling pathway mediating cell proliferation. FAK seems to be important in RELM-β-induced HPASMC proliferation, partially by upregulating survivin expression. - Highlights: • rhRELM-β increased the expression of FAK and survivin. • rhRELM-β increased the proportion of HPASMCs in the S phase. • FAK is upstream of survivin in the signaling pathway mediating cell proliferation. • FAK is important in RELM-β-induced HPASMC proliferation, partly via survivin.« less
Prospective Tests of Southern California Earthquake Forecasts
NASA Astrophysics Data System (ADS)
Jackson, D. D.; Schorlemmer, D.; Gerstenberger, M.; Kagan, Y. Y.; Helmstetter, A.; Wiemer, S.; Field, N.
2004-12-01
We are testing earthquake forecast models prospectively using likelihood ratios. Several investigators have developed such models as part of the Southern California Earthquake Center's project called Regional Earthquake Likelihood Models (RELM). Various models are based on fault geometry and slip rates, seismicity, geodetic strain, and stress interactions. Here we describe the testing procedure and present preliminary results. Forecasts are expressed as the yearly rate of earthquakes within pre-specified bins of longitude, latitude, magnitude, and focal mechanism parameters. We test models against each other in pairs, which requires that both forecasts in a pair be defined over the same set of bins. For this reason we specify a standard "menu" of bins and ground rules to guide forecasters in using common descriptions. One menu category includes five-year forecasts of magnitude 5.0 and larger. Contributors will be requested to submit forecasts in the form of a vector of yearly earthquake rates on a 0.1 degree grid at the beginning of the test. Focal mechanism forecasts, when available, are also archived and used in the tests. Interim progress will be evaluated yearly, but final conclusions would be made on the basis of cumulative five-year performance. The second category includes forecasts of earthquakes above magnitude 4.0 on a 0.1 degree grid, evaluated and renewed daily. Final evaluation would be based on cumulative performance over five years. Other types of forecasts with different magnitude, space, and time sampling are welcome and will be tested against other models with shared characteristics. Tests are based on the log likelihood scores derived from the probability that future earthquakes would occur where they do if a given forecast were true [Kagan and Jackson, J. Geophys. Res.,100, 3,943-3,959, 1995]. For each pair of forecasts, we compute alpha, the probability that the first would be wrongly rejected in favor of the second, and beta, the probability that the second would be wrongly rejected in favor of the first. Computing alpha and beta requires knowing the theoretical distribution of likelihood scores under each hypothesis, which we estimate by simulations. In this scheme, each forecast is given equal status; there is no "null hypothesis" which would be accepted by default. Forecasts and test results will be archived and posted on the RELM web site. Major problems under discussion include how to treat aftershocks, which clearly violate the variable-rate Poissonian hypotheses that we employ, and how to deal with the temporal variations in catalog completeness that follow large earthquakes.
Lu, Huijuan; Wei, Shasha; Zhou, Zili; Miao, Yanzi; Lu, Yi
2015-01-01
The main purpose of traditional classification algorithms on bioinformatics application is to acquire better classification accuracy. However, these algorithms cannot meet the requirement that minimises the average misclassification cost. In this paper, a new algorithm of cost-sensitive regularised extreme learning machine (CS-RELM) was proposed by using probability estimation and misclassification cost to reconstruct the classification results. By improving the classification accuracy of a group of small sample which higher misclassification cost, the new CS-RELM can minimise the classification cost. The 'rejection cost' was integrated into CS-RELM algorithm to further reduce the average misclassification cost. By using Colon Tumour dataset and SRBCT (Small Round Blue Cells Tumour) dataset, CS-RELM was compared with other cost-sensitive algorithms such as extreme learning machine (ELM), cost-sensitive extreme learning machine, regularised extreme learning machine, cost-sensitive support vector machine (SVM). The results of experiments show that CS-RELM with embedded rejection cost could reduce the average cost of misclassification and made more credible classification decision than others.
Chronic features of allergic asthma are enhanced in the absence of resistin-like molecule-beta.
LeMessurier, Kim S; Palipane, Maneesha; Tiwary, Meenakshi; Gavin, Brian; Samarasinghe, Amali E
2018-05-04
Asthma is characterized by inflammation and architectural changes in the lungs. A number of immune cells and mediators are recognized as initiators of asthma, although therapeutics based on these are not always effective. The multifaceted nature of this syndrome necessitate continued exploration of immunomodulators that may play a role in pathogenesis. We investigated the role of resistin-like molecule-beta (RELM-β), a gut antibacterial, in the development and pathogenesis of Aspergillus-induced allergic airways disease. Age and gender matched C57BL/6J and Retnlb -/- mice rendered allergic to Aspergillus fumigatus were used to measure canonical markers of allergic asthma at early and late time points. Inflammatory cells in airways were similar, although Retnlb -/- mice had reduced tissue inflammation. The absence of RELM-β elevated serum IgA and pro-inflammatory cytokines in the lungs at homeostasis. Markers of chronic disease including goblet cell numbers, Muc genes, airway wall remodelling, and hyperresponsiveness were greater in the absence RELM-β. Specific inflammatory mediators important in antimicrobial defence in allergic asthma were also increased in the absence of RELM-β. These data suggest that while characteristics of allergic asthma develop in the absence of RELM-β, that RELM-β may reduce the development of chronic markers of allergic airways disease.
Next-Day Earthquake Forecasts for California
NASA Astrophysics Data System (ADS)
Werner, M. J.; Jackson, D. D.; Kagan, Y. Y.
2008-12-01
We implemented a daily forecast of m > 4 earthquakes for California in the format suitable for testing in community-based earthquake predictability experiments: Regional Earthquake Likelihood Models (RELM) and the Collaboratory for the Study of Earthquake Predictability (CSEP). The forecast is based on near-real time earthquake reports from the ANSS catalog above magnitude 2 and will be available online. The model used to generate the forecasts is based on the Epidemic-Type Earthquake Sequence (ETES) model, a stochastic model of clustered and triggered seismicity. Our particular implementation is based on the earlier work of Helmstetter et al. (2006, 2007), but we extended the forecast to all of Cali-fornia, use more data to calibrate the model and its parameters, and made some modifications. Our forecasts will compete against the Short-Term Earthquake Probabilities (STEP) forecasts of Gersten-berger et al. (2005) and other models in the next-day testing class of the CSEP experiment in California. We illustrate our forecasts with examples and discuss preliminary results.
Heddam, Salim; Kisi, Ozgur
2017-07-01
In this paper, several extreme learning machine (ELM) models, including standard extreme learning machine with sigmoid activation function (S-ELM), extreme learning machine with radial basis activation function (R-ELM), online sequential extreme learning machine (OS-ELM), and optimally pruned extreme learning machine (OP-ELM), are newly applied for predicting dissolved oxygen concentration with and without water quality variables as predictors. Firstly, using data from eight United States Geological Survey (USGS) stations located in different rivers basins, USA, the S-ELM, R-ELM, OS-ELM, and OP-ELM were compared against the measured dissolved oxygen (DO) using four water quality variables, water temperature, specific conductance, turbidity, and pH, as predictors. For each station, we used data measured at an hourly time step for a period of 4 years. The dataset was divided into a training set (70%) and a validation set (30%). We selected several combinations of the water quality variables as inputs for each ELM model and six different scenarios were compared. Secondly, an attempt was made to predict DO concentration without water quality variables. To achieve this goal, we used the year numbers, 2008, 2009, etc., month numbers from (1) to (12), day numbers from (1) to (31) and hour numbers from (00:00) to (24:00) as predictors. Thirdly, the best ELM models were trained using validation dataset and tested with the training dataset. The performances of the four ELM models were evaluated using four statistical indices: the coefficient of correlation (R), the Nash-Sutcliffe efficiency (NSE), the root mean squared error (RMSE), and the mean absolute error (MAE). Results obtained from the eight stations indicated that: (i) the best results were obtained by the S-ELM, R-ELM, OS-ELM, and OP-ELM models having four water quality variables as predictors; (ii) out of eight stations, the OP-ELM performed better than the other three ELM models at seven stations while the R-ELM performed the best at one station. The OS-ELM models performed the worst and provided the lowest accuracy; (iii) for predicting DO without water quality variables, the R-ELM performed the best at seven stations followed by the S-ELM in the second place and the OP-ELM performed the worst with low accuracy; (iv) for the final application where training ELM models with validation dataset and testing with training dataset, the OP-ELM provided the best accuracy using water quality variables and the R-ELM performed the best at all eight stations without water quality variables. Fourthly, and finally, we compared the results obtained from different ELM models with those obtained using multiple linear regression (MLR) and multilayer perceptron neural network (MLPNN). Results obtained using MLPNN and MLR models reveal that: (i) using water quality variables as predictors, the MLR performed the worst and provided the lowest accuracy in all stations; (ii) MLPNN was ranked in the second place at two stations, in the third place at four stations, and finally, in the fourth place at two stations, (iii) for predicting DO without water quality variables, MLPNN is ranked in the second place at five stations, and ranked in the third, fourth, and fifth places in the remaining three stations, while MLR was ranked in the last place with very low accuracy at all stations. Overall, the results suggest that the ELM is more effective than the MLPNN and MLR for modelling DO concentration in river ecosystems.
Testing hypotheses of earthquake occurrence
NASA Astrophysics Data System (ADS)
Kagan, Y. Y.; Jackson, D. D.; Schorlemmer, D.; Gerstenberger, M.
2003-12-01
We present a relatively straightforward likelihood method for testing those earthquake hypotheses that can be stated as vectors of earthquake rate density in defined bins of area, magnitude, and time. We illustrate the method as it will be applied to the Regional Earthquake Likelihood Models (RELM) project of the Southern California Earthquake Center (SCEC). Several earthquake forecast models are being developed as part of this project, and additional contributed forecasts are welcome. Various models are based on fault geometry and slip rates, seismicity, geodetic strain, and stress interactions. We would test models in pairs, requiring that both forecasts in a pair be defined over the same set of bins. Thus we offer a standard "menu" of bins and ground rules to encourage standardization. One menu category includes five-year forecasts of magnitude 5.0 and larger. Forecasts would be in the form of a vector of yearly earthquake rates on a 0.05 degree grid at the beginning of the test. Focal mechanism forecasts, when available, would be also be archived and used in the tests. The five-year forecast category may be appropriate for testing hypotheses of stress shadows from large earthquakes. Interim progress will be evaluated yearly, but final conclusions would be made on the basis of cumulative five-year performance. The second category includes forecasts of earthquakes above magnitude 4.0 on a 0.05 degree grid, evaluated and renewed daily. Final evaluation would be based on cumulative performance over five years. Other types of forecasts with different magnitude, space, and time sampling are welcome and will be tested against other models with shared characteristics. All earthquakes would be counted, and no attempt made to separate foreshocks, main shocks, and aftershocks. Earthquakes would be considered as point sources located at the hypocenter. For each pair of forecasts, we plan to compute alpha, the probability that the first would be wrongly rejected in favor of the second, and beta, the probability that the second would be wrongly rejected in favor of the first. Computing alpha and beta requires knowing the theoretical distribution of likelihood scores under each hypothesis, which we will estimate by simulations. Each forecast is given equal status; there is no "null hypothesis" which would be accepted by default. Forecasts and test results would be archived and posted on the RELM web site. The same methods can be applied to any region with adequate monitoring and sufficient earthquakes. If fewer than ten events are forecasted, the likelihood tests may not give definitive results. The tests do force certain requirements on the forecast models. Because the tests are based on absolute rates, stress models must be explicit about how stress increments affect past seismicity rates. Aftershocks of triggered events must be accounted for. Furthermore, the tests are sensitive to magnitude, so forecast models must specify the magnitude distribution of triggered events. Models should account for probable errors in magnitude and location by appropriate smoothing of the probabilities, as the tests will be "cold hearted:" near misses won't count.
A family of tissue-specific resistin-like molecules
Steppan, Claire M.; Brown, Elizabeth J.; Wright, Christopher M.; Bhat, Savitha; Banerjee, Ronadip R.; Dai, Charlotte Y.; Enders, Gregory H.; Silberg, Debra G.; Wen, Xiaoming; Wu, Gary D.; Lazar, Mitchell A.
2001-01-01
We have identified a family of resistin-like molecules (RELMs) in rodents and humans. Resistin is a hormone produced by fat cells. RELMα is a secreted protein that has a restricted tissue distribution with highest levels in adipose tissue. Another family member, RELMβ, is a secreted protein expressed only in the gastrointestinal tract, particularly the colon, in both mouse and human. RELMβ gene expression is highest in proliferative epithelial cells and is markedly increased in tumors, suggesting a role in intestinal proliferation. Resistin and the RELMs share a cysteine composition and other signature features. Thus, the RELMs together with resistin comprise a class of tissue-specific signaling molecules. PMID:11209052
A family of tissue-specific resistin-like molecules.
Steppan, C M; Brown, E J; Wright, C M; Bhat, S; Banerjee, R R; Dai, C Y; Enders, G H; Silberg, D G; Wen, X; Wu, G D; Lazar, M A
2001-01-16
We have identified a family of resistin-like molecules (RELMs) in rodents and humans. Resistin is a hormone produced by fat cells. RELMalpha is a secreted protein that has a restricted tissue distribution with highest levels in adipose tissue. Another family member, RELMbeta, is a secreted protein expressed only in the gastrointestinal tract, particularly the colon, in both mouse and human. RELMbeta gene expression is highest in proliferative epithelial cells and is markedly increased in tumors, suggesting a role in intestinal proliferation. Resistin and the RELMs share a cysteine composition and other signature features. Thus, the RELMs together with resistin comprise a class of tissue-specific signaling molecules.
Ma, Wan-li; Cai, Peng-cheng; Xiong, Xian-zhi; Ye, Hong
2013-02-01
FIZZ/RELM is a new gene family named "found in inflammatory zone" (FIZZ) or "resistin-like molecule" (RELM). FIZZ1/RELMα is specifically expressed in lung tissue and associated with pulmonary inflammation. Chronic cigarette smoking up-regulates FIZZ1/RELMα expression in rat lung tissues, the mechanism of which is related to cigarette smoking-induced airway hyperresponsiveness. To investigate the effect of exercise training on chronic cigarette smoking-induced airway hyperresponsiveness and up-regulation of FIZZ1/RELMα, rat chronic cigarette smoking model was established. The rats were treated with regular exercise training and their airway responsiveness was measured. Hematoxylin and eosin (HE) staining, immunohistochemistry and in situ hybridization of lung tissues were performed to detect the expression of FIZZ1/RELMα. Results revealed that proper exercise training decreased airway hyperresponsiveness and pulmonary inflammation in rat chronic cigarette smoking model. Cigarette smoking increased the mRNA and protein levels of FIZZ1/RELMα, which were reversed by the proper exercise. It is concluded that proper exercise training prevents up-regulation of FIZZ1/RELMα induced by cigarette smoking, which may be involved in the mechanism of proper exercise training modulating airway hyperresponsiveness.
Antitumor and Wound Healing Properties of Rubus ellipticus Smith.
George, Blassan Plackal; Parimelazhagan, Thangaraj; Kumar, Yamini T; Sajeesh, Thankarajan
2015-06-01
The present investigation has been undertaken to study the antioxidant, antitumor, and wound healing properties of Rubus ellipticus. The R. ellipticus leaves were extracted using organic solvents in Soxhlet and were subjected to in vitro antioxidant assays. R. ellipticus leaf methanol (RELM) extract, which showed higher in vitro antioxidant activity, was taken for the evaluation of in vivo antioxidant, antitumor, and wound healing properties. Acute oral and dermal toxicity studies showed the safety of RELM up to a dose of 2 g/kg. A significant wound healing property was observed in incision, excision, and Staphylococcus aureus-induced infected wound models in the treatment groups compared to the control group. A complete epithelialization period was noticed during the 13(th) day and the 19(th) day. A 250-mg/kg treatment was found to prolong the life span of mice with Ehrlich ascite carcinoma (EAC; 46.76%) and to reduce the volume of Dalton's lymphoma ascite (DLA) solid tumors (2.56 cm(3)). The present study suggests that R. ellipticus is a valuable natural antioxidant and that it is immensely effective for treating skin diseases, wounds, and tumors. Copyright © 2015. Published by Elsevier B.V.
Prediction of mortality after radical cystectomy for bladder cancer by machine learning techniques.
Wang, Guanjin; Lam, Kin-Man; Deng, Zhaohong; Choi, Kup-Sze
2015-08-01
Bladder cancer is a common cancer in genitourinary malignancy. For muscle invasive bladder cancer, surgical removal of the bladder, i.e. radical cystectomy, is in general the definitive treatment which, unfortunately, carries significant morbidities and mortalities. Accurate prediction of the mortality of radical cystectomy is therefore needed. Statistical methods have conventionally been used for this purpose, despite the complex interactions of high-dimensional medical data. Machine learning has emerged as a promising technique for handling high-dimensional data, with increasing application in clinical decision support, e.g. cancer prediction and prognosis. Its ability to reveal the hidden nonlinear interactions and interpretable rules between dependent and independent variables is favorable for constructing models of effective generalization performance. In this paper, seven machine learning methods are utilized to predict the 5-year mortality of radical cystectomy, including back-propagation neural network (BPN), radial basis function (RBFN), extreme learning machine (ELM), regularized ELM (RELM), support vector machine (SVM), naive Bayes (NB) classifier and k-nearest neighbour (KNN), on a clinicopathological dataset of 117 patients of the urology unit of a hospital in Hong Kong. The experimental results indicate that RELM achieved the highest average prediction accuracy of 0.8 at a fast learning speed. The research findings demonstrate the potential of applying machine learning techniques to support clinical decision making. Copyright © 2015 Elsevier Ltd. All rights reserved.
Intestinal epithelial cell secretion of RELM-beta protects against gastrointestinal worm infection
USDA-ARS?s Scientific Manuscript database
IL-4 and IL-13 protect against parasitic helminths, but little is known about the mechanism of host protection. We show that IL-4/IL-13 confer immunity against worms by inducing intestinal epithelial cells (IEC) to differentiate into goblet cells that secrete resistin-like molecule beta (RELMB). R...
Prediction model of dissolved oxygen in ponds based on ELM neural network
NASA Astrophysics Data System (ADS)
Li, Xinfei; Ai, Jiaoyan; Lin, Chunhuan; Guan, Haibin
2018-02-01
Dissolved oxygen in ponds is affected by many factors, and its distribution is unbalanced. In this paper, in order to improve the imbalance of dissolved oxygen distribution more effectively, the dissolved oxygen prediction model of Extreme Learning Machine (ELM) intelligent algorithm is established, based on the method of improving dissolved oxygen distribution by artificial push flow. Select the Lake Jing of Guangxi University as the experimental area. Using the model to predict the dissolved oxygen concentration of different voltage pumps, the results show that the ELM prediction accuracy is higher than the BP algorithm, and its mean square error is MSEELM=0.0394, the correlation coefficient RELM=0.9823. The prediction results of the 24V voltage pump push flow show that the discrete prediction curve can approximate the measured values well. The model can provide the basis for the artificial improvement of the dissolved oxygen distribution decision.
Angelini, Daniel J; Su, Qingning; Kolosova, Irina A; Fan, Chunling; Skinner, John T; Yamaji-Kegan, Kazuyo; Collector, Michael; Sharkis, Saul J; Johns, Roger A
2010-06-22
Pulmonary hypertension (PH) is a disease of multiple etiologies with several common pathological features, including inflammation and pulmonary vascular remodeling. Recent evidence has suggested a potential role for the recruitment of bone marrow-derived (BMD) progenitor cells to this remodeling process. We recently demonstrated that hypoxia-induced mitogenic factor (HIMF/FIZZ1/RELM alpha) is chemotactic to murine bone marrow cells in vitro and involved in pulmonary vascular remodeling in vivo. We used a mouse bone marrow transplant model in which lethally irradiated mice were rescued with bone marrow transplanted from green fluorescent protein (GFP)(+) transgenic mice to determine the role of HIMF in recruiting BMD cells to the lung vasculature during PH development. Exposure to chronic hypoxia and pulmonary gene transfer of HIMF were used to induce PH. Both models resulted in markedly increased numbers of BMD cells in and around the pulmonary vasculature; in several neomuscularized small (approximately 20 microm) capillary-like vessels, an entirely new medial wall was made up of these cells. We found these GFP(+) BMD cells to be positive for stem cell antigen-1 and c-kit, but negative for CD31 and CD34. Several of the GFP(+) cells that localized to the pulmonary vasculature were alpha-smooth muscle actin(+) and localized to the media layer of the vessels. This finding suggests that these cells are of mesenchymal origin and differentiate toward myofibroblast and vascular smooth muscle. Structural location in the media of small vessels suggests a functional role in the lung vasculature. To examine a potential mechanism for HIMF-dependent recruitment of mesenchymal stem cells to the pulmonary vasculature, we performed a cell migration assay using cultured human mesenchymal stem cells (HMSCs). The addition of recombinant HIMF induced migration of HMSCs in a phosphoinosotide-3-kinase-dependent manner. These results demonstrate HIMF-dependent recruitment of BMD mesenchymal-like cells to the remodeling pulmonary vasculature.
Analysis of the Perceived Adequacy of Air Force Civil Engineering Prime Beef Training
1985-09-01
P.9 0); - I 6Q ANALYSIS OF THE PERCEIUED ADEOUACY OF AIR FORCE CIUIL ENGINEER!NG PRIME BEEF TRAINING THESIS William C. Morris Captain, USAF AFIT...Ohio TIWA docm=ent La 1;;, Ow psl• c relm’,-j and owme;d _! i8U0- ________IS____rft* 85 11 05 04 AFIT/SEM/DET/BE ANALYSIS OF THE PERCEIVED ADEQUACY...distribution unlimited L * The contents of the document are technically accurate, and 6 no sensitive items, detrimental ideas, or deleterious information are
Monk, Jennifer M; Lepp, Dion; Zhang, Claire P; Wu, Wenqing; Zarepoor, Leila; Lu, Jenifer T; Pauls, K Peter; Tsao, Rong; Wood, Geoffrey A; Robinson, Lindsay E; Power, Krista A
2016-02-01
Common beans are rich in phenolic compounds and nondigestible fermentable components, which may help alleviate intestinal diseases. We assessed the gut health priming effect of a 20% cranberry bean flour diet from two bean varieties with differing profiles of phenolic compounds [darkening (DC) and nondarkening (NDC) cranberry beans vs. basal diet control (BD)] on critical aspects of gut health in unchallenged mice, and during dextran sodium sulfate (DSS)-induced colitis (2% DSS wt/vol, 7 days). In unchallenged mice, NDC and DC increased (i) cecal short-chain fatty acids, (ii) colon crypt height, (iii) crypt goblet cell number and mucus content and (iv) Muc1, Klf4, Relmβ and Reg3γ gene expression vs. BD, indicative of enhanced microbial activity and gut barrier function. Fecal 16S rRNA sequencing determined that beans reduced abundance of the Lactobacillaceae (Ruminococcus gnavus), Clostridiaceae (Clostridium perfringens), Peptococcaceae, Peptostreptococcaceae, Rikenellaceae and Pophyromonadaceae families, and increased abundance of S24-7 and Prevotellaceae. During colitis, beans reduced (i) disease severity and colonic histological damage, (ii) increased gene expression of barrier function promoting genes (Muc1-3, Relmβ, and Reg3γ) and (iii) reduced colonic and circulating inflammatory cytokines (IL-1β, IL-6, IFNγ and TNFα). Therefore, prior to disease induction, bean supplementation enhanced multiple concurrent gut health promoting parameters that translated into reduced colitis severity. Moreover, both bean diets exerted similar effects, indicating that differing phenolic content did not influence the endpoints assessed. These data demonstrate a proof-of-concept regarding the gut-priming potential of beans in colitis, which could be extended to mitigate the severity of other gut barrier-associated pathologies. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Gleason, Ann Whitney
2015-01-01
Gaming as a means of delivering online education continues to gain in popularity. Online games provide an engaging and enjoyable way of learning. Gaming is especially appropriate for case-based teaching, and provides a conducive environment for adult independent learning. With funding from the National Network of Libraries of Medicine, Pacific Northwest Region (NN/LM PNR), the University of Washington (UW) Health Sciences Library, and the UW School of Medicine are collaborating to create an interactive, self-paced online game that teaches players to employ the steps in practicing evidence-based medicine. The game encourages life-long learning and literacy skills and could be used for providing continuing medical education.
Estimating the variance for heterogeneity in arm-based network meta-analysis.
Piepho, Hans-Peter; Madden, Laurence V; Roger, James; Payne, Roger; Williams, Emlyn R
2018-04-19
Network meta-analysis can be implemented by using arm-based or contrast-based models. Here we focus on arm-based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial-by-treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi-likelihood/pseudo-likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi-likelihood/pseudo-likelihood and h-likelihood reduce bias and yield satisfactory coverage rates. Sum-to-zero restriction and baseline contrasts for random trial-by-treatment interaction effects, as well as a residual ML-like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi-likelihood/pseudo-likelihood and h-likelihood are therefore recommended. Copyright © 2018 John Wiley & Sons, Ltd.
General Aviation Activity and Avionics Survey 1985
1987-03-01
M V 4c0 -- nC 0.. (D* - V - w * 0 i L)( IsO n cc0 5-0 la-N 0 N -O 0- M 05. 34’ N cc -- ND --0 C o hi oo8NI laW07- vr--- v -- C .0 .9. ooN -4 (.V9 I ...9v Iso go C. Go ill hiWIZ 0r o390 4cC 0In Inwl C4-0- . 0’ ow0- w ~ -- -04 0 c > In A.,. i wn cV - Is %0 00 v I O D4 0 N-c tf C4 4( O 00-a - C c-r w...3801554 SKRSKYS58 8141801 REIMS ISO 7530128 SCHL ERa 3801559 SKRSKYS58 8141804R[INS ISO 7530132 SCHLERK8 3801563 SKRSKYS58 8141806 RElms 150 7530134
NASA Astrophysics Data System (ADS)
Zeng, X.
2015-12-01
A large number of model executions are required to obtain alternative conceptual models' predictions and their posterior probabilities in Bayesian model averaging (BMA). The posterior model probability is estimated through models' marginal likelihood and prior probability. The heavy computation burden hinders the implementation of BMA prediction, especially for the elaborated marginal likelihood estimator. For overcoming the computation burden of BMA, an adaptive sparse grid (SG) stochastic collocation method is used to build surrogates for alternative conceptual models through the numerical experiment of a synthetical groundwater model. BMA predictions depend on model posterior weights (or marginal likelihoods), and this study also evaluated four marginal likelihood estimators, including arithmetic mean estimator (AME), harmonic mean estimator (HME), stabilized harmonic mean estimator (SHME), and thermodynamic integration estimator (TIE). The results demonstrate that TIE is accurate in estimating conceptual models' marginal likelihoods. The BMA-TIE has better predictive performance than other BMA predictions. TIE has high stability for estimating conceptual model's marginal likelihood. The repeated estimated conceptual model's marginal likelihoods by TIE have significant less variability than that estimated by other estimators. In addition, the SG surrogates are efficient to facilitate BMA predictions, especially for BMA-TIE. The number of model executions needed for building surrogates is 4.13%, 6.89%, 3.44%, and 0.43% of the required model executions of BMA-AME, BMA-HME, BMA-SHME, and BMA-TIE, respectively.
Assessment of parametric uncertainty for groundwater reactive transport modeling,
Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun
2014-01-01
The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.
Profile-Likelihood Approach for Estimating Generalized Linear Mixed Models with Factor Structures
ERIC Educational Resources Information Center
Jeon, Minjeong; Rabe-Hesketh, Sophia
2012-01-01
In this article, the authors suggest a profile-likelihood approach for estimating complex models by maximum likelihood (ML) using standard software and minimal programming. The method works whenever setting some of the parameters of the model to known constants turns the model into a standard model. An important class of models that can be…
Jeon, Jihyoun; Hsu, Li; Gorfine, Malka
2012-07-01
Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection.
Huang, Lan; Zheng, Dan; Zalkikar, Jyoti; Tiwari, Ram
2017-02-01
In recent decades, numerous methods have been developed for data mining of large drug safety databases, such as Food and Drug Administration's (FDA's) Adverse Event Reporting System, where data matrices are formed by drugs such as columns and adverse events as rows. Often, a large number of cells in these data matrices have zero cell counts and some of them are "true zeros" indicating that the drug-adverse event pairs cannot occur, and these zero counts are distinguished from the other zero counts that are modeled zero counts and simply indicate that the drug-adverse event pairs have not occurred yet or have not been reported yet. In this paper, a zero-inflated Poisson model based likelihood ratio test method is proposed to identify drug-adverse event pairs that have disproportionately high reporting rates, which are also called signals. The maximum likelihood estimates of the model parameters of zero-inflated Poisson model based likelihood ratio test are obtained using the expectation and maximization algorithm. The zero-inflated Poisson model based likelihood ratio test is also modified to handle the stratified analyses for binary and categorical covariates (e.g. gender and age) in the data. The proposed zero-inflated Poisson model based likelihood ratio test method is shown to asymptotically control the type I error and false discovery rate, and its finite sample performance for signal detection is evaluated through a simulation study. The simulation results show that the zero-inflated Poisson model based likelihood ratio test method performs similar to Poisson model based likelihood ratio test method when the estimated percentage of true zeros in the database is small. Both the zero-inflated Poisson model based likelihood ratio test and likelihood ratio test methods are applied to six selected drugs, from the 2006 to 2011 Adverse Event Reporting System database, with varying percentages of observed zero-count cells.
Paninski, Liam; Haith, Adrian; Szirtes, Gabor
2008-02-01
We recently introduced likelihood-based methods for fitting stochastic integrate-and-fire models to spike train data. The key component of this method involves the likelihood that the model will emit a spike at a given time t. Computing this likelihood is equivalent to computing a Markov first passage time density (the probability that the model voltage crosses threshold for the first time at time t). Here we detail an improved method for computing this likelihood, based on solving a certain integral equation. This integral equation method has several advantages over the techniques discussed in our previous work: in particular, the new method has fewer free parameters and is easily differentiable (for gradient computations). The new method is also easily adaptable for the case in which the model conductance, not just the input current, is time-varying. Finally, we describe how to incorporate large deviations approximations to very small likelihoods.
An Improved Nested Sampling Algorithm for Model Selection and Assessment
NASA Astrophysics Data System (ADS)
Zeng, X.; Ye, M.; Wu, J.; WANG, D.
2017-12-01
Multimodel strategy is a general approach for treating model structure uncertainty in recent researches. The unknown groundwater system is represented by several plausible conceptual models. Each alternative conceptual model is attached with a weight which represents the possibility of this model. In Bayesian framework, the posterior model weight is computed as the product of model prior weight and marginal likelihood (or termed as model evidence). As a result, estimating marginal likelihoods is crucial for reliable model selection and assessment in multimodel analysis. Nested sampling estimator (NSE) is a new proposed algorithm for marginal likelihood estimation. The implementation of NSE comprises searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm and its variants are often used for local sampling in NSE. However, M-H is not an efficient sampling algorithm for high-dimensional or complex likelihood function. For improving the performance of NSE, it could be feasible to integrate more efficient and elaborated sampling algorithm - DREAMzs into the local sampling. In addition, in order to overcome the computation burden problem of large quantity of repeating model executions in marginal likelihood estimation, an adaptive sparse grid stochastic collocation method is used to build the surrogates for original groundwater model.
Likelihood Ratio Tests for Special Rasch Models
ERIC Educational Resources Information Center
Hessen, David J.
2010-01-01
In this article, a general class of special Rasch models for dichotomous item scores is considered. Although Andersen's likelihood ratio test can be used to test whether a Rasch model fits to the data, the test does not differentiate between special Rasch models. Therefore, in this article, new likelihood ratio tests are proposed for testing…
ERIC Educational Resources Information Center
Criss, Amy H.; McClelland, James L.
2006-01-01
The subjective likelihood model [SLiM; McClelland, J. L., & Chappell, M. (1998). Familiarity breeds differentiation: a subjective-likelihood approach to the effects of experience in recognition memory. "Psychological Review," 105(4), 734-760.] and the retrieving effectively from memory model [REM; Shiffrin, R. M., & Steyvers, M. (1997). A model…
NASA Astrophysics Data System (ADS)
Nourali, Mahrouz; Ghahraman, Bijan; Pourreza-Bilondi, Mohsen; Davary, Kamran
2016-09-01
In the present study, DREAM(ZS), Differential Evolution Adaptive Metropolis combined with both formal and informal likelihood functions, is used to investigate uncertainty of parameters of the HEC-HMS model in Tamar watershed, Golestan province, Iran. In order to assess the uncertainty of 24 parameters used in HMS, three flood events were used to calibrate and one flood event was used to validate the posterior distributions. Moreover, performance of seven different likelihood functions (L1-L7) was assessed by means of DREAM(ZS)approach. Four likelihood functions, L1-L4, Nash-Sutcliffe (NS) efficiency, Normalized absolute error (NAE), Index of agreement (IOA), and Chiew-McMahon efficiency (CM), is considered as informal, whereas remaining (L5-L7) is represented in formal category. L5 focuses on the relationship between the traditional least squares fitting and the Bayesian inference, and L6, is a hetereoscedastic maximum likelihood error (HMLE) estimator. Finally, in likelihood function L7, serial dependence of residual errors is accounted using a first-order autoregressive (AR) model of the residuals. According to the results, sensitivities of the parameters strongly depend on the likelihood function, and vary for different likelihood functions. Most of the parameters were better defined by formal likelihood functions L5 and L7 and showed a high sensitivity to model performance. Posterior cumulative distributions corresponding to the informal likelihood functions L1, L2, L3, L4 and the formal likelihood function L6 are approximately the same for most of the sub-basins, and these likelihood functions depict almost a similar effect on sensitivity of parameters. 95% total prediction uncertainty bounds bracketed most of the observed data. Considering all the statistical indicators and criteria of uncertainty assessment, including RMSE, KGE, NS, P-factor and R-factor, results showed that DREAM(ZS) algorithm performed better under formal likelihood functions L5 and L7, but likelihood function L5 may result in biased and unreliable estimation of parameters due to violation of the residualerror assumptions. Thus, likelihood function L7 provides posterior distribution of model parameters credibly and therefore can be employed for further applications.
How much to trust the senses: Likelihood learning
Sato, Yoshiyuki; Kording, Konrad P.
2014-01-01
Our brain often needs to estimate unknown variables from imperfect information. Our knowledge about the statistical distributions of quantities in our environment (called priors) and currently available information from sensory inputs (called likelihood) are the basis of all Bayesian models of perception and action. While we know that priors are learned, most studies of prior-likelihood integration simply assume that subjects know about the likelihood. However, as the quality of sensory inputs change over time, we also need to learn about new likelihoods. Here, we show that human subjects readily learn the distribution of visual cues (likelihood function) in a way that can be predicted by models of statistically optimal learning. Using a likelihood that depended on color context, we found that a learned likelihood generalized to new priors. Thus, we conclude that subjects learn about likelihood. PMID:25398975
Maximum likelihood estimation of finite mixture model for economic data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-06-01
Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
Improving and Evaluating Nested Sampling Algorithm for Marginal Likelihood Estimation
NASA Astrophysics Data System (ADS)
Ye, M.; Zeng, X.; Wu, J.; Wang, D.; Liu, J.
2016-12-01
With the growing impacts of climate change and human activities on the cycle of water resources, an increasing number of researches focus on the quantification of modeling uncertainty. Bayesian model averaging (BMA) provides a popular framework for quantifying conceptual model and parameter uncertainty. The ensemble prediction is generated by combining each plausible model's prediction, and each model is attached with a model weight which is determined by model's prior weight and marginal likelihood. Thus, the estimation of model's marginal likelihood is crucial for reliable and accurate BMA prediction. Nested sampling estimator (NSE) is a new proposed method for marginal likelihood estimation. The process of NSE is accomplished by searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm is often used for local sampling. However, M-H is not an efficient sampling algorithm for high-dimensional or complicated parameter space. For improving the efficiency of NSE, it could be ideal to incorporate the robust and efficient sampling algorithm - DREAMzs into the local sampling of NSE. The comparison results demonstrated that the improved NSE could improve the efficiency of marginal likelihood estimation significantly. However, both improved and original NSEs suffer from heavy instability. In addition, the heavy computation cost of huge number of model executions is overcome by using an adaptive sparse grid surrogates.
Maximum Likelihood and Restricted Likelihood Solutions in Multiple-Method Studies
Rukhin, Andrew L.
2011-01-01
A formulation of the problem of combining data from several sources is discussed in terms of random effects models. The unknown measurement precision is assumed not to be the same for all methods. We investigate maximum likelihood solutions in this model. By representing the likelihood equations as simultaneous polynomial equations, the exact form of the Groebner basis for their stationary points is derived when there are two methods. A parametrization of these solutions which allows their comparison is suggested. A numerical method for solving likelihood equations is outlined, and an alternative to the maximum likelihood method, the restricted maximum likelihood, is studied. In the situation when methods variances are considered to be known an upper bound on the between-method variance is obtained. The relationship between likelihood equations and moment-type equations is also discussed. PMID:26989583
Maximum Likelihood and Restricted Likelihood Solutions in Multiple-Method Studies.
Rukhin, Andrew L
2011-01-01
A formulation of the problem of combining data from several sources is discussed in terms of random effects models. The unknown measurement precision is assumed not to be the same for all methods. We investigate maximum likelihood solutions in this model. By representing the likelihood equations as simultaneous polynomial equations, the exact form of the Groebner basis for their stationary points is derived when there are two methods. A parametrization of these solutions which allows their comparison is suggested. A numerical method for solving likelihood equations is outlined, and an alternative to the maximum likelihood method, the restricted maximum likelihood, is studied. In the situation when methods variances are considered to be known an upper bound on the between-method variance is obtained. The relationship between likelihood equations and moment-type equations is also discussed.
Hurdle models for multilevel zero-inflated data via h-likelihood.
Molas, Marek; Lesaffre, Emmanuel
2010-12-30
Count data often exhibit overdispersion. One type of overdispersion arises when there is an excess of zeros in comparison with the standard Poisson distribution. Zero-inflated Poisson and hurdle models have been proposed to perform a valid likelihood-based analysis to account for the surplus of zeros. Further, data often arise in clustered, longitudinal or multiple-membership settings. The proper analysis needs to reflect the design of a study. Typically random effects are used to account for dependencies in the data. We examine the h-likelihood estimation and inference framework for hurdle models with random effects for complex designs. We extend the h-likelihood procedures to fit hurdle models, thereby extending h-likelihood to truncated distributions. Two applications of the methodology are presented. Copyright © 2010 John Wiley & Sons, Ltd.
MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS
Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-07-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
ERIC Educational Resources Information Center
Jones, Douglas H.
The progress of modern mental test theory depends very much on the techniques of maximum likelihood estimation, and many popular applications make use of likelihoods induced by logistic item response models. While, in reality, item responses are nonreplicate within a single examinee and the logistic models are only ideal, practitioners make…
Risk prediction and aversion by anterior cingulate cortex.
Brown, Joshua W; Braver, Todd S
2007-12-01
The recently proposed error-likelihood hypothesis suggests that anterior cingulate cortex (ACC) and surrounding areas will become active in proportion to the perceived likelihood of an error. The hypothesis was originally derived from a computational model prediction. The same computational model now makes a further prediction that ACC will be sensitive not only to predicted error likelihood, but also to the predicted magnitude of the consequences, should an error occur. The product of error likelihood and predicted error consequence magnitude collectively defines the general "expected risk" of a given behavior in a manner analogous but orthogonal to subjective expected utility theory. New fMRI results from an incentivechange signal task now replicate the error-likelihood effect, validate the further predictions of the computational model, and suggest why some segments of the population may fail to show an error-likelihood effect. In particular, error-likelihood effects and expected risk effects in general indicate greater sensitivity to earlier predictors of errors and are seen in risk-averse but not risk-tolerant individuals. Taken together, the results are consistent with an expected risk model of ACC and suggest that ACC may generally contribute to cognitive control by recruiting brain activity to avoid risk.
Finite mixture model: A maximum likelihood estimation approach on time series data
NASA Astrophysics Data System (ADS)
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
Ratmann, Oliver; Andrieu, Christophe; Wiuf, Carsten; Richardson, Sylvia
2009-06-30
Mathematical models are an important tool to explain and comprehend complex phenomena, and unparalleled computational advances enable us to easily explore them without any or little understanding of their global properties. In fact, the likelihood of the data under complex stochastic models is often analytically or numerically intractable in many areas of sciences. This makes it even more important to simultaneously investigate the adequacy of these models-in absolute terms, against the data, rather than relative to the performance of other models-but no such procedure has been formally discussed when the likelihood is intractable. We provide a statistical interpretation to current developments in likelihood-free Bayesian inference that explicitly accounts for discrepancies between the model and the data, termed Approximate Bayesian Computation under model uncertainty (ABCmicro). We augment the likelihood of the data with unknown error terms that correspond to freely chosen checking functions, and provide Monte Carlo strategies for sampling from the associated joint posterior distribution without the need of evaluating the likelihood. We discuss the benefit of incorporating model diagnostics within an ABC framework, and demonstrate how this method diagnoses model mismatch and guides model refinement by contrasting three qualitative models of protein network evolution to the protein interaction datasets of Helicobacter pylori and Treponema pallidum. Our results make a number of model deficiencies explicit, and suggest that the T. pallidum network topology is inconsistent with evolution dominated by link turnover or lateral gene transfer alone.
A general methodology for maximum likelihood inference from band-recovery data
Conroy, M.J.; Williams, B.K.
1984-01-01
A numerical procedure is described for obtaining maximum likelihood estimates and associated maximum likelihood inference from band- recovery data. The method is used to illustrate previously developed one-age-class band-recovery models, and is extended to new models, including the analysis with a covariate for survival rates and variable-time-period recovery models. Extensions to R-age-class band- recovery, mark-recapture models, and twice-yearly marking are discussed. A FORTRAN program provides computations for these models.
Unified framework to evaluate panmixia and migration direction among multiple sampling locations.
Beerli, Peter; Palczewski, Michal
2010-05-01
For many biological investigations, groups of individuals are genetically sampled from several geographic locations. These sampling locations often do not reflect the genetic population structure. We describe a framework using marginal likelihoods to compare and order structured population models, such as testing whether the sampling locations belong to the same randomly mating population or comparing unidirectional and multidirectional gene flow models. In the context of inferences employing Markov chain Monte Carlo methods, the accuracy of the marginal likelihoods depends heavily on the approximation method used to calculate the marginal likelihood. Two methods, modified thermodynamic integration and a stabilized harmonic mean estimator, are compared. With finite Markov chain Monte Carlo run lengths, the harmonic mean estimator may not be consistent. Thermodynamic integration, in contrast, delivers considerably better estimates of the marginal likelihood. The choice of prior distributions does not influence the order and choice of the better models when the marginal likelihood is estimated using thermodynamic integration, whereas with the harmonic mean estimator the influence of the prior is pronounced and the order of the models changes. The approximation of marginal likelihood using thermodynamic integration in MIGRATE allows the evaluation of complex population genetic models, not only of whether sampling locations belong to a single panmictic population, but also of competing complex structured population models.
Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; ...
2014-10-16
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genesmore » and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.« less
Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; Chia, Nicholas; Price, Nathan D.
2014-01-01
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface. PMID:25329157
Multilevel and Latent Variable Modeling with Composite Links and Exploded Likelihoods
ERIC Educational Resources Information Center
Rabe-Hesketh, Sophia; Skrondal, Anders
2007-01-01
Composite links and exploded likelihoods are powerful yet simple tools for specifying a wide range of latent variable models. Applications considered include survival or duration models, models for rankings, small area estimation with census information, models for ordinal responses, item response models with guessing, randomized response models,…
Hock, Sabrina; Hasenauer, Jan; Theis, Fabian J
2013-01-01
Diffusion is a key component of many biological processes such as chemotaxis, developmental differentiation and tissue morphogenesis. Since recently, the spatial gradients caused by diffusion can be assessed in-vitro and in-vivo using microscopy based imaging techniques. The resulting time-series of two dimensional, high-resolutions images in combination with mechanistic models enable the quantitative analysis of the underlying mechanisms. However, such a model-based analysis is still challenging due to measurement noise and sparse observations, which result in uncertainties of the model parameters. We introduce a likelihood function for image-based measurements with log-normal distributed noise. Based upon this likelihood function we formulate the maximum likelihood estimation problem, which is solved using PDE-constrained optimization methods. To assess the uncertainty and practical identifiability of the parameters we introduce profile likelihoods for diffusion processes. As proof of concept, we model certain aspects of the guidance of dendritic cells towards lymphatic vessels, an example for haptotaxis. Using a realistic set of artificial measurement data, we estimate the five kinetic parameters of this model and compute profile likelihoods. Our novel approach for the estimation of model parameters from image data as well as the proposed identifiability analysis approach is widely applicable to diffusion processes. The profile likelihood based method provides more rigorous uncertainty bounds in contrast to local approximation methods.
A long-term earthquake rate model for the central and eastern United States from smoothed seismicity
Moschetti, Morgan P.
2015-01-01
I present a long-term earthquake rate model for the central and eastern United States from adaptive smoothed seismicity. By employing pseudoprospective likelihood testing (L-test), I examined the effects of fixed and adaptive smoothing methods and the effects of catalog duration and composition on the ability of the models to forecast the spatial distribution of recent earthquakes. To stabilize the adaptive smoothing method for regions of low seismicity, I introduced minor modifications to the way that the adaptive smoothing distances are calculated. Across all smoothed seismicity models, the use of adaptive smoothing and the use of earthquakes from the recent part of the catalog optimizes the likelihood for tests with M≥2.7 and M≥4.0 earthquake catalogs. The smoothed seismicity models optimized by likelihood testing with M≥2.7 catalogs also produce the highest likelihood values for M≥4.0 likelihood testing, thus substantiating the hypothesis that the locations of moderate-size earthquakes can be forecast by the locations of smaller earthquakes. The likelihood test does not, however, maximize the fraction of earthquakes that are better forecast than a seismicity rate model with uniform rates in all cells. In this regard, fixed smoothing models perform better than adaptive smoothing models. The preferred model of this study is the adaptive smoothed seismicity model, based on its ability to maximize the joint likelihood of predicting the locations of recent small-to-moderate-size earthquakes across eastern North America. The preferred rate model delineates 12 regions where the annual rate of M≥5 earthquakes exceeds 2×10−3. Although these seismic regions have been previously recognized, the preferred forecasts are more spatially concentrated than the rates from fixed smoothed seismicity models, with rate increases of up to a factor of 10 near clusters of high seismic activity.
Likelihood testing of seismicity-based rate forecasts of induced earthquakes in Oklahoma and Kansas
Moschetti, Morgan P.; Hoover, Susan M.; Mueller, Charles
2016-01-01
Likelihood testing of induced earthquakes in Oklahoma and Kansas has identified the parameters that optimize the forecasting ability of smoothed seismicity models and quantified the recent temporal stability of the spatial seismicity patterns. Use of the most recent 1-year period of earthquake data and use of 10–20-km smoothing distances produced the greatest likelihood. The likelihood that the locations of January–June 2015 earthquakes were consistent with optimized forecasts decayed with increasing elapsed time between the catalogs used for model development and testing. Likelihood tests with two additional sets of earthquakes from 2014 exhibit a strong sensitivity of the rate of decay to the smoothing distance. Marked reductions in likelihood are caused by the nonstationarity of the induced earthquake locations. Our results indicate a multiple-fold benefit from smoothed seismicity models in developing short-term earthquake rate forecasts for induced earthquakes in Oklahoma and Kansas, relative to the use of seismic source zones.
Determining the accuracy of maximum likelihood parameter estimates with colored residuals
NASA Technical Reports Server (NTRS)
Morelli, Eugene A.; Klein, Vladislav
1994-01-01
An important part of building high fidelity mathematical models based on measured data is calculating the accuracy associated with statistical estimates of the model parameters. Indeed, without some idea of the accuracy of parameter estimates, the estimates themselves have limited value. In this work, an expression based on theoretical analysis was developed to properly compute parameter accuracy measures for maximum likelihood estimates with colored residuals. This result is important because experience from the analysis of measured data reveals that the residuals from maximum likelihood estimation are almost always colored. The calculations involved can be appended to conventional maximum likelihood estimation algorithms. Simulated data runs were used to show that the parameter accuracy measures computed with this technique accurately reflect the quality of the parameter estimates from maximum likelihood estimation without the need for analysis of the output residuals in the frequency domain or heuristically determined multiplication factors. The result is general, although the application studied here is maximum likelihood estimation of aerodynamic model parameters from flight test data.
Tests for detecting overdispersion in models with measurement error in covariates.
Yang, Yingsi; Wong, Man Yu
2015-11-30
Measurement error in covariates can affect the accuracy in count data modeling and analysis. In overdispersion identification, the true mean-variance relationship can be obscured under the influence of measurement error in covariates. In this paper, we propose three tests for detecting overdispersion when covariates are measured with error: a modified score test and two score tests based on the proposed approximate likelihood and quasi-likelihood, respectively. The proposed approximate likelihood is derived under the classical measurement error model, and the resulting approximate maximum likelihood estimator is shown to have superior efficiency. Simulation results also show that the score test based on approximate likelihood outperforms the test based on quasi-likelihood and other alternatives in terms of empirical power. By analyzing a real dataset containing the health-related quality-of-life measurements of a particular group of patients, we demonstrate the importance of the proposed methods by showing that the analyses with and without measurement error correction yield significantly different results. Copyright © 2015 John Wiley & Sons, Ltd.
Mixture Rasch Models with Joint Maximum Likelihood Estimation
ERIC Educational Resources Information Center
Willse, John T.
2011-01-01
This research provides a demonstration of the utility of mixture Rasch models. Specifically, a model capable of estimating a mixture partial credit model using joint maximum likelihood is presented. Like the partial credit model, the mixture partial credit model has the beneficial feature of being appropriate for analysis of assessment data…
Multiple robustness in factorized likelihood models.
Molina, J; Rotnitzky, A; Sued, M; Robins, J M
2017-09-01
We consider inference under a nonparametric or semiparametric model with likelihood that factorizes as the product of two or more variation-independent factors. We are interested in a finite-dimensional parameter that depends on only one of the likelihood factors and whose estimation requires the auxiliary estimation of one or several nuisance functions. We investigate general structures conducive to the construction of so-called multiply robust estimating functions, whose computation requires postulating several dimension-reducing models but which have mean zero at the true parameter value provided one of these models is correct.
Estimating Function Approaches for Spatial Point Processes
NASA Astrophysics Data System (ADS)
Deng, Chong
Spatial point pattern data consist of locations of events that are often of interest in biological and ecological studies. Such data are commonly viewed as a realization from a stochastic process called spatial point process. To fit a parametric spatial point process model to such data, likelihood-based methods have been widely studied. However, while maximum likelihood estimation is often too computationally intensive for Cox and cluster processes, pairwise likelihood methods such as composite likelihood, Palm likelihood usually suffer from the loss of information due to the ignorance of correlation among pairs. For many types of correlated data other than spatial point processes, when likelihood-based approaches are not desirable, estimating functions have been widely used for model fitting. In this dissertation, we explore the estimating function approaches for fitting spatial point process models. These approaches, which are based on the asymptotic optimal estimating function theories, can be used to incorporate the correlation among data and yield more efficient estimators. We conducted a series of studies to demonstrate that these estmating function approaches are good alternatives to balance the trade-off between computation complexity and estimating efficiency. First, we propose a new estimating procedure that improves the efficiency of pairwise composite likelihood method in estimating clustering parameters. Our approach combines estimating functions derived from pairwise composite likeli-hood estimation and estimating functions that account for correlations among the pairwise contributions. Our method can be used to fit a variety of parametric spatial point process models and can yield more efficient estimators for the clustering parameters than pairwise composite likelihood estimation. We demonstrate its efficacy through a simulation study and an application to the longleaf pine data. Second, we further explore the quasi-likelihood approach on fitting second-order intensity function of spatial point processes. However, the original second-order quasi-likelihood is barely feasible due to the intense computation and high memory requirement needed to solve a large linear system. Motivated by the existence of geometric regular patterns in the stationary point processes, we find a lower dimension representation of the optimal weight function and propose a reduced second-order quasi-likelihood approach. Through a simulation study, we show that the proposed method not only demonstrates superior performance in fitting the clustering parameter but also merits in the relaxation of the constraint of the tuning parameter, H. Third, we studied the quasi-likelihood type estimating funciton that is optimal in a certain class of first-order estimating functions for estimating the regression parameter in spatial point process models. Then, by using a novel spectral representation, we construct an implementation that is computationally much more efficient and can be applied to more general setup than the original quasi-likelihood method.
Bayesian logistic regression approaches to predict incorrect DRG assignment.
Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural
2018-05-07
Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H
2017-03-01
To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.
Wernstedt Asterholm, Ingrid; Kim-Muller, Ja Young; Rutkowski, Joseph M; Crewe, Clair; Tao, Caroline; Scherer, Philipp E
2016-09-01
Resistin, and its closely related homologs, the resistin-like molecules (RELMs) have been implicated in metabolic dysregulation, inflammation, and cancer. Specifically, RELMβ, expressed predominantly in the goblet cells in the colon, is released both apically and basolaterally, and is hence found in both the intestinal lumen in the mucosal layer as well as in the circulation. RELMβ has been linked to both the pathogenesis of colon cancer and type 2 diabetes. RELMβ plays a complex role in immune system regulation, and the impact of loss of function of RELMβ on colon cancer and metabolic regulation has not been fully elucidated. We therefore tested whether Retnlβ (mouse ortholog of human RETNLβ) null mice have an enhanced or reduced susceptibility for colon cancer as well as metabolic dysfunction. We found that the lack of RELMβ leads to increased colonic expression of T helper cell type-2 cytokines and IL-17, associated with a reduced ability to maintain intestinal homeostasis. This defect leads to an enhanced susceptibility to the development of inflammation, colorectal cancer, and glucose intolerance. In conclusion, the phenotype of the Retnlβ null mice unravels new aspects of inflammation-mediated diseases and strengthens the notion that a proper intestinal barrier function is essential to sustain a healthy phenotype. Copyright © 2016 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.
Yiu, Sean; Tom, Brian Dm
2017-01-01
Several researchers have described two-part models with patient-specific stochastic processes for analysing longitudinal semicontinuous data. In theory, such models can offer greater flexibility than the standard two-part model with patient-specific random effects. However, in practice, the high dimensional integrations involved in the marginal likelihood (i.e. integrated over the stochastic processes) significantly complicates model fitting. Thus, non-standard computationally intensive procedures based on simulating the marginal likelihood have so far only been proposed. In this paper, we describe an efficient method of implementation by demonstrating how the high dimensional integrations involved in the marginal likelihood can be computed efficiently. Specifically, by using a property of the multivariate normal distribution and the standard marginal cumulative distribution function identity, we transform the marginal likelihood so that the high dimensional integrations are contained in the cumulative distribution function of a multivariate normal distribution, which can then be efficiently evaluated. Hence, maximum likelihood estimation can be used to obtain parameter estimates and asymptotic standard errors (from the observed information matrix) of model parameters. We describe our proposed efficient implementation procedure for the standard two-part model parameterisation and when it is of interest to directly model the overall marginal mean. The methodology is applied on a psoriatic arthritis data set concerning functional disability.
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty
Baele, Guy; Lemey, Philippe; Suchard, Marc A.
2016-01-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of “working distributions” to facilitate—or shorten—the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a “working” distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different “working” distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les
2008-01-01
To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
Reconceptualizing Social Influence in Counseling: The Elaboration Likelihood Model.
ERIC Educational Resources Information Center
McNeill, Brian W.; Stoltenberg, Cal D.
1989-01-01
Presents Elaboration Likelihood Model (ELM) of persuasion (a reconceptualization of the social influence process) as alternative model of attitude change. Contends ELM unifies conflicting social psychology results and can potentially account for inconsistent research findings in counseling psychology. Provides guidelines on integrating…
Royle, J. Andrew; Sutherland, Christopher S.; Fuller, Angela K.; Sun, Catherine C.
2015-01-01
We develop a likelihood analysis framework for fitting spatial capture-recapture (SCR) models to data collected on class structured or stratified populations. Our interest is motivated by the necessity of accommodating the problem of missing observations of individual class membership. This is particularly problematic in SCR data arising from DNA analysis of scat, hair or other material, which frequently yields individual identity but fails to identify the sex. Moreover, this can represent a large fraction of the data and, given the typically small sample sizes of many capture-recapture studies based on DNA information, utilization of the data with missing sex information is necessary. We develop the class structured likelihood for the case of missing covariate values, and then we address the scaling of the likelihood so that models with and without class structured parameters can be formally compared regardless of missing values. We apply our class structured model to black bear data collected in New York in which sex could be determined for only 62 of 169 uniquely identified individuals. The models containing sex-specificity of both the intercept of the SCR encounter probability model and the distance coefficient, and including a behavioral response are strongly favored by log-likelihood. Estimated population sex ratio is strongly influenced by sex structure in model parameters illustrating the importance of rigorous modeling of sex differences in capture-recapture models.
Franco-Pedroso, Javier; Ramos, Daniel; Gonzalez-Rodriguez, Joaquin
2016-01-01
In forensic science, trace evidence found at a crime scene and on suspect has to be evaluated from the measurements performed on them, usually in the form of multivariate data (for example, several chemical compound or physical characteristics). In order to assess the strength of that evidence, the likelihood ratio framework is being increasingly adopted. Several methods have been derived in order to obtain likelihood ratios directly from univariate or multivariate data by modelling both the variation appearing between observations (or features) coming from the same source (within-source variation) and that appearing between observations coming from different sources (between-source variation). In the widely used multivariate kernel likelihood-ratio, the within-source distribution is assumed to be normally distributed and constant among different sources and the between-source variation is modelled through a kernel density function (KDF). In order to better fit the observed distribution of the between-source variation, this paper presents a different approach in which a Gaussian mixture model (GMM) is used instead of a KDF. As it will be shown, this approach provides better-calibrated likelihood ratios as measured by the log-likelihood ratio cost (Cllr) in experiments performed on freely available forensic datasets involving different trace evidences: inks, glass fragments and car paints. PMID:26901680
The Equivalence of Two Methods of Parameter Estimation for the Rasch Model.
ERIC Educational Resources Information Center
Blackwood, Larry G.; Bradley, Edwin L.
1989-01-01
Two methods of estimating parameters in the Rasch model are compared. The equivalence of likelihood estimations from the model of G. J. Mellenbergh and P. Vijn (1981) and from usual unconditional maximum likelihood (UML) estimation is demonstrated. Mellenbergh and Vijn's model is a convenient method of calculating UML estimates. (SLD)
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Tsuboi, Koichiro; Nishitani, Mayo; Takakura, Atsushi; Imai, Yasuyuki; Komatsu, Masaaki; Kawashima, Hiroto
2015-01-01
Genome-wide association studies of inflammatory bowel diseases identified susceptible loci containing an autophagy-related gene. However, the role of autophagy in the colon, a major affected area in inflammatory bowel diseases, is not clear. Here, we show that colonic epithelial cell-specific autophagy-related gene 7 (Atg7) conditional knock-out (cKO) mice showed exacerbation of experimental colitis with more abundant bacterial invasion into the colonic epithelium. Quantitative PCR analysis revealed that cKO mice had abnormal microflora with an increase of some genera. Consistently, expression of antimicrobial or antiparasitic peptides such as angiogenin-4, Relmβ, intelectin-1, and intelectin-2 as well as that of their inducer cytokines was significantly reduced in the cKO mice. Furthermore, secretion of colonic mucins that function as a mucosal barrier against bacterial invasion was also significantly diminished in cKO mice. Taken together, our results indicate that autophagy in colonic epithelial cells protects against colitis by the maintenance of normal gut microflora and secretion of mucus. PMID:26149685
Su, Jingjun; Du, Xinzhong; Li, Xuyong
2018-05-16
Uncertainty analysis is an important prerequisite for model application. However, the existing phosphorus (P) loss indexes or indicators were rarely evaluated. This study applied generalized likelihood uncertainty estimation (GLUE) method to assess the uncertainty of parameters and modeling outputs of a non-point source (NPS) P indicator constructed in R language. And the influences of subjective choices of likelihood formulation and acceptability threshold of GLUE on model outputs were also detected. The results indicated the following. (1) Parameters RegR 2 , RegSDR 2 , PlossDP fer , PlossDP man , DPDR, and DPR were highly sensitive to overall TP simulation and their value ranges could be reduced by GLUE. (2) Nash efficiency likelihood (L 1 ) seemed to present better ability in accentuating high likelihood value simulations than the exponential function (L 2 ) did. (3) The combined likelihood integrating the criteria of multiple outputs acted better than single likelihood in model uncertainty assessment in terms of reducing the uncertainty band widths and assuring the fitting goodness of whole model outputs. (4) A value of 0.55 appeared to be a modest choice of threshold value to balance the interests between high modeling efficiency and high bracketing efficiency. Results of this study could provide (1) an option to conduct NPS modeling under one single computer platform, (2) important references to the parameter setting for NPS model development in similar regions, (3) useful suggestions for the application of GLUE method in studies with different emphases according to research interests, and (4) important insights into the watershed P management in similar regions.
A strategy for improved computational efficiency of the method of anchored distributions
NASA Astrophysics Data System (ADS)
Over, Matthew William; Yang, Yarong; Chen, Xingyuan; Rubin, Yoram
2013-06-01
This paper proposes a strategy for improving the computational efficiency of model inversion using the method of anchored distributions (MAD) by "bundling" similar model parametrizations in the likelihood function. Inferring the likelihood function typically requires a large number of forward model (FM) simulations for each possible model parametrization; as a result, the process is quite expensive. To ease this prohibitive cost, we present an approximation for the likelihood function called bundling that relaxes the requirement for high quantities of FM simulations. This approximation redefines the conditional statement of the likelihood function as the probability of a set of similar model parametrizations "bundle" replicating field measurements, which we show is neither a model reduction nor a sampling approach to improving the computational efficiency of model inversion. To evaluate the effectiveness of these modifications, we compare the quality of predictions and computational cost of bundling relative to a baseline MAD inversion of 3-D flow and transport model parameters. Additionally, to aid understanding of the implementation we provide a tutorial for bundling in the form of a sample data set and script for the R statistical computing language. For our synthetic experiment, bundling achieved a 35% reduction in overall computational cost and had a limited negative impact on predicted probability distributions of the model parameters. Strategies for minimizing error in the bundling approximation, for enforcing similarity among the sets of model parametrizations, and for identifying convergence of the likelihood function are also presented.
A Bayesian Alternative for Multi-objective Ecohydrological Model Specification
NASA Astrophysics Data System (ADS)
Tang, Y.; Marshall, L. A.; Sharma, A.; Ajami, H.
2015-12-01
Process-based ecohydrological models combine the study of hydrological, physical, biogeochemical and ecological processes of the catchments, which are usually more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov Chain Monte Carlo (MCMC) techniques. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological framework. In our study, a formal Bayesian approach is implemented in an ecohydrological model which combines a hydrological model (HyMOD) and a dynamic vegetation model (DVM). Simulations focused on one objective likelihood (Streamflow/LAI) and multi-objective likelihoods (Streamflow and LAI) with different weights are compared. Uniform, weakly informative and strongly informative prior distributions are used in different simulations. The Kullback-leibler divergence (KLD) is used to measure the dis(similarity) between different priors and corresponding posterior distributions to examine the parameter sensitivity. Results show that different prior distributions can strongly influence posterior distributions for parameters, especially when the available data is limited or parameters are insensitive to the available data. We demonstrate differences in optimized parameters and uncertainty limits in different cases based on multi-objective likelihoods vs. single objective likelihoods. We also demonstrate the importance of appropriately defining the weights of objectives in multi-objective calibration according to different data types.
Two models for evaluating landslide hazards
Davis, J.C.; Chung, C.-J.; Ohlmacher, G.C.
2006-01-01
Two alternative procedures for estimating landslide hazards were evaluated using data on topographic digital elevation models (DEMs) and bedrock lithologies in an area adjacent to the Missouri River in Atchison County, Kansas, USA. The two procedures are based on the likelihood ratio model but utilize different assumptions. The empirical likelihood ratio model is based on non-parametric empirical univariate frequency distribution functions under an assumption of conditional independence while the multivariate logistic discriminant model assumes that likelihood ratios can be expressed in terms of logistic functions. The relative hazards of occurrence of landslides were estimated by an empirical likelihood ratio model and by multivariate logistic discriminant analysis. Predictor variables consisted of grids containing topographic elevations, slope angles, and slope aspects calculated from a 30-m DEM. An integer grid of coded bedrock lithologies taken from digitized geologic maps was also used as a predictor variable. Both statistical models yield relative estimates in the form of the proportion of total map area predicted to already contain or to be the site of future landslides. The stabilities of estimates were checked by cross-validation of results from random subsamples, using each of the two procedures. Cell-by-cell comparisons of hazard maps made by the two models show that the two sets of estimates are virtually identical. This suggests that the empirical likelihood ratio and the logistic discriminant analysis models are robust with respect to the conditional independent assumption and the logistic function assumption, respectively, and that either model can be used successfully to evaluate landslide hazards. ?? 2006.
Ning, Jing; Chen, Yong; Piao, Jin
2017-07-01
Publication bias occurs when the published research results are systematically unrepresentative of the population of studies that have been conducted, and is a potential threat to meaningful meta-analysis. The Copas selection model provides a flexible framework for correcting estimates and offers considerable insight into the publication bias. However, maximizing the observed likelihood under the Copas selection model is challenging because the observed data contain very little information on the latent variable. In this article, we study a Copas-like selection model and propose an expectation-maximization (EM) algorithm for estimation based on the full likelihood. Empirical simulation studies show that the EM algorithm and its associated inferential procedure performs well and avoids the non-convergence problem when maximizing the observed likelihood. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Maintained Individual Data Distributed Likelihood Estimation (MIDDLE)
Boker, Steven M.; Brick, Timothy R.; Pritikin, Joshua N.; Wang, Yang; von Oertzen, Timo; Brown, Donald; Lach, John; Estabrook, Ryne; Hunter, Michael D.; Maes, Hermine H.; Neale, Michael C.
2015-01-01
Maintained Individual Data Distributed Likelihood Estimation (MIDDLE) is a novel paradigm for research in the behavioral, social, and health sciences. The MIDDLE approach is based on the seemingly-impossible idea that data can be privately maintained by participants and never revealed to researchers, while still enabling statistical models to be fit and scientific hypotheses tested. MIDDLE rests on the assumption that participant data should belong to, be controlled by, and remain in the possession of the participants themselves. Distributed likelihood estimation refers to fitting statistical models by sending an objective function and vector of parameters to each participants’ personal device (e.g., smartphone, tablet, computer), where the likelihood of that individual’s data is calculated locally. Only the likelihood value is returned to the central optimizer. The optimizer aggregates likelihood values from responding participants and chooses new vectors of parameters until the model converges. A MIDDLE study provides significantly greater privacy for participants, automatic management of opt-in and opt-out consent, lower cost for the researcher and funding institute, and faster determination of results. Furthermore, if a participant opts into several studies simultaneously and opts into data sharing, these studies automatically have access to individual-level longitudinal data linked across all studies. PMID:26717128
The Effects of Model Misspecification and Sample Size on LISREL Maximum Likelihood Estimates.
ERIC Educational Resources Information Center
Baldwin, Beatrice
The robustness of LISREL computer program maximum likelihood estimates under specific conditions of model misspecification and sample size was examined. The population model used in this study contains one exogenous variable; three endogenous variables; and eight indicator variables, two for each latent variable. Conditions of model…
A Composite Likelihood Inference in Latent Variable Models for Ordinal Longitudinal Responses
ERIC Educational Resources Information Center
Vasdekis, Vassilis G. S.; Cagnone, Silvia; Moustaki, Irini
2012-01-01
The paper proposes a composite likelihood estimation approach that uses bivariate instead of multivariate marginal probabilities for ordinal longitudinal responses using a latent variable model. The model considers time-dependent latent variables and item-specific random effects to be accountable for the interdependencies of the multivariate…
Robust analysis of semiparametric renewal process models
Lin, Feng-Chang; Truong, Young K.; Fine, Jason P.
2013-01-01
Summary A rate model is proposed for a modulated renewal process comprising a single long sequence, where the covariate process may not capture the dependencies in the sequence as in standard intensity models. We consider partial likelihood-based inferences under a semiparametric multiplicative rate model, which has been widely studied in the context of independent and identical data. Under an intensity model, gap times in a single long sequence may be used naively in the partial likelihood with variance estimation utilizing the observed information matrix. Under a rate model, the gap times cannot be treated as independent and studying the partial likelihood is much more challenging. We employ a mixing condition in the application of limit theory for stationary sequences to obtain consistency and asymptotic normality. The estimator's variance is quite complicated owing to the unknown gap times dependence structure. We adapt block bootstrapping and cluster variance estimators to the partial likelihood. Simulation studies and an analysis of a semiparametric extension of a popular model for neural spike train data demonstrate the practical utility of the rate approach in comparison with the intensity approach. PMID:24550568
Models and analysis for multivariate failure time data
NASA Astrophysics Data System (ADS)
Shih, Joanna Huang
The goal of this research is to develop and investigate models and analytic methods for multivariate failure time data. We compare models in terms of direct modeling of the margins, flexibility of dependency structure, local vs. global measures of association, and ease of implementation. In particular, we study copula models, and models produced by right neutral cumulative hazard functions and right neutral hazard functions. We examine the changes of association over time for families of bivariate distributions induced from these models by displaying their density contour plots, conditional density plots, correlation curves of Doksum et al, and local cross ratios of Oakes. We know that bivariate distributions with same margins might exhibit quite different dependency structures. In addition to modeling, we study estimation procedures. For copula models, we investigate three estimation procedures. the first procedure is full maximum likelihood. The second procedure is two-stage maximum likelihood. At stage 1, we estimate the parameters in the margins by maximizing the marginal likelihood. At stage 2, we estimate the dependency structure by fixing the margins at the estimated ones. The third procedure is two-stage partially parametric maximum likelihood. It is similar to the second procedure, but we estimate the margins by the Kaplan-Meier estimate. We derive asymptotic properties for these three estimation procedures and compare their efficiency by Monte-Carlo simulations and direct computations. For models produced by right neutral cumulative hazards and right neutral hazards, we derive the likelihood and investigate the properties of the maximum likelihood estimates. Finally, we develop goodness of fit tests for the dependency structure in the copula models. We derive a test statistic and its asymptotic properties based on the test of homogeneity of Zelterman and Chen (1988), and a graphical diagnostic procedure based on the empirical Bayes approach. We study the performance of these two methods using actual and computer generated data.
Constrained Maximum Likelihood Estimation for Two-Level Mean and Covariance Structure Models
ERIC Educational Resources Information Center
Bentler, Peter M.; Liang, Jiajuan; Tang, Man-Lai; Yuan, Ke-Hai
2011-01-01
Maximum likelihood is commonly used for the estimation of model parameters in the analysis of two-level structural equation models. Constraints on model parameters could be encountered in some situations such as equal factor loadings for different factors. Linear constraints are the most common ones and they are relatively easy to handle in…
Maximum Likelihood Dynamic Factor Modeling for Arbitrary "N" and "T" Using SEM
ERIC Educational Resources Information Center
Voelkle, Manuel C.; Oud, Johan H. L.; von Oertzen, Timo; Lindenberger, Ulman
2012-01-01
This article has 3 objectives that build on each other. First, we demonstrate how to obtain maximum likelihood estimates for dynamic factor models (the direct autoregressive factor score model) with arbitrary "T" and "N" by means of structural equation modeling (SEM) and compare the approach to existing methods. Second, we go beyond standard time…
Eng, Kenny; Carlisle, Daren M.; Wolock, David M.; Falcone, James A.
2013-01-01
An approach is presented in this study to aid water-resource managers in characterizing streamflow alteration at ungauged rivers. Such approaches can be used to take advantage of the substantial amounts of biological data collected at ungauged rivers to evaluate the potential ecological consequences of altered streamflows. National-scale random forest statistical models are developed to predict the likelihood that ungauged rivers have altered streamflows (relative to expected natural condition) for five hydrologic metrics (HMs) representing different aspects of the streamflow regime. The models use human disturbance variables, such as number of dams and road density, to predict the likelihood of streamflow alteration. For each HM, separate models are derived to predict the likelihood that the observed metric is greater than (‘inflated’) or less than (‘diminished’) natural conditions. The utility of these models is demonstrated by applying them to all river segments in the South Platte River in Colorado, USA, and for all 10-digit hydrologic units in the conterminous United States. In general, the models successfully predicted the likelihood of alteration to the five HMs at the national scale as well as in the South Platte River basin. However, the models predicting the likelihood of diminished HMs consistently outperformed models predicting inflated HMs, possibly because of fewer sites across the conterminous United States where HMs are inflated. The results of these analyses suggest that the primary predictors of altered streamflow regimes across the Nation are (i) the residence time of annual runoff held in storage in reservoirs, (ii) the degree of urbanization measured by road density and (iii) the extent of agricultural land cover in the river basin.
Julien, Clavel; Leandro, Aristide; Hélène, Morlon
2018-06-19
Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
Maximum Likelihood Item Easiness Models for Test Theory Without an Answer Key
Batchelder, William H.
2014-01-01
Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce two extensions to the basic model in order to account for item rating easiness/difficulty. The first extension is a multiplicative model and the second is an additive model. We show how the multiplicative model is related to the Rasch model. We describe several maximum-likelihood estimation procedures for the models and discuss issues of model fit and identifiability. We describe how the CCT models could be used to give alternative consensus-based measures of reliability. We demonstrate the utility of both the basic and extended models on a set of essay rating data and give ideas for future research. PMID:29795812
Modeling gene expression measurement error: a quasi-likelihood approach
Strimmer, Korbinian
2003-01-01
Background Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution) or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale). Results Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood). Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic) variance structure of the data. As the quasi-likelihood behaves (almost) like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye) effects. Conclusions The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also improved the power of tests to identify differential expression. PMID:12659637
Model averaging techniques for quantifying conceptual model uncertainty.
Singh, Abhishek; Mishra, Srikanta; Ruskauff, Greg
2010-01-01
In recent years a growing understanding has emerged regarding the need to expand the modeling paradigm to include conceptual model uncertainty for groundwater models. Conceptual model uncertainty is typically addressed by formulating alternative model conceptualizations and assessing their relative likelihoods using statistical model averaging approaches. Several model averaging techniques and likelihood measures have been proposed in the recent literature for this purpose with two broad categories--Monte Carlo-based techniques such as Generalized Likelihood Uncertainty Estimation or GLUE (Beven and Binley 1992) and criterion-based techniques that use metrics such as the Bayesian and Kashyap Information Criteria (e.g., the Maximum Likelihood Bayesian Model Averaging or MLBMA approach proposed by Neuman 2003) and Akaike Information Criterion-based model averaging (AICMA) (Poeter and Anderson 2005). These different techniques can often lead to significantly different relative model weights and ranks because of differences in the underlying statistical assumptions about the nature of model uncertainty. This paper provides a comparative assessment of the four model averaging techniques (GLUE, MLBMA with KIC, MLBMA with BIC, and AIC-based model averaging) mentioned above for the purpose of quantifying the impacts of model uncertainty on groundwater model predictions. Pros and cons of each model averaging technique are examined from a practitioner's perspective using two groundwater modeling case studies. Recommendations are provided regarding the use of these techniques in groundwater modeling practice.
ERIC Educational Resources Information Center
Levy, Roy
2010-01-01
SEMModComp, a software package for conducting likelihood ratio tests for mean and covariance structure modeling is described. The package is written in R and freely available for download or on request.
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
NASA Astrophysics Data System (ADS)
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2014-02-01
Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can be successfully applied to process-based models of high complexity. The methodology is particularly suitable for heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models.
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2013-08-01
Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can successfully be applied to process-based models of high complexity. The methodology is particularly suited to heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models in ecology and evolution.
Maximum Likelihood Analysis of Nonlinear Structural Equation Models with Dichotomous Variables
ERIC Educational Resources Information Center
Song, Xin-Yuan; Lee, Sik-Yum
2005-01-01
In this article, a maximum likelihood approach is developed to analyze structural equation models with dichotomous variables that are common in behavioral, psychological and social research. To assess nonlinear causal effects among the latent variables, the structural equation in the model is defined by a nonlinear function. The basic idea of the…
NASA Technical Reports Server (NTRS)
1979-01-01
The computer program Linear SCIDNT which evaluates rotorcraft stability and control coefficients from flight or wind tunnel test data is described. It implements the maximum likelihood method to maximize the likelihood function of the parameters based on measured input/output time histories. Linear SCIDNT may be applied to systems modeled by linear constant-coefficient differential equations. This restriction in scope allows the application of several analytical results which simplify the computation and improve its efficiency over the general nonlinear case.
Janssen, Eva; van Osch, Liesbeth; Lechner, Lilian; Candel, Math; de Vries, Hein
2012-01-01
Despite the increased recognition of affect in guiding probability estimates, perceived risk has been mainly operationalised in a cognitive way and the differentiation between rational and intuitive judgements is largely unexplored. This study investigated the validity of a measurement instrument differentiating cognitive and affective probability beliefs and examined whether behavioural decision making is mainly guided by cognition or affect. Data were obtained from four surveys focusing on smoking (N=268), fruit consumption (N=989), sunbed use (N=251) and sun protection (N=858). Correlational analyses showed that affective likelihood was more strongly correlated with worry compared to cognitive likelihood and confirmatory factor analysis provided support for a two-factor model of perceived likelihood instead of a one-factor model (i.e. cognition and affect combined). Furthermore, affective likelihood was significantly associated with the various outcome variables, whereas the association for cognitive likelihood was absent in three studies. The findings provide support for the construct validity of the measures used to assess cognitive and affective likelihood. Since affective likelihood might be a better predictor of health behaviour than the commonly used cognitive operationalisation, both dimensions should be considered in future research.
Dong, Yi; Mihalas, Stefan; Russell, Alexander; Etienne-Cummings, Ralph; Niebur, Ernst
2012-01-01
When a neuronal spike train is observed, what can we say about the properties of the neuron that generated it? A natural way to answer this question is to make an assumption about the type of neuron, select an appropriate model for this type, and then to choose the model parameters as those that are most likely to generate the observed spike train. This is the maximum likelihood method. If the neuron obeys simple integrate and fire dynamics, Paninski, Pillow, and Simoncelli (2004) showed that its negative log-likelihood function is convex and that its unique global minimum can thus be found by gradient descent techniques. The global minimum property requires independence of spike time intervals. Lack of history dependence is, however, an important constraint that is not fulfilled in many biological neurons which are known to generate a rich repertoire of spiking behaviors that are incompatible with history independence. Therefore, we expanded the integrate and fire model by including one additional variable, a variable threshold (Mihalas & Niebur, 2009) allowing for history-dependent firing patterns. This neuronal model produces a large number of spiking behaviors while still being linear. Linearity is important as it maintains the distribution of the random variables and still allows for maximum likelihood methods to be used. In this study we show that, although convexity of the negative log-likelihood is not guaranteed for this model, the minimum of the negative log-likelihood function yields a good estimate for the model parameters, in particular if the noise level is treated as a free parameter. Furthermore, we show that a nonlinear function minimization method (r-algorithm with space dilation) frequently reaches the global minimum. PMID:21851282
Likelihoods for fixed rank nomination networks
HOFF, PETER; FOSDICK, BAILEY; VOLFOVSKY, ALEX; STOVEL, KATHERINE
2014-01-01
Many studies that gather social network data use survey methods that lead to censored, missing, or otherwise incomplete information. For example, the popular fixed rank nomination (FRN) scheme, often used in studies of schools and businesses, asks study participants to nominate and rank at most a small number of contacts or friends, leaving the existence of other relations uncertain. However, most statistical models are formulated in terms of completely observed binary networks. Statistical analyses of FRN data with such models ignore the censored and ranked nature of the data and could potentially result in misleading statistical inference. To investigate this possibility, we compare Bayesian parameter estimates obtained from a likelihood for complete binary networks with those obtained from likelihoods that are derived from the FRN scheme, and therefore accommodate the ranked and censored nature of the data. We show analytically and via simulation that the binary likelihood can provide misleading inference, particularly for certain model parameters that relate network ties to characteristics of individuals and pairs of individuals. We also compare these different likelihoods in a data analysis of several adolescent social networks. For some of these networks, the parameter estimates from the binary and FRN likelihoods lead to different conclusions, indicating the importance of analyzing FRN data with a method that accounts for the FRN survey design. PMID:25110586
Inferring the parameters of a Markov process from snapshots of the steady state
NASA Astrophysics Data System (ADS)
Dettmer, Simon L.; Berg, Johannes
2018-02-01
We seek to infer the parameters of an ergodic Markov process from samples taken independently from the steady state. Our focus is on non-equilibrium processes, where the steady state is not described by the Boltzmann measure, but is generally unknown and hard to compute, which prevents the application of established equilibrium inference methods. We propose a quantity we call propagator likelihood, which takes on the role of the likelihood in equilibrium processes. This propagator likelihood is based on fictitious transitions between those configurations of the system which occur in the samples. The propagator likelihood can be derived by minimising the relative entropy between the empirical distribution and a distribution generated by propagating the empirical distribution forward in time. Maximising the propagator likelihood leads to an efficient reconstruction of the parameters of the underlying model in different systems, both with discrete configurations and with continuous configurations. We apply the method to non-equilibrium models from statistical physics and theoretical biology, including the asymmetric simple exclusion process (ASEP), the kinetic Ising model, and replicator dynamics.
Gaussian copula as a likelihood function for environmental models
NASA Astrophysics Data System (ADS)
Wani, O.; Espadas, G.; Cecinati, F.; Rieckermann, J.
2017-12-01
Parameter estimation of environmental models always comes with uncertainty. To formally quantify this parametric uncertainty, a likelihood function needs to be formulated, which is defined as the probability of observations given fixed values of the parameter set. A likelihood function allows us to infer parameter values from observations using Bayes' theorem. The challenge is to formulate a likelihood function that reliably describes the error generating processes which lead to the observed monitoring data, such as rainfall and runoff. If the likelihood function is not representative of the error statistics, the parameter inference will give biased parameter values. Several uncertainty estimation methods that are currently being used employ Gaussian processes as a likelihood function, because of their favourable analytical properties. Box-Cox transformation is suggested to deal with non-symmetric and heteroscedastic errors e.g. for flow data which are typically more uncertain in high flows than in periods with low flows. Problem with transformations is that the results are conditional on hyper-parameters, for which it is difficult to formulate the analyst's belief a priori. In an attempt to address this problem, in this research work we suggest learning the nature of the error distribution from the errors made by the model in the "past" forecasts. We use a Gaussian copula to generate semiparametric error distributions . 1) We show that this copula can be then used as a likelihood function to infer parameters, breaking away from the practice of using multivariate normal distributions. Based on the results from a didactical example of predicting rainfall runoff, 2) we demonstrate that the copula captures the predictive uncertainty of the model. 3) Finally, we find that the properties of autocorrelation and heteroscedasticity of errors are captured well by the copula, eliminating the need to use transforms. In summary, our findings suggest that copulas are an interesting departure from the usage of fully parametric distributions as likelihood functions - and they could help us to better capture the statistical properties of errors and make more reliable predictions.
Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...
2017-11-08
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Xin; Garikapati, Venu M.; You, Daehyun
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
Maximum Likelihood Analysis of a Two-Level Nonlinear Structural Equation Model with Fixed Covariates
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan
2005-01-01
In this article, a maximum likelihood (ML) approach for analyzing a rather general two-level structural equation model is developed for hierarchically structured data that are very common in educational and/or behavioral research. The proposed two-level model can accommodate nonlinear causal relations among latent variables as well as effects…
Detecting Growth Shape Misspecifications in Latent Growth Models: An Evaluation of Fit Indexes
ERIC Educational Resources Information Center
Leite, Walter L.; Stapleton, Laura M.
2011-01-01
In this study, the authors compared the likelihood ratio test and fit indexes for detection of misspecifications of growth shape in latent growth models through a simulation study and a graphical analysis. They found that the likelihood ratio test, MFI, and root mean square error of approximation performed best for detecting model misspecification…
ERIC Educational Resources Information Center
Petty, Richard E.; And Others
1987-01-01
Answers James Stiff's criticism of the Elaboration Likelihood Model (ELM) of persuasion. Corrects certain misperceptions of the ELM and criticizes Stiff's meta-analysis that compares ELM predictions with those derived from Kahneman's elastic capacity model. Argues that Stiff's presentation of the ELM and the conclusions he draws based on the data…
Maximum Likelihood Item Easiness Models for Test Theory without an Answer Key
ERIC Educational Resources Information Center
France, Stephen L.; Batchelder, William H.
2015-01-01
Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce…
ERIC Educational Resources Information Center
Kelderman, Henk
1992-01-01
Describes algorithms used in the computer program LOGIMO for obtaining maximum likelihood estimates of the parameters in loglinear models. These algorithms are also useful for the analysis of loglinear item-response theory models. Presents modified versions of the iterative proportional fitting and Newton-Raphson algorithms. Simulated data…
ERIC Educational Resources Information Center
Kieftenbeld, Vincent; Natesan, Prathiba
2012-01-01
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Profile-likelihood Confidence Intervals in Item Response Theory Models.
Chalmers, R Philip; Pek, Jolynn; Liu, Yang
2017-01-01
Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.
2010-06-01
GMKPF represents a better and more flexible alternative to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ...accurate results relative to GML and EML when the network delays are modeled in terms of a single non-Gaussian/non-exponential distribution or as a...to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ) estimators for clock offset estimation in non-Gaussian or non
Comparison of two weighted integration models for the cueing task: linear and likelihood
NASA Technical Reports Server (NTRS)
Shimozaki, Steven S.; Eckstein, Miguel P.; Abbey, Craig K.
2003-01-01
In a task in which the observer must detect a signal at two locations, presenting a precue that predicts the location of a signal leads to improved performance with a valid cue (signal location matches the cue), compared to an invalid cue (signal location does not match the cue). The cue validity effect has often been explained with a limited capacity attentional mechanism improving the perceptual quality at the cued location. Alternatively, the cueing effect can also be explained by unlimited capacity models that assume a weighted combination of noisy responses across the two locations. We compare two weighted integration models, a linear model and a sum of weighted likelihoods model based on a Bayesian observer. While qualitatively these models are similar, quantitatively they predict different cue validity effects as the signal-to-noise ratios (SNR) increase. To test these models, 3 observers performed in a cued discrimination task of Gaussian targets with an 80% valid precue across a broad range of SNR's. Analysis of a limited capacity attentional switching model was also included and rejected. The sum of weighted likelihoods model best described the psychophysical results, suggesting that human observers approximate a weighted combination of likelihoods, and not a weighted linear combination.
Liu, Peigui; Elshall, Ahmed S.; Ye, Ming; ...
2016-02-05
Evaluating marginal likelihood is the most critical and computationally expensive task, when conducting Bayesian model averaging to quantify parametric and model uncertainties. The evaluation is commonly done by using Laplace approximations to evaluate semianalytical expressions of the marginal likelihood or by using Monte Carlo (MC) methods to evaluate arithmetic or harmonic mean of a joint likelihood function. This study introduces a new MC method, i.e., thermodynamic integration, which has not been attempted in environmental modeling. Instead of using samples only from prior parameter space (as in arithmetic mean evaluation) or posterior parameter space (as in harmonic mean evaluation), the thermodynamicmore » integration method uses samples generated gradually from the prior to posterior parameter space. This is done through a path sampling that conducts Markov chain Monte Carlo simulation with different power coefficient values applied to the joint likelihood function. The thermodynamic integration method is evaluated using three analytical functions by comparing the method with two variants of the Laplace approximation method and three MC methods, including the nested sampling method that is recently introduced into environmental modeling. The thermodynamic integration method outperforms the other methods in terms of their accuracy, convergence, and consistency. The thermodynamic integration method is also applied to a synthetic case of groundwater modeling with four alternative models. The application shows that model probabilities obtained using the thermodynamic integration method improves predictive performance of Bayesian model averaging. As a result, the thermodynamic integration method is mathematically rigorous, and its MC implementation is computationally general for a wide range of environmental problems.« less
Maximum Likelihood Estimation of Nonlinear Structural Equation Models.
ERIC Educational Resources Information Center
Lee, Sik-Yum; Zhu, Hong-Tu
2002-01-01
Developed an EM type algorithm for maximum likelihood estimation of a general nonlinear structural equation model in which the E-step is completed by a Metropolis-Hastings algorithm. Illustrated the methodology with results from a simulation study and two real examples using data from previous studies. (SLD)
ERIC Educational Resources Information Center
Hamaker, Ellen L.; Dolan, Conor V.; Molenaar, Peter C. M.
2003-01-01
Demonstrated, through simulation, that stationary autoregressive moving average (ARMA) models may be fitted readily when T>N, using normal theory raw maximum likelihood structural equation modeling. Also provides some illustrations based on real data. (SLD)
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-04-01
Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
Bayesian experimental design for models with intractable likelihoods.
Drovandi, Christopher C; Pettitt, Anthony N
2013-12-01
In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables. © 2013, The International Biometric Society.
Poisson point process modeling for polyphonic music transcription.
Peeling, Paul; Li, Chung-fai; Godsill, Simon
2007-04-01
Peaks detected in the frequency domain spectrum of a musical chord are modeled as realizations of a nonhomogeneous Poisson point process. When several notes are superimposed to make a chord, the processes for individual notes combine to give another Poisson process, whose likelihood is easily computable. This avoids a data association step linking individual harmonics explicitly with detected peaks in the spectrum. The likelihood function is ideal for Bayesian inference about the unknown note frequencies in a chord. Here, maximum likelihood estimation of fundamental frequencies shows very promising performance on real polyphonic piano music recordings.
Maximum Likelihood Estimation of Nonlinear Structural Equation Models with Ignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan; Lee, John C. K.
2003-01-01
The existing maximum likelihood theory and its computer software in structural equation modeling are established on the basis of linear relationships among latent variables with fully observed data. However, in social and behavioral sciences, nonlinear relationships among the latent variables are important for establishing more meaningful models…
The Elaboration Likelihood Model: Implications for the Practice of School Psychology.
ERIC Educational Resources Information Center
Petty, Richard E.; Heesacker, Martin; Hughes, Jan N.
1997-01-01
Reviews a contemporary theory of attitude change, the Elaboration Likelihood Model (ELM) of persuasion, and addresses its relevance to school psychology. Claims that a key postulate of ELM is that attitude change results from thoughtful (central route) or nonthoughtful (peripheral route) processes. Illustrations of ELM's utility for school…
Counseling Pretreatment and the Elaboration Likelihood Model of Attitude Change.
ERIC Educational Resources Information Center
Heesacker, Martin
1986-01-01
Results of the application of the Elaboration Likelihood Model (ELM) to a counseling context revealed that more favorable attitudes toward counseling occurred as subjects' ego involvement increased and as intervention quality improved. Counselor credibility affected the degree to which subjects' attitudes reflected argument quality differences.…
Application of the Elaboration Likelihood Model of Attitude Change to Assertion Training.
ERIC Educational Resources Information Center
Ernst, John M.; Heesacker, Martin
1993-01-01
College students (n=113) participated in study comparing effects of elaboration likelihood model (ELM) based assertion workshop with those of typical assertion workshop. ELM-based workshop was significantly better at producing favorable attitude change, greater intention to act assertively, and more favorable evaluations of workshop content.…
Consistency of Rasch Model Parameter Estimation: A Simulation Study.
ERIC Educational Resources Information Center
van den Wollenberg, Arnold L.; And Others
1988-01-01
The unconditional--simultaneous--maximum likelihood (UML) estimation procedure for the one-parameter logistic model produces biased estimators. The UML method is inconsistent and is not a good alternative to conditional maximum likelihood method, at least with small numbers of items. The minimum Chi-square estimation procedure produces unbiased…
Model uncertainty estimation and risk assessment is essential to environmental management and informed decision making on pollution mitigation strategies. In this study, we apply a probabilistic methodology, which combines Bayesian Monte Carlo simulation and Maximum Likelihood e...
Modeling abundance effects in distance sampling
Royle, J. Andrew; Dawson, D.K.; Bates, S.
2004-01-01
Distance-sampling methods are commonly used in studies of animal populations to estimate population density. A common objective of such studies is to evaluate the relationship between abundance or density and covariates that describe animal habitat or other environmental influences. However, little attention has been focused on methods of modeling abundance covariate effects in conventional distance-sampling models. In this paper we propose a distance-sampling model that accommodates covariate effects on abundance. The model is based on specification of the distance-sampling likelihood at the level of the sample unit in terms of local abundance (for each sampling unit). This model is augmented with a Poisson regression model for local abundance that is parameterized in terms of available covariates. Maximum-likelihood estimation of detection and density parameters is based on the integrated likelihood, wherein local abundance is removed from the likelihood by integration. We provide an example using avian point-transect data of Ovenbirds (Seiurus aurocapillus) collected using a distance-sampling protocol and two measures of habitat structure (understory cover and basal area of overstory trees). The model yields a sensible description (positive effect of understory cover, negative effect on basal area) of the relationship between habitat and Ovenbird density that can be used to evaluate the effects of habitat management on Ovenbird populations.
Modeling regional variation in riverine fish biodiversity in the Arkansas-White-Red River basin
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schweizer, Peter E; Jager, Yetta
The patterns of biodiversity in freshwater systems are shaped by biogeography, environmental gradients, and human-induced factors. In this study, we developed empirical models to explain fish species richness in subbasins of the Arkansas White Red River basin as a function of discharge, elevation, climate, land cover, water quality, dams, and longitudinal position. We used information-theoretic criteria to compare generalized linear mixed models and identified well-supported models. Subbasin attributes that were retained as predictors included discharge, elevation, number of downstream dams, percent forest, percent shrubland, nitrate, total phosphorus, and sediment. The random component of our models, which assumed a negative binomialmore » distribution, included spatial correlation within larger river basins and overdispersed residual variance. This study differs from previous biodiversity modeling efforts in several ways. First, obtaining likelihoods for negative binomial mixed models, and thereby avoiding reliance on quasi-likelihoods, has only recently become practical. We found the ranking of models based on these likelihood estimates to be more believable than that produced using quasi-likelihoods. Second, because we had access to a regional-scale watershed model for this river basin, we were able to include model-estimated water quality attributes as predictors. Thus, the resulting models have potential value as tools with which to evaluate the benefits of water quality improvements to fish.« less
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
Measurement of CIB power spectra with CAM-SPEC from Planck HFI maps
NASA Astrophysics Data System (ADS)
Mak, Suet Ying; Challinor, Anthony; Efstathiou, George; Lagache, Guilaine
2015-08-01
We present new measurements of the cosmic infrared background (CIB) anisotropies and its first likelihood using Planck HFI data at 353, 545, and 857 GHz. The measurements are based on cross-frequency power spectra and likelihood analysis using the CAM-SPEC package, rather than map based template removal of foregrounds as done in previous Planck CIB analysis. We construct the likelihood of the CIB temperature fluctuations, an extension of CAM-SPEC likelihood as used in CMB analysis to higher frequency, and use it to drive the best estimate of the CIB power spectrum over three decades in multiple moment, l, covering 50 ≤ l ≤ 2500. We adopt parametric models of the CIB and foreground contaminants (Galactic cirrus, infrared point sources, and cosmic microwave background anisotropies), and calibrate the dataset uniformly across frequencies with known Planck beam and noise properties in the likelihood construction. We validate our likelihood through simulations and extensive suite of consistency tests, and assess the impact of instrumental and data selection effects on the final CIB power spectrum constraints. Two approaches are developed for interpreting the CIB power spectrum. The first approach is based on simple parametric model which model the cross frequency power using amplitudes, correlation coefficients, and known multipole dependence. The second approach is based on the physical models for galaxy clustering and the evolution of infrared emission of galaxies. The new approaches fit all auto- and cross- power spectra very well, with the best fit of χ2ν = 1.04 (parametric model). Using the best foreground solution, we find that the cleaned CIB power spectra are in good agreement with previous Planck and Herschel measurements.
ERIC Educational Resources Information Center
Wothke, Werner; Burket, George; Chen, Li-Sue; Gao, Furong; Shu, Lianghua; Chia, Mike
2011-01-01
It has been known for some time that item response theory (IRT) models may exhibit a likelihood function of a respondent's ability which may have multiple modes, flat modes, or both. These conditions, often associated with guessing of multiple-choice (MC) questions, can introduce uncertainty and bias to ability estimation by maximum likelihood…
Liu, Fang; Eugenio, Evercita C
2018-04-01
Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
Evangelista, Laura; Panunzio, Annalori; Polverosi, Roberta; Pomerri, Fabio; Rubello, Domenico
2014-03-01
The purpose of this study was to determine likelihood of malignancy for indeterminate lung nodules identified on CT comparing two standardized models with (18)F-FDG PET/CT. Fifty-nine cancer patients with indeterminate lung nodules (solid tumors; diameter, ≥5 mm) on CT had FDG PET/CT for lesion characterization. Mayo Clinic and Veterans Affairs Cooperative Study models of likelihood of malignancy were applied to solitary pulmonary nodules. High probability of malignancy was assigned a priori for multiple nodules. Low (<5%), intermediate (5-60%), and high (>60%) pretest malignancy probabilities were analyzed separately. Patients were reclassified with PET/CT. Histopathology or 2-year imaging follow-up established diagnosis. Outcome-based reclassification differences were defined as net reclassification improvement. A null hypothesis of asymptotic test was applied. Thirty-one patients had histology-proven malignancy. PET/CT was true-positive in 24 and true-negative in 25 cases. Negative predictive value was 78% and positive predictive value was 89%. On the basis of the Mayo Clinic model (n=31), 18 patients had low, 12 had intermediate, and one had high pretest likelihood; on the basis of the Veterans Affairs model (n=26), 5 patients had low, 20 had intermediate, and one had high pretest likelihood. Because of multiple lung nodules, 28 patients were classified as having high malignancy risk. PET/CT showed 32 negative and 27 positive scans. Net reclassification improvements respectively were 0.95 and 1.6 for Mayo Clinic and Veterans Affairs models (both p<0.0001). Fourteen of 31 (45.2%) and 12 of 26 (46.2%) patients with low and intermediate pretest likelihood, respectively, had positive findings on PET/CT for the Mayo Clinic and Veterans Affairs models, respectively. Of 15 patients with high pretest likelihood and negative findings on PET/CT, 13 (86.7%) did not have lung malignancy. PET/CT improves stratification of cancer patients with indeterminate pulmonary nodules. A substantial number of patients considered at low and intermediate pretest likelihood of malignancy with histology-proven lung malignancy showed abnormal PET/CT findings.
Cross-validation to select Bayesian hierarchical models in phylogenetics.
Duchêne, Sebastián; Duchêne, David A; Di Giallonardo, Francesca; Eden, John-Sebastian; Geoghegan, Jemma L; Holt, Kathryn E; Ho, Simon Y W; Holmes, Edward C
2016-05-26
Recent developments in Bayesian phylogenetic models have increased the range of inferences that can be drawn from molecular sequence data. Accordingly, model selection has become an important component of phylogenetic analysis. Methods of model selection generally consider the likelihood of the data under the model in question. In the context of Bayesian phylogenetics, the most common approach involves estimating the marginal likelihood, which is typically done by integrating the likelihood across model parameters, weighted by the prior. Although this method is accurate, it is sensitive to the presence of improper priors. We explored an alternative approach based on cross-validation that is widely used in evolutionary analysis. This involves comparing models according to their predictive performance. We analysed simulated data and a range of viral and bacterial data sets using a cross-validation approach to compare a variety of molecular clock and demographic models. Our results show that cross-validation can be effective in distinguishing between strict- and relaxed-clock models and in identifying demographic models that allow growth in population size over time. In most of our empirical data analyses, the model selected using cross-validation was able to match that selected using marginal-likelihood estimation. The accuracy of cross-validation appears to improve with longer sequence data, particularly when distinguishing between relaxed-clock models. Cross-validation is a useful method for Bayesian phylogenetic model selection. This method can be readily implemented even when considering complex models where selecting an appropriate prior for all parameters may be difficult.
Inverse Ising problem in continuous time: A latent variable approach
NASA Astrophysics Data System (ADS)
Donner, Christian; Opper, Manfred
2017-12-01
We consider the inverse Ising problem: the inference of network couplings from observed spin trajectories for a model with continuous time Glauber dynamics. By introducing two sets of auxiliary latent random variables we render the likelihood into a form which allows for simple iterative inference algorithms with analytical updates. The variables are (1) Poisson variables to linearize an exponential term which is typical for point process likelihoods and (2) Pólya-Gamma variables, which make the likelihood quadratic in the coupling parameters. Using the augmented likelihood, we derive an expectation-maximization (EM) algorithm to obtain the maximum likelihood estimate of network parameters. Using a third set of latent variables we extend the EM algorithm to sparse couplings via L1 regularization. Finally, we develop an efficient approximate Bayesian inference algorithm using a variational approach. We demonstrate the performance of our algorithms on data simulated from an Ising model. For data which are simulated from a more biologically plausible network with spiking neurons, we show that the Ising model captures well the low order statistics of the data and how the Ising couplings are related to the underlying synaptic structure of the simulated network.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Quasar microlensing models with constraints on the Quasar light curves
NASA Astrophysics Data System (ADS)
Tie, S. S.; Kochanek, C. S.
2018-01-01
Quasar microlensing analyses implicitly generate a model of the variability of the source quasar. The implied source variability may be unrealistic yet its likelihood is generally not evaluated. We used the damped random walk (DRW) model for quasar variability to evaluate the likelihood of the source variability and applied the revized algorithm to a microlensing analysis of the lensed quasar RX J1131-1231. We compared estimates of the size of the quasar disc and the average stellar mass of the lens galaxy with and without applying the DRW likelihoods for the source variability model and found no significant effect on the estimated physical parameters. The most likely explanation is that unreliastic source light-curve models are generally associated with poor microlensing fits that already make a negligible contribution to the probability distributions of the derived parameters.
Chen, Feng; Chen, Suren; Ma, Xiaoxiang
2018-06-01
Driving environment, including road surface conditions and traffic states, often changes over time and influences crash probability considerably. It becomes stretched for traditional crash frequency models developed in large temporal scales to capture the time-varying characteristics of these factors, which may cause substantial loss of critical driving environmental information on crash prediction. Crash prediction models with refined temporal data (hourly records) are developed to characterize the time-varying nature of these contributing factors. Unbalanced panel data mixed logit models are developed to analyze hourly crash likelihood of highway segments. The refined temporal driving environmental data, including road surface and traffic condition, obtained from the Road Weather Information System (RWIS), are incorporated into the models. Model estimation results indicate that the traffic speed, traffic volume, curvature and chemically wet road surface indicator are better modeled as random parameters. The estimation results of the mixed logit models based on unbalanced panel data show that there are a number of factors related to crash likelihood on I-25. Specifically, weekend indicator, November indicator, low speed limit and long remaining service life of rutting indicator are found to increase crash likelihood, while 5-am indicator and number of merging ramps per lane per mile are found to decrease crash likelihood. The study underscores and confirms the unique and significant impacts on crash imposed by the real-time weather, road surface, and traffic conditions. With the unbalanced panel data structure, the rich information from real-time driving environmental big data can be well incorporated. Copyright © 2018 National Safety Council and Elsevier Ltd. All rights reserved.
A Maximum Likelihood Approach to Functional Mapping of Longitudinal Binary Traits
Wang, Chenguang; Li, Hongying; Wang, Zhong; Wang, Yaqun; Wang, Ningtao; Wang, Zuoheng; Wu, Rongling
2013-01-01
Despite their importance in biology and biomedicine, genetic mapping of binary traits that change over time has not been well explored. In this article, we develop a statistical model for mapping quantitative trait loci (QTLs) that govern longitudinal responses of binary traits. The model is constructed within the maximum likelihood framework by which the association between binary responses is modeled in terms of conditional log odds-ratios. With this parameterization, the maximum likelihood estimates (MLEs) of marginal mean parameters are robust to the misspecification of time dependence. We implement an iterative procedures to obtain the MLEs of QTL genotype-specific parameters that define longitudinal binary responses. The usefulness of the model was validated by analyzing a real example in rice. Simulation studies were performed to investigate the statistical properties of the model, showing that the model has power to identify and map specific QTLs responsible for the temporal pattern of binary traits. PMID:23183762
NASA Astrophysics Data System (ADS)
Ben Abdessalem, Anis; Dervilis, Nikolaos; Wagg, David; Worden, Keith
2018-01-01
This paper will introduce the use of the approximate Bayesian computation (ABC) algorithm for model selection and parameter estimation in structural dynamics. ABC is a likelihood-free method typically used when the likelihood function is either intractable or cannot be approached in a closed form. To circumvent the evaluation of the likelihood function, simulation from a forward model is at the core of the ABC algorithm. The algorithm offers the possibility to use different metrics and summary statistics representative of the data to carry out Bayesian inference. The efficacy of the algorithm in structural dynamics is demonstrated through three different illustrative examples of nonlinear system identification: cubic and cubic-quintic models, the Bouc-Wen model and the Duffing oscillator. The obtained results suggest that ABC is a promising alternative to deal with model selection and parameter estimation issues, specifically for systems with complex behaviours.
New prior sampling methods for nested sampling - Development and testing
NASA Astrophysics Data System (ADS)
Stokes, Barrie; Tuyl, Frank; Hudson, Irene
2017-06-01
Nested Sampling is a powerful algorithm for fitting models to data in the Bayesian setting, introduced by Skilling [1]. The nested sampling algorithm proceeds by carrying out a series of compressive steps, involving successively nested iso-likelihood boundaries, starting with the full prior distribution of the problem parameters. The "central problem" of nested sampling is to draw at each step a sample from the prior distribution whose likelihood is greater than the current likelihood threshold, i.e., a sample falling inside the current likelihood-restricted region. For both flat and informative priors this ultimately requires uniform sampling restricted to the likelihood-restricted region. We present two new methods of carrying out this sampling step, and illustrate their use with the lighthouse problem [2], a bivariate likelihood used by Gregory [3] and a trivariate Gaussian mixture likelihood. All the algorithm development and testing reported here has been done with Mathematica® [4].
ERIC Educational Resources Information Center
Paek, Insu; Wilson, Mark
2011-01-01
This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…
ERIC Educational Resources Information Center
Klein, Andreas G.; Muthen, Bengt O.
2007-01-01
In this article, a nonlinear structural equation model is introduced and a quasi-maximum likelihood method for simultaneous estimation and testing of multiple nonlinear effects is developed. The focus of the new methodology lies on efficiency, robustness, and computational practicability. Monte-Carlo studies indicate that the method is highly…
Counseling Pretreatment and the Elaboration Likelihood Model of Attitude Change.
ERIC Educational Resources Information Center
Heesacker, Martin
The importance of high levels of involvement in counseling has been related to theories of interpersonal influence. To examine differing effects of counselor credibility as a function of how personally involved counselors are, the Elaboration Likelihood Model (ELM) of attitude change was applied to counseling pretreatment. Students (N=256) were…
Bias and Efficiency in Structural Equation Modeling: Maximum Likelihood versus Robust Methods
ERIC Educational Resources Information Center
Zhong, Xiaoling; Yuan, Ke-Hai
2011-01-01
In the structural equation modeling literature, the normal-distribution-based maximum likelihood (ML) method is most widely used, partly because the resulting estimator is claimed to be asymptotically unbiased and most efficient. However, this may not hold when data deviate from normal distribution. Outlying cases or nonnormally distributed data,…
Estimation of Complex Generalized Linear Mixed Models for Measurement and Growth
ERIC Educational Resources Information Center
Jeon, Minjeong
2012-01-01
Maximum likelihood (ML) estimation of generalized linear mixed models (GLMMs) is technically challenging because of the intractable likelihoods that involve high dimensional integrations over random effects. The problem is magnified when the random effects have a crossed design and thus the data cannot be reduced to small independent clusters. A…
Evaluation of Smoking Prevention Television Messages Based on the Elaboration Likelihood Model
ERIC Educational Resources Information Center
Flynn, Brian S.; Worden, John K.; Bunn, Janice Yanushka; Connolly, Scott W.; Dorwaldt, Anne L.
2011-01-01
Progress in reducing youth smoking may depend on developing improved methods to communicate with higher risk youth. This study explored the potential of smoking prevention messages based on the Elaboration Likelihood Model (ELM) to address these needs. Structured evaluations of 12 smoking prevention messages based on three strategies derived from…
A time series intervention analysis (TSIA) of dendrochronological data to infer the tree growth-climate-disturbance relations and forest disturbance history is described. Maximum likelihood is used to estimate the parameters of a structural time series model with components for ...
An Effective Palmprint Recognition Approach for Visible and Multispectral Sensor Images.
Gumaei, Abdu; Sammouda, Rachid; Al-Salman, Abdul Malik; Alsanad, Ahmed
2018-05-15
Among several palmprint feature extraction methods the HOG-based method is attractive and performs well against changes in illumination and shadowing of palmprint images. However, it still lacks the robustness to extract the palmprint features at different rotation angles. To solve this problem, this paper presents a hybrid feature extraction method, named HOG-SGF that combines the histogram of oriented gradients (HOG) with a steerable Gaussian filter (SGF) to develop an effective palmprint recognition approach. The approach starts by processing all palmprint images by David Zhang's method to segment only the region of interests. Next, we extracted palmprint features based on the hybrid HOG-SGF feature extraction method. Then, an optimized auto-encoder (AE) was utilized to reduce the dimensionality of the extracted features. Finally, a fast and robust regularized extreme learning machine (RELM) was applied for the classification task. In the evaluation phase of the proposed approach, a number of experiments were conducted on three publicly available palmprint databases, namely MS-PolyU of multispectral palmprint images and CASIA and Tongji of contactless palmprint images. Experimentally, the results reveal that the proposed approach outperforms the existing state-of-the-art approaches even when a small number of training samples are used.
Stoermer, Kristina A.; Burrack, Adam; Oko, Lauren; Montgomery, Stephanie A.; Borst, Luke B.; Gill, Ronald G.; Morrison, Thomas E.
2012-01-01
Chikungunya virus (CHIKV) and Ross River virus (RRV) cause a debilitating, and often chronic, musculoskeletal inflammatory disease in humans. Macrophages constitute the major inflammatory infiltrates in musculoskeletal tissues during these infections. However, the precise macrophage effector functions that affect the pathogenesis of arthritogenic alphaviruses have not been defined. We hypothesized that the severe damage to musculoskeletal tissues observed in RRV or CHIKV-infected mice would promote a wound healing response characterized by M2-like macrophages. Indeed, we found that RRV and CHIKV-induced musculoskeletal inflammatory lesions, and macrophages present in these lesions, have a unique gene expression pattern characterized by high expression of arginase 1 and Ym1/Chi3l3 in the absence of FIZZ1/Relmα that is consistent with an M2-like activation phenotype. Strikingly, mice specifically deleted for Arg1 in neutrophils and macrophages had dramatically reduced viral loads and improved pathology in musculoskeletal tissues at late times post-RRV infection. These findings indicate that arthritogenic alphavirus infection drives a unique myeloid cell activation program in inflamed musculoskeletal tissues that inhibits virus clearance and impedes disease resolution in an Arg1-dependent manner. PMID:22972923
Lin, Feng-Chang; Zhu, Jun
2012-01-01
We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.
Posada, David
2006-01-01
ModelTest server is a web-based application for the selection of models of nucleotide substitution using the program ModelTest. The server takes as input a text file with likelihood scores for the set of candidate models. Models can be selected with hierarchical likelihood ratio tests, or with the Akaike or Bayesian information criteria. The output includes several statistics for the assessment of model selection uncertainty, for model averaging or to estimate the relative importance of model parameters. The server can be accessed at . PMID:16845102
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
1998-01-01
Pseudo-Maximum Likelihood (p-ML) and Asymptotically Distribution Free (ADF) estimation methods for estimating dynamic factor model parameters within a covariance structure framework were compared through a Monte Carlo simulation. Both methods appear to give consistent model parameter estimates, but only ADF gives standard errors and chi-square…
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
Hey, Jody; Nielsen, Rasmus
2007-01-01
In 1988, Felsenstein described a framework for assessing the likelihood of a genetic data set in which all of the possible genealogical histories of the data are considered, each in proportion to their probability. Although not analytically solvable, several approaches, including Markov chain Monte Carlo methods, have been developed to find approximate solutions. Here, we describe an approach in which Markov chain Monte Carlo simulations are used to integrate over the space of genealogies, whereas other parameters are integrated out analytically. The result is an approximation to the full joint posterior density of the model parameters. For many purposes, this function can be treated as a likelihood, thereby permitting likelihood-based analyses, including likelihood ratio tests of nested models. Several examples, including an application to the divergence of chimpanzee subspecies, are provided. PMID:17301231
Explaining the effect of event valence on unrealistic optimism.
Gold, Ron S; Brown, Mark G
2009-05-01
People typically exhibit 'unrealistic optimism' (UO): they believe they have a lower chance of experiencing negative events and a higher chance of experiencing positive events than does the average person. UO has been found to be greater for negative than positive events. This 'valence effect' has been explained in terms of motivational processes. An alternative explanation is provided by the 'numerosity model', which views the valence effect simply as a by-product of a tendency for likelihood estimates pertaining to the average member of a group to increase with the size of the group. Predictions made by the numerosity model were tested in two studies. In each, UO for a single event was assessed. In Study 1 (n = 115 students), valence was manipulated by framing the event either negatively or positively, and participants estimated their own likelihood and that of the average student at their university. In Study 2 (n = 139 students), valence was again manipulated and participants again estimated their own likelihood; additionally, group size was manipulated by having participants estimate the likelihood of the average student in a small, medium-sized, or large group. In each study, the valence effect was found, but was due to an effect on estimates of own likelihood, not the average person's likelihood. In Study 2, valence did not interact with group size. The findings contradict the numerosity model, but are in accord with the motivational explanation. Implications for health education are discussed.
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
Harrell-Williams, Leigh; Wolfe, Edward W
2014-01-01
Previous research has investigated the influence of sample size, model misspecification, test length, ability distribution offset, and generating model on the likelihood ratio difference test in applications of item response models. This study extended that research to the evaluation of dimensionality using the multidimensional random coefficients multinomial logit model (MRCMLM). Logistic regression analysis of simulated data reveal that sample size and test length have a large effect on the capacity of the LR difference test to correctly identify unidimensionality, with shorter tests and smaller sample sizes leading to smaller Type I error rates. Higher levels of simulated misfit resulted in fewer incorrect decisions than data with no or little misfit. However, Type I error rates indicate that the likelihood ratio difference test is not suitable under any of the simulated conditions for evaluating dimensionality in applications of the MRCMLM.
ERIC Educational Resources Information Center
Eaves, Michael
This paper provides a literature review of the elaboration likelihood model (ELM) as applied in persuasion. Specifically, the paper addresses distraction with regard to effects on persuasion. In addition, the application of proxemic violations as peripheral cues in message processing is discussed. Finally, the paper proposes to shed new light on…
ERIC Educational Resources Information Center
Andrews, Lester W.; Gutkin, Terry B.
1994-01-01
Investigates variables drawn from the Elaboration Likelihood Model (ELM) that might be manipulated to enhance the persuasiveness of a psychoeducational report. Results showed teachers in training were more persuaded by reports with high message quality. Findings are discussed in terms of the ELM and professional school psychology practice. (RJM)
ERIC Educational Resources Information Center
Heppner, Mary J.; And Others
1995-01-01
Intervention sought to improve first-year college students' attitudes about rape. Used the Elaboration Likelihood Model to examine men's and women's attitude change process. Found numerous sex differences in ways men and women experienced and changed during and after intervention. Women's attitude showed more lasting change while men's was more…
Likelihood Methods for Adaptive Filtering and Smoothing. Technical Report #455.
ERIC Educational Resources Information Center
Butler, Ronald W.
The dynamic linear model or Kalman filtering model provides a useful methodology for predicting the past, present, and future states of a dynamic system, such as an object in motion or an economic or social indicator that is changing systematically with time. Recursive likelihood methods for adaptive Kalman filtering and smoothing are developed.…
ERIC Educational Resources Information Center
Han, Kyung T.; Guo, Fanmin
2014-01-01
The full-information maximum likelihood (FIML) method makes it possible to estimate and analyze structural equation models (SEM) even when data are partially missing, enabling incomplete data to contribute to model estimation. The cornerstone of FIML is the missing-at-random (MAR) assumption. In (unidimensional) computerized adaptive testing…
ERIC Educational Resources Information Center
Penfield, Randall D.; Bergeron, Jennifer M.
2005-01-01
This article applies a weighted maximum likelihood (WML) latent trait estimator to the generalized partial credit model (GPCM). The relevant equations required to obtain the WML estimator using the Newton-Raphson algorithm are presented, and a simulation study is described that compared the properties of the WML estimator to those of the maximum…
Baele, Guy; Lemey, Philippe; Vansteelandt, Stijn
2013-03-06
Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model's marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. We here assess the original 'model-switch' path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model's marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation.
Estimating Model Probabilities using Thermodynamic Markov Chain Monte Carlo Methods
NASA Astrophysics Data System (ADS)
Ye, M.; Liu, P.; Beerli, P.; Lu, D.; Hill, M. C.
2014-12-01
Markov chain Monte Carlo (MCMC) methods are widely used to evaluate model probability for quantifying model uncertainty. In a general procedure, MCMC simulations are first conducted for each individual model, and MCMC parameter samples are then used to approximate marginal likelihood of the model by calculating the geometric mean of the joint likelihood of the model and its parameters. It has been found the method of evaluating geometric mean suffers from the numerical problem of low convergence rate. A simple test case shows that even millions of MCMC samples are insufficient to yield accurate estimation of the marginal likelihood. To resolve this problem, a thermodynamic method is used to have multiple MCMC runs with different values of a heating coefficient between zero and one. When the heating coefficient is zero, the MCMC run is equivalent to a random walk MC in the prior parameter space; when the heating coefficient is one, the MCMC run is the conventional one. For a simple case with analytical form of the marginal likelihood, the thermodynamic method yields more accurate estimate than the method of using geometric mean. This is also demonstrated for a case of groundwater modeling with consideration of four alternative models postulated based on different conceptualization of a confining layer. This groundwater example shows that model probabilities estimated using the thermodynamic method are more reasonable than those obtained using the geometric method. The thermodynamic method is general, and can be used for a wide range of environmental problem for model uncertainty quantification.
Exclusion probabilities and likelihood ratios with applications to mixtures.
Slooten, Klaas-Jan; Egeland, Thore
2016-01-01
The statistical evidence obtained from mixed DNA profiles can be summarised in several ways in forensic casework including the likelihood ratio (LR) and the Random Man Not Excluded (RMNE) probability. The literature has seen a discussion of the advantages and disadvantages of likelihood ratios and exclusion probabilities, and part of our aim is to bring some clarification to this debate. In a previous paper, we proved that there is a general mathematical relationship between these statistics: RMNE can be expressed as a certain average of the LR, implying that the expected value of the LR, when applied to an actual contributor to the mixture, is at least equal to the inverse of the RMNE. While the mentioned paper presented applications for kinship problems, the current paper demonstrates the relevance for mixture cases, and for this purpose, we prove some new general properties. We also demonstrate how to use the distribution of the likelihood ratio for donors of a mixture, to obtain estimates for exceedance probabilities of the LR for non-donors, of which the RMNE is a special case corresponding to L R>0. In order to derive these results, we need to view the likelihood ratio as a random variable. In this paper, we describe how such a randomization can be achieved. The RMNE is usually invoked only for mixtures without dropout. In mixtures, artefacts like dropout and drop-in are commonly encountered and we address this situation too, illustrating our results with a basic but widely implemented model, a so-called binary model. The precise definitions, modelling and interpretation of the required concepts of dropout and drop-in are not entirely obvious, and we attempt to clarify them here in a general likelihood framework for a binary model.
2015-08-01
McCullagh, P.; Nelder, J.A. Generalized Linear Model , 2nd ed.; Chapman and Hall: London, 1989. 7. Johnston, J. Econometric Methods, 3rd ed.; McGraw...FOR A DOSE-RESPONSE MODEL ECBC-TN-068 Kyong H. Park Steven J. Lagan RESEARCH AND TECHNOLOGY DIRECTORATE August 2015 Approved for public release...Likelihood Estimation Method for Completely Separated and Quasi-Completely Separated Data for a Dose-Response Model 5a. CONTRACT NUMBER 5b. GRANT
Modeling Goal-Directed User Exploration in Human-Computer Interaction
2011-02-01
scent, other factors including the layout position and grouping of options in the user-interface also affect user exploration and the likelihood of...grouping of options in the user-interface also affect user exploration and the likelihood of success. This dissertation contributes a new model of goal...better inform UI design. 1.1 RESEARCH GAPS IN MODELING In addition to infoscent, the layout of the UI also affects the choices made during
2013-01-01
Background Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model’s marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. Results We here assess the original ‘model-switch’ path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model’s marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. Conclusions We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation. PMID:23497171
Fast maximum likelihood estimation of mutation rates using a birth-death process.
Wu, Xiaowei; Zhu, Hongxiao
2015-02-07
Since fluctuation analysis was first introduced by Luria and Delbrück in 1943, it has been widely used to make inference about spontaneous mutation rates in cultured cells. Under certain model assumptions, the probability distribution of the number of mutants that appear in a fluctuation experiment can be derived explicitly, which provides the basis of mutation rate estimation. It has been shown that, among various existing estimators, the maximum likelihood estimator usually demonstrates some desirable properties such as consistency and lower mean squared error. However, its application in real experimental data is often hindered by slow computation of likelihood due to the recursive form of the mutant-count distribution. We propose a fast maximum likelihood estimator of mutation rates, MLE-BD, based on a birth-death process model with non-differential growth assumption. Simulation studies demonstrate that, compared with the conventional maximum likelihood estimator derived from the Luria-Delbrück distribution, MLE-BD achieves substantial improvement on computational speed and is applicable to arbitrarily large number of mutants. In addition, it still retains good accuracy on point estimation. Published by Elsevier Ltd.
NASA Astrophysics Data System (ADS)
Cheng, Qin-Bo; Chen, Xi; Xu, Chong-Yu; Reinhardt-Imjela, Christian; Schulte, Achim
2014-11-01
In this study, the likelihood functions for uncertainty analysis of hydrological models are compared and improved through the following steps: (1) the equivalent relationship between the Nash-Sutcliffe Efficiency coefficient (NSE) and the likelihood function with Gaussian independent and identically distributed residuals is proved; (2) a new estimation method of the Box-Cox transformation (BC) parameter is developed to improve the effective elimination of the heteroscedasticity of model residuals; and (3) three likelihood functions-NSE, Generalized Error Distribution with BC (BC-GED) and Skew Generalized Error Distribution with BC (BC-SGED)-are applied for SWAT-WB-VSA (Soil and Water Assessment Tool - Water Balance - Variable Source Area) model calibration in the Baocun watershed, Eastern China. Performances of calibrated models are compared using the observed river discharges and groundwater levels. The result shows that the minimum variance constraint can effectively estimate the BC parameter. The form of the likelihood function significantly impacts on the calibrated parameters and the simulated results of high and low flow components. SWAT-WB-VSA with the NSE approach simulates flood well, but baseflow badly owing to the assumption of Gaussian error distribution, where the probability of the large error is low, but the small error around zero approximates equiprobability. By contrast, SWAT-WB-VSA with the BC-GED or BC-SGED approach mimics baseflow well, which is proved in the groundwater level simulation. The assumption of skewness of the error distribution may be unnecessary, because all the results of the BC-SGED approach are nearly the same as those of the BC-GED approach.
A unifying framework for marginalized random intercept models of correlated binary outcomes
Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian M.
2013-01-01
We demonstrate that many current approaches for marginal modeling of correlated binary outcomes produce likelihoods that are equivalent to the copula-based models herein. These general copula models of underlying latent threshold random variables yield likelihood-based models for marginal fixed effects estimation and interpretation in the analysis of correlated binary data with exchangeable correlation structures. Moreover, we propose a nomenclature and set of model relationships that substantially elucidates the complex area of marginalized random intercept models for binary data. A diverse collection of didactic mathematical and numerical examples are given to illustrate concepts. PMID:25342871
Likelihood ratio decisions in memory: three implied regularities.
Glanzer, Murray; Hilford, Andrew; Maloney, Laurence T
2009-06-01
We analyze four general signal detection models for recognition memory that differ in their distributional assumptions. Our analyses show that a basic assumption of signal detection theory, the likelihood ratio decision axis, implies three regularities in recognition memory: (1) the mirror effect, (2) the variance effect, and (3) the z-ROC length effect. For each model, we present the equations that produce the three regularities and show, in computed examples, how they do so. We then show that the regularities appear in data from a range of recognition studies. The analyses and data in our study support the following generalization: Individuals make efficient recognition decisions on the basis of likelihood ratios.
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles
2012-01-01
This paper focuses on two estimators of ability with logistic item response theory models: the Bayesian modal (BM) estimator and the weighted likelihood (WL) estimator. For the BM estimator, Jeffreys' prior distribution is considered, and the corresponding estimator is referred to as the Jeffreys modal (JM) estimator. It is established that under…
ERIC Educational Resources Information Center
Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun
2002-01-01
Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)
Chen, Yong; Liu, Yulun; Ning, Jing; Cormier, Janice; Chu, Haitao
2014-01-01
Systematic reviews of diagnostic tests often involve a mixture of case-control and cohort studies. The standard methods for evaluating diagnostic accuracy only focus on sensitivity and specificity and ignore the information on disease prevalence contained in cohort studies. Consequently, such methods cannot provide estimates of measures related to disease prevalence, such as population averaged or overall positive and negative predictive values, which reflect the clinical utility of a diagnostic test. In this paper, we propose a hybrid approach that jointly models the disease prevalence along with the diagnostic test sensitivity and specificity in cohort studies, and the sensitivity and specificity in case-control studies. In order to overcome the potential computational difficulties in the standard full likelihood inference of the proposed hybrid model, we propose an alternative inference procedure based on the composite likelihood. Such composite likelihood based inference does not suffer computational problems and maintains high relative efficiency. In addition, it is more robust to model mis-specifications compared to the standard full likelihood inference. We apply our approach to a review of the performance of contemporary diagnostic imaging modalities for detecting metastases in patients with melanoma. PMID:25897179
Equivalence of binormal likelihood-ratio and bi-chi-squared ROC curve models
Hillis, Stephen L.
2015-01-01
A basic assumption for a meaningful diagnostic decision variable is that there is a monotone relationship between it and its likelihood ratio. This relationship, however, generally does not hold for a decision variable that results in a binormal ROC curve. As a result, receiver operating characteristic (ROC) curve estimation based on the assumption of a binormal ROC-curve model produces improper ROC curves that have “hooks,” are not concave over the entire domain, and cross the chance line. Although in practice this “improperness” is usually not noticeable, sometimes it is evident and problematic. To avoid this problem, Metz and Pan proposed basing ROC-curve estimation on the assumption of a binormal likelihood-ratio (binormal-LR) model, which states that the decision variable is an increasing transformation of the likelihood-ratio function of a random variable having normal conditional diseased and nondiseased distributions. However, their development is not easy to follow. I show that the binormal-LR model is equivalent to a bi-chi-squared model in the sense that the families of corresponding ROC curves are the same. The bi-chi-squared formulation provides an easier-to-follow development of the binormal-LR ROC curve and its properties in terms of well-known distributions. PMID:26608405
Williamson, Ross S.; Sahani, Maneesh; Pillow, Jonathan W.
2015-01-01
Stimulus dimensionality-reduction methods in neuroscience seek to identify a low-dimensional space of stimulus features that affect a neuron’s probability of spiking. One popular method, known as maximally informative dimensions (MID), uses an information-theoretic quantity known as “single-spike information” to identify this space. Here we examine MID from a model-based perspective. We show that MID is a maximum-likelihood estimator for the parameters of a linear-nonlinear-Poisson (LNP) model, and that the empirical single-spike information corresponds to the normalized log-likelihood under a Poisson model. This equivalence implies that MID does not necessarily find maximally informative stimulus dimensions when spiking is not well described as Poisson. We provide several examples to illustrate this shortcoming, and derive a lower bound on the information lost when spiking is Bernoulli in discrete time bins. To overcome this limitation, we introduce model-based dimensionality reduction methods for neurons with non-Poisson firing statistics, and show that they can be framed equivalently in likelihood-based or information-theoretic terms. Finally, we show how to overcome practical limitations on the number of stimulus dimensions that MID can estimate by constraining the form of the non-parametric nonlinearity in an LNP model. We illustrate these methods with simulations and data from primate visual cortex. PMID:25831448
Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times.
dos Reis, Mario; Yang, Ziheng
2011-07-01
The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.
ELASTIC NET FOR COX'S PROPORTIONAL HAZARDS MODEL WITH A SOLUTION PATH ALGORITHM.
Wu, Yichao
2012-01-01
For least squares regression, Efron et al. (2004) proposed an efficient solution path algorithm, the least angle regression (LAR). They showed that a slight modification of the LAR leads to the whole LASSO solution path. Both the LAR and LASSO solution paths are piecewise linear. Recently Wu (2011) extended the LAR to generalized linear models and the quasi-likelihood method. In this work we extend the LAR further to handle Cox's proportional hazards model. The goal is to develop a solution path algorithm for the elastic net penalty (Zou and Hastie (2005)) in Cox's proportional hazards model. This goal is achieved in two steps. First we extend the LAR to optimizing the log partial likelihood plus a fixed small ridge term. Then we define a path modification, which leads to the solution path of the elastic net regularized log partial likelihood. Our solution path is exact and piecewise determined by ordinary differential equation systems.
Evaluation of risk from acts of terrorism :the adversary/defender model using belief and fuzzy sets.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Darby, John L.
Risk from an act of terrorism is a combination of the likelihood of an attack, the likelihood of success of the attack, and the consequences of the attack. The considerable epistemic uncertainty in each of these three factors can be addressed using the belief/plausibility measure of uncertainty from the Dempster/Shafer theory of evidence. The adversary determines the likelihood of the attack. The success of the attack and the consequences of the attack are determined by the security system and mitigation measures put in place by the defender. This report documents a process for evaluating risk of terrorist acts using anmore » adversary/defender model with belief/plausibility as the measure of uncertainty. Also, the adversary model is a linguistic model that applies belief/plausibility to fuzzy sets used in an approximate reasoning rule base.« less
Pseudomonas aeruginosa dose response and bathing water infection.
Roser, D J; van den Akker, B; Boase, S; Haas, C N; Ashbolt, N J; Rice, S A
2014-03-01
Pseudomonas aeruginosa is the opportunistic pathogen mostly implicated in folliculitis and acute otitis externa in pools and hot tubs. Nevertheless, infection risks remain poorly quantified. This paper reviews disease aetiologies and bacterial skin colonization science to advance dose-response theory development. Three model forms are identified for predicting disease likelihood from pathogen density. Two are based on Furumoto & Mickey's exponential 'single-hit' model and predict infection likelihood and severity (lesions/m2), respectively. 'Third-generation', mechanistic, dose-response algorithm development is additionally scoped. The proposed formulation integrates dispersion, epidermal interaction, and follicle invasion. The review also details uncertainties needing consideration which pertain to water quality, outbreaks, exposure time, infection sites, biofilms, cerumen, environmental factors (e.g. skin saturation, hydrodynamics), and whether P. aeruginosa is endogenous or exogenous. The review's findings are used to propose a conceptual infection model and identify research priorities including pool dose-response modelling, epidermis ecology and infection likelihood-based hygiene management.
Cosmological parameter estimation using Particle Swarm Optimization
NASA Astrophysics Data System (ADS)
Prasad, J.; Souradeep, T.
2014-03-01
Constraining parameters of a theoretical model from observational data is an important exercise in cosmology. There are many theoretically motivated models, which demand greater number of cosmological parameters than the standard model of cosmology uses, and make the problem of parameter estimation challenging. It is a common practice to employ Bayesian formalism for parameter estimation for which, in general, likelihood surface is probed. For the standard cosmological model with six parameters, likelihood surface is quite smooth and does not have local maxima, and sampling based methods like Markov Chain Monte Carlo (MCMC) method are quite successful. However, when there are a large number of parameters or the likelihood surface is not smooth, other methods may be more effective. In this paper, we have demonstrated application of another method inspired from artificial intelligence, called Particle Swarm Optimization (PSO) for estimating cosmological parameters from Cosmic Microwave Background (CMB) data taken from the WMAP satellite.
Negotiating Multicollinearity with Spike-and-Slab Priors.
Ročková, Veronika; George, Edward I
2014-08-01
In multiple regression under the normal linear model, the presence of multicollinearity is well known to lead to unreliable and unstable maximum likelihood estimates. This can be particularly troublesome for the problem of variable selection where it becomes more difficult to distinguish between subset models. Here we show how adding a spike-and-slab prior mitigates this difficulty by filtering the likelihood surface into a posterior distribution that allocates the relevant likelihood information to each of the subset model modes. For identification of promising high posterior models in this setting, we consider three EM algorithms, the fast closed form EMVS version of Rockova and George (2014) and two new versions designed for variants of the spike-and-slab formulation. For a multimodal posterior under multicollinearity, we compare the regions of convergence of these three algorithms. Deterministic annealing versions of the EMVS algorithm are seen to substantially mitigate this multimodality. A single simple running example is used for illustration throughout.
Dixon, Helen G; Warne, Charles D; Scully, Maree L; Wakefield, Melanie A; Dobbinson, Suzanne J
2011-04-01
Content analysis data on the tans of 4,422 female Caucasian models sampled from spring and summer magazine issues were combined with readership data to generate indices of potential exposure to social modeling of tanning via popular women's magazines over a 15-year period (1987 to 2002). Associations between these indices and cross-sectional telephone survey data from the same period on 5,675 female teenagers' and adults' tanning attitudes, beliefs, and behavior were examined using logistic regression models. Among young women, greater exposure to tanning in young women's magazines was associated with increased likelihood of endorsing pro-tan attitudes and beliefs. Among women of all ages, greater exposure to tanned models via the most popular women's magazines was associated with increased likelihood of attempting to get a tan but lower likelihood of endorsing pro-tan attitudes. Popular women's magazines may promote and reflect real women's tanning beliefs and behavior.
Maximum likelihood convolutional decoding (MCD) performance due to system losses
NASA Technical Reports Server (NTRS)
Webster, L.
1976-01-01
A model for predicting the computational performance of a maximum likelihood convolutional decoder (MCD) operating in a noisy carrier reference environment is described. This model is used to develop a subroutine that will be utilized by the Telemetry Analysis Program to compute the MCD bit error rate. When this computational model is averaged over noisy reference phase errors using a high-rate interpolation scheme, the results are found to agree quite favorably with experimental measurements.
1990-11-01
1 = Q- 1 - 1 QlaaQ- 1.1 + a’Q-1a This is a simple case of a general formula called Woodbury’s formula by some authors; see, for example, Phadke and...1 2. The First-Order Moving Average Model ..... .................. 3. Some Approaches to the Iterative...the approximate likelihood function in some time series models. Useful suggestions have been the Cholesky decomposition of the covariance matrix and
NASA Technical Reports Server (NTRS)
Cash, W.
1979-01-01
Many problems in the experimental estimation of parameters for models can be solved through use of the likelihood ratio test. Applications of the likelihood ratio, with particular attention to photon counting experiments, are discussed. The procedures presented solve a greater range of problems than those currently in use, yet are no more difficult to apply. The procedures are proved analytically, and examples from current problems in astronomy are discussed.
On the Existence and Uniqueness of JML Estimates for the Partial Credit Model
ERIC Educational Resources Information Center
Bertoli-Barsotti, Lucio
2005-01-01
A necessary and sufficient condition is given in this paper for the existence and uniqueness of the maximum likelihood (the so-called joint maximum likelihood) estimate of the parameters of the Partial Credit Model. This condition is stated in terms of a structural property of the pattern of the data matrix that can be easily verified on the basis…
ERIC Educational Resources Information Center
Lee, Woong-Kyu
2012-01-01
The principal objective of this study was to gain insight into attitude changes occurring during IT acceptance from the perspective of elaboration likelihood model (ELM). In particular, the primary target of this study was the process of IT acceptance through an education program. Although the Internet and computers are now quite ubiquitous, and…
Phoebe L. Zarnetske; Thomas C., Jr. Edwards; Gretchen G. Moisen
2007-01-01
Estimating species likelihood of occurrence across extensive landscapes is a powerful management tool. Unfortunately, available occurrence data for landscape-scale modeling is often lacking and usually only in the form of observed presences. Ecologically based pseudo-absence points were generated from within habitat envelopes to accompany presence-only data in habitat...
ERIC Educational Resources Information Center
Andersen, Erling B.
A computer program for solving the conditional likelihood equations arising in the Rasch model for questionnaires is described. The estimation method and the computational problems involved are described in a previous research report by Andersen, but a summary of those results are given in two sections of this paper. A working example is also…
Moral Identity Predicts Doping Likelihood via Moral Disengagement and Anticipated Guilt.
Kavussanu, Maria; Ring, Christopher
2017-08-01
In this study, we integrated elements of social cognitive theory of moral thought and action and the social cognitive model of moral identity to better understand doping likelihood in athletes. Participants (N = 398) recruited from a variety of team sports completed measures of moral identity, moral disengagement, anticipated guilt, and doping likelihood. Moral identity predicted doping likelihood indirectly via moral disengagement and anticipated guilt. Anticipated guilt about potential doping mediated the relationship between moral disengagement and doping likelihood. Our findings provide novel evidence to suggest that athletes, who feel that being a moral person is central to their self-concept, are less likely to use banned substances due to their lower tendency to morally disengage and the more intense feelings of guilt they expect to experience for using banned substances.
NASA Astrophysics Data System (ADS)
De Santis, Alberto; Dellepiane, Umberto; Lucidi, Stefano
2012-11-01
In this paper we investigate the estimation problem for a model of the commodity prices. This model is a stochastic state space dynamical model and the problem unknowns are the state variables and the system parameters. Data are represented by the commodity spot prices, very seldom time series of Futures contracts are available for free. Both the system joint likelihood function (state variables and parameters) and the system marginal likelihood (the state variables are eliminated) function are addressed.
A comparison of abundance estimates from extended batch-marking and Jolly–Seber-type experiments
Cowen, Laura L E; Besbeas, Panagiotis; Morgan, Byron J T; Schwarz, Carl J
2014-01-01
Little attention has been paid to the use of multi-sample batch-marking studies, as it is generally assumed that an individual's capture history is necessary for fully efficient estimates. However, recently, Huggins et al. (2010) present a pseudo-likelihood for a multi-sample batch-marking study where they used estimating equations to solve for survival and capture probabilities and then derived abundance estimates using a Horvitz–Thompson-type estimator. We have developed and maximized the likelihood for batch-marking studies. We use data simulated from a Jolly–Seber-type study and convert this to what would have been obtained from an extended batch-marking study. We compare our abundance estimates obtained from the Crosbie–Manly–Arnason–Schwarz (CMAS) model with those of the extended batch-marking model to determine the efficiency of collecting and analyzing batch-marking data. We found that estimates of abundance were similar for all three estimators: CMAS, Huggins, and our likelihood. Gains are made when using unique identifiers and employing the CMAS model in terms of precision; however, the likelihood typically had lower mean square error than the pseudo-likelihood method of Huggins et al. (2010). When faced with designing a batch-marking study, researchers can be confident in obtaining unbiased abundance estimators. Furthermore, they can design studies in order to reduce mean square error by manipulating capture probabilities and sample size. PMID:24558576
2013-01-01
Background Falls among the elderly are a major public health concern. Therefore, the possibility of a modeling technique which could better estimate fall probability is both timely and needed. Using biomedical, pharmacological and demographic variables as predictors, latent class analysis (LCA) is demonstrated as a tool for the prediction of falls among community dwelling elderly. Methods Using a retrospective data-set a two-step LCA modeling approach was employed. First, we looked for the optimal number of latent classes for the seven medical indicators, along with the patients’ prescription medication and three covariates (age, gender, and number of medications). Second, the appropriate latent class structure, with the covariates, were modeled on the distal outcome (fall/no fall). The default estimator was maximum likelihood with robust standard errors. The Pearson chi-square, likelihood ratio chi-square, BIC, Lo-Mendell-Rubin Adjusted Likelihood Ratio test and the bootstrap likelihood ratio test were used for model comparisons. Results A review of the model fit indices with covariates shows that a six-class solution was preferred. The predictive probability for latent classes ranged from 84% to 97%. Entropy, a measure of classification accuracy, was good at 90%. Specific prescription medications were found to strongly influence group membership. Conclusions In conclusion the LCA method was effective at finding relevant subgroups within a heterogenous at-risk population for falling. This study demonstrated that LCA offers researchers a valuable tool to model medical data. PMID:23705639
On the occurrence of false positives in tests of migration under an isolation with migration model
Hey, Jody; Chung, Yujin; Sethuraman, Arun
2015-01-01
The population genetic study of divergence is often done using a Bayesian genealogy sampler, like those implemented in IMa2 and related programs, and these analyses frequently include a likelihood-ratio test of the null hypothesis of no migration between populations. Cruickshank and Hahn (2014, Molecular Ecology, 23, 3133–3157) recently reported a high rate of false positive test results with IMa2 for data simulated with small numbers of loci under models with no migration and recent splitting times. We confirm these findings and discover that they are caused by a failure of the assumptions underlying likelihood ratio tests that arises when using marginal likelihoods for a subset of model parameters. We also show that for small data sets, with little divergence between samples from two populations, an excellent fit can often be found by a model with a low migration rate and recent splitting time and a model with a high migration rate and a deep splitting time. PMID:26456794
Withers, Giselle F; Wertheim, Eleanor H
2004-01-01
This study applied principles from the Elaboration Likelihood Model of Persuasion to the prevention of disordered eating. Early adolescent girls watched either a preventive videotape only (n=114) or video plus post-video activity (verbal discussion, written exercises, or control discussion) (n=187); or had no intervention (n=104). Significantly more body image and knowledge improvements occurred at post video and follow-up in the intervention groups compared to no intervention. There were no outcome differences among intervention groups, or between girls with high or low elaboration likelihood. Further research is needed in integrating the videotape into a broader prevention package.
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
A new model to predict weak-lensing peak counts. II. Parameter constraint strategies
NASA Astrophysics Data System (ADS)
Lin, Chieh-An; Kilbinger, Martin
2015-11-01
Context. Peak counts have been shown to be an excellent tool for extracting the non-Gaussian part of the weak lensing signal. Recently, we developed a fast stochastic forward model to predict weak-lensing peak counts. Our model is able to reconstruct the underlying distribution of observables for analysis. Aims: In this work, we explore and compare various strategies for constraining a parameter using our model, focusing on the matter density Ωm and the density fluctuation amplitude σ8. Methods: First, we examine the impact from the cosmological dependency of covariances (CDC). Second, we perform the analysis with the copula likelihood, a technique that makes a weaker assumption than does the Gaussian likelihood. Third, direct, non-analytic parameter estimations are applied using the full information of the distribution. Fourth, we obtain constraints with approximate Bayesian computation (ABC), an efficient, robust, and likelihood-free algorithm based on accept-reject sampling. Results: We find that neglecting the CDC effect enlarges parameter contours by 22% and that the covariance-varying copula likelihood is a very good approximation to the true likelihood. The direct techniques work well in spite of noisier contours. Concerning ABC, the iterative process converges quickly to a posterior distribution that is in excellent agreement with results from our other analyses. The time cost for ABC is reduced by two orders of magnitude. Conclusions: The stochastic nature of our weak-lensing peak count model allows us to use various techniques that approach the true underlying probability distribution of observables, without making simplifying assumptions. Our work can be generalized to other observables where forward simulations provide samples of the underlying distribution.
NASA Astrophysics Data System (ADS)
Feeney, Stephen M.; Mortlock, Daniel J.; Dalmasso, Niccolò
2018-05-01
Estimates of the Hubble constant, H0, from the local distance ladder and from the cosmic microwave background (CMB) are discrepant at the ˜3σ level, indicating a potential issue with the standard Λ cold dark matter (ΛCDM) cosmology. A probabilistic (i.e. Bayesian) interpretation of this tension requires a model comparison calculation, which in turn depends strongly on the tails of the H0 likelihoods. Evaluating the tails of the local H0 likelihood requires the use of non-Gaussian distributions to faithfully represent anchor likelihoods and outliers, and simultaneous fitting of the complete distance-ladder data set to ensure correct uncertainty propagation. We have hence developed a Bayesian hierarchical model of the full distance ladder that does not rely on Gaussian distributions and allows outliers to be modelled without arbitrary data cuts. Marginalizing over the full ˜3000-parameter joint posterior distribution, we find H0 = (72.72 ± 1.67) km s-1 Mpc-1 when applied to the outlier-cleaned Riess et al. data, and (73.15 ± 1.78) km s-1 Mpc-1 with supernova outliers reintroduced (the pre-cut Cepheid data set is not available). Using our precise evaluation of the tails of the H0 likelihood, we apply Bayesian model comparison to assess the evidence for deviation from ΛCDM given the distance-ladder and CMB data. The odds against ΛCDM are at worst ˜10:1 when considering the Planck 2015 XIII data, regardless of outlier treatment, considerably less dramatic than naïvely implied by the 2.8σ discrepancy. These odds become ˜60:1 when an approximation to the more-discrepant Planck Intermediate XLVI likelihood is included.
Statistical methods for the beta-binomial model in teratology.
Yamamoto, E; Yanagimoto, T
1994-01-01
The beta-binomial model is widely used for analyzing teratological data involving littermates. Recent developments in statistical analyses of teratological data are briefly reviewed with emphasis on the model. For statistical inference of the parameters in the beta-binomial distribution, separation of the likelihood introduces an likelihood inference. This leads to reducing biases of estimators and also to improving accuracy of empirical significance levels of tests. Separate inference of the parameters can be conducted in a unified way. PMID:8187716
Gyre and gimble: a maximum-likelihood replacement for Patterson correlation refinement.
McCoy, Airlie J; Oeffner, Robert D; Millán, Claudia; Sammito, Massimo; Usón, Isabel; Read, Randy J
2018-04-01
Descriptions are given of the maximum-likelihood gyre method implemented in Phaser for optimizing the orientation and relative position of rigid-body fragments of a model after the orientation of the model has been identified, but before the model has been positioned in the unit cell, and also the related gimble method for the refinement of rigid-body fragments of the model after positioning. Gyre refinement helps to lower the root-mean-square atomic displacements between model and target molecular-replacement solutions for the test case of antibody Fab(26-10) and improves structure solution with ARCIMBOLDO_SHREDDER.
Eisenhauer, Philipp; Heckman, James J.; Mosso, Stefano
2015-01-01
We compare the performance of maximum likelihood (ML) and simulated method of moments (SMM) estimation for dynamic discrete choice models. We construct and estimate a simplified dynamic structural model of education that captures some basic features of educational choices in the United States in the 1980s and early 1990s. We use estimates from our model to simulate a synthetic dataset and assess the ability of ML and SMM to recover the model parameters on this sample. We investigate the performance of alternative tuning parameters for SMM. PMID:26494926
He, Ye; Lin, Huazhen; Tu, Dongsheng
2018-06-04
In this paper, we introduce a single-index threshold Cox proportional hazard model to select and combine biomarkers to identify patients who may be sensitive to a specific treatment. A penalized smoothed partial likelihood is proposed to estimate the parameters in the model. A simple, efficient, and unified algorithm is presented to maximize this likelihood function. The estimators based on this likelihood function are shown to be consistent and asymptotically normal. Under mild conditions, the proposed estimators also achieve the oracle property. The proposed approach is evaluated through simulation analyses and application to the analysis of data from two clinical trials, one involving patients with locally advanced or metastatic pancreatic cancer and one involving patients with resectable lung cancer. Copyright © 2018 John Wiley & Sons, Ltd.
Donato, David I.
2012-01-01
This report presents the mathematical expressions and the computational techniques required to compute maximum-likelihood estimates for the parameters of the National Descriptive Model of Mercury in Fish (NDMMF), a statistical model used to predict the concentration of methylmercury in fish tissue. The expressions and techniques reported here were prepared to support the development of custom software capable of computing NDMMF parameter estimates more quickly and using less computer memory than is currently possible with available general-purpose statistical software. Computation of maximum-likelihood estimates for the NDMMF by numerical solution of a system of simultaneous equations through repeated Newton-Raphson iterations is described. This report explains the derivation of the mathematical expressions required for computational parameter estimation in sufficient detail to facilitate future derivations for any revised versions of the NDMMF that may be developed.
Cramer-Rao Bound, MUSIC, and Maximum Likelihood. Effects of Temporal Phase Difference
1990-11-01
Technical Report 1373 November 1990 Cramer-Rao Bound, MUSIC , And Maximum Likelihood Effects of Temporal Phase o Difference C. V. TranI OTIC Approved... MUSIC , and Maximum Likelihood (ML) asymptotic variances corresponding to the two-source direction-of-arrival estimation where sources were modeled as...1pI = 1.00, SNR = 20 dB ..................................... 27 2. MUSIC for two equipowered signals impinging on a 5-element ULA (a) IpI = 0.50, SNR
Population Synthesis of Radio and Gamma-ray Pulsars using the Maximum Likelihood Approach
NASA Astrophysics Data System (ADS)
Billman, Caleb; Gonthier, P. L.; Harding, A. K.
2012-01-01
We present the results of a pulsar population synthesis of normal pulsars from the Galactic disk using a maximum likelihood method. We seek to maximize the likelihood of a set of parameters in a Monte Carlo population statistics code to better understand their uncertainties and the confidence region of the model's parameter space. The maximum likelihood method allows for the use of more applicable Poisson statistics in the comparison of distributions of small numbers of detected gamma-ray and radio pulsars. Our code simulates pulsars at birth using Monte Carlo techniques and evolves them to the present assuming initial spatial, kick velocity, magnetic field, and period distributions. Pulsars are spun down to the present and given radio and gamma-ray emission characteristics. We select measured distributions of radio pulsars from the Parkes Multibeam survey and Fermi gamma-ray pulsars to perform a likelihood analysis of the assumed model parameters such as initial period and magnetic field, and radio luminosity. We present the results of a grid search of the parameter space as well as a search for the maximum likelihood using a Markov Chain Monte Carlo method. We express our gratitude for the generous support of the Michigan Space Grant Consortium, of the National Science Foundation (REU and RUI), the NASA Astrophysics Theory and Fundamental Program and the NASA Fermi Guest Investigator Program.
NASA Astrophysics Data System (ADS)
Elshall, A. S.; Ye, M.; Niu, G. Y.; Barron-Gafford, G.
2015-12-01
Models in biogeoscience involve uncertainties in observation data, model inputs, model structure, model processes and modeling scenarios. To accommodate for different sources of uncertainty, multimodal analysis such as model combination, model selection, model elimination or model discrimination are becoming more popular. To illustrate theoretical and practical challenges of multimodal analysis, we use an example about microbial soil respiration modeling. Global soil respiration releases more than ten times more carbon dioxide to the atmosphere than all anthropogenic emissions. Thus, improving our understanding of microbial soil respiration is essential for improving climate change models. This study focuses on a poorly understood phenomena, which is the soil microbial respiration pulses in response to episodic rainfall pulses (the "Birch effect"). We hypothesize that the "Birch effect" is generated by the following three mechanisms. To test our hypothesis, we developed and assessed five evolving microbial-enzyme models against field measurements from a semiarid Savannah that is characterized by pulsed precipitation. These five model evolve step-wise such that the first model includes none of these three mechanism, while the fifth model includes the three mechanisms. The basic component of Bayesian multimodal analysis is the estimation of marginal likelihood to rank the candidate models based on their overall likelihood with respect to observation data. The first part of the study focuses on using this Bayesian scheme to discriminate between these five candidate models. The second part discusses some theoretical and practical challenges, which are mainly the effect of likelihood function selection and the marginal likelihood estimation methods on both model ranking and Bayesian model averaging. The study shows that making valid inference from scientific data is not a trivial task, since we are not only uncertain about the candidate scientific models, but also about the statistical methods that are used to discriminate between these models.
Efficient simulation and likelihood methods for non-neutral multi-allele models.
Joyce, Paul; Genz, Alan; Buzbas, Erkan Ozge
2012-06-01
Throughout the 1980s, Simon Tavaré made numerous significant contributions to population genetics theory. As genetic data, in particular DNA sequence, became more readily available, a need to connect population-genetic models to data became the central issue. The seminal work of Griffiths and Tavaré (1994a , 1994b , 1994c) was among the first to develop a likelihood method to estimate the population-genetic parameters using full DNA sequences. Now, we are in the genomics era where methods need to scale-up to handle massive data sets, and Tavaré has led the way to new approaches. However, performing statistical inference under non-neutral models has proved elusive. In tribute to Simon Tavaré, we present an article in spirit of his work that provides a computationally tractable method for simulating and analyzing data under a class of non-neutral population-genetic models. Computational methods for approximating likelihood functions and generating samples under a class of allele-frequency based non-neutral parent-independent mutation models were proposed by Donnelly, Nordborg, and Joyce (DNJ) (Donnelly et al., 2001). DNJ (2001) simulated samples of allele frequencies from non-neutral models using neutral models as auxiliary distribution in a rejection algorithm. However, patterns of allele frequencies produced by neutral models are dissimilar to patterns of allele frequencies produced by non-neutral models, making the rejection method inefficient. For example, in some cases the methods in DNJ (2001) require 10(9) rejections before a sample from the non-neutral model is accepted. Our method simulates samples directly from the distribution of non-neutral models, making simulation methods a practical tool to study the behavior of the likelihood and to perform inference on the strength of selection.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pražnikar, Jure; University of Primorska,; Turk, Dušan, E-mail: dusan.turk@ijs.si
2014-12-01
The maximum-likelihood free-kick target, which calculates model error estimates from the work set and a randomly displaced model, proved superior in the accuracy and consistency of refinement of crystal structures compared with the maximum-likelihood cross-validation target, which calculates error estimates from the test set and the unperturbed model. The refinement of a molecular model is a computational procedure by which the atomic model is fitted to the diffraction data. The commonly used target in the refinement of macromolecular structures is the maximum-likelihood (ML) function, which relies on the assessment of model errors. The current ML functions rely on cross-validation. Theymore » utilize phase-error estimates that are calculated from a small fraction of diffraction data, called the test set, that are not used to fit the model. An approach has been developed that uses the work set to calculate the phase-error estimates in the ML refinement from simulating the model errors via the random displacement of atomic coordinates. It is called ML free-kick refinement as it uses the ML formulation of the target function and is based on the idea of freeing the model from the model bias imposed by the chemical energy restraints used in refinement. This approach for the calculation of error estimates is superior to the cross-validation approach: it reduces the phase error and increases the accuracy of molecular models, is more robust, provides clearer maps and may use a smaller portion of data for the test set for the calculation of R{sub free} or may leave it out completely.« less
Brown, Joshua W.
2009-01-01
The error likelihood computational model of anterior cingulate cortex (ACC) (Brown & Braver, 2005) has successfully predicted error likelihood effects, risk prediction effects, and how individual differences in conflict and error likelihood effects vary with trait differences in risk aversion. The same computational model now makes a further prediction that apparent conflict effects in ACC may result in part from an increasing number of simultaneously active responses, regardless of whether or not the cued responses are mutually incompatible. In Experiment 1, the model prediction was tested with a modification of the Eriksen flanker task, in which some task conditions require two otherwise mutually incompatible responses to be generated simultaneously. In that case, the two response processes are no longer in conflict with each other. The results showed small but significant medial PFC effects in the incongruent vs. congruent contrast, despite the absence of response conflict, consistent with model predictions. This is the multiple response effect. Nonetheless, actual response conflict led to greater ACC activation, suggesting that conflict effects are specific to particular task contexts. In Experiment 2, results from a change signal task suggested that the context dependence of conflict signals does not depend on error likelihood effects. Instead, inputs to ACC may reflect complex and task specific representations of motor acts, such as bimanual responses. Overall, the results suggest the existence of a richer set of motor signals monitored by medial PFC and are consistent with distinct effects of multiple responses, conflict, and error likelihood in medial PFC. PMID:19375509
Estimation of parameters of dose volume models and their confidence limits
NASA Astrophysics Data System (ADS)
van Luijk, P.; Delvigne, T. C.; Schilstra, C.; Schippers, J. M.
2003-07-01
Predictions of the normal-tissue complication probability (NTCP) for the ranking of treatment plans are based on fits of dose-volume models to clinical and/or experimental data. In the literature several different fit methods are used. In this work frequently used methods and techniques to fit NTCP models to dose response data for establishing dose-volume effects, are discussed. The techniques are tested for their usability with dose-volume data and NTCP models. Different methods to estimate the confidence intervals of the model parameters are part of this study. From a critical-volume (CV) model with biologically realistic parameters a primary dataset was generated, serving as the reference for this study and describable by the NTCP model. The CV model was fitted to this dataset. From the resulting parameters and the CV model, 1000 secondary datasets were generated by Monte Carlo simulation. All secondary datasets were fitted to obtain 1000 parameter sets of the CV model. Thus the 'real' spread in fit results due to statistical spreading in the data is obtained and has been compared with estimates of the confidence intervals obtained by different methods applied to the primary dataset. The confidence limits of the parameters of one dataset were estimated using the methods, employing the covariance matrix, the jackknife method and directly from the likelihood landscape. These results were compared with the spread of the parameters, obtained from the secondary parameter sets. For the estimation of confidence intervals on NTCP predictions, three methods were tested. Firstly, propagation of errors using the covariance matrix was used. Secondly, the meaning of the width of a bundle of curves that resulted from parameters that were within the one standard deviation region in the likelihood space was investigated. Thirdly, many parameter sets and their likelihood were used to create a likelihood-weighted probability distribution of the NTCP. It is concluded that for the type of dose response data used here, only a full likelihood analysis will produce reliable results. The often-used approximations, such as the usage of the covariance matrix, produce inconsistent confidence limits on both the parameter sets and the resulting NTCP values.
Boden, Lauren M; Boden, Stephanie A; Premkumar, Ajay; Gottschalk, Michael B; Boden, Scott D
2018-02-09
Retrospective analysis of prospectively collected data. To create a data-driven triage system stratifying patients by likelihood of undergoing spinal surgery within one year of presentation. Low back pain (LBP) and radicular lower extremity (LE) symptoms are common musculoskeletal problems. There is currently no standard data-derived triage process based on information that can be obtained prior to the initial physician-patient encounter to direct patients to the optimal physician type. We analyzed patient-reported data from 8006 patients with a chief complaint of LBP and/or LE radicular symptoms who presented to surgeons at a large multidisciplinary spine center between September 1, 2005 and June 30, 2016. Univariate and multivariate analysis identified independent risk factors for undergoing spinal surgery within one year of initial visit. A model incorporating these risk factors was created using a random sample of 80% of the total patients in our cohort, and validated on the remaining 20%. The baseline one-year surgery rate within our cohort was 39% for all patients and 42% for patients with LE symptoms. Those identified as high likelihood by the center's existing triage process had a surgery rate of 45%. The new triage scoring system proposed in this study was able to identify a high likelihood group in which 58% underwent surgery, which is a 46% higher surgery rate than in non-triaged patients and a 29% improvement from our institution's existing triage system. The data-driven triage model and scoring system derived and validated in this study (Spine Surgery Likelihood model [SSL-11]), significantly improved existing processes in predicting the likelihood of undergoing spinal surgery within one year of initial presentation. This triage system will allow centers to more selectively screen for surgical candidates and more effectively direct patients to surgeons or non-operative spine specialists. 4.
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students’ e-learning via Facebook through Bayesian structural equation modeling
Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019
Ring, Christopher; Kavussanu, Maria
2018-03-01
Given the concern over doping in sport, researchers have begun to explore the role played by self-regulatory processes in the decision whether to use banned performance-enhancing substances. Grounded on Bandura's (1991) theory of moral thought and action, this study examined the role of self-regulatory efficacy, moral disengagement and anticipated guilt on the likelihood to use a banned substance among college athletes. Doping self-regulatory efficacy was associated with doping likelihood both directly (b = -.16, P < .001) and indirectly (b = -.29, P < .001) through doping moral disengagement. Moral disengagement also contributed directly to higher doping likelihood and lower anticipated guilt about doping, which was associated with higher doping likelihood. Overall, the present findings provide evidence to support a model of doping based on Bandura's social cognitive theory of moral thought and action, in which self-regulatory efficacy influences the likelihood to use banned performance-enhancing substances both directly and indirectly via moral disengagement.
Synthesizing Regression Results: A Factored Likelihood Method
ERIC Educational Resources Information Center
Wu, Meng-Jia; Becker, Betsy Jane
2013-01-01
Regression methods are widely used by researchers in many fields, yet methods for synthesizing regression results are scarce. This study proposes using a factored likelihood method, originally developed to handle missing data, to appropriately synthesize regression models involving different predictors. This method uses the correlations reported…
Maximum likelihood estimation for Cox's regression model under nested case-control sampling.
Scheike, Thomas H; Juul, Anders
2004-04-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used to obtain information additional to the relative risk estimates of covariates.
NASA Technical Reports Server (NTRS)
Switzer, Eric Ryan; Watts, Duncan J.
2016-01-01
The B-mode polarization of the cosmic microwave background provides a unique window into tensor perturbations from inflationary gravitational waves. Survey effects complicate the estimation and description of the power spectrum on the largest angular scales. The pixel-space likelihood yields parameter distributions without the power spectrum as an intermediate step, but it does not have the large suite of tests available to power spectral methods. Searches for primordial B-modes must rigorously reject and rule out contamination. Many forms of contamination vary or are uncorrelated across epochs, frequencies, surveys, or other data treatment subsets. The cross power and the power spectrum of the difference of subset maps provide approaches to reject and isolate excess variance. We develop an analogous joint pixel-space likelihood. Contamination not modeled in the likelihood produces parameter-dependent bias and complicates the interpretation of the difference map. We describe a null test that consistently weights the difference map. Excess variance should either be explicitly modeled in the covariance or be removed through reprocessing the data.
NASA Astrophysics Data System (ADS)
Moschetti, M. P.; Mueller, C. S.; Boyd, O. S.; Petersen, M. D.
2013-12-01
In anticipation of the update of the Alaska seismic hazard maps (ASHMs) by the U. S. Geological Survey, we report progress on the comparison of smoothed seismicity models developed using fixed and adaptive smoothing algorithms, and investigate the sensitivity of seismic hazard to the models. While fault-based sources, such as those for great earthquakes in the Alaska-Aleutian subduction zone and for the ~10 shallow crustal faults within Alaska, dominate the seismic hazard estimates for locations near to the sources, smoothed seismicity rates make important contributions to seismic hazard away from fault-based sources and where knowledge of recurrence and magnitude is not sufficient for use in hazard studies. Recent developments in adaptive smoothing methods and statistical tests for evaluating and comparing rate models prompt us to investigate the appropriateness of adaptive smoothing for the ASHMs. We develop smoothed seismicity models for Alaska using fixed and adaptive smoothing methods and compare the resulting models by calculating and evaluating the joint likelihood test. We use the earthquake catalog, and associated completeness levels, developed for the 2007 ASHM to produce fixed-bandwidth-smoothed models with smoothing distances varying from 10 to 100 km and adaptively smoothed models. Adaptive smoothing follows the method of Helmstetter et al. and defines a unique smoothing distance for each earthquake epicenter from the distance to the nth nearest neighbor. The consequence of the adaptive smoothing methods is to reduce smoothing distances, causing locally increased seismicity rates, where seismicity rates are high and to increase smoothing distances where seismicity is sparse. We follow guidance from previous studies to optimize the neighbor number (n-value) by comparing model likelihood values, which estimate the likelihood that the observed earthquake epicenters from the recent catalog are derived from the smoothed rate models. We compare likelihood values from all rate models to rank the smoothing methods. We find that adaptively smoothed seismicity models yield better likelihood values than the fixed smoothing models. Holding all other (source and ground motion) models constant, we calculate seismic hazard curves for all points across Alaska on a 0.1 degree grid, using the adaptively smoothed and fixed smoothed seismicity models separately. Because adaptively smoothed models concentrate seismicity near the earthquake epicenters where seismicity rates are high, the corresponding hazard values are higher, locally, but reduced with distance from observed seismicity, relative to the hazard from fixed-bandwidth models. We suggest that adaptively smoothed seismicity models be considered for implementation in the update to the ASHMs because of their improved likelihood estimates relative to fixed smoothing methods; however, concomitant increases in seismic hazard will cause significant changes in regions of high seismicity, such as near the subduction zone, northeast of Kotzebue, and along the NNE trending zone of seismicity in the Alaskan interior.
Moschetti, Morgan P.; Mueller, Charles S.; Boyd, Oliver S.; Petersen, Mark D.
2014-01-01
In anticipation of the update of the Alaska seismic hazard maps (ASHMs) by the U. S. Geological Survey, we report progress on the comparison of smoothed seismicity models developed using fixed and adaptive smoothing algorithms, and investigate the sensitivity of seismic hazard to the models. While fault-based sources, such as those for great earthquakes in the Alaska-Aleutian subduction zone and for the ~10 shallow crustal faults within Alaska, dominate the seismic hazard estimates for locations near to the sources, smoothed seismicity rates make important contributions to seismic hazard away from fault-based sources and where knowledge of recurrence and magnitude is not sufficient for use in hazard studies. Recent developments in adaptive smoothing methods and statistical tests for evaluating and comparing rate models prompt us to investigate the appropriateness of adaptive smoothing for the ASHMs. We develop smoothed seismicity models for Alaska using fixed and adaptive smoothing methods and compare the resulting models by calculating and evaluating the joint likelihood test. We use the earthquake catalog, and associated completeness levels, developed for the 2007 ASHM to produce fixed-bandwidth-smoothed models with smoothing distances varying from 10 to 100 km and adaptively smoothed models. Adaptive smoothing follows the method of Helmstetter et al. and defines a unique smoothing distance for each earthquake epicenter from the distance to the nth nearest neighbor. The consequence of the adaptive smoothing methods is to reduce smoothing distances, causing locally increased seismicity rates, where seismicity rates are high and to increase smoothing distances where seismicity is sparse. We follow guidance from previous studies to optimize the neighbor number (n-value) by comparing model likelihood values, which estimate the likelihood that the observed earthquake epicenters from the recent catalog are derived from the smoothed rate models. We compare likelihood values from all rate models to rank the smoothing methods. We find that adaptively smoothed seismicity models yield better likelihood values than the fixed smoothing models. Holding all other (source and ground motion) models constant, we calculate seismic hazard curves for all points across Alaska on a 0.1 degree grid, using the adaptively smoothed and fixed smoothed seismicity models separately. Because adaptively smoothed models concentrate seismicity near the earthquake epicenters where seismicity rates are high, the corresponding hazard values are higher, locally, but reduced with distance from observed seismicity, relative to the hazard from fixed-bandwidth models. We suggest that adaptively smoothed seismicity models be considered for implementation in the update to the ASHMs because of their improved likelihood estimates relative to fixed smoothing methods; however, concomitant increases in seismic hazard will cause significant changes in regions of high seismicity, such as near the subduction zone, northeast of Kotzebue, and along the NNE trending zone of seismicity in the Alaskan interior.
Flassig, Robert J; Migal, Iryna; der Zalm, Esther van; Rihko-Struckmann, Liisa; Sundmacher, Kai
2015-01-16
Understanding the dynamics of biological processes can substantially be supported by computational models in the form of nonlinear ordinary differential equations (ODE). Typically, this model class contains many unknown parameters, which are estimated from inadequate and noisy data. Depending on the ODE structure, predictions based on unmeasured states and associated parameters are highly uncertain, even undetermined. For given data, profile likelihood analysis has been proven to be one of the most practically relevant approaches for analyzing the identifiability of an ODE structure, and thus model predictions. In case of highly uncertain or non-identifiable parameters, rational experimental design based on various approaches has shown to significantly reduce parameter uncertainties with minimal amount of effort. In this work we illustrate how to use profile likelihood samples for quantifying the individual contribution of parameter uncertainty to prediction uncertainty. For the uncertainty quantification we introduce the profile likelihood sensitivity (PLS) index. Additionally, for the case of several uncertain parameters, we introduce the PLS entropy to quantify individual contributions to the overall prediction uncertainty. We show how to use these two criteria as an experimental design objective for selecting new, informative readouts in combination with intervention site identification. The characteristics of the proposed multi-criterion objective are illustrated with an in silico example. We further illustrate how an existing practically non-identifiable model for the chlorophyll fluorescence induction in a photosynthetic organism, D. salina, can be rendered identifiable by additional experiments with new readouts. Having data and profile likelihood samples at hand, the here proposed uncertainty quantification based on prediction samples from the profile likelihood provides a simple way for determining individual contributions of parameter uncertainties to uncertainties in model predictions. The uncertainty quantification of specific model predictions allows identifying regions, where model predictions have to be considered with care. Such uncertain regions can be used for a rational experimental design to render initially highly uncertain model predictions into certainty. Finally, our uncertainty quantification directly accounts for parameter interdependencies and parameter sensitivities of the specific prediction.
Spectral likelihood expansions for Bayesian inference
NASA Astrophysics Data System (ADS)
Nagel, Joseph B.; Sudret, Bruno
2016-03-01
A spectral approach to Bayesian inference is presented. It pursues the emulation of the posterior probability density. The starting point is a series expansion of the likelihood function in terms of orthogonal polynomials. From this spectral likelihood expansion all statistical quantities of interest can be calculated semi-analytically. The posterior is formally represented as the product of a reference density and a linear combination of polynomial basis functions. Both the model evidence and the posterior moments are related to the expansion coefficients. This formulation avoids Markov chain Monte Carlo simulation and allows one to make use of linear least squares instead. The pros and cons of spectral Bayesian inference are discussed and demonstrated on the basis of simple applications from classical statistics and inverse modeling.
NASA Astrophysics Data System (ADS)
Harris, Adam
2014-05-01
The Intergovernmental Panel on Climate Change (IPCC) prescribes that the communication of risk and uncertainty information pertaining to scientific reports, model predictions etc. be communicated with a set of 7 likelihood expressions. These range from "Extremely likely" (intended to communicate a likelihood of greater than 99%) through "As likely as not" (33-66%) to "Extremely unlikely" (less than 1%). Psychological research has investigated the degree to which these expressions are interpreted as intended by the IPCC, both within and across cultures. I will present a selection of this research and demonstrate some problems associated with communicating likelihoods in this way, as well as suggesting some potential improvements.
An Effective Palmprint Recognition Approach for Visible and Multispectral Sensor Images
Sammouda, Rachid; Al-Salman, Abdul Malik; Alsanad, Ahmed
2018-01-01
Among several palmprint feature extraction methods the HOG-based method is attractive and performs well against changes in illumination and shadowing of palmprint images. However, it still lacks the robustness to extract the palmprint features at different rotation angles. To solve this problem, this paper presents a hybrid feature extraction method, named HOG-SGF that combines the histogram of oriented gradients (HOG) with a steerable Gaussian filter (SGF) to develop an effective palmprint recognition approach. The approach starts by processing all palmprint images by David Zhang’s method to segment only the region of interests. Next, we extracted palmprint features based on the hybrid HOG-SGF feature extraction method. Then, an optimized auto-encoder (AE) was utilized to reduce the dimensionality of the extracted features. Finally, a fast and robust regularized extreme learning machine (RELM) was applied for the classification task. In the evaluation phase of the proposed approach, a number of experiments were conducted on three publicly available palmprint databases, namely MS-PolyU of multispectral palmprint images and CASIA and Tongji of contactless palmprint images. Experimentally, the results reveal that the proposed approach outperforms the existing state-of-the-art approaches even when a small number of training samples are used. PMID:29762519
Lirio, R B; Dondériz, I C; Pérez Abalo, M C
1992-08-01
The methodology of Receiver Operating Characteristic curves based on the signal detection model is extended to evaluate the accuracy of two-stage diagnostic strategies. A computer program is developed for the maximum likelihood estimation of parameters that characterize the sensitivity and specificity of two-stage classifiers according to this extended methodology. Its use is briefly illustrated with data collected in a two-stage screening for auditory defects.
ERIC Educational Resources Information Center
Kelderman, Henk
In this paper, algorithms are described for obtaining the maximum likelihood estimates of the parameters in log-linear models. Modified versions of the iterative proportional fitting and Newton-Raphson algorithms are described that work on the minimal sufficient statistics rather than on the usual counts in the full contingency table. This is…
Accounting for informatively missing data in logistic regression by means of reassessment sampling.
Lin, Ji; Lyles, Robert H
2015-05-20
We explore the 'reassessment' design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mechanism, with emphasis on non-ignorable missingness. The estimation is carried out by numerical maximization of the joint likelihood function with close approximation of the accompanying Hessian matrix, using sharable programs that take advantage of general optimization routines in standard software. We show how likelihood ratio tests can be used for model selection and how they facilitate direct hypothesis testing for whether missingness is at random. Examples and simulations are presented to demonstrate the performance of the proposed method. Copyright © 2015 John Wiley & Sons, Ltd.
Hybrid pairwise likelihood analysis of animal behavior experiments.
Cattelan, Manuela; Varin, Cristiano
2013-12-01
The study of the determinants of fights between animals is an important issue in understanding animal behavior. For this purpose, tournament experiments among a set of animals are often used by zoologists. The results of these tournament experiments are naturally analyzed by paired comparison models. Proper statistical analysis of these models is complicated by the presence of dependence between the outcomes of fights because the same animal is involved in different contests. This paper discusses two different model specifications to account for between-fights dependence. Models are fitted through the hybrid pairwise likelihood method that iterates between optimal estimating equations for the regression parameters and pairwise likelihood inference for the association parameters. This approach requires the specification of means and covariances only. For this reason, the method can be applied also when the computation of the joint distribution is difficult or inconvenient. The proposed methodology is investigated by simulation studies and applied to real data about adult male Cape Dwarf Chameleons. © 2013, The International Biometric Society.
ELASTIC NET FOR COX’S PROPORTIONAL HAZARDS MODEL WITH A SOLUTION PATH ALGORITHM
Wu, Yichao
2012-01-01
For least squares regression, Efron et al. (2004) proposed an efficient solution path algorithm, the least angle regression (LAR). They showed that a slight modification of the LAR leads to the whole LASSO solution path. Both the LAR and LASSO solution paths are piecewise linear. Recently Wu (2011) extended the LAR to generalized linear models and the quasi-likelihood method. In this work we extend the LAR further to handle Cox’s proportional hazards model. The goal is to develop a solution path algorithm for the elastic net penalty (Zou and Hastie (2005)) in Cox’s proportional hazards model. This goal is achieved in two steps. First we extend the LAR to optimizing the log partial likelihood plus a fixed small ridge term. Then we define a path modification, which leads to the solution path of the elastic net regularized log partial likelihood. Our solution path is exact and piecewise determined by ordinary differential equation systems. PMID:23226932
Negotiating Multicollinearity with Spike-and-Slab Priors
Ročková, Veronika
2014-01-01
In multiple regression under the normal linear model, the presence of multicollinearity is well known to lead to unreliable and unstable maximum likelihood estimates. This can be particularly troublesome for the problem of variable selection where it becomes more difficult to distinguish between subset models. Here we show how adding a spike-and-slab prior mitigates this difficulty by filtering the likelihood surface into a posterior distribution that allocates the relevant likelihood information to each of the subset model modes. For identification of promising high posterior models in this setting, we consider three EM algorithms, the fast closed form EMVS version of Rockova and George (2014) and two new versions designed for variants of the spike-and-slab formulation. For a multimodal posterior under multicollinearity, we compare the regions of convergence of these three algorithms. Deterministic annealing versions of the EMVS algorithm are seen to substantially mitigate this multimodality. A single simple running example is used for illustration throughout. PMID:25419004
The Atacama Cosmology Telescope: Likelihood for Small-Scale CMB Data
NASA Technical Reports Server (NTRS)
Dunkley, J.; Calabrese, E.; Sievers, J.; Addison, G. E.; Battaglia, N.; Battistelli, E. S.; Bond, J. R.; Das, S.; Devlin, M. J.; Dunner, R.;
2013-01-01
The Atacama Cosmology Telescope has measured the angular power spectra of microwave fluctuations to arcminute scales at frequencies of 148 and 218 GHz, from three seasons of data. At small scales the fluctuations in the primordial Cosmic Microwave Background (CMB) become increasingly obscured by extragalactic foregounds and secondary CMB signals. We present results from a nine-parameter model describing these secondary effects, including the thermal and kinematic Sunyaev-Zel'dovich (tSZ and kSZ) power; the clustered and Poisson-like power from Cosmic Infrared Background (CIB) sources, and their frequency scaling; the tSZ-CIB correlation coefficient; the extragalactic radio source power; and thermal dust emission from Galactic cirrus in two different regions of the sky. In order to extract cosmological parameters, we describe a likelihood function for the ACT data, fitting this model to the multi-frequency spectra in the multipole range 500 < l < 10000. We extend the likelihood to include spectra from the South Pole Telescope at frequencies of 95, 150, and 220 GHz. Accounting for different radio source levels and Galactic cirrus emission, the same model provides an excellent fit to both datasets simultaneously, with ?2/dof= 675/697 for ACT, and 96/107 for SPT. We then use the multi-frequency likelihood to estimate the CMB power spectrum from ACT in bandpowers, marginalizing over the secondary parameters. This provides a simplified 'CMB-only' likelihood in the range 500 < l < 3500 for use in cosmological parameter estimation
Two stochastic models useful in petroleum exploration
NASA Technical Reports Server (NTRS)
Kaufman, G. M.; Bradley, P. G.
1972-01-01
A model of the petroleum exploration process that tests empirically the hypothesis that at an early stage in the exploration of a basin, the process behaves like sampling without replacement is proposed along with a model of the spatial distribution of petroleum reserviors that conforms to observed facts. In developing the model of discovery, the following topics are discussed: probabilitistic proportionality, likelihood function, and maximum likelihood estimation. In addition, the spatial model is described, which is defined as a stochastic process generating values of a sequence or random variables in a way that simulates the frequency distribution of areal extent, the geographic location, and shape of oil deposits
Maximum likelihood estimation for periodic autoregressive moving average models
Vecchia, A.V.
1985-01-01
A useful class of models for seasonal time series that cannot be filtered or standardized to achieve second-order stationarity is that of periodic autoregressive moving average (PARMA) models, which are extensions of ARMA models that allow periodic (seasonal) parameters. An approximation to the exact likelihood for Gaussian PARMA processes is developed, and a straightforward algorithm for its maximization is presented. The algorithm is tested on several periodic ARMA(1, 1) models through simulation studies and is compared to moment estimation via the seasonal Yule-Walker equations. Applicability of the technique is demonstrated through an analysis of a seasonal stream-flow series from the Rio Caroni River in Venezuela.
A maximum likelihood convolutional decoder model vs experimental data comparison
NASA Technical Reports Server (NTRS)
Chen, R. Y.
1979-01-01
This article describes the comparison of a maximum likelihood convolutional decoder (MCD) prediction model and the actual performance of the MCD at the Madrid Deep Space Station. The MCD prediction model is used to develop a subroutine that has been utilized by the Telemetry Analysis Program (TAP) to compute the MCD bit error rate for a given signal-to-noise ratio. The results indicate that that the TAP can predict quite well compared to the experimental measurements. An optimal modulation index also can be found through TAP.
Individual, team, and coach predictors of players' likelihood to aggress in youth soccer.
Chow, Graig M; Murray, Kristen E; Feltz, Deborah L
2009-08-01
The purpose of this study was to examine personal and socioenvironmental factors of players' likelihood to aggress. Participants were youth soccer players (N = 258) and their coaches (N = 23) from high school and club teams. Players completed the Judgments About Moral Behavior in Youth Sports Questionnaire (JAMBYSQ; Stephens, Bredemeier, & Shields, 1997), which assessed athletes' stage of moral development, team norm for aggression, and self-described likelihood to aggress against an opponent. Coaches were administered the Coaching Efficacy Scale (CES; Feltz, Chase, Moritz, & Sullivan, 1999). Using multilevel modeling, results demonstrated that the team norm for aggression at the athlete and team level were significant predictors of athletes' self likelihood to aggress scores. Further, coaches' game strategy efficacy emerged as a positive predictor of their players' self-described likelihood to aggress. The findings contribute to previous research examining the socioenvironmental predictors of athletic aggression in youth sport by demonstrating the importance of coaching efficacy beliefs.
Posterior propriety for hierarchical models with log-likelihoods that have norm bounds
Michalak, Sarah E.; Morris, Carl N.
2015-07-17
Statisticians often use improper priors to express ignorance or to provide good frequency properties, requiring that posterior propriety be verified. Our paper addresses generalized linear mixed models, GLMMs, when Level I parameters have Normal distributions, with many commonly-used hyperpriors. It provides easy-to-verify sufficient posterior propriety conditions based on dimensions, matrix ranks, and exponentiated norm bounds, ENBs, for the Level I likelihood. Since many familiar likelihoods have ENBs, which is often verifiable via log-concavity and MLE finiteness, our novel use of ENBs permits unification of posterior propriety results and posterior MGF/moment results for many useful Level I distributions, including those commonlymore » used with multilevel generalized linear models, e.g., GLMMs and hierarchical generalized linear models, HGLMs. Furthermore, those who need to verify existence of posterior distributions or of posterior MGFs/moments for a multilevel generalized linear model given a proper or improper multivariate F prior as in Section 1 should find the required results in Sections 1 and 2 and Theorem 3 (GLMMs), Theorem 4 (HGLMs), or Theorem 5 (posterior MGFs/moments).« less
Occupancy Modeling Species-Environment Relationships with Non-ignorable Survey Designs.
Irvine, Kathryn M; Rodhouse, Thomas J; Wright, Wilson J; Olsen, Anthony R
2018-05-26
Statistical models supporting inferences about species occurrence patterns in relation to environmental gradients are fundamental to ecology and conservation biology. A common implicit assumption is that the sampling design is ignorable and does not need to be formally accounted for in analyses. The analyst assumes data are representative of the desired population and statistical modeling proceeds. However, if datasets from probability and non-probability surveys are combined or unequal selection probabilities are used, the design may be non ignorable. We outline the use of pseudo-maximum likelihood estimation for site-occupancy models to account for such non-ignorable survey designs. This estimation method accounts for the survey design by properly weighting the pseudo-likelihood equation. In our empirical example, legacy and newer randomly selected locations were surveyed for bats to bridge a historic statewide effort with an ongoing nationwide program. We provide a worked example using bat acoustic detection/non-detection data and show how analysts can diagnose whether their design is ignorable. Using simulations we assessed whether our approach is viable for modeling datasets composed of sites contributed outside of a probability design Pseudo-maximum likelihood estimates differed from the usual maximum likelihood occu31 pancy estimates for some bat species. Using simulations we show the maximum likelihood estimator of species-environment relationships with non-ignorable sampling designs was biased, whereas the pseudo-likelihood estimator was design-unbiased. However, in our simulation study the designs composed of a large proportion of legacy or non-probability sites resulted in estimation issues for standard errors. These issues were likely a result of highly variable weights confounded by small sample sizes (5% or 10% sampling intensity and 4 revisits). Aggregating datasets from multiple sources logically supports larger sample sizes and potentially increases spatial extents for statistical inferences. Our results suggest that ignoring the mechanism for how locations were selected for data collection (e.g., the sampling design) could result in erroneous model-based conclusions. Therefore, in order to ensure robust and defensible recommendations for evidence-based conservation decision-making, the survey design information in addition to the data themselves must be available for analysts. Details for constructing the weights used in estimation and code for implementation are provided. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
On Muthen's Maximum Likelihood for Two-Level Covariance Structure Models
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Hayashi, Kentaro
2005-01-01
Data in social and behavioral sciences are often hierarchically organized. Special statistical procedures that take into account the dependence of such observations have been developed. Among procedures for 2-level covariance structure analysis, Muthen's maximum likelihood (MUML) has the advantage of easier computation and faster convergence. When…
ERIC Educational Resources Information Center
Casabianca, Jodi M.; Lewis, Charles
2015-01-01
Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…
Likelihood-Ratio DIF Testing: Effects of Nonnormality
ERIC Educational Resources Information Center
Woods, Carol M.
2008-01-01
Differential item functioning (DIF) occurs when an item has different measurement properties for members of one group versus another. Likelihood-ratio (LR) tests for DIF based on item response theory (IRT) involve statistically comparing IRT models that vary with respect to their constraints. A simulation study evaluated how violation of the…
A Study of Item Bias for Attitudinal Measurement Using Maximum Likelihood Factor Analysis.
ERIC Educational Resources Information Center
Mayberry, Paul W.
A technique for detecting item bias that is responsive to attitudinal measurement considerations is a maximum likelihood factor analysis procedure comparing multivariate factor structures across various subpopulations, often referred to as SIFASP. The SIFASP technique allows for factorial model comparisons in the testing of various hypotheses…
An EM Algorithm for Maximum Likelihood Estimation of Process Factor Analysis Models
ERIC Educational Resources Information Center
Lee, Taehun
2010-01-01
In this dissertation, an Expectation-Maximization (EM) algorithm is developed and implemented to obtain maximum likelihood estimates of the parameters and the associated standard error estimates characterizing temporal flows for the latent variable time series following stationary vector ARMA processes, as well as the parameters defining the…
Robust Multipoint Water-Fat Separation Using Fat Likelihood Analysis
Yu, Huanzhou; Reeder, Scott B.; Shimakawa, Ann; McKenzie, Charles A.; Brittain, Jean H.
2016-01-01
Fat suppression is an essential part of routine MRI scanning. Multiecho chemical-shift based water-fat separation methods estimate and correct for Bo field inhomogeneity. However, they must contend with the intrinsic challenge of water-fat ambiguity that can result in water-fat swapping. This problem arises because the signals from two chemical species, when both are modeled as a single discrete spectral peak, may appear indistinguishable in the presence of Bo off-resonance. In conventional methods, the water-fat ambiguity is typically removed by enforcing field map smoothness using region growing based algorithms. In reality, the fat spectrum has multiple spectral peaks. Using this spectral complexity, we introduce a novel concept that identifies water and fat for multiecho acquisitions by exploiting the spectral differences between water and fat. A fat likelihood map is produced to indicate if a pixel is likely to be water-dominant or fat-dominant by comparing the fitting residuals of two different signal models. The fat likelihood analysis and field map smoothness provide complementary information, and we designed an algorithm (Fat Likelihood Analysis for Multiecho Signals) to exploit both mechanisms. It is demonstrated in a wide variety of data that the Fat Likelihood Analysis for Multiecho Signals algorithm offers highly robust water-fat separation for 6-echo acquisitions, particularly in some previously challenging applications. PMID:21842498
Validation of DNA-based identification software by computation of pedigree likelihood ratios.
Slooten, K
2011-08-01
Disaster victim identification (DVI) can be aided by DNA-evidence, by comparing the DNA-profiles of unidentified individuals with those of surviving relatives. The DNA-evidence is used optimally when such a comparison is done by calculating the appropriate likelihood ratios. Though conceptually simple, the calculations can be quite involved, especially with large pedigrees, precise mutation models etc. In this article we describe a series of test cases designed to check if software designed to calculate such likelihood ratios computes them correctly. The cases include both simple and more complicated pedigrees, among which inbred ones. We show how to calculate the likelihood ratio numerically and algebraically, including a general mutation model and possibility of allelic dropout. In Appendix A we show how to derive such algebraic expressions mathematically. We have set up these cases to validate new software, called Bonaparte, which performs pedigree likelihood ratio calculations in a DVI context. Bonaparte has been developed by SNN Nijmegen (The Netherlands) for the Netherlands Forensic Institute (NFI). It is available free of charge for non-commercial purposes (see www.dnadvi.nl for details). Commercial licenses can also be obtained. The software uses Bayesian networks and the junction tree algorithm to perform its calculations. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Maximum Likelihood Shift Estimation Using High Resolution Polarimetric SAR Clutter Model
NASA Astrophysics Data System (ADS)
Harant, Olivier; Bombrun, Lionel; Vasile, Gabriel; Ferro-Famil, Laurent; Gay, Michel
2011-03-01
This paper deals with a Maximum Likelihood (ML) shift estimation method in the context of High Resolution (HR) Polarimetric SAR (PolSAR) clutter. Texture modeling is exposed and the generalized ML texture tracking method is extended to the merging of various sensors. Some results on displacement estimation on the Argentiere glacier in the Mont Blanc massif using dual-pol TerraSAR-X (TSX) and quad-pol RADARSAT-2 (RS2) sensors are finally discussed.
Approximate Bayesian computation in large-scale structure: constraining the galaxy-halo connection
NASA Astrophysics Data System (ADS)
Hahn, ChangHoon; Vakili, Mohammadjavad; Walsh, Kilian; Hearin, Andrew P.; Hogg, David W.; Campbell, Duncan
2017-08-01
Standard approaches to Bayesian parameter inference in large-scale structure assume a Gaussian functional form (chi-squared form) for the likelihood. This assumption, in detail, cannot be correct. Likelihood free inferences such as approximate Bayesian computation (ABC) relax these restrictions and make inference possible without making any assumptions on the likelihood. Instead ABC relies on a forward generative model of the data and a metric for measuring the distance between the model and data. In this work, we demonstrate that ABC is feasible for LSS parameter inference by using it to constrain parameters of the halo occupation distribution (HOD) model for populating dark matter haloes with galaxies. Using specific implementation of ABC supplemented with population Monte Carlo importance sampling, a generative forward model using HOD and a distance metric based on galaxy number density, two-point correlation function and galaxy group multiplicity function, we constrain the HOD parameters of mock observation generated from selected 'true' HOD parameters. The parameter constraints we obtain from ABC are consistent with the 'true' HOD parameters, demonstrating that ABC can be reliably used for parameter inference in LSS. Furthermore, we compare our ABC constraints to constraints we obtain using a pseudo-likelihood function of Gaussian form with MCMC and find consistent HOD parameter constraints. Ultimately, our results suggest that ABC can and should be applied in parameter inference for LSS analyses.
Bayesian inference for OPC modeling
NASA Astrophysics Data System (ADS)
Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.
2016-03-01
The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.
Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman
2017-03-01
We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions
Barrett, Harrison H.; Dainty, Christopher; Lara, David
2008-01-01
Maximum-likelihood (ML) estimation in wavefront sensing requires careful attention to all noise sources and all factors that influence the sensor data. We present detailed probability density functions for the output of the image detector in a wavefront sensor, conditional not only on wavefront parameters but also on various nuisance parameters. Practical ways of dealing with nuisance parameters are described, and final expressions for likelihoods and Fisher information matrices are derived. The theory is illustrated by discussing Shack–Hartmann sensors, and computational requirements are discussed. Simulation results show that ML estimation can significantly increase the dynamic range of a Shack–Hartmann sensor with four detectors and that it can reduce the residual wavefront error when compared with traditional methods. PMID:17206255
What are hierarchical models and how do we analyze them?
Royle, Andy
2016-01-01
In this chapter we provide a basic definition of hierarchical models and introduce the two canonical hierarchical models in this book: site occupancy and N-mixture models. The former is a hierarchical extension of logistic regression and the latter is a hierarchical extension of Poisson regression. We introduce basic concepts of probability modeling and statistical inference including likelihood and Bayesian perspectives. We go through the mechanics of maximizing the likelihood and characterizing the posterior distribution by Markov chain Monte Carlo (MCMC) methods. We give a general perspective on topics such as model selection and assessment of model fit, although we demonstrate these topics in practice in later chapters (especially Chapters 5, 6, 7, and 10 Chapter 5 Chapter 6 Chapter 7 Chapter 10)
NASA Astrophysics Data System (ADS)
Balbi, Stefano; Villa, Ferdinando; Mojtahed, Vahid; Hegetschweiler, Karin Tessa; Giupponi, Carlo
2016-06-01
This article presents a novel methodology to assess flood risk to people by integrating people's vulnerability and ability to cushion hazards through coping and adapting. The proposed approach extends traditional risk assessments beyond material damages; complements quantitative and semi-quantitative data with subjective and local knowledge, improving the use of commonly available information; and produces estimates of model uncertainty by providing probability distributions for all of its outputs. Flood risk to people is modeled using a spatially explicit Bayesian network model calibrated on expert opinion. Risk is assessed in terms of (1) likelihood of non-fatal physical injury, (2) likelihood of post-traumatic stress disorder and (3) likelihood of death. The study area covers the lower part of the Sihl valley (Switzerland) including the city of Zurich. The model is used to estimate the effect of improving an existing early warning system, taking into account the reliability, lead time and scope (i.e., coverage of people reached by the warning). Model results indicate that the potential benefits of an improved early warning in terms of avoided human impacts are particularly relevant in case of a major flood event.
A simulation study on Bayesian Ridge regression models for several collinearity levels
NASA Astrophysics Data System (ADS)
Efendi, Achmad; Effrihan
2017-12-01
When analyzing data with multiple regression model if there are collinearities, then one or several predictor variables are usually omitted from the model. However, there sometimes some reasons, for instance medical or economic reasons, the predictors are all important and should be included in the model. Ridge regression model is not uncommon in some researches to use to cope with collinearity. Through this modeling, weights for predictor variables are used for estimating parameters. The next estimation process could follow the concept of likelihood. Furthermore, for the estimation nowadays the Bayesian version could be an alternative. This estimation method does not match likelihood one in terms of popularity due to some difficulties; computation and so forth. Nevertheless, with the growing improvement of computational methodology recently, this caveat should not at the moment become a problem. This paper discusses about simulation process for evaluating the characteristic of Bayesian Ridge regression parameter estimates. There are several simulation settings based on variety of collinearity levels and sample sizes. The results show that Bayesian method gives better performance for relatively small sample sizes, and for other settings the method does perform relatively similar to the likelihood method.
How users adopt healthcare information: An empirical study of an online Q&A community.
Jin, Jiahua; Yan, Xiangbin; Li, Yijun; Li, Yumei
2016-02-01
The emergence of social media technology has led to the creation of many online healthcare communities, where patients can easily share and look for healthcare-related information from peers who have experienced a similar problem. However, with increased user-generated content, there is a need to constantly analyse which content should be trusted as one sifts through enormous amounts of healthcare information. This study aims to explore patients' healthcare information seeking behavior in online communities. Based on dual-process theory and the knowledge adoption model, we proposed a healthcare information adoption model for online communities. This model highlights that information quality, emotional support, and source credibility are antecedent variables of adoption likelihood of healthcare information, and competition among repliers and involvement of recipients moderate the relationship between the antecedent variables and adoption likelihood. Empirical data were collected from the healthcare module of China's biggest Q&A community-Baidu Knows. Text mining techniques were adopted to calculate the information quality and emotional support contained in each reply text. A binary logistics regression model and hierarchical regression approach were employed to test the proposed conceptual model. Information quality, emotional support, and source credibility have significant and positive impact on healthcare information adoption likelihood, and among these factors, information quality has the biggest impact on a patient's adoption decision. In addition, competition among repliers and involvement of recipients were tested as moderating effects between these antecedent factors and the adoption likelihood. Results indicate competition among repliers positively moderates the relationship between source credibility and adoption likelihood, and recipients' involvement positively moderates the relationship between information quality, source credibility, and adoption decision. In addition to information quality and source credibility, emotional support has significant positive impact on individuals' healthcare information adoption decisions. Moreover, the relationships between information quality, source credibility, emotional support, and adoption decision are moderated by competition among repliers and involvement of recipients. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Using the β-binomial distribution to characterize forest health
S.J. Zarnoch; R.L. Anderson; R.M. Sheffield
1995-01-01
The β-binomial distribution is suggested as a model for describing and analyzing the dichotomous data obtained from programs monitoring the health of forests in the United States. Maximum likelihood estimation of the parameters is given as well as asymptotic likelihood ratio tests. The procedure is illustrated with data on dogwood anthracnose infection (caused...
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
Expected versus Observed Information in SEM with Incomplete Normal and Nonnormal Data
ERIC Educational Resources Information Center
Savalei, Victoria
2010-01-01
Maximum likelihood is the most common estimation method in structural equation modeling. Standard errors for maximum likelihood estimates are obtained from the associated information matrix, which can be estimated from the sample using either expected or observed information. It is known that, with complete data, estimates based on observed or…
Five Methods for Estimating Angoff Cut Scores with IRT
ERIC Educational Resources Information Center
Wyse, Adam E.
2017-01-01
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
Maximum Likelihood Estimation with Emphasis on Aircraft Flight Data
NASA Technical Reports Server (NTRS)
Iliff, K. W.; Maine, R. E.
1985-01-01
Accurate modeling of flexible space structures is an important field that is currently under investigation. Parameter estimation, using methods such as maximum likelihood, is one of the ways that the model can be improved. The maximum likelihood estimator has been used to extract stability and control derivatives from flight data for many years. Most of the literature on aircraft estimation concentrates on new developments and applications, assuming familiarity with basic estimation concepts. Some of these basic concepts are presented. The maximum likelihood estimator and the aircraft equations of motion that the estimator uses are briefly discussed. The basic concepts of minimization and estimation are examined for a simple computed aircraft example. The cost functions that are to be minimized during estimation are defined and discussed. Graphic representations of the cost functions are given to help illustrate the minimization process. Finally, the basic concepts are generalized, and estimation from flight data is discussed. Specific examples of estimation of structural dynamics are included. Some of the major conclusions for the computed example are also developed for the analysis of flight data.
optBINS: Optimal Binning for histograms
NASA Astrophysics Data System (ADS)
Knuth, Kevin H.
2018-03-01
optBINS (optimal binning) determines the optimal number of bins in a uniform bin-width histogram by deriving the posterior probability for the number of bins in a piecewise-constant density model after assigning a multinomial likelihood and a non-informative prior. The maximum of the posterior probability occurs at a point where the prior probability and the the joint likelihood are balanced. The interplay between these opposing factors effectively implements Occam's razor by selecting the most simple model that best describes the data.
A maximum pseudo-profile likelihood estimator for the Cox model under length-biased sampling
Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A.
2012-01-01
This paper considers semiparametric estimation of the Cox proportional hazards model for right-censored and length-biased data arising from prevalent sampling. To exploit the special structure of length-biased sampling, we propose a maximum pseudo-profile likelihood estimator, which can handle time-dependent covariates and is consistent under covariate-dependent censoring. Simulation studies show that the proposed estimator is more efficient than its competitors. A data analysis illustrates the methods and theory. PMID:23843659
Bayesian hierarchical modeling for detecting safety signals in clinical trials.
Xia, H Amy; Ma, Haijun; Carlin, Bradley P
2011-09-01
Detection of safety signals from clinical trial adverse event data is critical in drug development, but carries a challenging statistical multiplicity problem. Bayesian hierarchical mixture modeling is appealing for its ability to borrow strength across subgroups in the data, as well as moderate extreme findings most likely due merely to chance. We implement such a model for subject incidence (Berry and Berry, 2004 ) using a binomial likelihood, and extend it to subject-year adjusted incidence rate estimation under a Poisson likelihood. We use simulation to choose a signal detection threshold, and illustrate some effective graphics for displaying the flagged signals.
PERIODIC AUTOREGRESSIVE-MOVING AVERAGE (PARMA) MODELING WITH APPLICATIONS TO WATER RESOURCES.
Vecchia, A.V.
1985-01-01
Results involving correlation properties and parameter estimation for autogressive-moving average models with periodic parameters are presented. A multivariate representation of the PARMA model is used to derive parameter space restrictions and difference equations for the periodic autocorrelations. Close approximation to the likelihood function for Gaussian PARMA processes results in efficient maximum-likelihood estimation procedures. Terms in the Fourier expansion of the parameters are sequentially included, and a selection criterion is given for determining the optimal number of harmonics to be included. Application of the techniques is demonstrated through analysis of a monthly streamflow time series.
Heersink, Daniel K; Caley, Peter; Paini, Dean R; Barry, Simon C
2016-05-01
The cost of an uncontrolled incursion of invasive alien species (IAS) arising from undetected entry through ports can be substantial, and knowledge of port-specific risks is needed to help allocate limited surveillance resources. Quantifying the establishment likelihood of such an incursion requires quantifying the ability of a species to enter, establish, and spread. Estimation of the approach rate of IAS into ports provides a measure of likelihood of entry. Data on the approach rate of IAS are typically sparse, and the combinations of risk factors relating to country of origin and port of arrival diverse. This presents challenges to making formal statistical inference on establishment likelihood. Here we demonstrate how these challenges can be overcome with judicious use of mixed-effects models when estimating the incursion likelihood into Australia of the European (Apis mellifera) and Asian (A. cerana) honeybees, along with the invasive parasites of biosecurity concern they host (e.g., Varroa destructor). Our results demonstrate how skewed the establishment likelihood is, with one-tenth of the ports accounting for 80% or more of the likelihood for both species. These results have been utilized by biosecurity agencies in the allocation of resources to the surveillance of maritime ports. © 2015 Society for Risk Analysis.
Competition between learned reward and error outcome predictions in anterior cingulate cortex.
Alexander, William H; Brown, Joshua W
2010-02-15
The anterior cingulate cortex (ACC) is implicated in performance monitoring and cognitive control. Non-human primate studies of ACC show prominent reward signals, but these are elusive in human studies, which instead show mainly conflict and error effects. Here we demonstrate distinct appetitive and aversive activity in human ACC. The error likelihood hypothesis suggests that ACC activity increases in proportion to the likelihood of an error, and ACC is also sensitive to the consequence magnitude of the predicted error. Previous work further showed that error likelihood effects reach a ceiling as the potential consequences of an error increase, possibly due to reductions in the average reward. We explored this issue by independently manipulating reward magnitude of task responses and error likelihood while controlling for potential error consequences in an Incentive Change Signal Task. The fMRI results ruled out a modulatory effect of expected reward on error likelihood effects in favor of a competition effect between expected reward and error likelihood. Dynamic causal modeling showed that error likelihood and expected reward signals are intrinsic to the ACC rather than received from elsewhere. These findings agree with interpretations of ACC activity as signaling both perceptions of risk and predicted reward. Copyright 2009 Elsevier Inc. All rights reserved.
Extending the Applicability of the Generalized Likelihood Function for Zero-Inflated Data Series
NASA Astrophysics Data System (ADS)
Oliveira, Debora Y.; Chaffe, Pedro L. B.; Sá, João. H. M.
2018-03-01
Proper uncertainty estimation for data series with a high proportion of zero and near zero observations has been a challenge in hydrologic studies. This technical note proposes a modification to the Generalized Likelihood function that accounts for zero inflation of the error distribution (ZI-GL). We compare the performance of the proposed ZI-GL with the original Generalized Likelihood function using the entire data series (GL) and by simply suppressing zero observations (GLy>0). These approaches were applied to two interception modeling examples characterized by data series with a significant number of zeros. The ZI-GL produced better uncertainty ranges than the GL as measured by the precision, reliability and volumetric bias metrics. The comparison between ZI-GL and GLy>0 highlights the need for further improvement in the treatment of residuals from near zero simulations when a linear heteroscedastic error model is considered. Aside from the interception modeling examples illustrated herein, the proposed ZI-GL may be useful for other hydrologic studies, such as for the modeling of the runoff generation in hillslopes and ephemeral catchments.
Linking Illness in Parents to Health Anxiety in Offspring: Do Beliefs about Health Play a Role?
Alberts, Nicole M; Hadjistavropoulos, Heather D; Sherry, Simon B; Stewart, Sherry H
2016-01-01
The cognitive behavioural (CB) model of health anxiety proposes parental illness leads to elevated health anxiety in offspring by promoting the acquisition of specific health beliefs (e.g. overestimation of the likelihood of illness). Our study tested this central tenet of the CB model. Participants were 444 emerging adults (18-25-years-old) who completed online measures and were categorized into those with healthy parents (n = 328) or seriously ill parents (n = 116). Small (d = .21), but significant, elevations in health anxiety, and small to medium (d = .40) elevations in beliefs about the likelihood of illness were found among those with ill vs. healthy parents. Mediation analyses indicated the relationship between parental illness and health anxiety was mediated by beliefs regarding the likelihood of future illness. Our study incrementally advances knowledge by testing and supporting a central proposition of the CB model. The findings add further specificity to the CB model by highlighting the importance of a specific health belief as a central contributor to health anxiety among offspring with a history of serious parental illness.
Lehmann, A; Scheffler, Ch; Hermanussen, M
2010-02-01
Recent progress in modelling individual growth has been achieved by combining the principal component analysis and the maximum likelihood principle. This combination models growth even in incomplete sets of data and in data obtained at irregular intervals. We re-analysed late 18th century longitudinal growth of German boys from the boarding school Carlsschule in Stuttgart. The boys, aged 6-23 years, were measured at irregular 3-12 monthly intervals during the period 1771-1793. At the age of 18 years, mean height was 1652 mm, but height variation was large. The shortest boy reached 1474 mm, the tallest 1826 mm. Measured height closely paralleled modelled height, with mean difference of 4 mm, SD 7 mm. Seasonal height variation was found. Low growth rates occurred in spring and high growth rates in summer and autumn. The present study demonstrates that combining the principal component analysis and the maximum likelihood principle enables growth modelling in historic height data also. Copyright (c) 2009 Elsevier GmbH. All rights reserved.
Zhao, Xing; Zhou, Xiao-Hua; Feng, Zijian; Guo, Pengfei; He, Hongyan; Zhang, Tao; Duan, Lei; Li, Xiaosong
2013-01-01
As a useful tool for geographical cluster detection of events, the spatial scan statistic is widely applied in many fields and plays an increasingly important role. The classic version of the spatial scan statistic for the binary outcome is developed by Kulldorff, based on the Bernoulli or the Poisson probability model. In this paper, we apply the Hypergeometric probability model to construct the likelihood function under the null hypothesis. Compared with existing methods, the likelihood function under the null hypothesis is an alternative and indirect method to identify the potential cluster, and the test statistic is the extreme value of the likelihood function. Similar with Kulldorff's methods, we adopt Monte Carlo test for the test of significance. Both methods are applied for detecting spatial clusters of Japanese encephalitis in Sichuan province, China, in 2009, and the detected clusters are identical. Through a simulation to independent benchmark data, it is indicated that the test statistic based on the Hypergeometric model outweighs Kulldorff's statistics for clusters of high population density or large size; otherwise Kulldorff's statistics are superior.
NASA Astrophysics Data System (ADS)
Hiemer, S.; Woessner, J.; Basili, R.; Danciu, L.; Giardini, D.; Wiemer, S.
2014-08-01
We present a time-independent gridded earthquake rate forecast for the European region including Turkey. The spatial component of our model is based on kernel density estimation techniques, which we applied to both past earthquake locations and fault moment release on mapped crustal faults and subduction zone interfaces with assigned slip rates. Our forecast relies on the assumption that the locations of past seismicity is a good guide to future seismicity, and that future large-magnitude events occur more likely in the vicinity of known faults. We show that the optimal weighted sum of the corresponding two spatial densities depends on the magnitude range considered. The kernel bandwidths and density weighting function are optimized using retrospective likelihood-based forecast experiments. We computed earthquake activity rates (a- and b-value) of the truncated Gutenberg-Richter distribution separately for crustal and subduction seismicity based on a maximum likelihood approach that considers the spatial and temporal completeness history of the catalogue. The final annual rate of our forecast is purely driven by the maximum likelihood fit of activity rates to the catalogue data, whereas its spatial component incorporates contributions from both earthquake and fault moment-rate densities. Our model constitutes one branch of the earthquake source model logic tree of the 2013 European seismic hazard model released by the EU-FP7 project `Seismic HAzard haRmonization in Europe' (SHARE) and contributes to the assessment of epistemic uncertainties in earthquake activity rates. We performed retrospective and pseudo-prospective likelihood consistency tests to underline the reliability of our model and SHARE's area source model (ASM) using the testing algorithms applied in the collaboratory for the study of earthquake predictability (CSEP). We comparatively tested our model's forecasting skill against the ASM and find a statistically significant better performance for testing periods of 10-20 yr. The testing results suggest that our model is a viable candidate model to serve for long-term forecasting on timescales of years to decades for the European region.
Methods for estimating drought streamflow probabilities for Virginia streams
Austin, Samuel H.
2014-01-01
Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
Grinband, Jack; Savitskaya, Judith; Wager, Tor D; Teichert, Tobias; Ferrera, Vincent P; Hirsch, Joy
2011-07-15
The dorsal medial frontal cortex (dMFC) is highly active during choice behavior. Though many models have been proposed to explain dMFC function, the conflict monitoring model is the most influential. It posits that dMFC is primarily involved in detecting interference between competing responses thus signaling the need for control. It accurately predicts increased neural activity and response time (RT) for incompatible (high-interference) vs. compatible (low-interference) decisions. However, it has been shown that neural activity can increase with time on task, even when no decisions are made. Thus, the greater dMFC activity on incompatible trials may stem from longer RTs rather than response conflict. This study shows that (1) the conflict monitoring model fails to predict the relationship between error likelihood and RT, and (2) the dMFC activity is not sensitive to congruency, error likelihood, or response conflict, but is monotonically related to time on task. Copyright © 2010 Elsevier Inc. All rights reserved.
Probabilistic Modeling of the Renal Stone Formation Module
NASA Technical Reports Server (NTRS)
Best, Lauren M.; Myers, Jerry G.; Goodenow, Debra A.; McRae, Michael P.; Jackson, Travis C.
2013-01-01
The Integrated Medical Model (IMM) is a probabilistic tool, used in mission planning decision making and medical systems risk assessments. The IMM project maintains a database of over 80 medical conditions that could occur during a spaceflight, documenting an incidence rate and end case scenarios for each. In some cases, where observational data are insufficient to adequately define the inflight medical risk, the IMM utilizes external probabilistic modules to model and estimate the event likelihoods. One such medical event of interest is an unpassed renal stone. Due to a high salt diet and high concentrations of calcium in the blood (due to bone depletion caused by unloading in the microgravity environment) astronauts are at a considerable elevated risk for developing renal calculi (nephrolithiasis) while in space. Lack of observed incidences of nephrolithiasis has led HRP to initiate the development of the Renal Stone Formation Module (RSFM) to create a probabilistic simulator capable of estimating the likelihood of symptomatic renal stone presentation in astronauts on exploration missions. The model consists of two major parts. The first is the probabilistic component, which utilizes probability distributions to assess the range of urine electrolyte parameters and a multivariate regression to transform estimated crystal density and size distributions to the likelihood of the presentation of nephrolithiasis symptoms. The second is a deterministic physical and chemical model of renal stone growth in the kidney developed by Kassemi et al. The probabilistic component of the renal stone model couples the input probability distributions describing the urine chemistry, astronaut physiology, and system parameters with the physical and chemical outputs and inputs to the deterministic stone growth model. These two parts of the model are necessary to capture the uncertainty in the likelihood estimate. The model will be driven by Monte Carlo simulations, continuously randomly sampling the probability distributions of the electrolyte concentrations and system parameters that are inputs into the deterministic model. The total urine chemistry concentrations are used to determine the urine chemistry activity using the Joint Expert Speciation System (JESS), a biochemistry model. Information used from JESS is then fed into the deterministic growth model. Outputs from JESS and the deterministic model are passed back to the probabilistic model where a multivariate regression is used to assess the likelihood of a stone forming and the likelihood of a stone requiring clinical intervention. The parameters used to determine to quantify these risks include: relative supersaturation (RS) of calcium oxalate, citrate/calcium ratio, crystal number density, total urine volume, pH, magnesium excretion, maximum stone width, and ureteral location. Methods and Validation: The RSFM is designed to perform a Monte Carlo simulation to generate probability distributions of clinically significant renal stones, as well as provide an associated uncertainty in the estimate. Initially, early versions will be used to test integration of the components and assess component validation and verification (V&V), with later versions used to address questions regarding design reference mission scenarios. Once integrated with the deterministic component, the credibility assessment of the integrated model will follow NASA STD 7009 requirements.
Lee, E Henry; Wickham, Charlotte; Beedlow, Peter A; Waschmann, Ronald S; Tingey, David T
2017-10-01
A time series intervention analysis (TSIA) of dendrochronological data to infer the tree growth-climate-disturbance relations and forest disturbance history is described. Maximum likelihood is used to estimate the parameters of a structural time series model with components for climate and forest disturbances (i.e., pests, diseases, fire). The statistical method is illustrated with a tree-ring width time series for a mature closed-canopy Douglas-fir stand on the west slopes of the Cascade Mountains of Oregon, USA that is impacted by Swiss needle cast disease caused by the foliar fungus, Phaecryptopus gaeumannii (Rhode) Petrak. The likelihood-based TSIA method is proposed for the field of dendrochronology to understand the interaction of temperature, water, and forest disturbances that are important in forest ecology and climate change studies.
Multiple Cognitive Control Effects of Error Likelihood and Conflict
Brown, Joshua W.
2010-01-01
Recent work on cognitive control has suggested a variety of performance monitoring functions of the anterior cingulate cortex, such as errors, conflict, error likelihood, and others. Given the variety of monitoring effects, a corresponding variety of control effects on behavior might be expected. This paper explores whether conflict and error likelihood produce distinct cognitive control effects on behavior, as measured by response time. A change signal task (Brown & Braver, 2005) was modified to include conditions of likely errors due to tardy as well as premature responses, in conditions with and without conflict. The results discriminate between competing hypotheses of independent vs. interacting conflict and error likelihood control effects. Specifically, the results suggest that the likelihood of premature vs. tardy response errors can lead to multiple distinct control effects, which are independent of cognitive control effects driven by response conflict. As a whole, the results point to the existence of multiple distinct cognitive control mechanisms and challenge existing models of cognitive control that incorporate only a single control signal. PMID:19030873
Chen, Rui; Hyrien, Ollivier
2011-01-01
This article deals with quasi- and pseudo-likelihood estimation in a class of continuous-time multi-type Markov branching processes observed at discrete points in time. “Conventional” and conditional estimation are discussed for both approaches. We compare their properties and identify situations where they lead to asymptotically equivalent estimators. Both approaches possess robustness properties, and coincide with maximum likelihood estimation in some cases. Quasi-likelihood functions involving only linear combinations of the data may be unable to estimate all model parameters. Remedial measures exist, including the resort either to non-linear functions of the data or to conditioning the moments on appropriate sigma-algebras. The method of pseudo-likelihood may also resolve this issue. We investigate the properties of these approaches in three examples: the pure birth process, the linear birth-and-death process, and a two-type process that generalizes the previous two examples. Simulations studies are conducted to evaluate performance in finite samples. PMID:21552356
The skewed weak lensing likelihood: why biases arise, despite data and theory being sound
NASA Astrophysics Data System (ADS)
Sellentin, Elena; Heymans, Catherine; Harnois-Déraps, Joachim
2018-07-01
We derive the essentials of the skewed weak lensing likelihood via a simple hierarchical forward model. Our likelihood passes four objective and cosmology-independent tests which a standard Gaussian likelihood fails. We demonstrate that sound weak lensing data are naturally biased low, since they are drawn from a skewed distribution. This occurs already in the framework of Lambda cold dark matter. Mathematically, the biases arise because noisy two-point functions follow skewed distributions. This form of bias is already known from cosmic microwave background analyses, where the low multipoles have asymmetric error bars. Weak lensing is more strongly affected by this asymmetry as galaxies form a discrete set of shear tracer particles, in contrast to a smooth shear field. We demonstrate that the biases can be up to 30 per cent of the standard deviation per data point, dependent on the properties of the weak lensing survey and the employed filter function. Our likelihood provides a versatile framework with which to address this bias in future weak lensing analyses.
The skewed weak lensing likelihood: why biases arise, despite data and theory being sound.
NASA Astrophysics Data System (ADS)
Sellentin, Elena; Heymans, Catherine; Harnois-Déraps, Joachim
2018-04-01
We derive the essentials of the skewed weak lensing likelihood via a simple Hierarchical Forward Model. Our likelihood passes four objective and cosmology-independent tests which a standard Gaussian likelihood fails. We demonstrate that sound weak lensing data are naturally biased low, since they are drawn from a skewed distribution. This occurs already in the framework of ΛCDM. Mathematically, the biases arise because noisy two-point functions follow skewed distributions. This form of bias is already known from CMB analyses, where the low multipoles have asymmetric error bars. Weak lensing is more strongly affected by this asymmetry as galaxies form a discrete set of shear tracer particles, in contrast to a smooth shear field. We demonstrate that the biases can be up to 30% of the standard deviation per data point, dependent on the properties of the weak lensing survey and the employed filter function. Our likelihood provides a versatile framework with which to address this bias in future weak lensing analyses.
Grummer, Jared A; Bryson, Robert W; Reeder, Tod W
2014-03-01
Current molecular methods of species delimitation are limited by the types of species delimitation models and scenarios that can be tested. Bayes factors allow for more flexibility in testing non-nested species delimitation models and hypotheses of individual assignment to alternative lineages. Here, we examined the efficacy of Bayes factors in delimiting species through simulations and empirical data from the Sceloporus scalaris species group. Marginal-likelihood scores of competing species delimitation models, from which Bayes factor values were compared, were estimated with four different methods: harmonic mean estimation (HME), smoothed harmonic mean estimation (sHME), path-sampling/thermodynamic integration (PS), and stepping-stone (SS) analysis. We also performed model selection using a posterior simulation-based analog of the Akaike information criterion through Markov chain Monte Carlo analysis (AICM). Bayes factor species delimitation results from the empirical data were then compared with results from the reversible-jump MCMC (rjMCMC) coalescent-based species delimitation method Bayesian Phylogenetics and Phylogeography (BP&P). Simulation results show that HME and sHME perform poorly compared with PS and SS marginal-likelihood estimators when identifying the true species delimitation model. Furthermore, Bayes factor delimitation (BFD) of species showed improved performance when species limits are tested by reassigning individuals between species, as opposed to either lumping or splitting lineages. In the empirical data, BFD through PS and SS analyses, as well as the rjMCMC method, each provide support for the recognition of all scalaris group taxa as independent evolutionary lineages. Bayes factor species delimitation and BP&P also support the recognition of three previously undescribed lineages. In both simulated and empirical data sets, harmonic and smoothed harmonic mean marginal-likelihood estimators provided much higher marginal-likelihood estimates than PS and SS estimators. The AICM displayed poor repeatability in both simulated and empirical data sets, and produced inconsistent model rankings across replicate runs with the empirical data. Our results suggest that species delimitation through the use of Bayes factors with marginal-likelihood estimates via PS or SS analyses provide a useful and complementary alternative to existing species delimitation methods.
Calibration of two complex ecosystem models with different likelihood functions
NASA Astrophysics Data System (ADS)
Hidy, Dóra; Haszpra, László; Pintér, Krisztina; Nagy, Zoltán; Barcza, Zoltán
2014-05-01
The biosphere is a sensitive carbon reservoir. Terrestrial ecosystems were approximately carbon neutral during the past centuries, but they became net carbon sinks due to climate change induced environmental change and associated CO2 fertilization effect of the atmosphere. Model studies and measurements indicate that the biospheric carbon sink can saturate in the future due to ongoing climate change which can act as a positive feedback. Robustness of carbon cycle models is a key issue when trying to choose the appropriate model for decision support. The input parameters of the process-based models are decisive regarding the model output. At the same time there are several input parameters for which accurate values are hard to obtain directly from experiments or no local measurements are available. Due to the uncertainty associated with the unknown model parameters significant bias can be experienced if the model is used to simulate the carbon and nitrogen cycle components of different ecosystems. In order to improve model performance the unknown model parameters has to be estimated. We developed a multi-objective, two-step calibration method based on Bayesian approach in order to estimate the unknown parameters of PaSim and Biome-BGC models. Biome-BGC and PaSim are a widely used biogeochemical models that simulate the storage and flux of water, carbon, and nitrogen between the ecosystem and the atmosphere, and within the components of the terrestrial ecosystems (in this research the developed version of Biome-BGC is used which is referred as BBGC MuSo). Both models were calibrated regardless the simulated processes and type of model parameters. The calibration procedure is based on the comparison of measured data with simulated results via calculating a likelihood function (degree of goodness-of-fit between simulated and measured data). In our research different likelihood function formulations were used in order to examine the effect of the different model goodness metric on calibration. The different likelihoods are different functions of RMSE (root mean squared error) weighted by measurement uncertainty: exponential / linear / quadratic / linear normalized by correlation. As a first calibration step sensitivity analysis was performed in order to select the influential parameters which have strong effect on the output data. In the second calibration step only the sensitive parameters were calibrated (optimal values and confidence intervals were calculated). In case of PaSim more parameters were found responsible for the 95% of the output data variance than is case of BBGC MuSo. Analysis of the results of the optimized models revealed that the exponential likelihood estimation proved to be the most robust (best model simulation with optimized parameter, highest confidence interval increase). The cross-validation of the model simulations can help in constraining the highly uncertain greenhouse gas budget of grasslands.
DarkBit: a GAMBIT module for computing dark matter observables and likelihoods
NASA Astrophysics Data System (ADS)
Bringmann, Torsten; Conrad, Jan; Cornell, Jonathan M.; Dal, Lars A.; Edsjö, Joakim; Farmer, Ben; Kahlhoefer, Felix; Kvellestad, Anders; Putze, Antje; Savage, Christopher; Scott, Pat; Weniger, Christoph; White, Martin; Wild, Sebastian
2017-12-01
We introduce DarkBit, an advanced software code for computing dark matter constraints on various extensions to the Standard Model of particle physics, comprising both new native code and interfaces to external packages. This release includes a dedicated signal yield calculator for gamma-ray observations, which significantly extends current tools by implementing a cascade-decay Monte Carlo, as well as a dedicated likelihood calculator for current and future experiments ( gamLike). This provides a general solution for studying complex particle physics models that predict dark matter annihilation to a multitude of final states. We also supply a direct detection package that models a large range of direct detection experiments ( DDCalc), and that provides the corresponding likelihoods for arbitrary combinations of spin-independent and spin-dependent scattering processes. Finally, we provide custom relic density routines along with interfaces to DarkSUSY, micrOMEGAs, and the neutrino telescope likelihood package nulike. DarkBit is written in the framework of the Global And Modular Beyond the Standard Model Inference Tool ( GAMBIT), providing seamless integration into a comprehensive statistical fitting framework that allows users to explore new models with both particle and astrophysics constraints, and a consistent treatment of systematic uncertainties. In this paper we describe its main functionality, provide a guide to getting started quickly, and show illustrative examples for results obtained with DarkBit (both as a stand-alone tool and as a GAMBIT module). This includes a quantitative comparison between two of the main dark matter codes ( DarkSUSY and micrOMEGAs), and application of DarkBit 's advanced direct and indirect detection routines to a simple effective dark matter model.
Empirical likelihood inference in randomized clinical trials.
Zhang, Biao
2017-01-01
In individually randomized controlled trials, in addition to the primary outcome, information is often available on a number of covariates prior to randomization. This information is frequently utilized to undertake adjustment for baseline characteristics in order to increase precision of the estimation of average treatment effects; such adjustment is usually performed via covariate adjustment in outcome regression models. Although the use of covariate adjustment is widely seen as desirable for making treatment effect estimates more precise and the corresponding hypothesis tests more powerful, there are considerable concerns that objective inference in randomized clinical trials can potentially be compromised. In this paper, we study an empirical likelihood approach to covariate adjustment and propose two unbiased estimating functions that automatically decouple evaluation of average treatment effects from regression modeling of covariate-outcome relationships. The resulting empirical likelihood estimator of the average treatment effect is as efficient as the existing efficient adjusted estimators 1 when separate treatment-specific working regression models are correctly specified, yet are at least as efficient as the existing efficient adjusted estimators 1 for any given treatment-specific working regression models whether or not they coincide with the true treatment-specific covariate-outcome relationships. We present a simulation study to compare the finite sample performance of various methods along with some results on analysis of a data set from an HIV clinical trial. The simulation results indicate that the proposed empirical likelihood approach is more efficient and powerful than its competitors when the working covariate-outcome relationships by treatment status are misspecified.
NASA Technical Reports Server (NTRS)
Lei, Ning; Chiang, Kwo-Fu; Oudrari, Hassan; Xiong, Xiaoxiong
2011-01-01
Optical sensors aboard Earth orbiting satellites such as the next generation Visible/Infrared Imager/Radiometer Suite (VIIRS) assume that the sensors radiometric response in the Reflective Solar Bands (RSB) is described by a quadratic polynomial, in relating the aperture spectral radiance to the sensor Digital Number (DN) readout. For VIIRS Flight Unit 1, the coefficients are to be determined before launch by an attenuation method, although the linear coefficient will be further determined on-orbit through observing the Solar Diffuser. In determining the quadratic polynomial coefficients by the attenuation method, a Maximum Likelihood approach is applied in carrying out the least-squares procedure. Crucial to the Maximum Likelihood least-squares procedure is the computation of the weight. The weight not only has a contribution from the noise of the sensor s digital count, with an important contribution from digitization error, but also is affected heavily by the mathematical expression used to predict the value of the dependent variable, because both the independent and the dependent variables contain random noise. In addition, model errors have a major impact on the uncertainties of the coefficients. The Maximum Likelihood approach demonstrates the inadequacy of the attenuation method model with a quadratic polynomial for the retrieved spectral radiance. We show that using the inadequate model dramatically increases the uncertainties of the coefficients. We compute the coefficient values and their uncertainties, considering both measurement and model errors.
On the Likelihood Ratio Test for the Number of Factors in Exploratory Factor Analysis
ERIC Educational Resources Information Center
Hayashi, Kentaro; Bentler, Peter M.; Yuan, Ke-Hai
2007-01-01
In the exploratory factor analysis, when the number of factors exceeds the true number of factors, the likelihood ratio test statistic no longer follows the chi-square distribution due to a problem of rank deficiency and nonidentifiability of model parameters. As a result, decisions regarding the number of factors may be incorrect. Several…
Nowakowska, Marzena
2017-04-01
The development of the Bayesian logistic regression model classifying the road accident severity is discussed. The already exploited informative priors (method of moments, maximum likelihood estimation, and two-stage Bayesian updating), along with the original idea of a Boot prior proposal, are investigated when no expert opinion has been available. In addition, two possible approaches to updating the priors, in the form of unbalanced and balanced training data sets, are presented. The obtained logistic Bayesian models are assessed on the basis of a deviance information criterion (DIC), highest probability density (HPD) intervals, and coefficients of variation estimated for the model parameters. The verification of the model accuracy has been based on sensitivity, specificity and the harmonic mean of sensitivity and specificity, all calculated from a test data set. The models obtained from the balanced training data set have a better classification quality than the ones obtained from the unbalanced training data set. The two-stage Bayesian updating prior model and the Boot prior model, both identified with the use of the balanced training data set, outperform the non-informative, method of moments, and maximum likelihood estimation prior models. It is important to note that one should be careful when interpreting the parameters since different priors can lead to different models. Copyright © 2017 Elsevier Ltd. All rights reserved.
Cox Regression Models with Functional Covariates for Survival Data.
Gellar, Jonathan E; Colantuoni, Elizabeth; Needham, Dale M; Crainiceanu, Ciprian M
2015-06-01
We extend the Cox proportional hazards model to cases when the exposure is a densely sampled functional process, measured at baseline. The fundamental idea is to combine penalized signal regression with methods developed for mixed effects proportional hazards models. The model is fit by maximizing the penalized partial likelihood, with smoothing parameters estimated by a likelihood-based criterion such as AIC or EPIC. The model may be extended to allow for multiple functional predictors, time varying coefficients, and missing or unequally-spaced data. Methods were inspired by and applied to a study of the association between time to death after hospital discharge and daily measures of disease severity collected in the intensive care unit, among survivors of acute respiratory distress syndrome.
Silveira, Maria J; Copeland, Laurel A; Feudtner, Chris
2006-07-01
We tested whether local cultural and social values regarding the use of health care are associated with the likelihood of home death, using variation in local rates of home births as a proxy for geographic variation in these values. For each of 351110 adult decedents in Washington state who died from 1989 through 1998, we calculated the home birth rate in each zip code during the year of death and then used multivariate regression modeling to estimate the relation between the likelihood of home death and the local rate of home births. Individuals residing in local areas with higher home birth rates had greater adjusted likelihood of dying at home (odds ratio [OR]=1.04 for each percentage point increase in home birth rate; 95% confidence interval [CI] = 1.03, 1.05). Moreover, the likelihood of dying at home increased with local wealth (OR=1.04 per $10000; 95% CI=1.02, 1.06) but decreased with local hospital bed availability (OR=0.96 per 1000 beds; 95% CI=0.95, 0.97). The likelihood of home death is associated with local rates of home births, suggesting the influence of health care use preferences.
Wang, Jiun-Hao; Chang, Hung-Hao
2010-10-26
In contrast to the considerable body of literature concerning the disabilities of the general population, little information exists pertaining to the disabilities of the farm population. Focusing on the disability issue to the insurants in the Farmers' Health Insurance (FHI) program in Taiwan, this paper examines the associations among socio-demographic characteristics, insured factors, and the introduction of the national health insurance program, as well as the types and payments of disabilities among the insurants. A unique dataset containing 1,594,439 insurants in 2008 was used in this research. A logistic regression model was estimated for the likelihood of received disability payments. By focusing on the recipients, a disability payment and a disability type equation were estimated using the ordinary least squares method and a multinomial logistic model, respectively, to investigate the effects of the exogenous factors on their received payments and the likelihood of having different types of disabilities. Age and different job categories are significantly associated with the likelihood of receiving disability payments. Compared to those under age 45, the likelihood is higher among recipients aged 85 and above (the odds ratio is 8.04). Compared to hired workers, the odds ratios for self-employed and spouses of farm operators who were not members of farmers' associations are 0.97 and 0.85, respectively. In addition, older insurants are more likely to have eye problems; few differences in disability types are related to insured job categories. Results indicate that older farmers are more likely to receive disability payments, but the likelihood is not much different among insurants of various job categories. Among all of the selected types of disability, a highest likelihood is found for eye disability. In addition, the introduction of the national health insurance program decreases the likelihood of receiving disability payments. The experience in Taiwan can be valuable for other countries that are in an initial stage to implement a universal health insurance program.
NASA Astrophysics Data System (ADS)
Balbi, S.; Villa, F.; Mojtahed, V.; Hegetschweiler, K. T.; Giupponi, C.
2015-10-01
This article presents a novel methodology to assess flood risk to people by integrating people's vulnerability and ability to cushion hazards through coping and adapting. The proposed approach extends traditional risk assessments beyond material damages; complements quantitative and semi-quantitative data with subjective and local knowledge, improving the use of commonly available information; produces estimates of model uncertainty by providing probability distributions for all of its outputs. Flood risk to people is modeled using a spatially explicit Bayesian network model calibrated on expert opinion. Risk is assessed in terms of: (1) likelihood of non-fatal physical injury; (2) likelihood of post-traumatic stress disorder; (3) likelihood of death. The study area covers the lower part of the Sihl valley (Switzerland) including the city of Zurich. The model is used to estimate the benefits of improving an existing Early Warning System, taking into account the reliability, lead-time and scope (i.e. coverage of people reached by the warning). Model results indicate that the potential benefits of an improved early warning in terms of avoided human impacts are particularly relevant in case of a major flood event: about 75 % of fatalities, 25 % of injuries and 18 % of post-traumatic stress disorders could be avoided.
Likelihood of achieving air quality targets under model uncertainties.
Digar, Antara; Cohan, Daniel S; Cox, Dennis D; Kim, Byeong-Uk; Boylan, James W
2011-01-01
Regulatory attainment demonstrations in the United States typically apply a bright-line test to predict whether a control strategy is sufficient to attain an air quality standard. Photochemical models are the best tools available to project future pollutant levels and are a critical part of regulatory attainment demonstrations. However, because photochemical models are uncertain and future meteorology is unknowable, future pollutant levels cannot be predicted perfectly and attainment cannot be guaranteed. This paper introduces a computationally efficient methodology for estimating the likelihood that an emission control strategy will achieve an air quality objective in light of uncertainties in photochemical model input parameters (e.g., uncertain emission and reaction rates, deposition velocities, and boundary conditions). The method incorporates Monte Carlo simulations of a reduced form model representing pollutant-precursor response under parametric uncertainty to probabilistically predict the improvement in air quality due to emission control. The method is applied to recent 8-h ozone attainment modeling for Atlanta, Georgia, to assess the likelihood that additional controls would achieve fixed (well-defined) or flexible (due to meteorological variability and uncertain emission trends) targets of air pollution reduction. The results show that in certain instances ranking of the predicted effectiveness of control strategies may differ between probabilistic and deterministic analyses.
Efficient Exploration of the Space of Reconciled Gene Trees
Szöllősi, Gergely J.; Rosikiewicz, Wojciech; Boussau, Bastien; Tannier, Eric; Daubin, Vincent
2013-01-01
Gene trees record the combination of gene-level events, such as duplication, transfer and loss (DTL), and species-level events, such as speciation and extinction. Gene tree–species tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and species-level events. The reconstruction of gene trees based on sequence alone almost always involves choosing between statistically equivalent or weakly distinguishable relationships that could be much better resolved based on a putative species tree. To exploit this potential for accurate reconstruction of gene trees, the space of reconciled gene trees must be explored according to a joint model of sequence evolution and gene tree–species tree reconciliation. Here we present amalgamated likelihood estimation (ALE), a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (Szöllősi et al. 2013), which allows for the DTL of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees. We demonstrate using simulations that gene trees reconstructed using the joint likelihood are substantially more accurate than those reconstructed using sequence alone. Using realistic gene tree topologies, branch lengths, and alignment sizes, we demonstrate that ALE produces more accurate gene trees even if the model of sequence evolution is greatly simplified. Finally, examining 1099 gene families from 36 cyanobacterial genomes we find that joint likelihood-based inference results in a striking reduction in apparent phylogenetic discord, with respectively. 24%, 59%, and 46% reductions in the mean numbers of duplications, transfers, and losses per gene family. The open source implementation of ALE is available from https://github.com/ssolo/ALE.git. [amalgamation; gene tree reconciliation; gene tree reconstruction; lateral gene transfer; phylogeny.] PMID:23925510
Reuter, Tabea; Renner, Britta
2011-01-01
Background In order to fight the spread of the novel H1N1 influenza, health authorities worldwide called for a change in hygiene behavior. Within a longitudinal study, we examined who collected a free bottle of hand sanitizer towards the end of the first swine flu pandemic wave in December 2009. Methods 629 participants took part in a longitudinal study assessing perceived likelihood and severity of an H1N1 infection, and H1N1 influenza related negative affect (i.e., feelings of threat, concern, and worry) at T1 (October 2009, week 43–44) and T2 (December 2009, week 51–52). Importantly, all participants received a voucher for a bottle of hand sanitizer at T2 which could be redeemed in a university office newly established for this occasion at T3 (ranging between 1–4 days after T2). Results Both a sequential longitudinal model (M2) as well as a change score model (M3) showed that greater perceived likelihood and severity at T1 (M2) or changes in perceived likelihood and severity between T1 and T2 (M3) did not directly drive protective behavior (T3), but showed a significant indirect impact on behavior through H1N1 influenza related negative affect. Specifically, increases in perceived likelihood (β = .12), severity (β = .24) and their interaction (β = .13) were associated with a more pronounced change in negative affect (M3). The more threatened, concerned and worried people felt (T2), the more likely they were to redeem the voucher at T3 (OR = 1.20). Conclusions Affective components need to be considered in health behavior models. Perceived likelihood and severity of an influenza infection represent necessary but not sufficient self-referential knowledge for paving the way for preventive behaviors. PMID:21789224
Reuter, Tabea; Renner, Britta
2011-01-01
In order to fight the spread of the novel H1N1 influenza, health authorities worldwide called for a change in hygiene behavior. Within a longitudinal study, we examined who collected a free bottle of hand sanitizer towards the end of the first swine flu pandemic wave in December 2009. 629 participants took part in a longitudinal study assessing perceived likelihood and severity of an H1N1 infection, and H1N1 influenza related negative affect (i.e., feelings of threat, concern, and worry) at T1 (October 2009, week 43-44) and T2 (December 2009, week 51-52). Importantly, all participants received a voucher for a bottle of hand sanitizer at T2 which could be redeemed in a university office newly established for this occasion at T3 (ranging between 1-4 days after T2). Both a sequential longitudinal model (M2) as well as a change score model (M3) showed that greater perceived likelihood and severity at T1 (M2) or changes in perceived likelihood and severity between T1 and T2 (M3) did not directly drive protective behavior (T3), but showed a significant indirect impact on behavior through H1N1 influenza related negative affect. Specifically, increases in perceived likelihood (β = .12), severity (β = .24) and their interaction (β = .13) were associated with a more pronounced change in negative affect (M3). The more threatened, concerned and worried people felt (T2), the more likely they were to redeem the voucher at T3 (OR = 1.20). Affective components need to be considered in health behavior models. Perceived likelihood and severity of an influenza infection represent necessary but not sufficient self-referential knowledge for paving the way for preventive behaviors.
Challenges in Species Tree Estimation Under the Multispecies Coalescent Model
Xu, Bo; Yang, Ziheng
2016-01-01
The multispecies coalescent (MSC) model has emerged as a powerful framework for inferring species phylogenies while accounting for ancestral polymorphism and gene tree-species tree conflict. A number of methods have been developed in the past few years to estimate the species tree under the MSC. The full likelihood methods (including maximum likelihood and Bayesian inference) average over the unknown gene trees and accommodate their uncertainties properly but involve intensive computation. The approximate or summary coalescent methods are computationally fast and are applicable to genomic datasets with thousands of loci, but do not make an efficient use of information in the multilocus data. Most of them take the two-step approach of reconstructing the gene trees for multiple loci by phylogenetic methods and then treating the estimated gene trees as observed data, without accounting for their uncertainties appropriately. In this article we review the statistical nature of the species tree estimation problem under the MSC, and explore the conceptual issues and challenges of species tree estimation by focusing mainly on simple cases of three or four closely related species. We use mathematical analysis and computer simulation to demonstrate that large differences in statistical performance may exist between the two classes of methods. We illustrate that several counterintuitive behaviors may occur with the summary methods but they are due to inefficient use of information in the data by summary methods and vanish when the data are analyzed using full-likelihood methods. These include (i) unidentifiability of parameters in the model, (ii) inconsistency in the so-called anomaly zone, (iii) singularity on the likelihood surface, and (iv) deterioration of performance upon addition of more data. We discuss the challenges and strategies of species tree inference for distantly related species when the molecular clock is violated, and highlight the need for improving the computational efficiency and model realism of the likelihood methods as well as the statistical efficiency of the summary methods. PMID:27927902
Incorrect likelihood methods were used to infer scaling laws of marine predator search behaviour.
Edwards, Andrew M; Freeman, Mervyn P; Breed, Greg A; Jonsen, Ian D
2012-01-01
Ecologists are collecting extensive data concerning movements of animals in marine ecosystems. Such data need to be analysed with valid statistical methods to yield meaningful conclusions. We demonstrate methodological issues in two recent studies that reached similar conclusions concerning movements of marine animals (Nature 451:1098; Science 332:1551). The first study analysed vertical movement data to conclude that diverse marine predators (Atlantic cod, basking sharks, bigeye tuna, leatherback turtles and Magellanic penguins) exhibited "Lévy-walk-like behaviour", close to a hypothesised optimal foraging strategy. By reproducing the original results for the bigeye tuna data, we show that the likelihood of tested models was calculated from residuals of regression fits (an incorrect method), rather than from the likelihood equations of the actual probability distributions being tested. This resulted in erroneous Akaike Information Criteria, and the testing of models that do not correspond to valid probability distributions. We demonstrate how this led to overwhelming support for a model that has no biological justification and that is statistically spurious because its probability density function goes negative. Re-analysis of the bigeye tuna data, using standard likelihood methods, overturns the original result and conclusion for that data set. The second study observed Lévy walk movement patterns by mussels. We demonstrate several issues concerning the likelihood calculations (including the aforementioned residuals issue). Re-analysis of the data rejects the original Lévy walk conclusion. We consequently question the claimed existence of scaling laws of the search behaviour of marine predators and mussels, since such conclusions were reached using incorrect methods. We discourage the suggested potential use of "Lévy-like walks" when modelling consequences of fishing and climate change, and caution that any resulting advice to managers of marine ecosystems would be problematic. For reproducibility and future work we provide R source code for all calculations.
Factors associated with persons with disability employment in India: a cross-sectional study.
Naraharisetti, Ramya; Castro, Marcia C
2016-10-07
Over twenty million persons with disability in India are increasingly being offered poverty alleviation strategies, including employment programs. This study employs a spatial analytic approach to identify correlates of employment among persons with disability in India, considering sight, speech, hearing, movement, and mental disabilities. Based on 2001 Census data, this study utilizes linear regression and spatial autoregressive models to identify factors associated with the proportion employed among persons with disability at the district level. Models stratified by rural and urban areas were also considered. Spatial autoregressive models revealed that different factors contribute to employment of persons with disability in rural and urban areas. In rural areas, having mental disability decreased the likelihood of employment, while being female and having movement, or sight impairment (compared to other disabilities) increased the likelihood of employment. In urban areas, being female and illiterate decreased the likelihood of employment but having sight, mental and movement impairment (compared to other disabilities) increased the likelihood of employment. Poverty alleviation programs designed for persons with disability in India should account for differences in employment by disability types and should be spatially targeted. Since persons with disability in rural and urban areas have different factors contributing to their employment, it is vital that government and service-planning organizations account for these differences when creating programs aimed at livelihood development.
A likelihood ratio anomaly detector for identifying within-perimeter computer network attacks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grana, Justin; Wolpert, David; Neil, Joshua
The rapid detection of attackers within firewalls of enterprise computer networks is of paramount importance. Anomaly detectors address this problem by quantifying deviations from baseline statistical models of normal network behavior and signaling an intrusion when the observed data deviates significantly from the baseline model. But, many anomaly detectors do not take into account plausible attacker behavior. As a result, anomaly detectors are prone to a large number of false positives due to unusual but benign activity. Our paper first introduces a stochastic model of attacker behavior which is motivated by real world attacker traversal. Then, we develop a likelihoodmore » ratio detector that compares the probability of observed network behavior under normal conditions against the case when an attacker has possibly compromised a subset of hosts within the network. Since the likelihood ratio detector requires integrating over the time each host becomes compromised, we illustrate how to use Monte Carlo methods to compute the requisite integral. We then present Receiver Operating Characteristic (ROC) curves for various network parameterizations that show for any rate of true positives, the rate of false positives for the likelihood ratio detector is no higher than that of a simple anomaly detector and is often lower. Finally, we demonstrate the superiority of the proposed likelihood ratio detector when the network topologies and parameterizations are extracted from real-world networks.« less
A likelihood ratio anomaly detector for identifying within-perimeter computer network attacks
Grana, Justin; Wolpert, David; Neil, Joshua; ...
2016-03-11
The rapid detection of attackers within firewalls of enterprise computer networks is of paramount importance. Anomaly detectors address this problem by quantifying deviations from baseline statistical models of normal network behavior and signaling an intrusion when the observed data deviates significantly from the baseline model. But, many anomaly detectors do not take into account plausible attacker behavior. As a result, anomaly detectors are prone to a large number of false positives due to unusual but benign activity. Our paper first introduces a stochastic model of attacker behavior which is motivated by real world attacker traversal. Then, we develop a likelihoodmore » ratio detector that compares the probability of observed network behavior under normal conditions against the case when an attacker has possibly compromised a subset of hosts within the network. Since the likelihood ratio detector requires integrating over the time each host becomes compromised, we illustrate how to use Monte Carlo methods to compute the requisite integral. We then present Receiver Operating Characteristic (ROC) curves for various network parameterizations that show for any rate of true positives, the rate of false positives for the likelihood ratio detector is no higher than that of a simple anomaly detector and is often lower. Finally, we demonstrate the superiority of the proposed likelihood ratio detector when the network topologies and parameterizations are extracted from real-world networks.« less
NASA Astrophysics Data System (ADS)
Morse, Brad S.; Pohll, Greg; Huntington, Justin; Rodriguez Castillo, Ramiro
2003-06-01
In 1992, Mexican researchers discovered concentrations of arsenic in excess of World Heath Organization (WHO) standards in several municipal wells in the Zimapan Valley of Mexico. This study describes a method to delineate a capture zone for one of the most highly contaminated wells to aid in future well siting. A stochastic approach was used to model the capture zone because of the high level of uncertainty in several input parameters. Two stochastic techniques were performed and compared: "standard" Monte Carlo analysis and the generalized likelihood uncertainty estimator (GLUE) methodology. The GLUE procedure differs from standard Monte Carlo analysis in that it incorporates a goodness of fit (termed a likelihood measure) in evaluating the model. This allows for more information (in this case, head data) to be used in the uncertainty analysis, resulting in smaller prediction uncertainty. Two likelihood measures are tested in this study to determine which are in better agreement with the observed heads. While the standard Monte Carlo approach does not aid in parameter estimation, the GLUE methodology indicates best fit models when hydraulic conductivity is approximately 10-6.5 m/s, with vertically isotropic conditions and large quantities of interbasin flow entering the basin. Probabilistic isochrones (capture zone boundaries) are then presented, and as predicted, the GLUE-derived capture zones are significantly smaller in area than those from the standard Monte Carlo approach.
Kinematic Structural Modelling in Bayesian Networks
NASA Astrophysics Data System (ADS)
Schaaf, Alexander; de la Varga, Miguel; Florian Wellmann, J.
2017-04-01
We commonly capture our knowledge about the spatial distribution of distinct geological lithologies in the form of 3-D geological models. Several methods exist to create these models, each with its own strengths and limitations. We present here an approach to combine the functionalities of two modeling approaches - implicit interpolation and kinematic modelling methods - into one framework, while explicitly considering parameter uncertainties and thus model uncertainty. In recent work, we proposed an approach to implement implicit modelling algorithms into Bayesian networks. This was done to address the issues of input data uncertainty and integration of geological information from varying sources in the form of geological likelihood functions. However, one general shortcoming of implicit methods is that they usually do not take any physical constraints into consideration, which can result in unrealistic model outcomes and artifacts. On the other hand, kinematic structural modelling intends to reconstruct the history of a geological system based on physically driven kinematic events. This type of modelling incorporates simplified, physical laws into the model, at the cost of a substantial increment of usable uncertain parameters. In the work presented here, we show an integration of these two different modelling methodologies, taking advantage of the strengths of both of them. First, we treat the two types of models separately, capturing the information contained in the kinematic models and their specific parameters in the form of likelihood functions, in order to use them in the implicit modelling scheme. We then go further and combine the two modelling approaches into one single Bayesian network. This enables the direct flow of information between the parameters of the kinematic modelling step and the implicit modelling step and links the exclusive input data and likelihoods of the two different modelling algorithms into one probabilistic inference framework. In addition, we use the capabilities of Noddy to analyze the topology of structural models to demonstrate how topological information, such as the connectivity of two layers across an unconformity, can be used as a likelihood function. In an application to a synthetic case study, we show that our approach leads to a successful combination of the two different modelling concepts. Specifically, we show that we derive ensemble realizations of implicit models that now incorporate the knowledge of the kinematic aspects, representing an important step forward in the integration of knowledge and a corresponding estimation of uncertainties in structural geological models.
A composite likelihood approach for spatially correlated survival data
Paik, Jane; Ying, Zhiliang
2013-01-01
The aim of this paper is to provide a composite likelihood approach to handle spatially correlated survival data using pairwise joint distributions. With e-commerce data, a recent question of interest in marketing research has been to describe spatially clustered purchasing behavior and to assess whether geographic distance is the appropriate metric to describe purchasing dependence. We present a model for the dependence structure of time-to-event data subject to spatial dependence to characterize purchasing behavior from the motivating example from e-commerce data. We assume the Farlie-Gumbel-Morgenstern (FGM) distribution and then model the dependence parameter as a function of geographic and demographic pairwise distances. For estimation of the dependence parameters, we present pairwise composite likelihood equations. We prove that the resulting estimators exhibit key properties of consistency and asymptotic normality under certain regularity conditions in the increasing-domain framework of spatial asymptotic theory. PMID:24223450
A composite likelihood approach for spatially correlated survival data.
Paik, Jane; Ying, Zhiliang
2013-01-01
The aim of this paper is to provide a composite likelihood approach to handle spatially correlated survival data using pairwise joint distributions. With e-commerce data, a recent question of interest in marketing research has been to describe spatially clustered purchasing behavior and to assess whether geographic distance is the appropriate metric to describe purchasing dependence. We present a model for the dependence structure of time-to-event data subject to spatial dependence to characterize purchasing behavior from the motivating example from e-commerce data. We assume the Farlie-Gumbel-Morgenstern (FGM) distribution and then model the dependence parameter as a function of geographic and demographic pairwise distances. For estimation of the dependence parameters, we present pairwise composite likelihood equations. We prove that the resulting estimators exhibit key properties of consistency and asymptotic normality under certain regularity conditions in the increasing-domain framework of spatial asymptotic theory.
Bootstrap Standard Errors for Maximum Likelihood Ability Estimates When Item Parameters Are Unknown
ERIC Educational Resources Information Center
Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi
2014-01-01
When item parameter estimates are used to estimate the ability parameter in item response models, the standard error (SE) of the ability estimate must be corrected to reflect the error carried over from item calibration. For maximum likelihood (ML) ability estimates, a corrected asymptotic SE is available, but it requires a long test and the…
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles
2010-01-01
In this article the authors focus on the issue of the nonuniqueness of the maximum likelihood (ML) estimator of proficiency level in item response theory (with special attention to logistic models). The usual maximum a posteriori (MAP) method offers a good alternative within that framework; however, this article highlights some drawbacks of its…
ERIC Educational Resources Information Center
Khattab, Ali-Maher; And Others
1982-01-01
A causal modeling system, using confirmatory maximum likelihood factor analysis with the LISREL IV computer program, evaluated the construct validity underlying the higher order factor structure of a given correlation matrix of 46 structure-of-intellect tests emphasizing the product of transformations. (Author/PN)
Empirical likelihood method for non-ignorable missing data problems.
Guan, Zhong; Qin, Jing
2017-01-01
Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)'s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased.
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data
Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu
2012-01-01
SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
Montero-Barrera, Daniel; Valderrama-Carvajal, Héctor; Terrazas, César A.; Rojas-Hernández, Saúl; Ledesma-Soto, Yadira; Vera-Arias, Laura; Carrasco-Yépez, Maricela; Gómez-García, Lorena; Martínez-Saucedo, Diana; Becerra-Díaz, Mireya; Terrazas, Luis I.
2015-01-01
C-type lectins are multifunctional sugar-binding molecules expressed on dendritic cells (DCs) and macrophages that internalize antigens for processing and presentation. Macrophage galactose-type lectin 1 (MGL1) recognizes glycoconjugates expressing Lewis X structures which contain galactose residues, and it is selectively expressed on immature DCs and macrophages. Helminth parasites contain large amounts of glycosylated components, which play a role in the immune regulation induced by such infections. Macrophages from MGL1−/− mice showed less binding ability toward parasite antigens than their wild-type (WT) counterparts. Exposure of WT macrophages to T. crassiceps antigens triggered tyrosine phosphorylation signaling activity, which was diminished in MGL1−/− macrophages. Following T. crassiceps infection, MGL1−/− mice failed to produce significant levels of inflammatory cytokines early in the infection compared to WT mice. In contrast, MGL1−/− mice developed a Th2-dominant immune response that was associated with significantly higher parasite loads, whereas WT mice were resistant. Flow cytometry and RT-PCR analyses showed overexpression of the mannose receptors, IL-4Rα, PDL2, arginase-1, Ym1, and RELM-α on MGL1−/− macrophages. These studies indicate that MGL1 is involved in T. crassiceps recognition and subsequent innate immune activation and resistance. PMID:25664320
Exponential series approaches for nonparametric graphical models
NASA Astrophysics Data System (ADS)
Janofsky, Eric
Markov Random Fields (MRFs) or undirected graphical models are parsimonious representations of joint probability distributions. This thesis studies high-dimensional, continuous-valued pairwise Markov Random Fields. We are particularly interested in approximating pairwise densities whose logarithm belongs to a Sobolev space. For this problem we propose the method of exponential series which approximates the log density by a finite-dimensional exponential family with the number of sufficient statistics increasing with the sample size. We consider two approaches to estimating these models. The first is regularized maximum likelihood. This involves optimizing the sum of the log-likelihood of the data and a sparsity-inducing regularizer. We then propose a variational approximation to the likelihood based on tree-reweighted, nonparametric message passing. This approximation allows for upper bounds on risk estimates, leverages parallelization and is scalable to densities on hundreds of nodes. We show how the regularized variational MLE may be estimated using a proximal gradient algorithm. We then consider estimation using regularized score matching. This approach uses an alternative scoring rule to the log-likelihood, which obviates the need to compute the normalizing constant of the distribution. For general continuous-valued exponential families, we provide parameter and edge consistency results. As a special case we detail a new approach to sparse precision matrix estimation which has statistical performance competitive with the graphical lasso and computational performance competitive with the state-of-the-art glasso algorithm. We then describe results for model selection in the nonparametric pairwise model using exponential series. The regularized score matching problem is shown to be a convex program; we provide scalable algorithms based on consensus alternating direction method of multipliers (ADMM) and coordinate-wise descent. We use simulations to compare our method to others in the literature as well as the aforementioned TRW estimator.
Heumann, Benjamin W.; Walsh, Stephen J.; Verdery, Ashton M.; McDaniel, Phillip M.; Rindfuss, Ronald R.
2012-01-01
Understanding the pattern-process relations of land use/land cover change is an important area of research that provides key insights into human-environment interactions. The suitability or likelihood of occurrence of land use such as agricultural crop types across a human-managed landscape is a central consideration. Recent advances in niche-based, geographic species distribution modeling (SDM) offer a novel approach to understanding land suitability and land use decisions. SDM links species presence-location data with geospatial information and uses machine learning algorithms to develop non-linear and discontinuous species-environment relationships. Here, we apply the MaxEnt (Maximum Entropy) model for land suitability modeling by adapting niche theory to a human-managed landscape. In this article, we use data from an agricultural district in Northeastern Thailand as a case study for examining the relationships between the natural, built, and social environments and the likelihood of crop choice for the commonly grown crops that occur in the Nang Rong District – cassava, heavy rice, and jasmine rice, as well as an emerging crop, fruit trees. Our results indicate that while the natural environment (e.g., elevation and soils) is often the dominant factor in crop likelihood, the likelihood is also influenced by household characteristics, such as household assets and conditions of the neighborhood or built environment. Furthermore, the shape of the land use-environment curves illustrates the non-continuous and non-linear nature of these relationships. This approach demonstrates a novel method of understanding non-linear relationships between land and people. The article concludes with a proposed method for integrating the niche-based rules of land use allocation into a dynamic land use model that can address both allocation and quantity of agricultural crops. PMID:24187378
Multiple-hit parameter estimation in monolithic detectors.
Hunter, William C J; Barrett, Harrison H; Lewellen, Tom K; Miyaoka, Robert S
2013-02-01
We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%-12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied.
Framework for adaptive multiscale analysis of nonhomogeneous point processes.
Helgason, Hannes; Bartroff, Jay; Abry, Patrice
2011-01-01
We develop the methodology for hypothesis testing and model selection in nonhomogeneous Poisson processes, with an eye toward the application of modeling and variability detection in heart beat data. Modeling the process' non-constant rate function using templates of simple basis functions, we develop the generalized likelihood ratio statistic for a given template and a multiple testing scheme to model-select from a family of templates. A dynamic programming algorithm inspired by network flows is used to compute the maximum likelihood template in a multiscale manner. In a numerical example, the proposed procedure is nearly as powerful as the super-optimal procedures that know the true template size and true partition, respectively. Extensions to general history-dependent point processes is discussed.
Phylogenetic evidence for cladogenetic polyploidization in land plants.
Zhan, Shing H; Drori, Michal; Goldberg, Emma E; Otto, Sarah P; Mayrose, Itay
2016-07-01
Polyploidization is a common and recurring phenomenon in plants and is often thought to be a mechanism of "instant speciation". Whether polyploidization is associated with the formation of new species (cladogenesis) or simply occurs over time within a lineage (anagenesis), however, has never been assessed systematically. We tested this hypothesis using phylogenetic and karyotypic information from 235 plant genera (mostly angiosperms). We first constructed a large database of combined sequence and chromosome number data sets using an automated procedure. We then applied likelihood models (ClaSSE) that estimate the degree of synchronization between polyploidization and speciation events in maximum likelihood and Bayesian frameworks. Our maximum likelihood analysis indicated that 35 genera supported a model that includes cladogenetic transitions over a model with only anagenetic transitions, whereas three genera supported a model that incorporates anagenetic transitions over one with only cladogenetic transitions. Furthermore, the Bayesian analysis supported a preponderance of cladogenetic change in four genera but did not support a preponderance of anagenetic change in any genus. Overall, these phylogenetic analyses provide the first broad confirmation that polyploidization is temporally associated with speciation events, suggesting that it is indeed a major speciation mechanism in plants, at least in some genera. © 2016 Botanical Society of America.
Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level
Savalei, Victoria; Rhemtulla, Mijke
2017-01-01
In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately handle missing data at the item level. Item-level multiple imputation (MI), however, can handle such missing data straightforwardly. In this article, we develop an analytic approach for dealing with item-level missing data—that is, one that obtains a unique set of parameter estimates directly from the incomplete data set and does not require imputations. The proposed approach is a variant of the two-stage maximum likelihood (TSML) methodology, and it is the analytic equivalent of item-level MI. We compare the new TSML approach to three existing alternatives for handling item-level missing data: scale-level full information maximum likelihood, available-case maximum likelihood, and item-level MI. We find that the TSML approach is the best analytic approach, and its performance is similar to item-level MI. We recommend its implementation in popular software and its further study. PMID:29276371
Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level.
Savalei, Victoria; Rhemtulla, Mijke
2017-08-01
In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately handle missing data at the item level. Item-level multiple imputation (MI), however, can handle such missing data straightforwardly. In this article, we develop an analytic approach for dealing with item-level missing data-that is, one that obtains a unique set of parameter estimates directly from the incomplete data set and does not require imputations. The proposed approach is a variant of the two-stage maximum likelihood (TSML) methodology, and it is the analytic equivalent of item-level MI. We compare the new TSML approach to three existing alternatives for handling item-level missing data: scale-level full information maximum likelihood, available-case maximum likelihood, and item-level MI. We find that the TSML approach is the best analytic approach, and its performance is similar to item-level MI. We recommend its implementation in popular software and its further study.
Reducing weapon-carrying among urban American Indian young people.
Bearinger, Linda H; Pettingell, Sandra L; Resnick, Michael D; Potthoff, Sandra J
2010-07-01
To examine the likelihood of weapon-carrying among urban American Indian young people, given the presence of salient risk and protective factors. The study used data from a confidential, self-report Urban Indian Youth Health Survey with 200 forced-choice items examining risk and protective factors and social, contextual, and demographic information. Between 1995 and 1998, 569 American Indian youths, aged 9-15 years, completed surveys administered in public schools and an after-school program. Using logistic regression, probability profiles compared the likelihood of weapon-carrying, given the combinations of salient risk and protective factors. In the final models, weapon-carrying was associated significantly with one risk factor (substance use) and two protective factors (school connectedness, perceiving peers as having prosocial behavior attitudes/norms). With one risk factor and two protective factors, in various combinations in the models, the likelihood of weapon carrying ranged from 4% (with two protective factors and no risk factor in the model) to 80% of youth (with the risk factor and no protective factors in the model). Even in the presence of the risk factor, the two protective factors decreased the likelihood of weapon-carrying to 25%. This analysis highlights the importance of protective factors in comprehensive assessments and interventions for vulnerable youth. In that the risk factor and two protective factors significantly related to weapon-carrying are amenable to intervention at both individual and population-focused levels, study findings offer a guide for prioritizing strategies for decreasing weapon-carrying among urban American Indian young people. Copyright (c) 2010 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gopich, Irina V.
2015-01-21
Photon sequences from single-molecule Förster resonance energy transfer (FRET) experiments can be analyzed using a maximum likelihood method. Parameters of the underlying kinetic model (FRET efficiencies of the states and transition rates between conformational states) are obtained by maximizing the appropriate likelihood function. In addition, the errors (uncertainties) of the extracted parameters can be obtained from the curvature of the likelihood function at the maximum. We study the standard deviations of the parameters of a two-state model obtained from photon sequences with recorded colors and arrival times. The standard deviations can be obtained analytically in a special case when themore » FRET efficiencies of the states are 0 and 1 and in the limiting cases of fast and slow conformational dynamics. These results are compared with the results of numerical simulations. The accuracy and, therefore, the ability to predict model parameters depend on how fast the transition rates are compared to the photon count rate. In the limit of slow transitions, the key parameters that determine the accuracy are the number of transitions between the states and the number of independent photon sequences. In the fast transition limit, the accuracy is determined by the small fraction of photons that are correlated with their neighbors. The relative standard deviation of the relaxation rate has a “chevron” shape as a function of the transition rate in the log-log scale. The location of the minimum of this function dramatically depends on how well the FRET efficiencies of the states are separated.« less
Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET
Gopich, Irina V.
2015-01-01
Photon sequences from single-molecule Förster resonance energy transfer (FRET) experiments can be analyzed using a maximum likelihood method. Parameters of the underlying kinetic model (FRET efficiencies of the states and transition rates between conformational states) are obtained by maximizing the appropriate likelihood function. In addition, the errors (uncertainties) of the extracted parameters can be obtained from the curvature of the likelihood function at the maximum. We study the standard deviations of the parameters of a two-state model obtained from photon sequences with recorded colors and arrival times. The standard deviations can be obtained analytically in a special case when the FRET efficiencies of the states are 0 and 1 and in the limiting cases of fast and slow conformational dynamics. These results are compared with the results of numerical simulations. The accuracy and, therefore, the ability to predict model parameters depend on how fast the transition rates are compared to the photon count rate. In the limit of slow transitions, the key parameters that determine the accuracy are the number of transitions between the states and the number of independent photon sequences. In the fast transition limit, the accuracy is determined by the small fraction of photons that are correlated with their neighbors. The relative standard deviation of the relaxation rate has a “chevron” shape as a function of the transition rate in the log-log scale. The location of the minimum of this function dramatically depends on how well the FRET efficiencies of the states are separated. PMID:25612692
NASA Astrophysics Data System (ADS)
Perlovsky, Leonid I.; Webb, Virgil H.; Bradley, Scott R.; Hansen, Christopher A.
1998-07-01
An advanced detection and tracking system is being developed for the U.S. Navy's Relocatable Over-the-Horizon Radar (ROTHR) to provide improved tracking performance against small aircraft typically used in drug-smuggling activities. The development is based on the Maximum Likelihood Adaptive Neural System (MLANS), a model-based neural network that combines advantages of neural network and model-based algorithmic approaches. The objective of the MLANS tracker development effort is to address user requirements for increased detection and tracking capability in clutter and improved track position, heading, and speed accuracy. The MLANS tracker is expected to outperform other approaches to detection and tracking for the following reasons. It incorporates adaptive internal models of target return signals, target tracks and maneuvers, and clutter signals, which leads to concurrent clutter suppression, detection, and tracking (track-before-detect). It is not combinatorial and thus does not require any thresholding or peak picking and can track in low signal-to-noise conditions. It incorporates superresolution spectrum estimation techniques exceeding the performance of conventional maximum likelihood and maximum entropy methods. The unique spectrum estimation method is based on the Einsteinian interpretation of the ROTHR received energy spectrum as a probability density of signal frequency. The MLANS neural architecture and learning mechanism are founded on spectrum models and maximization of the "Einsteinian" likelihood, allowing knowledge of the physical behavior of both targets and clutter to be injected into the tracker algorithms. The paper describes the addressed requirements and expected improvements, theoretical foundations, engineering methodology, and results of the development effort to date.
Mahara, Gehendra; Wang, Chao; Yang, Kun; Chen, Sipeng; Guo, Jin; Gao, Qi; Wang, Wei; Wang, Quanyi; Guo, Xiuhua
2016-01-01
(1) Background: Evidence regarding scarlet fever and its relationship with meteorological, including air pollution factors, is not very available. This study aimed to examine the relationship between ambient air pollutants and meteorological factors with scarlet fever occurrence in Beijing, China. (2) Methods: A retrospective ecological study was carried out to distinguish the epidemic characteristics of scarlet fever incidence in Beijing districts from 2013 to 2014. Daily incidence and corresponding air pollutant and meteorological data were used to develop the model. Global Moran’s I statistic and Anselin’s local Moran’s I (LISA) were applied to detect the spatial autocorrelation (spatial dependency) and clusters of scarlet fever incidence. The spatial lag model (SLM) and spatial error model (SEM) including ordinary least squares (OLS) models were then applied to probe the association between scarlet fever incidence and meteorological including air pollution factors. (3) Results: Among the 5491 cases, more than half (62%) were male, and more than one-third (37.8%) were female, with the annual average incidence rate 14.64 per 100,000 population. Spatial autocorrelation analysis exhibited the existence of spatial dependence; therefore, we applied spatial regression models. After comparing the values of R-square, log-likelihood and the Akaike information criterion (AIC) among the three models, the OLS model (R2 = 0.0741, log likelihood = −1819.69, AIC = 3665.38), SLM (R2 = 0.0786, log likelihood = −1819.04, AIC = 3665.08) and SEM (R2 = 0.0743, log likelihood = −1819.67, AIC = 3665.36), identified that the spatial lag model (SLM) was best for model fit for the regression model. There was a positive significant association between nitrogen oxide (p = 0.027), rainfall (p = 0.036) and sunshine hour (p = 0.048), while the relative humidity (p = 0.034) had an adverse association with scarlet fever incidence in SLM. (4) Conclusions: Our findings indicated that meteorological, as well as air pollutant factors may increase the incidence of scarlet fever; these findings may help to guide scarlet fever control programs and targeting the intervention. PMID:27827946
Mahara, Gehendra; Wang, Chao; Yang, Kun; Chen, Sipeng; Guo, Jin; Gao, Qi; Wang, Wei; Wang, Quanyi; Guo, Xiuhua
2016-11-04
(1) Background: Evidence regarding scarlet fever and its relationship with meteorological, including air pollution factors, is not very available. This study aimed to examine the relationship between ambient air pollutants and meteorological factors with scarlet fever occurrence in Beijing, China. (2) Methods: A retrospective ecological study was carried out to distinguish the epidemic characteristics of scarlet fever incidence in Beijing districts from 2013 to 2014. Daily incidence and corresponding air pollutant and meteorological data were used to develop the model. Global Moran's I statistic and Anselin's local Moran's I (LISA) were applied to detect the spatial autocorrelation (spatial dependency) and clusters of scarlet fever incidence. The spatial lag model (SLM) and spatial error model (SEM) including ordinary least squares (OLS) models were then applied to probe the association between scarlet fever incidence and meteorological including air pollution factors. (3) Results: Among the 5491 cases, more than half (62%) were male, and more than one-third (37.8%) were female, with the annual average incidence rate 14.64 per 100,000 population. Spatial autocorrelation analysis exhibited the existence of spatial dependence; therefore, we applied spatial regression models. After comparing the values of R-square, log-likelihood and the Akaike information criterion (AIC) among the three models, the OLS model (R² = 0.0741, log likelihood = -1819.69, AIC = 3665.38), SLM (R² = 0.0786, log likelihood = -1819.04, AIC = 3665.08) and SEM (R² = 0.0743, log likelihood = -1819.67, AIC = 3665.36), identified that the spatial lag model (SLM) was best for model fit for the regression model. There was a positive significant association between nitrogen oxide ( p = 0.027), rainfall ( p = 0.036) and sunshine hour ( p = 0.048), while the relative humidity ( p = 0.034) had an adverse association with scarlet fever incidence in SLM. (4) Conclusions: Our findings indicated that meteorological, as well as air pollutant factors may increase the incidence of scarlet fever; these findings may help to guide scarlet fever control programs and targeting the intervention.
Modeling Adversaries in Counterterrorism Decisions Using Prospect Theory.
Merrick, Jason R W; Leclerc, Philip
2016-04-01
Counterterrorism decisions have been an intense area of research in recent years. Both decision analysis and game theory have been used to model such decisions, and more recently approaches have been developed that combine the techniques of the two disciplines. However, each of these approaches assumes that the attacker is maximizing its utility. Experimental research shows that human beings do not make decisions by maximizing expected utility without aid, but instead deviate in specific ways such as loss aversion or likelihood insensitivity. In this article, we modify existing methods for counterterrorism decisions. We keep expected utility as the defender's paradigm to seek for the rational decision, but we use prospect theory to solve for the attacker's decision to descriptively model the attacker's loss aversion and likelihood insensitivity. We study the effects of this approach in a critical decision, whether to screen containers entering the United States for radioactive materials. We find that the defender's optimal decision is sensitive to the attacker's levels of loss aversion and likelihood insensitivity, meaning that understanding such descriptive decision effects is important in making such decisions. © 2014 Society for Risk Analysis.
A New Monte Carlo Method for Estimating Marginal Likelihoods.
Wang, Yu-Bo; Chen, Ming-Hui; Kuo, Lynn; Lewis, Paul O
2018-06-01
Evaluating the marginal likelihood in Bayesian analysis is essential for model selection. Estimators based on a single Markov chain Monte Carlo sample from the posterior distribution include the harmonic mean estimator and the inflated density ratio estimator. We propose a new class of Monte Carlo estimators based on this single Markov chain Monte Carlo sample. This class can be thought of as a generalization of the harmonic mean and inflated density ratio estimators using a partition weighted kernel (likelihood times prior). We show that our estimator is consistent and has better theoretical properties than the harmonic mean and inflated density ratio estimators. In addition, we provide guidelines on choosing optimal weights. Simulation studies were conducted to examine the empirical performance of the proposed estimator. We further demonstrate the desirable features of the proposed estimator with two real data sets: one is from a prostate cancer study using an ordinal probit regression model with latent variables; the other is for the power prior construction from two Eastern Cooperative Oncology Group phase III clinical trials using the cure rate survival model with similar objectives.
NASA Astrophysics Data System (ADS)
Núñez, M.; Robie, T.; Vlachos, D. G.
2017-10-01
Kinetic Monte Carlo (KMC) simulation provides insights into catalytic reactions unobtainable with either experiments or mean-field microkinetic models. Sensitivity analysis of KMC models assesses the robustness of the predictions to parametric perturbations and identifies rate determining steps in a chemical reaction network. Stiffness in the chemical reaction network, a ubiquitous feature, demands lengthy run times for KMC models and renders efficient sensitivity analysis based on the likelihood ratio method unusable. We address the challenge of efficiently conducting KMC simulations and performing accurate sensitivity analysis in systems with unknown time scales by employing two acceleration techniques: rate constant rescaling and parallel processing. We develop statistical criteria that ensure sufficient sampling of non-equilibrium steady state conditions. Our approach provides the twofold benefit of accelerating the simulation itself and enabling likelihood ratio sensitivity analysis, which provides further speedup relative to finite difference sensitivity analysis. As a result, the likelihood ratio method can be applied to real chemistry. We apply our methodology to the water-gas shift reaction on Pt(111).
NASA Astrophysics Data System (ADS)
Wang, Z.
2015-12-01
For decades, distributed and lumped hydrological models have furthered our understanding of hydrological system. The development of hydrological simulation in large scale and high precision elaborated the spatial descriptions and hydrological behaviors. Meanwhile, the new trend is also followed by the increment of model complexity and number of parameters, which brings new challenges of uncertainty quantification. Generalized Likelihood Uncertainty Estimation (GLUE) has been widely used in uncertainty analysis for hydrological models referring to Monte Carlo method coupled with Bayesian estimation. However, the stochastic sampling method of prior parameters adopted by GLUE appears inefficient, especially in high dimensional parameter space. The heuristic optimization algorithms utilizing iterative evolution show better convergence speed and optimality-searching performance. In light of the features of heuristic optimization algorithms, this study adopted genetic algorithm, differential evolution, shuffled complex evolving algorithm to search the parameter space and obtain the parameter sets of large likelihoods. Based on the multi-algorithm sampling, hydrological model uncertainty analysis is conducted by the typical GLUE framework. To demonstrate the superiority of the new method, two hydrological models of different complexity are examined. The results shows the adaptive method tends to be efficient in sampling and effective in uncertainty analysis, providing an alternative path for uncertainty quantilization.
Variational Bayesian Parameter Estimation Techniques for the General Linear Model
Starke, Ludger; Ostwald, Dirk
2017-01-01
Variational Bayes (VB), variational maximum likelihood (VML), restricted maximum likelihood (ReML), and maximum likelihood (ML) are cornerstone parametric statistical estimation techniques in the analysis of functional neuroimaging data. However, the theoretical underpinnings of these model parameter estimation techniques are rarely covered in introductory statistical texts. Because of the widespread practical use of VB, VML, ReML, and ML in the neuroimaging community, we reasoned that a theoretical treatment of their relationships and their application in a basic modeling scenario may be helpful for both neuroimaging novices and practitioners alike. In this technical study, we thus revisit the conceptual and formal underpinnings of VB, VML, ReML, and ML and provide a detailed account of their mathematical relationships and implementational details. We further apply VB, VML, ReML, and ML to the general linear model (GLM) with non-spherical error covariance as commonly encountered in the first-level analysis of fMRI data. To this end, we explicitly derive the corresponding free energy objective functions and ensuing iterative algorithms. Finally, in the applied part of our study, we evaluate the parameter and model recovery properties of VB, VML, ReML, and ML, first in an exemplary setting and then in the analysis of experimental fMRI data acquired from a single participant under visual stimulation. PMID:28966572
Estimating ambiguity preferences and perceptions in multiple prior models: Evidence from the field.
Dimmock, Stephen G; Kouwenberg, Roy; Mitchell, Olivia S; Peijnenburg, Kim
2015-12-01
We develop a tractable method to estimate multiple prior models of decision-making under ambiguity. In a representative sample of the U.S. population, we measure ambiguity attitudes in the gain and loss domains. We find that ambiguity aversion is common for uncertain events of moderate to high likelihood involving gains, but ambiguity seeking prevails for low likelihoods and for losses. We show that choices made under ambiguity in the gain domain are best explained by the α-MaxMin model, with one parameter measuring ambiguity aversion (ambiguity preferences) and a second parameter quantifying the perceived degree of ambiguity (perceptions about ambiguity). The ambiguity aversion parameter α is constant and prior probability sets are asymmetric for low and high likelihood events. The data reject several other models, such as MaxMin and MaxMax, as well as symmetric probability intervals. Ambiguity aversion and the perceived degree of ambiguity are both higher for men and for the college-educated. Ambiguity aversion (but not perceived ambiguity) is also positively related to risk aversion. In the loss domain, we find evidence of reflection, implying that ambiguity aversion for gains tends to reverse into ambiguity seeking for losses. Our model's estimates for preferences and perceptions about ambiguity can be used to analyze the economic and financial implications of such preferences.
Janssen, Eva; van Osch, Liesbeth; de Vries, Hein; Lechner, Lilian
2013-01-01
This study aimed to extricate the influence of rational (e.g., 'I think …') and intuitive (e.g., 'I feel …') probability beliefs in the behavioural decision-making process regarding skin cancer prevention practices. Structural equation modelling was used in two longitudinal surveys (sun protection during winter sports [N = 491]; sun protection during summer [N = 277]) to examine direct and indirect behavioural effects of affective and cognitive likelihood (i.e. unmediated or mediated by intention), controlled for attitude, social influence and self-efficacy. Affective likelihood was directly related to sun protection in both studies, whereas no direct effects were found for cognitive likelihood. After accounting for past sun protective behaviour, affective likelihood was only directly related to sun protection in Study 1. No support was found for the indirect effects of affective and cognitive likelihood through intention. The findings underscore the importance of feelings of (cancer) risk in the decision-making process and should be acknowledged by health behaviour theories and risk communication practices. Suggestions for future research are discussed.
ATAC Autocuer Modeling Analysis.
1981-01-01
the analysis of the simple rectangular scrnentation (1) is based on detection and estimation theory (2). This approach uses the concept of maximum ...continuous wave forms. In order to develop the principles of maximum likelihood, it is con- venient to develop the principles for the "classical...the concept of maximum likelihood is significant in that it provides the optimum performance of the detection/estimation problem. With a knowledge of
A quasi-likelihood approach to non-negative matrix factorization
Devarajan, Karthik; Cheung, Vincent C.K.
2017-01-01
A unified approach to non-negative matrix factorization based on the theory of generalized linear models is proposed. This approach embeds a variety of statistical models, including the exponential family, within a single theoretical framework and provides a unified view of such factorizations from the perspective of quasi-likelihood. Using this framework, a family of algorithms for handling signal-dependent noise is developed and its convergence proven using the Expectation-Maximization algorithm. In addition, a measure to evaluate the goodness-of-fit of the resulting factorization is described. The proposed methods allow modeling of non-linear effects via appropriate link functions and are illustrated using an application in biomedical signal processing. PMID:27348511
Uncertainty analysis of signal deconvolution using a measured instrument response function
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hartouni, E. P.; Beeman, B.; Caggiano, J. A.
2016-10-05
A common analysis procedure minimizes the ln-likelihood that a set of experimental observables matches a parameterized model of the observation. The model includes a description of the underlying physical process as well as the instrument response function (IRF). Here, we investigate the National Ignition Facility (NIF) neutron time-of-flight (nTOF) spectrometers, the IRF is constructed from measurements and models. IRF measurements have a finite precision that can make significant contributions to the uncertainty estimate of the physical model’s parameters. Finally, we apply a Bayesian analysis to properly account for IRF uncertainties in calculating the ln-likelihood function used to find the optimummore » physical parameters.« less
Do's and Don'ts of Computer Models for Planning
ERIC Educational Resources Information Center
Hammond, John S., III
1974-01-01
Concentrates on the managerial issues involved in computer planning models. Describes what computer planning models are and the process by which managers can increase the likelihood of computer planning models being successful in their organizations. (Author/DN)
Modifying high-order aeroelastic math model of a jet transport using maximum likelihood estimation
NASA Technical Reports Server (NTRS)
Anissipour, Amir A.; Benson, Russell A.
1989-01-01
The design of control laws to damp flexible structural modes requires accurate math models. Unlike the design of control laws for rigid body motion (e.g., where robust control is used to compensate for modeling inaccuracies), structural mode damping usually employs narrow band notch filters. In order to obtain the required accuracy in the math model, maximum likelihood estimation technique is employed to improve the accuracy of the math model using flight data. Presented here are all phases of this methodology: (1) pre-flight analysis (i.e., optimal input signal design for flight test, sensor location determination, model reduction technique, etc.), (2) data collection and preprocessing, and (3) post-flight analysis (i.e., estimation technique and model verification). In addition, a discussion is presented of the software tools used and the need for future study in this field.
Score Estimating Equations from Embedded Likelihood Functions under Accelerated Failure Time Model
NING, JING; QIN, JING; SHEN, YU
2014-01-01
SUMMARY The semiparametric accelerated failure time (AFT) model is one of the most popular models for analyzing time-to-event outcomes. One appealing feature of the AFT model is that the observed failure time data can be transformed to identically independent distributed random variables without covariate effects. We describe a class of estimating equations based on the score functions for the transformed data, which are derived from the full likelihood function under commonly used semiparametric models such as the proportional hazards or proportional odds model. The methods of estimating regression parameters under the AFT model can be applied to traditional right-censored survival data as well as more complex time-to-event data subject to length-biased sampling. We establish the asymptotic properties and evaluate the small sample performance of the proposed estimators. We illustrate the proposed methods through applications in two examples. PMID:25663727
A parimutuel gambling perspective to compare probabilistic seismicity forecasts
NASA Astrophysics Data System (ADS)
Zechar, J. Douglas; Zhuang, Jiancang
2014-10-01
Using analogies to gaming, we consider the problem of comparing multiple probabilistic seismicity forecasts. To measure relative model performance, we suggest a parimutuel gambling perspective which addresses shortcomings of other methods such as likelihood ratio, information gain and Molchan diagrams. We describe two variants of the parimutuel approach for a set of forecasts: head-to-head, in which forecasts are compared in pairs, and round table, in which all forecasts are compared simultaneously. For illustration, we compare the 5-yr forecasts of the Regional Earthquake Likelihood Models experiment for M4.95+ seismicity in California.
Elghafghuf, Adel; Dufour, Simon; Reyher, Kristen; Dohoo, Ian; Stryhn, Henrik
2014-12-01
Mastitis is a complex disease affecting dairy cows and is considered to be the most costly disease of dairy herds. The hazard of mastitis is a function of many factors, both managerial and environmental, making its control a difficult issue to milk producers. Observational studies of clinical mastitis (CM) often generate datasets with a number of characteristics which influence the analysis of those data: the outcome of interest may be the time to occurrence of a case of mastitis, predictors may change over time (time-dependent predictors), the effects of factors may change over time (time-dependent effects), there are usually multiple hierarchical levels, and datasets may be very large. Analysis of such data often requires expansion of the data into the counting-process format - leading to larger datasets - thus complicating the analysis and requiring excessive computing time. In this study, a nested frailty Cox model with time-dependent predictors and effects was applied to Canadian Bovine Mastitis Research Network data in which 10,831 lactations of 8035 cows from 69 herds were followed through lactation until the first occurrence of CM. The model was fit to the data as a Poisson model with nested normally distributed random effects at the cow and herd levels. Risk factors associated with the hazard of CM during the lactation were identified, such as parity, calving season, herd somatic cell score, pasture access, fore-stripping, and proportion of treated cases of CM in a herd. The analysis showed that most of the predictors had a strong effect early in lactation and also demonstrated substantial variation in the baseline hazard among cows and between herds. A small simulation study for a setting similar to the real data was conducted to evaluate the Poisson maximum likelihood estimation approach with both Gaussian quadrature method and Laplace approximation. Further, the performance of the two methods was compared with the performance of a widely used estimation approach for frailty Cox models based on the penalized partial likelihood. The simulation study showed good performance for the Poisson maximum likelihood approach with Gaussian quadrature and biased variance component estimates for both the Poisson maximum likelihood with Laplace approximation and penalized partial likelihood approaches. Copyright © 2014. Published by Elsevier B.V.
Recreating a functional ancestral archosaur visual pigment.
Chang, Belinda S W; Jönsson, Karolina; Kazmi, Manija A; Donoghue, Michael J; Sakmar, Thomas P
2002-09-01
The ancestors of the archosaurs, a major branch of the diapsid reptiles, originated more than 240 MYA near the dawn of the Triassic Period. We used maximum likelihood phylogenetic ancestral reconstruction methods and explored different models of evolution for inferring the amino acid sequence of a putative ancestral archosaur visual pigment. Three different types of maximum likelihood models were used: nucleotide-based, amino acid-based, and codon-based models. Where possible, within each type of model, likelihood ratio tests were used to determine which model best fit the data. Ancestral reconstructions of the ancestral archosaur node using the best-fitting models of each type were found to be in agreement, except for three amino acid residues at which one reconstruction differed from the other two. To determine if these ancestral pigments would be functionally active, the corresponding genes were chemically synthesized and then expressed in a mammalian cell line in tissue culture. The expressed artificial genes were all found to bind to 11-cis-retinal to yield stable photoactive pigments with lambda(max) values of about 508 nm, which is slightly redshifted relative to that of extant vertebrate pigments. The ancestral archosaur pigments also activated the retinal G protein transducin, as measured in a fluorescence assay. Our results show that ancestral genes from ancient organisms can be reconstructed de novo and tested for function using a combination of phylogenetic and biochemical methods.
Austin, Peter C
2010-04-22
Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
Estimation of submarine mass failure probability from a sequence of deposits with age dates
Geist, Eric L.; Chaytor, Jason D.; Parsons, Thomas E.; ten Brink, Uri S.
2013-01-01
The empirical probability of submarine mass failure is quantified from a sequence of dated mass-transport deposits. Several different techniques are described to estimate the parameters for a suite of candidate probability models. The techniques, previously developed for analyzing paleoseismic data, include maximum likelihood and Type II (Bayesian) maximum likelihood methods derived from renewal process theory and Monte Carlo methods. The estimated mean return time from these methods, unlike estimates from a simple arithmetic mean of the center age dates and standard likelihood methods, includes the effects of age-dating uncertainty and of open time intervals before the first and after the last event. The likelihood techniques are evaluated using Akaike’s Information Criterion (AIC) and Akaike’s Bayesian Information Criterion (ABIC) to select the optimal model. The techniques are applied to mass transport deposits recorded in two Integrated Ocean Drilling Program (IODP) drill sites located in the Ursa Basin, northern Gulf of Mexico. Dates of the deposits were constrained by regional bio- and magnetostratigraphy from a previous study. Results of the analysis indicate that submarine mass failures in this location occur primarily according to a Poisson process in which failures are independent and return times follow an exponential distribution. However, some of the model results suggest that submarine mass failures may occur quasiperiodically at one of the sites (U1324). The suite of techniques described in this study provides quantitative probability estimates of submarine mass failure occurrence, for any number of deposits and age uncertainty distributions.
Markov modulated Poisson process models incorporating covariates for rainfall intensity.
Thayakaran, R; Ramesh, N I
2013-01-01
Time series of rainfall bucket tip times at the Beaufort Park station, Bracknell, in the UK are modelled by a class of Markov modulated Poisson processes (MMPP) which may be thought of as a generalization of the Poisson process. Our main focus in this paper is to investigate the effects of including covariate information into the MMPP model framework on statistical properties. In particular, we look at three types of time-varying covariates namely temperature, sea level pressure, and relative humidity that are thought to be affecting the rainfall arrival process. Maximum likelihood estimation is used to obtain the parameter estimates, and likelihood ratio tests are employed in model comparison. Simulated data from the fitted model are used to make statistical inferences about the accumulated rainfall in the discrete time interval. Variability of the daily Poisson arrival rates is studied.
Johnson, Timothy R; Kuhn, Kristine M
2015-12-01
This paper introduces the ltbayes package for R. This package includes a suite of functions for investigating the posterior distribution of latent traits of item response models. These include functions for simulating realizations from the posterior distribution, profiling the posterior density or likelihood function, calculation of posterior modes or means, Fisher information functions and observed information, and profile likelihood confidence intervals. Inferences can be based on individual response patterns or sets of response patterns such as sum scores. Functions are included for several common binary and polytomous item response models, but the package can also be used with user-specified models. This paper introduces some background and motivation for the package, and includes several detailed examples of its use.
Ahn, Jaeil; Mukherjee, Bhramar; Banerjee, Mousumi; Cooney, Kathleen A.
2011-01-01
Summary The stereotype regression model for categorical outcomes, proposed by Anderson (1984) is nested between the baseline category logits and adjacent category logits model with proportional odds structure. The stereotype model is more parsimonious than the ordinary baseline-category (or multinomial logistic) model due to a product representation of the log odds-ratios in terms of a common parameter corresponding to each predictor and category specific scores. The model could be used for both ordered and unordered outcomes. For ordered outcomes, the stereotype model allows more flexibility than the popular proportional odds model in capturing highly subjective ordinal scaling which does not result from categorization of a single latent variable, but are inherently multidimensional in nature. As pointed out by Greenland (1994), an additional advantage of the stereotype model is that it provides unbiased and valid inference under outcome-stratified sampling as in case-control studies. In addition, for matched case-control studies, the stereotype model is amenable to classical conditional likelihood principle, whereas there is no reduction due to sufficiency under the proportional odds model. In spite of these attractive features, the model has been applied less, as there are issues with maximum likelihood estimation and likelihood based testing approaches due to non-linearity and lack of identifiability of the parameters. We present comprehensive Bayesian inference and model comparison procedure for this class of models as an alternative to the classical frequentist approach. We illustrate our methodology by analyzing data from The Flint Men’s Health Study, a case-control study of prostate cancer in African-American men aged 40 to 79 years. We use clinical staging of prostate cancer in terms of Tumors, Nodes and Metastatsis (TNM) as the categorical response of interest. PMID:19731262
Assessing performance and validating finite element simulations using probabilistic knowledge
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dolin, Ronald M.; Rodriguez, E. A.
Two probabilistic approaches for assessing performance are presented. The first approach assesses probability of failure by simultaneously modeling all likely events. The probability each event causes failure along with the event's likelihood of occurrence contribute to the overall probability of failure. The second assessment method is based on stochastic sampling using an influence diagram. Latin-hypercube sampling is used to stochastically assess events. The overall probability of failure is taken as the maximum probability of failure of all the events. The Likelihood of Occurrence simulation suggests failure does not occur while the Stochastic Sampling approach predicts failure. The Likelihood of Occurrencemore » results are used to validate finite element predictions.« less
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-04-06
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods.
Zeng, Chan; Newcomer, Sophia R; Glanz, Jason M; Shoup, Jo Ann; Daley, Matthew F; Hambidge, Simon J; Xu, Stanley
2013-12-15
The self-controlled case series (SCCS) method is often used to examine the temporal association between vaccination and adverse events using only data from patients who experienced such events. Conditional Poisson regression models are used to estimate incidence rate ratios, and these models perform well with large or medium-sized case samples. However, in some vaccine safety studies, the adverse events studied are rare and the maximum likelihood estimates may be biased. Several bias correction methods have been examined in case-control studies using conditional logistic regression, but none of these methods have been evaluated in studies using the SCCS design. In this study, we used simulations to evaluate 2 bias correction approaches-the Firth penalized maximum likelihood method and Cordeiro and McCullagh's bias reduction after maximum likelihood estimation-with small sample sizes in studies using the SCCS design. The simulations showed that the bias under the SCCS design with a small number of cases can be large and is also sensitive to a short risk period. The Firth correction method provides finite and less biased estimates than the maximum likelihood method and Cordeiro and McCullagh's method. However, limitations still exist when the risk period in the SCCS design is short relative to the entire observation period.
NASA Technical Reports Server (NTRS)
Watson, Clifford
2010-01-01
Traditional hazard analysis techniques utilize a two-dimensional representation of the results determined by relative likelihood and severity of the residual risk. These matrices present a quick-look at the Likelihood (Y-axis) and Severity (X-axis) of the probable outcome of a hazardous event. A three-dimensional method, described herein, utilizes the traditional X and Y axes, while adding a new, third dimension, shown as the Z-axis, and referred to as the Level of Control. The elements of the Z-axis are modifications of the Hazard Elimination and Control steps (also known as the Hazard Reduction Precedence Sequence). These steps are: 1. Eliminate risk through design. 2. Substitute less risky materials for more hazardous materials. 3. Install safety devices. 4. Install caution and warning devices. 5. Develop administrative controls (to include special procedures and training.) 6. Provide protective clothing and equipment. When added to the twodimensional models, the level of control adds a visual representation of the risk associated with the hazardous condition, creating a tall-pole for the least-well-controlled failure while establishing the relative likelihood and severity of all causes and effects for an identified hazard. Computer modeling of the analytical results, using spreadsheets and threedimensional charting gives a visual confirmation of the relationship between causes and their controls
NASA Technical Reports Server (NTRS)
Watson, Clifford C.
2011-01-01
Traditional hazard analysis techniques utilize a two-dimensional representation of the results determined by relative likelihood and severity of the residual risk. These matrices present a quick-look at the Likelihood (Y-axis) and Severity (X-axis) of the probable outcome of a hazardous event. A three-dimensional method, described herein, utilizes the traditional X and Y axes, while adding a new, third dimension, shown as the Z-axis, and referred to as the Level of Control. The elements of the Z-axis are modifications of the Hazard Elimination and Control steps (also known as the Hazard Reduction Precedence Sequence). These steps are: 1. Eliminate risk through design. 2. Substitute less risky materials for more hazardous materials. 3. Install safety devices. 4. Install caution and warning devices. 5. Develop administrative controls (to include special procedures and training.) 6. Provide protective clothing and equipment. When added to the two-dimensional models, the level of control adds a visual representation of the risk associated with the hazardous condition, creating a tall-pole for the least-well-controlled failure while establishing the relative likelihood and severity of all causes and effects for an identified hazard. Computer modeling of the analytical results, using spreadsheets and three-dimensional charting gives a visual confirmation of the relationship between causes and their controls.
Risk Presentation Using the Three Dimensions of Likelihood, Severity, and Level of Control
NASA Technical Reports Server (NTRS)
Watson, Clifford
2010-01-01
Traditional hazard analysis techniques utilize a two-dimensional representation of the results determined by relative likelihood and severity of the residual risk. These matrices present a quick-look at the Likelihood (Y-axis) and Severity (X-axis) of the probable outcome of a hazardous event. A three-dimensional method, described herein, utilizes the traditional X and Y axes, while adding a new, third dimension, shown as the Z-axis, and referred to as the Level of Control. The elements of the Z-axis are modifications of the Hazard Elimination and Control steps (also known as the Hazard Reduction Precedence Sequence). These steps are: 1. Eliminate risk through design. 2. Substitute less risky materials for more hazardous materials. 3. Install safety devices. 4. Install caution and warning devices. 5. Develop administrative controls (to include special procedures and training.) 6. Provide protective clothing and equipment. When added to the two-dimensional models, the level of control adds a visual representation of the risk associated with the hazardous condition, creating a tall-pole for the leastwell-controlled failure while establishing the relative likelihood and severity of all causes and effects for an identified hazard. Computer modeling of the analytical results, using spreadsheets and three-dimensional charting gives a visual confirmation of the relationship between causes and their controls.
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-01-01
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods. PMID:28383503
Rosenblum, Michael; van der Laan, Mark J.
2010-01-01
Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636
Wang, Shijun; Liu, Peter; Turkbey, Baris; Choyke, Peter; Pinto, Peter; Summers, Ronald M
2012-01-01
In this paper, we propose a new pharmacokinetic model for parameter estimation of dynamic contrast-enhanced (DCE) MRI by using Gaussian process inference. Our model is based on the Tofts dual-compartment model for the description of tracer kinetics and the observed time series from DCE-MRI is treated as a Gaussian stochastic process. The parameter estimation is done through a maximum likelihood approach and we propose a variant of the coordinate descent method to solve this likelihood maximization problem. The new model was shown to outperform a baseline method on simulated data. Parametric maps generated on prostate DCE data with the new model also provided better enhancement of tumors, lower intensity on false positives, and better boundary delineation when compared with the baseline method. New statistical parameter maps from the process model were also found to be informative, particularly when paired with the PK parameter maps.
Model selection for multi-component frailty models.
Ha, Il Do; Lee, Youngjo; MacKenzie, Gilbert
2007-11-20
Various frailty models have been developed and are now widely used for analysing multivariate survival data. It is therefore important to develop an information criterion for model selection. However, in frailty models there are several alternative ways of forming a criterion and the particular criterion chosen may not be uniformly best. In this paper, we study an Akaike information criterion (AIC) on selecting a frailty structure from a set of (possibly) non-nested frailty models. We propose two new AIC criteria, based on a conditional likelihood and an extended restricted likelihood (ERL) given by Lee and Nelder (J. R. Statist. Soc. B 1996; 58:619-678). We compare their performance using well-known practical examples and demonstrate that the two criteria may yield rather different results. A simulation study shows that the AIC based on the ERL is recommended, when attention is focussed on selecting the frailty structure rather than the fixed effects.
Huang, Chiung-Yu; Qin, Jing
2013-01-01
The Canadian Study of Health and Aging (CSHA) employed a prevalent cohort design to study survival after onset of dementia, where patients with dementia were sampled and the onset time of dementia was determined retrospectively. The prevalent cohort sampling scheme favors individuals who survive longer. Thus, the observed survival times are subject to length bias. In recent years, there has been a rising interest in developing estimation procedures for prevalent cohort survival data that not only account for length bias but also actually exploit the incidence distribution of the disease to improve efficiency. This article considers semiparametric estimation of the Cox model for the time from dementia onset to death under a stationarity assumption with respect to the disease incidence. Under the stationarity condition, the semiparametric maximum likelihood estimation is expected to be fully efficient yet difficult to perform for statistical practitioners, as the likelihood depends on the baseline hazard function in a complicated way. Moreover, the asymptotic properties of the semiparametric maximum likelihood estimator are not well-studied. Motivated by the composite likelihood method (Besag 1974), we develop a composite partial likelihood method that retains the simplicity of the popular partial likelihood estimator and can be easily performed using standard statistical software. When applied to the CSHA data, the proposed method estimates a significant difference in survival between the vascular dementia group and the possible Alzheimer’s disease group, while the partial likelihood method for left-truncated and right-censored data yields a greater standard error and a 95% confidence interval covering 0, thus highlighting the practical value of employing a more efficient methodology. To check the assumption of stable disease for the CSHA data, we also present new graphical and numerical tests in the article. The R code used to obtain the maximum composite partial likelihood estimator for the CSHA data is available in the online Supplementary Material, posted on the journal web site. PMID:24000265
Multiple-Hit Parameter Estimation in Monolithic Detectors
Barrett, Harrison H.; Lewellen, Tom K.; Miyaoka, Robert S.
2014-01-01
We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%–12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied. PMID:23193231
NASA Astrophysics Data System (ADS)
Staley, Dennis; Negri, Jacquelyn; Kean, Jason
2016-04-01
Population expansion into fire-prone steeplands has resulted in an increase in post-fire debris-flow risk in the western United States. Logistic regression methods for determining debris-flow likelihood and the calculation of empirical rainfall intensity-duration thresholds for debris-flow initiation represent two common approaches for characterizing hazard and reducing risk. Logistic regression models are currently being used to rapidly assess debris-flow hazard in response to design storms of known intensities (e.g. a 10-year recurrence interval rainstorm). Empirical rainfall intensity-duration thresholds comprise a major component of the United States Geological Survey (USGS) and the National Weather Service (NWS) debris-flow early warning system at a regional scale in southern California. However, these two modeling approaches remain independent, with each approach having limitations that do not allow for synergistic local-scale (e.g. drainage-basin scale) characterization of debris-flow hazard during intense rainfall. The current logistic regression equations consider rainfall a unique independent variable, which prevents the direct calculation of the relation between rainfall intensity and debris-flow likelihood. Regional (e.g. mountain range or physiographic province scale) rainfall intensity-duration thresholds fail to provide insight into the basin-scale variability of post-fire debris-flow hazard and require an extensive database of historical debris-flow occurrence and rainfall characteristics. Here, we present a new approach that combines traditional logistic regression and intensity-duration threshold methodologies. This method allows for local characterization of both the likelihood that a debris-flow will occur at a given rainfall intensity, the direct calculation of the rainfall rates that will result in a given likelihood, and the ability to calculate spatially explicit rainfall intensity-duration thresholds for debris-flow generation in recently burned areas. Our approach synthesizes the two methods by incorporating measured rainfall intensity into each model variable (based on measures of topographic steepness, burn severity and surface properties) within the logistic regression equation. This approach provides a more realistic representation of the relation between rainfall intensity and debris-flow likelihood, as likelihood values asymptotically approach zero when rainfall intensity approaches 0 mm/h, and increase with more intense rainfall. Model performance was evaluated by comparing predictions to several existing regional thresholds. The model, based upon training data collected in southern California, USA, has proven to accurately predict rainfall intensity-duration thresholds for other areas in the western United States not included in the original training dataset. In addition, the improved logistic regression model shows promise for emergency planning purposes and real-time, site-specific early warning. With further validation, this model may permit the prediction of spatially-explicit intensity-duration thresholds for debris-flow generation in areas where empirically derived regional thresholds do not exist. This improvement would permit the expansion of the early-warning system into other regions susceptible to post-fire debris flow.
Predicting Rotator Cuff Tears Using Data Mining and Bayesian Likelihood Ratios
Lu, Hsueh-Yi; Huang, Chen-Yuan; Su, Chwen-Tzeng; Lin, Chen-Chiang
2014-01-01
Objectives Rotator cuff tear is a common cause of shoulder diseases. Correct diagnosis of rotator cuff tears can save patients from further invasive, costly and painful tests. This study used predictive data mining and Bayesian theory to improve the accuracy of diagnosing rotator cuff tears by clinical examination alone. Methods In this retrospective study, 169 patients who had a preliminary diagnosis of rotator cuff tear on the basis of clinical evaluation followed by confirmatory MRI between 2007 and 2011 were identified. MRI was used as a reference standard to classify rotator cuff tears. The predictor variable was the clinical assessment results, which consisted of 16 attributes. This study employed 2 data mining methods (ANN and the decision tree) and a statistical method (logistic regression) to classify the rotator cuff diagnosis into “tear” and “no tear” groups. Likelihood ratio and Bayesian theory were applied to estimate the probability of rotator cuff tears based on the results of the prediction models. Results Our proposed data mining procedures outperformed the classic statistical method. The correction rate, sensitivity, specificity and area under the ROC curve of predicting a rotator cuff tear were statistical better in the ANN and decision tree models compared to logistic regression. Based on likelihood ratios derived from our prediction models, Fagan's nomogram could be constructed to assess the probability of a patient who has a rotator cuff tear using a pretest probability and a prediction result (tear or no tear). Conclusions Our predictive data mining models, combined with likelihood ratios and Bayesian theory, appear to be good tools to classify rotator cuff tears as well as determine the probability of the presence of the disease to enhance diagnostic decision making for rotator cuff tears. PMID:24733553
A score to estimate the likelihood of detecting advanced colorectal neoplasia at colonoscopy
Kaminski, Michal F; Polkowski, Marcin; Kraszewska, Ewa; Rupinski, Maciej; Butruk, Eugeniusz; Regula, Jaroslaw
2014-01-01
Objective This study aimed to develop and validate a model to estimate the likelihood of detecting advanced colorectal neoplasia in Caucasian patients. Design We performed a cross-sectional analysis of database records for 40-year-old to 66-year-old patients who entered a national primary colonoscopy-based screening programme for colorectal cancer in 73 centres in Poland in the year 2007. We used multivariate logistic regression to investigate the associations between clinical variables and the presence of advanced neoplasia in a randomly selected test set, and confirmed the associations in a validation set. We used model coefficients to develop a risk score for detection of advanced colorectal neoplasia. Results Advanced colorectal neoplasia was detected in 2544 of the 35 918 included participants (7.1%). In the test set, a logistic-regression model showed that independent risk factors for advanced colorectal neoplasia were: age, sex, family history of colorectal cancer, cigarette smoking (p<0.001 for these four factors), and Body Mass Index (p=0.033). In the validation set, the model was well calibrated (ratio of expected to observed risk of advanced neoplasia: 1.00 (95% CI 0.95 to 1.06)) and had moderate discriminatory power (c-statistic 0.62). We developed a score that estimated the likelihood of detecting advanced neoplasia in the validation set, from 1.32% for patients scoring 0, to 19.12% for patients scoring 7–8. Conclusions Developed and internally validated score consisting of simple clinical factors successfully estimates the likelihood of detecting advanced colorectal neoplasia in asymptomatic Caucasian patients. Once externally validated, it may be useful for counselling or designing primary prevention studies. PMID:24385598
A score to estimate the likelihood of detecting advanced colorectal neoplasia at colonoscopy.
Kaminski, Michal F; Polkowski, Marcin; Kraszewska, Ewa; Rupinski, Maciej; Butruk, Eugeniusz; Regula, Jaroslaw
2014-07-01
This study aimed to develop and validate a model to estimate the likelihood of detecting advanced colorectal neoplasia in Caucasian patients. We performed a cross-sectional analysis of database records for 40-year-old to 66-year-old patients who entered a national primary colonoscopy-based screening programme for colorectal cancer in 73 centres in Poland in the year 2007. We used multivariate logistic regression to investigate the associations between clinical variables and the presence of advanced neoplasia in a randomly selected test set, and confirmed the associations in a validation set. We used model coefficients to develop a risk score for detection of advanced colorectal neoplasia. Advanced colorectal neoplasia was detected in 2544 of the 35,918 included participants (7.1%). In the test set, a logistic-regression model showed that independent risk factors for advanced colorectal neoplasia were: age, sex, family history of colorectal cancer, cigarette smoking (p<0.001 for these four factors), and Body Mass Index (p=0.033). In the validation set, the model was well calibrated (ratio of expected to observed risk of advanced neoplasia: 1.00 (95% CI 0.95 to 1.06)) and had moderate discriminatory power (c-statistic 0.62). We developed a score that estimated the likelihood of detecting advanced neoplasia in the validation set, from 1.32% for patients scoring 0, to 19.12% for patients scoring 7-8. Developed and internally validated score consisting of simple clinical factors successfully estimates the likelihood of detecting advanced colorectal neoplasia in asymptomatic Caucasian patients. Once externally validated, it may be useful for counselling or designing primary prevention studies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Evaluation of Dynamic Coastal Response to Sea-level Rise Modifies Inundation Likelihood
NASA Technical Reports Server (NTRS)
Lentz, Erika E.; Thieler, E. Robert; Plant, Nathaniel G.; Stippa, Sawyer R.; Horton, Radley M.; Gesch, Dean B.
2016-01-01
Sea-level rise (SLR) poses a range of threats to natural and built environments, making assessments of SLR-induced hazards essential for informed decision making. We develop a probabilistic model that evaluates the likelihood that an area will inundate (flood) or dynamically respond (adapt) to SLR. The broad-area applicability of the approach is demonstrated by producing 30x30m resolution predictions for more than 38,000 sq km of diverse coastal landscape in the northeastern United States. Probabilistic SLR projections, coastal elevation and vertical land movement are used to estimate likely future inundation levels. Then, conditioned on future inundation levels and the current land-cover type, we evaluate the likelihood of dynamic response versus inundation. We find that nearly 70% of this coastal landscape has some capacity to respond dynamically to SLR, and we show that inundation models over-predict land likely to submerge. This approach is well suited to guiding coastal resource management decisions that weigh future SLR impacts and uncertainty against ecological targets and economic constraints.
Schminkey, Donna L; von Oertzen, Timo; Bullock, Linda
2016-08-01
With increasing access to population-based data and electronic health records for secondary analysis, missing data are common. In the social and behavioral sciences, missing data frequently are handled with multiple imputation methods or full information maximum likelihood (FIML) techniques, but healthcare researchers have not embraced these methodologies to the same extent and more often use either traditional imputation techniques or complete case analysis, which can compromise power and introduce unintended bias. This article is a review of options for handling missing data, concluding with a case study demonstrating the utility of multilevel structural equation modeling using full information maximum likelihood (MSEM with FIML) to handle large amounts of missing data. MSEM with FIML is a parsimonious and hypothesis-driven strategy to cope with large amounts of missing data without compromising power or introducing bias. This technique is relevant for nurse researchers faced with ever-increasing amounts of electronic data and decreasing research budgets. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
NASA Technical Reports Server (NTRS)
Ganga, Ken; Page, Lyman; Cheng, Edward; Meyer, Stephan
1994-01-01
In many cosmological models, the large angular scale anisotropy in the cosmic microwave background is parameterized by a spectral index, n, and a quadrupolar amplitude, Q. For a Harrison-Peebles-Zel'dovich spectrum, n = 1. Using data from the Far Infrared Survey (FIRS) and a new statistical measure, a contour plot of the likelihood for cosmological models for which -1 less than n less than 3 and 0 equal to or less than Q equal to or less than 50 micro K is obtained. Depending upon the details of the analysis, the maximum likelihood occurs at n between 0.8 and 1.4 and Q between 18 and 21 micro K. Regardless of Q, the likelihood is always less than half its maximum for n less than -0.4 and for n greater than 2.2, as it is for Q less than 8 micro K and Q greater than 44 micro K.
Vermunt, Neeltje P C A; Westert, Gert P; Olde Rikkert, Marcel G M; Faber, Marjan J
2018-03-01
To assess the impact of patient characteristics, patient-professional engagement, communication and context on the probability that healthcare professionals will discuss goals or priorities with older patients. Secondary analysis of cross-sectional data from the 2014 Commonwealth Fund International Health Policy Survey of Older Adults. 11 western countries. Community-dwelling adults, aged 55 or older. Assessment of goals and priorities. The final sample size consisted of 17,222 respondents, 54% of whom reported an assessment of their goals and priorities (AGP) by healthcare professionals. In logistic regression model 1, which was used to analyse the entire population, the determinants found to have moderate to large effects on the likelihood of AGP were information exchange on stress, diet or exercise, or both. Country (living in Sweden) and continuity of care (no regular professional or organisation) had moderate to large negative effects on the likelihood of AGP. In model 2, which focussed on respondents who experienced continuity of care, country and information exchange on stress and lifestyle were the main determinants of AGP, with comparable odds ratios to model 1. Furthermore, a professional asking questions also increased the likelihood of AGP. Continuity of care and information exchange is associated with a higher probability of AGP, while people living in Sweden are less likely to experience these assessments. Further study is required to determine whether increasing information exchange and professionals asking more questions may improve goal setting with older patients. Key points A patient goal-oriented approach can be beneficial for older patients with chronic conditions or multimorbidity; however, discussing goals with these patients is not a common practice. The likelihood of discussing goals varies by country, occurring most commonly in the USA, and least often in Sweden. Country-level differences in continuity of care and questions asked by a regularly visited professional affect the goal discussion probability. Patient characteristics, including age, have less impact than expected on the likelihood of sharing goals.
Measurement Model Specification Error in LISREL Structural Equation Models.
ERIC Educational Resources Information Center
Baldwin, Beatrice; Lomax, Richard
This LISREL study examines the robustness of the maximum likelihood estimates under varying degrees of measurement model misspecification. A true model containing five latent variables (two endogenous and three exogenous) and two indicator variables per latent variable was used. Measurement model misspecification considered included errors of…
PyEvolve: a toolkit for statistical modelling of molecular evolution.
Butterfield, Andrew; Vedagiri, Vivek; Lang, Edward; Lawrence, Cath; Wakefield, Matthew J; Isaev, Alexander; Huttley, Gavin A
2004-01-05
Examining the distribution of variation has proven an extremely profitable technique in the effort to identify sequences of biological significance. Most approaches in the field, however, evaluate only the conserved portions of sequences - ignoring the biological significance of sequence differences. A suite of sophisticated likelihood based statistical models from the field of molecular evolution provides the basis for extracting the information from the full distribution of sequence variation. The number of different problems to which phylogeny-based maximum likelihood calculations can be applied is extensive. Available software packages that can perform likelihood calculations suffer from a lack of flexibility and scalability, or employ error-prone approaches to model parameterisation. Here we describe the implementation of PyEvolve, a toolkit for the application of existing, and development of new, statistical methods for molecular evolution. We present the object architecture and design schema of PyEvolve, which includes an adaptable multi-level parallelisation schema. The approach for defining new methods is illustrated by implementing a novel dinucleotide model of substitution that includes a parameter for mutation of methylated CpG's, which required 8 lines of standard Python code to define. Benchmarking was performed using either a dinucleotide or codon substitution model applied to an alignment of BRCA1 sequences from 20 mammals, or a 10 species subset. Up to five-fold parallel performance gains over serial were recorded. Compared to leading alternative software, PyEvolve exhibited significantly better real world performance for parameter rich models with a large data set, reducing the time required for optimisation from approximately 10 days to approximately 6 hours. PyEvolve provides flexible functionality that can be used either for statistical modelling of molecular evolution, or the development of new methods in the field. The toolkit can be used interactively or by writing and executing scripts. The toolkit uses efficient processes for specifying the parameterisation of statistical models, and implements numerous optimisations that make highly parameter rich likelihood functions solvable within hours on multi-cpu hardware. PyEvolve can be readily adapted in response to changing computational demands and hardware configurations to maximise performance. PyEvolve is released under the GPL and can be downloaded from http://cbis.anu.edu.au/software.
Copula based flexible modeling of associations between clustered event times.
Geerdens, Candida; Claeskens, Gerda; Janssen, Paul
2016-07-01
Multivariate survival data are characterized by the presence of correlation between event times within the same cluster. First, we build multi-dimensional copulas with flexible and possibly symmetric dependence structures for such data. In particular, clustered right-censored survival data are modeled using mixtures of max-infinitely divisible bivariate copulas. Second, these copulas are fit by a likelihood approach where the vast amount of copula derivatives present in the likelihood is approximated by finite differences. Third, we formulate conditions for clustered right-censored survival data under which an information criterion for model selection is either weakly consistent or consistent. Several of the familiar selection criteria are included. A set of four-dimensional data on time-to-mastitis is used to demonstrate the developed methodology.
Anatomy of the ATLAS diboson anomaly
NASA Astrophysics Data System (ADS)
Allanach, B. C.; Gripaios, Ben; Sutherland, Dave
2015-09-01
We perform a general analysis of new physics interpretations of the recent ATLAS diboson excesses over standard model expectations in LHC Run I collisions. First, we estimate a likelihood function in terms of the truth signal in the W W , W Z , and Z Z channels, finding that the maximum has zero events in the W Z channel, though the likelihood is sufficiently flat to allow other scenarios. Second, we survey the possible effective field theories containing the standard model plus a new resonance that could explain the data, identifying two possibilities, viz. a vector that is either a left- or right-handed S U (2 ) triplet. Finally, we compare these models with other experimental data and determine the parameter regions in which they provide a consistent explanation.
Using Qualitative Hazard Analysis to Guide Quantitative Safety Analysis
NASA Technical Reports Server (NTRS)
Shortle, J. F.; Allocco, M.
2005-01-01
Quantitative methods can be beneficial in many types of safety investigations. However, there are many difficulties in using quantitative m ethods. Far example, there may be little relevant data available. This paper proposes a framework for using quantitative hazard analysis to prioritize hazard scenarios most suitable for quantitative mziysis. The framework first categorizes hazard scenarios by severity and likelihood. We then propose another metric "modeling difficulty" that desc ribes the complexity in modeling a given hazard scenario quantitatively. The combined metrics of severity, likelihood, and modeling difficu lty help to prioritize hazard scenarios for which quantitative analys is should be applied. We have applied this methodology to proposed concepts of operations for reduced wake separation for airplane operatio ns at closely spaced parallel runways.
NASA Astrophysics Data System (ADS)
Aartsen, M. G.; Abraham, K.; Ackermann, M.; Adams, J.; Aguilar, J. A.; Ahlers, M.; Ahrens, M.; Altmann, D.; Anderson, T.; Ansseau, I.; Anton, G.; Archinger, M.; Arguelles, C.; Arlen, T. C.; Auffenberg, J.; Bai, X.; Barwick, S. W.; Baum, V.; Bay, R.; Beatty, J. J.; Becker Tjus, J.; Becker, K.-H.; Beiser, E.; BenZvi, S.; Berghaus, P.; Berley, D.; Bernardini, E.; Bernhard, A.; Besson, D. Z.; Binder, G.; Bindig, D.; Bissok, M.; Blaufuss, E.; Blumenthal, J.; Boersma, D. J.; Bohm, C.; Börner, M.; Bos, F.; Bose, D.; Böser, S.; Botner, O.; Braun, J.; Brayeur, L.; Bretz, H.-P.; Buzinsky, N.; Casey, J.; Casier, M.; Cheung, E.; Chirkin, D.; Christov, A.; Clark, K.; Classen, L.; Coenders, S.; Collin, G. H.; Conrad, J. M.; Cowen, D. F.; Cruz Silva, A. H.; Danninger, M.; Daughhetee, J.; Davis, J. C.; Day, M.; de André, J. P. A. M.; De Clercq, C.; del Pino Rosendo, E.; Dembinski, H.; De Ridder, S.; Desiati, P.; de Vries, K. D.; de Wasseige, G.; de With, M.; DeYoung, T.; Díaz-Vélez, J. C.; di Lorenzo, V.; Dumm, J. P.; Dunkman, M.; Eberhardt, B.; Edsjö, J.; Ehrhardt, T.; Eichmann, B.; Euler, S.; Evenson, P. A.; Fahey, S.; Fazely, A. R.; Feintzeig, J.; Felde, J.; Filimonov, K.; Finley, C.; Flis, S.; Fösig, C.-C.; Fuchs, T.; Gaisser, T. K.; Gaior, R.; Gallagher, J.; Gerhardt, L.; Ghorbani, K.; Gier, D.; Gladstone, L.; Glagla, M.; Glüsenkamp, T.; Goldschmidt, A.; Golup, G.; Gonzalez, J. G.; Góra, D.; Grant, D.; Griffith, Z.; Groß, A.; Ha, C.; Haack, C.; Haj Ismail, A.; Hallgren, A.; Halzen, F.; Hansen, E.; Hansmann, B.; Hanson, K.; Hebecker, D.; Heereman, D.; Helbing, K.; Hellauer, R.; Hickford, S.; Hignight, J.; Hill, G. C.; Hoffman, K. D.; Hoffmann, R.; Holzapfel, K.; Homeier, A.; Hoshina, K.; Huang, F.; Huber, M.; Huelsnitz, W.; Hulth, P. O.; Hultqvist, K.; In, S.; Ishihara, A.; Jacobi, E.; Japaridze, G. S.; Jeong, M.; Jero, K.; Jones, B. J. P.; Jurkovic, M.; Kappes, A.; Karg, T.; Karle, A.; Katz, U.; Kauer, M.; Keivani, A.; Kelley, J. L.; Kemp, J.; Kheirandish, A.; Kiryluk, J.; Klein, S. R.; Kohnen, G.; Koirala, R.; Kolanoski, H.; Konietz, R.; Köpke, L.; Kopper, C.; Kopper, S.; Koskinen, D. J.; Kowalski, M.; Krings, K.; Kroll, G.; Kroll, M.; Krückl, G.; Kunnen, J.; Kurahashi, N.; Kuwabara, T.; Labare, M.; Lanfranchi, J. L.; Larson, M. J.; Lesiak-Bzdak, M.; Leuermann, M.; Leuner, J.; Lu, L.; Lünemann, J.; Madsen, J.; Maggi, G.; Mahn, K. B. M.; Mandelartz, M.; Maruyama, R.; Mase, K.; Matis, H. S.; Maunu, R.; McNally, F.; Meagher, K.; Medici, M.; Meier, M.; Meli, A.; Menne, T.; Merino, G.; Meures, T.; Miarecki, S.; Middell, E.; Mohrmann, L.; Montaruli, T.; Morse, R.; Nahnhauer, R.; Naumann, U.; Neer, G.; Niederhausen, H.; Nowicki, S. C.; Nygren, D. R.; Obertacke Pollmann, A.; Olivas, A.; Omairat, A.; O'Murchadha, A.; Palczewski, T.; Pandya, H.; Pankova, D. V.; Paul, L.; Pepper, J. A.; Pérez de los Heros, C.; Pfendner, C.; Pieloth, D.; Pinat, E.; Posselt, J.; Price, P. B.; Przybylski, G. T.; Quinnan, M.; Raab, C.; Rädel, L.; Rameez, M.; Rawlins, K.; Reimann, R.; Relich, M.; Resconi, E.; Rhode, W.; Richman, M.; Richter, S.; Riedel, B.; Robertson, S.; Rongen, M.; Rott, C.; Ruhe, T.; Ryckbosch, D.; Sabbatini, L.; Sander, H.-G.; Sandrock, A.; Sandroos, J.; Sarkar, S.; Savage, C.; Schatto, K.; Schimp, M.; Schlunder, P.; Schmidt, T.; Schoenen, S.; Schöneberg, S.; Schönwald, A.; Schulte, L.; Schumacher, L.; Scott, P.; Seckel, D.; Seunarine, S.; Silverwood, H.; Soldin, D.; Song, M.; Spiczak, G. M.; Spiering, C.; Stahlberg, M.; Stamatikos, M.; Stanev, T.; Stasik, A.; Steuer, A.; Stezelberger, T.; Stokstad, R. G.; Stößl, A.; Ström, R.; Strotjohann, N. L.; Sullivan, G. W.; Sutherland, M.; Taavola, H.; Taboada, I.; Tatar, J.; Ter-Antonyan, S.; Terliuk, A.; Te{š}ić, G.; Tilav, S.; Toale, P. A.; Tobin, M. N.; Toscano, S.; Tosi, D.; Tselengidou, M.; Turcati, A.; Unger, E.; Usner, M.; Vallecorsa, S.; Vandenbroucke, J.; van Eijndhoven, N.; Vanheule, S.; van Santen, J.; Veenkamp, J.; Vehring, M.; Voge, M.; Vraeghe, M.; Walck, C.; Wallace, A.; Wallraff, M.; Wandkowsky, N.; Weaver, Ch.; Wendt, C.; Westerhoff, S.; Whelan, B. J.; Wiebe, K.; Wiebusch, C. H.; Wille, L.; Williams, D. R.; Wills, L.; Wissing, H.; Wolf, M.; Wood, T. R.; Woschnagg, K.; Xu, D. L.; Xu, X. W.; Xu, Y.; Yanez, J. P.; Yodh, G.; Yoshida, S.; Zoll, M.
2016-04-01
We present an improved event-level likelihood formalism for including neutrino telescope data in global fits to new physics. We derive limits on spin-dependent dark matter-proton scattering by employing the new formalism in a re-analysis of data from the 79-string IceCube search for dark matter annihilation in the Sun, including explicit energy information for each event. The new analysis excludes a number of models in the weak-scale minimal supersymmetric standard model (MSSM) for the first time. This work is accompanied by the public release of the 79-string IceCube data, as well as an associated computer code for applying the new likelihood to arbitrary dark matter models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
West, R. Derek; Gunther, Jacob H.; Moon, Todd K.
In this study, we derive a comprehensive forward model for the data collected by stripmap synthetic aperture radar (SAR) that is linear in the ground reflectivity parameters. It is also shown that if the noise model is additive, then the forward model fits into the linear statistical model framework, and the ground reflectivity parameters can be estimated by statistical methods. We derive the maximum likelihood (ML) estimates for the ground reflectivity parameters in the case of additive white Gaussian noise. Furthermore, we show that obtaining the ML estimates of the ground reflectivity requires two steps. The first step amounts tomore » a cross-correlation of the data with a model of the data acquisition parameters, and it is shown that this step has essentially the same processing as the so-called convolution back-projection algorithm. The second step is a complete system inversion that is capable of mitigating the sidelobes of the spatially variant impulse responses remaining after the correlation processing. We also state the Cramer-Rao lower bound (CRLB) for the ML ground reflectivity estimates.We show that the CRLB is linked to the SAR system parameters, the flight path of the SAR sensor, and the image reconstruction grid.We demonstrate the ML image formation and the CRLB bound for synthetically generated data.« less
West, R. Derek; Gunther, Jacob H.; Moon, Todd K.
2016-12-01
In this study, we derive a comprehensive forward model for the data collected by stripmap synthetic aperture radar (SAR) that is linear in the ground reflectivity parameters. It is also shown that if the noise model is additive, then the forward model fits into the linear statistical model framework, and the ground reflectivity parameters can be estimated by statistical methods. We derive the maximum likelihood (ML) estimates for the ground reflectivity parameters in the case of additive white Gaussian noise. Furthermore, we show that obtaining the ML estimates of the ground reflectivity requires two steps. The first step amounts tomore » a cross-correlation of the data with a model of the data acquisition parameters, and it is shown that this step has essentially the same processing as the so-called convolution back-projection algorithm. The second step is a complete system inversion that is capable of mitigating the sidelobes of the spatially variant impulse responses remaining after the correlation processing. We also state the Cramer-Rao lower bound (CRLB) for the ML ground reflectivity estimates.We show that the CRLB is linked to the SAR system parameters, the flight path of the SAR sensor, and the image reconstruction grid.We demonstrate the ML image formation and the CRLB bound for synthetically generated data.« less
Bayesian Hierarchical Random Effects Models in Forensic Science.
Aitken, Colin G G
2018-01-01
Statistical modeling of the evaluation of evidence with the use of the likelihood ratio has a long history. It dates from the Dreyfus case at the end of the nineteenth century through the work at Bletchley Park in the Second World War to the present day. The development received a significant boost in 1977 with a seminal work by Dennis Lindley which introduced a Bayesian hierarchical random effects model for the evaluation of evidence with an example of refractive index measurements on fragments of glass. Many models have been developed since then. The methods have now been sufficiently well-developed and have become so widespread that it is timely to try and provide a software package to assist in their implementation. With that in mind, a project (SAILR: Software for the Analysis and Implementation of Likelihood Ratios) was funded by the European Network of Forensic Science Institutes through their Monopoly programme to develop a software package for use by forensic scientists world-wide that would assist in the statistical analysis and implementation of the approach based on likelihood ratios. It is the purpose of this document to provide a short review of a small part of this history. The review also provides a background, or landscape, for the development of some of the models within the SAILR package and references to SAILR as made as appropriate.
NASA Technical Reports Server (NTRS)
Bundick, W. T.
1985-01-01
The application of the Generalized Likelihood Ratio technique to the detection and identification of aircraft control element failures has been evaluated in a linear digital simulation of the longitudinal dynamics of a B-737 aircraft. Simulation results show that the technique has potential but that the effects of wind turbulence and Kalman filter model errors are problems which must be overcome.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Young, Jonathan; Thompson, Sandra E.; Brothers, Alan J.
The ability to estimate the likelihood of future events based on current and historical data is essential to the decision making process of many government agencies. Successful predictions related to terror events and characterizing the risks will support development of options for countering these events. The predictive tasks involve both technical and social component models. The social components have presented a particularly difficult challenge. This paper outlines some technical considerations of this modeling activity. Both data and predictions associated with the technical and social models will likely be known with differing certainties or accuracies – a critical challenge is linkingmore » across these model domains while respecting this fundamental difference in certainty level. This paper will describe the technical approach being taken to develop the social model and identification of the significant interfaces between the technical and social modeling in the context of analysis of diversion of nuclear material.« less
Link, William A; Barker, Richard J
2005-03-01
We present a hierarchical extension of the Cormack-Jolly-Seber (CJS) model for open population capture-recapture data. In addition to recaptures of marked animals, we model first captures of animals and losses on capture. The parameter set includes capture probabilities, survival rates, and birth rates. The survival rates and birth rates are treated as a random sample from a bivariate distribution, thus the model explicitly incorporates correlation in these demographic rates. A key feature of the model is that the likelihood function, which includes a CJS model factor, is expressed entirely in terms of identifiable parameters; losses on capture can be factored out of the model. Since the computational complexity of classical likelihood methods is prohibitive, we use Markov chain Monte Carlo in a Bayesian analysis. We describe an efficient candidate-generation scheme for Metropolis-Hastings sampling of CJS models and extensions. The procedure is illustrated using mark-recapture data for the moth Gonodontis bidentata.
Link, William A.; Barker, Richard J.
2005-01-01
We present a hierarchical extension of the Cormack–Jolly–Seber (CJS) model for open population capture–recapture data. In addition to recaptures of marked animals, we model first captures of animals and losses on capture. The parameter set includes capture probabilities, survival rates, and birth rates. The survival rates and birth rates are treated as a random sample from a bivariate distribution, thus the model explicitly incorporates correlation in these demographic rates. A key feature of the model is that the likelihood function, which includes a CJS model factor, is expressed entirely in terms of identifiable parameters; losses on capture can be factored out of the model. Since the computational complexity of classical likelihood methods is prohibitive, we use Markov chain Monte Carlo in a Bayesian analysis. We describe an efficient candidate-generation scheme for Metropolis–Hastings sampling of CJS models and extensions. The procedure is illustrated using mark-recapture data for the moth Gonodontis bidentata.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.
Kong, Shengchun; Nan, Bin
2014-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso
Kong, Shengchun; Nan, Bin
2013-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses. PMID:24516328
Maximum likelihood conjoint measurement of lightness and chroma.
Rogers, Marie; Knoblauch, Kenneth; Franklin, Anna
2016-03-01
Color varies along dimensions of lightness, hue, and chroma. We used maximum likelihood conjoint measurement to investigate how lightness and chroma influence color judgments. Observers judged lightness and chroma of stimuli that varied in both dimensions in a paired-comparison task. We modeled how changes in one dimension influenced judgment of the other. An additive model best fit the data in all conditions except for judgment of red chroma where there was a small but significant interaction. Lightness negatively contributed to perception of chroma for red, blue, and green hues but not for yellow. The method permits quantification of lightness and chroma contributions to color appearance.
Nakamura, M; Saito, K; Wakabayashi, M
1990-04-01
The purpose of this study was to investigate how attitude change is generated by the recipient's degree of attitude formation, evaluative-emotional elements contained in the persuasive messages, and source expertise as a peripheral cue in the persuasion context. Hypotheses based on the Attitude Formation Theory of Mizuhara (1982) and the Elaboration Likelihood Model of Petty and Cacioppo (1981, 1986) were examined. Eighty undergraduate students served as subjects in the experiment, the first stage of which involving manipulating the degree of attitude formation with respect to nuclear power development. Then, the experimenter presented persuasive messages with varying combinations of evaluative-emotional elements from a source with either high or low expertise on the subject. Results revealed a significant interaction effect on attitude change among attitude formation, persuasive message and the expertise of the message source. That is, high attitude formation subjects resisted evaluative-emotional persuasion from the high expertise source while low attitude formation subjects changed their attitude when exposed to the same persuasive message from a low expertise source. Results exceeded initial predictions based on the Attitude Formation Theory and the Elaboration Likelihood Model.
Partially incorrect fossil data augment analyses of discrete trait evolution in living species.
Puttick, Mark N
2016-08-01
Ancestral state reconstruction of discrete character traits is often vital when attempting to understand the origins and homology of traits in living species. The addition of fossils has been shown to alter our understanding of trait evolution in extant taxa, but researchers may avoid using fossils alongside extant species if only few are known, or if the designation of the trait of interest is uncertain. Here, I investigate the impacts of fossils and incorrectly coded fossils in the ancestral state reconstruction of discrete morphological characters under a likelihood model. Under simulated phylogenies and data, likelihood-based models are generally accurate when estimating ancestral node values. Analyses with combined fossil and extant data always outperform analyses with extant species alone, even when around one quarter of the fossil information is incorrect. These results are especially pronounced when model assumptions are violated, such as when there is a trend away from the root value. Fossil data are of particular importance when attempting to estimate the root node character state. Attempts should be made to include fossils in analysis of discrete traits under likelihood, even if there is uncertainty in the fossil trait data. © 2016 The Authors.
Meyer, Karin; Kirkpatrick, Mark
2005-01-01
Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given. PMID:15588566
Kimura, Akatsuki; Celani, Antonio; Nagao, Hiromichi; Stasevich, Timothy; Nakamura, Kazuyuki
2015-01-01
Construction of quantitative models is a primary goal of quantitative biology, which aims to understand cellular and organismal phenomena in a quantitative manner. In this article, we introduce optimization procedures to search for parameters in a quantitative model that can reproduce experimental data. The aim of optimization is to minimize the sum of squared errors (SSE) in a prediction or to maximize likelihood. A (local) maximum of likelihood or (local) minimum of the SSE can efficiently be identified using gradient approaches. Addition of a stochastic process enables us to identify the global maximum/minimum without becoming trapped in local maxima/minima. Sampling approaches take advantage of increasing computational power to test numerous sets of parameters in order to determine the optimum set. By combining Bayesian inference with gradient or sampling approaches, we can estimate both the optimum parameters and the form of the likelihood function related to the parameters. Finally, we introduce four examples of research that utilize parameter optimization to obtain biological insights from quantified data: transcriptional regulation, bacterial chemotaxis, morphogenesis, and cell cycle regulation. With practical knowledge of parameter optimization, cell and developmental biologists can develop realistic models that reproduce their observations and thus, obtain mechanistic insights into phenomena of interest.
Semiparametric time-to-event modeling in the presence of a latent progression event.
Rice, John D; Tsodikov, Alex
2017-06-01
In cancer research, interest frequently centers on factors influencing a latent event that must precede a terminal event. In practice it is often impossible to observe the latent event precisely, making inference about this process difficult. To address this problem, we propose a joint model for the unobserved time to the latent and terminal events, with the two events linked by the baseline hazard. Covariates enter the model parametrically as linear combinations that multiply, respectively, the hazard for the latent event and the hazard for the terminal event conditional on the latent one. We derive the partial likelihood estimators for this problem assuming the latent event is observed, and propose a profile likelihood-based method for estimation when the latent event is unobserved. The baseline hazard in this case is estimated nonparametrically using the EM algorithm, which allows for closed-form Breslow-type estimators at each iteration, bringing improved computational efficiency and stability compared with maximizing the marginal likelihood directly. We present simulation studies to illustrate the finite-sample properties of the method; its use in practice is demonstrated in the analysis of a prostate cancer data set. © 2016, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Fenicia, Fabrizio; Reichert, Peter; Kavetski, Dmitri; Albert, Calro
2016-04-01
The calibration of hydrological models based on signatures (e.g. Flow Duration Curves - FDCs) is often advocated as an alternative to model calibration based on the full time series of system responses (e.g. hydrographs). Signature based calibration is motivated by various arguments. From a conceptual perspective, calibration on signatures is a way to filter out errors that are difficult to represent when calibrating on the full time series. Such errors may for example occur when observed and simulated hydrographs are shifted, either on the "time" axis (i.e. left or right), or on the "streamflow" axis (i.e. above or below). These shifts may be due to errors in the precipitation input (time or amount), and if not properly accounted in the likelihood function, may cause biased parameter estimates (e.g. estimated model parameters that do not reproduce the recession characteristics of a hydrograph). From a practical perspective, signature based calibration is seen as a possible solution for making predictions in ungauged basins. Where streamflow data are not available, it may in fact be possible to reliably estimate streamflow signatures. Previous research has for example shown how FDCs can be reliably estimated at ungauged locations based on climatic and physiographic influence factors. Typically, the goal of signature based calibration is not the prediction of the signatures themselves, but the prediction of the system responses. Ideally, the prediction of system responses should be accompanied by a reliable quantification of the associated uncertainties. Previous approaches for signature based calibration, however, do not allow reliable estimates of streamflow predictive distributions. Here, we illustrate how the Bayesian approach can be employed to obtain reliable streamflow predictive distributions based on signatures. A case study is presented, where a hydrological model is calibrated on FDCs and additional signatures. We propose an approach where the likelihood function for the signatures is derived from the likelihood for streamflow (rather than using an "ad-hoc" likelihood for the signatures as done in previous approaches). This likelihood is not easily tractable analytically and we therefore cannot apply "simple" MCMC methods. This numerical problem is solved using Approximate Bayesian Computation (ABC). Our result indicate that the proposed approach is suitable for producing reliable streamflow predictive distributions based on calibration to signature data. Moreover, our results provide indications on which signatures are more appropriate to represent the information content of the hydrograph.
MIXOR: a computer program for mixed-effects ordinal regression analysis.
Hedeker, D; Gibbons, R D
1996-03-01
MIXOR provides maximum marginal likelihood estimates for mixed-effects ordinal probit, logistic, and complementary log-log regression models. These models can be used for analysis of dichotomous and ordinal outcomes from either a clustered or longitudinal design. For clustered data, the mixed-effects model assumes that data within clusters are dependent. The degree of dependency is jointly estimated with the usual model parameters, thus adjusting for dependence resulting from clustering of the data. Similarly, for longitudinal data, the mixed-effects approach can allow for individual-varying intercepts and slopes across time, and can estimate the degree to which these time-related effects vary in the population of individuals. MIXOR uses marginal maximum likelihood estimation, utilizing a Fisher-scoring solution. For the scoring solution, the Cholesky factor of the random-effects variance-covariance matrix is estimated, along with the effects of model covariates. Examples illustrating usage and features of MIXOR are provided.
Stochastic Ordering Using the Latent Trait and the Sum Score in Polytomous IRT Models.
ERIC Educational Resources Information Center
Hemker, Bas T.; Sijtsma, Klaas; Molenaar, Ivo W.; Junker, Brian W.
1997-01-01
Stochastic ordering properties are investigated for a broad class of item response theory (IRT) models for which the monotone likelihood ratio does not hold. A taxonomy is given for nonparametric and parametric models for polytomous models based on the hierarchical relationship between the models. (SLD)
Lammers, H B
2000-04-01
From an Elaboration Likelihood Model perspective, it was hypothesized that postexposure awareness of deceptive packaging claims would have a greater negative effect on scores for purchase intention by consumers lowly involved rather than highly involved with a product (n = 40). Undergraduates who were classified as either highly or lowly (ns = 20 and 20) involved with M&Ms examined either a deceptive or non-deceptive package design for M&Ms candy and were subsequently informed of the deception employed in the packaging before finally rating their intention to purchase. As anticipated, highly deceived subjects who were low in involvement rated intention to purchase lower than their highly involved peers. Overall, the results attest to the robustness of the model and suggest that the model has implications beyond advertising effects and into packaging effects.
Discerning the clinical relevance of biomarkers in early stage breast cancer.
Ballinger, Tarah J; Kassem, Nawal; Shen, Fei; Jiang, Guanglong; Smith, Mary Lou; Railey, Elda; Howell, John; White, Carol B; Schneider, Bryan P
2017-07-01
Prior data suggest that breast cancer patients accept significant toxicity for small benefit. It is unclear whether personalized estimations of risk or benefit likelihood that could be provided by biomarkers alter treatment decisions in the curative setting. A choice-based conjoint (CBC) survey was conducted in 417 HER2-negative breast cancer patients who received chemotherapy in the curative setting. The survey presented pairs of treatment choices derived from common taxane- and anthracycline-based regimens, varying in degree of benefit by risk of recurrence and in toxicity profile, including peripheral neuropathy (PN) and congestive heart failure (CHF). Hypothetical biomarkers shifting benefit and toxicity risk were modeled to determine whether this knowledge alters choice. Previously identified biomarkers were evaluated using this model. Based on CBC analysis, a non-anthracycline regimen was the most preferred. Patients with prior PN had a similar preference for a taxane regimen as those who were PN naïve, but more dramatically shifted preference away from taxanes when PN was described as severe/irreversible. When modeled after hypothetical biomarkers, as the likelihood of PN increased, the preference for taxane-containing regimens decreased; similarly, as the likelihood of CHF increased, the preference for anthracycline regimens decreased. When evaluating validated biomarkers for PN and CHF, this knowledge did alter regimen preference. Patients faced with multi-faceted decisions consider personal experience and perceived risk of recurrent disease. Biomarkers providing information on likelihood of toxicity risk do influence treatment choices, and patients may accept reduced benefit when faced with higher risk of toxicity in the curative setting.
Cluver, Lucie; Orkin, Mark
2009-10-01
Research shows that AIDS-orphaned children are more likely to experience clinical-range psychological problems. Little is known about possible interactions between factors mediating these high distress levels. We assessed how food insecurity, bullying, and AIDS-related stigma interacted with each other and with likelihood of experiencing clinical-range disorder. In South Africa, 1025 adolescents completed standardised measures of depression, anxiety and post-traumatic stress. 52 potential mediators were measured, including AIDS-orphanhood status. Logistic regressions and hierarchical log-linear modelling were used to identify interactions among significant risk factors. Food insecurity, stigma and bullying all independently increased likelihood of disorder. Poverty and stigma were found to interact strongly, and with both present, likelihood of disorder rose from 19% to 83%. Similarly, bullying interacted with AIDS-orphanhood status, and with both present, likelihood of disorder rose from 12% to 76%. Approaches to alleviating psychological distress amongst AIDS-affected children must address cumulative risk effects.
Factors associated with single-vehicle and multi-vehicle road traffic collision injuries in Ireland.
Donnelly-Swift, Erica; Kelly, Alan
2016-12-01
Generalised linear regression models were used to identify factors associated with fatal/serious road traffic collision injuries for single- and multi-vehicle collisions. Single-vehicle collisions and multi-vehicle collisions occurring during the hours of darkness or on a wet road surface had reduced likelihood of a fatal/serious injury. Single-vehicle 'driver with passengers' collisions occurring at junctions or on a hill/gradient were less likely to result in a fatal/serious injury. Multi-vehicle rear-end/angle collisions had reduced likelihood of a fatal/serious injury. Single-vehicle 'driver only' collisions and multi-vehicle collisions occurring on a public/bank holiday or on a hill/gradient were more likely to result in a fatal/serious injury. Single-vehicle collisions involving male drivers had increased likelihood of a fatal/serious injury and single-vehicle 'driver with passengers' collisions involving drivers under the age of 25 years also had increased likelihood of a fatal/serious injury. Findings can enlighten decision-makers to circumstances leading to fatal/serious injuries.
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
Zhan, Tingting; Chevoneva, Inna; Iglewicz, Boris
2010-01-01
The family of weighted likelihood estimators largely overlaps with minimum divergence estimators. They are robust to data contaminations compared to MLE. We define the class of generalized weighted likelihood estimators (GWLE), provide its influence function and discuss the efficiency requirements. We introduce a new truncated cubic-inverse weight, which is both first and second order efficient and more robust than previously reported weights. We also discuss new ways of selecting the smoothing bandwidth and weighted starting values for the iterative algorithm. The advantage of the truncated cubic-inverse weight is illustrated in a simulation study of three-components normal mixtures model with large overlaps and heavy contaminations. A real data example is also provided. PMID:20835375
A Sequential Ensemble Prediction System at Convection Permitting Scales
NASA Astrophysics Data System (ADS)
Milan, M.; Simmer, C.
2012-04-01
A Sequential Assimilation Method (SAM) following some aspects of particle filtering with resampling, also called SIR (Sequential Importance Resampling), is introduced and applied in the framework of an Ensemble Prediction System (EPS) for weather forecasting on convection permitting scales, with focus to precipitation forecast. At this scale and beyond, the atmosphere increasingly exhibits chaotic behaviour and non linear state space evolution due to convectively driven processes. One way to take full account of non linear state developments are particle filter methods, their basic idea is the representation of the model probability density function by a number of ensemble members weighted by their likelihood with the observations. In particular particle filter with resampling abandons ensemble members (particles) with low weights restoring the original number of particles adding multiple copies of the members with high weights. In our SIR-like implementation we substitute the likelihood way to define weights and introduce a metric which quantifies the "distance" between the observed atmospheric state and the states simulated by the ensemble members. We also introduce a methodology to counteract filter degeneracy, i.e. the collapse of the simulated state space. To this goal we propose a combination of resampling taking account of simulated state space clustering and nudging. By keeping cluster representatives during resampling and filtering, the method maintains the potential for non linear system state development. We assume that a particle cluster with initially low likelihood may evolve in a state space with higher likelihood in a subsequent filter time thus mimicking non linear system state developments (e.g. sudden convection initiation) and remedies timing errors for convection due to model errors and/or imperfect initial condition. We apply a simplified version of the resampling, the particles with highest weights in each cluster are duplicated; for the model evolution for each particle pair one particle evolves using the forward model; the second particle, however, is nudged to the radar and satellite observation during its evolution based on the forward model.
SEPARABLE FACTOR ANALYSIS WITH APPLICATIONS TO MORTALITY DATA
Fosdick, Bailey K.; Hoff, Peter D.
2014-01-01
Human mortality data sets can be expressed as multiway data arrays, the dimensions of which correspond to categories by which mortality rates are reported, such as age, sex, country and year. Regression models for such data typically assume an independent error distribution or an error model that allows for dependence along at most one or two dimensions of the data array. However, failing to account for other dependencies can lead to inefficient estimates of regression parameters, inaccurate standard errors and poor predictions. An alternative to assuming independent errors is to allow for dependence along each dimension of the array using a separable covariance model. However, the number of parameters in this model increases rapidly with the dimensions of the array and, for many arrays, maximum likelihood estimates of the covariance parameters do not exist. In this paper, we propose a submodel of the separable covariance model that estimates the covariance matrix for each dimension as having factor analytic structure. This model can be viewed as an extension of factor analysis to array-valued data, as it uses a factor model to estimate the covariance along each dimension of the array. We discuss properties of this model as they relate to ordinary factor analysis, describe maximum likelihood and Bayesian estimation methods, and provide a likelihood ratio testing procedure for selecting the factor model ranks. We apply this methodology to the analysis of data from the Human Mortality Database, and show in a cross-validation experiment how it outperforms simpler methods. Additionally, we use this model to impute mortality rates for countries that have no mortality data for several years. Unlike other approaches, our methodology is able to estimate similarities between the mortality rates of countries, time periods and sexes, and use this information to assist with the imputations. PMID:25489353
Tree-Based Global Model Tests for Polytomous Rasch Models
ERIC Educational Resources Information Center
Komboz, Basil; Strobl, Carolin; Zeileis, Achim
2018-01-01
Psychometric measurement models are only valid if measurement invariance holds between test takers of different groups. Global model tests, such as the well-established likelihood ratio (LR) test, are sensitive to violations of measurement invariance, such as differential item functioning and differential step functioning. However, these…
Maximum likelihood-based analysis of single-molecule photon arrival trajectories
NASA Astrophysics Data System (ADS)
Hajdziona, Marta; Molski, Andrzej
2011-02-01
In this work we explore the statistical properties of the maximum likelihood-based analysis of one-color photon arrival trajectories. This approach does not involve binning and, therefore, all of the information contained in an observed photon strajectory is used. We study the accuracy and precision of parameter estimates and the efficiency of the Akaike information criterion and the Bayesian information criterion (BIC) in selecting the true kinetic model. We focus on the low excitation regime where photon trajectories can be modeled as realizations of Markov modulated Poisson processes. The number of observed photons is the key parameter in determining model selection and parameter estimation. For example, the BIC can select the true three-state model from competing two-, three-, and four-state kinetic models even for relatively short trajectories made up of 2 × 103 photons. When the intensity levels are well-separated and 104 photons are observed, the two-state model parameters can be estimated with about 10% precision and those for a three-state model with about 20% precision.
Can obviously intoxicated patrons still easily buy alcohol at on-premise establishments?
Toomey, Traci L.; Lenk, Kathleen M.; Nederhoff, Dawn M.; Nelson, Toben F.; Ecklund, Alexandra M.; Horvath, Keith J.; Erickson, Darin J.
2015-01-01
Background Excessive alcohol consumption at licensed alcohol establishments (i.e., bars and restaurants) has been directly linked to alcohol-related problems such as traffic crashes and violence. Historically, alcohol establishments have had a high likelihood of selling alcohol to obviously intoxicated patrons (also referred to as “overservice”) despite laws prohibiting these sales. Given the risks associated with overservice and the need for up-to-date data, it is critical that we monitor the likelihood of sales to obviously intoxicated patrons. Methods To assess the current likelihood of a licensed alcohol establishment selling alcohol to an obviously intoxicated patron, we conducted pseudo-intoxicated purchase attempts (i.e., actors attempt to purchase alcohol while acting out obvious signs of intoxication) at 340 establishments in one Midwestern metropolitan area. We also measured characteristics of the establishments, the pseudo-intoxicated patrons, the servers, the managers, and the neighborhoods to assess whether these characteristics were associated with likelihood of sales of obviously intoxicated patrons. We assessed these associations with bivariate and multivariate regression models. Results Pseudo-intoxicated buyers were able to purchase alcohol at 82% of the establishments. In the fully adjusted multivariate regression model, only one of the characteristics we assessed was significantly associated with likelihood of selling to intoxicated patrons–establishments owned by a corporate entity had 3.6 greater odds of selling alcohol to a pseudo-intoxicated buyer compared to independently-owned establishments. Discussion Given the risks associated with overservice of alcohol, more resources should be devoted first to identify effective interventions for decreasing overservice of alcohol and then to educate practitioners who are working in their communities to address this public health problem. PMID:26891204
Bayesian Framework for Water Quality Model Uncertainty Estimation and Risk Management
A formal Bayesian methodology is presented for integrated model calibration and risk-based water quality management using Bayesian Monte Carlo simulation and maximum likelihood estimation (BMCML). The primary focus is on lucid integration of model calibration with risk-based wat...
Colder, Craig R; O'Connor, Roisin M; Read, Jennifer P; Eiden, Rina D; Lengua, Liliana J; Hawk, Larry W; Wieczorek, William F
2014-09-01
This longitudinal study provided a comprehensive examination of age-related changes in alcohol outcome expectancies, subjective evaluation of alcohol outcomes, and automatic alcohol associations in early adolescence. A community sample (52% female, 75% White/non-Hispanic) was assessed annually for 3 years (mean age at the first assessment = 11.6 years). Results from growth modeling suggested that perceived likelihood of positive outcomes increased and that subjective evaluations of these outcomes were more positive with age. Perceived likelihood of negative outcomes declined with age. Automatic alcohol associations were assessed with an Implicit Association Task (IAT), and were predominantly negative, but these negative associations weakened with age. High initial levels of perceived likelihood of positive outcomes at age 11 were associated with escalation of drinking. Perceived likelihood of negative outcomes was associated with low risk for drinking at age 11, but not with changes in drinking. Increases in positive evaluations of positive outcomes were associated with increases in alcohol use. Overall, findings suggest that at age 11, youth maintain largely negative attitudes and perceptions about alcohol, but with the transition into adolescence, there is a shift toward a more neutral or ambivalent view of alcohol. Some features of this shift are associated with escalation of drinking. Our findings point to the importance of delineating multiple aspects of alcohol information processing for extending cognitive models of alcohol use to the early stages of drinking.
Walsh, Kate; Messman-Moore, Terri; Zerubavel, Noga; Chandley, Rachel B.; DeNardi, Kathleen A.; Walker, Dave P.
2013-01-01
Objectives Although numerous studies have documented linkages between childhood sexual abuse (CSA) and later sexual revictimization, mechanisms underlying revictimization, particularly assaults occurring in the context of substance use, are not well-understood. Consistent with Traumagenic Dynamics theory, the present study tested a path model positing that lowered perceptions of sexual control resulting from CSA may be associated with increased sex-related alcohol expectancies and heightened likelihood of risky sexual behavior, which in turn, may predict adult substance-related rape. Methods Participants were 546 female college students who completed anonymous surveys regarding CSA and adult rape, perceptions of sexual control, sex-related alcohol expectancies, and likelihood of engaging in risky sexual behavior. Results The data fit the hypothesized model well and all hypothesized path coefficients were significant and in the expected directions. As expected, sex-related alcohol expectancies and likelihood of risky sexual behavior only predicted substance-related rape, not forcible rape. Conclusions Findings suggested that low perceived sexual control stemming from CSA is associated with increased sex-related alcohol expectancies and a higher likelihood of engaging in sexual behavior in the context of alcohol use. In turn these proximal risk factors heighten vulnerability to substance-related rape. Programs which aim to reduce risk for substance-related rape could be improved by addressing expectancies and motivations for risky sexual behavior in the context of substance use. Implications and future directions are discussed. PMID:23312991
Oviedo-Trespalacios, Oscar; Haque, Md Mazharul; King, Mark; Washington, Simon
2018-05-29
This study investigated how situational characteristics typically encountered in the transport system influence drivers' perceived likelihood of engaging in mobile phone multitasking. The impacts of mobile phone tasks, perceived environmental complexity/risk, and drivers' individual differences were evaluated as relevant individual predictors within the behavioral adaptation framework. An innovative questionnaire, which includes randomized textual and visual scenarios, was administered to collect data from a sample of 447 drivers in South East Queensland-Australia (66% females; n = 296). The likelihood of engaging in a mobile phone task across various scenarios was modeled by a random parameters ordered probit model. Results indicated that drivers who are female, are frequent users of phones for texting/answering calls, have less favorable attitudes towards safety, and are highly disinhibited were more likely to report stronger intentions of engaging in mobile phone multitasking. However, more years with a valid driving license, self-efficacy toward self-regulation in demanding traffic conditions and police enforcement, texting tasks, and demanding traffic conditions were negatively related to self-reported likelihood of mobile phone multitasking. The unobserved heterogeneity warned of riskier groups among female drivers and participants who need a lot of convincing to believe that multitasking while driving is dangerous. This research concludes that behavioral adaptation theory is a robust framework explaining self-regulation of distracted drivers. © 2018 Society for Risk Analysis.
Maximal likelihood correspondence estimation for face recognition across pose.
Li, Shaoxin; Liu, Xin; Chai, Xiujuan; Zhang, Haihong; Lao, Shihong; Shan, Shiguang
2014-10-01
Due to the misalignment of image features, the performance of many conventional face recognition methods degrades considerably in across pose scenario. To address this problem, many image matching-based methods are proposed to estimate semantic correspondence between faces in different poses. In this paper, we aim to solve two critical problems in previous image matching-based correspondence learning methods: 1) fail to fully exploit face specific structure information in correspondence estimation and 2) fail to learn personalized correspondence for each probe image. To this end, we first build a model, termed as morphable displacement field (MDF), to encode face specific structure information of semantic correspondence from a set of real samples of correspondences calculated from 3D face models. Then, we propose a maximal likelihood correspondence estimation (MLCE) method to learn personalized correspondence based on maximal likelihood frontal face assumption. After obtaining the semantic correspondence encoded in the learned displacement, we can synthesize virtual frontal images of the profile faces for subsequent recognition. Using linear discriminant analysis method with pixel-intensity features, state-of-the-art performance is achieved on three multipose benchmarks, i.e., CMU-PIE, FERET, and MultiPIE databases. Owe to the rational MDF regularization and the usage of novel maximal likelihood objective, the proposed MLCE method can reliably learn correspondence between faces in different poses even in complex wild environment, i.e., labeled face in the wild database.
A data fusion approach to indications and warnings of terrorist attacks
NASA Astrophysics Data System (ADS)
McDaniel, David; Schaefer, Gregory
2014-05-01
Indications and Warning (I&W) of terrorist attacks, particularly IED attacks, require detection of networks of agents and patterns of behavior. Social Network Analysis tries to detect a network; activity analysis tries to detect anomalous activities. This work builds on both to detect elements of an activity model of terrorist attack activity - the agents, resources, networks, and behaviors. The activity model is expressed as RDF triples statements where the tuple positions are elements or subsets of a formal ontology for activity models. The advantage of a model is that elements are interdependent and evidence for or against one will influence others so that there is a multiplier effect. The advantage of the formality is that detection could occur hierarchically, that is, at different levels of abstraction. The model matching is expressed as a likelihood ratio between input text and the model triples. The likelihood ratio is designed to be analogous to track correlation likelihood ratios common in JDL fusion level 1. This required development of a semantic distance metric for positive and null hypotheses as well as for complex objects. The metric uses the Web 1Terabype database of one to five gram frequencies for priors. This size requires the use of big data technologies so a Hadoop cluster is used in conjunction with OpenNLP natural language and Mahout clustering software. Distributed data fusion Map Reduce jobs distribute parts of the data fusion problem to the Hadoop nodes. For the purposes of this initial testing, open source models and text inputs of similar complexity to terrorist events were used as surrogates for the intended counter-terrorist application.
Bates, S E; Sansom, M S; Ball, F G; Ramsey, R L; Usherwood, P N
1990-01-01
Gigaohm recordings have been made from glutamate receptor channels in excised, outside-out patches of collagenase-treated locust muscle membrane. The channels in the excised patches exhibit the kinetic state switching first seen in megaohm recordings from intact muscle fibers. Analysis of channel dwell time distributions reveals that the gating mechanism contains at least four open states and at least four closed states. Dwell time autocorrelation function analysis shows that there are at least three gateways linking the open states of the channel with the closed states. A maximum likelihood procedure has been used to fit six different gating models to the single channel data. Of these models, a cooperative model yields the best fit, and accurately predicts most features of the observed channel gating kinetics. PMID:1696510
Henschel, Volkmar; Engel, Jutta; Hölzel, Dieter; Mansmann, Ulrich
2009-02-10
Multivariate analysis of interval censored event data based on classical likelihood methods is notoriously cumbersome. Likelihood inference for models which additionally include random effects are not available at all. Developed algorithms bear problems for practical users like: matrix inversion, slow convergence, no assessment of statistical uncertainty. MCMC procedures combined with imputation are used to implement hierarchical models for interval censored data within a Bayesian framework. Two examples from clinical practice demonstrate the handling of clustered interval censored event times as well as multilayer random effects for inter-institutional quality assessment. The software developed is called survBayes and is freely available at CRAN. The proposed software supports the solution of complex analyses in many fields of clinical epidemiology as well as health services research.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances.
Gil, Manuel
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error. PMID:25279263
Development of advanced techniques for rotorcraft state estimation and parameter identification
NASA Technical Reports Server (NTRS)
Hall, W. E., Jr.; Bohn, J. G.; Vincent, J. H.
1980-01-01
An integrated methodology for rotorcraft system identification consists of rotorcraft mathematical modeling, three distinct data processing steps, and a technique for designing inputs to improve the identifiability of the data. These elements are as follows: (1) a Kalman filter smoother algorithm which estimates states and sensor errors from error corrupted data. Gust time histories and statistics may also be estimated; (2) a model structure estimation algorithm for isolating a model which adequately explains the data; (3) a maximum likelihood algorithm for estimating the parameters and estimates for the variance of these estimates; and (4) an input design algorithm, based on a maximum likelihood approach, which provides inputs to improve the accuracy of parameter estimates. Each step is discussed with examples to both flight and simulated data cases.
Use of QuakeSim and UAVSAR for Earthquake Damage Mitigation and Response
NASA Technical Reports Server (NTRS)
Donnellan, A.; Parker, J. W.; Bawden, G.; Hensley, S.
2009-01-01
Spaceborne, airborne, and modeling and simulation techniques are being applied to earthquake risk assessment and response for mitigation from this natural disaster. QuakeSim is a web-based portal for modeling interseismic strain accumulation using paleoseismic and crustal deformation data. The models are used for understanding strain accumulation and release from earthquakes as well as stress transfer to neighboring faults. Simulations of the fault system can be used for understanding the likelihood and patterns of earthquakes as well as the likelihood of large aftershocks from events. UAVSAR is an airborne L-band InSAR system for collecting crustal deformation data. QuakeSim, UAVSAR, and DESDynI (following launch) can be used for monitoring earthquakes, the associated rupture and damage, and postseismic motions for prediction of aftershock locations.
Sand, Andreas; Kristiansen, Martin; Pedersen, Christian N S; Mailund, Thomas
2013-11-22
Hidden Markov models are widely used for genome analysis as they combine ease of modelling with efficient analysis algorithms. Calculating the likelihood of a model using the forward algorithm has worst case time complexity linear in the length of the sequence and quadratic in the number of states in the model. For genome analysis, however, the length runs to millions or billions of observations, and when maximising the likelihood hundreds of evaluations are often needed. A time efficient forward algorithm is therefore a key ingredient in an efficient hidden Markov model library. We have built a software library for efficiently computing the likelihood of a hidden Markov model. The library exploits commonly occurring substrings in the input to reuse computations in the forward algorithm. In a pre-processing step our library identifies common substrings and builds a structure over the computations in the forward algorithm which can be reused. This analysis can be saved between uses of the library and is independent of concrete hidden Markov models so one preprocessing can be used to run a number of different models.Using this library, we achieve up to 78 times shorter wall-clock time for realistic whole-genome analyses with a real and reasonably complex hidden Markov model. In one particular case the analysis was performed in less than 8 minutes compared to 9.6 hours for the previously fastest library. We have implemented the preprocessing procedure and forward algorithm as a C++ library, zipHMM, with Python bindings for use in scripts. The library is available at http://birc.au.dk/software/ziphmm/.
Klim, Søren; Mortensen, Stig Bousgaard; Kristensen, Niels Rode; Overgaard, Rune Viig; Madsen, Henrik
2009-06-01
The extension from ordinary to stochastic differential equations (SDEs) in pharmacokinetic and pharmacodynamic (PK/PD) modelling is an emerging field and has been motivated in a number of articles [N.R. Kristensen, H. Madsen, S.H. Ingwersen, Using stochastic differential equations for PK/PD model development, J. Pharmacokinet. Pharmacodyn. 32 (February(1)) (2005) 109-141; C.W. Tornøe, R.V. Overgaard, H. Agersø, H.A. Nielsen, H. Madsen, E.N. Jonsson, Stochastic differential equations in NONMEM: implementation, application, and comparison with ordinary differential equations, Pharm. Res. 22 (August(8)) (2005) 1247-1258; R.V. Overgaard, N. Jonsson, C.W. Tornøe, H. Madsen, Non-linear mixed-effects models with stochastic differential equations: implementation of an estimation algorithm, J. Pharmacokinet. Pharmacodyn. 32 (February(1)) (2005) 85-107; U. Picchini, S. Ditlevsen, A. De Gaetano, Maximum likelihood estimation of a time-inhomogeneous stochastic differential model of glucose dynamics, Math. Med. Biol. 25 (June(2)) (2008) 141-155]. PK/PD models are traditionally based ordinary differential equations (ODEs) with an observation link that incorporates noise. This state-space formulation only allows for observation noise and not for system noise. Extending to SDEs allows for a Wiener noise component in the system equations. This additional noise component enables handling of autocorrelated residuals originating from natural variation or systematic model error. Autocorrelated residuals are often partly ignored in PK/PD modelling although violating the hypothesis for many standard statistical tests. This article presents a package for the statistical program R that is able to handle SDEs in a mixed-effects setting. The estimation method implemented is the FOCE(1) approximation to the population likelihood which is generated from the individual likelihoods that are approximated using the Extended Kalman Filter's one-step predictions.
Local Solutions in the Estimation of Growth Mixture Models
ERIC Educational Resources Information Center
Hipp, John R.; Bauer, Daniel J.
2006-01-01
Finite mixture models are well known to have poorly behaved likelihood functions featuring singularities and multiple optima. Growth mixture models may suffer from fewer of these problems, potentially benefiting from the structure imposed on the estimated class means and covariances by the specified growth model. As demonstrated here, however,…
Estimation Methods for One-Parameter Testlet Models
ERIC Educational Resources Information Center
Jiao, Hong; Wang, Shudong; He, Wei
2013-01-01
This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…
Fitting ARMA Time Series by Structural Equation Models.
ERIC Educational Resources Information Center
van Buuren, Stef
1997-01-01
This paper outlines how the stationary ARMA (p,q) model (G. Box and G. Jenkins, 1976) can be specified as a structural equation model. Maximum likelihood estimates for the parameters in the ARMA model can be obtained by software for fitting structural equation models. The method is applied to three problem types. (SLD)
On the Relation between the Linear Factor Model and the Latent Profile Model
ERIC Educational Resources Information Center
Halpin, Peter F.; Dolan, Conor V.; Grasman, Raoul P. P. P.; De Boeck, Paul
2011-01-01
The relationship between linear factor models and latent profile models is addressed within the context of maximum likelihood estimation based on the joint distribution of the manifest variables. Although the two models are well known to imply equivalent covariance decompositions, in general they do not yield equivalent estimates of the…
Comparisons of Four Methods for Estimating a Dynamic Factor Model
ERIC Educational Resources Information Center
Zhang, Zhiyong; Hamaker, Ellen L.; Nesselroade, John R.
2008-01-01
Four methods for estimating a dynamic factor model, the direct autoregressive factor score (DAFS) model, are evaluated and compared. The first method estimates the DAFS model using a Kalman filter algorithm based on its state space model representation. The second one employs the maximum likelihood estimation method based on the construction of a…
Schwappach, David L. B.; Gehring, Katrin
2014-01-01
Purpose To investigate the likelihood of speaking up about patient safety in oncology and to clarify the effect of clinical and situational context factors on the likelihood of voicing concerns. Patients and Methods 1013 nurses and doctors in oncology rated four clinical vignettes describing coworkers’ errors and rule violations in a self-administered factorial survey (65% response rate). Multiple regression analysis was used to model the likelihood of speaking up as outcome of vignette attributes, responder’s evaluations of the situation and personal characteristics. Results Respondents reported a high likelihood of speaking up about patient safety but the variation between and within types of errors and rule violations was substantial. Staff without managerial function provided significantly higher levels of decision difficulty and discomfort to speak up. Based on the information presented in the vignettes, 74%−96% would speak up towards a supervisor failing to check a prescription, 45%−81% would point a coworker to a missed hand disinfection, 82%−94% would speak up towards nurses who violate a safety rule in medication preparation, and 59%−92% would question a doctor violating a safety rule in lumbar puncture. Several vignette attributes predicted the likelihood of speaking up. Perceived potential harm, anticipated discomfort, and decision difficulty were significant predictors of the likelihood of speaking up. Conclusions Clinicians’ willingness to speak up about patient safety is considerably affected by contextual factors. Physicians and nurses without managerial function report substantial discomfort with speaking up. Oncology departments should provide staff with clear guidance and trainings on when and how to voice safety concerns. PMID:25116338
Bivariate categorical data analysis using normal linear conditional multinomial probability model.
Sun, Bingrui; Sutradhar, Brajendra
2015-02-10
Bivariate multinomial data such as the left and right eyes retinopathy status data are analyzed either by using a joint bivariate probability model or by exploiting certain odds ratio-based association models. However, the joint bivariate probability model yields marginal probabilities, which are complicated functions of marginal and association parameters for both variables, and the odds ratio-based association model treats the odds ratios involved in the joint probabilities as 'working' parameters, which are consequently estimated through certain arbitrary 'working' regression models. Also, this later odds ratio-based model does not provide any easy interpretations of the correlations between two categorical variables. On the basis of pre-specified marginal probabilities, in this paper, we develop a bivariate normal type linear conditional multinomial probability model to understand the correlations between two categorical variables. The parameters involved in the model are consistently estimated using the optimal likelihood and generalized quasi-likelihood approaches. The proposed model and the inferences are illustrated through an intensive simulation study as well as an analysis of the well-known Wisconsin Diabetic Retinopathy status data. Copyright © 2014 John Wiley & Sons, Ltd.
Thakur, Jyoti; Pahuja, Sharvan Kumar; Pahuja, Roop
2017-01-01
In 2005, an international pediatric sepsis consensus conference defined systemic inflammatory response syndrome (SIRS) for children <18 years of age, but excluded premature infants. In 2012, Hofer et al. investigated the predictive power of SIRS for term neonates. In this paper, we examined the accuracy of SIRS in predicting sepsis in neonates, irrespective of their gestational age (i.e., pre-term, term, and post-term). We also created two prediction models, named Model A and Model B, using binary logistic regression. Both models performed better than SIRS. We also developed an android application so that physicians can easily use Model A and Model B in real-world scenarios. The sensitivity, specificity, positive likelihood ratio (PLR) and negative likelihood ratio (NLR) in cases of SIRS were 16.15%, 95.53%, 3.61, and 0.88, respectively, whereas they were 29.17%, 97.82%, 13.36, and 0.72, respectively, in the case of Model A, and 31.25%, 97.30%, 11.56, and 0.71, respectively, in the case of Model B. All models were significant with p < 0.001. PMID:29257099
Using the Health Belief Model for Bulimia Prevention.
ERIC Educational Resources Information Center
Grodner, Michele
1991-01-01
Discusses application of the Health Belief Model to the prevention of bulimia, describing each model component. The article considers the individual's beliefs about bulimia and bulimic-like behaviors as a means of predicting the likelihood of behavior change to prevent clinically diagnosable bulimia. (SM)
A Statistical Test for Comparing Nonnested Covariance Structure Models.
ERIC Educational Resources Information Center
Levy, Roy; Hancock, Gregory R.
While statistical procedures are well known for comparing hierarchically related (nested) covariance structure models, statistical tests for comparing nonhierarchically related (nonnested) models have proven more elusive. While isolated attempts have been made, none exists within the commonly used maximum likelihood estimation framework, thereby…
Parental Modeling and Deidentification in Romantic Relationships Among Mexican-origin Youth.
Kuo, Sally I-Chun; Wheeler, Lorey A; Updegraff, Kimberly A; McHale, Susan M; Umaña-Taylor, Adriana J; Perez-Brena, Norma J
2017-10-01
This study investigated youth's modeling of and de-identification from parents in romantic relationships, using two phases of data from adolescent siblings, mothers, and fathers in 246 Mexican-origin families. Each parent reported his/her marital satisfaction and conflict, and youth reported on parent-adolescent warmth and conflict at Time 1. Youth's reports of modeling of and de-identification from their mothers and fathers and three romantic relationship outcomes were assessed at Time 2. Findings revealed that higher parental marital satisfaction, lower marital conflict, and higher warmth and lower conflict in parent-adolescent relationships were associated with more modeling and less de-identification from parents. Moreover, higher de-identification was linked to a greater likelihood of youth being involved in a romantic relationship and cohabitation, whereas more modeling was linked to a lower likelihood of cohabitation and older age of first sex. Discussion underscores the importance of assessing parental modeling and de-identification and understanding correlates of these processes.
Elashoff, Robert M.; Li, Gang; Li, Ning
2009-01-01
Summary In this article we study a joint model for longitudinal measurements and competing risks survival data. Our joint model provides a flexible approach to handle possible nonignorable missing data in the longitudinal measurements due to dropout. It is also an extension of previous joint models with a single failure type, offering a possible way to model informatively censored events as a competing risk. Our model consists of a linear mixed effects submodel for the longitudinal outcome and a proportional cause-specific hazards frailty submodel (Prentice et al., 1978, Biometrics 34, 541-554) for the competing risks survival data, linked together by some latent random effects. We propose to obtain the maximum likelihood estimates of the parameters by an expectation maximization (EM) algorithm and estimate their standard errors using a profile likelihood method. The developed method works well in our simulation studies and is applied to a clinical trial for the scleroderma lung disease. PMID:18162112
Balakrishnan, Narayanaswamy; Pal, Suvra
2016-08-01
Recently, a flexible cure rate survival model has been developed by assuming the number of competing causes of the event of interest to follow the Conway-Maxwell-Poisson distribution. This model includes some of the well-known cure rate models discussed in the literature as special cases. Data obtained from cancer clinical trials are often right censored and expectation maximization algorithm can be used in this case to efficiently estimate the model parameters based on right censored data. In this paper, we consider the competing cause scenario and assuming the time-to-event to follow the Weibull distribution, we derive the necessary steps of the expectation maximization algorithm for estimating the parameters of different cure rate survival models. The standard errors of the maximum likelihood estimates are obtained by inverting the observed information matrix. The method of inference developed here is examined by means of an extensive Monte Carlo simulation study. Finally, we illustrate the proposed methodology with a real data on cancer recurrence. © The Author(s) 2013.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fox, Zachary; Neuert, Gregor; Department of Pharmacology, School of Medicine, Vanderbilt University, Nashville, Tennessee 37232
2016-08-21
Emerging techniques now allow for precise quantification of distributions of biological molecules in single cells. These rapidly advancing experimental methods have created a need for more rigorous and efficient modeling tools. Here, we derive new bounds on the likelihood that observations of single-cell, single-molecule responses come from a discrete stochastic model, posed in the form of the chemical master equation. These strict upper and lower bounds are based on a finite state projection approach, and they converge monotonically to the exact likelihood value. These bounds allow one to discriminate rigorously between models and with a minimum level of computational effort.more » In practice, these bounds can be incorporated into stochastic model identification and parameter inference routines, which improve the accuracy and efficiency of endeavors to analyze and predict single-cell behavior. We demonstrate the applicability of our approach using simulated data for three example models as well as for experimental measurements of a time-varying stochastic transcriptional response in yeast.« less
A baseline-free procedure for transformation models under interval censorship.
Gu, Ming Gao; Sun, Liuquan; Zuo, Guoxin
2005-12-01
An important property of Cox regression model is that the estimation of regression parameters using the partial likelihood procedure does not depend on its baseline survival function. We call such a procedure baseline-free. Using marginal likelihood, we show that an baseline-free procedure can be derived for a class of general transformation models under interval censoring framework. The baseline-free procedure results a simplified and stable computation algorithm for some complicated and important semiparametric models, such as frailty models and heteroscedastic hazard/rank regression models, where the estimation procedures so far available involve estimation of the infinite dimensional baseline function. A detailed computational algorithm using Markov Chain Monte Carlo stochastic approximation is presented. The proposed procedure is demonstrated through extensive simulation studies, showing the validity of asymptotic consistency and normality. We also illustrate the procedure with a real data set from a study of breast cancer. A heuristic argument showing that the score function is a mean zero martingale is provided.
Change-in-ratio estimators for populations with more than two subclasses
Udevitz, Mark S.; Pollock, Kenneth H.
1991-01-01
Change-in-ratio methods have been developed to estimate the size of populations with two or three population subclasses. Most of these methods require the often unreasonable assumption of equal sampling probabilities for individuals in all subclasses. This paper presents new models based on the weaker assumption that ratios of sampling probabilities are constant over time for populations with three or more subclasses. Estimation under these models requires that a value be assumed for one of these ratios when there are two samples. Explicit expressions are given for the maximum likelihood estimators under models for two samples with three or more subclasses and for three samples with two subclasses. A numerical method using readily available statistical software is described for obtaining the estimators and their standard errors under all of the models. Likelihood ratio tests that can be used in model selection are discussed. Emphasis is on the two-sample, three-subclass models for which Monte-Carlo simulation results and an illustrative example are presented.
Failure dynamics of the global risk network.
Szymanski, Boleslaw K; Lin, Xin; Asztalos, Andrea; Sreenivasan, Sameet
2015-06-18
Risks threatening modern societies form an intricately interconnected network that often underlies crisis situations. Yet, little is known about how risk materializations in distinct domains influence each other. Here we present an approach in which expert assessments of likelihoods and influence of risks underlie a quantitative model of the global risk network dynamics. The modeled risks range from environmental to economic and technological, and include difficult to quantify risks, such as geo-political and social. Using the maximum likelihood estimation, we find the optimal model parameters and demonstrate that the model including network effects significantly outperforms the others, uncovering full value of the expert collected data. We analyze the model dynamics and study its resilience and stability. Our findings include such risk properties as contagion potential, persistence, roles in cascades of failures and the identity of risks most detrimental to system stability. The model provides quantitative means for measuring the adverse effects of risk interdependencies and the materialization of risks in the network.
Pal, Suvra; Balakrishnan, N
2017-10-01
In this paper, we consider a competing cause scenario and assume the number of competing causes to follow a Conway-Maxwell Poisson distribution which can capture both over and under dispersion that is usually encountered in discrete data. Assuming the population of interest having a component cure and the form of the data to be interval censored, as opposed to the usually considered right-censored data, the main contribution is in developing the steps of the expectation maximization algorithm for the determination of the maximum likelihood estimates of the model parameters of the flexible Conway-Maxwell Poisson cure rate model with Weibull lifetimes. An extensive Monte Carlo simulation study is carried out to demonstrate the performance of the proposed estimation method. Model discrimination within the Conway-Maxwell Poisson distribution is addressed using the likelihood ratio test and information-based criteria to select a suitable competing cause distribution that provides the best fit to the data. A simulation study is also carried out to demonstrate the loss in efficiency when selecting an improper competing cause distribution which justifies the use of a flexible family of distributions for the number of competing causes. Finally, the proposed methodology and the flexibility of the Conway-Maxwell Poisson distribution are illustrated with two known data sets from the literature: smoking cessation data and breast cosmesis data.
Estimating ambiguity preferences and perceptions in multiple prior models: Evidence from the field
Dimmock, Stephen G.; Kouwenberg, Roy; Mitchell, Olivia S.; Peijnenburg, Kim
2016-01-01
We develop a tractable method to estimate multiple prior models of decision-making under ambiguity. In a representative sample of the U.S. population, we measure ambiguity attitudes in the gain and loss domains. We find that ambiguity aversion is common for uncertain events of moderate to high likelihood involving gains, but ambiguity seeking prevails for low likelihoods and for losses. We show that choices made under ambiguity in the gain domain are best explained by the α-MaxMin model, with one parameter measuring ambiguity aversion (ambiguity preferences) and a second parameter quantifying the perceived degree of ambiguity (perceptions about ambiguity). The ambiguity aversion parameter α is constant and prior probability sets are asymmetric for low and high likelihood events. The data reject several other models, such as MaxMin and MaxMax, as well as symmetric probability intervals. Ambiguity aversion and the perceived degree of ambiguity are both higher for men and for the college-educated. Ambiguity aversion (but not perceived ambiguity) is also positively related to risk aversion. In the loss domain, we find evidence of reflection, implying that ambiguity aversion for gains tends to reverse into ambiguity seeking for losses. Our model’s estimates for preferences and perceptions about ambiguity can be used to analyze the economic and financial implications of such preferences. PMID:26924890
Load estimator (LOADEST): a FORTRAN program for estimating constituent loads in streams and rivers
Runkel, Robert L.; Crawford, Charles G.; Cohn, Timothy A.
2004-01-01
LOAD ESTimator (LOADEST) is a FORTRAN program for estimating constituent loads in streams and rivers. Given a time series of streamflow, additional data variables, and constituent concentration, LOADEST assists the user in developing a regression model for the estimation of constituent load (calibration). Explanatory variables within the regression model include various functions of streamflow, decimal time, and additional user-specified data variables. The formulated regression model then is used to estimate loads over a user-specified time interval (estimation). Mean load estimates, standard errors, and 95 percent confidence intervals are developed on a monthly and(or) seasonal basis. The calibration and estimation procedures within LOADEST are based on three statistical estimation methods. The first two methods, Adjusted Maximum Likelihood Estimation (AMLE) and Maximum Likelihood Estimation (MLE), are appropriate when the calibration model errors (residuals) are normally distributed. Of the two, AMLE is the method of choice when the calibration data set (time series of streamflow, additional data variables, and concentration) contains censored data. The third method, Least Absolute Deviation (LAD), is an alternative to maximum likelihood estimation when the residuals are not normally distributed. LOADEST output includes diagnostic tests and warnings to assist the user in determining the appropriate estimation method and in interpreting the estimated loads. This report describes the development and application of LOADEST. Sections of the report describe estimation theory, input/output specifications, sample applications, and installation instructions.
Robust Gaussian Graphical Modeling via l1 Penalization
Sun, Hokeun; Li, Hongzhe
2012-01-01
Summary Gaussian graphical models have been widely used as an effective method for studying the conditional independency structure among genes and for constructing genetic networks. However, gene expression data typically have heavier tails or more outlying observations than the standard Gaussian distribution. Such outliers in gene expression data can lead to wrong inference on the dependency structure among the genes. We propose a l1 penalized estimation procedure for the sparse Gaussian graphical models that is robustified against possible outliers. The likelihood function is weighted according to how the observation is deviated, where the deviation of the observation is measured based on its own likelihood. An efficient computational algorithm based on the coordinate gradient descent method is developed to obtain the minimizer of the negative penalized robustified-likelihood, where nonzero elements of the concentration matrix represents the graphical links among the genes. After the graphical structure is obtained, we re-estimate the positive definite concentration matrix using an iterative proportional fitting algorithm. Through simulations, we demonstrate that the proposed robust method performs much better than the graphical Lasso for the Gaussian graphical models in terms of both graph structure selection and estimation when outliers are present. We apply the robust estimation procedure to an analysis of yeast gene expression data and show that the resulting graph has better biological interpretation than that obtained from the graphical Lasso. PMID:23020775
MultiPhyl: a high-throughput phylogenomics webserver using distributed computing
Keane, Thomas M.; Naughton, Thomas J.; McInerney, James O.
2007-01-01
With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php. PMID:17553837
Adaptive Quadrature for Item Response Models. Research Report. ETS RR-06-29
ERIC Educational Resources Information Center
Haberman, Shelby J.
2006-01-01
Adaptive quadrature is applied to marginal maximum likelihood estimation for item response models with normal ability distributions. Even in one dimension, significant gains in speed and accuracy of computation may be achieved.
Preserving Flow Variability in Watershed Model Calibrations
Background/Question/Methods Although watershed modeling flow calibration techniques often emphasize a specific flow mode, ecological conditions that depend on flow-ecology relationships often emphasize a range of flow conditions. We used informal likelihood methods to investig...
Likelihood-Based Random-Effect Meta-Analysis of Binary Events.
Amatya, Anup; Bhaumik, Dulal K; Normand, Sharon-Lise; Greenhouse, Joel; Kaizar, Eloise; Neelon, Brian; Gibbons, Robert D
2015-01-01
Meta-analysis has been used extensively for evaluation of efficacy and safety of medical interventions. Its advantages and utilities are well known. However, recent studies have raised questions about the accuracy of the commonly used moment-based meta-analytic methods in general and for rare binary outcomes in particular. The issue is further complicated for studies with heterogeneous effect sizes. Likelihood-based mixed-effects modeling provides an alternative to moment-based methods such as inverse-variance weighted fixed- and random-effects estimators. In this article, we compare and contrast different mixed-effect modeling strategies in the context of meta-analysis. Their performance in estimation and testing of overall effect and heterogeneity are evaluated when combining results from studies with a binary outcome. Models that allow heterogeneity in both baseline rate and treatment effect across studies have low type I and type II error rates, and their estimates are the least biased among the models considered.
A Framework for Modeling Emerging Diseases to Inform Management
Katz, Rachel A.; Richgels, Katherine L.D.; Walsh, Daniel P.; Grant, Evan H.C.
2017-01-01
The rapid emergence and reemergence of zoonotic diseases requires the ability to rapidly evaluate and implement optimal management decisions. Actions to control or mitigate the effects of emerging pathogens are commonly delayed because of uncertainty in the estimates and the predicted outcomes of the control tactics. The development of models that describe the best-known information regarding the disease system at the early stages of disease emergence is an essential step for optimal decision-making. Models can predict the potential effects of the pathogen, provide guidance for assessing the likelihood of success of different proposed management actions, quantify the uncertainty surrounding the choice of the optimal decision, and highlight critical areas for immediate research. We demonstrate how to develop models that can be used as a part of a decision-making framework to determine the likelihood of success of different management actions given current knowledge. PMID:27983501
Falk, Carl F; Cai, Li
2016-06-01
We present a semi-parametric approach to estimating item response functions (IRF) useful when the true IRF does not strictly follow commonly used functions. Our approach replaces the linear predictor of the generalized partial credit model with a monotonic polynomial. The model includes the regular generalized partial credit model at the lowest order polynomial. Our approach extends Liang's (A semi-parametric approach to estimate IRFs, Unpublished doctoral dissertation, 2007) method for dichotomous item responses to the case of polytomous data. Furthermore, item parameter estimation is implemented with maximum marginal likelihood using the Bock-Aitkin EM algorithm, thereby facilitating multiple group analyses useful in operational settings. Our approach is demonstrated on both educational and psychological data. We present simulation results comparing our approach to more standard IRF estimation approaches and other non-parametric and semi-parametric alternatives.
Maximum likelihood estimation for semiparametric transformation models with interval-censored data
Mao, Lu; Lin, D. Y.
2016-01-01
Abstract Interval censoring arises frequently in clinical, epidemiological, financial and sociological studies, where the event or failure of interest is known only to occur within an interval induced by periodic monitoring. We formulate the effects of potentially time-dependent covariates on the interval-censored failure time through a broad class of semiparametric transformation models that encompasses proportional hazards and proportional odds models. We consider nonparametric maximum likelihood estimation for this class of models with an arbitrary number of monitoring times for each subject. We devise an EM-type algorithm that converges stably, even in the presence of time-dependent covariates, and show that the estimators for the regression parameters are consistent, asymptotically normal, and asymptotically efficient with an easily estimated covariance matrix. Finally, we demonstrate the performance of our procedures through simulation studies and application to an HIV/AIDS study conducted in Thailand. PMID:27279656
A class of semiparametric cure models with current status data.
Diao, Guoqing; Yuan, Ao
2018-02-08
Current status data occur in many biomedical studies where we only know whether the event of interest occurs before or after a particular time point. In practice, some subjects may never experience the event of interest, i.e., a certain fraction of the population is cured or is not susceptible to the event of interest. We consider a class of semiparametric transformation cure models for current status data with a survival fraction. This class includes both the proportional hazards and the proportional odds cure models as two special cases. We develop efficient likelihood-based estimation and inference procedures. We show that the maximum likelihood estimators for the regression coefficients are consistent, asymptotically normal, and asymptotically efficient. Simulation studies demonstrate that the proposed methods perform well in finite samples. For illustration, we provide an application of the models to a study on the calcification of the hydrogel intraocular lenses.
NASA Astrophysics Data System (ADS)
He, Yi; Liwo, Adam; Scheraga, Harold A.
2015-12-01
Coarse-grained models are useful tools to investigate the structural and thermodynamic properties of biomolecules. They are obtained by merging several atoms into one interaction site. Such simplified models try to capture as much as possible information of the original biomolecular system in all-atom representation but the resulting parameters of these coarse-grained force fields still need further optimization. In this paper, a force field optimization method, which is based on maximum-likelihood fitting of the simulated to the experimental conformational ensembles and least-squares fitting of the simulated to the experimental heat-capacity curves, is applied to optimize the Nucleic Acid united-RESidue 2-point (NARES-2P) model for coarse-grained simulations of nucleic acids recently developed in our laboratory. The optimized NARES-2P force field reproduces the structural and thermodynamic data of small DNA molecules much better than the original force field.
A Framework for Modeling Emerging Diseases to Inform Management.
Russell, Robin E; Katz, Rachel A; Richgels, Katherine L D; Walsh, Daniel P; Grant, Evan H C
2017-01-01
The rapid emergence and reemergence of zoonotic diseases requires the ability to rapidly evaluate and implement optimal management decisions. Actions to control or mitigate the effects of emerging pathogens are commonly delayed because of uncertainty in the estimates and the predicted outcomes of the control tactics. The development of models that describe the best-known information regarding the disease system at the early stages of disease emergence is an essential step for optimal decision-making. Models can predict the potential effects of the pathogen, provide guidance for assessing the likelihood of success of different proposed management actions, quantify the uncertainty surrounding the choice of the optimal decision, and highlight critical areas for immediate research. We demonstrate how to develop models that can be used as a part of a decision-making framework to determine the likelihood of success of different management actions given current knowledge.
NASA Technical Reports Server (NTRS)
Kogut, A.; Banday, A. J.; Bennett, C. L.; Hinshaw, G.; Lubin, P. M.; Smoot, G. F.
1995-01-01
We use the two-point correlation function of the extrema points (peaks and valleys) in the Cosmic Background Explorer (COBE) Differential Microwave Radiometers (DMR) 2 year sky maps as a test for non-Gaussian temperature distribution in the cosmic microwave background anisotropy. A maximum-likelihood analysis compares the DMR data to n = 1 toy models whose random-phase spherical harmonic components a(sub lm) are drawn from either Gaussian, chi-square, or log-normal parent populations. The likelihood of the 53 GHz (A+B)/2 data is greatest for the exact Gaussian model. There is less than 10% chance that the non-Gaussian models tested describe the DMR data, limited primarily by type II errors in the statistical inference. The extrema correlation function is a stronger test for this class of non-Gaussian models than topological statistics such as the genus.
A framework for modeling emerging diseases to inform management
Russell, Robin E.; Katz, Rachel A.; Richgels, Katherine L. D.; Walsh, Daniel P.; Grant, Evan H. Campbell
2017-01-01
The rapid emergence and reemergence of zoonotic diseases requires the ability to rapidly evaluate and implement optimal management decisions. Actions to control or mitigate the effects of emerging pathogens are commonly delayed because of uncertainty in the estimates and the predicted outcomes of the control tactics. The development of models that describe the best-known information regarding the disease system at the early stages of disease emergence is an essential step for optimal decision-making. Models can predict the potential effects of the pathogen, provide guidance for assessing the likelihood of success of different proposed management actions, quantify the uncertainty surrounding the choice of the optimal decision, and highlight critical areas for immediate research. We demonstrate how to develop models that can be used as a part of a decision-making framework to determine the likelihood of success of different management actions given current knowledge.
Characterization, parameter estimation, and aircraft response statistics of atmospheric turbulence
NASA Technical Reports Server (NTRS)
Mark, W. D.
1981-01-01
A nonGaussian three component model of atmospheric turbulence is postulated that accounts for readily observable features of turbulence velocity records, their autocorrelation functions, and their spectra. Methods for computing probability density functions and mean exceedance rates of a generic aircraft response variable are developed using nonGaussian turbulence characterizations readily extracted from velocity recordings. A maximum likelihood method is developed for optimal estimation of the integral scale and intensity of records possessing von Karman transverse of longitudinal spectra. Formulas for the variances of such parameter estimates are developed. The maximum likelihood and least-square approaches are combined to yield a method for estimating the autocorrelation function parameters of a two component model for turbulence.
Modelling default and likelihood reasoning as probabilistic
NASA Technical Reports Server (NTRS)
Buntine, Wray
1990-01-01
A probabilistic analysis of plausible reasoning about defaults and about likelihood is presented. 'Likely' and 'by default' are in fact treated as duals in the same sense as 'possibility' and 'necessity'. To model these four forms probabilistically, a logic QDP and its quantitative counterpart DP are derived that allow qualitative and corresponding quantitative reasoning. Consistency and consequence results for subsets of the logics are given that require at most a quadratic number of satisfiability tests in the underlying propositional logic. The quantitative logic shows how to track the propagation error inherent in these reasoning forms. The methodology and sound framework of the system highlights their approximate nature, the dualities, and the need for complementary reasoning about relevance.
ANA: Astrophysical Neutrino Anisotropy
NASA Astrophysics Data System (ADS)
Denton, Peter
2017-08-01
ANA calculates the likelihood function for a model comprised of two components to the astrophysical neutrino flux detected by IceCube. The first component is extragalactic. Since point sources have not been found and there is increasing evidence that one source catalog cannot describe the entire data set, ANA models the extragalactic flux as isotropic. The second component is galactic. A variety of catalogs of interest are also provided. ANA takes the galactic contribution to be proportional to the matter density of the universe. The likelihood function has one free parameter fgal that is the fraction of the astrophysical flux that is galactic. ANA finds the best fit value of fgal and scans over 0
Exact likelihood evaluations and foreground marginalization in low resolution WMAP data
NASA Astrophysics Data System (ADS)
Slosar, Anže; Seljak, Uroš; Makarov, Alexey
2004-06-01
The large scale anisotropies of Wilkinson Microwave Anisotropy Probe (WMAP) data have attracted a lot of attention and have been a source of controversy, with many favorite cosmological models being apparently disfavored by the power spectrum estimates at low l. All the existing analyses of theoretical models are based on approximations for the likelihood function, which are likely to be inaccurate on large scales. Here we present exact evaluations of the likelihood of the low multipoles by direct inversion of the theoretical covariance matrix for low resolution WMAP maps. We project out the unwanted galactic contaminants using the WMAP derived maps of these foregrounds. This improves over the template based foreground subtraction used in the original analysis, which can remove some of the cosmological signal and may lead to a suppression of power. As a result we find an increase in power at low multipoles. For the quadrupole the maximum likelihood values are rather uncertain and vary between 140 and 220 μK2. On the other hand, the probability distribution away from the peak is robust and, assuming a uniform prior between 0 and 2000 μK2, the probability of having the true value above 1200 μK2 (as predicted by the simplest cold dark matter model with a cosmological constant) is 10%, a factor of 2.5 higher than predicted by the WMAP likelihood code. We do not find the correlation function to be unusual beyond the low quadrupole value. We develop a fast likelihood evaluation routine that can be used instead of WMAP routines for low l values. We apply it to the Markov chain Monte Carlo analysis to compare the cosmological parameters between the two cases. The new analysis of WMAP either alone or jointly with the Sloan Digital Sky Survey (SDSS) and the Very Small Array (VSA) data reduces the evidence for running to less than 1σ, giving αs=-0.022±0.033 for the combined case. The new analysis prefers about a 1σ lower value of Ωm, a consequence of an increased integrated Sachs-Wolfe (ISW) effect contribution required by the increase in the spectrum at low l. These results suggest that the details of foreground removal and full likelihood analysis are important for parameter estimation from the WMAP data. They are robust in the sense that they do not change significantly with frequency, mask, or details of foreground template marginalization. The marginalization approach presented here is the most conservative method to remove the foregrounds and should be particularly useful in the analysis of polarization, where foreground contamination may be much more severe.
Burkhardt, John C; DesJardins, Stephen L; Teener, Carol A; Gay, Steven E; Santen, Sally A
2016-11-01
In higher education, enrollment management has been developed to accurately predict the likelihood of enrollment of admitted students. This allows evidence to dictate numbers of interviews scheduled, offers of admission, and financial aid package distribution. The applicability of enrollment management techniques for use in medical education was tested through creation of a predictive enrollment model at the University of Michigan Medical School (U-M). U-M and American Medical College Application Service data (2006-2014) were combined to create a database including applicant demographics, academic application scores, institutional financial aid offer, and choice of school attended. Binomial logistic regression and multinomial logistic regression models were estimated in order to study factors related to enrollment at the local institution versus elsewhere and to groupings of competing peer institutions. A predictive analytic "dashboard" was created for practical use. Both models were significant at P < .001 and had similar predictive performance. In the binomial model female, underrepresented minority students, grade point average, Medical College Admission Test score, admissions committee desirability score, and most individual financial aid offers were significant (P < .05). The significant covariates were similar in the multinomial model (excluding female) and provided separate likelihoods of students enrolling at different institutional types. An enrollment-management-based approach would allow medical schools to better manage the number of students they admit and target recruitment efforts to improve their likelihood of success. It also performs a key institutional research function for understanding failed recruitment of highly desirable candidates.
CONSTRUCTING A FLEXIBLE LIKELIHOOD FUNCTION FOR SPECTROSCOPIC INFERENCE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Czekala, Ian; Andrews, Sean M.; Mandel, Kaisey S.
2015-10-20
We present a modular, extensible likelihood framework for spectroscopic inference based on synthetic model spectra. The subtraction of an imperfect model from a continuously sampled spectrum introduces covariance between adjacent datapoints (pixels) into the residual spectrum. For the high signal-to-noise data with large spectral range that is commonly employed in stellar astrophysics, that covariant structure can lead to dramatically underestimated parameter uncertainties (and, in some cases, biases). We construct a likelihood function that accounts for the structure of the covariance matrix, utilizing the machinery of Gaussian process kernels. This framework specifically addresses the common problem of mismatches in model spectralmore » line strengths (with respect to data) due to intrinsic model imperfections (e.g., in the atomic/molecular databases or opacity prescriptions) by developing a novel local covariance kernel formalism that identifies and self-consistently downweights pathological spectral line “outliers.” By fitting many spectra in a hierarchical manner, these local kernels provide a mechanism to learn about and build data-driven corrections to synthetic spectral libraries. An open-source software implementation of this approach is available at http://iancze.github.io/Starfish, including a sophisticated probabilistic scheme for spectral interpolation when using model libraries that are sparsely sampled in the stellar parameters. We demonstrate some salient features of the framework by fitting the high-resolution V-band spectrum of WASP-14, an F5 dwarf with a transiting exoplanet, and the moderate-resolution K-band spectrum of Gliese 51, an M5 field dwarf.« less
Assessing compatibility of direct detection data: halo-independent global likelihood analyses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gelmini, Graciela B.; Huh, Ji-Haeng; Witte, Samuel J.
2016-10-18
We present two different halo-independent methods to assess the compatibility of several direct dark matter detection data sets for a given dark matter model using a global likelihood consisting of at least one extended likelihood and an arbitrary number of Gaussian or Poisson likelihoods. In the first method we find the global best fit halo function (we prove that it is a unique piecewise constant function with a number of down steps smaller than or equal to a maximum number that we compute) and construct a two-sided pointwise confidence band at any desired confidence level, which can then be comparedmore » with those derived from the extended likelihood alone to assess the joint compatibility of the data. In the second method we define a “constrained parameter goodness-of-fit” test statistic, whose p-value we then use to define a “plausibility region” (e.g. where p≥10%). For any halo function not entirely contained within the plausibility region, the level of compatibility of the data is very low (e.g. p<10%). We illustrate these methods by applying them to CDMS-II-Si and SuperCDMS data, assuming dark matter particles with elastic spin-independent isospin-conserving interactions or exothermic spin-independent isospin-violating interactions.« less
Ou, Lu; Chow, Sy-Miin; Ji, Linying; Molenaar, Peter C M
2017-01-01
The autoregressive latent trajectory (ALT) model synthesizes the autoregressive model and the latent growth curve model. The ALT model is flexible enough to produce a variety of discrepant model-implied change trajectories. While some researchers consider this a virtue, others have cautioned that this may confound interpretations of the model's parameters. In this article, we show that some-but not all-of these interpretational difficulties may be clarified mathematically and tested explicitly via likelihood ratio tests (LRTs) imposed on the initial conditions of the model. We show analytically the nested relations among three variants of the ALT model and the constraints needed to establish equivalences. A Monte Carlo simulation study indicated that LRTs, particularly when used in combination with information criterion measures, can allow researchers to test targeted hypotheses about the functional forms of the change process under study. We further demonstrate when and how such tests may justifiably be used to facilitate our understanding of the underlying process of change using a subsample (N = 3,995) of longitudinal family income data from the National Longitudinal Survey of Youth.
Posterior Predictive Bayesian Phylogenetic Model Selection
Lewis, Paul O.; Xie, Wangang; Chen, Ming-Hui; Fan, Yu; Kuo, Lynn
2014-01-01
We present two distinctly different posterior predictive approaches to Bayesian phylogenetic model selection and illustrate these methods using examples from green algal protein-coding cpDNA sequences and flowering plant rDNA sequences. The Gelfand–Ghosh (GG) approach allows dissection of an overall measure of model fit into components due to posterior predictive variance (GGp) and goodness-of-fit (GGg), which distinguishes this method from the posterior predictive P-value approach. The conditional predictive ordinate (CPO) method provides a site-specific measure of model fit useful for exploratory analyses and can be combined over sites yielding the log pseudomarginal likelihood (LPML) which is useful as an overall measure of model fit. CPO provides a useful cross-validation approach that is computationally efficient, requiring only a sample from the posterior distribution (no additional simulation is required). Both GG and CPO add new perspectives to Bayesian phylogenetic model selection based on the predictive abilities of models and complement the perspective provided by the marginal likelihood (including Bayes Factor comparisons) based solely on the fit of competing models to observed data. [Bayesian; conditional predictive ordinate; CPO; L-measure; LPML; model selection; phylogenetics; posterior predictive.] PMID:24193892
Evaluation of dynamic coastal response to sea-level rise modifies inundation likelihood
Lentz, Erika E.; Thieler, E. Robert; Plant, Nathaniel G.; Stippa, Sawyer R.; Horton, Radley M.; Gesch, Dean B.
2016-01-01
Sea-level rise (SLR) poses a range of threats to natural and built environments1, 2, making assessments of SLR-induced hazards essential for informed decision making3. We develop a probabilistic model that evaluates the likelihood that an area will inundate (flood) or dynamically respond (adapt) to SLR. The broad-area applicability of the approach is demonstrated by producing 30 × 30 m resolution predictions for more than 38,000 km2 of diverse coastal landscape in the northeastern United States. Probabilistic SLR projections, coastal elevation and vertical land movement are used to estimate likely future inundation levels. Then, conditioned on future inundation levels and the current land-cover type, we evaluate the likelihood of dynamic response versus inundation. We find that nearly 70% of this coastal landscape has some capacity to respond dynamically to SLR, and we show that inundation models over-predict land likely to submerge. This approach is well suited to guiding coastal resource management decisions that weigh future SLR impacts and uncertainty against ecological targets and economic constraints.
Liu, Xiaoming; Fu, Yun-Xin; Maxwell, Taylor J.; Boerwinkle, Eric
2010-01-01
It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood of the observed SNP configurations to infer population mutation rate θ = 4Neμ, population exponential growth rate R, and error rate ɛ, simultaneously. Using simulation, we show the combined effects of the parameters, θ, n, ɛ, and R on the accuracy of parameter estimation. We compared our maximum composite likelihood estimator (MCLE) of θ with other θ estimators that take into account the error. The results show the MCLE performs well when the sample size is large or the error rate is high. Using parametric bootstrap, composite likelihood can also be used as a statistic for testing the model goodness-of-fit of the observed DNA sequences. The MCLE method is applied to sequence data on the ANGPTL4 gene in 1832 African American and 1045 European American individuals. PMID:19952140
A Poisson Log-Normal Model for Constructing Gene Covariation Network Using RNA-seq Data.
Choi, Yoonha; Coram, Marc; Peng, Jie; Tang, Hua
2017-07-01
Constructing expression networks using transcriptomic data is an effective approach for studying gene regulation. A popular approach for constructing such a network is based on the Gaussian graphical model (GGM), in which an edge between a pair of genes indicates that the expression levels of these two genes are conditionally dependent, given the expression levels of all other genes. However, GGMs are not appropriate for non-Gaussian data, such as those generated in RNA-seq experiments. We propose a novel statistical framework that maximizes a penalized likelihood, in which the observed count data follow a Poisson log-normal distribution. To overcome the computational challenges, we use Laplace's method to approximate the likelihood and its gradients, and apply the alternating directions method of multipliers to find the penalized maximum likelihood estimates. The proposed method is evaluated and compared with GGMs using both simulated and real RNA-seq data. The proposed method shows improved performance in detecting edges that represent covarying pairs of genes, particularly for edges connecting low-abundant genes and edges around regulatory hubs.
Applications of non-standard maximum likelihood techniques in energy and resource economics
NASA Astrophysics Data System (ADS)
Moeltner, Klaus
Two important types of non-standard maximum likelihood techniques, Simulated Maximum Likelihood (SML) and Pseudo-Maximum Likelihood (PML), have only recently found consideration in the applied economic literature. The objective of this thesis is to demonstrate how these methods can be successfully employed in the analysis of energy and resource models. Chapter I focuses on SML. It constitutes the first application of this technique in the field of energy economics. The framework is as follows: Surveys on the cost of power outages to commercial and industrial customers usually capture multiple observations on the dependent variable for a given firm. The resulting pooled data set is censored and exhibits cross-sectional heterogeneity. We propose a model that addresses these issues by allowing regression coefficients to vary randomly across respondents and by using the Geweke-Hajivassiliou-Keane simulator and Halton sequences to estimate high-order cumulative distribution terms. This adjustment requires the use of SML in the estimation process. Our framework allows for a more comprehensive analysis of outage costs than existing models, which rely on the assumptions of parameter constancy and cross-sectional homogeneity. Our results strongly reject both of these restrictions. The central topic of the second Chapter is the use of PML, a robust estimation technique, in count data analysis of visitor demand for a system of recreation sites. PML has been popular with researchers in this context, since it guards against many types of mis-specification errors. We demonstrate, however, that estimation results will generally be biased even if derived through PML if the recreation model is based on aggregate, or zonal data. To countervail this problem, we propose a zonal model of recreation that captures some of the underlying heterogeneity of individual visitors by incorporating distributional information on per-capita income into the aggregate demand function. This adjustment eliminates the unrealistic constraint of constant income across zonal residents, and thus reduces the risk of aggregation bias in estimated macro-parameters. The corrected aggregate specification reinstates the applicability of PML. It also increases model efficiency, and allows-for the generation of welfare estimates for population subgroups.
A Model-Based Diagnosis Framework for Distributed Systems
2002-05-04
of centralized compilation techniques as applied to [6] Marco Cadoli and Francesco M . Donini . A survey several areas, of which diagnosis is one. Our...for doing so than the family for that (1) Vi 1 ... m . Xi E 2V; (2) V ui(Xi[Xi E 1). tree-structured systems. For simplicity of notation, we will that (i...our diagnosis synthesis diagnoses using a likelihood weight ri assigned to each as- algorithm. sumable Ai, i = I, ... m . Using the likelihood algebra
1982-04-01
S. (1979), "Conflict Among Criteria for Testing Hypothesis: Extension and Comments," Econometrica, 47, 203-207 Breusch , T. S. and Pagan , A. R. (1980...Savin, N. E. (1977), "Conflict Among Criteria for Testing Hypothesis in the Multivariate Linear Regression Model," Econometrica, 45, 1263-1278 Breusch , T...VNCLASSIFIED RAND//-6756NL U l~ I- THE RELATION AMONG THE LIKELIHOOD RATIO-, WALD-, AND LAGRANGE MULTIPLIER TESTS AND THEIR APPLICABILITY TO SMALL SAMPLES
A Likelihood Ratio Test Regarding Two Nested But Oblique Order Restricted Hypotheses.
1982-11-01
Report #90 DIC JAN 2 411 ISMO. H American Mathematical Society 1979 subject classification Primary 62F03 Secondary 62E15 Key words and phrases: Order...model. A likelihood ratio test for these two restrictions is studied . Asa *a .on . r 373 RA&J *iii - ,sa~m muwod [] v~ -F: :.v"’. os "- 1...investigation was stimulated partly by a problem encountered in psychiatric research. [Winokur et al., 1971] studied data on psychiatric illnesses afflicting
Duchesne, Thierry; Fortin, Daniel; Rivest, Louis-Paul
2015-01-01
Animal movement has a fundamental impact on population and community structure and dynamics. Biased correlated random walks (BCRW) and step selection functions (SSF) are commonly used to study movements. Because no studies have contrasted the parameters and the statistical properties of their estimators for models constructed under these two Lagrangian approaches, it remains unclear whether or not they allow for similar inference. First, we used the Weak Law of Large Numbers to demonstrate that the log-likelihood function for estimating the parameters of BCRW models can be approximated by the log-likelihood of SSFs. Second, we illustrated the link between the two approaches by fitting BCRW with maximum likelihood and with SSF to simulated movement data in virtual environments and to the trajectory of bison (Bison bison L.) trails in natural landscapes. Using simulated and empirical data, we found that the parameters of a BCRW estimated directly from maximum likelihood and by fitting an SSF were remarkably similar. Movement analysis is increasingly used as a tool for understanding the influence of landscape properties on animal distribution. In the rapidly developing field of movement ecology, management and conservation biologists must decide which method they should implement to accurately assess the determinants of animal movement. We showed that BCRW and SSF can provide similar insights into the environmental features influencing animal movements. Both techniques have advantages. BCRW has already been extended to allow for multi-state modeling. Unlike BCRW, however, SSF can be estimated using most statistical packages, it can simultaneously evaluate habitat selection and movement biases, and can easily integrate a large number of movement taxes at multiple scales. SSF thus offers a simple, yet effective, statistical technique to identify movement taxis.
The logistic model for predicting the non-gonoactive Aedes aegypti females.
Reyes-Villanueva, Filiberto; Rodríguez-Pérez, Mario A
2004-01-01
To estimate, using logistic regression, the likelihood of occurrence of a non-gonoactive Aedes aegypti female, previously fed human blood, with relation to body size and collection method. This study was conducted in Monterrey, Mexico, between 1994 and 1996. Ten samplings of 60 mosquitoes of Ae. aegypti females were carried out in three dengue endemic areas: six of biting females, two of emerging mosquitoes, and two of indoor resting females. Gravid females, as well as those with blood in the gut were removed. Mosquitoes were taken to the laboratory and engorged on human blood. After 48 hours, ovaries were dissected to register whether they were gonoactive or non-gonoactive. Wing-length in mm was an indicator for body size. The logistic regression model was used to assess the likelihood of non-gonoactivity, as a binary variable, in relation to wing-length and collection method. Of the 600 females, 164 (27%) remained non-gonoactive, with a wing-length range of 1.9-3.2 mm, almost equal to that of all females (1.8-3.3 mm). The logistic regression model showed a significant likelihood of a female remaining non-gonoactive (Y=1). The collection method did not influence the binary response, but there was an inverse relationship between non-gonoactivity and wing-length. Dengue vector populations from Monterrey, Mexico display a wide-range body size. Logistic regression was a useful tool to estimate the likelihood for an engorged female to remain non-gonoactive. The necessity for a second blood meal is present in any female, but small mosquitoes are more likely to bite again within a 2-day interval, in order to attain egg maturation. The English version of this paper is available too at: http://www.insp.mx/salud/index.html.
NASA Astrophysics Data System (ADS)
Goodman, Steven N.
1989-11-01
This dissertation explores the use of a mathematical measure of statistical evidence, the log likelihood ratio, in clinical trials. The methods and thinking behind the use of an evidential measure are contrasted with traditional methods of analyzing data, which depend primarily on a p-value as an estimate of the statistical strength of an observed data pattern. It is contended that neither the behavioral dictates of Neyman-Pearson hypothesis testing methods, nor the coherency dictates of Bayesian methods are realistic models on which to base inference. The use of the likelihood alone is applied to four aspects of trial design or conduct: the calculation of sample size, the monitoring of data, testing for the equivalence of two treatments, and meta-analysis--the combining of results from different trials. Finally, a more general model of statistical inference, using belief functions, is used to see if it is possible to separate the assessment of evidence from our background knowledge. It is shown that traditional and Bayesian methods can be modeled as two ends of a continuum of structured background knowledge, methods which summarize evidence at the point of maximum likelihood assuming no structure, and Bayesian methods assuming complete knowledge. Both schools are seen to be missing a concept of ignorance- -uncommitted belief. This concept provides the key to understanding the problem of sampling to a foregone conclusion and the role of frequency properties in statistical inference. The conclusion is that statistical evidence cannot be defined independently of background knowledge, and that frequency properties of an estimator are an indirect measure of uncommitted belief. Several likelihood summaries need to be used in clinical trials, with the quantitative disparity between summaries being an indirect measure of our ignorance. This conclusion is linked with parallel ideas in the philosophy of science and cognitive psychology.
Cure rate model with interval censored data.
Kim, Yang-Jin; Jhun, Myoungshic
2008-01-15
In cancer trials, a significant fraction of patients can be cured, that is, the disease is completely eliminated, so that it never recurs. In general, treatments are developed to both increase the patients' chances of being cured and prolong the survival time among non-cured patients. A cure rate model represents a combination of cure fraction and survival model, and can be applied to many clinical studies over several types of cancer. In this article, the cure rate model is considered in the interval censored data composed of two time points, which include the event time of interest. Interval censored data commonly occur in the studies of diseases that often progress without symptoms, requiring clinical evaluation for detection (Encyclopedia of Biostatistics. Wiley: New York, 1998; 2090-2095). In our study, an approximate likelihood approach suggested by Goetghebeur and Ryan (Biometrics 2000; 56:1139-1144) is used to derive the likelihood in interval censored data. In addition, a frailty model is introduced to characterize the association between the cure fraction and survival model. In particular, the positive association between the cure fraction and the survival time is incorporated by imposing a common normal frailty effect. The EM algorithm is used to estimate parameters and a multiple imputation based on the profile likelihood is adopted for variance estimation. The approach is applied to the smoking cessation study in which the event of interest is a smoking relapse and several covariates including an intensive care treatment are evaluated to be effective for both the occurrence of relapse and the non-smoking duration. Copyright (c) 2007 John Wiley & Sons, Ltd.
DEVELOPING A PRODUCTION POSSIBILITY SET OF WILDLIFE SPECIES PERSISTENCE AND TIMBER HARVEST VALUE
An integrated model combing a wildlife population simulation model, and timber harvest and growth models was developed to explore the tradeoffs between the likelihood of persistence of a hypothetical wildlife species and timber harvest volumes on a landscape in the Central Oregon...
Bayesian image reconstruction - The pixon and optimal image modeling
NASA Technical Reports Server (NTRS)
Pina, R. K.; Puetter, R. C.
1993-01-01
In this paper we describe the optimal image model, maximum residual likelihood method (OptMRL) for image reconstruction. OptMRL is a Bayesian image reconstruction technique for removing point-spread function blurring. OptMRL uses both a goodness-of-fit criterion (GOF) and an 'image prior', i.e., a function which quantifies the a priori probability of the image. Unlike standard maximum entropy methods, which typically reconstruct the image on the data pixel grid, OptMRL varies the image model in order to find the optimal functional basis with which to represent the image. We show how an optimal basis for image representation can be selected and in doing so, develop the concept of the 'pixon' which is a generalized image cell from which this basis is constructed. By allowing both the image and the image representation to be variable, the OptMRL method greatly increases the volume of solution space over which the image is optimized. Hence the likelihood of the final reconstructed image is greatly increased. For the goodness-of-fit criterion, OptMRL uses the maximum residual likelihood probability distribution introduced previously by Pina and Puetter (1992). This GOF probability distribution, which is based on the spatial autocorrelation of the residuals, has the advantage that it ensures spatially uncorrelated image reconstruction residuals.
Ling, Cheng; Hamada, Tsuyoshi; Gao, Jingyang; Zhao, Guoguang; Sun, Donghong; Shi, Weifeng
2016-01-01
MrBayes is a widespread phylogenetic inference tool harnessing empirical evolutionary models and Bayesian statistics. However, the computational cost on the likelihood estimation is very expensive, resulting in undesirably long execution time. Although a number of multi-threaded optimizations have been proposed to speed up MrBayes, there are bottlenecks that severely limit the GPU thread-level parallelism of likelihood estimations. This study proposes a high performance and resource-efficient method for GPU-oriented parallelization of likelihood estimations. Instead of having to rely on empirical programming, the proposed novel decomposition storage model implements high performance data transfers implicitly. In terms of performance improvement, a speedup factor of up to 178 can be achieved on the analysis of simulated datasets by four Tesla K40 cards. In comparison to the other publicly available GPU-oriented MrBayes, the tgMC 3 ++ method (proposed herein) outperforms the tgMC 3 (v1.0), nMC 3 (v2.1.1) and oMC 3 (v1.00) methods by speedup factors of up to 1.6, 1.9 and 2.9, respectively. Moreover, tgMC 3 ++ supports more evolutionary models and gamma categories, which previous GPU-oriented methods fail to take into analysis.
Estimating Divergence Parameters With Small Samples From a Large Number of Loci
Wang, Yong; Hey, Jody
2010-01-01
Most methods for studying divergence with gene flow rely upon data from many individuals at few loci. Such data can be useful for inferring recent population history but they are unlikely to contain sufficient information about older events. However, the growing availability of genome sequences suggests a different kind of sampling scheme, one that may be more suited to studying relatively ancient divergence. Data sets extracted from whole-genome alignments may represent very few individuals but contain a very large number of loci. To take advantage of such data we developed a new maximum-likelihood method for genomic data under the isolation-with-migration model. Unlike many coalescent-based likelihood methods, our method does not rely on Monte Carlo sampling of genealogies, but rather provides a precise calculation of the likelihood by numerical integration over all genealogies. We demonstrate that the method works well on simulated data sets. We also consider two models for accommodating mutation rate variation among loci and find that the model that treats mutation rates as random variables leads to better estimates. We applied the method to the divergence of Drosophila melanogaster and D. simulans and detected a low, but statistically significant, signal of gene flow from D. simulans to D. melanogaster. PMID:19917765
Display size effects in visual search: analyses of reaction time distributions as mixtures.
Reynolds, Ann; Miller, Jeff
2009-05-01
In a reanalysis of data from Cousineau and Shiffrin (2004) and two new visual search experiments, we used a likelihood ratio test to examine the full distributions of reaction time (RT) for evidence that the display size effect is a mixture-type effect that occurs on only a proportion of trials, leaving RT in the remaining trials unaffected, as is predicted by serial self-terminating search models. Experiment 1 was a reanalysis of Cousineau and Shiffrin's data, for which a mixture effect had previously been established by a bimodal distribution of RTs, and the results confirmed that the likelihood ratio test could also detect this mixture. Experiment 2 applied the likelihood ratio test within a more standard visual search task with a relatively easy target/distractor discrimination, and Experiment 3 applied it within a target identification search task within the same types of stimuli. Neither of these experiments provided any evidence for the mixture-type display size effect predicted by serial self-terminating search models. Overall, these results suggest that serial self-terminating search models may generally be applicable only with relatively difficult target/distractor discriminations, and then only for some participants. In addition, they further illustrate the utility of analysing full RT distributions in addition to mean RT.
Modeling Dynamic Functional Neuroimaging Data Using Structural Equation Modeling
ERIC Educational Resources Information Center
Price, Larry R.; Laird, Angela R.; Fox, Peter T.; Ingham, Roger J.
2009-01-01
The aims of this study were to present a method for developing a path analytic network model using data acquired from positron emission tomography. Regions of interest within the human brain were identified through quantitative activation likelihood estimation meta-analysis. Using this information, a "true" or population path model was then…
IRT Model Selection Methods for Dichotomous Items
ERIC Educational Resources Information Center
Kang, Taehoon; Cohen, Allan S.
2007-01-01
Fit of the model to the data is important if the benefits of item response theory (IRT) are to be obtained. In this study, the authors compared model selection results using the likelihood ratio test, two information-based criteria, and two Bayesian methods. An example illustrated the potential for inconsistency in model selection depending on…
DOT National Transportation Integrated Search
2012-04-01
A poroelastic model is developed that can predict stress and strain distributions and, thus, ostensibly : damage likelihood in concrete under freezing conditions caused by aggregates with undesirable : combinations of geometry and constitutive proper...
A Review of System Identification Methods Applied to Aircraft
NASA Technical Reports Server (NTRS)
Klein, V.
1983-01-01
Airplane identification, equation error method, maximum likelihood method, parameter estimation in frequency domain, extended Kalman filter, aircraft equations of motion, aerodynamic model equations, criteria for the selection of a parsimonious model, and online aircraft identification are addressed.
The Scientific Method, Diagnostic Bayes, and How to Detect Epistemic Errors
NASA Astrophysics Data System (ADS)
Vrugt, J. A.
2015-12-01
In the past decades, Bayesian methods have found widespread application and use in environmental systems modeling. Bayes theorem states that the posterior probability, P(H|D) of a hypothesis, H is proportional to the product of the prior probability, P(H) of this hypothesis and the likelihood, L(H|hat{D}) of the same hypothesis given the new/incoming observations, \\hat {D}. In science and engineering, H often constitutes some numerical simulation model, D = F(x,.) which summarizes using algebraic, empirical, and differential equations, state variables and fluxes, all our theoretical and/or practical knowledge of the system of interest, and x are the d unknown parameters which are subject to inference using some data, \\hat {D} of the observed system response. The Bayesian approach is intimately related to the scientific method and uses an iterative cycle of hypothesis formulation (model), experimentation and data collection, and theory/hypothesis refinement to elucidate the rules that govern the natural world. Unfortunately, model refinement has proven to be very difficult in large part because of the poor diagnostic power of residual based likelihood functions tep{gupta2008}. This has inspired te{vrugt2013} to advocate the use of 'likelihood-free' inference using approximate Bayesian computation (ABC). This approach uses one or more summary statistics, S(\\hat {D}) of the original data, \\hat {D} designed ideally to be sensitive only to one particular process in the model. Any mismatch between the observed and simulated summary metrics is then easily linked to a specific model component. A recurrent issue with the application of ABC is self-sufficiency of the summary statistics. In theory, S(.) should contain as much information as the original data itself, yet complex systems rarely admit sufficient statistics. In this article, we propose to combine the ideas of ABC and regular Bayesian inference to guarantee that no information is lost in diagnostic model evaluation. This hybrid approach, coined diagnostic Bayes, uses the summary metrics as prior distribution and original data in the likelihood function, or P(x|\\hat {D}) ∝ P(x|S(\\hat {D})) L(x|\\hat {D}). A case study illustrates the ability of the proposed methodology to diagnose epistemic errors and provide guidance on model refinement.
Implementing Restricted Maximum Likelihood Estimation in Structural Equation Models
ERIC Educational Resources Information Center
Cheung, Mike W.-L.
2013-01-01
Structural equation modeling (SEM) is now a generic modeling framework for many multivariate techniques applied in the social and behavioral sciences. Many statistical models can be considered either as special cases of SEM or as part of the latent variable modeling framework. One popular extension is the use of SEM to conduct linear mixed-effects…
Evaluation of weighted regression and sample size in developing a taper model for loblolly pine
Kenneth L. Cormier; Robin M. Reich; Raymond L. Czaplewski; William A. Bechtold
1992-01-01
A stem profile model, fit using pseudo-likelihood weighted regression, was used to estimate merchantable volume of loblolly pine (Pinus taeda L.) in the southeast. The weighted regression increased model fit marginally, but did not substantially increase model performance. In all cases, the unweighted regression models performed as well as the...
Planck 2013 results. XV. CMB power spectra and likelihood
NASA Astrophysics Data System (ADS)
Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Armitage-Caplan, C.; Arnaud, M.; Ashdown, M.; Atrio-Barandela, F.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Bartlett, J. G.; Battaner, E.; Benabed, K.; Benoît, A.; Benoit-Lévy, A.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bobin, J.; Bock, J. J.; Bonaldi, A.; Bonavera, L.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Boulanger, F.; Bridges, M.; Bucher, M.; Burigana, C.; Butler, R. C.; Calabrese, E.; Cardoso, J.-F.; Catalano, A.; Challinor, A.; Chamballu, A.; Chiang, H. C.; Chiang, L.-Y.; Christensen, P. R.; Church, S.; Clements, D. L.; Colombi, S.; Colombo, L. P. L.; Combet, C.; Couchot, F.; Coulais, A.; Crill, B. P.; Curto, A.; Cuttaia, F.; Danese, L.; Davies, R. D.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Delouis, J.-M.; Désert, F.-X.; Dickinson, C.; Diego, J. M.; Dole, H.; Donzelli, S.; Doré, O.; Douspis, M.; Dunkley, J.; Dupac, X.; Efstathiou, G.; Elsner, F.; Enßlin, T. A.; Eriksen, H. K.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Gaier, T. C.; Galeotta, S.; Galli, S.; Ganga, K.; Giard, M.; Giardino, G.; Giraud-Héraud, Y.; Gjerløw, E.; González-Nuevo, J.; Górski, K. M.; Gratton, S.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Hanson, D.; Harrison, D.; Helou, G.; Henrot-Versillé, S.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Holmes, W. A.; Hornstrup, A.; Hovest, W.; Huffenberger, K. M.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jewell, J.; Jones, W. C.; Juvela, M.; Keihänen, E.; Keskitalo, R.; Kiiveri, K.; Kisner, T. S.; Kneissl, R.; Knoche, J.; Knox, L.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lähteenmäki, A.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Laureijs, R. J.; Lawrence, C. R.; Le Jeune, M.; Leach, S.; Leahy, J. P.; Leonardi, R.; León-Tavares, J.; Lesgourgues, J.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; Lindholm, V.; López-Caniego, M.; Lubin, P. M.; Macías-Pérez, J. F.; Maffei, B.; Maino, D.; Mandolesi, N.; Marinucci, D.; Maris, M.; Marshall, D. J.; Martin, P. G.; Martínez-González, E.; Masi, S.; Massardi, M.; Matarrese, S.; Matthai, F.; Mazzotta, P.; Meinhold, P. R.; Melchiorri, A.; Mendes, L.; Menegoni, E.; Mennella, A.; Migliaccio, M.; Millea, M.; Mitra, S.; Miville-Deschênes, M.-A.; Molinari, D.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Moss, A.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Netterfield, C. B.; Nørgaard-Nielsen, H. U.; Noviello, F.; Novikov, D.; Novikov, I.; O'Dwyer, I. J.; Orieux, F.; Osborne, S.; Oxborrow, C. A.; Paci, F.; Pagano, L.; Pajot, F.; Paladini, R.; Paoletti, D.; Partridge, B.; Pasian, F.; Patanchon, G.; Paykari, P.; Perdereau, O.; Perotto, L.; Perrotta, F.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Pietrobon, D.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Ponthieu, N.; Popa, L.; Poutanen, T.; Pratt, G. W.; Prézeau, G.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Rahlin, A.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Ricciardi, S.; Riller, T.; Ringeval, C.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Roudier, G.; Rowan-Robinson, M.; Rubiño-Martín, J. A.; Rusholme, B.; Sandri, M.; Sanselme, L.; Santos, D.; Savini, G.; Scott, D.; Seiffert, M. D.; Shellard, E. P. S.; Spencer, L. D.; Starck, J.-L.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sureau, F.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Tavagnacco, D.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Tucci, M.; Tuovinen, J.; Türler, M.; Valenziano, L.; Valiviita, J.; Van Tent, B.; Varis, J.; Vielva, P.; Villa, F.; Vittorio, N.; Wade, L. A.; Wandelt, B. D.; Wehus, I. K.; White, M.; White, S. D. M.; Yvon, D.; Zacchei, A.; Zonca, A.
2014-11-01
This paper presents the Planck 2013 likelihood, a complete statistical description of the two-point correlation function of the CMB temperature fluctuations that accounts for all known relevant uncertainties, both instrumental and astrophysical in nature. We use this likelihood to derive our best estimate of the CMB angular power spectrum from Planck over three decades in multipole moment, ℓ, covering 2 ≤ ℓ ≤ 2500. The main source of uncertainty at ℓ ≲ 1500 is cosmic variance. Uncertainties in small-scale foreground modelling and instrumental noise dominate the error budget at higher ℓs. For ℓ < 50, our likelihood exploits all Planck frequency channels from 30 to 353 GHz, separating the cosmological CMB signal from diffuse Galactic foregrounds through a physically motivated Bayesian component separation technique. At ℓ ≥ 50, we employ a correlated Gaussian likelihood approximation based on a fine-grained set of angular cross-spectra derived from multiple detector combinations between the 100, 143, and 217 GHz frequency channels, marginalising over power spectrum foreground templates. We validate our likelihood through an extensive suite of consistency tests, and assess the impact of residual foreground and instrumental uncertainties on the final cosmological parameters. We find good internal agreement among the high-ℓ cross-spectra with residuals below a few μK2 at ℓ ≲ 1000, in agreement with estimated calibration uncertainties. We compare our results with foreground-cleaned CMB maps derived from all Planck frequencies, as well as with cross-spectra derived from the 70 GHz Planck map, and find broad agreement in terms of spectrum residuals and cosmological parameters. We further show that the best-fit ΛCDM cosmology is in excellent agreement with preliminary PlanckEE and TE polarisation spectra. We find that the standard ΛCDM cosmology is well constrained by Planck from the measurements at ℓ ≲ 1500. One specific example is the spectral index of scalar perturbations, for which we report a 5.4σ deviation from scale invariance, ns = 1. Increasing the multipole range beyond ℓ ≃ 1500 does not increase our accuracy for the ΛCDM parameters, but instead allows us to study extensions beyond the standard model. We find no indication of significant departures from the ΛCDM framework. Finally, we report a tension between the Planck best-fit ΛCDM model and the low-ℓ spectrum in the form of a power deficit of 5-10% at ℓ ≲ 40, with a statistical significance of 2.5-3σ. Without a theoretically motivated model for this power deficit, we do not elaborate further on its cosmological implications, but note that this is our most puzzling finding in an otherwise remarkably consistent data set.
Probabilistic Modeling and Evaluation of Surf Zone Injury Occurrence along the Delaware Coast
NASA Astrophysics Data System (ADS)
Doelp, M.; Puleo, J. A.
2017-12-01
Beebe Healthcare in Lewes, DE collected along the DE coast surf zone injury (SZI) data for seven summer seasons from 2010 through 2016. Data include, but are not limited to, time of injury, gender, age, and activity. Over 2000 injuries were recorded over the seven year period, including 116 spinal injuries and three fatalities. These injuries are predominantly wave related incidents including wading (41%), bodysurfing (26%), and body-boarding (20%). Despite the large number of injuries, beach associated hazards do not receive the same level of awareness that rip currents receive. Injury population statistics revealed those between the ages of 11 and 15 years old suffered the greatest proportion of injuries (18.8%). Male water users were twice as likely to sustain injury as their female counterparts. Also, non-locals were roughly six times more likely to sustain injury than locals. In 2016, five or more injuries occurred for 18.5% of the days sampled, and no injuries occurred for 31.4% of the sample days. The episodic nature of injury occurrence and population statistics indicate the importance of environmental conditions and human behavior on surf zone injuries. Higher order statistics are necessary to effectively assess SZI cause and likelihood of occurrence on a particular day. A Bayesian network using Netica software (Norsys) was constructed to model SZI and predict changes in injury likelihood on an hourly basis. The network incorporates environmental data collected by weather stations, NDBC buoy #44009, USACE buoy at Bethany Beach, and by researcher personnel on the beach. The Bayesian model includes prior (e.g., historic) information to infer relationships between provided parameters. Sensitivity analysis determined the most influential variables to injury likelihood are population, water temperature, nearshore wave height, beach slope, and the day of the week. Forecasting during the 2017 summer season will test model ability to predict injury likelihood.
Yuan, Quan; Lu, Meng; Theofilatos, Athanasios; Li, Yi-Bing
2017-02-01
Rear-end crashes attribute to a large portion of total crashes in China, which lead to many casualties and property damage, especially when involving commercial vehicles. This paper aims to investigate the critical factors for occupant injury severity in the specific rear-end crash type involving trucks as the front vehicle (FV). This paper investigated crashes occurred from 2011 to 2013 in Beijing area, China and selected 100 qualified cases i.e., rear-end crashes involving trucks as the FV. The crash data were supplemented with interviews from police officers and vehicle inspection. A binary logistic regression model was used to build the relationship between occupant injury severity and corresponding affecting factors. Moreover, a multinomial logistic model was used to predict the likelihood of fatal or severe injury or no injury in a rear-end crash. The results provided insights on the characteristics of driver, vehicle and environment, and the corresponding influences on the likelihood of a rear-end crash. The binary logistic model showed that drivers' age, weight difference between vehicles, visibility condition and lane number of road significantly increased the likelihood for severe injury of rear-end crash. The multinomial logistic model and the average direct pseudo-elasticity of variables showed that night time, weekdays, drivers from other provinces and passenger vehicles as rear vehicles significantly increased the likelihood of rear drivers being fatal. All the abovementioned significant factors should be improved, such as the conditions of lighting and the layout of lanes on roads. Two of the most common driver factors are drivers' age and drivers' original residence. Young drivers and outsiders have a higher injury severity. Therefore it is imperative to enhance the safety education and management on the young drivers who steer heavy duty truck from other cities to Beijing on weekdays. Copyright © 2016 Daping Hospital and the Research Institute of Surgery of the Third Military Medical University. Production and hosting by Elsevier B.V. All rights reserved.
Figler, Bradley D; Mack, Christopher D; Kaufman, Robert; Wessells, Hunter; Bulger, Eileen; Smith, Thomas G; Voelzke, Bryan
2014-03-01
The National Highway Traffic Safety Administration's New Car Assessment Program (NCAP) implemented side-impact crash testing on all new vehicles since 1998 to assess the likelihood of major thoracoabdominal injuries during a side-impact crash. Higher crash test rating is intended to indicate a safer car, but the real-world applicability of these ratings is unknown. Our objective was to determine the relationship between a vehicle's NCAP side-impact crash test rating and the risk of major thoracoabdominal injury among the vehicle's occupants in real-world side-impact motor vehicle crashes. The National Automotive Sampling System Crashworthiness Data System contains detailed crash and injury data in a sample of major crashes in the United States. For model years 1998 to 2010 and crash years 1999 to 2010, 68,124 occupants were identified in the Crashworthiness Data System database. Because 47% of cases were missing crash severity (ΔV), multiple imputation was used to estimate the missing values. The primary predictor of interest was the occupant vehicle's NCAP side-impact crash test rating, and the outcome of interest was the presence of major (Abbreviated Injury Scale [AIS] score ≥ 3) thoracoabdominal injury. In multivariate analysis, increasing NCAP crash test rating was associated with lower likelihood of major thoracoabdominal injury at high (odds ratio [OR], 0.8; 95% confidence interval [CI], 0.7-0.9; p < 0.01) and medium (OR, 0.9; 95% CI, 0.8-1.0; p < 0.05) crash severity (ΔV), but not at low ΔV (OR, 0.95; 95% CI, 0.8-1.2; p = 0.55). In our model, older age and absence of seat belt use were associated with greater likelihood of major thoracoabdominal injury at low and medium ΔV (p < 0.001), but not at high ΔV (p ≥ 0.09). Among adults in model year 1998 to 2010 vehicles involved in medium and high severity motor vehicle crashes, a higher NCAP side-impact crash test rating is associated with a lower likelihood of major thoracoabdominal trauma. Epidemiologic study, level III.
Assessing Individual Weather Risk-Taking and Its Role in Modeling Likelihood of Hurricane Evacuation
NASA Astrophysics Data System (ADS)
Stewart, A. E.
2017-12-01
This research focuses upon measuring an individual's level of perceived risk of different severe and extreme weather conditions using a new self-report measure, the Weather Risk-Taking Scale (WRTS). For 32 severe and extreme situations in which people could perform an unsafe behavior (e. g., remaining outside with lightning striking close by, driving over roadways covered with water, not evacuating ahead of an approaching hurricane, etc.), people rated: 1.their likelihood of performing the behavior, 2. The perceived risk of performing the behavior, 3. the expected benefits of performing the behavior, and 4. whether the behavior has actually been performed in the past. Initial development research with the measure using 246 undergraduate students examined its psychometric properties and found that it was internally consistent (Cronbach's a ranged from .87 to .93 for the four scales) and that the scales possessed good temporal (test-retest) reliability (r's ranged from .84 to .91). A second regression study involving 86 undergraduate students found that taking weather risks was associated with having taken similar risks in one's past and with the personality trait of sensation-seeking. Being more attentive to the weather and perceiving its risks when it became extreme was associated with lower likelihoods of taking weather risks (overall regression model, R2adj = 0.60). A third study involving 334 people examined the contributions of weather risk perceptions and risk-taking in modeling the self-reported likelihood of complying with a recommended evacuation ahead of a hurricane. Here, higher perceptions of hurricane risks and lower perceived benefits of risk-taking along with fear of severe weather and hurricane personal self-efficacy ratings were all statistically significant contributors to the likelihood of evacuating ahead of a hurricane. Psychological rootedness and attachment to one's home also tend to predict lack of evacuation. This research highlights the contributions that a psychological approach can offer in understanding preparations for severe weather. This approach also suggests that a great deal of individual variation exists in weather-protective behaviors, which may explain in part why some people take weather-related risks despite receiving warnings for severe weather.
Maximum likelihood: Extracting unbiased information from complex networks
NASA Astrophysics Data System (ADS)
Garlaschelli, Diego; Loffredo, Maria I.
2008-07-01
The choice of free parameters in network models is subjective, since it depends on what topological properties are being monitored. However, we show that the maximum likelihood (ML) principle indicates a unique, statistically rigorous parameter choice, associated with a well-defined topological feature. We then find that, if the ML condition is incompatible with the built-in parameter choice, network models turn out to be intrinsically ill defined or biased. To overcome this problem, we construct a class of safely unbiased models. We also propose an extension of these results that leads to the fascinating possibility to extract, only from topological data, the “hidden variables” underlying network organization, making them “no longer hidden.” We test our method on World Trade Web data, where we recover the empirical gross domestic product using only topological information.
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
On modeling animal movements using Brownian motion with measurement error.
Pozdnyakov, Vladimir; Meyer, Thomas; Wang, Yu-Bo; Yan, Jun
2014-02-01
Modeling animal movements with Brownian motion (or more generally by a Gaussian process) has a long tradition in ecological studies. The recent Brownian bridge movement model (BBMM), which incorporates measurement errors, has been quickly adopted by ecologists because of its simplicity and tractability. We discuss some nontrivial properties of the discrete-time stochastic process that results from observing a Brownian motion with added normal noise at discrete times. In particular, we demonstrate that the observed sequence of random variables is not Markov. Consequently the expected occupation time between two successively observed locations does not depend on just those two observations; the whole path must be taken into account. Nonetheless, the exact likelihood function of the observed time series remains tractable; it requires only sparse matrix computations. The likelihood-based estimation procedure is described in detail and compared to the BBMM estimation.
Fang, Yun; Wu, Hulin; Zhu, Li-Xing
2011-07-01
We propose a two-stage estimation method for random coefficient ordinary differential equation (ODE) models. A maximum pseudo-likelihood estimator (MPLE) is derived based on a mixed-effects modeling approach and its asymptotic properties for population parameters are established. The proposed method does not require repeatedly solving ODEs, and is computationally efficient although it does pay a price with the loss of some estimation efficiency. However, the method does offer an alternative approach when the exact likelihood approach fails due to model complexity and high-dimensional parameter space, and it can also serve as a method to obtain the starting estimates for more accurate estimation methods. In addition, the proposed method does not need to specify the initial values of state variables and preserves all the advantages of the mixed-effects modeling approach. The finite sample properties of the proposed estimator are studied via Monte Carlo simulations and the methodology is also illustrated with application to an AIDS clinical data set.
Hirose, H
1997-01-01
This paper proposes a new treatment for electrical insulation degradation. Some types of insulation which have been used under various circumstances are considered to degrade at various rates in accordance with their stress circumstances. The cross-linked polyethylene (XLPE) insulated cables inspected by major Japanese electric companies clearly indicate such phenomena. By assuming that the inspected specimen is sampled from one of the clustered groups, a mixed degradation model can be constructed. Since the degradation of the insulation under common circumstances is considered to follow a Weibull distribution, a mixture model and a Weibull power law can be combined. This is called The mixture Weibull power law model. By using the maximum likelihood estimation for the newly proposed model to Japanese 22 and 33 kV insulation class cables, they are clustered into a certain number of groups by using the AIC and the generalized likelihood ratio test method. The reliability of the cables at specified years are assessed.
Liu, Xiang; Peng, Yingwei; Tu, Dongsheng; Liang, Hua
2012-10-30
Survival data with a sizable cure fraction are commonly encountered in cancer research. The semiparametric proportional hazards cure model has been recently used to analyze such data. As seen in the analysis of data from a breast cancer study, a variable selection approach is needed to identify important factors in predicting the cure status and risk of breast cancer recurrence. However, no specific variable selection method for the cure model is available. In this paper, we present a variable selection approach with penalized likelihood for the cure model. The estimation can be implemented easily by combining the computational methods for penalized logistic regression and the penalized Cox proportional hazards models with the expectation-maximization algorithm. We illustrate the proposed approach on data from a breast cancer study. We conducted Monte Carlo simulations to evaluate the performance of the proposed method. We used and compared different penalty functions in the simulation studies. Copyright © 2012 John Wiley & Sons, Ltd.
Church, Sheri A; Livingstone, Kevin; Lai, Zhao; Kozik, Alexander; Knapp, Steven J; Michelmore, Richard W; Rieseberg, Loren H
2007-02-01
Using likelihood-based variable selection models, we determined if positive selection was acting on 523 EST sequence pairs from two lineages of sunflower and lettuce. Variable rate models are generally not used for comparisons of sequence pairs due to the limited information and the inaccuracy of estimates of specific substitution rates. However, previous studies have shown that the likelihood ratio test (LRT) is reliable for detecting positive selection, even with low numbers of sequences. These analyses identified 56 genes that show a signature of selection, of which 75% were not identified by simpler models that average selection across codons. Subsequent mapping studies in sunflower show four of five of the positively selected genes identified by these methods mapped to domestication QTLs. We discuss the validity and limitations of using variable rate models for comparisons of sequence pairs, as well as the limitations of using ESTs for identification of positively selected genes.
Maximum Likelihood Analysis of Low Energy CDMS II Germanium Data
Agnese, R.
2015-03-30
We report on the results of a search for a Weakly Interacting Massive Particle (WIMP) signal in low-energy data of the Cryogenic Dark Matter Search experiment using a maximum likelihood analysis. A background model is constructed using GEANT4 to simulate the surface-event background from Pb210decay-chain events, while using independent calibration data to model the gamma background. Fitting this background model to the data results in no statistically significant WIMP component. In addition, we also perform fits using an analytic ad hoc background model proposed by Collar and Fields, who claimed to find a large excess of signal-like events in ourmore » data. Finally, we confirm the strong preference for a signal hypothesis in their analysis under these assumptions, but excesses are observed in both single- and multiple-scatter events, which implies the signal is not caused by WIMPs, but rather reflects the inadequacy of their background model.« less
Parameter Estimation and Model Selection for Indoor Environments Based on Sparse Observations
NASA Astrophysics Data System (ADS)
Dehbi, Y.; Loch-Dehbi, S.; Plümer, L.
2017-09-01
This paper presents a novel method for the parameter estimation and model selection for the reconstruction of indoor environments based on sparse observations. While most approaches for the reconstruction of indoor models rely on dense observations, we predict scenes of the interior with high accuracy in the absence of indoor measurements. We use a model-based top-down approach and incorporate strong but profound prior knowledge. The latter includes probability density functions for model parameters and sparse observations such as room areas and the building footprint. The floorplan model is characterized by linear and bi-linear relations with discrete and continuous parameters. We focus on the stochastic estimation of model parameters based on a topological model derived by combinatorial reasoning in a first step. A Gauss-Markov model is applied for estimation and simulation of the model parameters. Symmetries are represented and exploited during the estimation process. Background knowledge as well as observations are incorporated in a maximum likelihood estimation and model selection is performed with AIC/BIC. The likelihood is also used for the detection and correction of potential errors in the topological model. Estimation results are presented and discussed.
Cognitive Processing of Fear-Arousing Message Content.
ERIC Educational Resources Information Center
Hale, Jerold L.; And Others
1995-01-01
Investigates two models (the Elaboration Likelihood Model and the Heuristic-Systematic Model) of the cognitive processing of fear-arousing messages in undergraduate students. Finds in three of the four conditions (low fear, high fear, high trait anxiety) that cognitive processing appears to be antagonistic. Finds some evidence of concurrent…
Sensitivity of Fit Indices to Misspecification in Growth Curve Models
ERIC Educational Resources Information Center
Wu, Wei; West, Stephen G.
2010-01-01
This study investigated the sensitivity of fit indices to model misspecification in within-individual covariance structure, between-individual covariance structure, and marginal mean structure in growth curve models. Five commonly used fit indices were examined, including the likelihood ratio test statistic, root mean square error of…
ERIC Educational Resources Information Center
Formann, Anton K.
1986-01-01
It is shown that for equal parameters explicit formulas exist, facilitating the application of the Newton-Raphson procedure to estimate the parameters in the Rasch model and related models according to the conditional maximum likelihood principle. (Author/LMO)
Measurement and Structural Model Class Separation in Mixture CFA: ML/EM versus MCMC
ERIC Educational Resources Information Center
Depaoli, Sarah
2012-01-01
Parameter recovery was assessed within mixture confirmatory factor analysis across multiple estimator conditions under different simulated levels of mixture class separation. Mixture class separation was defined in the measurement model (through factor loadings) and the structural model (through factor variances). Maximum likelihood (ML) via the…