Decadal climate predictions improved by ocean ensemble dispersion filtering
NASA Astrophysics Data System (ADS)
Kadow, C.; Illing, S.; Kröner, I.; Ulbrich, U.; Cubasch, U.
2017-06-01
Decadal predictions by Earth system models aim to capture the state and phase of the climate several years in advance. Atmosphere-ocean interaction plays an important role for such climate forecasts. While short-term weather forecasts represent an initial value problem and long-term climate projections represent a boundary condition problem, the decadal climate prediction falls in-between these two time scales. In recent years, more precise initialization techniques of coupled Earth system models and increased ensemble sizes have improved decadal predictions. However, climate models in general start losing the initialized signal and its predictive skill from one forecast year to the next. Here we show that the climate prediction skill of an Earth system model can be improved by a shift of the ocean state toward the ensemble mean of its individual members at seasonal intervals. We found that this procedure, called ensemble dispersion filter, results in more accurate results than the standard decadal prediction. Global mean and regional temperature, precipitation, and winter cyclone predictions show an increased skill up to 5 years ahead. Furthermore, the novel technique outperforms predictions with larger ensembles and higher resolution. Our results demonstrate how decadal climate predictions benefit from ocean ensemble dispersion filtering toward the ensemble mean.
Numerical weather prediction model tuning via ensemble prediction system
NASA Astrophysics Data System (ADS)
Jarvinen, H.; Laine, M.; Ollinaho, P.; Solonen, A.; Haario, H.
2011-12-01
This paper discusses a novel approach to tune predictive skill of numerical weather prediction (NWP) models. NWP models contain tunable parameters which appear in parameterizations schemes of sub-grid scale physical processes. Currently, numerical values of these parameters are specified manually. In a recent dual manuscript (QJRMS, revised) we developed a new concept and method for on-line estimation of the NWP model parameters. The EPPES ("Ensemble prediction and parameter estimation system") method requires only minimal changes to the existing operational ensemble prediction infra-structure and it seems very cost-effective because practically no new computations are introduced. The approach provides an algorithmic decision making tool for model parameter optimization in operational NWP. In EPPES, statistical inference about the NWP model tunable parameters is made by (i) generating each member of the ensemble of predictions using different model parameter values, drawn from a proposal distribution, and (ii) feeding-back the relative merits of the parameter values to the proposal distribution, based on evaluation of a suitable likelihood function against verifying observations. In the presentation, the method is first illustrated in low-order numerical tests using a stochastic version of the Lorenz-95 model which effectively emulates the principal features of ensemble prediction systems. The EPPES method correctly detects the unknown and wrongly specified parameters values, and leads to an improved forecast skill. Second, results with an atmospheric general circulation model based ensemble prediction system show that the NWP model tuning capacity of EPPES scales up to realistic models and ensemble prediction systems. Finally, a global top-end NWP model tuning exercise with preliminary results is published.
Ocean Predictability and Uncertainty Forecasts Using Local Ensemble Transfer Kalman Filter (LETKF)
NASA Astrophysics Data System (ADS)
Wei, M.; Hogan, P. J.; Rowley, C. D.; Smedstad, O. M.; Wallcraft, A. J.; Penny, S. G.
2017-12-01
Ocean predictability and uncertainty are studied with an ensemble system that has been developed based on the US Navy's operational HYCOM using the Local Ensemble Transfer Kalman Filter (LETKF) technology. One of the advantages of this method is that the best possible initial analysis states for the HYCOM forecasts are provided by the LETKF which assimilates operational observations using ensemble method. The background covariance during this assimilation process is implicitly supplied with the ensemble avoiding the difficult task of developing tangent linear and adjoint models out of HYCOM with the complicated hybrid isopycnal vertical coordinate for 4D-VAR. The flow-dependent background covariance from the ensemble will be an indispensable part in the next generation hybrid 4D-Var/ensemble data assimilation system. The predictability and uncertainty for the ocean forecasts are studied initially for the Gulf of Mexico. The results are compared with another ensemble system using Ensemble Transfer (ET) method which has been used in the Navy's operational center. The advantages and disadvantages are discussed.
A comparison of breeding and ensemble transform vectors for global ensemble generation
NASA Astrophysics Data System (ADS)
Deng, Guo; Tian, Hua; Li, Xiaoli; Chen, Jing; Gong, Jiandong; Jiao, Meiyan
2012-02-01
To compare the initial perturbation techniques using breeding vectors and ensemble transform vectors, three ensemble prediction systems using both initial perturbation methods but with different ensemble member sizes based on the spectral model T213/L31 are constructed at the National Meteorological Center, China Meteorological Administration (NMC/CMA). A series of ensemble verification scores such as forecast skill of the ensemble mean, ensemble resolution, and ensemble reliability are introduced to identify the most important attributes of ensemble forecast systems. The results indicate that the ensemble transform technique is superior to the breeding vector method in light of the evaluation of anomaly correlation coefficient (ACC), which is a deterministic character of the ensemble mean, the root-mean-square error (RMSE) and spread, which are of probabilistic attributes, and the continuous ranked probability score (CRPS) and its decomposition. The advantage of the ensemble transform approach is attributed to its orthogonality among ensemble perturbations as well as its consistence with the data assimilation system. Therefore, this study may serve as a reference for configuration of the best ensemble prediction system to be used in operation.
Can decadal climate predictions be improved by ocean ensemble dispersion filtering?
NASA Astrophysics Data System (ADS)
Kadow, C.; Illing, S.; Kröner, I.; Ulbrich, U.; Cubasch, U.
2017-12-01
Decadal predictions by Earth system models aim to capture the state and phase of the climate several years inadvance. Atmosphere-ocean interaction plays an important role for such climate forecasts. While short-termweather forecasts represent an initial value problem and long-term climate projections represent a boundarycondition problem, the decadal climate prediction falls in-between these two time scales. The ocean memorydue to its heat capacity holds big potential skill on the decadal scale. In recent years, more precise initializationtechniques of coupled Earth system models (incl. atmosphere and ocean) have improved decadal predictions.Ensembles are another important aspect. Applying slightly perturbed predictions results in an ensemble. Insteadof using and evaluating one prediction, but the whole ensemble or its ensemble average, improves a predictionsystem. However, climate models in general start losing the initialized signal and its predictive skill from oneforecast year to the next. Here we show that the climate prediction skill of an Earth system model can be improvedby a shift of the ocean state toward the ensemble mean of its individual members at seasonal intervals. Wefound that this procedure, called ensemble dispersion filter, results in more accurate results than the standarddecadal prediction. Global mean and regional temperature, precipitation, and winter cyclone predictions showan increased skill up to 5 years ahead. Furthermore, the novel technique outperforms predictions with largerensembles and higher resolution. Our results demonstrate how decadal climate predictions benefit from oceanensemble dispersion filtering toward the ensemble mean. This study is part of MiKlip (fona-miklip.de) - a major project on decadal climate prediction in Germany.We focus on the Max-Planck-Institute Earth System Model using the low-resolution version (MPI-ESM-LR) andMiKlip's basic initialization strategy as in 2017 published decadal climate forecast: http://www.fona-miklip.de/decadal-forecast-2017-2026/decadal-forecast-for-2017-2026/ More informations about this study in JAMES:DOI: 10.1002/2016MS000787
Decadal climate prediction in the large ensemble limit
NASA Astrophysics Data System (ADS)
Yeager, S. G.; Rosenbloom, N. A.; Strand, G.; Lindsay, K. T.; Danabasoglu, G.; Karspeck, A. R.; Bates, S. C.; Meehl, G. A.
2017-12-01
In order to quantify the benefits of initialization for climate prediction on decadal timescales, two parallel sets of historical simulations are required: one "initialized" ensemble that incorporates observations of past climate states and one "uninitialized" ensemble whose internal climate variations evolve freely and without synchronicity. In the large ensemble limit, ensemble averaging isolates potentially predictable forced and internal variance components in the "initialized" set, but only the forced variance remains after averaging the "uninitialized" set. The ensemble size needed to achieve this variance decomposition, and to robustly distinguish initialized from uninitialized decadal predictions, remains poorly constrained. We examine a large ensemble (LE) of initialized decadal prediction (DP) experiments carried out using the Community Earth System Model (CESM). This 40-member CESM-DP-LE set of experiments represents the "initialized" complement to the CESM large ensemble of 20th century runs (CESM-LE) documented in Kay et al. (2015). Both simulation sets share the same model configuration, historical radiative forcings, and large ensemble sizes. The twin experiments afford an unprecedented opportunity to explore the sensitivity of DP skill assessment, and in particular the skill enhancement associated with initialization, to ensemble size. This talk will highlight the benefits of a large ensemble size for initialized predictions of seasonal climate over land in the Atlantic sector as well as predictions of shifts in the likelihood of climate extremes that have large societal impact.
NWP model forecast skill optimization via closure parameter variations
NASA Astrophysics Data System (ADS)
Järvinen, H.; Ollinaho, P.; Laine, M.; Solonen, A.; Haario, H.
2012-04-01
We present results of a novel approach to tune predictive skill of numerical weather prediction (NWP) models. These models contain tunable parameters which appear in parameterizations schemes of sub-grid scale physical processes. The current practice is to specify manually the numerical parameter values, based on expert knowledge. We developed recently a concept and method (QJRMS 2011) for on-line estimation of the NWP model parameters via closure parameter variations. The method called EPPES ("Ensemble prediction and parameter estimation system") utilizes ensemble prediction infra-structure for parameter estimation in a very cost-effective way: practically no new computations are introduced. The approach provides an algorithmic decision making tool for model parameter optimization in operational NWP. In EPPES, statistical inference about the NWP model tunable parameters is made by (i) generating an ensemble of predictions so that each member uses different model parameter values, drawn from a proposal distribution, and (ii) feeding-back the relative merits of the parameter values to the proposal distribution, based on evaluation of a suitable likelihood function against verifying observations. In this presentation, the method is first illustrated in low-order numerical tests using a stochastic version of the Lorenz-95 model which effectively emulates the principal features of ensemble prediction systems. The EPPES method correctly detects the unknown and wrongly specified parameters values, and leads to an improved forecast skill. Second, results with an ensemble prediction system emulator, based on the ECHAM5 atmospheric GCM show that the model tuning capability of EPPES scales up to realistic models and ensemble prediction systems. Finally, preliminary results of EPPES in the context of ECMWF forecasting system are presented.
NASA Astrophysics Data System (ADS)
Abaza, Mabrouk; Anctil, François; Fortin, Vincent; Perreault, Luc
2017-12-01
Meteorological and hydrological ensemble prediction systems are imperfect. Their outputs could often be improved through the use of a statistical processor, opening up the question of the necessity of using both processors (meteorological and hydrological), only one of them, or none. This experiment compares the predictive distributions from four hydrological ensemble prediction systems (H-EPS) utilising the Ensemble Kalman filter (EnKF) probabilistic sequential data assimilation scheme. They differ in the inclusion or not of the Distribution Based Scaling (DBS) method for post-processing meteorological forecasts and the ensemble Bayesian Model Averaging (ensemble BMA) method for hydrological forecast post-processing. The experiment is implemented on three large watersheds and relies on the combination of two meteorological reforecast products: the 4-member Canadian reforecasts from the Canadian Centre for Meteorological and Environmental Prediction (CCMEP) and the 10-member American reforecasts from the National Oceanic and Atmospheric Administration (NOAA), leading to 14 members at each time step. Results show that all four tested H-EPS lead to resolution and sharpness values that are quite similar, with an advantage to DBS + EnKF. The ensemble BMA is unable to compensate for any bias left in the precipitation ensemble forecasts. On the other hand, it succeeds in calibrating ensemble members that are otherwise under-dispersed. If reliability is preferred over resolution and sharpness, DBS + EnKF + ensemble BMA performs best, making use of both processors in the H-EPS system. Conversely, for enhanced resolution and sharpness, DBS is the preferred method.
NASA Astrophysics Data System (ADS)
Lahmiri, Salim; Boukadoum, Mounir
2015-08-01
We present a new ensemble system for stock market returns prediction where continuous wavelet transform (CWT) is used to analyze return series and backpropagation neural networks (BPNNs) for processing CWT-based coefficients, determining the optimal ensemble weights, and providing final forecasts. Particle swarm optimization (PSO) is used for finding optimal weights and biases for each BPNN. To capture symmetry/asymmetry in the underlying data, three wavelet functions with different shapes are adopted. The proposed ensemble system was tested on three Asian stock markets: The Hang Seng, KOSPI, and Taiwan stock market data. Three statistical metrics were used to evaluate the forecasting accuracy; including, mean of absolute errors (MAE), root mean of squared errors (RMSE), and mean of absolute deviations (MADs). Experimental results showed that our proposed ensemble system outperformed the individual CWT-ANN models each with different wavelet function. In addition, the proposed ensemble system outperformed the conventional autoregressive moving average process. As a result, the proposed ensemble system is suitable to capture symmetry/asymmetry in financial data fluctuations for better prediction accuracy.
The Development of Storm Surge Ensemble Prediction System and Case Study of Typhoon Meranti in 2016
NASA Astrophysics Data System (ADS)
Tsai, Y. L.; Wu, T. R.; Terng, C. T.; Chu, C. H.
2017-12-01
Taiwan is under the threat of storm surge and associated inundation, which is located at a potentially severe storm generation zone. The use of ensemble prediction can help forecasters to know the characteristic of storm surge under the uncertainty of track and intensity. In addition, it can help the deterministic forecasting. In this study, the kernel of ensemble prediction system is based on COMCOT-SURGE (COrnell Multi-grid COupled Tsunami Model - Storm Surge). COMCOT-SURGE solves nonlinear shallow water equations in Open Ocean and coastal regions with the nested-grid scheme and adopts wet-dry-cell treatment to calculate potential inundation area. In order to consider tide-surge interaction, the global TPXO 7.1 tide model provides the tidal boundary conditions. After a series of validations and case studies, COMCOT-SURGE has become an official operating system of Central Weather Bureau (CWB) in Taiwan. In this study, the strongest typhoon in 2016, Typhoon Meranti, is chosen as a case study. We adopt twenty ensemble members from CWB WRF Ensemble Prediction System (CWB WEPS), which differs from parameters of microphysics, boundary layer, cumulus, and surface. From box-and-whisker results, maximum observed storm surges were located in the interval of the first and third quartile at more than 70 % gauge locations, e.g. Toucheng, Chengkung, and Jiangjyun. In conclusion, the ensemble prediction can effectively help forecasters to predict storm surge especially under the uncertainty of storm track and intensity
Reliability of windstorm predictions in the ECMWF ensemble prediction system
NASA Astrophysics Data System (ADS)
Becker, Nico; Ulbrich, Uwe
2016-04-01
Windstorms caused by extratropical cyclones are one of the most dangerous natural hazards in the European region. Therefore, reliable predictions of such storm events are needed. Case studies have shown that ensemble prediction systems (EPS) are able to provide useful information about windstorms between two and five days prior to the event. In this work, ensemble predictions with the European Centre for Medium-Range Weather Forecasts (ECMWF) EPS are evaluated in a four year period. Within the 50 ensemble members, which are initialized every 12 hours and are run for 10 days, windstorms are identified and tracked in time and space. By using a clustering approach, different predictions of the same storm are identified in the different ensemble members and compared to reanalysis data. The occurrence probability of the predicted storms is estimated by fitting a bivariate normal distribution to the storm track positions. Our results show, for example, that predicted storm clusters with occurrence probabilities of more than 50% have a matching observed storm in 80% of all cases at a lead time of two days. The predicted occurrence probabilities are reliable up to 3 days lead time. At longer lead times the occurrence probabilities are overestimated by the EPS.
Tatinati, Sivanagaraja; Nazarpour, Kianoush; Tech Ang, Wei; Veluvolu, Kalyana C
2016-08-01
Successful treatment of tumors with motion-adaptive radiotherapy requires accurate prediction of respiratory motion, ideally with a prediction horizon larger than the latency in radiotherapy system. Accurate prediction of respiratory motion is however a non-trivial task due to the presence of irregularities and intra-trace variabilities, such as baseline drift and temporal changes in fundamental frequency pattern. In this paper, to enhance the accuracy of the respiratory motion prediction, we propose a stacked regression ensemble framework that integrates heterogeneous respiratory motion prediction algorithms. We further address two crucial issues for developing a successful ensemble framework: (1) selection of appropriate prediction methods to ensemble (level-0 methods) among the best existing prediction methods; and (2) finding a suitable generalization approach that can successfully exploit the relative advantages of the chosen level-0 methods. The efficacy of the developed ensemble framework is assessed with real respiratory motion traces acquired from 31 patients undergoing treatment. Results show that the developed ensemble framework improves the prediction performance significantly compared to the best existing methods. Copyright © 2016 IPEM. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Lee, H. S.; Liu, Y.; Ward, J.; Brown, J.; Maestre, A.; Herr, H.; Fresch, M. A.; Wells, E.; Reed, S. M.; Jones, E.
2017-12-01
The National Weather Service's (NWS) Office of Water Prediction (OWP) recently launched a nationwide effort to verify streamflow forecasts from the Hydrologic Ensemble Forecast Service (HEFS) for a majority of forecast locations across the 13 River Forecast Centers (RFCs). Known as the HEFS Baseline Validation (BV), the project involves a joint effort between the OWP and the RFCs. It aims to provide a geographically consistent, statistically robust validation, and a benchmark to guide the operational implementation of the HEFS, inform practical applications, such as impact-based decision support services, and to provide an objective framework for evaluating strategic investments in the HEFS. For the BV, HEFS hindcasts are issued once per day on a 12Z cycle for the period of 1985-2015 with a forecast horizon of 30 days. For the first two weeks, the hindcasts are forced with precipitation and temperature ensemble forecasts from the Global Ensemble Forecast System of the National Centers for Environmental Prediction, and by resampled climatology for the remaining period. The HEFS-generated ensemble streamflow hindcasts are verified using the Ensemble Verification System. Skill is assessed relative to streamflow hindcasts generated from NWS' current operational system, namely climatology-based Ensemble Streamflow Prediction. In this presentation, we summarize the results and findings to date.
NASA Astrophysics Data System (ADS)
Pegion, K.; DelSole, T. M.; Becker, E.; Cicerone, T.
2016-12-01
Predictability represents the upper limit of prediction skill if we had an infinite member ensemble and a perfect model. It is an intrinsic limit of the climate system associated with the chaotic nature of the atmosphere. Producing a forecast system that can make predictions very near to this limit is the ultimate goal of forecast system development. Estimates of predictability together with calculations of current prediction skill are often used to define the gaps in our prediction capabilities on subseasonal to seasonal timescales and to inform the scientific issues that must be addressed to build the next forecast system. Quantification of the predictability is also important for providing a scientific basis for relaying to stakeholders what kind of climate information can be provided to inform decision-making and what kind of information is not possible given the intrinsic predictability of the climate system. One challenge with predictability estimates is that different prediction systems can give different estimates of the upper limit of skill. How do we know which estimate of predictability is most representative of the true predictability of the climate system? Previous studies have used the spread-error relationship and the autocorrelation to evaluate the fidelity of the signal and noise estimates. Using a multi-model ensemble prediction system, we can quantify whether these metrics accurately indicate an individual model's ability to properly estimate the signal, noise, and predictability. We use this information to identify the best estimates of predictability for 2-meter temperature, precipitation, and sea surface temperature from the North American Multi-model Ensemble and compare with current skill to indicate the regions with potential for improving skill.
Examination of multi-model ensemble seasonal prediction methods using a simple climate system
NASA Astrophysics Data System (ADS)
Kang, In-Sik; Yoo, Jin Ho
2006-02-01
A simple climate model was designed as a proxy for the real climate system, and a number of prediction models were generated by slightly perturbing the physical parameters of the simple model. A set of long (240 years) historical hindcast predictions were performed with various prediction models, which are used to examine various issues of multi-model ensemble seasonal prediction, such as the best ways of blending multi-models and the selection of models. Based on these results, we suggest a feasible way of maximizing the benefit of using multi models in seasonal prediction. In particular, three types of multi-model ensemble prediction systems, i.e., the simple composite, superensemble, and the composite after statistically correcting individual predictions (corrected composite), are examined and compared to each other. The superensemble has more of an overfitting problem than the others, especially for the case of small training samples and/or weak external forcing, and the corrected composite produces the best prediction skill among the multi-model systems.
NASA Astrophysics Data System (ADS)
Edouard, Simon; Vincendon, Béatrice; Ducrocq, Véronique
2018-05-01
Intense precipitation events in the Mediterranean often lead to devastating flash floods (FF). FF modelling is affected by several kinds of uncertainties and Hydrological Ensemble Prediction Systems (HEPS) are designed to take those uncertainties into account. The major source of uncertainty comes from rainfall forcing and convective-scale meteorological ensemble prediction systems can manage it for forecasting purpose. But other sources are related to the hydrological modelling part of the HEPS. This study focuses on the uncertainties arising from the hydrological model parameters and initial soil moisture with aim to design an ensemble-based version of an hydrological model dedicated to Mediterranean fast responding rivers simulations, the ISBA-TOP coupled system. The first step consists in identifying the parameters that have the strongest influence on FF simulations by assuming perfect precipitation. A sensitivity study is carried out first using a synthetic framework and then for several real events and several catchments. Perturbation methods varying the most sensitive parameters as well as initial soil moisture allow designing an ensemble-based version of ISBA-TOP. The first results of this system on some real events are presented. The direct perspective of this work will be to drive this ensemble-based version with the members of a convective-scale meteorological ensemble prediction system to design a complete HEPS for FF forecasting.
Three-model ensemble wind prediction in southern Italy
NASA Astrophysics Data System (ADS)
Torcasio, Rosa Claudia; Federico, Stefano; Calidonna, Claudia Roberta; Avolio, Elenio; Drofa, Oxana; Landi, Tony Christian; Malguzzi, Piero; Buzzi, Andrea; Bonasoni, Paolo
2016-03-01
Quality of wind prediction is of great importance since a good wind forecast allows the prediction of available wind power, improving the penetration of renewable energies into the energy market. Here, a 1-year (1 December 2012 to 30 November 2013) three-model ensemble (TME) experiment for wind prediction is considered. The models employed, run operationally at National Research Council - Institute of Atmospheric Sciences and Climate (CNR-ISAC), are RAMS (Regional Atmospheric Modelling System), BOLAM (BOlogna Limited Area Model), and MOLOCH (MOdello LOCale in H coordinates). The area considered for the study is southern Italy and the measurements used for the forecast verification are those of the GTS (Global Telecommunication System). Comparison with observations is made every 3 h up to 48 h of forecast lead time. Results show that the three-model ensemble outperforms the forecast of each individual model. The RMSE improvement compared to the best model is between 22 and 30 %, depending on the season. It is also shown that the three-model ensemble outperforms the IFS (Integrated Forecasting System) of the ECMWF (European Centre for Medium-Range Weather Forecast) for the surface wind forecasts. Notably, the three-model ensemble forecast performs better than each unbiased model, showing the added value of the ensemble technique. Finally, the sensitivity of the three-model ensemble RMSE to the length of the training period is analysed.
Viney, N.R.; Bormann, H.; Breuer, L.; Bronstert, A.; Croke, B.F.W.; Frede, H.; Graff, T.; Hubrechts, L.; Huisman, J.A.; Jakeman, A.J.; Kite, G.W.; Lanini, J.; Leavesley, G.; Lettenmaier, D.P.; Lindstrom, G.; Seibert, J.; Sivapalan, M.; Willems, P.
2009-01-01
This paper reports on a project to compare predictions from a range of catchment models applied to a mesoscale river basin in central Germany and to assess various ensemble predictions of catchment streamflow. The models encompass a large range in inherent complexity and input requirements. In approximate order of decreasing complexity, they are DHSVM, MIKE-SHE, TOPLATS, WASIM-ETH, SWAT, PRMS, SLURP, HBV, LASCAM and IHACRES. The models are calibrated twice using different sets of input data. The two predictions from each model are then combined by simple averaging to produce a single-model ensemble. The 10 resulting single-model ensembles are combined in various ways to produce multi-model ensemble predictions. Both the single-model ensembles and the multi-model ensembles are shown to give predictions that are generally superior to those of their respective constituent models, both during a 7-year calibration period and a 9-year validation period. This occurs despite a considerable disparity in performance of the individual models. Even the weakest of models is shown to contribute useful information to the ensembles they are part of. The best model combination methods are a trimmed mean (constructed using the central four or six predictions each day) and a weighted mean ensemble (with weights calculated from calibration performance) that places relatively large weights on the better performing models. Conditional ensembles, in which separate model weights are used in different system states (e.g. summer and winter, high and low flows) generally yield little improvement over the weighted mean ensemble. However a conditional ensemble that discriminates between rising and receding flows shows moderate improvement. An analysis of ensemble predictions shows that the best ensembles are not necessarily those containing the best individual models. Conversely, it appears that some models that predict well individually do not necessarily combine well with other models in multi-model ensembles. The reasons behind these observations may relate to the effects of the weighting schemes, non-stationarity of the climate series and possible cross-correlations between models. Crown Copyright ?? 2008.
Skill of Global Raw and Postprocessed Ensemble Predictions of Rainfall over Northern Tropical Africa
NASA Astrophysics Data System (ADS)
Vogel, Peter; Knippertz, Peter; Fink, Andreas H.; Schlueter, Andreas; Gneiting, Tilmann
2018-04-01
Accumulated precipitation forecasts are of high socioeconomic importance for agriculturally dominated societies in northern tropical Africa. In this study, we analyze the performance of nine operational global ensemble prediction systems (EPSs) relative to climatology-based forecasts for 1 to 5-day accumulated precipitation based on the monsoon seasons 2007-2014 for three regions within northern tropical Africa. To assess the full potential of raw ensemble forecasts across spatial scales, we apply state-of-the-art statistical postprocessing methods in form of Bayesian Model Averaging (BMA) and Ensemble Model Output Statistics (EMOS), and verify against station and spatially aggregated, satellite-based gridded observations. Raw ensemble forecasts are uncalibrated, unreliable, and underperform relative to climatology, independently of region, accumulation time, monsoon season, and ensemble. Differences between raw ensemble and climatological forecasts are large, and partly stem from poor prediction for low precipitation amounts. BMA and EMOS postprocessed forecasts are calibrated, reliable, and strongly improve on the raw ensembles, but - somewhat disappointingly - typically do not outperform climatology. Most EPSs exhibit slight improvements over the period 2007-2014, but overall have little added value compared to climatology. We suspect that the parametrization of convection is a potential cause for the sobering lack of ensemble forecast skill in a region dominated by mesoscale convective systems.
HEPEX - achievements and challenges!
NASA Astrophysics Data System (ADS)
Pappenberger, Florian; Ramos, Maria-Helena; Thielen, Jutta; Wood, Andy; Wang, Qj; Duan, Qingyun; Collischonn, Walter; Verkade, Jan; Voisin, Nathalie; Wetterhall, Fredrik; Vuillaume, Jean-Francois Emmanuel; Lucatero Villasenor, Diana; Cloke, Hannah L.; Schaake, John; van Andel, Schalk-Jan
2014-05-01
HEPEX is an international initiative bringing together hydrologists, meteorologists, researchers and end-users to develop advanced probabilistic hydrological forecast techniques for improved flood, drought and water management. HEPEX was launched in 2004 as an independent, cooperative international scientific activity. During the first meeting, the overarching goal was defined as: "to develop and test procedures to produce reliable hydrological ensemble forecasts, and to demonstrate their utility in decision making related to the water, environmental and emergency management sectors." The applications of hydrological ensemble predictions span across large spatio-temporal scales, ranging from short-term and localized predictions to global climate change and regional modeling. Within the HEPEX community, information is shared through its blog (www.hepex.org), meetings, testbeds and intercompaison experiments, as well as project reportings. Key questions of HEPEX are: * What adaptations are required for meteorological ensemble systems to be coupled with hydrological ensemble systems? * How should the existing hydrological ensemble prediction systems be modified to account for all sources of uncertainty within a forecast? * What is the best way for the user community to take advantage of ensemble forecasts and to make better decisions based on them? This year HEPEX celebrates its 10th year anniversary and this poster will present a review of the main operational and research achievements and challenges prepared by Hepex contributors on data assimilation, post-processing of hydrologic predictions, forecast verification, communication and use of probabilistic forecasts in decision-making. Additionally, we will present the most recent activities implemented by Hepex and illustrate how everyone can join the community and participate to the development of new approaches in hydrologic ensemble prediction.
NASA Astrophysics Data System (ADS)
Suzuki, Kazuyoshi; Zupanski, Milija
2018-01-01
In this study, we investigate the uncertainties associated with land surface processes in an ensemble predication context. Specifically, we compare the uncertainties produced by a coupled atmosphere-land modeling system with two different land surface models, the Noah- MP land surface model (LSM) and the Noah LSM, by using the Maximum Likelihood Ensemble Filter (MLEF) data assimilation system as a platform for ensemble prediction. We carried out 24-hour prediction simulations in Siberia with 32 ensemble members beginning at 00:00 UTC on 5 March 2013. We then compared the model prediction uncertainty of snow depth and solid precipitation with observation-based research products and evaluated the standard deviation of the ensemble spread. The prediction skill and ensemble spread exhibited high positive correlation for both LSMs, indicating a realistic uncertainty estimation. The inclusion of a multiple snowlayer model in the Noah-MP LSM was beneficial for reducing the uncertainties of snow depth and snow depth change compared to the Noah LSM, but the uncertainty in daily solid precipitation showed minimal difference between the two LSMs. The impact of LSM choice in reducing temperature uncertainty was limited to surface layers of the atmosphere. In summary, we found that the more sophisticated Noah-MP LSM reduces uncertainties associated with land surface processes compared to the Noah LSM. Thus, using prediction models with improved skill implies improved predictability and greater certainty of prediction.
NASA Astrophysics Data System (ADS)
Saleh, F.; Ramaswamy, V.; Georgas, N.; Blumberg, A. F.; Wang, Y.
2016-12-01
Advances in computational resources and modeling techniques are opening the path to effectively integrate existing complex models. In the context of flood prediction, recent extreme events have demonstrated the importance of integrating components of the hydrosystem to better represent the interactions amongst different physical processes and phenomena. As such, there is a pressing need to develop holistic and cross-disciplinary modeling frameworks that effectively integrate existing models and better represent the operative dynamics. This work presents a novel Hydrologic-Hydraulic-Hydrodynamic Ensemble (H3E) flood prediction framework that operationally integrates existing predictive models representing coastal (New York Harbor Observing and Prediction System, NYHOPS), hydrologic (US Army Corps of Engineers Hydrologic Modeling System, HEC-HMS) and hydraulic (2-dimensional River Analysis System, HEC-RAS) components. The state-of-the-art framework is forced with 125 ensemble meteorological inputs from numerical weather prediction models including the Global Ensemble Forecast System, the European Centre for Medium-Range Weather Forecasts (ECMWF), the Canadian Meteorological Centre (CMC), the Short Range Ensemble Forecast (SREF) and the North American Mesoscale Forecast System (NAM). The framework produces, within a 96-hour forecast horizon, on-the-fly Google Earth flood maps that provide critical information for decision makers and emergency preparedness managers. The utility of the framework was demonstrated by retrospectively forecasting an extreme flood event, hurricane Sandy in the Passaic and Hackensack watersheds (New Jersey, USA). Hurricane Sandy caused significant damage to a number of critical facilities in this area including the New Jersey Transit's main storage and maintenance facility. The results of this work demonstrate that ensemble based frameworks provide improved flood predictions and useful information about associated uncertainties, thus improving the assessment of risks as when compared to a deterministic forecast. The work offers perspectives for short-term flood forecasts, flood mitigation strategies and best management practices for climate change scenarios.
NASA Astrophysics Data System (ADS)
Reyers, Mark; Moemken, Julia; Pinto, Joaquim; Feldmann, Hendrik; Kottmeier, Christoph; MiKlip Module-C Team
2017-04-01
Decadal climate predictions can provide a useful basis for decision making support systems for the public and private sectors. Several generations of decadal hindcasts and predictions have been generated throughout the German research program MiKlip. Together with the global climate predictions computed with MPI-ESM, the regional climate model (RCM) COSMO-CLM is used for regional downscaling by MiKlip Module-C. The RCMs provide climate information on spatial and temporal scales closer to the needs of potential users. In this study, two downscaled hindcast generations are analysed (named b0 and b1). The respective global generations are both initialized by nudging them towards different reanalysis anomaly fields. An ensemble of five starting years (1961, 1971, 1981, 1991, and 2001), each comprising ten ensemble members, is used for both generations in order to quantify the regional decadal prediction skill for precipitation and near-surface temperature and wind speed over Europe. All datasets (including hindcasts, observations, reanalysis, and historical MPI-ESM runs) are pre-processed in an analogue manner by (i) removing the long-term trend and (ii) re-gridding to a common grid. Our analysis shows that there is potential for skillful decadal predictions over Europe in the regional MiKlip ensemble, but the skill is not systematic and depends on the PRUDENCE region and the variable. Further, the differences between the two hindcast generations are mostly small. As we used detrended time series, the predictive skill found in our study can probably attributed to reasonable predictions of anomalies which are associated with the natural climate variability. In a sensitivity study, it is shown that the results may strongly change when the long-term trend is kept in the datasets, as here the skill of predicting the long-term trend (e.g. for temperature) also plays a major role. The regionalization of the global ensemble provides an added value for decadal predictions for some complex regions like the Mediterranean and Iberian Peninsula, while for other regions no systematic improvement is found. A clear dependence of the performance of the regional MiKlip system on the ensemble size is detected. For all variables in both hindcast generations, the skill increases when the ensemble is enlarged. The results indicate that a number of ten members is an appropriate ensemble size for decadal predictions over Europe.
NASA Astrophysics Data System (ADS)
Lee, Sang-Min; Nam, Ji-Eun; Choi, Hee-Wook; Ha, Jong-Chul; Lee, Yong Hee; Kim, Yeon-Hee; Kang, Hyun-Suk; Cho, ChunHo
2016-08-01
This study was conducted to evaluate the prediction accuracies of THe Observing system Research and Predictability EXperiment (THORPEX) Interactive Grand Global Ensemble (TIGGE) data at six operational forecast centers using the root-mean square difference (RMSD) and Brier score (BS) from April to July 2012. And it was performed to test the precipitation predictability of ensemble prediction systems (EPS) on the onset of the summer rainy season, the day of withdrawal in spring drought over South Korea on 29 June 2012 with use of the ensemble mean precipitation, ensemble probability precipitation, 10-day lag ensemble forecasts (ensemble mean and probability precipitation), and effective drought index (EDI). The RMSD analysis of atmospheric variables (geopotential-height at 500 hPa, temperature at 850 hPa, sea-level pressure and specific humidity at 850 hPa) showed that the prediction accuracies of the EPS at the Meteorological Service of Canada (CMC) and China Meteorological Administration (CMA) were poor and those at the European Center for Medium-Range Weather Forecasts (ECMWF) and Korea Meteorological Administration (KMA) were good. Also, ECMWF and KMA showed better results than other EPSs for predicting precipitation in the BS distributions. It is also evaluated that the onset of the summer rainy season could be predicted using ensemble-mean precipitation from 4-day leading time at all forecast centers. In addition, the spatial distributions of predicted precipitation of the EPS at KMA and the Met Office of the United Kingdom (UKMO) were similar to those of observed precipitation; thus, the predictability showed good performance. The precipitation probability forecasts of EPS at CMA, the National Centers for Environmental Prediction (NCEP), and UKMO (ECMWF and KMA) at 1-day lead time produced over-forecasting (under-forecasting) in the reliability diagram. And all the ones at 2˜4-day lead time showed under-forecasting. Also, the precipitation on onset day of the summer rainy season could be predicted from a 4-day lead time to initial time by using the 10-day lag ensemble mean and probability forecasts. Additionally, the predictability for withdrawal day of spring drought to be ended due to precipitation on onset day of summer rainy season was evaluated using Effective Drought Index (EDI) to be calculated by ensemble mean precipitation forecasts and spreads at five EPSs.
A seasonal hydrologic ensemble prediction system for water resource management
NASA Astrophysics Data System (ADS)
Luo, L.; Wood, E. F.
2006-12-01
A seasonal hydrologic ensemble prediction system, developed for the Ohio River basin, has been improved and expanded to several other regions including the Eastern U.S., Africa and East Asia. The prediction system adopts the traditional Extended Streamflow Prediction (ESP) approach, utilizing the VIC (Variable Infiltration Capacity) hydrological model as the central tool for producing ensemble prediction of soil moisture, snow and streamflow with lead times up to 6-month. VIC is forced by observed meteorology to estimate the hydrological initial condition prior to the forecast, but during the forecast period the atmospheric forcing comes from statistically downscaled, seasonal forecast from dynamic climate models. The seasonal hydrologic ensemble prediction system is currently producing realtime seasonal hydrologic forecast for these regions on a monthly basis. Using hindcasts from a 19-year period (1981-1999), during which seasonal hindcasts from NCEP Climate Forecast System (CFS) and European Union DEMETER project are available, we evaluate the performance of the forecast system over our forecast regions. The evaluation shows that the prediction system using the current forecast approach is able to produce reliable and accurate precipitation, soil moisture and streamflow predictions. The overall skill is much higher then the traditional ESP. In particular, forecasts based on multiple climate model forecast are more skillful than single model-based forecast. This emphasizes the significant need for producing seasonal climate forecast with multiple climate models for hydrologic applications. Forecast from this system is expected to provide very valuable information about future hydrologic states and associated risks for end users, including water resource management and financial sectors.
Multi-model analysis in hydrological prediction
NASA Astrophysics Data System (ADS)
Lanthier, M.; Arsenault, R.; Brissette, F.
2017-12-01
Hydrologic modelling, by nature, is a simplification of the real-world hydrologic system. Therefore ensemble hydrological predictions thus obtained do not present the full range of possible streamflow outcomes, thereby producing ensembles which demonstrate errors in variance such as under-dispersion. Past studies show that lumped models used in prediction mode can return satisfactory results, especially when there is not enough information available on the watershed to run a distributed model. But all lumped models greatly simplify the complex processes of the hydrologic cycle. To generate more spread in the hydrologic ensemble predictions, multi-model ensembles have been considered. In this study, the aim is to propose and analyse a method that gives an ensemble streamflow prediction that properly represents the forecast probabilities and reduced ensemble bias. To achieve this, three simple lumped models are used to generate an ensemble. These will also be combined using multi-model averaging techniques, which generally generate a more accurate hydrogram than the best of the individual models in simulation mode. This new predictive combined hydrogram is added to the ensemble, thus creating a large ensemble which may improve the variability while also improving the ensemble mean bias. The quality of the predictions is then assessed on different periods: 2 weeks, 1 month, 3 months and 6 months using a PIT Histogram of the percentiles of the real observation volumes with respect to the volumes of the ensemble members. Initially, the models were run using historical weather data to generate synthetic flows. This worked for individual models, but not for the multi-model and for the large ensemble. Consequently, by performing data assimilation at each prediction period and thus adjusting the initial states of the models, the PIT Histogram could be constructed using the observed flows while allowing the use of the multi-model predictions. The under-dispersion has been largely corrected on short-term predictions. For the longer term, the addition of the multi-model member has been beneficial to the quality of the predictions, although it is too early to determine whether the gain is related to the addition of a member or if multi-model member has plus-value itself.
NASA Astrophysics Data System (ADS)
Wang, S.; Huang, G. H.; Baetz, B. W.; Huang, W.
2015-11-01
This paper presents a polynomial chaos ensemble hydrologic prediction system (PCEHPS) for an efficient and robust uncertainty assessment of model parameters and predictions, in which possibilistic reasoning is infused into probabilistic parameter inference with simultaneous consideration of randomness and fuzziness. The PCEHPS is developed through a two-stage factorial polynomial chaos expansion (PCE) framework, which consists of an ensemble of PCEs to approximate the behavior of the hydrologic model, significantly speeding up the exhaustive sampling of the parameter space. Multiple hypothesis testing is then conducted to construct an ensemble of reduced-dimensionality PCEs with only the most influential terms, which is meaningful for achieving uncertainty reduction and further acceleration of parameter inference. The PCEHPS is applied to the Xiangxi River watershed in China to demonstrate its validity and applicability. A detailed comparison between the HYMOD hydrologic model, the ensemble of PCEs, and the ensemble of reduced PCEs is performed in terms of accuracy and efficiency. Results reveal temporal and spatial variations in parameter sensitivities due to the dynamic behavior of hydrologic systems, and the effects (magnitude and direction) of parametric interactions depending on different hydrological metrics. The case study demonstrates that the PCEHPS is capable not only of capturing both expert knowledge and probabilistic information in the calibration process, but also of implementing an acceleration of more than 10 times faster than the hydrologic model without compromising the predictive accuracy.
Nanni, Loris; Lumini, Alessandra
2009-01-01
The focuses of this work are: to propose a novel method for building an ensemble of classifiers for peptide classification based on substitution matrices; to show the importance to select a proper set of the parameters of the classifiers that build the ensemble of learning systems. The HIV-1 protease cleavage site prediction problem is here studied. The results obtained by a blind testing protocol are reported, the comparison with other state-of-the-art approaches, based on ensemble of classifiers, allows to quantify the performance improvement obtained by the systems proposed in this paper. The simulation based on experimentally determined protease cleavage data has demonstrated the success of these new ensemble algorithms. Particularly interesting it is to note that also if the HIV-1 protease cleavage site prediction problem is considered linearly separable we obtain the best performance using an ensemble of non-linear classifiers.
Predictability of short-range forecasting: a multimodel approach
NASA Astrophysics Data System (ADS)
García-Moya, Jose-Antonio; Callado, Alfons; Escribà, Pau; Santos, Carlos; Santos-Muñoz, Daniel; Simarro, Juan
2011-05-01
Numerical weather prediction (NWP) models (including mesoscale) have limitations when it comes to dealing with severe weather events because extreme weather is highly unpredictable, even in the short range. A probabilistic forecast based on an ensemble of slightly different model runs may help to address this issue. Among other ensemble techniques, Multimodel ensemble prediction systems (EPSs) are proving to be useful for adding probabilistic value to mesoscale deterministic models. A Multimodel Short Range Ensemble Prediction System (SREPS) focused on forecasting the weather up to 72 h has been developed at the Spanish Meteorological Service (AEMET). The system uses five different limited area models (LAMs), namely HIRLAM (HIRLAM Consortium), HRM (DWD), the UM (UKMO), MM5 (PSU/NCAR) and COSMO (COSMO Consortium). These models run with initial and boundary conditions provided by five different global deterministic models, namely IFS (ECMWF), UM (UKMO), GME (DWD), GFS (NCEP) and CMC (MSC). AEMET-SREPS (AE) validation on the large-scale flow, using ECMWF analysis, shows a consistent and slightly underdispersive system. For surface parameters, the system shows high skill forecasting binary events. 24-h precipitation probabilistic forecasts are verified using an up-scaling grid of observations from European high-resolution precipitation networks, and compared with ECMWF-EPS (EC).
Real-Time Ensemble Forecasting of Coronal Mass Ejections Using the Wsa-Enlil+Cone Model
NASA Astrophysics Data System (ADS)
Mays, M. L.; Taktakishvili, A.; Pulkkinen, A. A.; Odstrcil, D.; MacNeice, P. J.; Rastaetter, L.; LaSota, J. A.
2014-12-01
Ensemble forecasting of coronal mass ejections (CMEs) provides significant information in that it provides an estimation of the spread or uncertainty in CME arrival time predictions. Real-time ensemble modeling of CME propagation is performed by forecasters at the Space Weather Research Center (SWRC) using the WSA-ENLIL+cone model available at the Community Coordinated Modeling Center (CCMC). To estimate the effect of uncertainties in determining CME input parameters on arrival time predictions, a distribution of n (routinely n=48) CME input parameter sets are generated using the CCMC Stereo CME Analysis Tool (StereoCAT) which employs geometrical triangulation techniques. These input parameters are used to perform n different simulations yielding an ensemble of solar wind parameters at various locations of interest, including a probability distribution of CME arrival times (for hits), and geomagnetic storm strength (for Earth-directed hits). We present the results of ensemble simulations for a total of 38 CME events in 2013-2014. For 28 of the ensemble runs containing hits, the observed CME arrival was within the range of ensemble arrival time predictions for 14 runs (half). The average arrival time prediction was computed for each of the 28 ensembles predicting hits and using the actual arrival time, an average absolute error of 10.0 hours (RMSE=11.4 hours) was found for all 28 ensembles, which is comparable to current forecasting errors. Some considerations for the accuracy of ensemble CME arrival time predictions include the importance of the initial distribution of CME input parameters, particularly the mean and spread. When the observed arrivals are not within the predicted range, this still allows the ruling out of prediction errors caused by tested CME input parameters. Prediction errors can also arise from ambient model parameters such as the accuracy of the solar wind background, and other limitations. Additionally the ensemble modeling sysem was used to complete a parametric event case study of the sensitivity of the CME arrival time prediction to free parameters for ambient solar wind model and CME. The parameter sensitivity study suggests future directions for the system, such as running ensembles using various magnetogram inputs to the WSA model.
NASA Astrophysics Data System (ADS)
Ishitsuka, Y.; Yoshimura, K.
2016-12-01
Floods have a potential to be a major source of economic or human damage caused by natural disasters. Flood prediction systems were developed all over the world and to treat the uncertainty of the prediction ensemble simulation is commonly adopted. In this study, ensemble flood prediction system using global scale land surface and hydrodynamic model was developed. The system requests surface atmospheric forcing and Land Surface Model, MATSIRO, calculates runoff. Those generated runoff is inputted to hydrodynamic model CaMa-Flood to calculate discharge and flood inundation. CaMa-Flood can simulate flood area and its fraction by introducing floodplain connected to river channel. Forecast leadtime was set 39hours according to forcing data. For the case study, the flood occurred at Kinu river basin, Japan in 2015 was hindcasted. In a 1761 km² Kinu river basin, 3-days accumulated average rainfall was 384mm and over 4000 people was left in the inundated area. Available ensemble numerical weather prediction data at that time was inputted to the system in a resolution of 0.05 degrees and 1hour time step. As a result, the system predicted the flood occurrence by 45% and 84% at 23 and 11 hours before the water level exceeded the evacuation threshold, respectively. Those prediction lead time may provide the chance for early preparation for the floods such as levee reinforcement or evacuation. Adding to the discharge, flood area predictability was also analyzed. Although those models were applied for Japan region, this system can be applied easily to other region or even global scale. The areal flood prediction in meso to global scale would be useful for detecting hot zones or vulnerable areas over each region.
Mixture EMOS model for calibrating ensemble forecasts of wind speed.
Baran, S; Lerch, S
2016-03-01
Ensemble model output statistics (EMOS) is a statistical tool for post-processing forecast ensembles of weather variables obtained from multiple runs of numerical weather prediction models in order to produce calibrated predictive probability density functions. The EMOS predictive probability density function is given by a parametric distribution with parameters depending on the ensemble forecasts. We propose an EMOS model for calibrating wind speed forecasts based on weighted mixtures of truncated normal (TN) and log-normal (LN) distributions where model parameters and component weights are estimated by optimizing the values of proper scoring rules over a rolling training period. The new model is tested on wind speed forecasts of the 50 member European Centre for Medium-range Weather Forecasts ensemble, the 11 member Aire Limitée Adaptation dynamique Développement International-Hungary Ensemble Prediction System ensemble of the Hungarian Meteorological Service, and the eight-member University of Washington mesoscale ensemble, and its predictive performance is compared with that of various benchmark EMOS models based on single parametric families and combinations thereof. The results indicate improved calibration of probabilistic and accuracy of point forecasts in comparison with the raw ensemble and climatological forecasts. The mixture EMOS model significantly outperforms the TN and LN EMOS methods; moreover, it provides better calibrated forecasts than the TN-LN combination model and offers an increased flexibility while avoiding covariate selection problems. © 2016 The Authors Environmetrics Published by JohnWiley & Sons Ltd.
Ensemble forecasting has been used for operational numerical weather prediction in the United States and Europe since the early 1990s. An ensemble of weather or climate forecasts is used to characterize the two main sources of uncertainty in computer models of physical systems: ...
Analyses and forecasts of a tornadic supercell outbreak using a 3DVAR system ensemble
NASA Astrophysics Data System (ADS)
Zhuang, Zhaorong; Yussouf, Nusrat; Gao, Jidong
2016-05-01
As part of NOAA's "Warn-On-Forecast" initiative, a convective-scale data assimilation and prediction system was developed using the WRF-ARW model and ARPS 3DVAR data assimilation technique. The system was then evaluated using retrospective short-range ensemble analyses and probabilistic forecasts of the tornadic supercell outbreak event that occurred on 24 May 2011 in Oklahoma, USA. A 36-member multi-physics ensemble system provided the initial and boundary conditions for a 3-km convective-scale ensemble system. Radial velocity and reflectivity observations from four WSR-88Ds were assimilated into the ensemble using the ARPS 3DVAR technique. Five data assimilation and forecast experiments were conducted to evaluate the sensitivity of the system to data assimilation frequencies, in-cloud temperature adjustment schemes, and fixed- and mixed-microphysics ensembles. The results indicated that the experiment with 5-min assimilation frequency quickly built up the storm and produced a more accurate analysis compared with the 10-min assimilation frequency experiment. The predicted vertical vorticity from the moist-adiabatic in-cloud temperature adjustment scheme was larger in magnitude than that from the latent heat scheme. Cycled data assimilation yielded good forecasts, where the ensemble probability of high vertical vorticity matched reasonably well with the observed tornado damage path. Overall, the results of the study suggest that the 3DVAR analysis and forecast system can provide reasonable forecasts of tornadic supercell storms.
NASA Astrophysics Data System (ADS)
Velázquez, Juan Alberto; Anctil, François; Ramos, Maria-Helena; Perrin, Charles
2010-05-01
An ensemble forecasting system seeks to assess and to communicate the uncertainty of hydrological predictions by proposing, at each time step, an ensemble of forecasts from which one can estimate the probability distribution of the predictant (the probabilistic forecast), in contrast with a single estimate of the flow, for which no distribution is obtainable (the deterministic forecast). In the past years, efforts towards the development of probabilistic hydrological prediction systems were made with the adoption of ensembles of numerical weather predictions (NWPs). The additional information provided by the different available Ensemble Prediction Systems (EPS) was evaluated in a hydrological context on various case studies (see the review by Cloke and Pappenberger, 2009). For example, the European ECMWF-EPS was explored in case studies by Roulin et al. (2005), Bartholmes et al. (2005), Jaun et al. (2008), and Renner et al. (2009). The Canadian EC-EPS was also evaluated by Velázquez et al. (2009). Most of these case studies investigate the ensemble predictions of a given hydrological model, set up over a limited number of catchments. Uncertainty from weather predictions is assessed through the use of meteorological ensembles. However, uncertainty from the tested hydrological model and statistical robustness of the forecasting system when coping with different hydro-meteorological conditions are less frequently evaluated. The aim of this study is to evaluate and compare the performance and the reliability of 18 lumped hydrological models applied to a large number of catchments in an operational ensemble forecasting context. Some of these models were evaluated in a previous study (Perrin et al. 2001) for their ability to simulate streamflow. Results demonstrated that very simple models can achieve a level of performance almost as high (sometimes higher) as models with more parameters. In the present study, we focus on the ability of the hydrological models to provide reliable probabilistic forecasts of streamflow, based on ensemble weather predictions. The models were therefore adapted to run in a forecasting mode, i.e., to update initial conditions according to the last observed discharge at the time of the forecast, and to cope with ensemble weather scenarios. All models are lumped, i.e., the hydrological behavior is integrated over the spatial scale of the catchment, and run at daily time steps. The complexity of tested models varies between 3 and 13 parameters. The models are tested on 29 French catchments. Daily streamflow time series extend over 17 months, from March 2005 to July 2006. Catchment areas range between 1470 km2 and 9390 km2, and represent a variety of hydrological and meteorological conditions. The 12 UTC 10-day ECMWF rainfall ensemble (51 members) was used, which led to daily streamflow forecasts for a 9-day lead time. In order to assess the performance and reliability of the hydrological ensemble predictions, we computed the Continuous Ranked probability Score (CRPS) (Matheson and Winkler, 1976), as well as the reliability diagram (e.g. Wilks, 1995) and the rank histogram (Talagrand et al., 1999). Since the ECMWF deterministic forecasts are also available, the performance of the hydrological forecasting systems was also evaluated by comparing the deterministic score (MAE) with the probabilistic score (CRPS). The results obtained for the 18 hydrological models and the 29 studied catchments are discussed in the perspective of improving the operational use of ensemble forecasting in hydrology. References Bartholmes, J. and Todini, E.: Coupling meteorological and hydrological models for flood forecasting, Hydrol. Earth Syst. Sci., 9, 333-346, 2005. Cloke, H. and Pappenberger, F.: Ensemble Flood Forecasting: A Review. Journal of Hydrology 375 (3-4): 613-626, 2009. Jaun, S., Ahrens, B., Walser, A., Ewen, T., and Schär, C.: A probabilistic view on the August 2005 floods in the upper Rhine catchment, Nat. Hazards Earth Syst. Sci., 8, 281-291, 2008. Matheson, J. E. and Winkler, R. L.: Scoring rules for continuous probability distributions, Manage Sci., 22, 1087-1096, 1976. Perrin, C., Michel C. and Andréassian,V. Does a large number of parameters enhance model performance? Comparative assessment of common catchment model structures on 429 catchments, J. Hydrol., 242, 275-301, 2001. Renner, M., Werner, M. G. F., Rademacher, S., and Sprokkereef, E.: Verification of ensemble flow forecast for the River Rhine, J. Hydrol., 376, 463-475, 2009. Roulin, E. and Vannitsem, S.: Skill of medium-range hydrological ensemble predictions, J. Hydrometeorol., 6, 729-744, 2005. Talagrand, O., Vautard, R., and Strauss, B.: Evaluation of the probabilistic prediction systems, in: Proceedings, ECMWF Workshop on Predictability, Shinfield Park, Reading, Berkshire, ECMWF, 1-25, 1999. Velázquez, J.A., Petit, T., Lavoie, A., Boucher M.-A., Turcotte R., Fortin V., and Anctil, F. : An evaluation of the Canadian global meteorological ensemble prediction system for short-term hydrological forecasting, Hydrol. Earth Syst. Sci., 13, 2221-2231, 2009. Wilks, D. S.: Statistical Methods in the Atmospheric Sciences, Academic Press, San Diego, CA, 465 pp., 1995.
NASA Technical Reports Server (NTRS)
Kirtman, Ben P.; Min, Dughong; Infanti, Johnna M.; Kinter, James L., III; Paolino, Daniel A.; Zhang, Qin; vandenDool, Huug; Saha, Suranjana; Mendez, Malaquias Pena; Becker, Emily;
2013-01-01
The recent US National Academies report "Assessment of Intraseasonal to Interannual Climate Prediction and Predictability" was unequivocal in recommending the need for the development of a North American Multi-Model Ensemble (NMME) operational predictive capability. Indeed, this effort is required to meet the specific tailored regional prediction and decision support needs of a large community of climate information users. The multi-model ensemble approach has proven extremely effective at quantifying prediction uncertainty due to uncertainty in model formulation, and has proven to produce better prediction quality (on average) then any single model ensemble. This multi-model approach is the basis for several international collaborative prediction research efforts, an operational European system and there are numerous examples of how this multi-model ensemble approach yields superior forecasts compared to any single model. Based on two NOAA Climate Test Bed (CTB) NMME workshops (February 18, and April 8, 2011) a collaborative and coordinated implementation strategy for a NMME prediction system has been developed and is currently delivering real-time seasonal-to-interannual predictions on the NOAA Climate Prediction Center (CPC) operational schedule. The hindcast and real-time prediction data is readily available (e.g., http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/) and in graphical format from CPC (http://origin.cpc.ncep.noaa.gov/products/people/wd51yf/NMME/index.html). Moreover, the NMME forecast are already currently being used as guidance for operational forecasters. This paper describes the new NMME effort, presents an overview of the multi-model forecast quality, and the complementary skill associated with individual models.
NASA Astrophysics Data System (ADS)
Szunyogh, Istvan; Kostelich, Eric J.; Gyarmati, G.; Patil, D. J.; Hunt, Brian R.; Kalnay, Eugenia; Ott, Edward; Yorke, James A.
2005-08-01
The accuracy and computational efficiency of the recently proposed local ensemble Kalman filter (LEKF) data assimilation scheme is investigated on a state-of-the-art operational numerical weather prediction model using simulated observations. The model selected for this purpose is the T62 horizontal- and 28-level vertical-resolution version of the Global Forecast System (GFS) of the National Center for Environmental Prediction. The performance of the data assimilation system is assessed for different configurations of the LEKF scheme. It is shown that a modest size (40-member) ensemble is sufficient to track the evolution of the atmospheric state with high accuracy. For this ensemble size, the computational time per analysis is less than 9 min on a cluster of PCs. The analyses are extremely accurate in the mid-latitude storm track regions. The largest analysis errors, which are typically much smaller than the observational errors, occur where parametrized physical processes play important roles. Because these are also the regions where model errors are expected to be the largest, limitations of a real-data implementation of the ensemble-based Kalman filter may be easily mistaken for model errors. In light of these results, the importance of testing the ensemble-based Kalman filter data assimilation systems on simulated observations is stressed.
An operational mesoscale ensemble data assimilation and prediction system: E-RTFDDA
NASA Astrophysics Data System (ADS)
Liu, Y.; Hopson, T.; Roux, G.; Hacker, J.; Xu, M.; Warner, T.; Swerdlin, S.
2009-04-01
Mesoscale (2-2000 km) meteorological processes differ from synoptic circulations in that mesoscale weather changes rapidly in space and time, and physics processes that are parameterized in NWP models play a great role. Complex interactions of synoptic circulations, regional and local terrain, land-surface heterogeneity, and associated physical properties, and the physical processes of radiative transfer, cloud and precipitation and boundary layer mixing, are crucial in shaping regional weather and climate. Mesoscale ensemble analysis and prediction should sample the uncertainties of mesoscale modeling systems in representing these factors. An innovative mesoscale Ensemble Real-Time Four Dimensional Data Assimilation (E-RTFDDA) and forecasting system has been developed at NCAR. E-RTFDDA contains diverse ensemble perturbation approaches that consider uncertainties in all major system components to produce multi-scale continuously-cycling probabilistic data assimilation and forecasting. A 30-member E-RTFDDA system with three nested domains with grid sizes of 30, 10 and 3.33 km has been running on a Department of Defense high-performance computing platform since September 2007. It has been applied at two very different US geographical locations; one in the western inter-mountain area and the other in the northeastern states, producing 6 hour analyses and 48 hour forecasts, with 4 forecast cycles a day. The operational model outputs are analyzed to a) assess overall ensemble performance and properties, b) study terrain effect on mesoscale predictability, c) quantify the contribution of different ensemble perturbation approaches to the overall forecast skill, and d) assess the additional contributed skill from an ensemble calibration process based on a quantile-regression algorithm. The system and the results will be reported at the meeting.
Space weather forecasting with a Multimodel Ensemble Prediction System (MEPS)
NASA Astrophysics Data System (ADS)
Schunk, R. W.; Scherliess, L.; Eccles, V.; Gardner, L. C.; Sojka, J. J.; Zhu, L.; Pi, X.; Mannucci, A. J.; Butala, M.; Wilson, B. D.; Komjathy, A.; Wang, C.; Rosen, G.
2016-07-01
The goal of the Multimodel Ensemble Prediction System (MEPS) program is to improve space weather specification and forecasting with ensemble modeling. Space weather can have detrimental effects on a variety of civilian and military systems and operations, and many of the applications pertain to the ionosphere and upper atmosphere. Space weather can affect over-the-horizon radars, HF communications, surveying and navigation systems, surveillance, spacecraft charging, power grids, pipelines, and the Federal Aviation Administration (FAA's) Wide Area Augmentation System (WAAS). Because of its importance, numerous space weather forecasting approaches are being pursued, including those involving empirical, physics-based, and data assimilation models. Clearly, if there are sufficient data, the data assimilation modeling approach is expected to be the most reliable, but different data assimilation models can produce different results. Therefore, like the meteorology community, we created a Multimodel Ensemble Prediction System (MEPS) for the Ionosphere-Thermosphere-Electrodynamics (ITE) system that is based on different data assimilation models. The MEPS ensemble is composed of seven physics-based data assimilation models for the ionosphere, ionosphere-plasmasphere, thermosphere, high-latitude ionosphere-electrodynamics, and middle to low latitude ionosphere-electrodynamics. Hence, multiple data assimilation models can be used to describe each region. A selected storm event that was reconstructed with four different data assimilation models covering the middle and low latitude ionosphere is presented and discussed. In addition, the effect of different data types on the reconstructions is shown.
NASA Astrophysics Data System (ADS)
Bennett, J.; David, R. E.; Wang, Q.; Li, M.; Shrestha, D. L.
2016-12-01
Flood forecasting in Australia has historically relied on deterministic forecasting models run only when floods are imminent, with considerable forecaster input and interpretation. These now co-existed with a continually available 7-day streamflow forecasting service (also deterministic) aimed at operational water management applications such as environmental flow releases. The 7-day service is not optimised for flood prediction. We describe progress on developing a system for ensemble streamflow forecasting that is suitable for both flood prediction and water management applications. Precipitation uncertainty is handled through post-processing of Numerical Weather Prediction (NWP) output with a Bayesian rainfall post-processor (RPP). The RPP corrects biases, downscales NWP output, and produces reliable ensemble spread. Ensemble precipitation forecasts are used to force a semi-distributed conceptual rainfall-runoff model. Uncertainty in precipitation forecasts is insufficient to reliably describe streamflow forecast uncertainty, particularly at shorter lead-times. We characterise hydrological prediction uncertainty separately with a 4-stage error model. The error model relies on data transformation to ensure residuals are homoscedastic and symmetrically distributed. To ensure streamflow forecasts are accurate and reliable, the residuals are modelled using a mixture-Gaussian distribution with distinct parameters for the rising and falling limbs of the forecast hydrograph. In a case study of the Murray River in south-eastern Australia, we show ensemble predictions of floods generally have lower errors than deterministic forecasting methods. We also discuss some of the challenges in operationalising short-term ensemble streamflow forecasts in Australia, including meeting the needs for accurate predictions across all flow ranges and comparing forecasts generated by event and continuous hydrological models.
The role of ensemble post-processing for modeling the ensemble tail
NASA Astrophysics Data System (ADS)
Van De Vyver, Hans; Van Schaeybroeck, Bert; Vannitsem, Stéphane
2016-04-01
The past decades the numerical weather prediction community has witnessed a paradigm shift from deterministic to probabilistic forecast and state estimation (Buizza and Leutbecher, 2015; Buizza et al., 2008), in an attempt to quantify the uncertainties associated with initial-condition and model errors. An important benefit of a probabilistic framework is the improved prediction of extreme events. However, one may ask to what extent such model estimates contain information on the occurrence probability of extreme events and how this information can be optimally extracted. Different approaches have been proposed and applied on real-world systems which, based on extreme value theory, allow the estimation of extreme-event probabilities conditional on forecasts and state estimates (Ferro, 2007; Friederichs, 2010). Using ensemble predictions generated with a model of low dimensionality, a thorough investigation is presented quantifying the change of predictability of extreme events associated with ensemble post-processing and other influencing factors including the finite ensemble size, lead time and model assumption and the use of different covariates (ensemble mean, maximum, spread...) for modeling the tail distribution. Tail modeling is performed by deriving extreme-quantile estimates using peak-over-threshold representation (generalized Pareto distribution) or quantile regression. Common ensemble post-processing methods aim to improve mostly the ensemble mean and spread of a raw forecast (Van Schaeybroeck and Vannitsem, 2015). Conditional tail modeling, on the other hand, is a post-processing in itself, focusing on the tails only. Therefore, it is unclear how applying ensemble post-processing prior to conditional tail modeling impacts the skill of extreme-event predictions. This work is investigating this question in details. Buizza, Leutbecher, and Isaksen, 2008: Potential use of an ensemble of analyses in the ECMWF Ensemble Prediction System, Q. J. R. Meteorol. Soc. 134: 2051-2066.Buizza and Leutbecher, 2015: The forecast skill horizon, Q. J. R. Meteorol. Soc. 141: 3366-3382.Ferro, 2007: A probability model for verifying deterministic forecasts of extreme events. Weather and Forecasting 22 (5), 1089-1100.Friederichs, 2010: Statistical downscaling of extreme precipitation events using extreme value theory. Extremes 13, 109-132.Van Schaeybroeck and Vannitsem, 2015: Ensemble post-processing using member-by-member approaches: theoretical aspects. Q.J.R. Meteorol. Soc., 141: 807-818.
National Centers for Environmental Prediction
ENSEMBLE PRODUCTS & DATA SOURCES Probabilistic Forecasts of Quantitative Precipitation from the NCEP Predictability Research with Indian Monsoon Examples - PDF - 28 Mar 2005 North American Ensemble Forecast System QUANTITATIVE PRECIPITATION *PQPF* In these charts, the probability that 24-hour precipitation amounts over a
Potentialities of ensemble strategies for flood forecasting over the Milano urban area
NASA Astrophysics Data System (ADS)
Ravazzani, Giovanni; Amengual, Arnau; Ceppi, Alessandro; Homar, Víctor; Romero, Romu; Lombardi, Gabriele; Mancini, Marco
2016-08-01
Analysis of ensemble forecasting strategies, which can provide a tangible backing for flood early warning procedures and mitigation measures over the Mediterranean region, is one of the fundamental motivations of the international HyMeX programme. Here, we examine two severe hydrometeorological episodes that affected the Milano urban area and for which the complex flood protection system of the city did not completely succeed. Indeed, flood damage have exponentially increased during the last 60 years, due to industrial and urban developments. Thus, the improvement of the Milano flood control system needs a synergism between structural and non-structural approaches. First, we examine how land-use changes due to urban development have altered the hydrological response to intense rainfalls. Second, we test a flood forecasting system which comprises the Flash-flood Event-based Spatially distributed rainfall-runoff Transformation, including Water Balance (FEST-WB) and the Weather Research and Forecasting (WRF) models. Accurate forecasts of deep moist convection and extreme precipitation are difficult to be predicted due to uncertainties arising from the numeric weather prediction (NWP) physical parameterizations and high sensitivity to misrepresentation of the atmospheric state; however, two hydrological ensemble prediction systems (HEPS) have been designed to explicitly cope with uncertainties in the initial and lateral boundary conditions (IC/LBCs) and physical parameterizations of the NWP model. No substantial differences in skill have been found between both ensemble strategies when considering an enhanced diversity of IC/LBCs for the perturbed initial conditions ensemble. Furthermore, no additional benefits have been found by considering more frequent LBCs in a mixed physics ensemble, as ensemble spread seems to be reduced. These findings could help to design the most appropriate ensemble strategies before these hydrometeorological extremes, given the computational cost of running such advanced HEPSs for operational purposes.
Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA
2017-01-01
Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository. PMID:28263984
Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA.
Biggs, Matthew B; Papin, Jason A
2017-03-01
Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository.
NASA Astrophysics Data System (ADS)
Clark, E.; Wood, A.; Nijssen, B.; Newman, A. J.; Mendoza, P. A.
2016-12-01
The System for Hydrometeorological Applications, Research and Prediction (SHARP), developed at the National Center for Atmospheric Research (NCAR), University of Washington, U.S. Army Corps of Engineers, and U.S. Bureau of Reclamation, is a fully automated ensemble prediction system for short-term to seasonal applications. It incorporates uncertainty in initial hydrologic conditions (IHCs) and in hydrometeorological predictions. In this implementation, IHC uncertainty is estimated by propagating an ensemble of 100 plausible temperature and precipitation time series through the Sacramento/Snow-17 model. The forcing ensemble explicitly accounts for measurement and interpolation uncertainties in the development of gridded meteorological forcing time series. The resulting ensemble of derived IHCs exhibits a broad range of possible soil moisture and snow water equivalent (SWE) states. To select the IHCs that are most consistent with the observations, we employ a particle filter (PF) that weights IHC ensemble members based on observations of streamflow and SWE. These particles are then used to initialize ensemble precipitation and temperature forecasts downscaled from the Global Ensemble Forecast System (GEFS), generating a streamflow forecast ensemble. We test this method in two basins in the Pacific Northwest that are important for water resources management: 1) the Green River upstream of Howard Hanson Dam, and 2) the South Fork Flathead River upstream of Hungry Horse Dam. The first of these is characterized by mixed snow and rain, while the second is snow-dominated. The PF-based forecasts are compared to forecasts based on a single IHC (corresponding to median streamflow) paired with the full GEFS ensemble, and 2) the full IHC ensemble, without filtering, paired with the full GEFS ensemble. In addition to assessing improvements in the spread of IHCs, we perform a hindcast experiment to evaluate the utility of PF-based data assimilation on streamflow forecasts at 1- to 7-day lead times.
NASA Astrophysics Data System (ADS)
Gutiérrez, J. M.; Primo, C.; Rodríguez, M. A.; Fernández, J.
2008-02-01
We present a novel approach to characterize and graphically represent the spatiotemporal evolution of ensembles using a simple diagram. To this aim we analyze the fluctuations obtained as differences between each member of the ensemble and the control. The lognormal character of these fluctuations suggests a characterization in terms of the first two moments of the logarithmic transformed values. On one hand, the mean is associated with the exponential growth in time. On the other hand, the variance accounts for the spatial correlation and localization of fluctuations. In this paper we introduce the MVL (Mean-Variance of Logarithms) diagram to intuitively represent the interplay and evolution of these two quantities. We show that this diagram uncovers useful information about the spatiotemporal dynamics of the ensemble. Some universal features of the diagram are also described, associated either with the nonlinear system or with the ensemble method and illustrated using both toy models and numerical weather prediction systems.
NASA Astrophysics Data System (ADS)
Perera, Kushan C.; Western, Andrew W.; Robertson, David E.; George, Biju; Nawarathna, Bandara
2016-06-01
Irrigation demands fluctuate in response to weather variations and a range of irrigation management decisions, which creates challenges for water supply system operators. This paper develops a method for real-time ensemble forecasting of irrigation demand and applies it to irrigation command areas of various sizes for lead times of 1 to 5 days. The ensemble forecasts are based on a deterministic time series model coupled with ensemble representations of the various inputs to that model. Forecast inputs include past flow, precipitation, and potential evapotranspiration. These inputs are variously derived from flow observations from a modernized irrigation delivery system; short-term weather forecasts derived from numerical weather prediction models and observed weather data available from automatic weather stations. The predictive performance for the ensemble spread of irrigation demand was quantified using rank histograms, the mean continuous rank probability score (CRPS), the mean CRPS reliability and the temporal mean of the ensemble root mean squared error (MRMSE). The mean forecast was evaluated using root mean squared error (RMSE), Nash-Sutcliffe model efficiency (NSE) and bias. The NSE values for evaluation periods ranged between 0.96 (1 day lead time, whole study area) and 0.42 (5 days lead time, smallest command area). Rank histograms and comparison of MRMSE, mean CRPS, mean CRPS reliability and RMSE indicated that the ensemble spread is generally a reliable representation of the forecast uncertainty for short lead times but underestimates the uncertainty for long lead times.
NASA Astrophysics Data System (ADS)
Liu, Li; Xu, Yue-Ping
2017-04-01
Ensemble flood forecasting driven by numerical weather prediction products is becoming more commonly used in operational flood forecasting applications.In this study, a hydrological ensemble flood forecasting system based on Variable Infiltration Capacity (VIC) model and quantitative precipitation forecasts from TIGGE dataset is constructed for Lanjiang Basin, Southeast China. The impacts of calibration strategies and ensemble methods on the performance of the system are then evaluated.The hydrological model is optimized by parallel programmed ɛ-NSGAII multi-objective algorithm and two respectively parameterized models are determined to simulate daily flows and peak flows coupled with a modular approach.The results indicatethat the ɛ-NSGAII algorithm permits more efficient optimization and rational determination on parameter setting.It is demonstrated that the multimodel ensemble streamflow mean have better skills than the best singlemodel ensemble mean (ECMWF) and the multimodel ensembles weighted on members and skill scores outperform other multimodel ensembles. For typical flood event, it is proved that the flood can be predicted 3-4 days in advance, but the flows in rising limb can be captured with only 1-2 days ahead due to the flash feature. With respect to peak flows selected by Peaks Over Threshold approach, the ensemble means from either singlemodel or multimodels are generally underestimated as the extreme values are smoothed out by ensemble process.
NASA Astrophysics Data System (ADS)
Alessandri, Andrea; Felice, Matteo De; Catalano, Franco; Lee, June-Yi; Wang, Bin; Lee, Doo Young; Yoo, Jin-Ho; Weisheimer, Antije
2018-04-01
Multi-model ensembles (MMEs) are powerful tools in dynamical climate prediction as they account for the overconfidence and the uncertainties related to single-model ensembles. Previous works suggested that the potential benefit that can be expected by using a MME amplifies with the increase of the independence of the contributing Seasonal Prediction Systems. In this work we combine the two MME Seasonal Prediction Systems (SPSs) independently developed by the European (ENSEMBLES) and by the Asian-Pacific (APCC/CliPAS) communities. To this aim, all the possible multi-model combinations obtained by putting together the 5 models from ENSEMBLES and the 11 models from APCC/CliPAS have been evaluated. The grand ENSEMBLES-APCC/CliPAS MME enhances significantly the skill in predicting 2m temperature and precipitation compared to previous estimates from the contributing MMEs. Our results show that, in general, the better combinations of SPSs are obtained by mixing ENSEMBLES and APCC/CliPAS models and that only a limited number of SPSs is required to obtain the maximum performance. The number and selection of models that perform better is usually different depending on the region/phenomenon under consideration so that all models are useful in some cases. It is shown that the incremental performance contribution tends to be higher when adding one model from ENSEMBLES to APCC/CliPAS MMEs and vice versa, confirming that the benefit of using MMEs amplifies with the increase of the independence the contributing models. To verify the above results for a real world application, the Grand ENSEMBLES-APCC/CliPAS MME is used to predict retrospective energy demand over Italy as provided by TERNA (Italian Transmission System Operator) for the period 1990-2007. The results demonstrate the useful application of MME seasonal predictions for energy demand forecasting over Italy. It is shown a significant enhancement of the potential economic value of forecasting energy demand when using the better combinations from the Grand MME by comparison to the maximum value obtained from the better combinations of each of the two contributing MMEs. The above results demonstrate for the first time the potential of the Grand MME to significantly contribute in obtaining useful predictions at the seasonal time-scale.
NASA Astrophysics Data System (ADS)
Niedzielski, Tomasz; Mizinski, Bartlomiej
2016-04-01
The HydroProg system has been elaborated in frame of the research project no. 2011/01/D/ST10/04171 of the National Science Centre of Poland and is steadily producing multimodel ensemble predictions of hydrograph in real time. Although there are six ensemble members available at present, the longest record of predictions and their statistics is available for two data-based models (uni- and multivariate autoregressive models). Thus, we consider 3-hour predictions of water levels, with lead times ranging from 15 to 180 minutes, computed every 15 minutes since August 2013 for the Nysa Klodzka basin (SW Poland) using the two approaches and their two-model ensemble. Since the launch of the HydroProg system there have been 12 high flow episodes, and the objective of this work is to present the performance of the two-model ensemble in the process of forecasting these events. For a sake of brevity, we limit our investigation to a single gauge located at the Nysa Klodzka river in the town of Klodzko, which is centrally located in the studied basin. We identified certain regular scenarios of how the models perform in predicting the high flows in Klodzko. At the initial phase of the high flow, well before the rising limb of hydrograph, the two-model ensemble is found to provide the most skilful prognoses of water levels. However, while forecasting the rising limb of hydrograph, either the two-model solution or the vector autoregressive model offers the best predictive performance. In addition, it is hypothesized that along with the development of the rising limb phase, the vector autoregression becomes the most skilful approach amongst the scrutinized ones. Our simple two-model exercise confirms that multimodel hydrologic ensemble predictions cannot be treated as universal solutions suitable for forecasting the entire high flow event, but their superior performance may hold only for certain phases of a high flow.
NASA Astrophysics Data System (ADS)
Baehr, J.; Fröhlich, K.; Botzet, M.; Domeisen, D. I. V.; Kornblueh, L.; Notz, D.; Piontek, R.; Pohlmann, H.; Tietsche, S.; Müller, W. A.
2015-05-01
A seasonal forecast system is presented, based on the global coupled climate model MPI-ESM as used for CMIP5 simulations. We describe the initialisation of the system and analyse its predictive skill for surface temperature. The presented system is initialised in the atmospheric, oceanic, and sea ice component of the model from reanalysis/observations with full field nudging in all three components. For the initialisation of the ensemble, bred vectors with a vertically varying norm are implemented in the ocean component to generate initial perturbations. In a set of ensemble hindcast simulations, starting each May and November between 1982 and 2010, we analyse the predictive skill. Bias-corrected ensemble forecasts for each start date reproduce the observed surface temperature anomalies at 2-4 months lead time, particularly in the tropics. Niño3.4 sea surface temperature anomalies show a small root-mean-square error and predictive skill up to 6 months. Away from the tropics, predictive skill is mostly limited to the ocean, and to regions which are strongly influenced by ENSO teleconnections. In summary, the presented seasonal prediction system based on a coupled climate model shows predictive skill for surface temperature at seasonal time scales comparable to other seasonal prediction systems using different underlying models and initialisation strategies. As the same model underlying our seasonal prediction system—with a different initialisation—is presently also used for decadal predictions, this is an important step towards seamless seasonal-to-decadal climate predictions.
Ensemble Downscaling of Winter Seasonal Forecasts: The MRED Project
NASA Astrophysics Data System (ADS)
Arritt, R. W.; Mred Team
2010-12-01
The Multi-Regional climate model Ensemble Downscaling (MRED) project is a multi-institutional project that is producing large ensembles of downscaled winter seasonal forecasts from coupled atmosphere-ocean seasonal prediction models. Eight regional climate models each are downscaling 15-member ensembles from the National Centers for Environmental Prediction (NCEP) Climate Forecast System (CFS) and the new NASA seasonal forecast system based on the GEOS5 atmospheric model coupled with the MOM4 ocean model. This produces 240-member ensembles, i.e., 8 regional models x 15 global ensemble members x 2 global models, for each winter season (December-April) of 1982-2003. Results to date show that combined global-regional downscaled forecasts have greatest skill for seasonal precipitation anomalies during strong El Niño events such as 1982-83 and 1997-98. Ensemble means of area-averaged seasonal precipitation for the regional models generally track the corresponding results for the global model, though there is considerable inter-model variability amongst the regional models. For seasons and regions where area mean precipitation is accurately simulated the regional models bring added value by extracting greater spatial detail from the global forecasts, mainly due to better resolution of terrain in the regional models. Our results also emphasize that an ensemble approach is essential to realizing the added value from the combined global-regional modeling system.
NASA Astrophysics Data System (ADS)
Liechti, K.; Panziera, L.; Germann, U.; Zappa, M.
2013-10-01
This study explores the limits of radar-based forecasting for hydrological runoff prediction. Two novel radar-based ensemble forecasting chains for flash-flood early warning are investigated in three catchments in the southern Swiss Alps and set in relation to deterministic discharge forecasts for the same catchments. The first radar-based ensemble forecasting chain is driven by NORA (Nowcasting of Orographic Rainfall by means of Analogues), an analogue-based heuristic nowcasting system to predict orographic rainfall for the following eight hours. The second ensemble forecasting system evaluated is REAL-C2, where the numerical weather prediction COSMO-2 is initialised with 25 different initial conditions derived from a four-day nowcast with the radar ensemble REAL. Additionally, three deterministic forecasting chains were analysed. The performance of these five flash-flood forecasting systems was analysed for 1389 h between June 2007 and December 2010 for which NORA forecasts were issued, due to the presence of orographic forcing. A clear preference was found for the ensemble approach. Discharge forecasts perform better when forced by NORA and REAL-C2 rather then by deterministic weather radar data. Moreover, it was observed that using an ensemble of initial conditions at the forecast initialisation, as in REAL-C2, significantly improved the forecast skill. These forecasts also perform better then forecasts forced by ensemble rainfall forecasts (NORA) initialised form a single initial condition of the hydrological model. Thus the best results were obtained with the REAL-C2 forecasting chain. However, for regions where REAL cannot be produced, NORA might be an option for forecasting events triggered by orographic precipitation.
National Centers for Environmental Prediction
Modeling Mesoscale Modeling Marine Modeling and Analysis Teams Climate Data Assimilation Ensembles and Post Contacts Change Log Events Calendar People Numerical Forecast Systems Ensemble and Post Processing Team
NASA Astrophysics Data System (ADS)
Bao, Hongjun; Zhao, Linna
2012-02-01
A coupled atmospheric-hydrologic-hydraulic ensemble flood forecasting model, driven by The Observing System Research and Predictability Experiment (THORPEX) Interactive Grand Global Ensemble (TIGGE) data, has been developed for flood forecasting over the Huaihe River. The incorporation of numerical weather prediction (NWP) information into flood forecasting systems may increase forecast lead time from a few hours to a few days. A single NWP model forecast from a single forecast center, however, is insufficient as it involves considerable non-predictable uncertainties and leads to a high number of false alarms. The availability of global ensemble NWP systems through TIGGE offers a new opportunity for flood forecast. The Xinanjiang model used for hydrological rainfall-runoff modeling and the one-dimensional unsteady flow model applied to channel flood routing are coupled with ensemble weather predictions based on the TIGGE data from the Canadian Meteorological Centre (CMC), the European Centre for Medium-Range Weather Forecasts (ECMWF), the UK Met Office (UKMO), and the US National Centers for Environmental Prediction (NCEP). The developed ensemble flood forecasting model is applied to flood forecasting of the 2007 flood season as a test case. The test case is chosen over the upper reaches of the Huaihe River above Lutaizi station with flood diversion and retarding areas. The input flood discharge hydrograph from the main channel to the flood diversion area is estimated with the fixed split ratio of the main channel discharge. The flood flow inside the flood retarding area is calculated as a reservoir with the water balance method. The Muskingum method is used for flood routing in the flood diversion area. A probabilistic discharge and flood inundation forecast is provided as the end product to study the potential benefits of using the TIGGE ensemble forecasts. The results demonstrate satisfactory flood forecasting with clear signals of probability of floods up to a few days in advance, and show that TIGGE ensemble forecast data are a promising tool for forecasting of flood inundation, comparable with that driven by raingauge observations.
NASA Astrophysics Data System (ADS)
Greenway, D. P.; Hackett, E.
2017-12-01
Under certain atmospheric refractivity conditions, propagated electromagnetic waves (EM) can become trapped between the surface and the bottom of the atmosphere's mixed layer, which is referred to as surface duct propagation. Being able to predict the presence of these surface ducts can reap many benefits to users and developers of sensing technologies and communication systems because they significantly influence the performance of these systems. However, the ability to directly measure or model a surface ducting layer is challenging due to the high spatial resolution and large spatial coverage needed to make accurate refractivity estimates for EM propagation; thus, inverse methods have become an increasingly popular way of determining atmospheric refractivity. This study uses data from the Coupled Ocean/Atmosphere Mesoscale Prediction System developed by the Naval Research Laboratory and instrumented helicopter (helo) measurements taken during the Wallops Island Field Experiment to evaluate the use of ensemble forecasts in refractivity inversions. Helo measurements and ensemble forecasts are optimized to a parametric refractivity model, and three experiments are performed to evaluate whether incorporation of ensemble forecast data aids in more timely and accurate inverse solutions using genetic algorithms. The results suggest that using optimized ensemble members as an initial population for the genetic algorithms generally enhances the accuracy and speed of the inverse solution; however, use of the ensemble data to restrict parameter search space yields mixed results. Inaccurate results are related to parameterization of the ensemble members' refractivity profile and the subsequent extraction of the parameter ranges to limit the search space.
Application Bayesian Model Averaging method for ensemble system for Poland
NASA Astrophysics Data System (ADS)
Guzikowski, Jakub; Czerwinska, Agnieszka
2014-05-01
The aim of the project is to evaluate methods for generating numerical ensemble weather prediction using a meteorological data from The Weather Research & Forecasting Model and calibrating this data by means of Bayesian Model Averaging (WRF BMA) approach. We are constructing height resolution short range ensemble forecasts using meteorological data (temperature) generated by nine WRF's models. WRF models have 35 vertical levels and 2.5 km x 2.5 km horizontal resolution. The main emphasis is that the used ensemble members has a different parameterization of the physical phenomena occurring in the boundary layer. To calibrate an ensemble forecast we use Bayesian Model Averaging (BMA) approach. The BMA predictive Probability Density Function (PDF) is a weighted average of predictive PDFs associated with each individual ensemble member, with weights that reflect the member's relative skill. For test we chose a case with heat wave and convective weather conditions in Poland area from 23th July to 1st August 2013. From 23th July to 29th July 2013 temperature oscillated below or above 30 Celsius degree in many meteorology stations and new temperature records were added. During this time the growth of the hospitalized patients with cardiovascular system problems was registered. On 29th July 2013 an advection of moist tropical air masses was recorded in the area of Poland causes strong convection event with mesoscale convection system (MCS). MCS caused local flooding, damage to the transport infrastructure, destroyed buildings, trees and injuries and direct threat of life. Comparison of the meteorological data from ensemble system with the data recorded on 74 weather stations localized in Poland is made. We prepare a set of the model - observations pairs. Then, the obtained data from single ensemble members and median from WRF BMA system are evaluated on the basis of the deterministic statistical error Root Mean Square Error (RMSE), Mean Absolute Error (MAE). To evaluation probabilistic data The Brier Score (BS) and Continuous Ranked Probability Score (CRPS) were used. Finally comparison between BMA calibrated data and data from ensemble members will be displayed.
Majid, Abdul; Ali, Safdar
2015-01-01
We developed genetic programming (GP)-based evolutionary ensemble system for the early diagnosis, prognosis and prediction of human breast cancer. This system has effectively exploited the diversity in feature and decision spaces. First, individual learners are trained in different feature spaces using physicochemical properties of protein amino acids. Their predictions are then stacked to develop the best solution during GP evolution process. Finally, results for HBC-Evo system are obtained with optimal threshold, which is computed using particle swarm optimization. Our novel approach has demonstrated promising results compared to state of the art approaches.
Quasi-most unstable modes: a window to 'À la carte' ensemble diversity?
NASA Astrophysics Data System (ADS)
Homar Santaner, Victor; Stensrud, David J.
2010-05-01
The atmospheric scientific community is nowadays facing the ambitious challenge of providing useful forecasts of atmospheric events that produce high societal impact. The low level of social resilience to false alarms creates tremendous pressure on forecasting offices to issue accurate, timely and reliable warnings.Currently, no operational numerical forecasting system is able to respond to the societal demand for high-resolution (in time and space) predictions in the 12-72h time span. The main reasons for such deficiencies are the lack of adequate observations and the high non-linearity of the numerical models that are currently used. The whole weather forecasting problem is intrinsically probabilistic and current methods aim at coping with the various sources of uncertainties and the error propagation throughout the forecasting system. This probabilistic perspective is often created by generating ensembles of deterministic predictions that are aimed at sampling the most important sources of uncertainty in the forecasting system. The ensemble generation/sampling strategy is a crucial aspect of their performance and various methods have been proposed. Although global forecasting offices have been using ensembles of perturbed initial conditions for medium-range operational forecasts since 1994, no consensus exists regarding the optimum sampling strategy for high resolution short-range ensemble forecasts. Bred vectors, however, have been hypothesized to better capture the growing modes in the highly nonlinear mesoscale dynamics of severe episodes than singular vectors or observation perturbations. Yet even this technique is not able to produce enough diversity in the ensembles to accurately and routinely predict extreme phenomena such as severe weather. Thus, we propose a new method to generate ensembles of initial conditions perturbations that is based on the breeding technique. Given a standard bred mode, a set of customized perturbations is derived with specified amplitudes and horizontal scales. This allows the ensemble to excite growing modes across a wider range of scales. Results show that this approach produces significantly more spread in the ensemble prediction than standard bred modes alone. Several examples that illustrate the benefits from this approach for severe weather forecasts will be provided.
Dynamical predictive power of the generalized Gibbs ensemble revealed in a second quench.
Zhang, J M; Cui, F C; Hu, Jiangping
2012-04-01
We show that a quenched and relaxed completely integrable system is hardly distinguishable from the corresponding generalized Gibbs ensemble in a dynamical sense. To be specific, the response of the quenched and relaxed system to a second quench can be accurately reproduced by using the generalized Gibbs ensemble as a substitute. Remarkably, as demonstrated with the transverse Ising model and the hard-core bosons in one dimension, not only the steady values but even the transient, relaxation dynamics of the physical variables can be accurately reproduced by using the generalized Gibbs ensemble as a pseudoinitial state. This result is an important complement to the previously established result that a quenched and relaxed system is hardly distinguishable from the generalized Gibbs ensemble in a static sense. The relevance of the generalized Gibbs ensemble in the nonequilibrium dynamics of completely integrable systems is then greatly strengthened.
NCEP/NLDAS Drought Monitoring and Prediction
NASA Astrophysics Data System (ADS)
Xia, Y.; Ek, M.; Wood, E.; Luo, L.; Sheffield, J.; Lettenmaier, D.; Livneh, B.; Cosgrove, B.; Mocko, D.; Meng, J.; Wei, H.; Restrepo, P.; Schaake, J.; Mo, K.
2009-05-01
The NCEP Environmental Modeling Center (EMC) collaborated with its CPPA (Climate Prediction Program of the Americas) partners to develop a North American Land Data Assimilation System (NLDAS, http://www.emc.ncep.noaa.gov/mmb/nldas) to monitor and predict the drought over the Continental United States (CONUS). The realtime NLDAS drought monitor, executed daily at NCEP/EMC, including daily, weekly and monthly anomaly and percentile of six fields (soil moisture, snow water equivalent, total runoff, streamflow, evaporation, precipitation) outputted from four land surface models (Noah, Mosaic, SAC, and VIC) on a common 1/8th degree grid using common hourly land surface forcing. The non-precipitation surface forcing is derived from NCEP's retrospective and realtime North American Regional Reanalysis System (NARR). The precipitation forcing is anchored to a daily gauge-only precipitation analysis over CONUS that applies a Parameter-elevation Regressions on Independent Slopes Model (PRISM) correction. This daily precipitation analysis is then temporally disaggregated to hourly precipitation amounts using radar and satellite precipitation. The NARR- based surface downward solar radiation is bias-corrected using seven years (1997-2004) of GOES satellite- derived solar radiation retrievals. The uncoupled ensemble seasonal drought prediction utilizes the following three independent approaches for generating downscaled ensemble seasonal forecasts of surface forcing: (1) Ensemble Streamflow Prediction, (2) CPC Official Seasonal Climate Outlook, and (3) NCEP CFS ensemble dynamical model prediction. For each of these three approaches, twenty ensemble members of forcing realizations are generated using a Bayesian merging algorithm developed by Princeton University. The three forcing methods are then used to drive the VIC model in seasonal prediction mode over thirteen large river basins that together span the CONUS domain. One to nine month ensemble seasonal prediction products such as air temperature, precipitation, soil moisture, snowpack, total runoff, evaporation and streamflow are derived for each forcing approach. The anomalies and percentiles of the predicted products for each approach may be used for CONUS drought prediction. This system is executed at the beginning of each month and distributes its products by the 10th of each month. The prediction products are evaluated using corresponding monitoring products for the VIC model and are compared with the prediction products from other research groups (e.g., University of Washington at Seattle, NASA Goddard) in the CONUS.
A Technical Analysis Information Fusion Approach for Stock Price Analysis and Modeling
NASA Astrophysics Data System (ADS)
Lahmiri, Salim
In this paper, we address the problem of technical analysis information fusion in improving stock market index-level prediction. We present an approach for analyzing stock market price behavior based on different categories of technical analysis metrics and a multiple predictive system. Each category of technical analysis measures is used to characterize stock market price movements. The presented predictive system is based on an ensemble of neural networks (NN) coupled with particle swarm intelligence for parameter optimization where each single neural network is trained with a specific category of technical analysis measures. The experimental evaluation on three international stock market indices and three individual stocks show that the presented ensemble-based technical indicators fusion system significantly improves forecasting accuracy in comparison with single NN. Also, it outperforms the classical neural network trained with index-level lagged values and NN trained with stationary wavelet transform details and approximation coefficients. As a result, technical information fusion in NN ensemble architecture helps improving prediction accuracy.
2015-06-19
effective and scientifically valid method of making comparisons of clothing and equipment changes prior to conducting human research. predictive modeling...valid method of making comparisons of clothing and equipment changes prior to conducting human research. 2 INTRODUCTION Modern day...clothing and equipment changes prior to conducting human research. METHODS Ensembles Three different body armor (BA) plus clothing ensembles were
Impact of Damping Uncertainty on SEA Model Response Variance
NASA Technical Reports Server (NTRS)
Schiller, Noah; Cabell, Randolph; Grosveld, Ferdinand
2010-01-01
Statistical Energy Analysis (SEA) is commonly used to predict high-frequency vibroacoustic levels. This statistical approach provides the mean response over an ensemble of random subsystems that share the same gross system properties such as density, size, and damping. Recently, techniques have been developed to predict the ensemble variance as well as the mean response. However these techniques do not account for uncertainties in the system properties. In the present paper uncertainty in the damping loss factor is propagated through SEA to obtain more realistic prediction bounds that account for both ensemble and damping variance. The analysis is performed on a floor-equipped cylindrical test article that resembles an aircraft fuselage. Realistic bounds on the damping loss factor are determined from measurements acquired on the sidewall of the test article. The analysis demonstrates that uncertainties in damping have the potential to significantly impact the mean and variance of the predicted response.
NASA Astrophysics Data System (ADS)
Huang, Ling; Luo, Yali
2017-08-01
Based on The Observing System Research and Predictability Experiment Interactive Grand Global Ensemble (TIGGE) data set, this study evaluates the ability of global ensemble prediction systems (EPSs) from the European Centre for Medium-Range Weather Forecasts (ECMWF), U.S. National Centers for Environmental Prediction, Japan Meteorological Agency (JMA), Korean Meteorological Administration, and China Meteorological Administration (CMA) to predict presummer rainy season (April-June) precipitation in south China. Evaluation of 5 day forecasts in three seasons (2013-2015) demonstrates the higher skill of probability matching forecasts compared to simple ensemble mean forecasts and shows that the deterministic forecast is a close second. The EPSs overestimate light-to-heavy rainfall (0.1 to 30 mm/12 h) and underestimate heavier rainfall (>30 mm/12 h), with JMA being the worst. By analyzing the synoptic situations predicted by the identified more skillful (ECMWF) and less skillful (JMA and CMA) EPSs and the ensemble sensitivity for four representative cases of torrential rainfall, the transport of warm-moist air into south China by the low-level southwesterly flow, upstream of the torrential rainfall regions, is found to be a key synoptic factor that controls the quantitative precipitation forecast. The results also suggest that prediction of locally produced torrential rainfall is more challenging than prediction of more extensively distributed torrential rainfall. A slight improvement in the performance is obtained by shortening the forecast lead time from 30-36 h to 18-24 h to 6-12 h for the cases with large-scale forcing, but not for the locally produced cases.
NASA Astrophysics Data System (ADS)
Wang, S.; Huang, G. H.; Baetz, B. W.; Cai, X. M.; Ancell, B. C.; Fan, Y. R.
2017-11-01
The ensemble Kalman filter (EnKF) is recognized as a powerful data assimilation technique that generates an ensemble of model variables through stochastic perturbations of forcing data and observations. However, relatively little guidance exists with regard to the proper specification of the magnitude of the perturbation and the ensemble size, posing a significant challenge in optimally implementing the EnKF. This paper presents a robust data assimilation system (RDAS), in which a multi-factorial design of the EnKF experiments is first proposed for hydrologic ensemble predictions. A multi-way analysis of variance is then used to examine potential interactions among factors affecting the EnKF experiments, achieving optimality of the RDAS with maximized performance of hydrologic predictions. The RDAS is applied to the Xiangxi River watershed which is the most representative watershed in China's Three Gorges Reservoir region to demonstrate its validity and applicability. Results reveal that the pairwise interaction between perturbed precipitation and streamflow observations has the most significant impact on the performance of the EnKF system, and their interactions vary dynamically across different settings of the ensemble size and the evapotranspiration perturbation. In addition, the interactions among experimental factors vary greatly in magnitude and direction depending on different statistical metrics for model evaluation including the Nash-Sutcliffe efficiency and the Box-Cox transformed root-mean-square error. It is thus necessary to test various evaluation metrics in order to enhance the robustness of hydrologic prediction systems.
NASA Astrophysics Data System (ADS)
Kunii, Masaru; Saito, Kazuo; Seko, Hiromu; Hara, Masahiro; Hara, Tabito; Yamaguchi, Munehiko; Gong, Jiandong; Charron, Martin; Du, Jun; Wang, Yong; Chen, Dehui
2011-05-01
During the period around the Beijing 2008 Olympic Games, the Beijing 2008 Olympics Research and Development Project (B08RDP) was conducted as part of the World Weather Research Program short-range weather forecasting research project. Mesoscale ensemble prediction (MEP) experiments were carried out by six organizations in near-real time, in order to share their experiences in the development of MEP systems. The purpose of this study is to objectively verify these experiments and to clarify the problems associated with the current MEP systems through the same experiences. Verification was performed using the MEP outputs interpolated into a common verification domain with a horizontal resolution of 15 km. For all systems, the ensemble spreads grew as the forecast time increased, and the ensemble mean improved the forecast errors compared with individual control forecasts in the verification against the analysis fields. However, each system exhibited individual characteristics according to the MEP method. Some participants used physical perturbation methods. The significance of these methods was confirmed by the verification. However, the mean error (ME) of the ensemble forecast in some systems was worse than that of the individual control forecast. This result suggests that it is necessary to pay careful attention to physical perturbations.
Performance of Trajectory Models with Wind Uncertainty
NASA Technical Reports Server (NTRS)
Lee, Alan G.; Weygandt, Stephen S.; Schwartz, Barry; Murphy, James R.
2009-01-01
Typical aircraft trajectory predictors use wind forecasts but do not account for the forecast uncertainty. A method for generating estimates of wind prediction uncertainty is described and its effect on aircraft trajectory prediction uncertainty is investigated. The procedure for estimating the wind prediction uncertainty relies uses a time-lagged ensemble of weather model forecasts from the hourly updated Rapid Update Cycle (RUC) weather prediction system. Forecast uncertainty is estimated using measures of the spread amongst various RUC time-lagged ensemble forecasts. This proof of concept study illustrates the estimated uncertainty and the actual wind errors, and documents the validity of the assumed ensemble-forecast accuracy relationship. Aircraft trajectory predictions are made using RUC winds with provision for the estimated uncertainty. Results for a set of simulated flights indicate this simple approach effectively translates the wind uncertainty estimate into an aircraft trajectory uncertainty. A key strength of the method is the ability to relate uncertainty to specific weather phenomena (contained in the various ensemble members) allowing identification of regional variations in uncertainty.
The Canadian seasonal forecast and the APCC exchange.
NASA Astrophysics Data System (ADS)
Archambault, B.; Fontecilla, J.; Kharin, V.; Bourgouin, P.; Ashok, K.; Lee, D.
2009-05-01
In this talk, we will first describe the Canadian seasonal forecast system. This system uses a 4 model ensemble approach with each of these models generating a 10 members ensemble. Multi-model issues related to this system will be describes. Secondly, we will describe an international multi-system initiative. The Asia-Pacific Economic Cooperation (APEC) is a forum for 21 Pacific Rim countries or regions including Canada. The APEC Climate Center (APCC) provides seasonal forecasts to their regional climate centers with a Multi Model Ensemble (MME) approach. The APCC MME is based on 13 ensemble prediction systems from different institutions including MSC(Canada), NCEP(USA), COLA(USA), KMA(Korea), JMA(Japan), BOM(Australia) and others. In this presentation, we will describe the basics of this international cooperation.
Enhancing Flood Prediction Reliability Using Bayesian Model Averaging
NASA Astrophysics Data System (ADS)
Liu, Z.; Merwade, V.
2017-12-01
Uncertainty analysis is an indispensable part of modeling the hydrology and hydrodynamics of non-idealized environmental systems. Compared to reliance on prediction from one model simulation, using on ensemble of predictions that consider uncertainty from different sources is more reliable. In this study, Bayesian model averaging (BMA) is applied to Black River watershed in Arkansas and Missouri by combining multi-model simulations to get reliable deterministic water stage and probabilistic inundation extent predictions. The simulation ensemble is generated from 81 LISFLOOD-FP subgrid model configurations that include uncertainty from channel shape, channel width, channel roughness and discharge. Model simulation outputs are trained with observed water stage data during one flood event, and BMA prediction ability is validated for another flood event. Results from this study indicate that BMA does not always outperform all members in the ensemble, but it provides relatively robust deterministic flood stage predictions across the basin. Station based BMA (BMA_S) water stage prediction has better performance than global based BMA (BMA_G) prediction which is superior to the ensemble mean prediction. Additionally, high-frequency flood inundation extent (probability greater than 60%) in BMA_G probabilistic map is more accurate than the probabilistic flood inundation extent based on equal weights.
NASA Astrophysics Data System (ADS)
Saleh, Firas; Ramaswamy, Venkatsundar; Georgas, Nickitas; Blumberg, Alan F.; Pullen, Julie
2016-07-01
This paper investigates the uncertainties in hourly streamflow ensemble forecasts for an extreme hydrological event using a hydrological model forced with short-range ensemble weather prediction models. A state-of-the art, automated, short-term hydrologic prediction framework was implemented using GIS and a regional scale hydrological model (HEC-HMS). The hydrologic framework was applied to the Hudson River basin ( ˜ 36 000 km2) in the United States using gridded precipitation data from the National Centers for Environmental Prediction (NCEP) North American Regional Reanalysis (NARR) and was validated against streamflow observations from the United States Geologic Survey (USGS). Finally, 21 precipitation ensemble members of the latest Global Ensemble Forecast System (GEFS/R) were forced into HEC-HMS to generate a retrospective streamflow ensemble forecast for an extreme hydrological event, Hurricane Irene. The work shows that ensemble stream discharge forecasts provide improved predictions and useful information about associated uncertainties, thus improving the assessment of risks when compared with deterministic forecasts. The uncertainties in weather inputs may result in false warnings and missed river flooding events, reducing the potential to effectively mitigate flood damage. The findings demonstrate how errors in the ensemble median streamflow forecast and time of peak, as well as the ensemble spread (uncertainty) are reduced 48 h pre-event by utilizing the ensemble framework. The methodology and implications of this work benefit efforts of short-term streamflow forecasts at regional scales, notably regarding the peak timing of an extreme hydrologic event when combined with a flood threshold exceedance diagram. Although the modeling framework was implemented on the Hudson River basin, it is flexible and applicable in other parts of the world where atmospheric reanalysis products and streamflow data are available.
Multi-Resolution Climate Ensemble Parameter Analysis with Nested Parallel Coordinates Plots.
Wang, Junpeng; Liu, Xiaotong; Shen, Han-Wei; Lin, Guang
2017-01-01
Due to the uncertain nature of weather prediction, climate simulations are usually performed multiple times with different spatial resolutions. The outputs of simulations are multi-resolution spatial temporal ensembles. Each simulation run uses a unique set of values for multiple convective parameters. Distinct parameter settings from different simulation runs in different resolutions constitute a multi-resolution high-dimensional parameter space. Understanding the correlation between the different convective parameters, and establishing a connection between the parameter settings and the ensemble outputs are crucial to domain scientists. The multi-resolution high-dimensional parameter space, however, presents a unique challenge to the existing correlation visualization techniques. We present Nested Parallel Coordinates Plot (NPCP), a new type of parallel coordinates plots that enables visualization of intra-resolution and inter-resolution parameter correlations. With flexible user control, NPCP integrates superimposition, juxtaposition and explicit encodings in a single view for comparative data visualization and analysis. We develop an integrated visual analytics system to help domain scientists understand the connection between multi-resolution convective parameters and the large spatial temporal ensembles. Our system presents intricate climate ensembles with a comprehensive overview and on-demand geographic details. We demonstrate NPCP, along with the climate ensemble visualization system, based on real-world use-cases from our collaborators in computational and predictive science.
NASA Astrophysics Data System (ADS)
Saito, Kazuo; Hara, Masahiro; Kunii, Masaru; Seko, Hiromu; Yamaguchi, Munehiko
2011-05-01
Different initial perturbation methods for the mesoscale ensemble prediction were compared by the Meteorological Research Institute (MRI) as a part of the intercomparison of mesoscale ensemble prediction systems (EPSs) of the World Weather Research Programme (WWRP) Beijing 2008 Olympics Research and Development Project (B08RDP). Five initial perturbation methods for mesoscale ensemble prediction were developed for B08RDP and compared at MRI: (1) a downscaling method of the Japan Meteorological Agency (JMA)'s operational one-week EPS (WEP), (2) a targeted global model singular vector (GSV) method, (3) a mesoscale model singular vector (MSV) method based on the adjoint model of the JMA non-hydrostatic model (NHM), (4) a mesoscale breeding growing mode (MBD) method based on the NHM forecast and (5) a local ensemble transform (LET) method based on the local ensemble transform Kalman filter (LETKF) using NHM. These perturbation methods were applied to the preliminary experiments of the B08RDP Tier-1 mesoscale ensemble prediction with a horizontal resolution of 15 km. To make the comparison easier, the same horizontal resolution (40 km) was employed for the three mesoscale model-based initial perturbation methods (MSV, MBD and LET). The GSV method completely outperformed the WEP method, confirming the advantage of targeting in mesoscale EPS. The GSV method generally performed well with regard to root mean square errors of the ensemble mean, large growth rates of ensemble spreads throughout the 36-h forecast period, and high detection rates and high Brier skill scores (BSSs) for weak rains. On the other hand, the mesoscale model-based initial perturbation methods showed good detection rates and BSSs for intense rains. The MSV method showed a rapid growth in the ensemble spread of precipitation up to a forecast time of 6 h, which suggests suitability of the mesoscale SV for short-range EPSs, but the initial large growth of the perturbation did not last long. The performance of the MBD method was good for ensemble prediction of intense rain with a relatively small computing cost. The LET method showed similar characteristics to the MBD method, but the spread and growth rate were slightly smaller and the relative operating characteristic area skill score and BSS did not surpass those of MBD. These characteristic features of the five methods were confirmed by checking the evolution of the total energy norms and their growth rates. Characteristics of the initial perturbations obtained by four methods (GSV, MSV, MBD and LET) were examined for the case of a synoptic low-pressure system passing over eastern China. With GSV and MSV, the regions of large spread were near the low-pressure system, but with MSV, the distribution was more concentrated on the mesoscale disturbance. On the other hand, large-spread areas were observed southwest of the disturbance in MBD and LET. The horizontal pattern of LET perturbation was similar to that of MBD, but the amplitude of the LET perturbation reflected the observation density.
NCAR's Experimental Real-time Convection-allowing Ensemble Prediction System
NASA Astrophysics Data System (ADS)
Schwartz, C. S.; Romine, G. S.; Sobash, R.; Fossell, K.
2016-12-01
Since April 2015, the National Center for Atmospheric Research's (NCAR's) Mesoscale and Microscale Meteorology (MMM) Laboratory, in collaboration with NCAR's Computational Information Systems Laboratory (CISL), has been producing daily, real-time, 10-member, 48-hr ensemble forecasts with 3-km horizontal grid spacing over the conterminous United States (http://ensemble.ucar.edu). These computationally-intensive, next-generation forecasts are produced on the Yellowstone supercomputer, have been embraced by both amateur and professional weather forecasters, are widely used by NCAR and university researchers, and receive considerable attention on social media. Initial conditions are supplied by NCAR's Data Assimilation Research Testbed (DART) software and the forecast model is NCAR's Weather Research and Forecasting (WRF) model; both WRF and DART are community tools. This presentation will focus on cutting-edge research results leveraging the ensemble dataset, including winter weather predictability, severe weather forecasting, and power outage modeling. Additionally, the unique design of the real-time analysis and forecast system and computational challenges and solutions will be described.
Ocean state and uncertainty forecasts using HYCOM with Local Ensemble Transfer Kalman Filter (LETKF)
NASA Astrophysics Data System (ADS)
Wei, Mozheng; Hogan, Pat; Rowley, Clark; Smedstad, Ole-Martin; Wallcraft, Alan; Penny, Steve
2017-04-01
An ensemble forecast system based on the US Navy's operational HYCOM using Local Ensemble Transfer Kalman Filter (LETKF) technology has been developed for ocean state and uncertainty forecasts. One of the advantages is that the best possible initial analysis states for the HYCOM forecasts are provided by the LETKF which assimilates the operational observations using ensemble method. The background covariance during this assimilation process is supplied with the ensemble, thus it avoids the difficulty of developing tangent linear and adjoint models for 4D-VAR from the complicated hybrid isopycnal vertical coordinate in HYCOM. Another advantage is that the ensemble system provides the valuable uncertainty estimate corresponding to every state forecast from HYCOM. Uncertainty forecasts have been proven to be critical for the downstream users and managers to make more scientifically sound decisions in numerical prediction community. In addition, ensemble mean is generally more accurate and skilful than the single traditional deterministic forecast with the same resolution. We will introduce the ensemble system design and setup, present some results from 30-member ensemble experiment, and discuss scientific, technical and computational issues and challenges, such as covariance localization, inflation, model related uncertainties and sensitivity to the ensemble size.
IASI Radiance Data Assimilation in Local Ensemble Transform Kalman Filter
NASA Astrophysics Data System (ADS)
Cho, K.; Hyoung-Wook, C.; Jo, Y.
2016-12-01
Korea institute of Atmospheric Prediction Systems (KIAPS) is developing NWP model with data assimilation systems. Local Ensemble Transform Kalman Filter (LETKF) system, one of the data assimilation systems, has been developed for KIAPS Integrated Model (KIM) based on cubed-sphere grid and has successfully assimilated real data. LETKF data assimilation system has been extended to 4D- LETKF which considers time-evolving error covariance within assimilation window and IASI radiance data assimilation using KPOP (KIAPS package for observation processing) with RTTOV (Radiative Transfer for TOVS). The LETKF system is implementing semi operational prediction including conventional (sonde, aircraft) observation and AMSU-A (Advanced Microwave Sounding Unit-A) radiance data from April. Recently, the semi operational prediction system updated radiance observations including GPS-RO, AMV, IASI (Infrared Atmospheric Sounding Interferometer) data at July. A set of simulation of KIM with ne30np4 and 50 vertical levels (of top 0.3hPa) were carried out for short range forecast (10days) within semi operation prediction LETKF system with ensemble forecast 50 members. In order to only IASI impact, our experiments used only conventional and IAIS radiance data to same semi operational prediction set. We carried out sensitivity test for IAIS thinning method (3D and 4D). IASI observation number was increased by temporal (4D) thinning and the improvement of IASI radiance data impact on the forecast skill of model will expect.
Supermodeling With A Global Atmospheric Model
NASA Astrophysics Data System (ADS)
Wiegerinck, Wim; Burgers, Willem; Selten, Frank
2013-04-01
In weather and climate prediction studies it often turns out to be the case that the multi-model ensemble mean prediction has the best prediction skill scores. One possible explanation is that the major part of the model error is random and is averaged out in the ensemble mean. In the standard multi-model ensemble approach, the models are integrated in time independently and the predicted states are combined a posteriori. Recently an alternative ensemble prediction approach has been proposed in which the models exchange information during the simulation and synchronize on a common solution that is closer to the truth than any of the individual model solutions in the standard multi-model ensemble approach or a weighted average of these. This approach is called the super modeling approach (SUMO). The potential of the SUMO approach has been demonstrated in the context of simple, low-order, chaotic dynamical systems. The information exchange takes the form of linear nudging terms in the dynamical equations that nudge the solution of each model to the solution of all other models in the ensemble. With a suitable choice of the connection strengths the models synchronize on a common solution that is indeed closer to the true system than any of the individual model solutions without nudging. This approach is called connected SUMO. An alternative approach is to integrate a weighted averaged model, weighted SUMO. At each time step all models in the ensemble calculate the tendency, these tendencies are weighted averaged and the state is integrated one time step into the future with this weighted averaged tendency. It was shown that in case the connected SUMO synchronizes perfectly, the connected SUMO follows the weighted averaged trajectory and both approaches yield the same solution. In this study we pioneer both approaches in the context of a global, quasi-geostrophic, three-level atmosphere model that is capable of simulating quite realistically the extra-tropical circulation in the Northern Hemisphere winter.
NASA Astrophysics Data System (ADS)
Multsch, S.; Exbrayat, J.-F.; Kirby, M.; Viney, N. R.; Frede, H.-G.; Breuer, L.
2014-11-01
Irrigation agriculture plays an increasingly important role in food supply. Many evapotranspiration models are used today to estimate the water demand for irrigation. They consider different stages of crop growth by empirical crop coefficients to adapt evapotranspiration throughout the vegetation period. We investigate the importance of the model structural vs. model parametric uncertainty for irrigation simulations by considering six evapotranspiration models and five crop coefficient sets to estimate irrigation water requirements for growing wheat in the Murray-Darling Basin, Australia. The study is carried out using the spatial decision support system SPARE:WATER. We find that structural model uncertainty is far more important than model parametric uncertainty to estimate irrigation water requirement. Using the Reliability Ensemble Averaging (REA) technique, we are able to reduce the overall predictive model uncertainty by more than 10%. The exceedance probability curve of irrigation water requirements shows that a certain threshold, e.g. an irrigation water limit due to water right of 400 mm, would be less frequently exceeded in case of the REA ensemble average (45%) in comparison to the equally weighted ensemble average (66%). We conclude that multi-model ensemble predictions and sophisticated model averaging techniques are helpful in predicting irrigation demand and provide relevant information for decision making.
NASA Astrophysics Data System (ADS)
Wood, Andy; Clark, Elizabeth; Mendoza, Pablo; Nijssen, Bart; Newman, Andy; Clark, Martyn; Nowak, Kenneth; Arnold, Jeffrey
2017-04-01
Many if not most national operational streamflow prediction systems rely on a forecaster-in-the-loop approach that require the hands-on-effort of an experienced human forecaster. This approach evolved from the need to correct for long-standing deficiencies in the models and datasets used in forecasting, and the practice often leads to skillful flow predictions despite the use of relatively simple, conceptual models. Yet the 'in-the-loop' forecast process is not reproducible, which limits opportunities to assess and incorporate new techniques systematically, and the effort required to make forecasts in this way is an obstacle to expanding forecast services - e.g., though adding new forecast locations or more frequent forecast updates, running more complex models, or producing forecast and hindcasts that can support verification. In the last decade, the hydrologic forecasting community has begun develop more centralized, 'over-the-loop' systems. The quality of these new forecast products will depend on their ability to leverage research in areas including earth system modeling, parameter estimation, data assimilation, statistical post-processing, weather and climate prediction, verification, and uncertainty estimation through the use of ensembles. Currently, many national operational streamflow forecasting and water management communities have little experience with the strengths and weaknesses of over-the-loop approaches, even as such systems are beginning to be deployed operationally in centers such as ECMWF. There is thus a need both to evaluate these forecasting advances and to demonstrate their potential in a public arena, raising awareness in forecast user communities and development programs alike. To address this need, the US National Center for Atmospheric Research is collaborating with the University of Washington, the Bureau of Reclamation and the US Army Corps of Engineers, using the NCAR 'System for Hydromet Analysis Research and Prediction Applications' (SHARP) to implement, assess and demonstrate real-time over-the-loop ensemble flow forecasts in a range of US watersheds. The system relies on fully ensemble techniques, including: an 100-member ensemble of meteorological model forcings and an ensemble particle filter data assimilation for initializing watershed states; analog/regression-based downscaling of ensemble weather forecasts from GEFS; and statistical post-processing of ensemble forecast outputs, all of which run in real-time within a workflow managed by ECWMF's ecFlow libraries over large US regional domains. We describe SHARP and present early hindcast and verification results for short to seasonal range streamflow forecasts in a number of US case study watersheds.
NASA Astrophysics Data System (ADS)
Sharma, Sanjib; Siddique, Ridwan; Reed, Seann; Ahnert, Peter; Mendoza, Pablo; Mejia, Alfonso
2018-03-01
The relative roles of statistical weather preprocessing and streamflow postprocessing in hydrological ensemble forecasting at short- to medium-range forecast lead times (day 1-7) are investigated. For this purpose, a regional hydrologic ensemble prediction system (RHEPS) is developed and implemented. The RHEPS is comprised of the following components: (i) hydrometeorological observations (multisensor precipitation estimates, gridded surface temperature, and gauged streamflow); (ii) weather ensemble forecasts (precipitation and near-surface temperature) from the National Centers for Environmental Prediction 11-member Global Ensemble Forecast System Reforecast version 2 (GEFSRv2); (iii) NOAA's Hydrology Laboratory-Research Distributed Hydrologic Model (HL-RDHM); (iv) heteroscedastic censored logistic regression (HCLR) as the statistical preprocessor; (v) two statistical postprocessors, an autoregressive model with a single exogenous variable (ARX(1,1)) and quantile regression (QR); and (vi) a comprehensive verification strategy. To implement the RHEPS, 1 to 7 days weather forecasts from the GEFSRv2 are used to force HL-RDHM and generate raw ensemble streamflow forecasts. Forecasting experiments are conducted in four nested basins in the US Middle Atlantic region, ranging in size from 381 to 12 362 km2. Results show that the HCLR preprocessed ensemble precipitation forecasts have greater skill than the raw forecasts. These improvements are more noticeable in the warm season at the longer lead times (> 3 days). Both postprocessors, ARX(1,1) and QR, show gains in skill relative to the raw ensemble streamflow forecasts, particularly in the cool season, but QR outperforms ARX(1,1). The scenarios that implement preprocessing and postprocessing separately tend to perform similarly, although the postprocessing-alone scenario is often more effective. The scenario involving both preprocessing and postprocessing consistently outperforms the other scenarios. In some cases, however, the differences between this scenario and the scenario with postprocessing alone are not as significant. We conclude that implementing both preprocessing and postprocessing ensures the most skill improvements, but postprocessing alone can often be a competitive alternative.
A Diagnostics Tool to detect ensemble forecast system anomaly and guide operational decisions
NASA Astrophysics Data System (ADS)
Park, G. H.; Srivastava, A.; Shrestha, E.; Thiemann, M.; Day, G. N.; Draijer, S.
2017-12-01
The hydrologic community is moving toward using ensemble forecasts to take uncertainty into account during the decision-making process. The New York City Department of Environmental Protection (DEP) implements several types of ensemble forecasts in their decision-making process: ensemble products for a statistical model (Hirsch and enhanced Hirsch); the National Weather Service (NWS) Advanced Hydrologic Prediction Service (AHPS) forecasts based on the classical Ensemble Streamflow Prediction (ESP) technique; and the new NWS Hydrologic Ensemble Forecasting Service (HEFS) forecasts. To remove structural error and apply the forecasts to additional forecast points, the DEP post processes both the AHPS and the HEFS forecasts. These ensemble forecasts provide mass quantities of complex data, and drawing conclusions from these forecasts is time-consuming and difficult. The complexity of these forecasts also makes it difficult to identify system failures resulting from poor data, missing forecasts, and server breakdowns. To address these issues, we developed a diagnostic tool that summarizes ensemble forecasts and provides additional information such as historical forecast statistics, forecast skill, and model forcing statistics. This additional information highlights the key information that enables operators to evaluate the forecast in real-time, dynamically interact with the data, and review additional statistics, if needed, to make better decisions. We used Bokeh, a Python interactive visualization library, and a multi-database management system to create this interactive tool. This tool compiles and stores data into HTML pages that allows operators to readily analyze the data with built-in user interaction features. This paper will present a brief description of the ensemble forecasts, forecast verification results, and the intended applications for the diagnostic tool.
NASA Astrophysics Data System (ADS)
Liu, Li; Gao, Chao; Xuan, Weidong; Xu, Yue-Ping
2017-11-01
Ensemble flood forecasts by hydrological models using numerical weather prediction products as forcing data are becoming more commonly used in operational flood forecasting applications. In this study, a hydrological ensemble flood forecasting system comprised of an automatically calibrated Variable Infiltration Capacity model and quantitative precipitation forecasts from TIGGE dataset is constructed for Lanjiang Basin, Southeast China. The impacts of calibration strategies and ensemble methods on the performance of the system are then evaluated. The hydrological model is optimized by the parallel programmed ε-NSGA II multi-objective algorithm. According to the solutions by ε-NSGA II, two differently parameterized models are determined to simulate daily flows and peak flows at each of the three hydrological stations. Then a simple yet effective modular approach is proposed to combine these daily and peak flows at the same station into one composite series. Five ensemble methods and various evaluation metrics are adopted. The results show that ε-NSGA II can provide an objective determination on parameter estimation, and the parallel program permits a more efficient simulation. It is also demonstrated that the forecasts from ECMWF have more favorable skill scores than other Ensemble Prediction Systems. The multimodel ensembles have advantages over all the single model ensembles and the multimodel methods weighted on members and skill scores outperform other methods. Furthermore, the overall performance at three stations can be satisfactory up to ten days, however the hydrological errors can degrade the skill score by approximately 2 days, and the influence persists until a lead time of 10 days with a weakening trend. With respect to peak flows selected by the Peaks Over Threshold approach, the ensemble means from single models or multimodels are generally underestimated, indicating that the ensemble mean can bring overall improvement in forecasting of flows. For peak values taking flood forecasts from each individual member into account is more appropriate.
Zhang, Li; Ai, Haixin; Chen, Wen; Yin, Zimo; Hu, Huan; Zhu, Junfeng; Zhao, Jian; Zhao, Qi; Liu, Hongsheng
2017-05-18
Carcinogenicity refers to a highly toxic end point of certain chemicals, and has become an important issue in the drug development process. In this study, three novel ensemble classification models, namely Ensemble SVM, Ensemble RF, and Ensemble XGBoost, were developed to predict carcinogenicity of chemicals using seven types of molecular fingerprints and three machine learning methods based on a dataset containing 1003 diverse compounds with rat carcinogenicity. Among these three models, Ensemble XGBoost is found to be the best, giving an average accuracy of 70.1 ± 2.9%, sensitivity of 67.0 ± 5.0%, and specificity of 73.1 ± 4.4% in five-fold cross-validation and an accuracy of 70.0%, sensitivity of 65.2%, and specificity of 76.5% in external validation. In comparison with some recent methods, the ensemble models outperform some machine learning-based approaches and yield equal accuracy and higher specificity but lower sensitivity than rule-based expert systems. It is also found that the ensemble models could be further improved if more data were available. As an application, the ensemble models are employed to discover potential carcinogens in the DrugBank database. The results indicate that the proposed models are helpful in predicting the carcinogenicity of chemicals. A web server called CarcinoPred-EL has been built for these models ( http://ccsipb.lnu.edu.cn/toxicity/CarcinoPred-EL/ ).
NASA Astrophysics Data System (ADS)
Engelen, R. J.; Peuch, V. H.
2017-12-01
The European Copernicus Atmosphere Monitoring Service (CAMS) operationally provides daily forecasts of global atmospheric composition and regional air quality. The global forecasting system is using ECMWF's Integrated Forecasting System (IFS), which is used for numerical weather prediction and which has been extended with modules for atmospheric chemistry, aerosols and greenhouse gases. The regional forecasts are produced by an ensemble of seven operational European air quality models that take their boundary conditions from the global system and provide an ensemble median with ensemble spread as their main output. Both the global and regional forecasting systems are feeding their output into air quality models on a variety of scales in various parts of the world. We will introduce the CAMS service chain and provide illustrations of its use in downstream applications. Both the usage of the daily forecasts and the usage of global and regional reanalyses will be addressed.
A mesoscale hybrid data assimilation system based on the JMA nonhydrostatic model
NASA Astrophysics Data System (ADS)
Ito, K.; Kunii, M.; Kawabata, T. T.; Saito, K. K.; Duc, L. L.
2015-12-01
This work evaluates the potential of a hybrid ensemble Kalman filter and four-dimensional variational (4D-Var) data assimilation system for predicting severe weather events from a deterministic point of view. This hybrid system is an adjoint-based 4D-Var system using a background error covariance matrix constructed from the mixture of a so-called NMC method and perturbations in a local ensemble transform Kalman filter data assimilation system, both of which are based on the Japan Meteorological Agency nonhydrostatic model. To construct the background error covariance matrix, we investigated two types of schemes. One is a spatial localization scheme and the other is neighboring ensemble approach, which regards the result at a horizontally spatially shifted point in each ensemble member as that obtained from a different realization of ensemble simulation. An assimilation of a pseudo single-observation located to the north of a tropical cyclone (TC) yielded an analysis increment of wind and temperature physically consistent with what is expected for a mature TC in both hybrid systems, whereas an analysis increment in a 4D-Var system using a static background error covariance distorted a structure of the mature TC. Real data assimilation experiments applied to 4 TCs and 3 local heavy rainfall events showed that hybrid systems and EnKF provided better initial conditions than the NMC-based 4D-Var, both for predicting the intensity and track forecast of TCs and for the location and amount of local heavy rainfall events.
Climatological Observations for Maritime Prediction and Analysis Support Service (COMPASS)
NASA Astrophysics Data System (ADS)
OConnor, A.; Kirtman, B. P.; Harrison, S.; Gorman, J.
2016-02-01
Current US Navy forecasting systems cannot easily incorporate extended-range forecasts that can improve mission readiness and effectiveness; ensure safety; and reduce cost, labor, and resource requirements. If Navy operational planners had systems that incorporated these forecasts, they could plan missions using more reliable and longer-term weather and climate predictions. Further, using multi-model forecast ensembles instead of single forecasts would produce higher predictive performance. Extended-range multi-model forecast ensembles, such as those available in the North American Multi-Model Ensemble (NMME), are ideal for system integration because of their high skill predictions; however, even higher skill predictions can be produced if forecast model ensembles are combined correctly. While many methods for weighting models exist, the best method in a given environment requires expert knowledge of the models and combination methods.We present an innovative approach that uses machine learning to combine extended-range predictions from multi-model forecast ensembles and generate a probabilistic forecast for any region of the globe up to 12 months in advance. Our machine-learning approach uses 30 years of hindcast predictions to learn patterns of forecast model successes and failures. Each model is assigned a weight for each environmental condition, 100 km2 region, and day given any expected environmental information. These weights are then applied to the respective predictions for the region and time of interest to effectively stitch together a single, coherent probabilistic forecast. Our experimental results demonstrate the benefits of our approach to produce extended-range probabilistic forecasts for regions and time periods of interest that are superior, in terms of skill, to individual NMME forecast models and commonly weighted models. The probabilistic forecast leverages the strengths of three NMME forecast models to predict environmental conditions for an area spanning from San Diego, CA to Honolulu, HI, seven months in-advance. Key findings include: weighted combinations of models are strictly better than individual models; machine-learned combinations are especially better; and forecasts produced using our approach have the highest rank probability skill score most often.
Has the prediction of the South China Sea summer monsoon improved since the late 1970s?
NASA Astrophysics Data System (ADS)
Fan, Yi; Fan, Ke; Tian, Baoqiang
2016-12-01
Based on the evaluation of state-of-the-art coupled ocean-atmosphere general circulation models (CGCMs) from the ENSEMBLES (Ensemble-based Predictions of Climate Changes and Their Impacts) and DEMETER (Development of a European Multimodel Ensemble System for Seasonal to Interannual Prediction) projects, it is found that the prediction of the South China Sea summer monsoon (SCSSM) has improved since the late 1970s. These CGCMs show better skills in prediction of the atmospheric circulation and precipitation within the SCSSM domain during 1979-2005 than that during 1960-1978. Possible reasons for this improvement are investigated. First, the relationship between the SSTs over the tropical Pacific, North Pacific and tropical Indian Ocean, and SCSSM has intensified since the late 1970s. Meanwhile, the SCSSM-related SSTs, with their larger amplitude of interannual variability, have been better predicted. Moreover, the larger amplitude of the interannual variability of the SCSSM and improved initializations for CGCMs after the late 1970s contribute to the better prediction of the SCSSM. In addition, considering that the CGCMs have certain limitations in SCSSM rainfall prediction, we applied the year-to-year increment approach to these CGCMs from the DEMETER and ENSEMBLES projects to improve the prediction of SCSSM rainfall before and after the late 1970s.
Michael J. Erickson; Brian A. Colle; Joseph J. Charney
2012-01-01
The performance of a multimodel ensemble over the northeast United States is evaluated before and after applying bias correction and Bayesian model averaging (BMA). The 13-member Stony Brook University (SBU) ensemble at 0000 UTC is combined with the 21-member National Centers for Environmental Prediction (NCEP) Short-Range Ensemble Forecast (SREF) system at 2100 UTC....
NASA Astrophysics Data System (ADS)
Li, N.; Kinzelbach, W.; Li, H.; Li, W.; Chen, F.; Wang, L.
2017-12-01
Data assimilation techniques are widely used in hydrology to improve the reliability of hydrological models and to reduce model predictive uncertainties. This provides critical information for decision makers in water resources management. This study aims to evaluate a data assimilation system for the Guantao groundwater flow model coupled with a one-dimensional soil column simulation (Hydrus 1D) using an Unbiased Ensemble Square Root Filter (UnEnSRF) originating from the Ensemble Kalman Filter (EnKF) to update parameters and states, separately or simultaneously. To simplify the coupling between unsaturated and saturated zone, a linear relationship obtained from analyzing inputs to and outputs from Hydrus 1D is applied in the data assimilation process. Unlike EnKF, the UnEnSRF updates parameter ensemble mean and ensemble perturbations separately. In order to keep the ensemble filter working well during the data assimilation, two factors are introduced in the study. One is called damping factor to dampen the update amplitude of the posterior ensemble mean to avoid nonrealistic values. The other is called inflation factor to relax the posterior ensemble perturbations close to prior to avoid filter inbreeding problems. The sensitivities of the two factors are studied and their favorable values for the Guantao model are determined. The appropriate observation error and ensemble size were also determined to facilitate the further analysis. This study demonstrated that the data assimilation of both model parameters and states gives a smaller model prediction error but with larger uncertainty while the data assimilation of only model states provides a smaller predictive uncertainty but with a larger model prediction error. Data assimilation in a groundwater flow model will improve model prediction and at the same time make the model converge to the true parameters, which provides a successful base for applications in real time modelling or real time controlling strategies in groundwater resources management.
Prediction of Weather Impacted Airport Capacity using Ensemble Learning
NASA Technical Reports Server (NTRS)
Wang, Yao Xun
2011-01-01
Ensemble learning with the Bagging Decision Tree (BDT) model was used to assess the impact of weather on airport capacities at selected high-demand airports in the United States. The ensemble bagging decision tree models were developed and validated using the Federal Aviation Administration (FAA) Aviation System Performance Metrics (ASPM) data and weather forecast at these airports. The study examines the performance of BDT, along with traditional single Support Vector Machines (SVM), for airport runway configuration selection and airport arrival rates (AAR) prediction during weather impacts. Testing of these models was accomplished using observed weather, weather forecast, and airport operation information at the chosen airports. The experimental results show that ensemble methods are more accurate than a single SVM classifier. The airport capacity ensemble method presented here can be used as a decision support model that supports air traffic flow management to meet the weather impacted airport capacity in order to reduce costs and increase safety.
Verification of Meteorological and Oceanographic Ensemble Forecasts in the U.S. Navy
NASA Astrophysics Data System (ADS)
Klotz, S.; Hansen, J.; Pauley, P.; Sestak, M.; Wittmann, P.; Skupniewicz, C.; Nelson, G.
2013-12-01
The Navy Ensemble Forecast Verification System (NEFVS) has been promoted recently to operational status at the U.S. Navy's Fleet Numerical Meteorology and Oceanography Center (FNMOC). NEFVS processes FNMOC and National Centers for Environmental Prediction (NCEP) meteorological and ocean wave ensemble forecasts, gridded forecast analyses, and innovation (observational) data output by FNMOC's data assimilation system. The NEFVS framework consists of statistical analysis routines, a variety of pre- and post-processing scripts to manage data and plot verification metrics, and a master script to control application workflow. NEFVS computes metrics that include forecast bias, mean-squared error, conditional error, conditional rank probability score, and Brier score. The system also generates reliability and Receiver Operating Characteristic diagrams. In this presentation we describe the operational framework of NEFVS and show examples of verification products computed from ensemble forecasts, meteorological observations, and forecast analyses. The construction and deployment of NEFVS addresses important operational and scientific requirements within Navy Meteorology and Oceanography. These include computational capabilities for assessing the reliability and accuracy of meteorological and ocean wave forecasts in an operational environment, for quantifying effects of changes and potential improvements to the Navy's forecast models, and for comparing the skill of forecasts from different forecast systems. NEFVS also supports the Navy's collaboration with the U.S. Air Force, NCEP, and Environment Canada in the North American Ensemble Forecast System (NAEFS) project and with the Air Force and the National Oceanic and Atmospheric Administration (NOAA) in the National Unified Operational Prediction Capability (NUOPC) program. This program is tasked with eliminating unnecessary duplication within the three agencies, accelerating the transition of new technology, such as multi-model ensemble forecasting, to U.S. Department of Defense use, and creating a superior U.S. global meteorological and oceanographic prediction capability. Forecast verification is an important component of NAEFS and NUOPC. Distribution Statement A: Approved for Public Release; distribution is unlimited
Verification of Meteorological and Oceanographic Ensemble Forecasts in the U.S. Navy
NASA Astrophysics Data System (ADS)
Klotz, S. P.; Hansen, J.; Pauley, P.; Sestak, M.; Wittmann, P.; Skupniewicz, C.; Nelson, G.
2012-12-01
The Navy Ensemble Forecast Verification System (NEFVS) has been promoted recently to operational status at the U.S. Navy's Fleet Numerical Meteorology and Oceanography Center (FNMOC). NEFVS processes FNMOC and National Centers for Environmental Prediction (NCEP) meteorological and ocean wave ensemble forecasts, gridded forecast analyses, and innovation (observational) data output by FNMOC's data assimilation system. The NEFVS framework consists of statistical analysis routines, a variety of pre- and post-processing scripts to manage data and plot verification metrics, and a master script to control application workflow. NEFVS computes metrics that include forecast bias, mean-squared error, conditional error, conditional rank probability score, and Brier score. The system also generates reliability and Receiver Operating Characteristic diagrams. In this presentation we describe the operational framework of NEFVS and show examples of verification products computed from ensemble forecasts, meteorological observations, and forecast analyses. The construction and deployment of NEFVS addresses important operational and scientific requirements within Navy Meteorology and Oceanography (METOC). These include computational capabilities for assessing the reliability and accuracy of meteorological and ocean wave forecasts in an operational environment, for quantifying effects of changes and potential improvements to the Navy's forecast models, and for comparing the skill of forecasts from different forecast systems. NEFVS also supports the Navy's collaboration with the U.S. Air Force, NCEP, and Environment Canada in the North American Ensemble Forecast System (NAEFS) project and with the Air Force and the National Oceanic and Atmospheric Administration (NOAA) in the National Unified Operational Prediction Capability (NUOPC) program. This program is tasked with eliminating unnecessary duplication within the three agencies, accelerating the transition of new technology, such as multi-model ensemble forecasting, to U.S. Department of Defense use, and creating a superior U.S. global meteorological and oceanographic prediction capability. Forecast verification is an important component of NAEFS and NUOPC.
NASA Astrophysics Data System (ADS)
Xue, Yan; Wen, C.; Kumar, A.; Balmaseda, M.; Fujii, Y.; Alves, O.; Martin, M.; Yang, X.; Vernieres, G.; Desportes, C.; Lee, T.; Ascione, I.; Gudgel, R.; Ishikawa, I.
2017-12-01
An ensemble of nine operational ocean reanalyses (ORAs) is now routinely collected, and is used to monitor the consistency across the tropical Pacific temperature analyses in real-time in support of ENSO monitoring, diagnostics, and prediction. The ensemble approach allows a more reliable estimate of the signal as well as an estimation of the noise among analyses. The real-time estimation of signal-to-noise ratio assists the prediction of ENSO. The ensemble approach also enables us to estimate the impact of the Tropical Pacific Observing System (TPOS) on the estimation of ENSO-related oceanic indicators. The ensemble mean is shown to have a better accuracy than individual ORAs, suggesting the ensemble approach is an effective tool to reduce uncertainties in temperature analysis for ENSO. The ensemble spread, as a measure of uncertainties in ORAs, is shown to be partially linked to the data counts of in situ observations. Despite the constraints by TPOS data, uncertainties in ORAs are still large in the northwestern tropical Pacific, in the SPCZ region, as well as in the central and northeastern tropical Pacific. The uncertainties in total temperature reduced significantly in 2015 due to the recovery of the TAO/TRITON array to approach the value before the TAO crisis in 2012. However, the uncertainties in anomalous temperature remained much higher than the pre-2012 value, probably due to uncertainties in the reference climatology. This highlights the importance of the long-term stability of the observing system for anomaly monitoring. The current data assimilation systems tend to constrain the solution very locally near the buoy sites, potentially damaging the larger-scale dynamical consistency. So there is an urgent need to improve data assimilation systems so that they can optimize the observation information from TPOS and contribute to improved ENSO prediction.
NASA Astrophysics Data System (ADS)
Wang, S.; Ancell, B. C.; Huang, G. H.; Baetz, B. W.
2018-03-01
Data assimilation using the ensemble Kalman filter (EnKF) has been increasingly recognized as a promising tool for probabilistic hydrologic predictions. However, little effort has been made to conduct the pre- and post-processing of assimilation experiments, posing a significant challenge in achieving the best performance of hydrologic predictions. This paper presents a unified data assimilation framework for improving the robustness of hydrologic ensemble predictions. Statistical pre-processing of assimilation experiments is conducted through the factorial design and analysis to identify the best EnKF settings with maximized performance. After the data assimilation operation, statistical post-processing analysis is also performed through the factorial polynomial chaos expansion to efficiently address uncertainties in hydrologic predictions, as well as to explicitly reveal potential interactions among model parameters and their contributions to the predictive accuracy. In addition, the Gaussian anamorphosis is used to establish a seamless bridge between data assimilation and uncertainty quantification of hydrologic predictions. Both synthetic and real data assimilation experiments are carried out to demonstrate feasibility and applicability of the proposed methodology in the Guadalupe River basin, Texas. Results suggest that statistical pre- and post-processing of data assimilation experiments provide meaningful insights into the dynamic behavior of hydrologic systems and enhance robustness of hydrologic ensemble predictions.
NASA Astrophysics Data System (ADS)
Clark, Elizabeth; Wood, Andy; Nijssen, Bart; Mendoza, Pablo; Newman, Andy; Nowak, Kenneth; Arnold, Jeffrey
2017-04-01
In an automated forecast system, hydrologic data assimilation (DA) performs the valuable function of correcting raw simulated watershed model states to better represent external observations, including measurements of streamflow, snow, soil moisture, and the like. Yet the incorporation of automated DA into operational forecasting systems has been a long-standing challenge due to the complexities of the hydrologic system, which include numerous lags between state and output variations. To help demonstrate that such methods can succeed in operational automated implementations, we present results from the real-time application of an ensemble particle filter (PF) for short-range (7 day lead) ensemble flow forecasts in western US river basins. We use the System for Hydromet Applications, Research and Prediction (SHARP), developed by the National Center for Atmospheric Research (NCAR) in collaboration with the University of Washington, U.S. Army Corps of Engineers, and U.S. Bureau of Reclamation. SHARP is a fully automated platform for short-term to seasonal hydrologic forecasting applications, incorporating uncertainty in initial hydrologic conditions (IHCs) and in hydrometeorological predictions through ensemble methods. In this implementation, IHC uncertainty is estimated by propagating an ensemble of 100 temperature and precipitation time series through conceptual and physically-oriented models. The resulting ensemble of derived IHCs exhibits a broad range of possible soil moisture and snow water equivalent (SWE) states. The PF selects and/or weights and resamples the IHCs that are most consistent with external streamflow observations, and uses the particles to initialize a streamflow forecast ensemble driven by ensemble precipitation and temperature forecasts downscaled from the Global Ensemble Forecast System (GEFS). We apply this method in real-time for several basins in the western US that are important for water resources management, and perform a hindcast experiment to evaluate the utility of PF-based data assimilation on streamflow forecasts skill. This presentation describes findings, including a comparison of sequential and non-sequential particle weighting methods.
An, Yi; Wang, Jiawei; Li, Chen; Leier, André; Marquez-Lago, Tatiana; Wilksch, Jonathan; Zhang, Yang; Webb, Geoffrey I; Song, Jiangning; Lithgow, Trevor
2018-01-01
Bacterial effector proteins secreted by various protein secretion systems play crucial roles in host-pathogen interactions. In this context, computational tools capable of accurately predicting effector proteins of the various types of bacterial secretion systems are highly desirable. Existing computational approaches use different machine learning (ML) techniques and heterogeneous features derived from protein sequences and/or structural information. These predictors differ not only in terms of the used ML methods but also with respect to the used curated data sets, the features selection and their prediction performance. Here, we provide a comprehensive survey and benchmarking of currently available tools for the prediction of effector proteins of bacterial types III, IV and VI secretion systems (T3SS, T4SS and T6SS, respectively). We review core algorithms, feature selection techniques, tool availability and applicability and evaluate the prediction performance based on carefully curated independent test data sets. In an effort to improve predictive performance, we constructed three ensemble models based on ML algorithms by integrating the output of all individual predictors reviewed. Our benchmarks demonstrate that these ensemble models outperform all the reviewed tools for the prediction of effector proteins of T3SS and T4SS. The webserver of the proposed ensemble methods for T3SS and T4SS effector protein prediction is freely available at http://tbooster.erc.monash.edu/index.jsp. We anticipate that this survey will serve as a useful guide for interested users and that the new ensemble predictors will stimulate research into host-pathogen relationships and inspiration for the development of new bioinformatics tools for predicting effector proteins of T3SS, T4SS and T6SS. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The Hydrologic Ensemble Prediction Experiment (HEPEX)
NASA Astrophysics Data System (ADS)
Wood, A. W.; Thielen, J.; Pappenberger, F.; Schaake, J. C.; Hartman, R. K.
2012-12-01
The Hydrologic Ensemble Prediction Experiment was established in March, 2004, at a workshop hosted by the European Center for Medium Range Weather Forecasting (ECMWF). With support from the US National Weather Service (NWS) and the European Commission (EC), the HEPEX goal was to bring the international hydrological and meteorological communities together to advance the understanding and adoption of hydrological ensemble forecasts for decision support in emergency management and water resources sectors. The strategy to meet this goal includes meetings that connect the user, forecast producer and research communities to exchange ideas, data and methods; the coordination of experiments to address specific challenges; and the formation of testbeds to facilitate shared experimentation. HEPEX has organized about a dozen international workshops, as well as sessions at scientific meetings (including AMS, AGU and EGU) and special issues of scientific journals where workshop results have been published. Today, the HEPEX mission is to demonstrate the added value of hydrological ensemble prediction systems (HEPS) for emergency management and water resources sectors to make decisions that have important consequences for economy, public health, safety, and the environment. HEPEX is now organised around six major themes that represent core elements of a hydrologic ensemble prediction enterprise: input and pre-processing, ensemble techniques, data assimilation, post-processing, verification, and communication and use in decision making. This poster presents an overview of recent and planned HEPEX activities, highlighting case studies that exemplify the focus and objectives of HEPEX.
NASA Astrophysics Data System (ADS)
Multsch, S.; Exbrayat, J.-F.; Kirby, M.; Viney, N. R.; Frede, H.-G.; Breuer, L.
2015-04-01
Irrigation agriculture plays an increasingly important role in food supply. Many evapotranspiration models are used today to estimate the water demand for irrigation. They consider different stages of crop growth by empirical crop coefficients to adapt evapotranspiration throughout the vegetation period. We investigate the importance of the model structural versus model parametric uncertainty for irrigation simulations by considering six evapotranspiration models and five crop coefficient sets to estimate irrigation water requirements for growing wheat in the Murray-Darling Basin, Australia. The study is carried out using the spatial decision support system SPARE:WATER. We find that structural model uncertainty among reference ET is far more important than model parametric uncertainty introduced by crop coefficients. These crop coefficients are used to estimate irrigation water requirement following the single crop coefficient approach. Using the reliability ensemble averaging (REA) technique, we are able to reduce the overall predictive model uncertainty by more than 10%. The exceedance probability curve of irrigation water requirements shows that a certain threshold, e.g. an irrigation water limit due to water right of 400 mm, would be less frequently exceeded in case of the REA ensemble average (45%) in comparison to the equally weighted ensemble average (66%). We conclude that multi-model ensemble predictions and sophisticated model averaging techniques are helpful in predicting irrigation demand and provide relevant information for decision making.
NASA Astrophysics Data System (ADS)
Cofino, A. S.; Santos, C.; Garcia-Moya, J. A.; Gutierrez, J. M.; Orfila, B.
2009-04-01
The Short-Range Ensemble Prediction System (SREPS) is a multi-LAM (UM, HIRLAM, MM5, LM and HRM) multi analysis/boundary conditions (ECMWF, UKMetOffice, DWD and GFS) run twice a day by AEMET (72 hours lead time) over a European domain, with a total of 5 (LAMs) x 4 (GCMs) = 20 members. One of the main goals of this project is analyzing the impact of models and boundary conditions in the short-range high-resolution forecasted precipitation. A previous validation of this method has been done considering a set of climate networks in Spain, France and Germany, by interpolating the prediction to the gauge locations (SREPS, 2008). In this work we compare these results with those obtained by using a statistical downscaling method to post-process the global predictions, obtaining an "advanced interpolation" for the local precipitation using climate network precipitation observations. In particular, we apply the PROMETEO downscaling system based on analogs and compare the SREPS ensemble of 20 members with the PROMETEO statistical ensemble of 5 (analog ensemble) x 4 (GCMs) = 20 members. Moreover, we will also compare the performance of a combined approach post-processing the SREPS outputs using the PROMETEO system. References: SREPS 2008. 2008 EWGLAM-SRNWP Meeting (http://www.aemet.es/documentos/va/divulgacion/conferencias/prediccion/Ewglam/PRED_CSantos.pdf)
Minimalist ensemble algorithms for genome-wide protein localization prediction.
Lin, Jhih-Rong; Mondal, Ananda Mohan; Liu, Rong; Hu, Jianjun
2012-07-03
Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi.
Minimalist ensemble algorithms for genome-wide protein localization prediction
2012-01-01
Background Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. Results This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. Conclusions We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi. PMID:22759391
NASA Astrophysics Data System (ADS)
Rowley, C. D.; Hogan, P. J.; Martin, P.; Thoppil, P.; Wei, M.
2017-12-01
An extended range ensemble forecast system is being developed in the US Navy Earth System Prediction Capability (ESPC), and a global ocean ensemble generation capability to represent uncertainty in the ocean initial conditions has been developed. At extended forecast times, the uncertainty due to the model error overtakes the initial condition as the primary source of forecast uncertainty. Recently, stochastic parameterization or stochastic forcing techniques have been applied to represent the model error in research and operational atmospheric, ocean, and coupled ensemble forecasts. A simple stochastic forcing technique has been developed for application to US Navy high resolution regional and global ocean models, for use in ocean-only and coupled atmosphere-ocean-ice-wave ensemble forecast systems. Perturbation forcing is added to the tendency equations for state variables, with the forcing defined by random 3- or 4-dimensional fields with horizontal, vertical, and temporal correlations specified to characterize different possible kinds of error. Here, we demonstrate the stochastic forcing in regional and global ensemble forecasts with varying perturbation amplitudes and length and time scales, and assess the change in ensemble skill measured by a range of deterministic and probabilistic metrics.
Exploring the calibration of a wind forecast ensemble for energy applications
NASA Astrophysics Data System (ADS)
Heppelmann, Tobias; Ben Bouallegue, Zied; Theis, Susanne
2015-04-01
In the German research project EWeLiNE, Deutscher Wetterdienst (DWD) and Fraunhofer Institute for Wind Energy and Energy System Technology (IWES) are collaborating with three German Transmission System Operators (TSO) in order to provide the TSOs with improved probabilistic power forecasts. Probabilistic power forecasts are derived from probabilistic weather forecasts, themselves derived from ensemble prediction systems (EPS). Since the considered raw ensemble wind forecasts suffer from underdispersiveness and bias, calibration methods are developed for the correction of the model bias and the ensemble spread bias. The overall aim is to improve the ensemble forecasts such that the uncertainty of the possible weather deployment is depicted by the ensemble spread from the first forecast hours. Additionally, the ensemble members after calibration should remain physically consistent scenarios. We focus on probabilistic hourly wind forecasts with horizon of 21 h delivered by the convection permitting high-resolution ensemble system COSMO-DE-EPS which has become operational in 2012 at DWD. The ensemble consists of 20 ensemble members driven by four different global models. The model area includes whole Germany and parts of Central Europe with a horizontal resolution of 2.8 km and a vertical resolution of 50 model levels. For verification we use wind mast measurements around 100 m height that corresponds to the hub height of wind energy plants that belong to wind farms within the model area. Calibration of the ensemble forecasts can be performed by different statistical methods applied to the raw ensemble output. Here, we explore local bivariate Ensemble Model Output Statistics at individual sites and quantile regression with different predictors. Applying different methods, we already show an improvement of ensemble wind forecasts from COSMO-DE-EPS for energy applications. In addition, an ensemble copula coupling approach transfers the time-dependencies of the raw ensemble to the calibrated ensemble. The calibrated wind forecasts are evaluated first with univariate probabilistic scores and additionally with diagnostics of wind ramps in order to assess the time-consistency of the calibrated ensemble members.
NASA Astrophysics Data System (ADS)
Singh, Shailesh Kumar
2014-05-01
Streamflow forecasts are essential for making critical decision for optimal allocation of water supplies for various demands that include irrigation for agriculture, habitat for fisheries, hydropower production and flood warning. The major objective of this study is to explore the Ensemble Streamflow Prediction (ESP) based forecast in New Zealand catchments and to highlights the present capability of seasonal flow forecasting of National Institute of Water and Atmospheric Research (NIWA). In this study a probabilistic forecast framework for ESP is presented. The basic assumption in ESP is that future weather pattern were experienced historically. Hence, past forcing data can be used with current initial condition to generate an ensemble of prediction. Small differences in initial conditions can result in large difference in the forecast. The initial state of catchment can be obtained by continuously running the model till current time and use this initial state with past forcing data to generate ensemble of flow for future. The approach taken here is to run TopNet hydrological models with a range of past forcing data (precipitation, temperature etc.) with current initial conditions. The collection of runs is called the ensemble. ESP give probabilistic forecasts for flow. From ensemble members the probability distributions can be derived. The probability distributions capture part of the intrinsic uncertainty in weather or climate. An ensemble stream flow prediction which provide probabilistic hydrological forecast with lead time up to 3 months is presented for Rangitata, Ahuriri, and Hooker and Jollie rivers in South Island of New Zealand. ESP based seasonal forecast have better skill than climatology. This system can provide better over all information for holistic water resource management.
NASA Astrophysics Data System (ADS)
Alessandri, A.; De Felice, M.; Catalano, F.; Lee, J. Y.; Wang, B.; Lee, D. Y.; Yoo, J. H.; Weisheimer, A.
2017-12-01
By initiating a novel cooperation between the European and the Asian-Pacific climate-prediction communities, this work demonstrates the potential of gathering together their Multi-Model Ensembles (MMEs) to obtain useful climate predictions at seasonal time-scale.MMEs are powerful tools in dynamical climate prediction as they account for the overconfidence and the uncertainties related to single-model ensembles and increasing benefit is expected with the increase of the independence of the contributing Seasonal Prediction Systems (SPSs). In this work we combine the two MME SPSs independently developed by the European (ENSEMBLES) and by the Asian-Pacific (APCC/CliPAS) communities by establishing an unprecedented partnerships. To this aim, all the possible MME combinations obtained by putting together the 5 models from ENSEMBLES and the 11 models from APCC/CliPAS have been evaluated. The Grand ENSEMBLES-APCC/CliPAS MME enhances significantly the skill in predicting 2m temperature and precipitation. Our results show that, in general, the better combinations of SPSs are obtained by mixing ENSEMBLES and APCC/CliPAS models and that only a limited number of SPSs is required to obtain the maximum performance. The selection of models that perform better is usually different depending on the region/phenomenon under consideration so that all models are useful in some cases. It is shown that the incremental performance contribution tends to be higher when adding one model from ENSEMBLES to APCC/CliPAS MMEs and vice versa, confirming that the benefit of using MMEs amplifies with the increase of the independence the contributing models.To verify the above results for a real world application, the Grand MME is used to predict energy demand over Italy as provided by TERNA (Italian Transmission System Operator) for the period 1990-2007. The results demonstrate the useful application of MME seasonal predictions for energy demand forecasting over Italy. It is shown a significant enhancement of the potential economic value of forecasting energy demand when using the better combinations from the Grand MME by comparison to the maximum value obtained from the better combinations of each of the two contributing MMEs. Above results are discussed in a Clim Dyn paper (Alessandri et al., 2017; doi:10.1007/s00382-016-3372-4).
A common fallacy in climate model evaluation
NASA Astrophysics Data System (ADS)
Annan, J. D.; Hargreaves, J. C.; Tachiiri, K.
2012-04-01
We discuss the assessment of model ensembles such as that arising from the CMIP3 coordinated multi-model experiments. An important aspect of this is not merely the closeness of the models to observations in absolute terms but also the reliability of the ensemble spread as an indication of uncertainty. In this context, it has been widely argued that the multi-model ensemble of opportunity is insufficiently broad to adequately represent uncertainties regarding future climate change. For example, the IPCC AR4 summarises the consensus with the sentence: "Those studies also suggest that the current AOGCMs may not cover the full range of uncertainty for climate sensitivity." Similar claims have been made in the literature for other properties of the climate system, including the transient climate response and efficiency of ocean heat uptake. Comparison of model outputs with observations of the climate system forms an essential component of model assessment and is crucial for building our confidence in model predictions. However, methods for undertaking this comparison are not always clearly justified and understood. Here we show that the popular approach which forms the basis for the above claims, of comparing the ensemble spread to a so-called "observationally-constrained pdf", can be highly misleading. Such a comparison will almost certainly result in disagreement, but in reality tells us little about the performance of the ensemble. We present an alternative approach based on an assessment of the predictive performance of the ensemble, and show how it may lead to very different, and rather more encouraging, conclusions. We additionally outline some necessary conditions for an ensemble (or more generally, a probabilistic prediction) to be challenged by an observation.
Hansen, Bjoern Oest; Meyer, Etienne H; Ferrari, Camilla; Vaid, Neha; Movahedi, Sara; Vandepoele, Klaas; Nikoloski, Zoran; Mutwil, Marek
2018-03-01
Recent advances in gene function prediction rely on ensemble approaches that integrate results from multiple inference methods to produce superior predictions. Yet, these developments remain largely unexplored in plants. We have explored and compared two methods to integrate 10 gene co-function networks for Arabidopsis thaliana and demonstrate how the integration of these networks produces more accurate gene function predictions for a larger fraction of genes with unknown function. These predictions were used to identify genes involved in mitochondrial complex I formation, and for five of them, we confirmed the predictions experimentally. The ensemble predictions are provided as a user-friendly online database, EnsembleNet. The methods presented here demonstrate that ensemble gene function prediction is a powerful method to boost prediction performance, whereas the EnsembleNet database provides a cutting-edge community tool to guide experimentalists. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Multiphysics superensemble forecast applied to Mediterranean heavy precipitation situations
NASA Astrophysics Data System (ADS)
Vich, M.; Romero, R.
2010-11-01
The high-impact precipitation events that regularly affect the western Mediterranean coastal regions are still difficult to predict with the current prediction systems. Bearing this in mind, this paper focuses on the superensemble technique applied to the precipitation field. Encouraged by the skill shown by a previous multiphysics ensemble prediction system applied to western Mediterranean precipitation events, the superensemble is fed with this ensemble. The training phase of the superensemble contributes to the actual forecast with weights obtained by comparing the past performance of the ensemble members and the corresponding observed states. The non-hydrostatic MM5 mesoscale model is used to run the multiphysics ensemble. Simulations are performed with a 22.5 km resolution domain (Domain 1 in http://mm5forecasts.uib.es) nested in the ECMWF forecast fields. The period between September and December 2001 is used to train the superensemble and a collection of 19~MEDEX cyclones is used to test it. The verification procedure involves testing the superensemble performance and comparing it with that of the poor-man and bias-corrected ensemble mean and the multiphysic EPS control member. The results emphasize the need of a well-behaved training phase to obtain good results with the superensemble technique. A strategy to obtain this improved training phase is already outlined.
Extended Range Prediction of Indian Summer Monsoon: Current status
NASA Astrophysics Data System (ADS)
Sahai, A. K.; Abhilash, S.; Borah, N.; Joseph, S.; Chattopadhyay, R.; S, S.; Rajeevan, M.; Mandal, R.; Dey, A.
2014-12-01
The main focus of this study is to develop forecast consensus in the extended range prediction (ERP) of monsoon Intraseasonal oscillations using a suit of different variants of Climate Forecast system (CFS) model. In this CFS based Grand MME prediction system (CGMME), the ensemble members are generated by perturbing the initial condition and using different configurations of CFSv2. This is to address the role of different physical mechanisms known to have control on the error growth in the ERP in the 15-20 day time scale. The final formulation of CGMME is based on 21 ensembles of the standalone Global Forecast System (GFS) forced with bias corrected forecasted SST from CFS, 11 low resolution CFST126 and 11 high resolution CFST382. Thus, we develop the multi-model consensus forecast for the ERP of Indian summer monsoon (ISM) using a suite of different variants of CFS model. This coordinated international effort lead towards the development of specific tailor made regional forecast products over Indian region. Skill of deterministic and probabilistic categorical rainfall forecast as well the verification of large-scale low frequency monsoon intraseasonal oscillations has been carried out using hindcast from 2001-2012 during the monsoon season in which all models are initialized at every five days starting from 16May to 28 September. The skill of deterministic forecast from CGMME is better than the best participating single model ensemble configuration (SME). The CGMME approach is believed to quantify the uncertainty in both initial conditions and model formulation. Main improvement is attained in probabilistic forecast which is because of an increase in the ensemble spread, thereby reducing the error due to over-confident ensembles in a single model configuration. For probabilistic forecast, three tercile ranges are determined by ranking method based on the percentage of ensemble members from all the participating models falls in those three categories. CGMME further added value to both deterministic and probability forecast compared to raw SME's and this better skill is probably flows from large spread and improved spread-error relationship. CGMME system is currently capable of generating ER prediction in real time and successfully delivering its experimental operational ER forecast of ISM for the last few years.
Ensemble data assimilation in the Red Sea: sensitivity to ensemble selection and atmospheric forcing
NASA Astrophysics Data System (ADS)
Toye, Habib; Zhan, Peng; Gopalakrishnan, Ganesh; Kartadikaria, Aditya R.; Huang, Huang; Knio, Omar; Hoteit, Ibrahim
2017-07-01
We present our efforts to build an ensemble data assimilation and forecasting system for the Red Sea. The system consists of the high-resolution Massachusetts Institute of Technology general circulation model (MITgcm) to simulate ocean circulation and of the Data Research Testbed (DART) for ensemble data assimilation. DART has been configured to integrate all members of an ensemble adjustment Kalman filter (EAKF) in parallel, based on which we adapted the ensemble operations in DART to use an invariant ensemble, i.e., an ensemble Optimal Interpolation (EnOI) algorithm. This approach requires only single forward model integration in the forecast step and therefore saves substantial computational cost. To deal with the strong seasonal variability of the Red Sea, the EnOI ensemble is then seasonally selected from a climatology of long-term model outputs. Observations of remote sensing sea surface height (SSH) and sea surface temperature (SST) are assimilated every 3 days. Real-time atmospheric fields from the National Center for Environmental Prediction (NCEP) and the European Center for Medium-Range Weather Forecasts (ECMWF) are used as forcing in different assimilation experiments. We investigate the behaviors of the EAKF and (seasonal-) EnOI and compare their performances for assimilating and forecasting the circulation of the Red Sea. We further assess the sensitivity of the assimilation system to various filtering parameters (ensemble size, inflation) and atmospheric forcing.
NASA Technical Reports Server (NTRS)
Maggioni, V.; Anagnostou, E. N.; Reichle, R. H.
2013-01-01
The contribution of rainfall forcing errors relative to model (structural and parameter) uncertainty in the prediction of soil moisture is investigated by integrating the NASA Catchment Land Surface Model (CLSM), forced with hydro-meteorological data, in the Oklahoma region. Rainfall-forcing uncertainty is introduced using a stochastic error model that generates ensemble rainfall fields from satellite rainfall products. The ensemble satellite rain fields are propagated through CLSM to produce soil moisture ensembles. Errors in CLSM are modeled with two different approaches: either by perturbing model parameters (representing model parameter uncertainty) or by adding randomly generated noise (representing model structure and parameter uncertainty) to the model prognostic variables. Our findings highlight that the method currently used in the NASA GEOS-5 Land Data Assimilation System to perturb CLSM variables poorly describes the uncertainty in the predicted soil moisture, even when combined with rainfall model perturbations. On the other hand, by adding model parameter perturbations to rainfall forcing perturbations, a better characterization of uncertainty in soil moisture simulations is observed. Specifically, an analysis of the rank histograms shows that the most consistent ensemble of soil moisture is obtained by combining rainfall and model parameter perturbations. When rainfall forcing and model prognostic perturbations are added, the rank histogram shows a U-shape at the domain average scale, which corresponds to a lack of variability in the forecast ensemble. The more accurate estimation of the soil moisture prediction uncertainty obtained by combining rainfall and parameter perturbations is encouraging for the application of this approach in ensemble data assimilation systems.
NASA Technical Reports Server (NTRS)
Keppenne, C. L.; Rienecker, M.; Borovikov, A. Y.
1999-01-01
Two massively parallel data assimilation systems in which the model forecast-error covariances are estimated from the distribution of an ensemble of model integrations are applied to the assimilation of 97-98 TOPEX/POSEIDON altimetry and TOGA/TAO temperature data into a Pacific basin version the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. in the first system, ensemble of model runs forced by an ensemble of atmospheric model simulations is used to calculate asymptotic error statistics. The data assimilation then occurs in the reduced phase space spanned by the corresponding leading empirical orthogonal functions. The second system is an ensemble Kalman filter in which new error statistics are computed during each assimilation cycle from the time-dependent ensemble distribution. The data assimilation experiments are conducted on NSIPP's 512-processor CRAY T3E. The two data assimilation systems are validated by withholding part of the data and quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The pros and cons of each system are discussed.
Potential Predictability of the Monsoon Subclimate Systems
NASA Technical Reports Server (NTRS)
Yang, Song; Lau, K.-M.; Chang, Y.; Schubert, S.
1999-01-01
While El Nino/Southern Oscillation (ENSO) phenomenon can be predicted with some success using coupled oceanic-atmospheric models, the skill of predicting the tropical monsoons is low regardless of the methods applied. The low skill of monsoon prediction may be either because the monsoons are not defined appropriately or because they are not influenced significantly by boundary forcing. The latter characterizes the importance of internal dynamics in monsoon variability and leads to many eminent chaotic features of the monsoons. In this study, we analyze results from nine AMIP-type ensemble experiments with the NASA/GEOS-2 general circulation model to assess the potential predictability of the tropical climate system. We will focus on the variability and predictability of tropical monsoon rainfall on seasonal-to-interannual time scales. It is known that the tropical climate is more predictable than its extratropical counterpart. However, predictability is different from one climate subsystem to another within the tropics. It is important to understand the differences among these subsystems in order to increase our skill of seasonal-to-interannual prediction. We assess potential predictability by comparing the magnitude of internal and forced variances as defined by Harzallah and Sadourny (1995). The internal variance measures the spread among the various ensemble members. The forced part of rainfall variance is determined by the magnitude of the ensemble mean rainfall anomaly and by the degree of consistency of the results from the various experiments.
NASA Astrophysics Data System (ADS)
Hu, Jianlin; Li, Xun; Huang, Lin; Ying, Qi; Zhang, Qiang; Zhao, Bin; Wang, Shuxiao; Zhang, Hongliang
2017-11-01
Accurate exposure estimates are required for health effect analyses of severe air pollution in China. Chemical transport models (CTMs) are widely used to provide spatial distribution, chemical composition, particle size fractions, and source origins of air pollutants. The accuracy of air quality predictions in China is greatly affected by the uncertainties of emission inventories. The Community Multiscale Air Quality (CMAQ) model with meteorological inputs from the Weather Research and Forecasting (WRF) model were used in this study to simulate air pollutants in China in 2013. Four simulations were conducted with four different anthropogenic emission inventories, including the Multi-resolution Emission Inventory for China (MEIC), the Emission Inventory for China by School of Environment at Tsinghua University (SOE), the Emissions Database for Global Atmospheric Research (EDGAR), and the Regional Emission inventory in Asia version 2 (REAS2). Model performance of each simulation was evaluated against available observation data from 422 sites in 60 cities across China. Model predictions of O3 and PM2.5 generally meet the model performance criteria, but performance differences exist in different regions, for different pollutants, and among inventories. Ensemble predictions were calculated by linearly combining the results from different inventories to minimize the sum of the squared errors between the ensemble results and the observations in all cities. The ensemble concentrations show improved agreement with observations in most cities. The mean fractional bias (MFB) and mean fractional errors (MFEs) of the ensemble annual PM2.5 in the 60 cities are -0.11 and 0.24, respectively, which are better than the MFB (-0.25 to -0.16) and MFE (0.26-0.31) of individual simulations. The ensemble annual daily maximum 1 h O3 (O3-1h) concentrations are also improved, with mean normalized bias (MNB) of 0.03 and mean normalized errors (MNE) of 0.14, compared to MNB of 0.06-0.19 and MNE of 0.16-0.22 of the individual predictions. The ensemble predictions agree better with observations with daily, monthly, and annual averaging times in all regions of China for both PM2.5 and O3-1h. The study demonstrates that ensemble predictions from combining predictions from individual emission inventories can improve the accuracy of predicted temporal and spatial distributions of air pollutants. This study is the first ensemble model study in China using multiple emission inventories, and the results are publicly available for future health effect studies.
Probabilistic Storm Surge Forecast For Venice
NASA Astrophysics Data System (ADS)
Mel, Riccardo; Lionello, Piero
2013-04-01
This study describes an ensemble storm surge prediction procedure for the city of Venice, which is potentially very useful for its management, maintenance and for operating the movable barriers that are presently being built. Ensemble Prediction System (EPS) is meant to complement the existing SL forecast system by providing a probabilistic forecast and information on uncertainty of SL prediction. The procedure is applied to storm surge events in the period 2009-2010 producing for each of them an ensemble of 50 simulations. It is shown that EPS slightly increases the accuracy of SL prediction with respect to the deterministic forecast (DF) and it is more reliable than it. Though results are low biased and forecast uncertainty is underestimated, the probability distribution of maximum sea level produced by the EPS is acceptably realistic. The error of the EPS mean is shown to be correlated with the EPS spread. SL peaks correspond to maxima of uncertainty and uncertainty increases linearly with the forecast range. The quasi linear dynamics of the storm surges produces a modulation of the uncertainty after the SL peak with period corresponding to that of the main Adriatic seiche.
A global perspective of the limits of prediction skill based on the ECMWF ensemble
NASA Astrophysics Data System (ADS)
Zagar, Nedjeljka
2016-04-01
In this talk presents a new model of the global forecast error growth applied to the forecast errors simulated by the ensemble prediction system (ENS) of the ECMWF. The proxy for forecast errors is the total spread of the ECMWF operational ensemble forecasts obtained by the decomposition of the wind and geopotential fields in the normal-mode functions. In this way, the ensemble spread can be quantified separately for the balanced and inertio-gravity (IG) modes for every forecast range. Ensemble reliability is defined for the balanced and IG modes comparing the ensemble spread with the control analysis in each scale. The results show that initial uncertainties in the ECMWF ENS are largest in the tropical large-scale modes and their spatial distribution is similar to the distribution of the short-range forecast errors. Initially the ensemble spread grows most in the smallest scales and in the synoptic range of the IG modes but the overall growth is dominated by the increase of spread in balanced modes in synoptic and planetary scales in the midlatitudes. During the forecasts, the distribution of spread in the balanced and IG modes grows towards the climatological spread distribution characteristic of the analyses. The ENS system is found to be somewhat under-dispersive which is associated with the lack of tropical variability, primarily the Kelvin waves. The new model of the forecast error growth has three fitting parameters to parameterize the initial fast growth and a more slow exponential error growth later on. The asymptotic values of forecast errors are independent of the exponential growth rate. It is found that the asymptotic values of the errors due to unbalanced dynamics are around 10 days while the balanced and total errors saturate in 3 to 4 weeks. Reference: Žagar, N., R. Buizza, and J. Tribbia, 2015: A three-dimensional multivariate modal analysis of atmospheric predictability with application to the ECMWF ensemble. J. Atmos. Sci., 72, 4423-4444.
The Ensembl genome database project.
Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M
2002-01-01
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.
NASA Astrophysics Data System (ADS)
Niedzielski, Tomasz; Mizinski, Bartlomiej; Swierczynska-Chlasciak, Malgorzata
2017-04-01
The HydroProg system, the real-time multimodel hydrologic ensemble system elaborated at the University of Wroclaw (Poland) in frame of the research grant no. 2011/01/D/ST10/04171 financed by National Science Centre of Poland, has been experimentally launched in 2013 in the Nysa Klodzka river basin (southwestern Poland). Since that time the system has been working operationally to provide water level predictions in real time. At present, depending on a hydrologic gauge, up to eight hydrologic models are run. They are data- and physically-based solutions, with the majority of them being the data-based ones. The paper aims to report on the performance of the implementation of the HydroProg system for the basin in question. We focus on several high flows episodes and discuss the skills of the individual models in forecasting them. In addition, we present the performance of the multimodel ensemble solution. We also introduce a new prognosis which is determined in the following way: for a given lead time we select the most skillful prediction (from the set of all individual models running at a given gauge and their multimodel ensemble) using the performance statistics computed operationally in real time as a function of lead time.
Using ensemble of classifiers for predicting HIV protease cleavage sites in proteins.
Nanni, Loris; Lumini, Alessandra
2009-03-01
The focus of this work is the use of ensembles of classifiers for predicting HIV protease cleavage sites in proteins. Due to the complex relationships in the biological data, several recent works show that often ensembles of learning algorithms outperform stand-alone methods. We show that the fusion of approaches based on different encoding models can be useful for improving the performance of this classification problem. In particular, in this work four different feature encodings for peptides are described and tested. An extensive evaluation on a large dataset according to a blind testing protocol is reported which demonstrates how different feature extraction methods and classifiers can be combined for obtaining a robust and reliable system. The comparison with other stand-alone approaches allows quantifying the performance improvement obtained by the ensembles proposed in this work.
Evaluation of the Plant-Craig stochastic convection scheme in an ensemble forecasting system
NASA Astrophysics Data System (ADS)
Keane, R. J.; Plant, R. S.; Tennant, W. J.
2015-12-01
The Plant-Craig stochastic convection parameterization (version 2.0) is implemented in the Met Office Regional Ensemble Prediction System (MOGREPS-R) and is assessed in comparison with the standard convection scheme with a simple stochastic element only, from random parameter variation. A set of 34 ensemble forecasts, each with 24 members, is considered, over the month of July 2009. Deterministic and probabilistic measures of the precipitation forecasts are assessed. The Plant-Craig parameterization is found to improve probabilistic forecast measures, particularly the results for lower precipitation thresholds. The impact on deterministic forecasts at the grid scale is neutral, although the Plant-Craig scheme does deliver improvements when forecasts are made over larger areas. The improvements found are greater in conditions of relatively weak synoptic forcing, for which convective precipitation is likely to be less predictable.
Spatial Ensemble Postprocessing of Precipitation Forecasts Using High Resolution Analyses
NASA Astrophysics Data System (ADS)
Lang, Moritz N.; Schicker, Irene; Kann, Alexander; Wang, Yong
2017-04-01
Ensemble prediction systems are designed to account for errors or uncertainties in the initial and boundary conditions, imperfect parameterizations, etc. However, due to sampling errors and underestimation of the model errors, these ensemble forecasts tend to be underdispersive, and to lack both reliability and sharpness. To overcome such limitations, statistical postprocessing methods are commonly applied to these forecasts. In this study, a full-distributional spatial post-processing method is applied to short-range precipitation forecasts over Austria using Standardized Anomaly Model Output Statistics (SAMOS). Following Stauffer et al. (2016), observation and forecast fields are transformed into standardized anomalies by subtracting a site-specific climatological mean and dividing by the climatological standard deviation. Due to the need of fitting only a single regression model for the whole domain, the SAMOS framework provides a computationally inexpensive method to create operationally calibrated probabilistic forecasts for any arbitrary location or for all grid points in the domain simultaneously. Taking advantage of the INCA system (Integrated Nowcasting through Comprehensive Analysis), high resolution analyses are used for the computation of the observed climatology and for model training. The INCA system operationally combines station measurements and remote sensing data into real-time objective analysis fields at 1 km-horizontal resolution and 1 h-temporal resolution. The precipitation forecast used in this study is obtained from a limited area model ensemble prediction system also operated by ZAMG. The so called ALADIN-LAEF provides, by applying a multi-physics approach, a 17-member forecast at a horizontal resolution of 10.9 km and a temporal resolution of 1 hour. The performed SAMOS approach statistically combines the in-house developed high resolution analysis and ensemble prediction system. The station-based validation of 6 hour precipitation sums shows a mean improvement of more than 40% in CRPS when compared to bilinearly interpolated uncalibrated ensemble forecasts. The validation on randomly selected grid points, representing the true height distribution over Austria, still indicates a mean improvement of 35%. The applied statistical model is currently set up for 6-hourly and daily accumulation periods, but will be extended to a temporal resolution of 1-3 hours within a new probabilistic nowcasting system operated by ZAMG.
An Intelligent Ensemble Neural Network Model for Wind Speed Prediction in Renewable Energy Systems.
Ranganayaki, V; Deepa, S N
2016-01-01
Various criteria are proposed to select the number of hidden neurons in artificial neural network (ANN) models and based on the criterion evolved an intelligent ensemble neural network model is proposed to predict wind speed in renewable energy applications. The intelligent ensemble neural model based wind speed forecasting is designed by averaging the forecasted values from multiple neural network models which includes multilayer perceptron (MLP), multilayer adaptive linear neuron (Madaline), back propagation neural network (BPN), and probabilistic neural network (PNN) so as to obtain better accuracy in wind speed prediction with minimum error. The random selection of hidden neurons numbers in artificial neural network results in overfitting or underfitting problem. This paper aims to avoid the occurrence of overfitting and underfitting problems. The selection of number of hidden neurons is done in this paper employing 102 criteria; these evolved criteria are verified by the computed various error values. The proposed criteria for fixing hidden neurons are validated employing the convergence theorem. The proposed intelligent ensemble neural model is applied for wind speed prediction application considering the real time wind data collected from the nearby locations. The obtained simulation results substantiate that the proposed ensemble model reduces the error value to minimum and enhances the accuracy. The computed results prove the effectiveness of the proposed ensemble neural network (ENN) model with respect to the considered error factors in comparison with that of the earlier models available in the literature.
An ensemble predictive modeling framework for breast cancer classification.
Nagarajan, Radhakrishnan; Upreti, Meenakshi
2017-12-01
Molecular changes often precede clinical presentation of diseases and can be useful surrogates with potential to assist in informed clinical decision making. Recent studies have demonstrated the usefulness of modeling approaches such as classification that can predict the clinical outcomes from molecular expression profiles. While useful, a majority of these approaches implicitly use all molecular markers as features in the classification process often resulting in sparse high-dimensional projection of the samples often comparable to that of the sample size. In this study, a variant of the recently proposed ensemble classification approach is used for predicting good and poor-prognosis breast cancer samples from their molecular expression profiles. In contrast to traditional single and ensemble classifiers, the proposed approach uses multiple base classifiers with varying feature sets obtained from two-dimensional projection of the samples in conjunction with a majority voting strategy for predicting the class labels. In contrast to our earlier implementation, base classifiers in the ensembles are chosen based on maximal sensitivity and minimal redundancy by choosing only those with low average cosine distance. The resulting ensemble sets are subsequently modeled as undirected graphs. Performance of four different classification algorithms is shown to be better within the proposed ensemble framework in contrast to using them as traditional single classifier systems. Significance of a subset of genes with high-degree centrality in the network abstractions across the poor-prognosis samples is also discussed. Copyright © 2017 Elsevier Inc. All rights reserved.
An Intelligent Ensemble Neural Network Model for Wind Speed Prediction in Renewable Energy Systems
Ranganayaki, V.; Deepa, S. N.
2016-01-01
Various criteria are proposed to select the number of hidden neurons in artificial neural network (ANN) models and based on the criterion evolved an intelligent ensemble neural network model is proposed to predict wind speed in renewable energy applications. The intelligent ensemble neural model based wind speed forecasting is designed by averaging the forecasted values from multiple neural network models which includes multilayer perceptron (MLP), multilayer adaptive linear neuron (Madaline), back propagation neural network (BPN), and probabilistic neural network (PNN) so as to obtain better accuracy in wind speed prediction with minimum error. The random selection of hidden neurons numbers in artificial neural network results in overfitting or underfitting problem. This paper aims to avoid the occurrence of overfitting and underfitting problems. The selection of number of hidden neurons is done in this paper employing 102 criteria; these evolved criteria are verified by the computed various error values. The proposed criteria for fixing hidden neurons are validated employing the convergence theorem. The proposed intelligent ensemble neural model is applied for wind speed prediction application considering the real time wind data collected from the nearby locations. The obtained simulation results substantiate that the proposed ensemble model reduces the error value to minimum and enhances the accuracy. The computed results prove the effectiveness of the proposed ensemble neural network (ENN) model with respect to the considered error factors in comparison with that of the earlier models available in the literature. PMID:27034973
Huisman, J.A.; Breuer, L.; Bormann, H.; Bronstert, A.; Croke, B.F.W.; Frede, H.-G.; Graff, T.; Hubrechts, L.; Jakeman, A.J.; Kite, G.; Lanini, J.; Leavesley, G.; Lettenmaier, D.P.; Lindstrom, G.; Seibert, J.; Sivapalan, M.; Viney, N.R.; Willems, P.
2009-01-01
An ensemble of 10 hydrological models was applied to the same set of land use change scenarios. There was general agreement about the direction of changes in the mean annual discharge and 90% discharge percentile predicted by the ensemble members, although a considerable range in the magnitude of predictions for the scenarios and catchments under consideration was obvious. Differences in the magnitude of the increase were attributed to the different mean annual actual evapotranspiration rates for each land use type. The ensemble of model runs was further analyzed with deterministic and probabilistic ensemble methods. The deterministic ensemble method based on a trimmed mean resulted in a single somewhat more reliable scenario prediction. The probabilistic reliability ensemble averaging (REA) method allowed a quantification of the model structure uncertainty in the scenario predictions. It was concluded that the use of a model ensemble has greatly increased our confidence in the reliability of the model predictions. ?? 2008 Elsevier Ltd.
Prediction of drug synergy in cancer using ensemble-based machine learning techniques
NASA Astrophysics Data System (ADS)
Singh, Harpreet; Rana, Prashant Singh; Singh, Urvinder
2018-04-01
Drug synergy prediction plays a significant role in the medical field for inhibiting specific cancer agents. It can be developed as a pre-processing tool for therapeutic successes. Examination of different drug-drug interaction can be done by drug synergy score. It needs efficient regression-based machine learning approaches to minimize the prediction errors. Numerous machine learning techniques such as neural networks, support vector machines, random forests, LASSO, Elastic Nets, etc., have been used in the past to realize requirement as mentioned above. However, these techniques individually do not provide significant accuracy in drug synergy score. Therefore, the primary objective of this paper is to design a neuro-fuzzy-based ensembling approach. To achieve this, nine well-known machine learning techniques have been implemented by considering the drug synergy data. Based on the accuracy of each model, four techniques with high accuracy are selected to develop ensemble-based machine learning model. These models are Random forest, Fuzzy Rules Using Genetic Cooperative-Competitive Learning method (GFS.GCCL), Adaptive-Network-Based Fuzzy Inference System (ANFIS) and Dynamic Evolving Neural-Fuzzy Inference System method (DENFIS). Ensembling is achieved by evaluating the biased weighted aggregation (i.e. adding more weights to the model with a higher prediction score) of predicted data by selected models. The proposed and existing machine learning techniques have been evaluated on drug synergy score data. The comparative analysis reveals that the proposed method outperforms others in terms of accuracy, root mean square error and coefficient of correlation.
Evaluation of Multi-Model Ensemble System for Seasonal and Monthly Prediction
NASA Astrophysics Data System (ADS)
Zhang, Q.; Van den Dool, H. M.
2013-12-01
Since August 2011, the realtime seasonal forecasts of U.S. National Multi-Model Ensemble (NMME) have been made on 8th of each month by NCEP Climate Prediction Center (CPC). During the first year, the participating models were NCEP/CFSv1&2, GFDL/CM2.2, NCAR/U.Miami/COLA/CCSM3, NASA/GEOS5, IRI/ ECHAM-a & ECHAM-f for the realtime NMME forecast. The Canadian Meteorological Center CanCM3 and CM4 replaced the CFSv1 and IRI's models in the second year. The NMME team at CPC collects three variables, including precipitation, 2-meter temperature and sea surface temperature from each modeling center on a 1x1 global grid, removes systematic errors, makes the grand ensemble mean with equal weight for each model and constructs a probability forecast with equal weight for each member. The team then provides the NMME forecast to the operational CPC forecaster responsible for the seasonal and monthly outlook each month. Verification of the seasonal and monthly prediction from NMME is conducted by calculating the anomaly correlation (AC) from the 30-year hindcasts (1982-2011) of individual model and NMME ensemble. The motivation of this study is to provide skill benchmarks for future improvements of the NMME seasonal and monthly prediction system. The experimental (Phase I) stage of the project already supplies routine guidance to users of the NMME forecasts.
Verifying and Postprocesing the Ensemble Spread-Error Relationship
NASA Astrophysics Data System (ADS)
Hopson, Tom; Knievel, Jason; Liu, Yubao; Roux, Gregory; Wu, Wanli
2013-04-01
With the increased utilization of ensemble forecasts in weather and hydrologic applications, there is a need to verify their benefit over less expensive deterministic forecasts. One such potential benefit of ensemble systems is their capacity to forecast their own forecast error through the ensemble spread-error relationship. The paper begins by revisiting the limitations of the Pearson correlation alone in assessing this relationship. Next, we introduce two new metrics to consider in assessing the utility an ensemble's varying dispersion. We argue there are two aspects of an ensemble's dispersion that should be assessed. First, and perhaps more fundamentally: is there enough variability in the ensembles dispersion to justify the maintenance of an expensive ensemble prediction system (EPS), irrespective of whether the EPS is well-calibrated or not? To diagnose this, the factor that controls the theoretical upper limit of the spread-error correlation can be useful. Secondly, does the variable dispersion of an ensemble relate to variable expectation of forecast error? Representing the spread-error correlation in relation to its theoretical limit can provide a simple diagnostic of this attribute. A context for these concepts is provided by assessing two operational ensembles: 30-member Western US temperature forecasts for the U.S. Army Test and Evaluation Command and 51-member Brahmaputra River flow forecasts of the Climate Forecast and Applications Project for Bangladesh. Both of these systems utilize a postprocessing technique based on quantile regression (QR) under a step-wise forward selection framework leading to ensemble forecasts with both good reliability and sharpness. In addition, the methodology utilizes the ensemble's ability to self-diagnose forecast instability to produce calibrated forecasts with informative skill-spread relationships. We will describe both ensemble systems briefly, review the steps used to calibrate the ensemble forecast, and present verification statistics using error-spread metrics, along with figures from operational ensemble forecasts before and after calibration.
NASA Astrophysics Data System (ADS)
Flampouris, Stylianos; Penny, Steve; Alves, Henrique
2017-04-01
The National Centers for Environmental Prediction (NCEP) of the National Oceanic and Atmospheric Administration (NOAA) provides the operational wave forecast for the US National Weather Service (NWS). Given the continuous efforts to improve forecast, NCEP is developing an ensemble-based data assimilation system, based on the local ensemble transform Kalman filter (LETKF), the existing operational global wave ensemble system (GWES) and on satellite and in-situ observations. While the LETKF was designed for atmospheric applications (Hunt et al 2007), and has been adapted for several ocean models (e.g. Penny 2016), this is the first time applied for oceanic waves assimilation. This new wave assimilation system provides a global estimation of the surface sea state and its approximate uncertainty. It achieves this by analyzing the 21-member ensemble of the significant wave height provided by GWES every 6h. Observations from four altimeters and all the available in-situ measurements are used in this analysis. The analysis of the significant wave height is used for initializing the next forecasting cycle; the data assimilation system is currently being tested for operational use.
Karp, Jerome M; Eryilmaz, Ertan; Erylimaz, Ertan; Cowburn, David
2015-01-01
There has been a longstanding interest in being able to accurately predict NMR chemical shifts from structural data. Recent studies have focused on using molecular dynamics (MD) simulation data as input for improved prediction. Here we examine the accuracy of chemical shift prediction for intein systems, which have regions of intrinsic disorder. We find that using MD simulation data as input for chemical shift prediction does not consistently improve prediction accuracy over use of a static X-ray crystal structure. This appears to result from the complex conformational ensemble of the disordered protein segments. We show that using accelerated molecular dynamics (aMD) simulations improves chemical shift prediction, suggesting that methods which better sample the conformational ensemble like aMD are more appropriate tools for use in chemical shift prediction for proteins with disordered regions. Moreover, our study suggests that data accurately reflecting protein dynamics must be used as input for chemical shift prediction in order to correctly predict chemical shifts in systems with disorder.
NASA Astrophysics Data System (ADS)
Counillon, Francois; Kimmritz, Madlen; Keenlyside, Noel; Wang, Yiguo; Bethke, Ingo
2017-04-01
The Norwegian Climate Prediction Model combines the Norwegian Earth System Model and the Ensemble Kalman Filter data assimilation method. The prediction skills of different versions of the system (with 30 members) are tested in the Nordic Seas and the Arctic region. Comparing the hindcasts branched from a SST-only assimilation run with a free ensemble run of 30 members, we are able to dissociate the predictability rooted in the external forcing from the predictability harvest from SST derived initial conditions. The latter adds predictability in the North Atlantic subpolar gyre and the Nordic Seas regions and overall there is very little degradation or forecast drift. Combined assimilation of SST and T-S profiles further improves the prediction skill in the Nordic Seas and into the Arctic. These lead to multi-year predictability in the high-latitudes. Ongoing developments of strongly coupled assimilation (ocean and sea ice) of ice concentration in idealized twin experiment will be shown, as way to further enhance prediction skill in the Arctic.
Alternative Approaches to Land Initialization for Seasonal Precipitation and Temperature Forecasts
NASA Technical Reports Server (NTRS)
Koster, Randal; Suarez, Max; Liu, Ping; Jambor, Urszula
2004-01-01
The seasonal prediction system of the NASA Global Modeling and Assimilation Office is used to generate ensembles of summer forecasts utilizing realistic soil moisture initialization. To derive the realistic land states, we drive offline the system's land model with realistic meteorological forcing over the period 1979-1993 (in cooperation with the Global Land Data Assimilation System project at GSFC) and then extract the state variables' values on the chosen forecast start dates. A parallel series of forecast ensembles is performed with a random (though climatologically consistent) set of land initial conditions; by comparing the two sets of ensembles, we can isolate the impact of land initialization on forecast skill from that of the imposed SSTs. The base initialization experiment is supplemented with several forecast ensembles that use alternative initialization techniques. One ensemble addresses the impact of minimizing climate drift in the system through the scaling of the initial conditions, and another is designed to isolate the importance of the precipitation signal from that of all other signals in the antecedent offline forcing. A third ensemble includes a more realistic initialization of the atmosphere along with the land initialization. The impact of each variation on forecast skill is quantified.
NASA Astrophysics Data System (ADS)
Zheng, Fei; Zhu, Jiang
2017-04-01
How to design a reliable ensemble prediction strategy with considering the major uncertainties of a forecasting system is a crucial issue for performing an ensemble forecast. In this study, a new stochastic perturbation technique is developed to improve the prediction skills of El Niño-Southern Oscillation (ENSO) through using an intermediate coupled model. We first estimate and analyze the model uncertainties from the ensemble Kalman filter analysis results through assimilating the observed sea surface temperatures. Then, based on the pre-analyzed properties of model errors, we develop a zero-mean stochastic model-error model to characterize the model uncertainties mainly induced by the missed physical processes of the original model (e.g., stochastic atmospheric forcing, extra-tropical effects, Indian Ocean Dipole). Finally, we perturb each member of an ensemble forecast at each step by the developed stochastic model-error model during the 12-month forecasting process, and add the zero-mean perturbations into the physical fields to mimic the presence of missing processes and high-frequency stochastic noises. The impacts of stochastic model-error perturbations on ENSO deterministic predictions are examined by performing two sets of 21-yr hindcast experiments, which are initialized from the same initial conditions and differentiated by whether they consider the stochastic perturbations. The comparison results show that the stochastic perturbations have a significant effect on improving the ensemble-mean prediction skills during the entire 12-month forecasting process. This improvement occurs mainly because the nonlinear terms in the model can form a positive ensemble-mean from a series of zero-mean perturbations, which reduces the forecasting biases and then corrects the forecast through this nonlinear heating mechanism.
HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy.
Hu, Huan; Zhang, Li; Ai, Haixin; Zhang, Hui; Fan, Yetian; Zhao, Qi; Liu, Hongsheng
2018-03-27
LncRNA plays an important role in many biological and disease progression by binding to related proteins. However, the experimental methods for studying lncRNA-protein interactions are time-consuming and expensive. Although there are a few models designed to predict the interactions of ncRNA-protein, they all have some common drawbacks that limit their predictive performance. In this study, we present a model called HLPI-Ensemble designed specifically for human lncRNA-protein interactions. HLPI-Ensemble adopts the ensemble strategy based on three mainstream machine learning algorithms of Support Vector Machines (SVM), Random Forests (RF) and Extreme Gradient Boosting (XGB) to generate HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble, respectively. The results of 10-fold cross-validation show that HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble achieved AUCs of 0.95, 0.96 and 0.96, respectively, in the test dataset. Furthermore, we compared the performance of the HLPI-Ensemble models with the previous models through external validation dataset. The results show that the false positives (FPs) of HLPI-Ensemble models are much lower than that of the previous models, and other evaluation indicators of HLPI-Ensemble models are also higher than those of the previous models. It is further showed that HLPI-Ensemble models are superior in predicting human lncRNA-protein interaction compared with previous models. The HLPI-Ensemble is publicly available at: http://ccsipb.lnu.edu.cn/hlpiensemble/ .
NASA Astrophysics Data System (ADS)
Abhilash, S.; Sahai, A. K.; Borah, N.; Chattopadhyay, R.; Joseph, S.; Sharmila, S.; De, S.; Goswami, B. N.; Kumar, Arun
2014-05-01
An ensemble prediction system (EPS) is devised for the extended range prediction (ERP) of monsoon intraseasonal oscillations (MISO) of Indian summer monsoon (ISM) using National Centers for Environmental Prediction Climate Forecast System model version 2 at T126 horizontal resolution. The EPS is formulated by generating 11 member ensembles through the perturbation of atmospheric initial conditions. The hindcast experiments were conducted at every 5-day interval for 45 days lead time starting from 16th May to 28th September during 2001-2012. The general simulation of ISM characteristics and the ERP skill of the proposed EPS at pentad mean scale are evaluated in the present study. Though the EPS underestimates both the mean and variability of ISM rainfall, it simulates the northward propagation of MISO reasonably well. It is found that the signal-to-noise ratio of the forecasted rainfall becomes unity by about 18 days. The potential predictability error of the forecasted rainfall saturates by about 25 days. Though useful deterministic forecasts could be generated up to 2nd pentad lead, significant correlations are found even up to 4th pentad lead. The skill in predicting large-scale MISO, which is assessed by comparing the predicted and observed MISO indices, is found to be ~17 days. It is noted that the prediction skill of actual rainfall is closely related to the prediction of large-scale MISO amplitude as well as the initial conditions related to the different phases of MISO. An analysis of categorical prediction skills reveals that break is more skillfully predicted, followed by active and then normal. The categorical probability skill scores suggest that useful probabilistic forecasts could be generated even up to 4th pentad lead.
NASA Astrophysics Data System (ADS)
Smiatek, G.; Kunstmann, H.; Werhahn, J.
2012-04-01
The Ammer River catchment located in the Bavarian Ammergau Alps and alpine forelands, Germany, represents with elevations reaching 2185 m and annual mean precipitation between1100 and 2000 mm a very demanding test ground for a river runoff prediction system. Large flooding events in 1999 and 2005 motivated the development of a physically based prediction tool in this area. Such a tool is the coupled high resolution numerical weather and river runoff forecasting system AM-POE that is being studied in several configurations in various experiments starting from the year 2005. Corner stones of the coupled system are the hydrological water balance model WaSiM-ETH run at 100 m grid resolution, the numerical weather prediction model (NWP) MM5 driven at 3.5 km grid cell resolution and the Perl Object Environment (POE) framework. POE implements the input data download from various sources, the input data provision via SOAP based WEB services as well as the runs of the hydrology model both with observed and with NWP predicted meteorology input. The one way coupled system utilizes a lagged ensemble prediction system (EPS) taking into account combination of recent and previous NWP forecasts. Results obtained in the years 2005-2011 reveal that river runoff simulations depict high correlation with observed runoff when driven with monitored observations in hindcast experiments. The ability to runoff forecasts is depending on lead times in the lagged ensemble prediction and shows still limitations resulting from errors in timing and total amount of the predicted precipitation in the complex mountainous area. The presentation describes the system implementation, and demonstrates the application of the POE framework in networking, distributed computing and in the setup of various experiments as well as long term results of the system application in the years 2005 - 2011.
NASA Astrophysics Data System (ADS)
Niu, Mingfei; Wang, Yufang; Sun, Shaolong; Li, Yongwu
2016-06-01
To enhance prediction reliability and accuracy, a hybrid model based on the promising principle of "decomposition and ensemble" and a recently proposed meta-heuristic called grey wolf optimizer (GWO) is introduced for daily PM2.5 concentration forecasting. Compared with existing PM2.5 forecasting methods, this proposed model has improved the prediction accuracy and hit rates of directional prediction. The proposed model involves three main steps, i.e., decomposing the original PM2.5 series into several intrinsic mode functions (IMFs) via complementary ensemble empirical mode decomposition (CEEMD) for simplifying the complex data; individually predicting each IMF with support vector regression (SVR) optimized by GWO; integrating all predicted IMFs for the ensemble result as the final prediction by another SVR optimized by GWO. Seven benchmark models, including single artificial intelligence (AI) models, other decomposition-ensemble models with different decomposition methods and models with the same decomposition-ensemble method but optimized by different algorithms, are considered to verify the superiority of the proposed hybrid model. The empirical study indicates that the proposed hybrid decomposition-ensemble model is remarkably superior to all considered benchmark models for its higher prediction accuracy and hit rates of directional prediction.
Summer drought predictability over Europe: empirical versus dynamical forecasts
NASA Astrophysics Data System (ADS)
Turco, Marco; Ceglar, Andrej; Prodhomme, Chloé; Soret, Albert; Toreti, Andrea; Doblas-Reyes Francisco, J.
2017-08-01
Seasonal climate forecasts could be an important planning tool for farmers, government and insurance companies that can lead to better and timely management of seasonal climate risks. However, climate seasonal forecasts are often under-used, because potential users are not well aware of the capabilities and limitations of these products. This study aims at assessing the merits and caveats of a statistical empirical method, the ensemble streamflow prediction system (ESP, an ensemble based on reordering historical data) and an operational dynamical forecast system, the European Centre for Medium-Range Weather Forecasts—System 4 (S4) in predicting summer drought in Europe. Droughts are defined using the Standardized Precipitation Evapotranspiration Index for the month of August integrated over 6 months. Both systems show useful and mostly comparable deterministic skill. We argue that this source of predictability is mostly attributable to the observed initial conditions. S4 shows only higher skill in terms of ability to probabilistically identify drought occurrence. Thus, currently, both approaches provide useful information and ESP represents a computationally fast alternative to dynamical prediction applications for drought prediction.
Kingsley, Laura J.; Lill, Markus A.
2014-01-01
Computational prediction of ligand entry and egress paths in proteins has become an emerging topic in computational biology and has proven useful in fields such as protein engineering and drug design. Geometric tunnel prediction programs, such as Caver3.0 and MolAxis, are computationally efficient methods to identify potential ligand entry and egress routes in proteins. Although many geometric tunnel programs are designed to accommodate a single input structure, the increasingly recognized importance of protein flexibility in tunnel formation and behavior has led to the more widespread use of protein ensembles in tunnel prediction. However, there has not yet been an attempt to directly investigate the influence of ensemble size and composition on geometric tunnel prediction. In this study, we compared tunnels found in a single crystal structure to ensembles of various sizes generated using different methods on both the apo and holo forms of cytochrome P450 enzymes CYP119, CYP2C9, and CYP3A4. Several protein structure clustering methods were tested in an attempt to generate smaller ensembles that were capable of reproducing the data from larger ensembles. Ultimately, we found that by including members from both the apo and holo data sets, we could produce ensembles containing less than 15 members that were comparable to apo or holo ensembles containing over 100 members. Furthermore, we found that, in the absence of either apo or holo crystal structure data, pseudo-apo or –holo ensembles (e.g. adding ligand to apo protein throughout MD simulations) could be used to resemble the structural ensembles of the corresponding apo and holo ensembles, respectively. Our findings not only further highlight the importance of including protein flexibility in geometric tunnel prediction, but also suggest that smaller ensembles can be as capable as larger ensembles at capturing many of the protein motions important for tunnel prediction at a lower computational cost. PMID:24956479
NASA Astrophysics Data System (ADS)
Keane, Richard J.; Plant, Robert S.; Tennant, Warren J.
2016-05-01
The Plant-Craig stochastic convection parameterization (version 2.0) is implemented in the Met Office Regional Ensemble Prediction System (MOGREPS-R) and is assessed in comparison with the standard convection scheme with a simple stochastic scheme only, from random parameter variation. A set of 34 ensemble forecasts, each with 24 members, is considered, over the month of July 2009. Deterministic and probabilistic measures of the precipitation forecasts are assessed. The Plant-Craig parameterization is found to improve probabilistic forecast measures, particularly the results for lower precipitation thresholds. The impact on deterministic forecasts at the grid scale is neutral, although the Plant-Craig scheme does deliver improvements when forecasts are made over larger areas. The improvements found are greater in conditions of relatively weak synoptic forcing, for which convective precipitation is likely to be less predictable.
NASA Astrophysics Data System (ADS)
Lahmiri, S.; Boukadoum, M.
2015-10-01
Accurate forecasting of stock market volatility is an important issue in portfolio risk management. In this paper, an ensemble system for stock market volatility is presented. It is composed of three different models that hybridize the exponential generalized autoregressive conditional heteroscedasticity (GARCH) process and the artificial neural network trained with the backpropagation algorithm (BPNN) to forecast stock market volatility under normal, t-Student, and generalized error distribution (GED) assumption separately. The goal is to design an ensemble system where each single hybrid model is capable to capture normality, excess skewness, or excess kurtosis in the data to achieve complementarity. The performance of each EGARCH-BPNN and the ensemble system is evaluated by the closeness of the volatility forecasts to realized volatility. Based on mean absolute error and mean of squared errors, the experimental results show that proposed ensemble model used to capture normality, skewness, and kurtosis in data is more accurate than the individual EGARCH-BPNN models in forecasting the S&P 500 intra-day volatility based on one and five-minute time horizons data.
Ali, Safdar; Majid, Abdul
2015-04-01
The diagnostic of human breast cancer is an intricate process and specific indicators may produce negative results. In order to avoid misleading results, accurate and reliable diagnostic system for breast cancer is indispensable. Recently, several interesting machine-learning (ML) approaches are proposed for prediction of breast cancer. To this end, we developed a novel classifier stacking based evolutionary ensemble system "Can-Evo-Ens" for predicting amino acid sequences associated with breast cancer. In this paper, first, we selected four diverse-type of ML algorithms of Naïve Bayes, K-Nearest Neighbor, Support Vector Machines, and Random Forest as base-level classifiers. These classifiers are trained individually in different feature spaces using physicochemical properties of amino acids. In order to exploit the decision spaces, the preliminary predictions of base-level classifiers are stacked. Genetic programming (GP) is then employed to develop a meta-classifier that optimal combine the predictions of the base classifiers. The most suitable threshold value of the best-evolved predictor is computed using Particle Swarm Optimization technique. Our experiments have demonstrated the robustness of Can-Evo-Ens system for independent validation dataset. The proposed system has achieved the highest value of Area Under Curve (AUC) of ROC Curve of 99.95% for cancer prediction. The comparative results revealed that proposed approach is better than individual ML approaches and conventional ensemble approaches of AdaBoostM1, Bagging, GentleBoost, and Random Subspace. It is expected that the proposed novel system would have a major impact on the fields of Biomedical, Genomics, Proteomics, Bioinformatics, and Drug Development. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Borah, Nabanita; Sukumarpillai, Abhilash; Sahai, Atul Kumar; Chattopadhyay, Rajib; Joseph, Susmitha; De, Soumyendu; Nath Goswami, Bhupendra; Kumar, Arun
2014-05-01
An ensemble prediction system (EPS) is devised for the extended range prediction (ERP) of monsoon intraseasonal oscillations (MISO) of Indian summer monsoon (ISM) using NCEP Climate Forecast System model version2 at T126 horizontal resolution. The EPS is formulated by producing 11 member ensembles through the perturbation of atmospheric initial conditions. The hindcast experiments were conducted at every 5-day interval for 45 days lead time starting from 16th May to 28th September during 2001-2012. The general simulation of ISM characteristics and the ERP skill of the proposed EPS at pentad mean scale are evaluated in the present study. Though the EPS underestimates both the mean and variability of ISM rainfall, it simulates the northward propagation of MISO reasonably well. It is found that the signal-to-noise ratio becomes unity by about18 days and the predictability error saturates by about 25 days. Though useful deterministic forecasts could be generated up to 2nd pentad lead, significant correlations are observed even up to 4th pentad lead. The skill in predicting large-scale MISO, which is assessed by comparing the predicted and observed MISO indices, is found to be ~17 days. It is noted that the prediction skill of actual rainfall is closely related to the prediction of amplitude of large scale MISO as well as the initial conditions related to the different phases of MISO. Categorical prediction skills reveals that break is more skillfully predicted, followed by active and then normal. The categorical probability skill scores suggest that useful probabilistic forecasts could be generated even up to 4th pentad lead.
NASA Astrophysics Data System (ADS)
Borah, N.; Abhilash, S.; Sahai, A. K.; Chattopadhyay, R.; Joseph, S.; Sharmila, S.; de, S.; Goswami, B.; Kumar, A.
2013-12-01
An ensemble prediction system (EPS) is devised for the extended range prediction (ERP) of monsoon intraseasonal oscillations (MISOs) of Indian summer monsoon (ISM) using NCEP Climate Forecast System model version2 at T126 horizontal resolution. The EPS is formulated by producing 11 member ensembles through the perturbation of atmospheric initial conditions. The hindcast experiments were conducted at every 5-day interval for 45 days lead time starting from 16th May to 28th September during 2001-2012. The general simulation of ISM characteristics and the ERP skill of the proposed EPS at pentad mean scale are evaluated in the present study. Though the EPS underestimates both the mean and variability of ISM rainfall, it simulates the northward propagation of MISO reasonably well. It is found that the signal-to-noise ratio becomes unity by about18 days and the predictability error saturates by about 25 days. Though useful deterministic forecasts could be generated up to 2nd pentad lead, significant correlations are observed even up to 4th pentad lead. The skill in predicting large-scale MISO, which is assessed by comparing the predicted and observed MISO indices, is found to be ~17 days. It is noted that the prediction skill of actual rainfall is closely related to the prediction of amplitude of large scale MISO as well as the initial conditions related to the different phases of MISO. Categorical prediction skills reveals that break is more skillfully predicted, followed by active and then normal. The categorical probability skill scores suggest that useful probabilistic forecasts could be generated even up to 4th pentad lead.
National Centers for Environmental Prediction
SYSTEM CFS CLIMATE FORECAST SYSTEM NAQFC NAQFC MODEL GEFS GLOBAL ENSEMBLE FORECAST SYSTEM HWRF HURRICANE WEATHER RESEARCH and FORECASTING HMON HMON - OPERATIONAL HURRICANE FORECASTING WAVEWATCH III WAVEWATCH III
Improving medium-range ensemble streamflow forecasts through statistical post-processing
NASA Astrophysics Data System (ADS)
Mendoza, Pablo; Wood, Andy; Clark, Elizabeth; Nijssen, Bart; Clark, Martyn; Ramos, Maria-Helena; Nowak, Kenneth; Arnold, Jeffrey
2017-04-01
Probabilistic hydrologic forecasts are a powerful source of information for decision-making in water resources operations. A common approach is the hydrologic model-based generation of streamflow forecast ensembles, which can be implemented to account for different sources of uncertainties - e.g., from initial hydrologic conditions (IHCs), weather forecasts, and hydrologic model structure and parameters. In practice, hydrologic ensemble forecasts typically have biases and spread errors stemming from errors in the aforementioned elements, resulting in a degradation of probabilistic properties. In this work, we compare several statistical post-processing techniques applied to medium-range ensemble streamflow forecasts obtained with the System for Hydromet Applications, Research and Prediction (SHARP). SHARP is a fully automated prediction system for the assessment and demonstration of short-term to seasonal streamflow forecasting applications, developed by the National Center for Atmospheric Research, University of Washington, U.S. Army Corps of Engineers, and U.S. Bureau of Reclamation. The suite of post-processing techniques includes linear blending, quantile mapping, extended logistic regression, quantile regression, ensemble analogs, and the generalized linear model post-processor (GLMPP). We assess and compare these techniques using multi-year hindcasts in several river basins in the western US. This presentation discusses preliminary findings about the effectiveness of the techniques for improving probabilistic skill, reliability, discrimination, sharpness and resolution.
Improving wave forecasting by integrating ensemble modelling and machine learning
NASA Astrophysics Data System (ADS)
O'Donncha, F.; Zhang, Y.; James, S. C.
2017-12-01
Modern smart-grid networks use technologies to instantly relay information on supply and demand to support effective decision making. Integration of renewable-energy resources with these systems demands accurate forecasting of energy production (and demand) capacities. For wave-energy converters, this requires wave-condition forecasting to enable estimates of energy production. Current operational wave forecasting systems exhibit substantial errors with wave-height RMSEs of 40 to 60 cm being typical, which limits the reliability of energy-generation predictions thereby impeding integration with the distribution grid. In this study, we integrate physics-based models with statistical learning aggregation techniques that combine forecasts from multiple, independent models into a single "best-estimate" prediction of the true state. The Simulating Waves Nearshore physics-based model is used to compute wind- and currents-augmented waves in the Monterey Bay area. Ensembles are developed based on multiple simulations perturbing input data (wave characteristics supplied at the model boundaries and winds) to the model. A learning-aggregation technique uses past observations and past model forecasts to calculate a weight for each model. The aggregated forecasts are compared to observation data to quantify the performance of the model ensemble and aggregation techniques. The appropriately weighted ensemble model outperforms an individual ensemble member with regard to forecasting wave conditions.
NASA Astrophysics Data System (ADS)
Courdent, Vianney; Grum, Morten; Mikkelsen, Peter Steen
2018-01-01
Precipitation constitutes a major contribution to the flow in urban storm- and wastewater systems. Forecasts of the anticipated runoff flows, created from radar extrapolation and/or numerical weather predictions, can potentially be used to optimize operation in both wet and dry weather periods. However, flow forecasts are inevitably uncertain and their use will ultimately require a trade-off between the value of knowing what will happen in the future and the probability and consequence of being wrong. In this study we examine how ensemble forecasts from the HIRLAM-DMI-S05 numerical weather prediction (NWP) model subject to three different ensemble post-processing approaches can be used to forecast flow exceedance in a combined sewer for a wide range of ratios between the probability of detection (POD) and the probability of false detection (POFD). We use a hydrological rainfall-runoff model to transform the forecasted rainfall into forecasted flow series and evaluate three different approaches to establishing the relative operating characteristics (ROC) diagram of the forecast, which is a plot of POD against POFD for each fraction of concordant ensemble members and can be used to select the weight of evidence that matches the desired trade-off between POD and POFD. In the first approach, the rainfall input to the model is calculated for each of 25 ensemble members as a weighted average of rainfall from the NWP cells over the catchment where the weights are proportional to the areal intersection between the catchment and the NWP cells. In the second approach, a total of 2825 flow ensembles are generated using rainfall input from the neighbouring NWP cells up to approximately 6 cells in all directions from the catchment. In the third approach, the first approach is extended spatially by successively increasing the area covered and for each spatial increase and each time step selecting only the cell with the highest intensity resulting in a total of 175 ensemble members. While the first and second approaches have the disadvantage of not covering the full range of the ROC diagram and being computationally heavy, respectively, the third approach leads to both a broad coverage of the ROC diagram range at a relatively low computational cost. A broad coverage of the ROC diagram offers a larger selection of prediction skill to choose from to best match to the prediction purpose. The study distinguishes itself from earlier research in being the first application to urban hydrology, with fast runoff and small catchments that are highly sensitive to local extremes. Furthermore, no earlier reference has been found on the highly efficient third approach using only neighbouring cells with the highest threat to expand the range of the ROC diagram. This study provides an efficient and robust approach to using ensemble rainfall forecasts affected by bias and misplacement errors for predicting flow threshold exceedance in urban drainage systems.
NASA Astrophysics Data System (ADS)
Jha, Prakash K.; Athanasiadis, Panos; Gualdi, Silvio; Trabucco, Antonio; Mereu, Valentina; Shelia, Vakhtang; Hoogenboom, Gerrit
2018-03-01
Ensemble forecasts from dynamic seasonal prediction systems (SPSs) have the potential to improve decision-making for crop management to help cope with interannual weather variability. Because the reliability of crop yield predictions based on seasonal weather forecasts depends on the quality of the forecasts, it is essential to evaluate forecasts prior to agricultural applications. This study analyses the potential of Climate Forecast System version 2 (CFSv2) in predicting the Indian summer monsoon (ISM) for producing meteorological variables relevant to crop modeling. The focus area was Nepal's Terai region, and the local hindcasts were compared with weather station and reanalysis data. The results showed that the CFSv2 model accurately predicts monthly anomalies of daily maximum and minimum air temperature (Tmax and Tmin) as well as incoming total surface solar radiation (Srad). However, the daily climatologies of the respective CFSv2 hindcasts exhibit significant systematic biases compared to weather station data. The CFSv2 is less capable of predicting monthly precipitation anomalies and simulating the respective intra-seasonal variability over the growing season. Nevertheless, the observed daily climatologies of precipitation fall within the ensemble spread of the respective daily climatologies of CFSv2 hindcasts. These limitations in the CFSv2 seasonal forecasts, primarily in precipitation, restrict the potential application for predicting the interannual variability of crop yield associated with weather variability. Despite these limitations, ensemble averaging of the simulated yield using all CFSv2 members after applying bias correction may lead to satisfactory yield predictions.
Bayesian quantitative precipitation forecasts in terms of quantiles
NASA Astrophysics Data System (ADS)
Bentzien, Sabrina; Friederichs, Petra
2014-05-01
Ensemble prediction systems (EPS) for numerical weather predictions on the mesoscale are particularly developed to obtain probabilistic guidance for high impact weather. An EPS not only issues a deterministic future state of the atmosphere but a sample of possible future states. Ensemble postprocessing then translates such a sample of forecasts into probabilistic measures. This study focus on probabilistic quantitative precipitation forecasts in terms of quantiles. Quantiles are particular suitable to describe precipitation at various locations, since no assumption is required on the distribution of precipitation. The focus is on the prediction during high-impact events and related to the Volkswagen Stiftung funded project WEX-MOP (Mesoscale Weather Extremes - Theory, Spatial Modeling and Prediction). Quantile forecasts are derived from the raw ensemble and via quantile regression. Neighborhood method and time-lagging are effective tools to inexpensively increase the ensemble spread, which results in more reliable forecasts especially for extreme precipitation events. Since an EPS provides a large amount of potentially informative predictors, a variable selection is required in order to obtain a stable statistical model. A Bayesian formulation of quantile regression allows for inference about the selection of predictive covariates by the use of appropriate prior distributions. Moreover, the implementation of an additional process layer for the regression parameters accounts for spatial variations of the parameters. Bayesian quantile regression and its spatially adaptive extension is illustrated for the German-focused mesoscale weather prediction ensemble COSMO-DE-EPS, which runs (pre)operationally since December 2010 at the German Meteorological Service (DWD). Objective out-of-sample verification uses the quantile score (QS), a weighted absolute error between quantile forecasts and observations. The QS is a proper scoring function and can be decomposed into reliability, resolutions and uncertainty parts. A quantile reliability plot gives detailed insights in the predictive performance of the quantile forecasts.
NASA Astrophysics Data System (ADS)
Subramanian, Aneesh C.; Palmer, Tim N.
2017-06-01
Stochastic schemes to represent model uncertainty in the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble prediction system has helped improve its probabilistic forecast skill over the past decade by both improving its reliability and reducing the ensemble mean error. The largest uncertainties in the model arise from the model physics parameterizations. In the tropics, the parameterization of moist convection presents a major challenge for the accurate prediction of weather and climate. Superparameterization is a promising alternative strategy for including the effects of moist convection through explicit turbulent fluxes calculated from a cloud-resolving model (CRM) embedded within a global climate model (GCM). In this paper, we compare the impact of initial random perturbations in embedded CRMs, within the ECMWF ensemble prediction system, with stochastically perturbed physical tendency (SPPT) scheme as a way to represent model uncertainty in medium-range tropical weather forecasts. We especially focus on forecasts of tropical convection and dynamics during MJO events in October-November 2011. These are well-studied events for MJO dynamics as they were also heavily observed during the DYNAMO field campaign. We show that a multiscale ensemble modeling approach helps improve forecasts of certain aspects of tropical convection during the MJO events, while it also tends to deteriorate certain large-scale dynamic fields with respect to stochastically perturbed physical tendencies approach that is used operationally at ECMWF.
Wang, Xueyi; Davidson, Nicholas J.
2011-01-01
Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162
On the predictability of outliers in ensemble forecasts
NASA Astrophysics Data System (ADS)
Siegert, S.; Bröcker, J.; Kantz, H.
2012-03-01
In numerical weather prediction, ensembles are used to retrieve probabilistic forecasts of future weather conditions. We consider events where the verification is smaller than the smallest, or larger than the largest ensemble member of a scalar ensemble forecast. These events are called outliers. In a statistically consistent K-member ensemble, outliers should occur with a base rate of 2/(K+1). In operational ensembles this base rate tends to be higher. We study the predictability of outlier events in terms of the Brier Skill Score and find that forecast probabilities can be calculated which are more skillful than the unconditional base rate. This is shown analytically for statistically consistent ensembles. Using logistic regression, forecast probabilities for outlier events in an operational ensemble are calculated. These probabilities exhibit positive skill which is quantitatively similar to the analytical results. Possible causes of these results as well as their consequences for ensemble interpretation are discussed.
Muhlestein, Whitney E; Akagi, Dallin S; Kallos, Justiss A; Morone, Peter J; Weaver, Kyle D; Thompson, Reid C; Chambless, Lola B
2018-04-01
Objective Machine learning (ML) algorithms are powerful tools for predicting patient outcomes. This study pilots a novel approach to algorithm selection and model creation using prediction of discharge disposition following meningioma resection as a proof of concept. Materials and Methods A diversity of ML algorithms were trained on a single-institution database of meningioma patients to predict discharge disposition. Algorithms were ranked by predictive power and top performers were combined to create an ensemble model. The final ensemble was internally validated on never-before-seen data to demonstrate generalizability. The predictive power of the ensemble was compared with a logistic regression. Further analyses were performed to identify how important variables impact the ensemble. Results Our ensemble model predicted disposition significantly better than a logistic regression (area under the curve of 0.78 and 0.71, respectively, p = 0.01). Tumor size, presentation at the emergency department, body mass index, convexity location, and preoperative motor deficit most strongly influence the model, though the independent impact of individual variables is nuanced. Conclusion Using a novel ML technique, we built a guided ML ensemble model that predicts discharge destination following meningioma resection with greater predictive power than a logistic regression, and that provides greater clinical insight than a univariate analysis. These techniques can be extended to predict many other patient outcomes of interest.
Bashir, Saba; Qamar, Usman; Khan, Farhan Hassan
2015-06-01
Conventional clinical decision support systems are based on individual classifiers or simple combination of these classifiers which tend to show moderate performance. This research paper presents a novel classifier ensemble framework based on enhanced bagging approach with multi-objective weighted voting scheme for prediction and analysis of heart disease. The proposed model overcomes the limitations of conventional performance by utilizing an ensemble of five heterogeneous classifiers: Naïve Bayes, linear regression, quadratic discriminant analysis, instance based learner and support vector machines. Five different datasets are used for experimentation, evaluation and validation. The datasets are obtained from publicly available data repositories. Effectiveness of the proposed ensemble is investigated by comparison of results with several classifiers. Prediction results of the proposed ensemble model are assessed by ten fold cross validation and ANOVA statistics. The experimental evaluation shows that the proposed framework deals with all type of attributes and achieved high diagnosis accuracy of 84.16 %, 93.29 % sensitivity, 96.70 % specificity, and 82.15 % f-measure. The f-ratio higher than f-critical and p value less than 0.05 for 95 % confidence interval indicate that the results are extremely statistically significant for most of the datasets.
NASA Astrophysics Data System (ADS)
Zhu, Kefeng; Xue, Ming
2016-11-01
On 21 July 2012, an extreme rainfall event that recorded a maximum rainfall amount over 24 hours of 460 mm, occurred in Beijing, China. Most operational models failed to predict such an extreme amount. In this study, a convective-permitting ensemble forecast system (CEFS), at 4-km grid spacing, covering the entire mainland of China, is applied to this extreme rainfall case. CEFS consists of 22 members and uses multiple physics parameterizations. For the event, the predicted maximum is 415 mm d-1 in the probability-matched ensemble mean. The predicted high-probability heavy rain region is located in southwest Beijing, as was observed. Ensemble-based verification scores are then investigated. For a small verification domain covering Beijing and its surrounding areas, the precipitation rank histogram of CEFS is much flatter than that of a reference global ensemble. CEFS has a lower (higher) Brier score and a higher resolution than the global ensemble for precipitation, indicating more reliable probabilistic forecasting by CEFS. Additionally, forecasts of different ensemble members are compared and discussed. Most of the extreme rainfall comes from convection in the warm sector east of an approaching cold front. A few members of CEFS successfully reproduce such precipitation, and orographic lift of highly moist low-level flows with a significantly southeasterly component is suggested to have played important roles in producing the initial convection. Comparisons between good and bad forecast members indicate a strong sensitivity of the extreme rainfall to the mesoscale environmental conditions, and, to less of an extent, the model physics.
Ensemble-based docking: From hit discovery to metabolism and toxicity predictions
Evangelista, Wilfredo; Weir, Rebecca; Ellingson, Sally; ...
2016-07-29
The use of ensemble-based docking for the exploration of biochemical pathways and toxicity prediction of drug candidates is described. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials.
NASA Astrophysics Data System (ADS)
Kutty, Govindan; Muraleedharan, Rohit; Kesarkar, Amit P.
2018-03-01
Uncertainties in the numerical weather prediction models are generally not well-represented in ensemble-based data assimilation (DA) systems. The performance of an ensemble-based DA system becomes suboptimal, if the sources of error are undersampled in the forecast system. The present study examines the effect of accounting for model error treatments in the hybrid ensemble transform Kalman filter—three-dimensional variational (3DVAR) DA system (hybrid) in the track forecast of two tropical cyclones viz. Hudhud and Thane, formed over the Bay of Bengal, using Advanced Research Weather Research and Forecasting (ARW-WRF) model. We investigated the effect of two types of model error treatment schemes and their combination on the hybrid DA system; (i) multiphysics approach, which uses different combination of cumulus, microphysics and planetary boundary layer schemes, (ii) stochastic kinetic energy backscatter (SKEB) scheme, which perturbs the horizontal wind and potential temperature tendencies, (iii) a combination of both multiphysics and SKEB scheme. Substantial improvements are noticed in the track positions of both the cyclones, when flow-dependent ensemble covariance is used in 3DVAR framework. Explicit model error representation is found to be beneficial in treating the underdispersive ensembles. Among the model error schemes used in this study, a combination of multiphysics and SKEB schemes has outperformed the other two schemes with improved track forecast for both the tropical cyclones.
Predicting Aircraft Spray Patterns on Crops
NASA Technical Reports Server (NTRS)
Teske, M. E.; Bilanin, A. J.
1986-01-01
Agricultural Dispersion Prediction (AGDISP) system developed to predict deposition of agricultural material released from rotary- and fixed-wing aircraft. AGDISP computes ensemble average mean motion resulting from turbulent fluid fluctuations. Used to examine ways of making dispersal process more efficient by insuring uniformity, reducing waste, and saving money. Programs in AGDISP system written in FORTRAN IV for interactive execution.
NASA Astrophysics Data System (ADS)
Miyoshi, Takemasa; Kunii, Masaru
2012-03-01
The local ensemble transform Kalman filter (LETKF) is implemented with the Weather Research and Forecasting (WRF) model, and real observations are assimilated to assess the newly-developed WRF-LETKF system. The WRF model is a widely-used mesoscale numerical weather prediction model, and the LETKF is an ensemble Kalman filter (EnKF) algorithm particularly efficient in parallel computer architecture. This study aims to provide the basis of future research on mesoscale data assimilation using the WRF-LETKF system, an additional testbed to the existing EnKF systems with the WRF model used in the previous studies. The particular LETKF system adopted in this study is based on the system initially developed in 2004 and has been continuously improved through theoretical studies and wide applications to many kinds of dynamical models including realistic geophysical models. Most recent and important improvements include an adaptive covariance inflation scheme which considers the spatial and temporal inhomogeneity of inflation parameters. Experiments show that the LETKF successfully assimilates real observations and that adaptive inflation is advantageous. Additional experiments with various ensemble sizes show that using more ensemble members improves the analyses consistently.
Universal LD50 predictions using deep learning
NICEATM Predictive Models for Acute Oral Systemic Toxicity LD50 entry Risa R. Sayre (sayre.risa@epa.gov) & Christopher M. Grulke Our approach uses an ensemble of multilayer perceptron regressions to predict rat acute oral LD50 values from chemical features. Features were genera...
Data-driven reverse engineering of signaling pathways using ensembles of dynamic models.
Henriques, David; Villaverde, Alejandro F; Rocha, Miguel; Saez-Rodriguez, Julio; Banga, Julio R
2017-02-01
Despite significant efforts and remarkable progress, the inference of signaling networks from experimental data remains very challenging. The problem is particularly difficult when the objective is to obtain a dynamic model capable of predicting the effect of novel perturbations not considered during model training. The problem is ill-posed due to the nonlinear nature of these systems, the fact that only a fraction of the involved proteins and their post-translational modifications can be measured, and limitations on the technologies used for growing cells in vitro, perturbing them, and measuring their variations. As a consequence, there is a pervasive lack of identifiability. To overcome these issues, we present a methodology called SELDOM (enSEmbLe of Dynamic lOgic-based Models), which builds an ensemble of logic-based dynamic models, trains them to experimental data, and combines their individual simulations into an ensemble prediction. It also includes a model reduction step to prune spurious interactions and mitigate overfitting. SELDOM is a data-driven method, in the sense that it does not require any prior knowledge of the system: the interaction networks that act as scaffolds for the dynamic models are inferred from data using mutual information. We have tested SELDOM on a number of experimental and in silico signal transduction case-studies, including the recent HPN-DREAM breast cancer challenge. We found that its performance is highly competitive compared to state-of-the-art methods for the purpose of recovering network topology. More importantly, the utility of SELDOM goes beyond basic network inference (i.e. uncovering static interaction networks): it builds dynamic (based on ordinary differential equation) models, which can be used for mechanistic interpretations and reliable dynamic predictions in new experimental conditions (i.e. not used in the training). For this task, SELDOM's ensemble prediction is not only consistently better than predictions from individual models, but also often outperforms the state of the art represented by the methods used in the HPN-DREAM challenge.
Data-driven reverse engineering of signaling pathways using ensembles of dynamic models
Henriques, David; Villaverde, Alejandro F.; Banga, Julio R.
2017-01-01
Despite significant efforts and remarkable progress, the inference of signaling networks from experimental data remains very challenging. The problem is particularly difficult when the objective is to obtain a dynamic model capable of predicting the effect of novel perturbations not considered during model training. The problem is ill-posed due to the nonlinear nature of these systems, the fact that only a fraction of the involved proteins and their post-translational modifications can be measured, and limitations on the technologies used for growing cells in vitro, perturbing them, and measuring their variations. As a consequence, there is a pervasive lack of identifiability. To overcome these issues, we present a methodology called SELDOM (enSEmbLe of Dynamic lOgic-based Models), which builds an ensemble of logic-based dynamic models, trains them to experimental data, and combines their individual simulations into an ensemble prediction. It also includes a model reduction step to prune spurious interactions and mitigate overfitting. SELDOM is a data-driven method, in the sense that it does not require any prior knowledge of the system: the interaction networks that act as scaffolds for the dynamic models are inferred from data using mutual information. We have tested SELDOM on a number of experimental and in silico signal transduction case-studies, including the recent HPN-DREAM breast cancer challenge. We found that its performance is highly competitive compared to state-of-the-art methods for the purpose of recovering network topology. More importantly, the utility of SELDOM goes beyond basic network inference (i.e. uncovering static interaction networks): it builds dynamic (based on ordinary differential equation) models, which can be used for mechanistic interpretations and reliable dynamic predictions in new experimental conditions (i.e. not used in the training). For this task, SELDOM’s ensemble prediction is not only consistently better than predictions from individual models, but also often outperforms the state of the art represented by the methods used in the HPN-DREAM challenge. PMID:28166222
Davey, James A; Chica, Roberto A
2014-05-01
Multistate computational protein design (MSD) with backbone ensembles approximating conformational flexibility can predict higher quality sequences than single-state design with a single fixed backbone. However, it is currently unclear what characteristics of backbone ensembles are required for the accurate prediction of protein sequence stability. In this study, we aimed to improve the accuracy of protein stability predictions made with MSD by using a variety of backbone ensembles to recapitulate the experimentally measured stability of 85 Streptococcal protein G domain β1 sequences. Ensembles tested here include an NMR ensemble as well as those generated by molecular dynamics (MD) simulations, by Backrub motions, and by PertMin, a new method that we developed involving the perturbation of atomic coordinates followed by energy minimization. MSD with the PertMin ensembles resulted in the most accurate predictions by providing the highest number of stable sequences in the top 25, and by correctly binning sequences as stable or unstable with the highest success rate (≈90%) and the lowest number of false positives. The performance of PertMin ensembles is due to the fact that their members closely resemble the input crystal structure and have low potential energy. Conversely, the NMR ensemble as well as those generated by MD simulations at 500 or 1000 K reduced prediction accuracy due to their low structural similarity to the crystal structure. The ensembles tested herein thus represent on- or off-target models of the native protein fold and could be used in future studies to design for desired properties other than stability. Copyright © 2013 Wiley Periodicals, Inc.
Probabilistic flood warning using grand ensemble weather forecasts
NASA Astrophysics Data System (ADS)
He, Y.; Wetterhall, F.; Cloke, H.; Pappenberger, F.; Wilson, M.; Freer, J.; McGregor, G.
2009-04-01
As the severity of floods increases, possibly due to climate and landuse change, there is urgent need for more effective and reliable warning systems. The incorporation of numerical weather predictions (NWP) into a flood warning system can increase forecast lead times from a few hours to a few days. A single NWP forecast from a single forecast centre, however, is insufficient as it involves considerable non-predictable uncertainties and can lead to a high number of false or missed warnings. An ensemble of weather forecasts from one Ensemble Prediction System (EPS), when used on catchment hydrology, can provide improved early flood warning as some of the uncertainties can be quantified. EPS forecasts from a single weather centre only account for part of the uncertainties originating from initial conditions and stochastic physics. Other sources of uncertainties, including numerical implementations and/or data assimilation, can only be assessed if a grand ensemble of EPSs from different weather centres is used. When various models that produce EPS from different weather centres are aggregated, the probabilistic nature of the ensemble precipitation forecasts can be better retained and accounted for. The availability of twelve global EPSs through the 'THORPEX Interactive Grand Global Ensemble' (TIGGE) offers a new opportunity for the design of an improved probabilistic flood forecasting framework. This work presents a case study using the TIGGE database for flood warning on a meso-scale catchment. The upper reach of the River Severn catchment located in the Midlands Region of England is selected due to its abundant data for investigation and its relatively small size (4062 km2) (compared to the resolution of the NWPs). This choice was deliberate as we hypothesize that the uncertainty in the forcing of smaller catchments cannot be represented by a single EPS with a very limited number of ensemble members, but only through the variance given by a large number ensembles and ensemble system. A coupled atmospheric-hydrologic-hydraulic cascade system driven by the TIGGE ensemble forecasts is set up to study the potential benefits of using the TIGGE database in early flood warning. Physically based and fully distributed LISFLOOD suite of models is selected to simulate discharge and flood inundation consecutively. The results show the TIGGE database is a promising tool to produce forecasts of discharge and flood inundation comparable with the observed discharge and simulated inundation driven by the observed discharge. The spread of discharge forecasts varies from centre to centre, but it is generally large, implying a significant level of uncertainties. Precipitation input uncertainties dominate and propagate through the cascade chain. The current NWPs fall short of representing the spatial variability of precipitation on a comparatively small catchment. This perhaps indicates the need to improve NWPs resolution and/or disaggregation techniques to narrow down the spatial gap between meteorology and hydrology. It is not necessarily true that early flood warning becomes more reliable when more ensemble forecasts are employed. It is difficult to identify the best forecast centre(s), but in general the chance of detecting floods is increased by using the TIGGE database. Only one flood event was studied because most of the TIGGE data became available after October 2007. It is necessary to test the TIGGE ensemble forecasts with other flood events in other catchments with different hydrological and climatic regimes before general conclusions can be made on its robustness and applicability.
Predicting protein function and other biomedical characteristics with heterogeneous ensembles
Whalen, Sean; Pandey, Om Prakash
2015-01-01
Prediction problems in biomedical sciences, including protein function prediction (PFP), are generally quite difficult. This is due in part to incomplete knowledge of the cellular phenomenon of interest, the appropriateness and data quality of the variables and measurements used for prediction, as well as a lack of consensus regarding the ideal predictor for specific problems. In such scenarios, a powerful approach to improving prediction performance is to construct heterogeneous ensemble predictors that combine the output of diverse individual predictors that capture complementary aspects of the problems and/or datasets. In this paper, we demonstrate the potential of such heterogeneous ensembles, derived from stacking and ensemble selection methods, for addressing PFP and other similar biomedical prediction problems. Deeper analysis of these results shows that the superior predictive ability of these methods, especially stacking, can be attributed to their attention to the following aspects of the ensemble learning process: (i) better balance of diversity and performance, (ii) more effective calibration of outputs and (iii) more robust incorporation of additional base predictors. Finally, to make the effective application of heterogeneous ensembles to large complex datasets (big data) feasible, we present DataSink, a distributed ensemble learning framework, and demonstrate its sound scalability using the examined datasets. DataSink is publicly available from https://github.com/shwhalen/datasink. PMID:26342255
20180411 - Universal LD50 predictions using deep learning (ICCVAM)
NICEATM Predictive Models for Acute Oral Systemic Toxicity LD50 entry Risa R. Sayre (sayre.risa@epa.gov) & Christopher M. Grulke Our approach uses an ensemble of multilayer perceptron regressions to predict rat acute oral LD50 values from chemical features. Features were gene...
Decadal climate prediction (project GCEP).
Haines, Keith; Hermanson, Leon; Liu, Chunlei; Putt, Debbie; Sutton, Rowan; Iwi, Alan; Smith, Doug
2009-03-13
Decadal prediction uses climate models forced by changing greenhouse gases, as in the International Panel for Climate Change, but unlike longer range predictions they also require initialization with observations of the current climate. In particular, the upper-ocean heat content and circulation have a critical influence. Decadal prediction is still in its infancy and there is an urgent need to understand the important processes that determine predictability on these timescales. We have taken the first Hadley Centre Decadal Prediction System (DePreSys) and implemented it on several NERC institute compute clusters in order to study a wider range of initial condition impacts on decadal forecasting, eventually including the state of the land and cryosphere. The eScience methods are used to manage submission and output from the many ensemble model runs required to assess predictive skill. Early results suggest initial condition skill may extend for several years, even over land areas, but this depends sensitively on the definition used to measure skill, and alternatives are presented. The Grid for Coupled Ensemble Prediction (GCEP) system will allow the UK academic community to contribute to international experiments being planned to explore decadal climate predictability.
NASA Astrophysics Data System (ADS)
Chen, L. C.; Mo, K. C.; Zhang, Q.; Huang, J.
2014-12-01
Drought prediction from monthly to seasonal time scales is of critical importance to disaster mitigation, agricultural planning, and multi-purpose reservoir management. Starting in December 2012, NOAA Climate Prediction Center (CPC) has been providing operational Standardized Precipitation Index (SPI) Outlooks using the North American Multi-Model Ensemble (NMME) forecasts, to support CPC's monthly drought outlooks and briefing activities. The current NMME system consists of six model forecasts from U.S. and Canada modeling centers, including the CFSv2, CM2.1, GEOS-5, CCSM3.0, CanCM3, and CanCM4 models. In this study, we conduct an assessment of the predictive skill of meteorological drought using real-time NMME forecasts for the period from May 2012 to May 2014. The ensemble SPI forecasts are the equally weighted mean of the six model forecasts. Two performance measures, the anomaly correlation coefficient and root-mean-square errors against the observations, are used to evaluate forecast skill.Similar to the assessment based on NMME retrospective forecasts, predictive skill of monthly-mean precipitation (P) forecasts is generally low after the second month and errors vary among models. Although P forecast skill is not large, SPI predictive skill is high and the differences among models are small. The skill mainly comes from the P observations appended to the model forecasts. This factor also contributes to the similarity of SPI prediction among the six models. Still, NMME SPI ensemble forecasts have higher skill than those based on individual models or persistence, and the 6-month SPI forecasts are skillful out to four months. The three major drought events occurred during the 2012-2014 period, the 2012 Central Great Plains drought, the 2013 Upper Midwest flash drought, and 2013-2014 California drought, are used as examples to illustrate the system's strength and limitations. For precipitation-driven drought events, such as the 2012 Central Great Plains drought, NMME SPI forecasts perform well in predicting drought severity and spatial patterns. For fast-developing drought events, such as the 2013 Upper Midwest flash drought, the system failed to capture the onset of the drought.
Real-time Ensemble Forecasting of Coronal Mass Ejections using the WSA-ENLIL+Cone Model
NASA Astrophysics Data System (ADS)
Mays, M. L.; Taktakishvili, A.; Pulkkinen, A. A.; MacNeice, P. J.; Rastaetter, L.; Kuznetsova, M. M.; Odstrcil, D.
2013-12-01
Ensemble forecasting of coronal mass ejections (CMEs) provides significant information in that it provides an estimation of the spread or uncertainty in CME arrival time predictions due to uncertainties in determining CME input parameters. Ensemble modeling of CME propagation in the heliosphere is performed by forecasters at the Space Weather Research Center (SWRC) using the WSA-ENLIL cone model available at the Community Coordinated Modeling Center (CCMC). SWRC is an in-house research-based operations team at the CCMC which provides interplanetary space weather forecasting for NASA's robotic missions and performs real-time model validation. A distribution of n (routinely n=48) CME input parameters are generated using the CCMC Stereo CME Analysis Tool (StereoCAT) which employs geometrical triangulation techniques. These input parameters are used to perform n different simulations yielding an ensemble of solar wind parameters at various locations of interest (satellites or planets), including a probability distribution of CME shock arrival times (for hits), and geomagnetic storm strength (for Earth-directed hits). Ensemble simulations have been performed experimentally in real-time at the CCMC since January 2013. We present the results of ensemble simulations for a total of 15 CME events, 10 of which were performed in real-time. The observed CME arrival was within the range of ensemble arrival time predictions for 5 out of the 12 ensemble runs containing hits. The average arrival time prediction was computed for each of the twelve ensembles predicting hits and using the actual arrival time an average absolute error of 8.20 hours was found for all twelve ensembles, which is comparable to current forecasting errors. Some considerations for the accuracy of ensemble CME arrival time predictions include the importance of the initial distribution of CME input parameters, particularly the mean and spread. When the observed arrivals are not within the predicted range, this still allows the ruling out of prediction errors caused by tested CME input parameters. Prediction errors can also arise from ambient model parameters such as the accuracy of the solar wind background, and other limitations. Additionally the ensemble modeling setup was used to complete a parametric event case study of the sensitivity of the CME arrival time prediction to free parameters for ambient solar wind model and CME.
Ensemble ecosystem modeling for predicting ecosystem response to predator reintroduction.
Baker, Christopher M; Gordon, Ascelin; Bode, Michael
2017-04-01
Introducing a new or extirpated species to an ecosystem is risky, and managers need quantitative methods that can predict the consequences for the recipient ecosystem. Proponents of keystone predator reintroductions commonly argue that the presence of the predator will restore ecosystem function, but this has not always been the case, and mathematical modeling has an important role to play in predicting how reintroductions will likely play out. We devised an ensemble modeling method that integrates species interaction networks and dynamic community simulations and used it to describe the range of plausible consequences of 2 keystone-predator reintroductions: wolves (Canis lupus) to Yellowstone National Park and dingoes (Canis dingo) to a national park in Australia. Although previous methods for predicting ecosystem responses to such interventions focused on predicting changes around a given equilibrium, we used Lotka-Volterra equations to predict changing abundances through time. We applied our method to interaction networks for wolves in Yellowstone National Park and for dingoes in Australia. Our model replicated the observed dynamics in Yellowstone National Park and produced a larger range of potential outcomes for the dingo network. However, we also found that changes in small vertebrates or invertebrates gave a good indication about the potential future state of the system. Our method allowed us to predict when the systems were far from equilibrium. Our results showed that the method can also be used to predict which species may increase or decrease following a reintroduction and can identify species that are important to monitor (i.e., species whose changes in abundance give extra insight into broad changes in the system). Ensemble ecosystem modeling can also be applied to assess the ecosystem-wide implications of other types of interventions including assisted migration, biocontrol, and invasive species eradication. © 2016 Society for Conservation Biology.
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Delle Monache, L.; Alessandrini, S.
2016-12-01
Accuracy of weather forecasts in Northeast U.S. has become very important in recent years, given the serious and devastating effects of extreme weather events. Despite the use of evolved forecasting tools and techniques strengthened by increased super-computing resources, the weather forecasting systems still have their limitations in predicting extreme events. In this study, we examine the combination of analog ensemble and Bayesian regression techniques to improve the prediction of storms that have impacted NE U.S., mostly defined by the occurrence of high wind speeds (i.e. blizzards, winter storms, hurricanes and thunderstorms). The predicted wind speed, wind direction and temperature by two state-of-the-science atmospheric models (WRF and RAMS/ICLAMS) are combined using the mentioned techniques, exploring various ways that those variables influence the minimization of the prediction error (systematic and random). This study is focused on retrospective simulations of 146 storms that affected the NE U.S. in the period 2005-2016. In order to evaluate the techniques, leave-one-out cross validation procedure was implemented regarding 145 storms as the training dataset. The analog ensemble method selects a set of past observations that corresponded to the best analogs of the numerical weather prediction and provides a set of ensemble members of the selected observation dataset. The set of ensemble members can then be used in a deterministic or probabilistic way. In the Bayesian regression framework, optimal variances are estimated for the training partition by minimizing the root mean square error and are applied to the out-of-sample storm. The preliminary results indicate a significant improvement in the statistical metrics of 10-m wind speed for 146 storms using both techniques (20-30% bias and error reduction in all observation-model pairs). In this presentation, we discuss the various combinations of atmospheric predictors and techniques and illustrate how the long record of predicted storms is valuable in the improvement of wind speed prediction.
Prediction skill of rainstorm events over India in the TIGGE weather prediction models
NASA Astrophysics Data System (ADS)
Karuna Sagar, S.; Rajeevan, M.; Vijaya Bhaskara Rao, S.; Mitra, A. K.
2017-12-01
Extreme rainfall events pose a serious threat of leading to severe floods in many countries worldwide. Therefore, advance prediction of its occurrence and spatial distribution is very essential. In this paper, an analysis has been made to assess the skill of numerical weather prediction models in predicting rainstorms over India. Using gridded daily rainfall data set and objective criteria, 15 rainstorms were identified during the monsoon season (June to September). The analysis was made using three TIGGE (THe Observing System Research and Predictability Experiment (THORPEX) Interactive Grand Global Ensemble) models. The models considered are the European Centre for Medium-Range Weather Forecasts (ECMWF), National Centre for Environmental Prediction (NCEP) and the UK Met Office (UKMO). Verification of the TIGGE models for 43 observed rainstorm days from 15 rainstorm events has been made for the period 2007-2015. The comparison reveals that rainstorm events are predictable up to 5 days in advance, however with a bias in spatial distribution and intensity. The statistical parameters like mean error (ME) or Bias, root mean square error (RMSE) and correlation coefficient (CC) have been computed over the rainstorm region using the multi-model ensemble (MME) mean. The study reveals that the spread is large in ECMWF and UKMO followed by the NCEP model. Though the ensemble spread is quite small in NCEP, the ensemble member averages are not well predicted. The rank histograms suggest that the forecasts are under prediction. The modified Contiguous Rain Area (CRA) technique was used to verify the spatial as well as the quantitative skill of the TIGGE models. Overall, the contribution from the displacement and pattern errors to the total RMSE is found to be more in magnitude. The volume error increases from 24 hr forecast to 48 hr forecast in all the three models.
NASA Astrophysics Data System (ADS)
Zunz, Violette; Goosse, Hugues; Dubinkina, Svetlana
2014-05-01
In this study, we assess systematically the impact of different initialisation procedures on the predictability of the sea ice in the Southern Ocean. These initialisation strategies are based on three data assimilation methods: the nudging, the particle filter with sequential resampling and the nudging proposal particle filter. An Earth-system model of intermediate complexity has been used to perform hindcast simulations in a perfect model approach. The predictability of the Southern Ocean sea ice is estimated through two aspects: the spread of the hindcast ensemble, indicating the uncertainty on the ensemble, and the correlation between the ensemble mean and the pseudo-observations, used to assess the accuracy of the prediction. Our results show that, at decadal timescales, more sophisticated data assimilation methods as well as denser pseudo-observations used to initialise the hindcasts decrease the spread of the ensemble but improve only slightly the accuracy of the prediction of the sea ice in the Southern Ocean. Overall, the predictability at interannual timescales is limited, at most, to three years ahead. At multi-decadal timescales, there is a clear improvement of the correlation of the trend in sea ice extent between the hindcasts and the pseudo-observations if the initialisation takes into account the pseudo-observations. The correlation reaches values larger than 0.5 and is due to the inertia of the ocean, showing the importance of the quality of the initialisation below the sea ice.
Coupled lagged ensemble weather- and river runoff prediction in complex Alpine terrain
NASA Astrophysics Data System (ADS)
Smiatek, Gerhard; Kunstmann, Harald; Werhahn, Johannes
2013-04-01
It is still a challenge to predict fast reacting streamflow precipitation response in Alpine terrain. Civil protection measures require flood prediction in 24 - 48 lead time. This holds particularly true for the Ammer River region which was affected by century floods in 1999, 2003 and 2005. Since 2005 a coupled NWP/Hydrology model system is operated in simulating and predicting the Ammer River discharges. The Ammer River catchment is located in the Bavarian Ammergau Alps and alpine forelands, Germany. With elevations reaching 2185 m and annual mean precipitation between 1100 and 2000 mm it represents very demanding test ground for a river runoff prediction system. The one way coupled system utilizes a lagged ensemble prediction system (EPS) taking into account combination of recent and previous NWP forecasts. The major components of the system are the MM5 NWP model run at 3.5 km resolution and initialized twice a day, the hydrology model WaSiM-ETH run at 100 m resolution and Perl object environment (POE) implementing the networking and the system operation. Results obtained in the years 2005-2012 reveal that river runoff simulations depict already high correlation (NSC in range 0.53 and 0.95) with observed runoff in retrospective runs with monitored meteorology data, but suffer from errors in quantitative precipitation forecast (QPF) from the employed numerical weather prediction model. We evaluate the NWP model accuracy, especially the precipitation intensity, frequency and location and put a focus on the performance gain of bias adjustment procedures. We show how this enhanced QFP data help to reduce the uncertainty in the discharge prediction. In addition to the HND (Hochwassernachrichtendienst, Bayern) observations TERENO Longterm Observatory hydrometeorological observation data are available since 2011. They are used to evaluate the NWP performance and setup of a bias correction procedure based on ensemble postprocessing applying Bayesian (BMA) model averaging. We first present briefly the technical setup of the operational coupled lagged NWP/Hydrology model system and then focus on the evaluation of the NWP model, the BMA enhanced QPF and its application within the Ammer simulation system in the period 2011 - 2012
A Solar Time-Based Analog Ensemble Method for Regional Solar Power Forecasting
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hodge, Brian S; Zhang, Xinmin; Li, Yuan
This paper presents a new analog ensemble method for day-ahead regional photovoltaic (PV) power forecasting with hourly resolution. By utilizing open weather forecast and power measurement data, this prediction method is processed within a set of historical data with similar meteorological data (temperature and irradiance), and astronomical date (solar time and earth declination angle). Further, clustering and blending strategies are applied to improve its accuracy in regional PV forecasting. The robustness of the proposed method is demonstrated with three different numerical weather prediction models, the North American Mesoscale Forecast System, the Global Forecast System, and the Short-Range Ensemble Forecast, formore » both region level and single site level PV forecasts. Using real measured data, the new forecasting approach is applied to the load zone in Southeastern Massachusetts as a case study. The normalized root mean square error (NRMSE) has been reduced by 13.80%-61.21% when compared with three tested baselines.« less
Synoptic Factors Affecting Structure Predictability of Hurricane Alex (2016)
NASA Astrophysics Data System (ADS)
Gonzalez-Aleman, J. J.; Evans, J. L.; Kowaleski, A. M.
2016-12-01
On January 7, 2016, a disturbance formed over the western North Atlantic basin. After undergoing tropical transition, the system became the first hurricane of 2016 - and the first North Atlantic hurricane to form in January since 1938. Already an extremely rare hurricane event, Alex then underwent extratropical transition [ET] just north of the Azores Islands. We examine the factors affecting Alex's structural evolution through a new technique called path-clustering. In this way, 51 ensembles from the European Centre for Medium-Range Weather Forecasts Ensemble Prediction System (ECMWF-EPS) are grouped based on similarities in the storm's path through the Cyclone Phase Space (CPS). The differing clusters group various possible scenarios of structural development represented in the ensemble forecasts. As a result, it is possible to shed light on the role of the synoptic scale in changing the structure of this hurricane in the midlatitudes through intercomparison of the most "realistic" forecast of the evolution of Alex and the other physically plausible modes of its development.
NASA Astrophysics Data System (ADS)
Ament, F.; Weusthoff, T.; Arpagaus, M.; Rotach, M.
2009-04-01
The main aim of the WWRP Forecast Demonstration Project MAP D-PHASE is to demonstrate the performance of today's models to forecast heavy precipitation and flood events in the Alpine region. Therefore an end-to-end, real-time forecasting system was installed and operated during the D PHASE Operations Period from June to November 2007. Part of this system are 30 numerical weather prediction models (deterministic as well as ensemble systems) operated by weather services and research institutes, which issue alerts if predicted precipitation accumulations exceed critical thresholds. Additionally to the real-time alerts, all relevant model fields of these simulations are stored in a central data archive. This comprehensive data set allows a detailed assessment of today's quantitative precipitation forecast (QPF) performance in the Alpine region. We will present results of QPF verifications against Swiss radar and rain gauge data both from a qualitative point of view, in terms of alerts, as well as from a quantitative perspective, in terms of precipitation rate. Various influencing factors like lead time, accumulation time, selection of warning thresholds, or bias corrections will be discussed. Additional to traditional verifications of area average precipitation amounts, the performance of the models to predict the correct precipitation statistics without requiring a point-to-point match will be described by using modern Fuzzy verification techniques. Both analyses reveal significant advantages of deep convection resolving models compared to coarser models with parameterized convection. An intercomparison of the model forecasts themselves reveals a remarkably high variability between different models, and makes it worthwhile to evaluate the potential of a multi-model ensemble. Various multi-model ensemble strategies will be tested by combining D-PHASE models to virtual ensemble systems.
SVM and SVM Ensembles in Breast Cancer Prediction.
Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong
2017-01-01
Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.
SVM and SVM Ensembles in Breast Cancer Prediction
Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong
2017-01-01
Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers. PMID:28060807
Pourhoseingholi, Mohamad Amin; Kheirian, Sedigheh; Zali, Mohammad Reza
2017-12-01
Colorectal cancer (CRC) is one of the most common malignancies and cause of cancer mortality worldwide. Given the importance of predicting the survival of CRC patients and the growing use of data mining methods, this study aims to compare the performance of models for predicting 5-year survival of CRC patients using variety of basic and ensemble data mining methods. The CRC dataset from The Shahid Beheshti University of Medical Sciences Research Center for Gastroenterology and Liver Diseases were used for prediction and comparative study of the base and ensemble data mining techniques. Feature selection methods were used to select predictor attributes for classification. The WEKA toolkit and MedCalc software were respectively utilized for creating and comparing the models. The obtained results showed that the predictive performance of developed models was altogether high (all greater than 90%). Overall, the performance of ensemble models was higher than that of basic classifiers and the best result achieved by ensemble voting model in terms of area under the ROC curve (AUC= 0.96). AUC Comparison of models showed that the ensemble voting method significantly outperformed all models except for two methods of Random Forest (RF) and Bayesian Network (BN) considered the overlapping 95% confidence intervals. This result may indicate high predictive power of these two methods along with ensemble voting for predicting 5-year survival of CRC patients.
Short-term Temperature Prediction Using Adaptive Computing on Dynamic Scales
NASA Astrophysics Data System (ADS)
Hu, W.; Cervone, G.; Jha, S.; Balasubramanian, V.; Turilli, M.
2017-12-01
When predicting temperature, there are specific places and times when high accuracy predictions are harder. For example, not all the sub-regions in the domain require the same amount of computing resources to generate an accurate prediction. Plateau areas might require less computing resources than mountainous areas because of the steeper gradient of temperature change in the latter. However, it is difficult to estimate beforehand the optimal allocation of computational resources because several parameters play a role in determining the accuracy of the forecasts, in addition to orography. The allocation of resources to perform simulations can become a bottleneck because it requires human intervention to stop jobs or start new ones. The goal of this project is to design and develop a dynamic approach to generate short-term temperature predictions that can automatically determines the required computing resources and the geographic scales of the predictions based on the spatial and temporal uncertainties. The predictions and the prediction quality metrics are computed using a numeric weather prediction model, Analog Ensemble (AnEn), and the parallelization on high performance computing systems is accomplished using Ensemble Toolkit, one component of the RADICAL-Cybertools family of tools. RADICAL-Cybertools decouple the science needs from the computational capabilities by building an intermediate layer to run general ensemble patterns, regardless of the science. In this research, we show how the ensemble toolkit allows generating high resolution temperature forecasts at different spatial and temporal resolution. The AnEn algorithm is run using NAM analysis and forecasts data for the continental United States for a period of 2 years. AnEn results show that temperature forecasts perform well according to different probabilistic and deterministic statistical tests.
NASA Astrophysics Data System (ADS)
Christensen, Hannah; Moroz, Irene; Palmer, Tim
2015-04-01
Forecast verification is important across scientific disciplines as it provides a framework for evaluating the performance of a forecasting system. In the atmospheric sciences, probabilistic skill scores are often used for verification as they provide a way of unambiguously ranking the performance of different probabilistic forecasts. In order to be useful, a skill score must be proper -- it must encourage honesty in the forecaster, and reward forecasts which are reliable and which have good resolution. A new score, the Error-spread Score (ES), is proposed which is particularly suitable for evaluation of ensemble forecasts. It is formulated with respect to the moments of the forecast. The ES is confirmed to be a proper score, and is therefore sensitive to both resolution and reliability. The ES is tested on forecasts made using the Lorenz '96 system, and found to be useful for summarising the skill of the forecasts. The European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble prediction system (EPS) is evaluated using the ES. Its performance is compared to a perfect statistical probabilistic forecast -- the ECMWF high resolution deterministic forecast dressed with the observed error distribution. This generates a forecast that is perfectly reliable if considered over all time, but which does not vary from day to day with the predictability of the atmospheric flow. The ES distinguishes between the dynamically reliable EPS forecasts and the statically reliable dressed deterministic forecasts. Other skill scores are tested and found to be comparatively insensitive to this desirable forecast quality. The ES is used to evaluate seasonal range ensemble forecasts made with the ECMWF System 4. The ensemble forecasts are found to be skilful when compared with climatological or persistence forecasts, though this skill is dependent on region and time of year.
NASA Astrophysics Data System (ADS)
Najafi, H.; Shahbazi, A.; Zohrabi, N.; Robertson, A. W.; Mofidi, A.; Massah Bavani, A. R.
2016-12-01
Each year, a number of high impact weather events occur worldwide. Since any level of predictability at sub-seasonal to seasonal timescale is highly beneficial to society, international efforts is now on progress to promote reliable Ensemble Prediction Systems for monthly forecasts within the WWRP/WCRP initiative (S2S) project and North American Multi Model Ensemble (NMME). For water resources managers in the face of extreme events, not only can reliable forecasts of high impact weather events prevent catastrophic losses caused by floods but also contribute to benefits gained from hydropower generation and water markets. The aim of this paper is to analyze the predictability of recent severe weather events over Iran. Two recent heavy precipitations are considered as an illustration to examine whether S2S forecasts can be used for developing flood alert systems especially where large cascade of dams are in operation. Both events have caused major damages to cities and infrastructures. The first severe precipitation was is in the early November 2015 when heavy precipitation (more than 50 mm) occurred in 2 days. More recently, up to 300 mm of precipitation is observed within less than a week in April 2016 causing a consequent flash flood. Over some stations, the observed precipitation was even more than the total annual mean precipitation. To analyze the predictive capability, ensemble forecasts from several operational centers including (European Centre for Medium-Range Weather Forecasts (ECMWF) system, Climate Forecast System Version 2 (CFSv2) and Chinese Meteorological Center (CMA) are evaluated. It has been observed that significant changes in precipitation anomalies were likely to be predicted days in advance. The next step will be to conduct thorough analysis based on comparing multi-model outputs over the full hindcast dataset developing real-time high impact weather prediction systems.
Development and Testing of a Coupled Ocean-atmosphere Mesoscale Ensemble Prediction System
2011-06-28
wind, temperature, and moisture variables, while the oceanographic ET is derived from ocean current, temperature, and salinity variables. Estimates of...wind, temperature, and moisture variables while the oceanographic ET is derived from ocean current temperature, and salinity variables. Estimates of...uncertainty in the model. Rigorously accurate ensemble methods for describing the distribution of future states given past information include particle
NASA Technical Reports Server (NTRS)
Chambon, Philippe; Zhang, Sara Q.; Hou, Arthur Y.; Zupanski, Milija; Cheung, Samson
2013-01-01
The forthcoming Global Precipitation Measurement (GPM) Mission will provide next generation precipitation observations from a constellation of satellites. Since precipitation by nature has large variability and low predictability at cloud-resolving scales, the impact of precipitation data on the skills of mesoscale numerical weather prediction (NWP) is largely affected by the characterization of background and observation errors and the representation of nonlinear cloud/precipitation physics in an NWP data assimilation system. We present a data impact study on the assimilation of precipitation-affected microwave (MW) radiances from a pre-GPM satellite constellation using the Goddard WRF Ensemble Data Assimilation System (Goddard WRF-EDAS). A series of assimilation experiments are carried out in a Weather Research Forecast (WRF) model domain of 9 km resolution in western Europe. Sensitivities to observation error specifications, background error covariance estimated from ensemble forecasts with different ensemble sizes, and MW channel selections are examined through single-observation assimilation experiments. An empirical bias correction for precipitation-affected MW radiances is developed based on the statistics of radiance innovations in rainy areas. The data impact is assessed by full data assimilation cycling experiments for a storm event that occurred in France in September 2010. Results show that the assimilation of MW precipitation observations from a satellite constellation mimicking GPM has a positive impact on the accumulated rain forecasts verified with surface radar rain estimates. The case-study on a convective storm also reveals that the accuracy of ensemble-based background error covariance is limited by sampling errors and model errors such as precipitation displacement and unresolved convective scale instability.
NASA Astrophysics Data System (ADS)
Miyoshi, T.; Teramura, T.; Ruiz, J.; Kondo, K.; Lien, G. Y.
2016-12-01
Convective weather is known to be highly nonlinear and chaotic, and it is hard to predict their location and timing precisely. Our Big Data Assimilation (BDA) effort has been exploring to use dense and frequent observations to avoid non-Gaussian probability density function (PDF) and to apply an ensemble Kalman filter under the Gaussian error assumption. The phased array weather radar (PAWR) can observe a dense three-dimensional volume scan with 100-m range resolution and 100 elevation angles in only 30 seconds. The BDA system assimilates the PAWR reflectivity and Doppler velocity observations every 30 seconds into 100 ensemble members of storm-scale numerical weather prediction (NWP) model at 100-m grid spacing. The 30-second-update, 100-m-mesh BDA system has been quite successful in multiple case studies of local severe rainfall events. However, with 1000 ensemble members, the reduced-resolution BDA system at 1-km grid spacing showed significant non-Gaussian PDF with every-30-second updates. With a 10240-member ensemble Kalman filter with a global NWP model at 112-km grid spacing, we found roughly 1000 members satisfactory to capture the non-Gaussian error structures. With these in mind, we explore how the density of observations in space and time affects the non-Gaussianity in an ensemble Kalman filter with a simple toy model. In this presentation, we will present the most up-to-date results of the BDA research, as well as the investigation with the toy model on the non-Gaussianity with dense and frequent observations.
Weather and seasonal climate prediction for South America using a multi-model superensemble
NASA Astrophysics Data System (ADS)
Chaves, Rosane R.; Ross, Robert S.; Krishnamurti, T. N.
2005-11-01
This work examines the feasibility of weather and seasonal climate predictions for South America using the multi-model synthetic superensemble approach for climate, and the multi-model conventional superensemble approach for numerical weather prediction, both developed at Florida State University (FSU). The effect on seasonal climate forecasts of the number of models used in the synthetic superensemble is investigated. It is shown that the synthetic superensemble approach for climate and the conventional superensemble approach for numerical weather prediction can reduce the errors over South America in seasonal climate prediction and numerical weather prediction.For climate prediction, a suite of 13 models is used. The forecast lead-time is 1 month for the climate forecasts, which consist of precipitation and surface temperature forecasts. The multi-model ensemble is comprised of four versions of the FSU-Coupled Ocean-Atmosphere Model, seven models from the Development of a European Multi-model Ensemble System for Seasonal to Interannual Prediction (DEMETER), a version of the Community Climate Model (CCM3), and a version of the predictive Ocean Atmosphere Model for Australia (POAMA). The results show that conditions over South America are appropriately simulated by the Florida State University Synthetic Superensemble (FSUSSE) in comparison to observations and that the skill of this approach increases with the use of additional models in the ensemble. When compared to observations, the forecasts are generally better than those from both a single climate model and the multi-model ensemble mean, for the variables tested in this study.For numerical weather prediction, the conventional Florida State University Superensemble (FSUSE) is used to predict the mass and motion fields over South America. Predictions of mean sea level pressure, 500 hPa geopotential height, and 850 hPa wind are made with a multi-model superensemble comprised of six global models for the period January, February, and December of 2000. The six global models are from the following forecast centers: FSU, Bureau of Meteorology Research Center (BMRC), Japan Meteorological Agency (JMA), National Centers for Environmental Prediction (NCEP), Naval Research Laboratory (NRL), and Recherche en Prevision Numerique (RPN). Predictions of precipitation are made for the period January, February, and December of 2001 with a multi-analysis-multi-model superensemble where, in addition to the six forecast models just mentioned, five additional versions of the FSU model are used in the ensemble, each with a different initialization (analysis) based on different physical initialization procedures. On the basis of observations, the results show that the FSUSE provides the best forecasts of the mass and motion field variables to forecast day 5, when compared to both the models comprising the ensemble and the multi-model ensemble mean during the wet season of December-February over South America. Individual case studies show that the FSUSE provides excellent predictions of rainfall for particular synoptic events to forecast day 3. Copyright
NASA Astrophysics Data System (ADS)
Romanova, Vanya; Hense, Andreas; Wahl, Sabrina; Brune, Sebastian; Baehr, Johanna
2016-04-01
The decadal variability and its predictability of the surface net freshwater fluxes is compared in a set of retrospective predictions, all using the same model setup, and only differing in the implemented ocean initialisation method and ensemble generation method. The basic aim is to deduce the differences between the initialization/ensemble generation methods in view of the uncertainty of the verifying observational data sets. The analysis will give an approximation of the uncertainties of the net freshwater fluxes, which up to now appear to be one of the most uncertain products in observational data and model outputs. All ensemble generation methods are implemented into the MPI-ESM earth system model in the framework of the ongoing MiKlip project (www.fona-miklip.de). Hindcast experiments are initialised annually between 2000-2004, and from each start year 10 ensemble members are initialized for 5 years each. Four different ensemble generation methods are compared: (i) a method based on the Anomaly Transform method (Romanova and Hense, 2015) in which the initial oceanic perturbations represent orthogonal and balanced anomaly structures in space and time and between the variables taken from a control run, (ii) one-day-lagged ocean states from the MPI-ESM-LR baseline system (iii) one-day-lagged of ocean and atmospheric states with preceding full-field nudging to re-analysis in both the atmospheric and the oceanic component of the system - the baseline one MPI-ESM-LR system, (iv) an Ensemble Kalman Filter (EnKF) implemented into oceanic part of MPI-ESM (Brune et al. 2015), assimilating monthly subsurface oceanic temperature and salinity (EN3) using the Parallel Data Assimilation Framework (PDAF). The hindcasts are evaluated probabilistically using fresh water flux data sets from four different reanalysis data sets: MERRA, NCEP-R1, GFDL ocean reanalysis and GECCO2. The assessments show no clear differences in the evaluations scores on regional scales. However, on the global scale the physically motivated methods (i) and (iv) provide probabilistic hindcasts with a consistently higher reliability than the lagged initialization methods (ii)/(iii) despite the large uncertainties in the verifying observations and in the simulations.
Bayesian energy landscape tilting: towards concordant models of molecular ensembles.
Beauchamp, Kyle A; Pande, Vijay S; Das, Rhiju
2014-03-18
Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and (3)J measurements gives convergent values of the peptide's α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT's principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data. Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Towards the Prediction of Decadal to Centennial Climate Processes in the Coupled Earth System Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Zhengyu; Kutzbach, J.; Jacob, R.
2011-12-05
In this proposal, we have made major advances in the understanding of decadal and long term climate variability. (a) We performed a systematic study of multidecadal climate variability in FOAM-LPJ and CCSM-T31, and are starting exploring decadal variability in the IPCC AR4 models. (b) We develop several novel methods for the assessment of climate feedbacks in the observation. (c) We also developed a new initialization scheme DAI (Dynamical Analogue Initialization) for ensemble decadal prediction. (d) We also studied climate-vegetation feedback in the observation and models. (e) Finally, we started a pilot program using Ensemble Kalman Filter in CGCM for decadalmore » climate prediction.« less
Abawajy, Jemal; Kelarev, Andrei; Chowdhury, Morshed U; Jelinek, Herbert F
2016-01-01
Blood biochemistry attributes form an important class of tests, routinely collected several times per year for many patients with diabetes. The objective of this study is to investigate the role of blood biochemistry for improving the predictive accuracy of the diagnosis of cardiac autonomic neuropathy (CAN) progression. Blood biochemistry contributes to CAN, and so it is a causative factor that can provide additional power for the diagnosis of CAN especially in the absence of a complete set of Ewing tests. We introduce automated iterative multitier ensembles (AIME) and investigate their performance in comparison to base classifiers and standard ensemble classifiers for blood biochemistry attributes. AIME incorporate diverse ensembles into several tiers simultaneously and combine them into one automatically generated integrated system so that one ensemble acts as an integral part of another ensemble. We carried out extensive experimental analysis using large datasets from the diabetes screening research initiative (DiScRi) project. The results of our experiments show that several blood biochemistry attributes can be used to supplement the Ewing battery for the detection of CAN in situations where one or more of the Ewing tests cannot be completed because of the individual difficulties faced by each patient in performing the tests. The results show that AIME provide higher accuracy as a multitier CAN classification paradigm. The best predictive accuracy of 99.57% has been obtained by the AIME combining decorate on top tier with bagging on middle tier based on random forest. Practitioners can use these findings to increase the accuracy of CAN diagnosis.
Predicting September sea ice: Ensemble skill of the SEARCH Sea Ice Outlook 2008-2013
NASA Astrophysics Data System (ADS)
Stroeve, Julienne; Hamilton, Lawrence C.; Bitz, Cecilia M.; Blanchard-Wrigglesworth, Edward
2014-04-01
Since 2008, the Study of Environmental Arctic Change Sea Ice Outlook has solicited predictions of September sea-ice extent from the Arctic research community. Individuals and teams employ a variety of modeling, statistical, and heuristic approaches to make these predictions. Viewed as monthly ensembles each with one or two dozen individual predictions, they display a bimodal pattern of success. In years when observed ice extent is near its trend, the median predictions tend to be accurate. In years when the observed extent is anomalous, the median and most individual predictions are less accurate. Statistical analysis suggests that year-to-year variability, rather than methods, dominate the variation in ensemble prediction success. Furthermore, ensemble predictions do not improve as the season evolves. We consider the role of initial ice, atmosphere and ocean conditions, and summer storms and weather in contributing to the challenge of sea-ice prediction.
Multi-Model Ensemble Wake Vortex Prediction
NASA Technical Reports Server (NTRS)
Koerner, Stephan; Holzaepfel, Frank; Ahmad, Nash'at N.
2015-01-01
Several multi-model ensemble methods are investigated for predicting wake vortex transport and decay. This study is a joint effort between National Aeronautics and Space Administration and Deutsches Zentrum fuer Luft- und Raumfahrt to develop a multi-model ensemble capability using their wake models. An overview of different multi-model ensemble methods and their feasibility for wake applications is presented. The methods include Reliability Ensemble Averaging, Bayesian Model Averaging, and Monte Carlo Simulations. The methodologies are evaluated using data from wake vortex field experiments.
NASA Astrophysics Data System (ADS)
Yang, Xiu-Qun; Yang, Dejian; Xie, Qian; Zhang, Yaocun; Ren, Xuejuan; Tang, Youmin
2017-04-01
Based on historical forecasts of three quasi-operational multi-model ensemble (MME) systems, this study assesses the superiority of coupled MME over contributing single-model ensembles (SMEs) and over uncoupled atmospheric MME in predicting the Western North Pacific-East Asian summer monsoon variability. The probabilistic and deterministic forecast skills are measured by Brier skill score (BSS) and anomaly correlation (AC), respectively. A forecast-format dependent MME superiority over SMEs is found. The probabilistic forecast skill of the MME is always significantly better than that of each SME, while the deterministic forecast skill of the MME can be lower than that of some SMEs. The MME superiority arises from both the model diversity and the ensemble size increase in the tropics, and primarily from the ensemble size increase in the subtropics. The BSS is composed of reliability and resolution, two attributes characterizing probabilistic forecast skill. The probabilistic skill increase of the MME is dominated by the dramatic improvement in reliability, while resolution is not always improved, similar to AC. A monotonic resolution-AC relationship is further found and qualitatively explained, whereas little relationship can be identified between reliability and AC. It is argued that the MME's success in improving the reliability arises from an effective reduction of the overconfidence in forecast distributions. Moreover, it is examined that the seasonal predictions with coupled MME are more skillful than those with the uncoupled atmospheric MME forced by persisting sea surface temperature (SST) anomalies, since the coupled MME has better predicted the SST anomaly evolution in three key regions.
NASA Astrophysics Data System (ADS)
Gagnon, Patrick; Rousseau, Alain N.; Charron, Dominique; Fortin, Vincent; Audet, René
2017-11-01
Several businesses and industries rely on rainfall forecasts to support their day-to-day operations. To deal with the uncertainty associated with rainfall forecast, some meteorological organisations have developed products, such as ensemble forecasts. However, due to the intensive computational requirements of ensemble forecasts, the spatial resolution remains coarse. For example, Environment and Climate Change Canada's (ECCC) Global Ensemble Prediction System (GEPS) data is freely available on a 1-degree grid (about 100 km), while those of the so-called High Resolution Deterministic Prediction System (HRDPS) are available on a 2.5-km grid (about 40 times finer). Potential users are then left with the option of using either a high-resolution rainfall forecast without uncertainty estimation and/or an ensemble with a spectrum of plausible rainfall values, but at a coarser spatial scale. The objective of this study was to evaluate the added value of coupling the Gibbs Sampling Disaggregation Model (GSDM) with ECCC products to provide accurate, precise and consistent rainfall estimates at a fine spatial resolution (10-km) within a forecast framework (6-h). For 30, 6-h, rainfall events occurring within a 40,000-km2 area (Québec, Canada), results show that, using 100-km aggregated reference rainfall depths as input, statistics of the rainfall fields generated by GSDM were close to those of the 10-km reference field. However, in forecast mode, GSDM outcomes inherit of the ECCC forecast biases, resulting in a poor performance when GEPS data were used as input, mainly due to the inherent rainfall depth distribution of the latter product. Better performance was achieved when the Regional Deterministic Prediction System (RDPS), available on a 10-km grid and aggregated at 100-km, was used as input to GSDM. Nevertheless, most of the analyzed ensemble forecasts were weakly consistent. Some areas of improvement are identified herein.
NASA Technical Reports Server (NTRS)
Reynolds, David; Rasch, William; Kozlowski, Daniel; Burks, Jason; Zavodsky, Bradley; Bernardet, Ligia; Jankov, Isidora; Albers, Steve
2014-01-01
The Experimental Regional Ensemble Forecast (ExREF) system is a tool for the development and testing of new Numerical Weather Prediction (NWP) methodologies. ExREF is run in near-realtime by the Global Systems Division (GSD) of the NOAA Earth System Research Laboratory (ESRL) and its products are made available through a website, an ftp site, and via the Unidata Local Data Manager (LDM). The ExREF domain covers most of North America and has 9-km horizontal grid spacing. The ensemble has eight members, all employing WRF-ARW. The ensemble uses a variety of initial conditions from LAPS and the Global Forecasting System (GFS) and multiple boundary conditions from the GFS ensemble. Additionally, a diversity of physical parameterizations is used to increase ensemble spread and to account for the uncertainty in forecasting extreme precipitation events. ExREF has been a component of the Hydrometeorology Testbed (HMT) NWP suite in the 2012-2013 and 2013-2014 winters. A smaller domain covering just the West Coast was created to minimize band-width consumption for the NWS. This smaller domain has and is being distributed to the National Weather Service (NWS) Weather Forecast Office and California Nevada River Forecast Center in Sacramento, California, where it is ingested into the Advanced Weather Interactive Processing System (AWIPS I and II) to provide guidance on the forecasting of extreme precipitation events. This paper will review the cooperative effort employed by NOAA ESRL, NASA SPoRT (Short-term Prediction Research and Transition Center), and the NWS to facilitate the ingest and display of ExREF data utilizing the AWIPS I and II D2D and GFE (Graphical Software Editor) software. Within GFE is a very useful verification software package called BoiVer that allows the NWS to utilize the River Forecast Center's 4 km gridded QPE to compare with all operational NWP models 6-hr QPF along with the ExREF mean 6-hr QPF so the forecasters can build confidence in the use of the ExREF in preparing their rainfall forecasts. Preliminary results will be presented.
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan
2014-01-01
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.
NASA Astrophysics Data System (ADS)
Warner, Thomas T.; Sheu, Rong-Shyang; Bowers, James F.; Sykes, R. Ian; Dodd, Gregory C.; Henn, Douglas S.
2002-05-01
Ensemble simulations made using a coupled atmospheric dynamic model and a probabilistic Lagrangian puff dispersion model were employed in a forensic analysis of the transport and dispersion of a toxic gas that may have been released near Al Muthanna, Iraq, during the Gulf War. The ensemble study had two objectives, the first of which was to determine the sensitivity of the calculated dosage fields to the choices that must be made about the configuration of the atmospheric dynamic model. In this test, various choices were used for model physics representations and for the large-scale analyses that were used to construct the model initial and boundary conditions. The second study objective was to examine the dispersion model's ability to use ensemble inputs to predict dosage probability distributions. Here, the dispersion model was used with the ensemble mean fields from the individual atmospheric dynamic model runs, including the variability in the individual wind fields, to generate dosage probabilities. These are compared with the explicit dosage probabilities derived from the individual runs of the coupled modeling system. The results demonstrate that the specific choices made about the dynamic-model configuration and the large-scale analyses can have a large impact on the simulated dosages. For example, the area near the source that is exposed to a selected dosage threshold varies by up to a factor of 4 among members of the ensemble. The agreement between the explicit and ensemble dosage probabilities is relatively good for both low and high dosage levels. Although only one ensemble was considered in this study, the encouraging results suggest that a probabilistic dispersion model may be of value in quantifying the effects of uncertainties in a dynamic-model ensemble on dispersion model predictions of atmospheric transport and dispersion.
Ensemble-based docking: From hit discovery to metabolism and toxicity predictions.
Evangelista, Wilfredo; Weir, Rebecca L; Ellingson, Sally R; Harris, Jason B; Kapoor, Karan; Smith, Jeremy C; Baudry, Jerome
2016-10-15
This paper describes and illustrates the use of ensemble-based docking, i.e., using a collection of protein structures in docking calculations for hit discovery, the exploration of biochemical pathways and toxicity prediction of drug candidates. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials. Copyright © 2016 Elsevier Ltd. All rights reserved.
The Nature and Variability of Ensemble Sensitivity Fields that Diagnose Severe Convection
NASA Astrophysics Data System (ADS)
Ancell, B. C.
2017-12-01
Ensemble sensitivity analysis (ESA) is a statistical technique that uses information from an ensemble of forecasts to reveal relationships between chosen forecast metrics and the larger atmospheric state at various forecast times. A number of studies have employed ESA from the perspectives of dynamical interpretation, observation targeting, and ensemble subsetting toward improved probabilistic prediction of high-impact events, mostly at synoptic scales. We tested ESA using convective forecast metrics at the 2016 HWT Spring Forecast Experiment to understand the utility of convective ensemble sensitivity fields in improving forecasts of severe convection and its individual hazards. The main purpose of this evaluation was to understand the temporal coherence and general characteristics of convective sensitivity fields toward future use in improving ensemble predictability within an operational framework.The magnitude and coverage of simulated reflectivity, updraft helicity, and surface wind speed were used as response functions, and the sensitivity of these functions to winds, temperatures, geopotential heights, and dew points at different atmospheric levels and at different forecast times were evaluated on a daily basis throughout the HWT Spring Forecast experiment. These sensitivities were calculated within the Texas Tech real-time ensemble system, which possesses 42 members that run twice daily to 48-hr forecast time. Here we summarize both the findings regarding the nature of the sensitivity fields and the evaluation of the participants that reflects their opinions of the utility of operational ESA. The future direction of ESA for operational use will also be discussed.
NASA Astrophysics Data System (ADS)
van Dijk, Albert I. J. M.; Peña-Arancibia, Jorge L.; Wood, Eric F.; Sheffield, Justin; Beck, Hylke E.
2013-05-01
Ideally, a seasonal streamflow forecasting system would ingest skilful climate forecasts and propagate these through calibrated hydrological models initialized with observed catchment conditions. At global scale, practical problems exist in each of these aspects. For the first time, we analyzed theoretical and actual skill in bimonthly streamflow forecasts from a global ensemble streamflow prediction (ESP) system. Forecasts were generated six times per year for 1979-2008 by an initialized hydrological model and an ensemble of 1° resolution daily climate estimates for the preceding 30 years. A post-ESP conditional sampling method was applied to 2.6% of forecasts, based on predictive relationships between precipitation and 1 of 21 climate indices prior to the forecast date. Theoretical skill was assessed against a reference run with historic forcing. Actual skill was assessed against streamflow records for 6192 small (<10,000 km2) catchments worldwide. The results show that initial catchment conditions provide the main source of skill. Post-ESP sampling enhanced skill in equatorial South America and Southeast Asia, particularly in terms of tercile probability skill, due to the persistence and influence of the El Niño Southern Oscillation. Actual skill was on average 54% of theoretical skill but considerably more for selected regions and times of year. The realized fraction of the theoretical skill probably depended primarily on the quality of precipitation estimates. Forecast skill could be predicted as the product of theoretical skill and historic model performance. Increases in seasonal forecast skill are likely to require improvement in the observation of precipitation and initial hydrological conditions.
Forecast cooling of the Atlantic subpolar gyre and associated impacts.
Hermanson, Leon; Eade, Rosie; Robinson, Niall H; Dunstone, Nick J; Andrews, Martin B; Knight, Jeff R; Scaife, Adam A; Smith, Doug M
2014-07-28
Decadal variability in the North Atlantic and its subpolar gyre (SPG) has been shown to be predictable in climate models initialized with the concurrent ocean state. Numerous impacts over ocean and land have also been identified. Here we use three versions of the Met Office Decadal Prediction System to provide a multimodel ensemble forecast of the SPG and related impacts. The recent cooling trend in the SPG is predicted to continue in the next 5 years due to a decrease in the SPG heat convergence related to a slowdown of the Atlantic Meridional Overturning Circulation. We present evidence that the ensemble forecast is able to skilfully predict these quantities over recent decades. We also investigate the ability of the forecast to predict impacts on surface temperature, pressure, precipitation, and Atlantic tropical storms and compare the forecast to recent boreal summer climate.
National Centers for Environmental Prediction
Statistics Observational Data Processing Data Assimilation Monsoon Desk Model Transition Seminars Seminar (CFS) HURRICANE WEATHER RESEARCH and FORECASTING (HWRF) GLOBAL ENSEMBLE FORECAST SYSTEM (GEFS) NATIONAL Climate Prediction (NCWCP) 5830 University Research Court College Park, MD 20740 Page Author: EMC
Probabilistic Predictions of PM2.5 Using a Novel Ensemble Design for the NAQFC
NASA Astrophysics Data System (ADS)
Kumar, R.; Lee, J. A.; Delle Monache, L.; Alessandrini, S.; Lee, P.
2017-12-01
Poor air quality (AQ) in the U.S. is estimated to cause about 60,000 premature deaths with costs of 100B-150B annually. To reduce such losses, the National AQ Forecasting Capability (NAQFC) at the National Oceanic and Atmospheric Administration (NOAA) produces forecasts of ozone, particulate matter less than 2.5 mm in diameter (PM2.5), and other pollutants so that advance notice and warning can be issued to help individuals and communities limit the exposure and reduce air pollution-caused health problems. The current NAQFC, based on the U.S. Environmental Protection Agency Community Multi-scale AQ (CMAQ) modeling system, provides only deterministic AQ forecasts and does not quantify the uncertainty associated with the predictions, which could be large due to the chaotic nature of atmosphere and nonlinearity in atmospheric chemistry. This project aims to take NAQFC a step further in the direction of probabilistic AQ prediction by exploring and quantifying the potential value of ensemble predictions of PM2.5, and perturbing three key aspects of PM2.5 modeling: the meteorology, emissions, and CMAQ secondary organic aerosol formulation. This presentation focuses on the impact of meteorological variability, which is represented by three members of NOAA's Short-Range Ensemble Forecast (SREF) system that were down-selected by hierarchical cluster analysis. These three SREF members provide the physics configurations and initial/boundary conditions for the Weather Research and Forecasting (WRF) model runs that generate required output variables for driving CMAQ that are missing in operational SREF output. We conducted WRF runs for Jan, Apr, Jul, and Oct 2016 to capture seasonal changes in meteorology. Estimated emissions of trace gases and aerosols via the Sparse Matrix Operator Kernel (SMOKE) system were developed using the WRF output. WRF and SMOKE output drive a 3-member CMAQ mini-ensemble of once-daily, 48-h PM2.5 forecasts for the same four months. The CMAQ mini-ensemble is evaluated against both observations and the current operational deterministic NAQFC products, and analyzed to assess the impact of meteorological biases on PM2.5 variability. Quantification of the PM2.5 prediction uncertainty will prove a key factor to support cost-effective decision-making while protecting public health.
NASA Astrophysics Data System (ADS)
Lange, Heiner; Craig, George
2014-05-01
This study uses the Local Ensemble Transform Kalman Filter (LETKF) to perform storm-scale Data Assimilation of simulated Doppler radar observations into the non-hydrostatic, convection-permitting COSMO model. In perfect model experiments (OSSEs), it is investigated how the limited predictability of convective storms affects precipitation forecasts. The study compares a fine analysis scheme with small RMS errors to a coarse scheme that allows for errors in position, shape and occurrence of storms in the ensemble. The coarse scheme uses superobservations, a coarser grid for analysis weights, a larger localization radius and larger observation error that allow a broadening of the Gaussian error statistics. Three hour forecasts of convective systems (with typical lifetimes exceeding 6 hours) from the detailed analyses of the fine scheme are found to be advantageous to those of the coarse scheme during the first 1-2 hours, with respect to the predicted storm positions. After 3 hours in the convective regime used here, the forecast quality of the two schemes appears indiscernible, judging by RMSE and verification methods for rain-fields and objects. It is concluded that, for operational assimilation systems, the analysis scheme might not necessarily need to be detailed to the grid scale of the model. Depending on the forecast lead time, and on the presence of orographic or synoptic forcing that enhance the predictability of storm occurrences, analyses from a coarser scheme might suffice.
Upper Limits of Predictability in Long-Range Climate/Hydrologic Forecasts
NASA Technical Reports Server (NTRS)
Koster, R. D.; Suarez, M. J.; Heiser, M.
1998-01-01
The accurate forecasting of el nino or la nina conditions in the tropical Pacific can potentially lead to valuable predictions of hydrological anomalies over land at seasonal to interannual timescales. Even with highly accurate earth system models, though, our ability to generate these continental forecasts will always be limited by the chaotic nature of the atmospheric circulation. The nature of this fundamental limitation is explored through the use of 16-member ensembles of multi-decade GCM simulations. In each simulation of the first ensemble, sea surface temperatures (SSTs) are given the same realistic interannual variations over a 45-year period, and land surface state is allowed to evolve with that of the atmosphere. Analysis of the results shows that the SSTs control the temporal organization of continental precipitation anomalies to a significant extent in the tropics and to a much smaller extent in midlatitudes. In each simulation of the second ensemble, we prescribe SSTs as before, but we also prescribe interannual variations in the low frequency component of evaporation efficiency over land. Thus, in the second ensemble, we effectively make the extreme assumption that surface boundary conditions across the globe are perfectly predictable, and we quantify the consistency with which the atmosphere (particularly precipitation) responds to these boundary conditions. The resulting "absolute upper limit" on the predictability of precipitation is found to be quite high in the tropics yet only moderate in many midlatitude regions.
Alves, Pedro; Liu, Shuang; Wang, Daifeng; Gerstein, Mark
2018-01-01
Machine learning is an integral part of computational biology, and has already shown its use in various applications, such as prognostic tests. In the last few years in the non-biological machine learning community, ensembling techniques have shown their power in data mining competitions such as the Netflix challenge; however, such methods have not found wide use in computational biology. In this work, we endeavor to show how ensembling techniques can be applied to practical problems, including problems in the field of bioinformatics, and how they often outperform other machine learning techniques in both predictive power and robustness. Furthermore, we develop a methodology of ensembling, Multi-Swarm Ensemble (MSWE) by using multiple particle swarm optimizations and demonstrate its ability to further enhance the performance of ensembles.
Stochastic Parametrisations and Regime Behaviour of Atmospheric Models
NASA Astrophysics Data System (ADS)
Arnold, Hannah; Moroz, Irene; Palmer, Tim
2013-04-01
The presence of regimes is a characteristic of non-linear, chaotic systems (Lorenz, 2006). In the atmosphere, regimes emerge as familiar circulation patterns such as the El-Nino Southern Oscillation (ENSO), the North Atlantic Oscillation (NAO) and Scandinavian Blocking events. In recent years there has been much interest in the problem of identifying and studying atmospheric regimes (Solomon et al, 2007). In particular, how do these regimes respond to an external forcing such as anthropogenic greenhouse gas emissions? The importance of regimes in observed trends over the past 50-100 years indicates that in order to predict anthropogenic climate change, our climate models must be able to represent accurately natural circulation regimes, their statistics and variability. It is well established that representing model uncertainty as well as initial condition uncertainty is important for reliable weather forecasts (Palmer, 2001). In particular, stochastic parametrisation schemes have been shown to improve the skill of weather forecast models (e.g. Berner et al., 2009; Frenkel et al., 2012; Palmer et al., 2009). It is possible that including stochastic physics as a representation of model uncertainty could also be beneficial in climate modelling, enabling the simulator to explore larger regions of the climate attractor including other flow regimes. An alternative representation of model uncertainty is a perturbed parameter scheme, whereby physical parameters in subgrid parametrisation schemes are perturbed about their optimal value. Perturbing parameters gives a greater control over the ensemble than multi-model or multiparametrisation ensembles, and has been used as a representation of model uncertainty in climate prediction (Stainforth et al., 2005; Rougier et al., 2009). We investigate the effect of including representations of model uncertainty on the regime behaviour of a simulator. A simple chaotic model of the atmosphere, the Lorenz '96 system, is used to study the predictability of regime changes (Lorenz 1996, 2006). Three types of models are considered: a deterministic parametrisation scheme, stochastic parametrisation schemes with additive or multiplicative noise, and a perturbed parameter ensemble. Each forecasting scheme was tested on its ability to reproduce the attractor of the full system, defined in a reduced space based on EOF decomposition. None of the forecast models accurately capture the less common regime, though a significant improvement is observed over the deterministic parametrisation when a temporally correlated stochastic parametrisation is used. The attractor for the perturbed parameter ensemble improves on that forecast by the deterministic or white additive schemes, showing a distinct peak in the attractor corresponding to the less common regime. However, the 40 constituent members of the perturbed parameter ensemble each differ greatly from the true attractor, with many only showing one dominant regime with very rare transitions. These results indicate that perturbed parameter ensembles must be carefully analysed as individual members may have very different characteristics to the ensemble mean and to the true system being modelled. On the other hand, the stochastic parametrisation schemes tested performed well, improving the simulated climate, and motivating the development of a stochastic earth-system simulator for use in climate prediction. J. Berner, G. J. Shutts, M. Leutbecher, and T. N. Palmer. A spectral stochastic kinetic energy backscatter scheme and its impact on flow dependent predictability in the ECMWF ensemble prediction system. J. Atmos. Sci., 66(3):603-626, 2009. Y. Frenkel, A. J. Majda, and B. Khouider. Using the stochastic multicloud model to improve tropical convective parametrisation: A paradigm example. J. Atmos. Sci., 69(3):1080-1105, 2012. E. N. Lorenz. Predictability: a problem partly solved. In Proceedings, Seminar on Predictability, 4-8 September 1995, volume 1, pages 1-18, Shinfield Park, Reading, 1996. ECMWF. E. N. Lorenz. Regimes in simple systems. J. Atmos. Sci., 63(8):2056-2073, 2006. T. N Palmer. A nonlinear dynamical perspective on model error: A proposal for non-local stochastic-dynamic parametrisation in weather and climate prediction models. Q. J. Roy. Meteor. Soc., 127(572):279-304, 2001. T. N. Palmer, R. Buizza, F. Doblas-Reyes, T. Jung, M. Leutbecher, G. J. Shutts, M. Steinheimer, and A. Weisheimer. Stochastic parametrization and model uncertainty. Technical Report 598, European Centre for Medium-Range Weather Forecasts, 2009. J. Rougier, D. M. H. Sexton, J. M. Murphy, and D. Stainforth. Analyzing the climate sensitivity of the HadSM3 climate model using ensembles from different but related experiments. J. Climate, 22:3540-3557, 2009. S. Solomon, D. Qin, M. Manning, Z. Chen, M. Marquis, K. B. Averyt, Tignor M., and H. L. Miller. Climate models and their evaluation. In Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change, Cambridge, United Kingdom and New York, NY, USA, 2007. Cambridge University Press. D. A Stainforth, T. Aina, C. Christensen, M. Collins, N. Faull, D. J. Frame, J. A. Kettleborough, S. Knight, A. Martin, J. M. Murphy, C. Piani, D. Sexton, L. A. Smith, R. A Spicer, A. J. Thorpe, and M. R Allen. Uncertainty in predictions of the climate response to rising levels of greenhouse gases. Nature, 433(7024):403-406, 2005.
National Centers for Environmental Prediction
NAM Specifications/References [<--click here] Rapid Refresh (RAP) [<--click here] High -Resolution Rapid Refresh (HRRR) [<--click here] Short-range Ensemble Forecast (SREF) system [<--click
Alghamdi, Manal; Al-Mallah, Mouaz; Keteyian, Steven; Brawner, Clinton; Ehrman, Jonathan; Sakr, Sherif
2017-01-01
Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are free of any known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 5-year follow-up. At the completion of the fifth year, 5,099 of those patients have developed diabetes. The dataset contained 62 attributes classified into four categories: demographic characteristics, disease history, medication use history, and stress test vital signs. We developed an Ensembling-based predictive model using 13 attributes that were selected based on their clinical importance, Multiple Linear Regression, and Information Gain Ranking methods. The negative effect of the imbalance class of the constructed model was handled by Synthetic Minority Oversampling Technique (SMOTE). The overall performance of the predictive model classifier was improved by the Ensemble machine learning approach using the Vote method with three Decision Trees (Naïve Bayes Tree, Random Forest, and Logistic Model Tree) and achieved high accuracy of prediction (AUC = 0.92). The study shows the potential of ensembling and SMOTE approaches for predicting incident diabetes using cardiorespiratory fitness data.
Systems and methods for modeling and analyzing networks
Hill, Colin C; Church, Bruce W; McDonagh, Paul D; Khalil, Iya G; Neyarapally, Thomas A; Pitluk, Zachary W
2013-10-29
The systems and methods described herein utilize a probabilistic modeling framework for reverse engineering an ensemble of causal models, from data and then forward simulating the ensemble of models to analyze and predict the behavior of the network. In certain embodiments, the systems and methods described herein include data-driven techniques for developing causal models for biological networks. Causal network models include computational representations of the causal relationships between independent variables such as a compound of interest and dependent variables such as measured DNA alterations, changes in mRNA, protein, and metabolites to phenotypic readouts of efficacy and toxicity.
Cerruela García, G; García-Pedrajas, N; Luque Ruiz, I; Gómez-Nieto, M Á
2018-03-01
This paper proposes a method for molecular activity prediction in QSAR studies using ensembles of classifiers constructed by means of two supervised subspace projection methods, namely nonparametric discriminant analysis (NDA) and hybrid discriminant analysis (HDA). We studied the performance of the proposed ensembles compared to classical ensemble methods using four molecular datasets and eight different models for the representation of the molecular structure. Using several measures and statistical tests for classifier comparison, we observe that our proposal improves the classification results with respect to classical ensemble methods. Therefore, we show that ensembles constructed using supervised subspace projections offer an effective way of creating classifiers in cheminformatics.
The state of the art of flood forecasting - Hydrological Ensemble Prediction Systems
NASA Astrophysics Data System (ADS)
Thielen-Del Pozo, J.; Pappenberger, F.; Salamon, P.; Bogner, K.; Burek, P.; de Roo, A.
2010-09-01
Flood forecasting systems form a key part of ‘preparedness' strategies for disastrous floods and provide hydrological services, civil protection authorities and the public with information of upcoming events. Provided the warning leadtime is sufficiently long, adequate preparatory actions can be taken to efficiently reduce the impacts of the flooding. Because of the specific characteristics of each catchment, varying data availability and end-user demands, the design of the best flood forecasting system may differ from catchment to catchment. However, despite the differences in concept and data needs, there is one underlying issue that spans across all systems. There has been an growing awareness and acceptance that uncertainty is a fundamental issue of flood forecasting and needs to be dealt with at the different spatial and temporal scales as well as the different stages of the flood generating processes. Today, operational flood forecasting centres change increasingly from single deterministic forecasts to probabilistic forecasts with various representations of the different contributions of uncertainty. The move towards these so-called Hydrological Ensemble Prediction Systems (HEPS) in flood forecasting represents the state of the art in forecasting science, following on the success of the use of ensembles for weather forecasting (Buizza et al., 2005) and paralleling the move towards ensemble forecasting in other related disciplines such as climate change predictions. The use of HEPS has been internationally fostered by initiatives such as "The Hydrologic Ensemble Prediction Experiment" (HEPEX), created with the aim to investigate how best to produce, communicate and use hydrologic ensemble forecasts in hydrological short-, medium- und long term prediction of hydrological processes. The advantages of quantifying the different contributions of uncertainty as well as the overall uncertainty to obtain reliable and useful flood forecasts also for extreme events, has become evident. However, despite the demonstrated advantages, worldwide the incorporation of HEPS in operational flood forecasting is still limited. The applicability of HEPS for smaller river basins was tested in MAP D-Phase, an acronym for "Demonstration of Probabilistic Hydrological and Atmospheric Simulation of flood Events in the Alpine region" which was launched in 2005 as a Forecast Demonstration Project of World Weather Research Programme of WMO, and entered a pre-operational and still active testing phase in 2007. In Europe, a comparatively high number of EPS driven systems for medium-large rivers exist. National flood forecasting centres of Sweden, Finland and the Netherlands, have already implemented HEPS in their operational forecasting chain, while in other countries including France, Germany, Czech Republic and Hungary, hybrids or experimental chains have been installed. As an example of HEPS, the European Flood Alert System (EFAS) is being presented. EFAS provides medium-range probabilistic flood forecasting information for large trans-national river basins. It incorporates multiple sets of weather forecast including different types of EPS and deterministic forecasts from different providers. EFAS products are evaluated and visualised as exceedance of critical levels only - both in forms of maps and time series. Different sources of uncertainty and its impact on the flood forecasting performance for every grid cell has been tested offline but not yet incorporated operationally into the forecasting chain for computational reasons. However, at stations where real-time discharges are available, a hydrological uncertainty processor is being applied to estimate the total predictive uncertainty from the hydrological and input uncertainties. Research on long-term EFAS results has shown the need for complementing statistical analysis with case studies for which examples will be shown.
Evaluation of the North American Multi-Model Ensemble System for Monthly and Seasonal Prediction
NASA Astrophysics Data System (ADS)
Zhang, Q.
2014-12-01
Since August 2011, the real time seasonal forecasts of the U.S. National Multi-Model Ensemble (NMME) have been made on 8th of each month by NCEP Climate Prediction Center (CPC). The participating models were NCEP/CFSv1&2, GFDL/CM2.2, NCAR/U.Miami/COLA/CCSM3, NASA/GEOS5, IRI/ ECHAM-a & ECHAM-f in the first year of the real time NMME forecast. Two Canadian coupled models CMC/CanCM3 and CM4 joined in and CFSv1 and IRI's models dropped out in the second year. The NMME team at CPC collects monthly means of three variables, precipitation, temperature at 2m and sea surface temperature from each modeling center on a 1x1 global grid, removes systematic errors, makes the grand ensemble mean in equal weight for each model mean and probability forecast with equal weight for each member of each model. This provides the NMME forecast locked in schedule for the CPC operational seasonal and monthly outlook. The basic verification metrics of seasonal and monthly prediction of NMME are calculated as an evaluation of skill, including both deterministic and probabilistic forecasts for the 3-year real time (August, 2011- July 2014) period and the 30-year retrospective forecast (1982-2011) of the individual models as well as the NMME ensemble. The motivation of this study is to provide skill benchmarks for future improvements of the NMME seasonal and monthly prediction system. We also want to establish whether the real time and hindcast periods (used for bias correction in real time) are consistent. The experimental phase I of the project already supplies routine guidance to users of the NMME forecasts.
Gridded Calibration of Ensemble Wind Vector Forecasts Using Ensemble Model Output Statistics
NASA Astrophysics Data System (ADS)
Lazarus, S. M.; Holman, B. P.; Splitt, M. E.
2017-12-01
A computationally efficient method is developed that performs gridded post processing of ensemble wind vector forecasts. An expansive set of idealized WRF model simulations are generated to provide physically consistent high resolution winds over a coastal domain characterized by an intricate land / water mask. Ensemble model output statistics (EMOS) is used to calibrate the ensemble wind vector forecasts at observation locations. The local EMOS predictive parameters (mean and variance) are then spread throughout the grid utilizing flow-dependent statistical relationships extracted from the downscaled WRF winds. Using data withdrawal and 28 east central Florida stations, the method is applied to one year of 24 h wind forecasts from the Global Ensemble Forecast System (GEFS). Compared to the raw GEFS, the approach improves both the deterministic and probabilistic forecast skill. Analysis of multivariate rank histograms indicate the post processed forecasts are calibrated. Two downscaling case studies are presented, a quiescent easterly flow event and a frontal passage. Strengths and weaknesses of the approach are presented and discussed.
Baronio, Fabio; Andreana, Marco; Conforti, Matteo; Manili, Gabriele; Couderc, Vincent; De Angelis, Costantino; Barthélémy, Alain
2011-07-04
We consider the spectral theory of three-wave interactions to predict the initiation, formation and dynamics of an ensemble of bright-dark-bright soliton triads in frequency conversion processes. Spatial observation of non-interacting triads ensemble in a KTP crystal confirms theoretical prediction and numerical simulations.
NASA Astrophysics Data System (ADS)
Fernández, J.; Primo, C.; Cofiño, A. S.; Gutiérrez, J. M.; Rodríguez, M. A.
2009-08-01
In a recent paper, Gutiérrez et al. (Nonlinear Process Geophys 15(1):109-114, 2008) introduced a new characterization of spatiotemporal error growth—the so called mean-variance logarithmic (MVL) diagram—and applied it to study ensemble prediction systems (EPS); in particular, they analyzed single-model ensembles obtained by perturbing the initial conditions. In the present work, the MVL diagram is applied to multi-model ensembles analyzing also the effect of model formulation differences. To this aim, the MVL diagram is systematically applied to the multi-model ensemble produced in the EU-funded DEMETER project. It is shown that the shared building blocks (atmospheric and ocean components) impose similar dynamics among different models and, thus, contribute to poorly sampling the model formulation uncertainty. This dynamical similarity should be taken into account, at least as a pre-screening process, before applying any objective weighting method.
Deep biomarkers of human aging: Application of deep neural networks to biomarker development
Putin, Evgeny; Mamoshina, Polina; Aliper, Alexander; Korzinkin, Mikhail; Moskalev, Alexey; Kolosov, Alexey; Ostrovskiy, Alexander; Cantor, Charles; Vijg, Jan; Zhavoronkov, Alex
2016-01-01
One of the major impediments in human aging research is the absence of a comprehensive and actionable set of biomarkers that may be targeted and measured to track the effectiveness of therapeutic interventions. In this study, we designed a modular ensemble of 21 deep neural networks (DNNs) of varying depth, structure and optimization to predict human chronological age using a basic blood test. To train the DNNs, we used over 60,000 samples from common blood biochemistry and cell count tests from routine health exams performed by a single laboratory and linked to chronological age and sex. The best performing DNN in the ensemble demonstrated 81.5 % epsilon-accuracy r = 0.90 with R2 = 0.80 and MAE = 6.07 years in predicting chronological age within a 10 year frame, while the entire ensemble achieved 83.5% epsilon-accuracy r = 0.91 with R2 = 0.82 and MAE = 5.55 years. The ensemble also identified the 5 most important markers for predicting human chronological age: albumin, glucose, alkaline phosphatase, urea and erythrocytes. To allow for public testing and evaluate real-life performance of the predictor, we developed an online system available at http://www.aging.ai. The ensemble approach may facilitate integration of multi-modal data linked to chronological age and sex that may lead to simple, minimally invasive, and affordable methods of tracking integrated biomarkers of aging in humans and performing cross-species feature importance analysis. PMID:27191382
Deep biomarkers of human aging: Application of deep neural networks to biomarker development.
Putin, Evgeny; Mamoshina, Polina; Aliper, Alexander; Korzinkin, Mikhail; Moskalev, Alexey; Kolosov, Alexey; Ostrovskiy, Alexander; Cantor, Charles; Vijg, Jan; Zhavoronkov, Alex
2016-05-01
One of the major impediments in human aging research is the absence of a comprehensive and actionable set of biomarkers that may be targeted and measured to track the effectiveness of therapeutic interventions. In this study, we designed a modular ensemble of 21 deep neural networks (DNNs) of varying depth, structure and optimization to predict human chronological age using a basic blood test. To train the DNNs, we used over 60,000 samples from common blood biochemistry and cell count tests from routine health exams performed by a single laboratory and linked to chronological age and sex. The best performing DNN in the ensemble demonstrated 81.5 % epsilon-accuracy r = 0.90 with R(2) = 0.80 and MAE = 6.07 years in predicting chronological age within a 10 year frame, while the entire ensemble achieved 83.5% epsilon-accuracy r = 0.91 with R(2) = 0.82 and MAE = 5.55 years. The ensemble also identified the 5 most important markers for predicting human chronological age: albumin, glucose, alkaline phosphatase, urea and erythrocytes. To allow for public testing and evaluate real-life performance of the predictor, we developed an online system available at http://www.aging.ai. The ensemble approach may facilitate integration of multi-modal data linked to chronological age and sex that may lead to simple, minimally invasive, and affordable methods of tracking integrated biomarkers of aging in humans and performing cross-species feature importance analysis.
Evolutionary Ensemble for In Silico Prediction of Ames Test Mutagenicity
NASA Astrophysics Data System (ADS)
Chen, Huanhuan; Yao, Xin
Driven by new regulations and animal welfare, the need to develop in silico models has increased recently as alternative approaches to safety assessment of chemicals without animal testing. This paper describes a novel machine learning ensemble approach to building an in silico model for the prediction of the Ames test mutagenicity, one of a battery of the most commonly used experimental in vitro and in vivo genotoxicity tests for safety evaluation of chemicals. Evolutionary random neural ensemble with negative correlation learning (ERNE) [1] was developed based on neural networks and evolutionary algorithms. ERNE combines the method of bootstrap sampling on training data with the method of random subspace feature selection to ensure diversity in creating individuals within an initial ensemble. Furthermore, while evolving individuals within the ensemble, it makes use of the negative correlation learning, enabling individual NNs to be trained as accurate as possible while still manage to maintain them as diverse as possible. Therefore, the resulting individuals in the final ensemble are capable of cooperating collectively to achieve better generalization of prediction. The empirical experiment suggest that ERNE is an effective ensemble approach for predicting the Ames test mutagenicity of chemicals.
Prediction of plant lncRNA by ensemble machine learning classifiers.
Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian
2018-05-02
In plants, long non-protein coding RNAs are believed to have essential roles in development and stress responses. However, relative to advances on discerning biological roles for long non-protein coding RNAs in animal systems, this RNA class in plants is largely understudied. With comparatively few validated plant long non-coding RNAs, research on this potentially critical class of RNA is hindered by a lack of appropriate prediction tools and databases. Supervised learning models trained on data sets of mostly non-validated, non-coding transcripts have been previously used to identify this enigmatic RNA class with applications largely focused on animal systems. Our approach uses a training set comprised only of empirically validated long non-protein coding RNAs from plant, animal, and viral sources to predict and rank candidate long non-protein coding gene products for future functional validation. Individual stochastic gradient boosting and random forest classifiers trained on only empirically validated long non-protein coding RNAs were constructed. In order to use the strengths of multiple classifiers, we combined multiple models into a single stacking meta-learner. This ensemble approach benefits from the diversity of several learners to effectively identify putative plant long non-coding RNAs from transcript sequence features. When the predicted genes identified by the ensemble classifier were compared to those listed in GreeNC, an established plant long non-coding RNA database, overlap for predicted genes from Arabidopsis thaliana, Oryza sativa and Eutrema salsugineum ranged from 51 to 83% with the highest agreement in Eutrema salsugineum. Most of the highest ranking predictions from Arabidopsis thaliana were annotated as potential natural antisense genes, pseudogenes, transposable elements, or simply computationally predicted hypothetical protein. Due to the nature of this tool, the model can be updated as new long non-protein coding transcripts are identified and functionally verified. This ensemble classifier is an accurate tool that can be used to rank long non-protein coding RNA predictions for use in conjunction with gene expression studies. Selection of plant transcripts with a high potential for regulatory roles as long non-protein coding RNAs will advance research in the elucidation of long non-protein coding RNA function.
A Wind Forecasting System for Energy Application
NASA Astrophysics Data System (ADS)
Courtney, Jennifer; Lynch, Peter; Sweeney, Conor
2010-05-01
Accurate forecasting of available energy is crucial for the efficient management and use of wind power in the national power grid. With energy output critically dependent upon wind strength there is a need to reduce the errors associated wind forecasting. The objective of this research is to get the best possible wind forecasts for the wind energy industry. To achieve this goal, three methods are being applied. First, a mesoscale numerical weather prediction (NWP) model called WRF (Weather Research and Forecasting) is being used to predict wind values over Ireland. Currently, a gird resolution of 10km is used and higher model resolutions are being evaluated to establish whether they are economically viable given the forecast skill improvement they produce. Second, the WRF model is being used in conjunction with ECMWF (European Centre for Medium-Range Weather Forecasts) ensemble forecasts to produce a probabilistic weather forecasting product. Due to the chaotic nature of the atmosphere, a single, deterministic weather forecast can only have limited skill. The ECMWF ensemble methods produce an ensemble of 51 global forecasts, twice a day, by perturbing initial conditions of a 'control' forecast which is the best estimate of the initial state of the atmosphere. This method provides an indication of the reliability of the forecast and a quantitative basis for probabilistic forecasting. The limitation of ensemble forecasting lies in the fact that the perturbed model runs behave differently under different weather patterns and each model run is equally likely to be closest to the observed weather situation. Models have biases, and involve assumptions about physical processes and forcing factors such as underlying topography. Third, Bayesian Model Averaging (BMA) is being applied to the output from the ensemble forecasts in order to statistically post-process the results and achieve a better wind forecasting system. BMA is a promising technique that will offer calibrated probabilistic wind forecasts which will be invaluable in wind energy management. In brief, this method turns the ensemble forecasts into a calibrated predictive probability distribution. Each ensemble member is provided with a 'weight' determined by its relative predictive skill over a training period of around 30 days. Verification of data is carried out using observed wind data from operational wind farms. These are then compared to existing forecasts produced by ECMWF and Met Eireann in relation to skill scores. We are developing decision-making models to show the benefits achieved using the data produced by our wind energy forecasting system. An energy trading model will be developed, based on the rules currently used by the Single Electricity Market Operator for energy trading in Ireland. This trading model will illustrate the potential for financial savings by using the forecast data generated by this research.
Lysine acetylation sites prediction using an ensemble of support vector machine classifiers.
Xu, Yan; Wang, Xiao-Bo; Ding, Jun; Wu, Ling-Yun; Deng, Nai-Yang
2010-05-07
Lysine acetylation is an essentially reversible and high regulated post-translational modification which regulates diverse protein properties. Experimental identification of acetylation sites is laborious and expensive. Hence, there is significant interest in the development of computational methods for reliable prediction of acetylation sites from amino acid sequences. In this paper we use an ensemble of support vector machine classifiers to perform this work. The experimentally determined acetylation lysine sites are extracted from Swiss-Prot database and scientific literatures. Experiment results show that an ensemble of support vector machine classifiers outperforms single support vector machine classifier and other computational methods such as PAIL and LysAcet on the problem of predicting acetylation lysine sites. The resulting method has been implemented in EnsemblePail, a web server for lysine acetylation sites prediction available at http://www.aporc.org/EnsemblePail/. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
An ensemble framework for identifying essential proteins.
Zhang, Xue; Xiao, Wangxin; Acencio, Marcio Luis; Lemke, Ney; Wang, Xujing
2016-08-25
Many centrality measures have been proposed to mine and characterize the correlations between network topological properties and protein essentiality. However, most of them show limited prediction accuracy, and the number of common predicted essential proteins by different methods is very small. In this paper, an ensemble framework is proposed which integrates gene expression data and protein-protein interaction networks (PINs). It aims to improve the prediction accuracy of basic centrality measures. The idea behind this ensemble framework is that different protein-protein interactions (PPIs) may show different contributions to protein essentiality. Five standard centrality measures (degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, and subgraph centrality) are integrated into the ensemble framework respectively. We evaluated the performance of the proposed ensemble framework using yeast PINs and gene expression data. The results show that it can considerably improve the prediction accuracy of the five centrality measures individually. It can also remarkably increase the number of common predicted essential proteins among those predicted by each centrality measure individually and enable each centrality measure to find more low-degree essential proteins. This paper demonstrates that it is valuable to differentiate the contributions of different PPIs for identifying essential proteins based on network topological characteristics. The proposed ensemble framework is a successful paradigm to this end.
NASA Astrophysics Data System (ADS)
Medina, Hanoi; Tian, Di; Srivastava, Puneet; Pelosi, Anna; Chirico, Giovanni B.
2018-07-01
Reference evapotranspiration (ET0) plays a fundamental role in agronomic, forestry, and water resources management. Estimating and forecasting ET0 have long been recognized as a major challenge for researchers and practitioners in these communities. This work explored the potential of multiple leading numerical weather predictions (NWPs) for estimating and forecasting summer ET0 at 101 U.S. Regional Climate Reference Network stations over nine climate regions across the contiguous United States (CONUS). Three leading global NWP model forecasts from THORPEX Interactive Grand Global Ensemble (TIGGE) dataset were used in this study, including the single model ensemble forecasts from the European Centre for Medium-Range Weather Forecasts (EC), the National Centers for Environmental Prediction Global Forecast System (NCEP), and the United Kingdom Meteorological Office forecasts (MO), as well as multi-model ensemble forecasts from the combinations of these NWP models. A regression calibration was employed to bias correct the ET0 forecasts. Impact of individual forecast variables on ET0 forecasts were also evaluated. The results showed that the EC forecasts provided the least error and highest skill and reliability, followed by the MO and NCEP forecasts. The multi-model ensembles constructed from the combination of EC and MO forecasts provided slightly better performance than the single model EC forecasts. The regression process greatly improved ET0 forecast performances, particularly for the regions involving stations near the coast, or with a complex orography. The performance of EC forecasts was only slightly influenced by the size of the ensemble members, particularly at short lead times. Even with less ensemble members, EC still performed better than the other two NWPs. Errors in the radiation forecasts, followed by those in the wind, had the most detrimental effects on the ET0 forecast performances.
NASA Astrophysics Data System (ADS)
Li, Hui; Hong, Lu-Yao; Zhou, Qing; Yu, Hai-Jie
2015-08-01
The business failure of numerous companies results in financial crises. The high social costs associated with such crises have made people to search for effective tools for business risk prediction, among which, support vector machine is very effective. Several modelling means, including single-technique modelling, hybrid modelling, and ensemble modelling, have been suggested in forecasting business risk with support vector machine. However, existing literature seldom focuses on the general modelling frame for business risk prediction, and seldom investigates performance differences among different modelling means. We reviewed researches on forecasting business risk with support vector machine, proposed the general assisted prediction modelling frame with hybridisation and ensemble (APMF-WHAE), and finally, investigated the use of principal components analysis, support vector machine, random sampling, and group decision, under the general frame in forecasting business risk. Under the APMF-WHAE frame with support vector machine as the base predictive model, four specific predictive models were produced, namely, pure support vector machine, a hybrid support vector machine involved with principal components analysis, a support vector machine ensemble involved with random sampling and group decision, and an ensemble of hybrid support vector machine using group decision to integrate various hybrid support vector machines on variables produced from principle components analysis and samples from random sampling. The experimental results indicate that hybrid support vector machine and ensemble of hybrid support vector machines were able to produce dominating performance than pure support vector machine and support vector machine ensemble.
Bassen, David M; Vilkhovoy, Michael; Minot, Mason; Butcher, Jonathan T; Varner, Jeffrey D
2017-01-25
Ensemble modeling is a promising approach for obtaining robust predictions and coarse grained population behavior in deterministic mathematical models. Ensemble approaches address model uncertainty by using parameter or model families instead of single best-fit parameters or fixed model structures. Parameter ensembles can be selected based upon simulation error, along with other criteria such as diversity or steady-state performance. Simulations using parameter ensembles can estimate confidence intervals on model variables, and robustly constrain model predictions, despite having many poorly constrained parameters. In this software note, we present a multiobjective based technique to estimate parameter or models ensembles, the Pareto Optimal Ensemble Technique in the Julia programming language (JuPOETs). JuPOETs integrates simulated annealing with Pareto optimality to estimate ensembles on or near the optimal tradeoff surface between competing training objectives. We demonstrate JuPOETs on a suite of multiobjective problems, including test functions with parameter bounds and system constraints as well as for the identification of a proof-of-concept biochemical model with four conflicting training objectives. JuPOETs identified optimal or near optimal solutions approximately six-fold faster than a corresponding implementation in Octave for the suite of test functions. For the proof-of-concept biochemical model, JuPOETs produced an ensemble of parameters that gave both the mean of the training data for conflicting data sets, while simultaneously estimating parameter sets that performed well on each of the individual objective functions. JuPOETs is a promising approach for the estimation of parameter and model ensembles using multiobjective optimization. JuPOETs can be adapted to solve many problem types, including mixed binary and continuous variable types, bilevel optimization problems and constrained problems without altering the base algorithm. JuPOETs is open source, available under an MIT license, and can be installed using the Julia package manager from the JuPOETs GitHub repository.
NASA Astrophysics Data System (ADS)
Elsberry, Russell L.; Jordan, Mary S.; Vitart, Frederic
2010-05-01
The objective of this study is to provide evidence of predictability on intraseasonal time scales (10-30 days) for western North Pacific tropical cyclone formation and subsequent tracks using the 51-member ECMWF 32-day forecasts made once a week from 5 June through 25 December 2008. Ensemble storms are defined by grouping ensemble member vortices whose positions are within a specified separation distance that is equal to 180 n mi at the initial forecast time t and increases linearly to 420 n mi at Day 14 and then is constant. The 12-h track segments are calculated with a Weighted-Mean Vector Motion technique in which the weighting factor is inversely proportional to the distance from the endpoint of the previous 12-h motion vector. Seventy-six percent of the ensemble storms had five or fewer member vortices. On average, the ensemble storms begin 2.5 days before the first entry of the Joint Typhoon Warning Center (JTWC) best-track file, tend to translate too slowly in the deep tropics, and persist for longer periods over land. A strict objective matching technique with the JTWC storms is combined with a second subjective procedure that is then applied to identify nearby ensemble storms that would indicate a greater likelihood of a tropical cyclone developing in that region with that track orientation. The ensemble storms identified in the ECMWF 32-day forecasts provided guidance on intraseasonal timescales of the formations and tracks of the three strongest typhoons and two other typhoons, but not for two early season typhoons and the late season Dolphin. Four strong tropical storms were predicted consistently over Week-1 through Week-4, as was one weak tropical storm. Two other weak tropical storms, three tropical cyclones that developed from precursor baroclinic systems, and three other tropical depressions were not predicted on intraseasonal timescales. At least for the strongest tropical cyclones during the peak season, the ECMWF 32-day ensemble provides guidance of formation and tracks on 10-30 day timescales.
Generalized Gibbs ensemble in integrable lattice models
NASA Astrophysics Data System (ADS)
Vidmar, Lev; Rigol, Marcos
2016-06-01
The generalized Gibbs ensemble (GGE) was introduced ten years ago to describe observables in isolated integrable quantum systems after equilibration. Since then, the GGE has been demonstrated to be a powerful tool to predict the outcome of the relaxation dynamics of few-body observables in a variety of integrable models, a process we call generalized thermalization. This review discusses several fundamental aspects of the GGE and generalized thermalization in integrable systems. In particular, we focus on questions such as: which observables equilibrate to the GGE predictions and who should play the role of the bath; what conserved quantities can be used to construct the GGE; what are the differences between generalized thermalization in noninteracting systems and in interacting systems mappable to noninteracting ones; why is it that the GGE works when traditional ensembles of statistical mechanics fail. Despite a lot of interest in these questions in recent years, no definite answers have been given. We review results for the XX model and for the transverse field Ising model. For the latter model, we also report original results and show that the GGE describes spin-spin correlations over the entire system. This makes apparent that there is no need to trace out a part of the system in real space for equilibration to occur and for the GGE to apply. In the past, a spectral decomposition of the weights of various statistical ensembles revealed that generalized eigenstate thermalization occurs in the XX model (hard-core bosons). Namely, eigenstates of the Hamiltonian with similar distributions of conserved quantities have similar expectation values of few-spin observables. Here we show that generalized eigenstate thermalization also occurs in the transverse field Ising model.
NASA Astrophysics Data System (ADS)
Ehsan, Muhammad Azhar; Tippett, Michael K.; Almazroui, Mansour; Ismail, Muhammad; Yousef, Ahmed; Kucharski, Fred; Omar, Mohamed; Hussein, Mahmoud; Alkhalaf, Abdulrahman A.
2017-05-01
Northern Hemisphere winter precipitation reforecasts from the European Centre for Medium Range Weather Forecast System-4 and six of the models in the North American Multi-Model Ensemble are evaluated, focusing on two regions (Region-A: 20°N-45°N, 10°E-65°E and Region-B: 20°N-55°N, 205°E-255°E) where winter precipitation is a dominant fraction of the annual total and where precipitation from mid-latitude storms is important. Predictability and skill (deterministic and probabilistic) are assessed for 1983-2013 by the multimodel composite (MME) of seven prediction models. The MME climatological mean and variability over the two regions is comparable to observation with some regional differences. The statistically significant decreasing trend observed in Region-B precipitation is captured well by the MME and most of the individual models. El Niño Southern Oscillation is a source of forecast skill, and the correlation coefficient between the Niño3.4 index and precipitation over region A and B is 0.46 and 0.35, statistically significant at the 95 % level. The MME reforecasts weakly reproduce the observed teleconnection. Signal, noise and signal to noise ratio analysis show that the signal variance over two regions is very small as compared to noise variance which tends to reduce the prediction skill. The MME ranked probability skill score is higher than that of individual models, showing the advantage of a multimodel ensemble. Observed Region-A rainfall anomalies are strongly associated with the North Atlantic Oscillation, but none of the models reproduce this relation, which may explain the low skill over Region-A. The superior quality of multimodel ensemble compared with individual models is mainly due to larger ensemble size.
NASA Astrophysics Data System (ADS)
Jha, Sanjeev K.; Shrestha, Durga L.; Stadnyk, Tricia A.; Coulibaly, Paulin
2018-03-01
Flooding in Canada is often caused by heavy rainfall during the snowmelt period. Hydrologic forecast centers rely on precipitation forecasts obtained from numerical weather prediction (NWP) models to enforce hydrological models for streamflow forecasting. The uncertainties in raw quantitative precipitation forecasts (QPFs) are enhanced by physiography and orography effects over a diverse landscape, particularly in the western catchments of Canada. A Bayesian post-processing approach called rainfall post-processing (RPP), developed in Australia (Robertson et al., 2013; Shrestha et al., 2015), has been applied to assess its forecast performance in a Canadian catchment. Raw QPFs obtained from two sources, Global Ensemble Forecasting System (GEFS) Reforecast 2 project, from the National Centers for Environmental Prediction, and Global Deterministic Forecast System (GDPS), from Environment and Climate Change Canada, are used in this study. The study period from January 2013 to December 2015 covered a major flood event in Calgary, Alberta, Canada. Post-processed results show that the RPP is able to remove the bias and reduce the errors of both GEFS and GDPS forecasts. Ensembles generated from the RPP reliably quantify the forecast uncertainty.
Trends in the predictive performance of raw ensemble weather forecasts
NASA Astrophysics Data System (ADS)
Hemri, Stephan; Scheuerer, Michael; Pappenberger, Florian; Bogner, Konrad; Haiden, Thomas
2015-04-01
Over the last two decades the paradigm in weather forecasting has shifted from being deterministic to probabilistic. Accordingly, numerical weather prediction (NWP) models have been run increasingly as ensemble forecasting systems. The goal of such ensemble forecasts is to approximate the forecast probability distribution by a finite sample of scenarios. Global ensemble forecast systems, like the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble, are prone to probabilistic biases, and are therefore not reliable. They particularly tend to be underdispersive for surface weather parameters. Hence, statistical post-processing is required in order to obtain reliable and sharp forecasts. In this study we apply statistical post-processing to ensemble forecasts of near-surface temperature, 24-hour precipitation totals, and near-surface wind speed from the global ECMWF model. Our main objective is to evaluate the evolution of the difference in skill between the raw ensemble and the post-processed forecasts. The ECMWF ensemble is under continuous development, and hence its forecast skill improves over time. Parts of these improvements may be due to a reduction of probabilistic bias. Thus, we first hypothesize that the gain by post-processing decreases over time. Based on ECMWF forecasts from January 2002 to March 2014 and corresponding observations from globally distributed stations we generate post-processed forecasts by ensemble model output statistics (EMOS) for each station and variable. Parameter estimates are obtained by minimizing the Continuous Ranked Probability Score (CRPS) over rolling training periods that consist of the n days preceding the initialization dates. Given the higher average skill in terms of CRPS of the post-processed forecasts for all three variables, we analyze the evolution of the difference in skill between raw ensemble and EMOS forecasts. The fact that the gap in skill remains almost constant over time, especially for near-surface wind speed, suggests that improvements to the atmospheric model have an effect quite different from what calibration by statistical post-processing is doing. That is, they are increasing potential skill. Thus this study indicates that (a) further model development is important even if one is just interested in point forecasts, and (b) statistical post-processing is important because it will keep adding skill in the foreseeable future.
2013-01-01
Background Many problems in protein modeling require obtaining a discrete representation of the protein conformational space as an ensemble of conformations. In ab-initio structure prediction, in particular, where the goal is to predict the native structure of a protein chain given its amino-acid sequence, the ensemble needs to satisfy energetic constraints. Given the thermodynamic hypothesis, an effective ensemble contains low-energy conformations which are similar to the native structure. The high-dimensionality of the conformational space and the ruggedness of the underlying energy surface currently make it very difficult to obtain such an ensemble. Recent studies have proposed that Basin Hopping is a promising probabilistic search framework to obtain a discrete representation of the protein energy surface in terms of local minima. Basin Hopping performs a series of structural perturbations followed by energy minimizations with the goal of hopping between nearby energy minima. This approach has been shown to be effective in obtaining conformations near the native structure for small systems. Recent work by us has extended this framework to larger systems through employment of the molecular fragment replacement technique, resulting in rapid sampling of large ensembles. Methods This paper investigates the algorithmic components in Basin Hopping to both understand and control their effect on the sampling of near-native minima. Realizing that such an ensemble is reduced before further refinement in full ab-initio protocols, we take an additional step and analyze the quality of the ensemble retained by ensemble reduction techniques. We propose a novel multi-objective technique based on the Pareto front to filter the ensemble of sampled local minima. Results and conclusions We show that controlling the magnitude of the perturbation allows directly controlling the distance between consecutively-sampled local minima and, in turn, steering the exploration towards conformations near the native structure. For the minimization step, we show that the addition of Metropolis Monte Carlo-based minimization is no more effective than a simple greedy search. Finally, we show that the size of the ensemble of sampled local minima can be effectively and efficiently reduced by a multi-objective filter to obtain a simpler representation of the probed energy surface. PMID:24564970
The Role of the AMOC in Forecast Cooling of the Atlantic Subpolar Gyre and Its Associated Impacts
NASA Astrophysics Data System (ADS)
Eade, R.; Hermanson, L.; Robinson, N.; Dunstone, N.; Andrews, M.; Knight, J.; Scaife, A. A.; Smith, D.
2014-12-01
Decadal variability in the North Atlantic and its subpolar gyre (SPG) has been shown to be predictable in climate models initialized with the concurrent ocean state. Numerous impacts over ocean and land have also been identified. Here we use three versions of the Met Office Decadal Prediction System to provide a multimodel ensemble forecast of the SPG and related impacts. The recent cooling trend in the SPG is predicted to continue in the next 5 years due to a decrease in the SPG heat convergence related to a slowdown of the Atlantic Meridional Overturning Circulation. We present evidence that the ensemble forecast is able to skilfully predict these quantities over recent decades. We also investigate the ability of the forecast to predict impacts on surface temperature, pressure, precipitation, and Atlantic tropical storms and compare the forecast to recent boreal summer climate.
Forecast cooling of the Atlantic subpolar gyre and associated impacts
Hermanson, Leon; Eade, Rosie; Robinson, Niall H; Dunstone, Nick J; Andrews, Martin B; Knight, Jeff R; Scaife, Adam A; Smith, Doug M
2014-01-01
Decadal variability in the North Atlantic and its subpolar gyre (SPG) has been shown to be predictable in climate models initialized with the concurrent ocean state. Numerous impacts over ocean and land have also been identified. Here we use three versions of the Met Office Decadal Prediction System to provide a multimodel ensemble forecast of the SPG and related impacts. The recent cooling trend in the SPG is predicted to continue in the next 5 years due to a decrease in the SPG heat convergence related to a slowdown of the Atlantic Meridional Overturning Circulation. We present evidence that the ensemble forecast is able to skilfully predict these quantities over recent decades. We also investigate the ability of the forecast to predict impacts on surface temperature, pressure, precipitation, and Atlantic tropical storms and compare the forecast to recent boreal summer climate. PMID:25821269
Using beta binomials to estimate classification uncertainty for ensemble models.
Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin
2014-01-01
Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.
Evaluation of an ensemble of genetic models for prediction of a quantitative trait.
Milton, Jacqueline N; Steinberg, Martin H; Sebastiani, Paola
2014-01-01
Many genetic markers have been shown to be associated with common quantitative traits in genome-wide association studies. Typically these associated genetic markers have small to modest effect sizes and individually they explain only a small amount of the variability of the phenotype. In order to build a genetic prediction model without fitting a multiple linear regression model with possibly hundreds of genetic markers as predictors, researchers often summarize the joint effect of risk alleles into a genetic score that is used as a covariate in the genetic prediction model. However, the prediction accuracy can be highly variable and selecting the optimal number of markers to be included in the genetic score is challenging. In this manuscript we present a strategy to build an ensemble of genetic prediction models from data and we show that the ensemble-based method makes the challenge of choosing the number of genetic markers more amenable. Using simulated data with varying heritability and number of genetic markers, we compare the predictive accuracy and inclusion of true positive and false positive markers of a single genetic prediction model and our proposed ensemble method. The results show that the ensemble of genetic models tends to include a larger number of genetic variants than a single genetic model and it is more likely to include all of the true genetic markers. This increased sensitivity is obtained at the price of a lower specificity that appears to minimally affect the predictive accuracy of the ensemble.
Gender and Attraction: Predicting Middle School Performance Ensemble Participation
ERIC Educational Resources Information Center
Warnock, Emery C.
2009-01-01
This study was designed to predict middle school sixth graders' group membership in band (n = 81), chorus (n = 45), and as non-participants in music performance ensembles (n = 127), as determined by gender and factors on the Attraction Toward School Performance Ensemble (ATSPE) scale (alpha = 0.88). Students completed the ATSPE as elementary fifth…
Decision Support on the Sediments Flushing of Aimorés Dam Using Medium-Range Ensemble Forecasts
NASA Astrophysics Data System (ADS)
Mainardi Fan, Fernando; Schwanenberg, Dirk; Collischonn, Walter; Assis dos Reis, Alberto; Alvarado Montero, Rodolfo; Alencar Siqueira, Vinicius
2015-04-01
In the present study we investigate the use of medium-range streamflow forecasts in the Doce River basin (Brazil), at the reservoir of Aimorés Hydro Power Plant (HPP). During daily operations this reservoir acts as a "trap" to the sediments that originate from the upstream basin of the Doce River. This motivates a cleaning process called "pass through" to periodically remove the sediments from the reservoir. The "pass through" or "sediments flushing" process consists of a decrease of the reservoir's water level to a certain flushing level when a determined reservoir inflow threshold is forecasted. Then, the water in the approaching inflow is used to flush the sediments from the reservoir through the spillway and to recover the original reservoir storage. To be triggered, the sediments flushing operation requires an inflow larger than 3000m³/s in a forecast horizon of 7 days. This lead-time of 7 days is far beyond the basin's concentration time (around 2 days), meaning that the forecasts for the pass through procedure highly depends on Numerical Weather Predictions (NWP) models that generate Quantitative Precipitation Forecasts (QPF). This dependency creates an environment with a high amount of uncertainty to the operator. To support the decision making at Aimorés HPP we developed a fully operational hydrological forecasting system to the basin. The system is capable of generating ensemble streamflow forecasts scenarios when driven by QPF data from meteorological Ensemble Prediction Systems (EPS). This approach allows accounting for uncertainties in the NWP at a decision making level. This system is starting to be used operationally by CEMIG and is the one shown in the present study, including a hindcasting analysis to assess the performance of the system for the specific flushing problem. The QPF data used in the hindcasting study was derived from the TIGGE (THORPEX Interactive Grand Global Ensemble) database. Among all EPS available on TIGGE, three were selected: ECMWF, GEFS, and CPTEC. As a deterministic reference forecast, we adopt the high resolution ECMWF forecast for comparison. The experiment consisted on running retrospective forecasts for a full five-year period. To verify the proposed objectives of the study, we use different metrics to evaluate the forecast: ROC Curves, Exceedance Diagrams, Forecast Convergence Score (FCS). Metrics results enabled to understand the benefits of the hydrological ensemble prediction system as a decision making tool for the HPP operation. The ROC scores indicate that the use of the lower percentiles of the ensemble scenarios issues for a true alarm rate around 0,5 to 0,8 (depending on the model and on the percentile), for the lead time of seven days. While the false alarm rate is between 0 and 0,3. Those rates were better than the ones resulting from the deterministic reference forecast. Exceedance diagrams and forecast convergence scores indicate that the ensemble scenarios provide an early signal about the threshold crossing. Furthermore, the ensemble forecasts are more consistent between two subsequent forecasts in comparison to the deterministic forecast. The assessments results also give more credibility to CEMIG in the realization and communication of flushing operation with the stakeholders involved.
Wave ensemble forecast in the Western Mediterranean Sea, application to an early warning system.
NASA Astrophysics Data System (ADS)
Pallares, Elena; Hernandez, Hector; Moré, Jordi; Espino, Manuel; Sairouni, Abdel
2015-04-01
The Western Mediterranean Sea is a highly heterogeneous and variable area, as is reflected on the wind field, the current field, and the waves, mainly in the first kilometers offshore. As a result of this variability, the wave forecast in these regions is quite complicated to perform, usually with some accuracy problems during energetic storm events. Moreover, is in these areas where most of the economic activities take part, including fisheries, sailing, tourism, coastal management and offshore renewal energy platforms. In order to introduce an indicator of the probability of occurrence of the different sea states and give more detailed information of the forecast to the end users, an ensemble wave forecast system is considered. The ensemble prediction systems have already been used in the last decades for the meteorological forecast; to deal with the uncertainties of the initial conditions and the different parametrizations used in the models, which may introduce some errors in the forecast, a bunch of different perturbed meteorological simulations are considered as possible future scenarios and compared with the deterministic forecast. In the present work, the SWAN wave model (v41.01) has been implemented for the Western Mediterranean sea, forced with wind fields produced by the deterministic Global Forecast System (GFS) and Global Ensemble Forecast System (GEFS). The wind fields includes a deterministic forecast (also named control), between 11 and 21 ensemble members, and some intelligent member obtained from the ensemble, as the mean of all the members. Four buoys located in the study area, moored in coastal waters, have been used to validate the results. The outputs include all the time series, with a forecast horizon of 8 days and represented in spaghetti diagrams, the spread of the system and the probability at different thresholds. The main goal of this exercise is to be able to determine the degree of the uncertainty of the wave forecast, meaningful between the 5th and the 8th day of the prediction. The information obtained is then included in an early warning system, designed in the framework of the European project iCoast (ECHO/SUB/2013/661009) with the aim of set alarms in coastal areas depending on the wave conditions, the sea level, the flooding and the run up in the coast.
Multimodel Ensemble Methods for Prediction of Wake-Vortex Transport and Decay Originating NASA
NASA Technical Reports Server (NTRS)
Korner, Stephan; Ahmad, Nashat N.; Holzapfel, Frank; VanValkenburg, Randal L.
2017-01-01
Several multimodel ensemble methods are selected and further developed to improve the deterministic and probabilistic prediction skills of individual wake-vortex transport and decay models. The different multimodel ensemble methods are introduced, and their suitability for wake applications is demonstrated. The selected methods include direct ensemble averaging, Bayesian model averaging, and Monte Carlo simulation. The different methodologies are evaluated employing data from wake-vortex field measurement campaigns conducted in the United States and Germany.
Reliable probabilities through statistical post-processing of ensemble predictions
NASA Astrophysics Data System (ADS)
Van Schaeybroeck, Bert; Vannitsem, Stéphane
2013-04-01
We develop post-processing or calibration approaches based on linear regression that make ensemble forecasts more reliable. We enforce climatological reliability in the sense that the total variability of the prediction is equal to the variability of the observations. Second, we impose ensemble reliability such that the spread around the ensemble mean of the observation coincides with the one of the ensemble members. In general the attractors of the model and reality are inhomogeneous. Therefore ensemble spread displays a variability not taken into account in standard post-processing methods. We overcome this by weighting the ensemble by a variable error. The approaches are tested in the context of the Lorenz 96 model (Lorenz 1996). The forecasts become more reliable at short lead times as reflected by a flatter rank histogram. Our best method turns out to be superior to well-established methods like EVMOS (Van Schaeybroeck and Vannitsem, 2011) and Nonhomogeneous Gaussian Regression (Gneiting et al., 2005). References [1] Gneiting, T., Raftery, A. E., Westveld, A., Goldman, T., 2005: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Weather Rev. 133, 1098-1118. [2] Lorenz, E. N., 1996: Predictability - a problem partly solved. Proceedings, Seminar on Predictability ECMWF. 1, 1-18. [3] Van Schaeybroeck, B., and S. Vannitsem, 2011: Post-processing through linear regression, Nonlin. Processes Geophys., 18, 147.
NOAA Climate Program Office Contributions to National ESPC
NASA Astrophysics Data System (ADS)
Higgins, W.; Huang, J.; Mariotti, A.; Archambault, H. M.; Barrie, D.; Lucas, S. E.; Mathis, J. T.; Legler, D. M.; Pulwarty, R. S.; Nierenberg, C.; Jones, H.; Cortinas, J. V., Jr.; Carman, J.
2016-12-01
NOAA is one of five federal agencies (DOD, DOE, NASA, NOAA, and NSF) which signed an updated charter in 2016 to partner on the National Earth System Prediction Capability (ESPC). Situated within NOAA's Office of Oceanic and Atmospheric Research (OAR), NOAA Climate Program Office (CPO) programs contribute significantly to the National ESPC goals and activities. This presentation will provide an overview of CPO contributions to National ESPC. First, we will discuss selected CPO research and transition activities that directly benefit the ESPC coupled model prediction capability, including The North American Multi-Model Ensemble (NMME) seasonal prediction system The Subseasonal Experiment (SubX) project to test real-time subseasonal ensemble prediction systems. Improvements to the NOAA operational Climate Forecast System (CFS), including software infrastructure and data assimilation. Next, we will show how CPO's foundational research activities are advancing future ESPC capabilities. Highlights will include: The Tropical Pacific Observing System (TPOS) to provide the basis for predicting climate on subseasonal to decadal timescales. Subseasonal-to-Seasonal (S2S) processes and predictability studies to improve understanding, modeling and prediction of the MJO. An Arctic Research Program to address urgent needs for advancing monitoring and prediction capabilities in this major area of concern. Advances towards building an experimental multi-decadal prediction system through studies on the Atlantic Meridional Overturning Circulation (AMOC). Finally, CPO has embraced Integrated Information Systems (IIS's) that build on the innovation of programs such as the National Integrated Drought Information System (NIDIS) to develop and deliver end to end environmental information for key societal challenges (e.g. extreme heat; coastal flooding). These contributions will help the National ESPC better understand and address societal needs and decision support requirements.
A short-term ensemble wind speed forecasting system for wind power applications
NASA Astrophysics Data System (ADS)
Baidya Roy, S.; Traiteur, J. J.; Callicutt, D.; Smith, M.
2011-12-01
This study develops an adaptive, blended forecasting system to provide accurate wind speed forecasts 1 hour ahead of time for wind power applications. The system consists of an ensemble of 21 forecasts with different configurations of the Weather Research and Forecasting Single Column Model (WRFSCM) and a persistence model. The ensemble is calibrated against observations for a 2 month period (June-July, 2008) at a potential wind farm site in Illinois using the Bayesian Model Averaging (BMA) technique. The forecasting system is evaluated against observations for August 2008 at the same site. The calibrated ensemble forecasts significantly outperform the forecasts from the uncalibrated ensemble while significantly reducing forecast uncertainty under all environmental stability conditions. The system also generates significantly better forecasts than persistence, autoregressive (AR) and autoregressive moving average (ARMA) models during the morning transition and the diurnal convective regimes. This forecasting system is computationally more efficient than traditional numerical weather prediction models and can generate a calibrated forecast, including model runs and calibration, in approximately 1 minute. Currently, hour-ahead wind speed forecasts are almost exclusively produced using statistical models. However, numerical models have several distinct advantages over statistical models including the potential to provide turbulence forecasts. Hence, there is an urgent need to explore the role of numerical models in short-term wind speed forecasting. This work is a step in that direction and is likely to trigger a debate within the wind speed forecasting community.
Fire spread estimation on forest wildfire using ensemble kalman filter
NASA Astrophysics Data System (ADS)
Syarifah, Wardatus; Apriliani, Erna
2018-04-01
Wildfire is one of the most frequent disasters in the world, for example forest wildfire, causing population of forest decrease. Forest wildfire, whether naturally occurring or prescribed, are potential risks for ecosystems and human settlements. These risks can be managed by monitoring the weather, prescribing fires to limit available fuel, and creating firebreaks. With computer simulations we can predict and explore how fires may spread. The model of fire spread on forest wildfire was established to determine the fire properties. The fire spread model is prepared based on the equation of the diffusion reaction model. There are many methods to estimate the spread of fire. The Kalman Filter Ensemble Method is a modified estimation method of the Kalman Filter algorithm that can be used to estimate linear and non-linear system models. In this research will apply Ensemble Kalman Filter (EnKF) method to estimate the spread of fire on forest wildfire. Before applying the EnKF method, the fire spread model will be discreted using finite difference method. At the end, the analysis obtained illustrated by numerical simulation using software. The simulation results show that the Ensemble Kalman Filter method is closer to the system model when the ensemble value is greater, while the covariance value of the system model and the smaller the measurement.
Sea surface temperature predictions using a multi-ocean analysis ensemble scheme
NASA Astrophysics Data System (ADS)
Zhang, Ying; Zhu, Jieshun; Li, Zhongxian; Chen, Haishan; Zeng, Gang
2017-08-01
This study examined the global sea surface temperature (SST) predictions by a so-called multiple-ocean analysis ensemble (MAE) initialization method which was applied in the National Centers for Environmental Prediction (NCEP) Climate Forecast System Version 2 (CFSv2). Different from most operational climate prediction practices which are initialized by a specific ocean analysis system, the MAE method is based on multiple ocean analyses. In the paper, the MAE method was first justified by analyzing the ocean temperature variability in four ocean analyses which all are/were applied for operational climate predictions either at the European Centre for Medium-range Weather Forecasts or at NCEP. It was found that these systems exhibit substantial uncertainties in estimating the ocean states, especially at the deep layers. Further, a set of MAE hindcasts was conducted based on the four ocean analyses with CFSv2, starting from each April during 1982-2007. The MAE hindcasts were verified against a subset of hindcasts from the NCEP CFS Reanalysis and Reforecast (CFSRR) Project. Comparisons suggested that MAE shows better SST predictions than CFSRR over most regions where ocean dynamics plays a vital role in SST evolutions, such as the El Niño and Atlantic Niño regions. Furthermore, significant improvements were also found in summer precipitation predictions over the equatorial eastern Pacific and Atlantic oceans, for which the local SST prediction improvements should be responsible. The prediction improvements by MAE imply a problem for most current climate predictions which are based on a specific ocean analysis system. That is, their predictions would drift towards states biased by errors inherent in their ocean initialization system, and thus have large prediction errors. In contrast, MAE arguably has an advantage by sampling such structural uncertainties, and could efficiently cancel these errors out in their predictions.
The Predictability of Dry-Season Precipitation in Tropical West Africa
NASA Astrophysics Data System (ADS)
Knippertz, P.; Davis, J.; Fink, A. H.
2012-04-01
Precipitation during the boreal winter dry season in tropical West Africa is rare but occasionally connected to high-impacts for the local population. Previous work has shown that these events are usually connected to a trough over northwestern Africa, an extensive cloud plume on its eastern side, unusual precipitation at the northern and western fringes of the Sahara, and reduced surface pressure over the southern Sahara and Sahel, which allows an inflow of moist southerlies from the Gulf of Guinea to feed the unusual dry-season rainfalls. These results also suggest that the extratropical influence enhances the predictability of these events on the synoptic timescale. Here we further investigate this question for the 11 dry seasons (November-March) 1998/99-2008/09 using rainfall estimates from TRMM (Tropical Rainfall Measuring Mission) and GPCP (Global Precipitation Climatology Project), and operational ensemble predictions from the European Centre for Medium-Range Forecasts (ECMWF). All fields are averaged over the study area 7.5-15°N, 10°W-10°E that spans most of southern West Africa. For each 0000 UTC analysis time, the daily precipitation estimates are accumulated to pentads and compared with 120-hour predictions starting at the same time. Compared to TRMM, the ensemble mean shows a weak positive bias, whereas there is a substantial negative bias with regard to GPCP. Temporal correlations reach a high value of 0.8 for both datasets, showing similar synoptic variability despite the differences in total amount. Standard probabilistic evaluation methods such as relative operating characteristic (ROC) diagrams indicate remarkably good reliability, resolution and skill, particularly for lower precipitation thresholds. Not surprisingly, forecasts cluster at low probabilities for higher thresholds, but the reliability and ROC score are still reasonably high. The results show that global ensemble prediction systems are capable to predict dry-season rainfall events in southern West Africa well, at least on regional spatial and synoptic time scales. These results should encourage West African weather services to capitalize more on the valuable information provided by ensemble prediction systems during the dry season.
Short-Range prediction of a Mediterranean Severe weather event using EnKF: Configuration tests
NASA Astrophysics Data System (ADS)
Carrio Carrio, Diego Saul; Homar Santaner, Víctor
2014-05-01
The afternoon of 4th October 2007, severe damaging winds and torrential rainfall affected the Island of Mallorca. This storm produced F2-F3 tornadoes in the vicinity of Palma, with one person killed and estimated damages to property exceeding 10 M€. Several studies have analysed the meteorological context in which this episode unfolded, describing the formation of a train of multiple thunderstorms along a warm front and the evolution of a squall line organized from convective activity initiated offshore Murcia during that morning. Couhet et al. (2011) attributed the correct simulation of the convective system and particularly its organization as a squall line to the correct representation of a convergence line at low-levels over the Alboran Sea during the first hours of the day. The numerical prediction of mesoscale phenomena which initiates, organizes and evolves over the sea is an extremely demanding challenge of great importance for coastal regions. In this study, we investigate the skill of a mesoscale ensemble data assimilation system to predict the severe phenomena occurred on 4th October 2007. We use an Ensemble Kalman Filter which assimilates conventional (surface, radiosonde and AMDAR) data using the DART implementation from (NCAR). On the one hand, we analyse the potential of the assimilation cycle to advect critical observational data towards decisive data-void areas over the sea. Furthermore, we assess the sensitivity of the ensemble products to the ensemble size, grid resolution, assimilation period and physics diversity in the mesoscale model. In particular, we focus on the effect of these numerical configurations on the representation of the convective activity and the precipitation field, as valuable predictands of high impact weather. Results show that the 6-h EnKF assimilation period produces initial fields that successfully represent the environment in which initiation occurred and thus the derived numerical predictions render improved evolutions of the squall line. Synthetic maps of severe convective risk reveals the improved predictability of the event using the EnKF as opposed to deterministic or downscaled configurations. Discussion on further improvements to the forecasting systems is provided.
NASA Astrophysics Data System (ADS)
Zunz, Violette; Goosse, Hugues; Dubinkina, Svetlana
2015-04-01
In this study, we assess systematically the impact of different initialisation procedures on the predictability of the sea ice in the Southern Ocean. These initialisation strategies are based on three data assimilation methods: the nudging, the particle filter with sequential importance resampling and the nudging proposal particle filter. An Earth system model of intermediate complexity is used to perform hindcast simulations in a perfect model approach. The predictability of the Antarctic sea ice at interannual to multi-decadal timescales is estimated through two aspects: the spread of the hindcast ensemble, indicating the uncertainty of the ensemble, and the correlation between the ensemble mean and the pseudo-observations, used to assess the accuracy of the prediction. Our results show that at decadal timescales more sophisticated data assimilation methods as well as denser pseudo-observations used to initialise the hindcasts decrease the spread of the ensemble. However, our experiments did not clearly demonstrate that one of the initialisation methods systematically provides with a more accurate prediction of the sea ice in the Southern Ocean than the others. Overall, the predictability at interannual timescales is limited to 3 years ahead at most. At multi-decadal timescales, the trends in sea ice extent computed over the time period just after the initialisation are clearly better correlated between the hindcasts and the pseudo-observations if the initialisation takes into account the pseudo-observations. The correlation reaches values larger than 0.5 in winter. This high correlation has likely its origin in the slow evolution of the ocean ensured by its strong thermal inertia, showing the importance of the quality of the initialisation below the sea ice.
NASA Astrophysics Data System (ADS)
Schunk, R. W.; Scherliess, L.; Eccles, V.; Gardner, L. C.; Sojka, J. J.; Zhu, L.; Pi, X.; Mannucci, A. J.; Komjathy, A.; Wang, C.; Rosen, G.
2016-12-01
As part of the NASA-NSF Space Weather Modeling Collaboration, we created a Multimodel Ensemble Prediction System (MEPS) for the Ionosphere-Thermosphere-Electrodynamics system that is based on Data Assimilation (DA) models. MEPS is composed of seven physics-based data assimilation models that cover the globe. Ensemble modeling can be conducted for the mid-low latitude ionosphere using the four GAIM data assimilation models, including the Gauss Markov (GM), Full Physics (FP), Band Limited (BL) and 4DVAR DA models. These models can assimilate Total Electron Content (TEC) from a constellation of satellites, bottom-side electron density profiles from digisondes, in situ plasma densities, occultation data and ultraviolet emissions. The four GAIM models were run for the March 16-17, 2013, geomagnetic storm period with the same data, but we also systematically added new data types and re-ran the GAIM models to see how the different data types affected the GAIM results, with the emphasis on elucidating differences in the underlying ionospheric dynamics and thermospheric coupling. Also, for each scenario the outputs from the four GAIM models were used to produce an ensemble mean for TEC, NmF2, and hmF2. A simple average of the models was used in the ensemble averaging to see if there was an improvement of the ensemble average over the individual models. For the scenarios considered, the ensemble average yielded better specifications than the individual GAIM models. The model differences and averages, and the consequent differences in ionosphere-thermosphere coupling and dynamics will be discussed.
No-Reference Image Quality Assessment by Wide-Perceptual-Domain Scorer Ensemble Method.
Liu, Tsung-Jung; Liu, Kuan-Hsien
2018-03-01
A no-reference (NR) learning-based approach to assess image quality is presented in this paper. The devised features are extracted from wide perceptual domains, including brightness, contrast, color, distortion, and texture. These features are used to train a model (scorer) which can predict scores. The scorer selection algorithms are utilized to help simplify the proposed system. In the final stage, the ensemble method is used to combine the prediction results from selected scorers. Two multiple-scale versions of the proposed approach are also presented along with the single-scale one. They turn out to have better performances than the original single-scale method. Because of having features from five different domains at multiple image scales and using the outputs (scores) from selected score prediction models as features for multi-scale or cross-scale fusion (i.e., ensemble), the proposed NR image quality assessment models are robust with respect to more than 24 image distortion types. They also can be used on the evaluation of images with authentic distortions. The extensive experiments on three well-known and representative databases confirm the performance robustness of our proposed model.
NASA Astrophysics Data System (ADS)
Alvarez-Garreton, C.; Ryu, D.; Western, A. W.; Su, C.-H.; Crow, W. T.; Robertson, D. E.; Leahy, C.
2014-09-01
Assimilation of remotely sensed soil moisture data (SM-DA) to correct soil water stores of rainfall-runoff models has shown skill in improving streamflow prediction. In the case of large and sparsely monitored catchments, SM-DA is a particularly attractive tool. Within this context, we assimilate active and passive satellite soil moisture (SSM) retrievals using an ensemble Kalman filter to improve operational flood prediction within a large semi-arid catchment in Australia (>40 000 km2). We assess the importance of accounting for channel routing and the spatial distribution of forcing data by applying SM-DA to a lumped and a semi-distributed scheme of the probability distributed model (PDM). Our scheme also accounts for model error representation and seasonal biases and errors in the satellite data. Before assimilation, the semi-distributed model provided more accurate streamflow prediction (Nash-Sutcliffe efficiency, NS = 0.77) than the lumped model (NS = 0.67) at the catchment outlet. However, this did not ensure good performance at the "ungauged" inner catchments. After SM-DA, the streamflow ensemble prediction at the outlet was improved in both the lumped and the semi-distributed schemes: the root mean square error of the ensemble was reduced by 27 and 31%, respectively; the NS of the ensemble mean increased by 7 and 38%, respectively; the false alarm ratio was reduced by 15 and 25%, respectively; and the ensemble prediction spread was reduced while its reliability was maintained. Our findings imply that even when rainfall is the main driver of flooding in semi-arid catchments, adequately processed SSM can be used to reduce errors in the model soil moisture, which in turn provides better streamflow ensemble prediction. We demonstrate that SM-DA efficacy is enhanced when the spatial distribution in forcing data and routing processes are accounted for. At ungauged locations, SM-DA is effective at improving streamflow ensemble prediction, however, the updated prediction is still poor since SM-DA does not address systematic errors in the model.
Ensemble method for dengue prediction.
Buczak, Anna L; Baugher, Benjamin; Moniz, Linda J; Bagley, Thomas; Babin, Steven M; Guven, Erhan
2018-01-01
In the 2015 NOAA Dengue Challenge, participants made three dengue target predictions for two locations (Iquitos, Peru, and San Juan, Puerto Rico) during four dengue seasons: 1) peak height (i.e., maximum weekly number of cases during a transmission season; 2) peak week (i.e., week in which the maximum weekly number of cases occurred); and 3) total number of cases reported during a transmission season. A dengue transmission season is the 12-month period commencing with the location-specific, historical week with the lowest number of cases. At the beginning of the Dengue Challenge, participants were provided with the same input data for developing the models, with the prediction testing data provided at a later date. Our approach used ensemble models created by combining three disparate types of component models: 1) two-dimensional Method of Analogues models incorporating both dengue and climate data; 2) additive seasonal Holt-Winters models with and without wavelet smoothing; and 3) simple historical models. Of the individual component models created, those with the best performance on the prior four years of data were incorporated into the ensemble models. There were separate ensembles for predicting each of the three targets at each of the two locations. Our ensemble models scored higher for peak height and total dengue case counts reported in a transmission season for Iquitos than all other models submitted to the Dengue Challenge. However, the ensemble models did not do nearly as well when predicting the peak week. The Dengue Challenge organizers scored the dengue predictions of the Challenge participant groups. Our ensemble approach was the best in predicting the total number of dengue cases reported for transmission season and peak height for Iquitos, Peru.
Ensemble method for dengue prediction
Baugher, Benjamin; Moniz, Linda J.; Bagley, Thomas; Babin, Steven M.; Guven, Erhan
2018-01-01
Background In the 2015 NOAA Dengue Challenge, participants made three dengue target predictions for two locations (Iquitos, Peru, and San Juan, Puerto Rico) during four dengue seasons: 1) peak height (i.e., maximum weekly number of cases during a transmission season; 2) peak week (i.e., week in which the maximum weekly number of cases occurred); and 3) total number of cases reported during a transmission season. A dengue transmission season is the 12-month period commencing with the location-specific, historical week with the lowest number of cases. At the beginning of the Dengue Challenge, participants were provided with the same input data for developing the models, with the prediction testing data provided at a later date. Methods Our approach used ensemble models created by combining three disparate types of component models: 1) two-dimensional Method of Analogues models incorporating both dengue and climate data; 2) additive seasonal Holt-Winters models with and without wavelet smoothing; and 3) simple historical models. Of the individual component models created, those with the best performance on the prior four years of data were incorporated into the ensemble models. There were separate ensembles for predicting each of the three targets at each of the two locations. Principal findings Our ensemble models scored higher for peak height and total dengue case counts reported in a transmission season for Iquitos than all other models submitted to the Dengue Challenge. However, the ensemble models did not do nearly as well when predicting the peak week. Conclusions The Dengue Challenge organizers scored the dengue predictions of the Challenge participant groups. Our ensemble approach was the best in predicting the total number of dengue cases reported for transmission season and peak height for Iquitos, Peru. PMID:29298320
Post-processing of global model output to forecast point rainfall
NASA Astrophysics Data System (ADS)
Hewson, Tim; Pillosu, Fatima
2016-04-01
ECMWF (the European Centre for Medium range Weather Forecasts) has recently embarked upon a new project to post-process gridbox rainfall forecasts from its ensemble prediction system, to provide probabilistic forecasts of point rainfall. The new post-processing strategy relies on understanding how different rainfall generation mechanisms lead to different degrees of sub-grid variability in rainfall totals. We use a number of simple global model parameters, such as the convective rainfall fraction, to anticipate the sub-grid variability, and then post-process each ensemble forecast into a pdf (probability density function) for a point-rainfall total. The final forecast will comprise the sum of the different pdfs from all ensemble members. The post-processing is essentially a re-calibration exercise, which needs only rainfall totals from standard global reporting stations (and forecasts) to train it. High density observations are not needed. This presentation will describe results from the initial 'proof of concept' study, which has been remarkably successful. Reference will also be made to other useful outcomes of the work, such as gaining insights into systematic model biases in different synoptic settings. The special case of orographic rainfall will also be discussed. Work ongoing this year will also be described. This involves further investigations of which model parameters can provide predictive skill, and will then move on to development of an operational system for predicting point rainfall across the globe. The main practical benefit of this system will be a greatly improved capacity to predict extreme point rainfall, and thereby provide early warnings, for the whole world, of flash flood potential for lead times that extend beyond day 5. This will be incorporated into the suite of products output by GLOFAS (the GLObal Flood Awareness System) which is hosted at ECMWF. As such this work offers a very cost-effective approach to satisfying user needs right around the world. This field has hitherto relied on using very expensive high-resolution ensembles; by their very nature these can only run over small regions, and only for lead times up to about 2 days.
Simultaneous calibration of ensemble river flow predictions over an entire range of lead times
NASA Astrophysics Data System (ADS)
Hemri, S.; Fundel, F.; Zappa, M.
2013-10-01
Probabilistic estimates of future water levels and river discharge are usually simulated with hydrologic models using ensemble weather forecasts as main inputs. As hydrologic models are imperfect and the meteorological ensembles tend to be biased and underdispersed, the ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, in order to achieve both reliable and sharp predictions statistical postprocessing is required. In this work Bayesian model averaging (BMA) is applied to statistically postprocess ensemble runoff raw forecasts for a catchment in Switzerland, at lead times ranging from 1 to 240 h. The raw forecasts have been obtained using deterministic and ensemble forcing meteorological models with different forecast lead time ranges. First, BMA is applied based on mixtures of univariate normal distributions, subject to the assumption of independence between distinct lead times. Then, the independence assumption is relaxed in order to estimate multivariate runoff forecasts over the entire range of lead times simultaneously, based on a BMA version that uses multivariate normal distributions. Since river runoff is a highly skewed variable, Box-Cox transformations are applied in order to achieve approximate normality. Both univariate and multivariate BMA approaches are able to generate well calibrated probabilistic forecasts that are considerably sharper than climatological forecasts. Additionally, multivariate BMA provides a promising approach for incorporating temporal dependencies into the postprocessed forecasts. Its major advantage against univariate BMA is an increase in reliability when the forecast system is changing due to model availability.
Visualization and classification of physiological failure modes in ensemble hemorrhage simulation
NASA Astrophysics Data System (ADS)
Zhang, Song; Pruett, William Andrew; Hester, Robert
2015-01-01
In an emergency situation such as hemorrhage, doctors need to predict which patients need immediate treatment and care. This task is difficult because of the diverse response to hemorrhage in human population. Ensemble physiological simulations provide a means to sample a diverse range of subjects and may have a better chance of containing the correct solution. However, to reveal the patterns and trends from the ensemble simulation is a challenging task. We have developed a visualization framework for ensemble physiological simulations. The visualization helps users identify trends among ensemble members, classify ensemble member into subpopulations for analysis, and provide prediction to future events by matching a new patient's data to existing ensembles. We demonstrated the effectiveness of the visualization on simulated physiological data. The lessons learned here can be applied to clinically-collected physiological data in the future.
A Sequential Ensemble Prediction System at Convection Permitting Scales
NASA Astrophysics Data System (ADS)
Milan, M.; Simmer, C.
2012-04-01
A Sequential Assimilation Method (SAM) following some aspects of particle filtering with resampling, also called SIR (Sequential Importance Resampling), is introduced and applied in the framework of an Ensemble Prediction System (EPS) for weather forecasting on convection permitting scales, with focus to precipitation forecast. At this scale and beyond, the atmosphere increasingly exhibits chaotic behaviour and non linear state space evolution due to convectively driven processes. One way to take full account of non linear state developments are particle filter methods, their basic idea is the representation of the model probability density function by a number of ensemble members weighted by their likelihood with the observations. In particular particle filter with resampling abandons ensemble members (particles) with low weights restoring the original number of particles adding multiple copies of the members with high weights. In our SIR-like implementation we substitute the likelihood way to define weights and introduce a metric which quantifies the "distance" between the observed atmospheric state and the states simulated by the ensemble members. We also introduce a methodology to counteract filter degeneracy, i.e. the collapse of the simulated state space. To this goal we propose a combination of resampling taking account of simulated state space clustering and nudging. By keeping cluster representatives during resampling and filtering, the method maintains the potential for non linear system state development. We assume that a particle cluster with initially low likelihood may evolve in a state space with higher likelihood in a subsequent filter time thus mimicking non linear system state developments (e.g. sudden convection initiation) and remedies timing errors for convection due to model errors and/or imperfect initial condition. We apply a simplified version of the resampling, the particles with highest weights in each cluster are duplicated; for the model evolution for each particle pair one particle evolves using the forward model; the second particle, however, is nudged to the radar and satellite observation during its evolution based on the forward model.
How do I know if I’ve improved my continental scale flood early warning system?
NASA Astrophysics Data System (ADS)
Cloke, Hannah L.; Pappenberger, Florian; Smith, Paul J.; Wetterhall, Fredrik
2017-04-01
Flood early warning systems mitigate damages and loss of life and are an economically efficient way of enhancing disaster resilience. The use of continental scale flood early warning systems is rapidly growing. The European Flood Awareness System (EFAS) is a pan-European flood early warning system forced by a multi-model ensemble of numerical weather predictions. Responses to scientific and technical changes can be complex in these computationally expensive continental scale systems, and improvements need to be tested by evaluating runs of the whole system. It is demonstrated here that forecast skill is not correlated with the value of warnings. In order to tell if the system has been improved an evaluation strategy is required that considers both forecast skill and warning value. The combination of a multi-forcing ensemble of EFAS flood forecasts is evaluated with a new skill-value strategy. The full multi-forcing ensemble is recommended for operational forecasting, but, there are spatial variations in the optimal forecast combination. Results indicate that optimizing forecasts based on value rather than skill alters the optimal forcing combination and the forecast performance. Also indicated is that model diversity and ensemble size are both important in achieving best overall performance. The use of several evaluation measures that consider both skill and value is strongly recommended when considering improvements to early warning systems.
Simulation skill of APCC set of global climate models for Asian summer monsoon rainfall variability
NASA Astrophysics Data System (ADS)
Singh, U. K.; Singh, G. P.; Singh, Vikas
2015-04-01
The performance of 11 Asia-Pacific Economic Cooperation Climate Center (APCC) global climate models (coupled and uncoupled both) in simulating the seasonal summer (June-August) monsoon rainfall variability over Asia (especially over India and East Asia) has been evaluated in detail using hind-cast data (3 months advance) generated from APCC which provides the regional climate information product services based on multi-model ensemble dynamical seasonal prediction systems. The skill of each global climate model over Asia was tested separately in detail for the period of 21 years (1983-2003), and simulated Asian summer monsoon rainfall (ASMR) has been verified using various statistical measures for Indian and East Asian land masses separately. The analysis found a large variation in spatial ASMR simulated with uncoupled model compared to coupled models (like Predictive Ocean Atmosphere Model for Australia, National Centers for Environmental Prediction and Japan Meteorological Agency). The simulated ASMR in coupled model was closer to Climate Prediction Centre Merged Analysis of Precipitation (CMAP) compared to uncoupled models although the amount of ASMR was underestimated in both models. Analysis also found a high spread in simulated ASMR among the ensemble members (suggesting that the model's performance is highly dependent on its initial conditions). The correlation analysis between sea surface temperature (SST) and ASMR shows that that the coupled models are strongly associated with ASMR compared to the uncoupled models (suggesting that air-sea interaction is well cared in coupled models). The analysis of rainfall using various statistical measures suggests that the multi-model ensemble (MME) performed better compared to individual model and also separate study indicate that Indian and East Asian land masses are more useful compared to Asia monsoon rainfall as a whole. The results of various statistical measures like skill of multi-model ensemble, large spread among the ensemble members of individual model, strong teleconnection (correlation analysis) with SST, coefficient of variation, inter-annual variability, analysis of Taylor diagram, etc. suggest that there is a need to improve coupled model instead of uncoupled model for the development of a better dynamical seasonal forecast system.
Evolutionary Wavelet Neural Network ensembles for breast cancer and Parkinson's disease prediction.
Khan, Maryam Mahsal; Mendes, Alexandre; Chalup, Stephan K
2018-01-01
Wavelet Neural Networks are a combination of neural networks and wavelets and have been mostly used in the area of time-series prediction and control. Recently, Evolutionary Wavelet Neural Networks have been employed to develop cancer prediction models. The present study proposes to use ensembles of Evolutionary Wavelet Neural Networks. The search for a high quality ensemble is directed by a fitness function that incorporates the accuracy of the classifiers both independently and as part of the ensemble itself. The ensemble approach is tested on three publicly available biomedical benchmark datasets, one on Breast Cancer and two on Parkinson's disease, using a 10-fold cross-validation strategy. Our experimental results show that, for the first dataset, the performance was similar to previous studies reported in literature. On the second dataset, the Evolutionary Wavelet Neural Network ensembles performed better than all previous methods. The third dataset is relatively new and this study is the first to report benchmark results.
Evolutionary Wavelet Neural Network ensembles for breast cancer and Parkinson’s disease prediction
Mendes, Alexandre; Chalup, Stephan K.
2018-01-01
Wavelet Neural Networks are a combination of neural networks and wavelets and have been mostly used in the area of time-series prediction and control. Recently, Evolutionary Wavelet Neural Networks have been employed to develop cancer prediction models. The present study proposes to use ensembles of Evolutionary Wavelet Neural Networks. The search for a high quality ensemble is directed by a fitness function that incorporates the accuracy of the classifiers both independently and as part of the ensemble itself. The ensemble approach is tested on three publicly available biomedical benchmark datasets, one on Breast Cancer and two on Parkinson’s disease, using a 10-fold cross-validation strategy. Our experimental results show that, for the first dataset, the performance was similar to previous studies reported in literature. On the second dataset, the Evolutionary Wavelet Neural Network ensembles performed better than all previous methods. The third dataset is relatively new and this study is the first to report benchmark results. PMID:29420578
Chaos and random matrices in supersymmetric SYK
NASA Astrophysics Data System (ADS)
Hunter-Jones, Nicholas; Liu, Junyu
2018-05-01
We use random matrix theory to explore late-time chaos in supersymmetric quantum mechanical systems. Motivated by the recent study of supersymmetric SYK models and their random matrix classification, we consider the Wishart-Laguerre unitary ensemble and compute the spectral form factors and frame potentials to quantify chaos and randomness. Compared to the Gaussian ensembles, we observe the absence of a dip regime in the form factor and a slower approach to Haar-random dynamics. We find agreement between our random matrix analysis and predictions from the supersymmetric SYK model, and discuss the implications for supersymmetric chaotic systems.
NIMEFI: Gene Regulatory Network Inference using Multiple Ensemble Feature Importance Algorithms
Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan
2014-01-01
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available. PMID:24667482
Multi-RCM ensemble downscaling of global seasonal forecasts (MRED)
NASA Astrophysics Data System (ADS)
Arritt, R. W.
2008-12-01
The Multi-RCM Ensemble Downscaling (MRED) project was recently initiated to address the question, Can regional climate models provide additional useful information from global seasonal forecasts? MRED will use a suite of regional climate models to downscale seasonal forecasts produced by the new National Centers for Environmental Prediction (NCEP) Climate Forecast System (CFS) seasonal forecast system and the NASA GEOS5 system. The initial focus will be on wintertime forecasts in order to evaluate topographic forcing, snowmelt, and the potential usefulness of higher resolution, especially for near-surface fields influenced by high resolution orography. Each regional model will cover the conterminous US (CONUS) at approximately 32 km resolution, and will perform an ensemble of 15 runs for each year 1982-2003 for the forecast period 1 December - 30 April. MRED will compare individual regional and global forecasts as well as ensemble mean precipitation and temperature forecasts, which are currently being used to drive macroscale land surface models (LSMs), as well as wind, humidity, radiation, turbulent heat fluxes, which are important for more advanced coupled macro-scale hydrologic models. Metrics of ensemble spread will also be evaluated. Extensive analysis will be performed to link improvements in downscaled forecast skill to regional forcings and physical mechanisms. Our overarching goal is to determine what additional skill can be provided by a community ensemble of high resolution regional models, which we believe will eventually define a strategy for more skillful and useful regional seasonal climate forecasts.
Morabito, Marco; Pavlinic, Daniela Z; Crisci, Alfonso; Capecchi, Valerio; Orlandini, Simone; Mekjavic, Igor B
2011-07-01
Military and civil defense personnel are often involved in complex activities in a variety of outdoor environments. The choice of appropriate clothing ensembles represents an important strategy to establish the success of a military mission. The main aim of this study was to compare the known clothing insulation of the garment ensembles worn by soldiers during two winter outdoor field trials (hike and guard duty) with the estimated optimal clothing thermal insulations recommended to maintain thermoneutrality, assessed by using two different biometeorological procedures. The overall aim was to assess the applicability of such biometeorological procedures to weather forecast systems, thereby developing a comprehensive biometeorological tool for military operational forecast purposes. Military trials were carried out during winter 2006 in Pokljuka (Slovenia) by Slovene Armed Forces personnel. Gastrointestinal temperature, heart rate and environmental parameters were measured with portable data acquisition systems. The thermal characteristics of the clothing ensembles worn by the soldiers, namely thermal resistance, were determined with a sweating thermal manikin. Results showed that the clothing ensemble worn by the military was appropriate during guard duty but generally inappropriate during the hike. A general under-estimation of the biometeorological forecast model in predicting the optimal clothing insulation value was observed and an additional post-processing calibration might further improve forecast accuracy. This study represents the first step in the development of a comprehensive personalized biometeorological forecast system aimed at improving recommendations regarding the optimal thermal insulation of military garment ensembles for winter activities.
NASA Astrophysics Data System (ADS)
O'Connor, Alison; Kirtman, Benjamin; Harrison, Scott; Gorman, Joe
2016-05-01
The US Navy faces several limitations when planning operations in regard to forecasting environmental conditions. Currently, mission analysis and planning tools rely heavily on short-term (less than a week) forecasts or long-term statistical climate products. However, newly available data in the form of weather forecast ensembles provides dynamical and statistical extended-range predictions that can produce more accurate predictions if ensemble members can be combined correctly. Charles River Analytics is designing the Climatological Observations for Maritime Prediction and Analysis Support Service (COMPASS), which performs data fusion over extended-range multi-model ensembles, such as the North American Multi-Model Ensemble (NMME), to produce a unified forecast for several weeks to several seasons in the future. We evaluated thirty years of forecasts using machine learning to select predictions for an all-encompassing and superior forecast that can be used to inform the Navy's decision planning process.
NASA Astrophysics Data System (ADS)
Pinson, Pierre
2016-04-01
The operational management of renewable energy generation in power systems and electricity markets requires forecasts in various forms, e.g., deterministic or probabilistic, continuous or categorical, depending upon the decision process at hand. Besides, such forecasts may also be necessary at various spatial and temporal scales, from high temporal resolutions (in the order of minutes) and very localized for an offshore wind farm, to coarser temporal resolutions (hours) and covering a whole country for day-ahead power scheduling problems. As of today, weather predictions are a common input to forecasting methodologies for renewable energy generation. Since for most decision processes, optimal decisions can only be made if accounting for forecast uncertainties, ensemble predictions and density forecasts are increasingly seen as the product of choice. After discussing some of the basic approaches to obtaining ensemble forecasts of renewable power generation, it will be argued that space-time trajectories of renewable power production may or may not be necessitate post-processing ensemble forecasts for relevant weather variables. Example approaches and test case applications will be covered, e.g., looking at the Horns Rev offshore wind farm in Denmark, or gridded forecasts for the whole continental Europe. Eventually, we will illustrate some of the limitations of current frameworks to forecast verification, which actually make it difficult to fully assess the quality of post-processing approaches to obtain renewable energy predictions.
Calibration of decadal ensemble predictions
NASA Astrophysics Data System (ADS)
Pasternack, Alexander; Rust, Henning W.; Bhend, Jonas; Liniger, Mark; Grieger, Jens; Müller, Wolfgang; Ulbrich, Uwe
2017-04-01
Decadal climate predictions are of great socio-economic interest due to the corresponding planning horizons of several political and economic decisions. Due to uncertainties of weather and climate, forecasts (e.g. due to initial condition uncertainty), they are issued in a probabilistic way. One issue frequently observed for probabilistic forecasts is that they tend to be not reliable, i.e. the forecasted probabilities are not consistent with the relative frequency of the associated observed events. Thus, these kind of forecasts need to be re-calibrated. While re-calibration methods for seasonal time scales are available and frequently applied, these methods still have to be adapted for decadal time scales and its characteristic problems like climate trend and lead time dependent bias. Regarding this, we propose a method to re-calibrate decadal ensemble predictions that takes the above mentioned characteristics into account. Finally, this method will be applied and validated to decadal forecasts from the MiKlip system (Germany's initiative for decadal prediction).
Forecasting seasonal outbreaks of influenza.
Shaman, Jeffrey; Karspeck, Alicia
2012-12-11
Influenza recurs seasonally in temperate regions of the world; however, our ability to predict the timing, duration, and magnitude of local seasonal outbreaks of influenza remains limited. Here we develop a framework for initializing real-time forecasts of seasonal influenza outbreaks, using a data assimilation technique commonly applied in numerical weather prediction. The availability of real-time, web-based estimates of local influenza infection rates makes this type of quantitative forecasting possible. Retrospective ensemble forecasts are generated on a weekly basis following assimilation of these web-based estimates for the 2003-2008 influenza seasons in New York City. The findings indicate that real-time skillful predictions of peak timing can be made more than 7 wk in advance of the actual peak. In addition, confidence in those predictions can be inferred from the spread of the forecast ensemble. This work represents an initial step in the development of a statistically rigorous system for real-time forecast of seasonal influenza.
Forecasting seasonal outbreaks of influenza
Shaman, Jeffrey; Karspeck, Alicia
2012-01-01
Influenza recurs seasonally in temperate regions of the world; however, our ability to predict the timing, duration, and magnitude of local seasonal outbreaks of influenza remains limited. Here we develop a framework for initializing real-time forecasts of seasonal influenza outbreaks, using a data assimilation technique commonly applied in numerical weather prediction. The availability of real-time, web-based estimates of local influenza infection rates makes this type of quantitative forecasting possible. Retrospective ensemble forecasts are generated on a weekly basis following assimilation of these web-based estimates for the 2003–2008 influenza seasons in New York City. The findings indicate that real-time skillful predictions of peak timing can be made more than 7 wk in advance of the actual peak. In addition, confidence in those predictions can be inferred from the spread of the forecast ensemble. This work represents an initial step in the development of a statistically rigorous system for real-time forecast of seasonal influenza. PMID:23184969
A Signal to Noise Paradox in Climate Predictions
NASA Astrophysics Data System (ADS)
Eade, R.; Scaife, A. A.; Smith, D.; Dunstone, N. J.; MacLachlan, C.; Hermanson, L.; Ruth, C.
2017-12-01
Recent advances in climate modelling have resulted in the achievement of skilful long-range prediction, particular that associated with the winter circulation over the north Atlantic (e.g. Scaife et al 2014, Stockdale et al 2015, Dunstone et al 2016) including impacts over Europe and North America, and further afield. However, while highly significant and potentially useful skill exists, the signal-to-noise ratio of the ensemble mean to total variability in these ensemble predictions is anomalously small (Scaife et al 2014) and the correlation between the ensemble mean and historical observations exceeds the proportion of predictable variance in the ensemble (Eade et al 2014). This means the real world is more predictable than our climate models. Here we discuss a series of hypothesis tests that have been carried out to assess issues with model mechanisms compared to the observed world, and present the latest findings in our attempt to determine the cause of the anomalously weak predicted signals in our seasonal-to-decadal hindcasts.
National Centers for Environmental Prediction
Modeling Mesoscale Modeling Marine Modeling and Analysis Teams Climate Data Assimilation Ensembles and Post Contacts Change Log Events Calendar Numerical Forecast Systems NCEP Model Analysis and Guidance Page [< Modeling Center NOAA Center for Weather and Climate Prediction (NCWCP) 5830 University Research Court
Distinct cognitive mechanisms involved in the processing of single objects and object ensembles
Cant, Jonathan S.; Sun, Sol Z.; Xu, Yaoda
2015-01-01
Behavioral research has demonstrated that the shape and texture of single objects can be processed independently. Similarly, neuroimaging results have shown that an object's shape and texture are processed in distinct brain regions with shape in the lateral occipital area and texture in parahippocampal cortex. Meanwhile, objects are not always seen in isolation and are often grouped together as an ensemble. We recently showed that the processing of ensembles also involves parahippocampal cortex and that the shape and texture of ensemble elements are processed together within this region. These neural data suggest that the independence seen between shape and texture in single-object perception would not be observed in object-ensemble perception. Here we tested this prediction by examining whether observers could attend to the shape of ensemble elements while ignoring changes in an unattended texture feature and vice versa. Across six behavioral experiments, we replicated previous findings of independence between shape and texture in single-object perception. In contrast, we observed that changes in an unattended ensemble feature negatively impacted the processing of an attended ensemble feature only when ensemble features were attended globally. When they were attended locally, thereby making ensemble processing similar to single-object processing, interference was abolished. Overall, these findings confirm previous neuroimaging results and suggest that distinct cognitive mechanisms may be involved in single-object and object-ensemble perception. Additionally, they show that the scope of visual attention plays a critical role in determining which type of object processing (ensemble or single object) is engaged by the visual system. PMID:26360156
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dunn, Nicholas J. H.; Noid, W. G., E-mail: wnoid@chem.psu.edu
This work investigates the promise of a “bottom-up” extended ensemble framework for developing coarse-grained (CG) models that provide predictive accuracy and transferability for describing both structural and thermodynamic properties. We employ a force-matching variational principle to determine system-independent, i.e., transferable, interaction potentials that optimally model the interactions in five distinct heptane-toluene mixtures. Similarly, we employ a self-consistent pressure-matching approach to determine a system-specific pressure correction for each mixture. The resulting CG potentials accurately reproduce the site-site rdfs, the volume fluctuations, and the pressure equations of state that are determined by all-atom (AA) models for the five mixtures. Furthermore, we demonstratemore » that these CG potentials provide similar accuracy for additional heptane-toluene mixtures that were not included their parameterization. Surprisingly, the extended ensemble approach improves not only the transferability but also the accuracy of the calculated potentials. Additionally, we observe that the required pressure corrections strongly correlate with the intermolecular cohesion of the system-specific CG potentials. Moreover, this cohesion correlates with the relative “structure” within the corresponding mapped AA ensemble. Finally, the appendix demonstrates that the self-consistent pressure-matching approach corresponds to minimizing an appropriate relative entropy.« less
Dunn, Nicholas J H; Noid, W G
2016-05-28
This work investigates the promise of a "bottom-up" extended ensemble framework for developing coarse-grained (CG) models that provide predictive accuracy and transferability for describing both structural and thermodynamic properties. We employ a force-matching variational principle to determine system-independent, i.e., transferable, interaction potentials that optimally model the interactions in five distinct heptane-toluene mixtures. Similarly, we employ a self-consistent pressure-matching approach to determine a system-specific pressure correction for each mixture. The resulting CG potentials accurately reproduce the site-site rdfs, the volume fluctuations, and the pressure equations of state that are determined by all-atom (AA) models for the five mixtures. Furthermore, we demonstrate that these CG potentials provide similar accuracy for additional heptane-toluene mixtures that were not included their parameterization. Surprisingly, the extended ensemble approach improves not only the transferability but also the accuracy of the calculated potentials. Additionally, we observe that the required pressure corrections strongly correlate with the intermolecular cohesion of the system-specific CG potentials. Moreover, this cohesion correlates with the relative "structure" within the corresponding mapped AA ensemble. Finally, the appendix demonstrates that the self-consistent pressure-matching approach corresponds to minimizing an appropriate relative entropy.
National Centers for Environmental Prediction
Ensemble Users Meetings 7th NCEP/NWS Ensemble User Workshop 13-15 June 2016 6th NCEP/NWS Ensemble User Workshop 25 - 27 March 2014 5th NCEP/NWS Ensemble User Workshop 10 - 12 May, 2011 4th NCEP/NWS Ensemble User Workshop 13 - 15 May, 2008 3rd NCEP/NWS Ensemble User Workshop 31 Oct - 2 Nov, 2006 2nd NCEP/NWS
Ehrhardt, Fiona; Soussana, Jean-François; Bellocchi, Gianni; Grace, Peter; McAuliffe, Russel; Recous, Sylvie; Sándor, Renáta; Smith, Pete; Snow, Val; de Antoni Migliorati, Massimiliano; Basso, Bruno; Bhatia, Arti; Brilli, Lorenzo; Doltra, Jordi; Dorich, Christopher D; Doro, Luca; Fitton, Nuala; Giacomini, Sandro J; Grant, Brian; Harrison, Matthew T; Jones, Stephanie K; Kirschbaum, Miko U F; Klumpp, Katja; Laville, Patricia; Léonard, Joël; Liebig, Mark; Lieffering, Mark; Martin, Raphaël; Massad, Raia S; Meier, Elizabeth; Merbold, Lutz; Moore, Andrew D; Myrgiotis, Vasileios; Newton, Paul; Pattey, Elizabeth; Rolinski, Susanne; Sharp, Joanna; Smith, Ward N; Wu, Lianhai; Zhang, Qing
2018-02-01
Simulation models are extensively used to predict agricultural productivity and greenhouse gas emissions. However, the uncertainties of (reduced) model ensemble simulations have not been assessed systematically for variables affecting food security and climate change mitigation, within multi-species agricultural contexts. We report an international model comparison and benchmarking exercise, showing the potential of multi-model ensembles to predict productivity and nitrous oxide (N 2 O) emissions for wheat, maize, rice and temperate grasslands. Using a multi-stage modelling protocol, from blind simulations (stage 1) to partial (stages 2-4) and full calibration (stage 5), 24 process-based biogeochemical models were assessed individually or as an ensemble against long-term experimental data from four temperate grassland and five arable crop rotation sites spanning four continents. Comparisons were performed by reference to the experimental uncertainties of observed yields and N 2 O emissions. Results showed that across sites and crop/grassland types, 23%-40% of the uncalibrated individual models were within two standard deviations (SD) of observed yields, while 42 (rice) to 96% (grasslands) of the models were within 1 SD of observed N 2 O emissions. At stage 1, ensembles formed by the three lowest prediction model errors predicted both yields and N 2 O emissions within experimental uncertainties for 44% and 33% of the crop and grassland growth cycles, respectively. Partial model calibration (stages 2-4) markedly reduced prediction errors of the full model ensemble E-median for crop grain yields (from 36% at stage 1 down to 4% on average) and grassland productivity (from 44% to 27%) and to a lesser and more variable extent for N 2 O emissions. Yield-scaled N 2 O emissions (N 2 O emissions divided by crop yields) were ranked accurately by three-model ensembles across crop species and field sites. The potential of using process-based model ensembles to predict jointly productivity and N 2 O emissions at field scale is discussed. © 2017 John Wiley & Sons Ltd.
Multi-model ensemble hydrologic prediction using Bayesian model averaging
NASA Astrophysics Data System (ADS)
Duan, Qingyun; Ajami, Newsha K.; Gao, Xiaogang; Sorooshian, Soroosh
2007-05-01
Multi-model ensemble strategy is a means to exploit the diversity of skillful predictions from different models. This paper studies the use of Bayesian model averaging (BMA) scheme to develop more skillful and reliable probabilistic hydrologic predictions from multiple competing predictions made by several hydrologic models. BMA is a statistical procedure that infers consensus predictions by weighing individual predictions based on their probabilistic likelihood measures, with the better performing predictions receiving higher weights than the worse performing ones. Furthermore, BMA provides a more reliable description of the total predictive uncertainty than the original ensemble, leading to a sharper and better calibrated probability density function (PDF) for the probabilistic predictions. In this study, a nine-member ensemble of hydrologic predictions was used to test and evaluate the BMA scheme. This ensemble was generated by calibrating three different hydrologic models using three distinct objective functions. These objective functions were chosen in a way that forces the models to capture certain aspects of the hydrograph well (e.g., peaks, mid-flows and low flows). Two sets of numerical experiments were carried out on three test basins in the US to explore the best way of using the BMA scheme. In the first set, a single set of BMA weights was computed to obtain BMA predictions, while the second set employed multiple sets of weights, with distinct sets corresponding to different flow intervals. In both sets, the streamflow values were transformed using Box-Cox transformation to ensure that the probability distribution of the prediction errors is approximately Gaussian. A split sample approach was used to obtain and validate the BMA predictions. The test results showed that BMA scheme has the advantage of generating more skillful and equally reliable probabilistic predictions than original ensemble. The performance of the expected BMA predictions in terms of daily root mean square error (DRMS) and daily absolute mean error (DABS) is generally superior to that of the best individual predictions. Furthermore, the BMA predictions employing multiple sets of weights are generally better than those using single set of weights.
Probabilistic forecasts based on radar rainfall uncertainty
NASA Astrophysics Data System (ADS)
Liguori, S.; Rico-Ramirez, M. A.
2012-04-01
The potential advantages resulting from integrating weather radar rainfall estimates in hydro-meteorological forecasting systems is limited by the inherent uncertainty affecting radar rainfall measurements, which is due to various sources of error [1-3]. The improvement of quality control and correction techniques is recognized to play a role for the future improvement of radar-based flow predictions. However, the knowledge of the uncertainty affecting radar rainfall data can also be effectively used to build a hydro-meteorological forecasting system in a probabilistic framework. This work discusses the results of the implementation of a novel probabilistic forecasting system developed to improve ensemble predictions over a small urban area located in the North of England. An ensemble of radar rainfall fields can be determined as the sum of a deterministic component and a perturbation field, the latter being informed by the knowledge of the spatial-temporal characteristics of the radar error assessed with reference to rain-gauges measurements. This approach is similar to the REAL system [4] developed for use in the Southern-Alps. The radar uncertainty estimate can then be propagated with a nowcasting model, used to extrapolate an ensemble of radar rainfall forecasts, which can ultimately drive hydrological ensemble predictions. A radar ensemble generator has been calibrated using radar rainfall data made available from the UK Met Office after applying post-processing and corrections algorithms [5-6]. One hour rainfall accumulations from 235 rain gauges recorded for the year 2007 have provided the reference to determine the radar error. Statistics describing the spatial characteristics of the error (i.e. mean and covariance) have been computed off-line at gauges location, along with the parameters describing the error temporal correlation. A system has then been set up to impose the space-time error properties to stochastic perturbations, generated in real-time at gauges location, and then interpolated back onto the radar domain, in order to obtain probabilistic radar rainfall fields in real time. The deterministic nowcasting model integrated in the STEPS system [7-8] has been used for the purpose of propagating the uncertainty and assessing the benefit of implementing the radar ensemble generator for probabilistic rainfall forecasts and ultimately sewer flow predictions. For this purpose, events representative of different types of precipitation (i.e. stratiform/convective) and significant at the urban catchment scale (i.e. in terms of sewer overflow within the urban drainage system) have been selected. As high spatial/temporal resolution is required to the forecasts for their use in urban areas [9-11], the probabilistic nowcasts have been set up to be produced at 1 km resolution and 5 min intervals. The forecasting chain is completed by a hydrodynamic model of the urban drainage network. The aim of this work is to discuss the implementation of this probabilistic system, which takes into account the radar error to characterize the forecast uncertainty, with consequent potential benefits in the management of urban systems. It will also allow a comparison with previous findings related to the analysis of different approaches to uncertainty estimation and quantification in terms of rainfall [12] and flows at the urban scale [13]. Acknowledgements The authors would like to acknowledge the BADC, the UK Met Office and Dr. Alan Seed from the Australian Bureau of Meteorology for providing the radar data and the nowcasting model. The authors acknowledge the support from the Engineering and Physical Sciences Research Council (EPSRC) via grant EP/I012222/1.
NASA Astrophysics Data System (ADS)
Re, Matteo; Valentini, Giorgio
2012-03-01
Ensemble methods are statistical and computational learning procedures reminiscent of the human social learning behavior of seeking several opinions before making any crucial decision. The idea of combining the opinions of different "experts" to obtain an overall “ensemble” decision is rooted in our culture at least from the classical age of ancient Greece, and it has been formalized during the Enlightenment with the Condorcet Jury Theorem[45]), which proved that the judgment of a committee is superior to those of individuals, provided the individuals have reasonable competence. Ensembles are sets of learning machines that combine in some way their decisions, or their learning algorithms, or different views of data, or other specific characteristics to obtain more reliable and more accurate predictions in supervised and unsupervised learning problems [48,116]. A simple example is represented by the majority vote ensemble, by which the decisions of different learning machines are combined, and the class that receives the majority of “votes” (i.e., the class predicted by the majority of the learning machines) is the class predicted by the overall ensemble [158]. In the literature, a plethora of terms other than ensembles has been used, such as fusion, combination, aggregation, and committee, to indicate sets of learning machines that work together to solve a machine learning problem [19,40,56,66,99,108,123], but in this chapter we maintain the term ensemble in its widest meaning, in order to include the whole range of combination methods. Nowadays, ensemble methods represent one of the main current research lines in machine learning [48,116], and the interest of the research community on ensemble methods is witnessed by conferences and workshops specifically devoted to ensembles, first of all the multiple classifier systems (MCS) conference organized by Roli, Kittler, Windeatt, and other researchers of this area [14,62,85,149,173]. Several theories have been proposed to explain the characteristics and the successful application of ensembles to different application domains. For instance, Allwein, Schapire, and Singer interpreted the improved generalization capabilities of ensembles of learning machines in the framework of large margin classifiers [4,177], Kleinberg in the context of stochastic discrimination theory [112], and Breiman and Friedman in the light of the bias-variance analysis borrowed from classical statistics [21,70]. Empirical studies showed that both in classification and regression problems, ensembles improve on single learning machines, and moreover large experimental studies compared the effectiveness of different ensemble methods on benchmark data sets [10,11,49,188]. The interest in this research area is motivated also by the availability of very fast computers and networks of workstations at a relatively low cost that allow the implementation and the experimentation of complex ensemble methods using off-the-shelf computer platforms. However, as explained in Section 26.2 there are deeper reasons to use ensembles of learning machines, motivated by the intrinsic characteristics of the ensemble methods. The main aim of this chapter is to introduce ensemble methods and to provide an overview and a bibliography of the main areas of research, without pretending to be exhaustive or to explain the detailed characteristics of each ensemble method. The paper is organized as follows. In the next section, the main theoretical and practical reasons for combining multiple learners are introduced. Section 26.3 depicts the main taxonomies on ensemble methods proposed in the literature. In Section 26.4 and 26.5, we present an overview of the main supervised ensemble methods reported in the literature, adopting a simple taxonomy, originally proposed in Ref. [201]. Applications of ensemble methods are only marginally considered, but a specific section on some relevant applications of ensemble methods in astronomy and astrophysics has been added (Section 26.6). The conclusion (Section 26.7) ends this paper and lists some issues not covered in this work.
Drug-target interaction prediction using ensemble learning and dimensionality reduction.
Ezzat, Ali; Wu, Min; Li, Xiao-Li; Kwoh, Chee-Keong
2017-10-01
Experimental prediction of drug-target interactions is expensive, time-consuming and tedious. Fortunately, computational methods help narrow down the search space for interaction candidates to be further examined via wet-lab techniques. Nowadays, the number of attributes/features for drugs and targets, as well as the amount of their interactions, are increasing, making these computational methods inefficient or occasionally prohibitive. This motivates us to derive a reduced feature set for prediction. In addition, since ensemble learning techniques are widely used to improve the classification performance, it is also worthwhile to design an ensemble learning framework to enhance the performance for drug-target interaction prediction. In this paper, we propose a framework for drug-target interaction prediction leveraging both feature dimensionality reduction and ensemble learning. First, we conducted feature subspacing to inject diversity into the classifier ensemble. Second, we applied three different dimensionality reduction methods to the subspaced features. Third, we trained homogeneous base learners with the reduced features and then aggregated their scores to derive the final predictions. For base learners, we selected two classifiers, namely Decision Tree and Kernel Ridge Regression, resulting in two variants of ensemble models, EnsemDT and EnsemKRR, respectively. In our experiments, we utilized AUC (Area under ROC Curve) as an evaluation metric. We compared our proposed methods with various state-of-the-art methods under 5-fold cross validation. Experimental results showed EnsemKRR achieving the highest AUC (94.3%) for predicting drug-target interactions. In addition, dimensionality reduction helped improve the performance of EnsemDT. In conclusion, our proposed methods produced significant improvements for drug-target interaction prediction. Copyright © 2017 Elsevier Inc. All rights reserved.
Satellite Data Assimilation within KIAPS-LETKF system
NASA Astrophysics Data System (ADS)
Jo, Y.; Lee, S., Sr.; Cho, K.
2016-12-01
Korea Institute of Atmospheric Prediction Systems (KIAPS) has been developing an ensemble data assimilation system using four-dimensional local ensemble transform kalman filter (LETKF; Hunt et al., 2007) within KIAPS Integrated Model (KIM), referred to as "KIAPS-LETKF". KIAPS-LETKF system was successfully evaluated with various Observing System Simulation Experiments (OSSEs) with NCAR Community Atmospheric Model - Spectral Element (Kang et al., 2013), which has fully unstructured quadrilateral meshes based on the cubed-sphere grid as the same grid system of KIM. Recently, assimilation of real observations has been conducted within the KIAPS-LETKF system with four-dimensional covariance functions over the 6-hr assimilation window. Then, conventional (e.g., sonde, aircraft, and surface) and satellite (e.g., AMSU-A, IASI, GPS-RO, and AMV) observations have been provided by the KIAPS Package for Observation Processing (KPOP). Wind speed prediction was found most beneficial due to ingestion of AMV and for the temperature prediction the improvement in assimilation is mostly due to ingestion of AMSU-A and IASI. However, some degradation in the simulation of the GPS-RO is presented in the upper stratosphere, even though GPS-RO leads positive impacts on the analysis and forecasts. We plan to test the bias correction method and several vertical localization strategies for radiance observations to improve analysis and forecast impacts.
NASA Astrophysics Data System (ADS)
Tsai, Hsiao-Chung; Chen, Pang-Cheng; Elsberry, Russell L.
2017-04-01
The objective of this study is to evaluate the predictability of the extended-range forecasts of tropical cyclone (TC) in the western North Pacific using reforecasts from National Centers for Environmental Prediction (NCEP) Global Ensemble Forecast System (GEFS) during 1996-2015, and from the Climate Forecast System (CFS) during 1999-2010. Tsai and Elsberry have demonstrated that an opportunity exists to support hydrological operations by using the extended-range TC formation and track forecasts in the western North Pacific from the ECMWF 32-day ensemble. To demonstrate this potential for the decision-making processes regarding water resource management and hydrological operation in Taiwan reservoir watershed areas, special attention is given to the skill of the NCEP GEFS and CFS models in predicting the TCs affecting the Taiwan area. The first objective of this study is to analyze the skill of NCEP GEFS and CFS TC forecasts and quantify the forecast uncertainties via verifications of categorical binary forecasts and probabilistic forecasts. The second objective is to investigate the relationships among the large-scale environmental factors [e.g., El Niño Southern Oscillation (ENSO), Madden-Julian Oscillation (MJO), etc.] and the model forecast errors by using the reforecasts. Preliminary results are indicating that the skill of the TC activity forecasts based on the raw forecasts can be further improved if the model biases are minimized by utilizing these reforecasts.
Hierarchical Ensemble Methods for Protein Function Prediction
2014-01-01
Protein function prediction is a complex multiclass multilabel classification problem, characterized by multiple issues such as the incompleteness of the available annotations, the integration of multiple sources of high dimensional biomolecular data, the unbalance of several functional classes, and the difficulty of univocally determining negative examples. Moreover, the hierarchical relationships between functional classes that characterize both the Gene Ontology and FunCat taxonomies motivate the development of hierarchy-aware prediction methods that showed significantly better performances than hierarchical-unaware “flat” prediction methods. In this paper, we provide a comprehensive review of hierarchical methods for protein function prediction based on ensembles of learning machines. According to this general approach, a separate learning machine is trained to learn a specific functional term and then the resulting predictions are assembled in a “consensus” ensemble decision, taking into account the hierarchical relationships between classes. The main hierarchical ensemble methods proposed in the literature are discussed in the context of existing computational methods for protein function prediction, highlighting their characteristics, advantages, and limitations. Open problems of this exciting research area of computational biology are finally considered, outlining novel perspectives for future research. PMID:25937954
Ghose, Soumya; Mitra, Jhimli; Karunanithi, Mohan; Dowling, Jason
2015-01-01
Home monitoring of chronically ill or elderly patient can reduce frequent hospitalisations and hence provide improved quality of care at a reduced cost to the community, therefore reducing the burden on the healthcare system. Activity recognition of such patients is of high importance in such a design. In this work, a system for automatic human physical activity recognition from smart-phone inertial sensors data is proposed. An ensemble of decision trees framework is adopted to train and predict the multi-class human activity system. A comparison of our proposed method with a multi-class traditional support vector machine shows significant improvement in activity recognition accuracies.
NASA Astrophysics Data System (ADS)
Pantillon, Florian; Knippertz, Peter; Corsmeier, Ulrich
2017-10-01
New insights into the synoptic-scale predictability of 25 severe European winter storms of the 1995-2015 period are obtained using the homogeneous ensemble reforecast dataset from the European Centre for Medium-Range Weather Forecasts. The predictability of the storms is assessed with different metrics including (a) the track and intensity to investigate the storms' dynamics and (b) the Storm Severity Index to estimate the impact of the associated wind gusts. The storms are well predicted by the whole ensemble up to 2-4 days ahead. At longer lead times, the number of members predicting the observed storms decreases and the ensemble average is not clearly defined for the track and intensity. The Extreme Forecast Index and Shift of Tails are therefore computed from the deviation of the ensemble from the model climate. Based on these indices, the model has some skill in forecasting the area covered by extreme wind gusts up to 10 days, which indicates a clear potential for early warnings. However, large variability is found between the individual storms. The poor predictability of outliers appears related to their physical characteristics such as explosive intensification or small size. Longer datasets with more cases would be needed to further substantiate these points.
NASA Astrophysics Data System (ADS)
Tippett, Michael K.; Ranganathan, Meghana; L'Heureux, Michelle; Barnston, Anthony G.; DelSole, Timothy
2017-05-01
Here we examine the skill of three, five, and seven-category monthly ENSO probability forecasts (1982-2015) from single and multi-model ensemble integrations of the North American Multimodel Ensemble (NMME) project. Three-category forecasts are typical and provide probabilities for the ENSO phase (El Niño, La Niña or neutral). Additional forecast categories indicate the likelihood of ENSO conditions being weak, moderate or strong. The level of skill observed for differing numbers of forecast categories can help to determine the appropriate degree of forecast precision. However, the dependence of the skill score itself on the number of forecast categories must be taken into account. For reliable forecasts with same quality, the ranked probability skill score (RPSS) is fairly insensitive to the number of categories, while the logarithmic skill score (LSS) is an information measure and increases as categories are added. The ignorance skill score decreases to zero as forecast categories are added, regardless of skill level. For all models, forecast formats and skill scores, the northern spring predictability barrier explains much of the dependence of skill on target month and forecast lead. RPSS values for monthly ENSO forecasts show little dependence on the number of categories. However, the LSS of multimodel ensemble forecasts with five and seven categories show statistically significant advantages over the three-category forecasts for the targets and leads that are least affected by the spring predictability barrier. These findings indicate that current prediction systems are capable of providing more detailed probabilistic forecasts of ENSO phase and amplitude than are typically provided.
Negative correlation learning for customer churn prediction: a comparison study.
Rodan, Ali; Fayyoumi, Ayham; Faris, Hossam; Alsakran, Jamal; Al-Kadi, Omar
2015-01-01
Recently, telecommunication companies have been paying more attention toward the problem of identification of customer churn behavior. In business, it is well known for service providers that attracting new customers is much more expensive than retaining existing ones. Therefore, adopting accurate models that are able to predict customer churn can effectively help in customer retention campaigns and maximizing the profit. In this paper we will utilize an ensemble of Multilayer perceptrons (MLP) whose training is obtained using negative correlation learning (NCL) for predicting customer churn in a telecommunication company. Experiments results confirm that NCL based MLP ensemble can achieve better generalization performance (high churn rate) compared with ensemble of MLP without NCL (flat ensemble) and other common data mining techniques used for churn analysis.
NASA Astrophysics Data System (ADS)
Delaney, C.; Hartman, R. K.; Mendoza, J.; Whitin, B.
2017-12-01
Forecast informed reservoir operations (FIRO) is a methodology that incorporates short to mid-range precipitation and flow forecasts to inform the flood operations of reservoirs. The Ensemble Forecast Operations (EFO) alternative is a probabilistic approach of FIRO that incorporates ensemble streamflow predictions (ESPs) made by NOAA's California-Nevada River Forecast Center (CNRFC). With the EFO approach, release decisions are made to manage forecasted risk of reaching critical operational thresholds. A water management model was developed for Lake Mendocino, a 111,000 acre-foot reservoir located near Ukiah, California, to evaluate the viability of the EFO alternative to improve water supply reliability but not increase downstream flood risk. Lake Mendocino is a dual use reservoir, which is owned and operated for flood control by the United States Army Corps of Engineers and is operated for water supply by the Sonoma County Water Agency. Due to recent changes in the operations of an upstream hydroelectric facility, this reservoir has suffered from water supply reliability issues since 2007. The EFO alternative was simulated using a 26-year (1985-2010) ESP hindcast generated by the CNRFC. The ESP hindcast was developed using Global Ensemble Forecast System version 10 precipitation reforecasts processed with the Hydrologic Ensemble Forecast System to generate daily reforecasts of 61 flow ensemble members for a 15-day forecast horizon. Model simulation results demonstrate that the EFO alternative may improve water supply reliability for Lake Mendocino yet not increase flood risk for downstream areas. The developed operations framework can directly leverage improved skill in the second week of the forecast and is extendable into the S2S time domain given the demonstration of improved skill through a reliable reforecast of adequate historical duration and consistent with operationally available numerical weather predictions.
NASA Astrophysics Data System (ADS)
Hill, A.; Weiss, C.; Ancell, B. C.
2017-12-01
The basic premise of observation targeting is that additional observations, when gathered and assimilated with a numerical weather prediction (NWP) model, will produce a more accurate forecast related to a specific phenomenon. Ensemble-sensitivity analysis (ESA; Ancell and Hakim 2007; Torn and Hakim 2008) is a tool capable of accurately estimating the proper location of targeted observations in areas that have initial model uncertainty and large error growth, as well as predicting the reduction of forecast variance due to the assimilated observation. ESA relates an ensemble of NWP model forecasts, specifically an ensemble of scalar forecast metrics, linearly to earlier model states. A thorough investigation is presented to determine how different factors of the forecast process are impacting our ability to successfully target new observations for mesoscale convection forecasts. Our primary goals for this work are to determine: (1) If targeted observations hold more positive impact over non-targeted (i.e. randomly chosen) observations; (2) If there are lead-time constraints to targeting for convection; (3) How inflation, localization, and the assimilation filter influence impact prediction and realized results; (4) If there exist differences between targeted observations at the surface versus aloft; and (5) how physics errors and nonlinearity may augment observation impacts.Ten cases of dryline-initiated convection between 2011 to 2013 are simulated within a simplified OSSE framework and presented here. Ensemble simulations are produced from a cycling system that utilizes the Weather Research and Forecasting (WRF) model v3.8.1 within the Data Assimilation Research Testbed (DART). A "truth" (nature) simulation is produced by supplying a 3-km WRF run with GFS analyses and integrating the model forward 90 hours, from the beginning of ensemble initialization through the end of the forecast. Target locations for surface and radiosonde observations are computed 6, 12, and 18 hours into the forecast based on a chosen scalar forecast response metric (e.g., maximum reflectivity at convection initiation). A variety of experiments are designed to achieve the aforementioned goals and will be presented, along with their results, detailing the feasibility of targeting for mesoscale convection forecasts.
Medium-Range Forecast Skill for Extraordinary Arctic Cyclones in Summer of 2008-2016
NASA Astrophysics Data System (ADS)
Yamagami, Akio; Matsueda, Mio; Tanaka, Hiroshi L.
2018-05-01
Arctic cyclones (ACs) are a severe atmospheric phenomenon that affects the Arctic environment. This study assesses the forecast skill of five leading operational medium-range ensemble forecasts for 10 extraordinary ACs that occurred in summer during 2008-2016. Average existence probability of the predicted ACs was >0.9 at lead times of ≤3.5 days. Average central position error of the predicted ACs was less than half of the mean radius of the 10 ACs (469.1 km) at lead times of 2.5-4.5 days. Average central pressure error of the predicted ACs was 5.5-10.7 hPa at such lead times. Therefore, the operational ensemble prediction systems generally predict the position of ACs within 469.1 km 2.5-4.5 days before they mature. The forecast skill for the extraordinary ACs is lower than that for midlatitude cyclones in the Northern Hemisphere but similar to that in the Southern Hemisphere.
Blanton, Brian; Dresback, Kendra; Colle, Brian; Kolar, Randy; Vergara, Humberto; Hong, Yang; Leonardo, Nicholas; Davidson, Rachel; Nozick, Linda; Wachtendorf, Tricia
2018-04-25
Hurricane track and intensity can change rapidly in unexpected ways, thus making predictions of hurricanes and related hazards uncertain. This inherent uncertainty often translates into suboptimal decision-making outcomes, such as unnecessary evacuation. Representing this uncertainty is thus critical in evacuation planning and related activities. We describe a physics-based hazard modeling approach that (1) dynamically accounts for the physical interactions among hazard components and (2) captures hurricane evolution uncertainty using an ensemble method. This loosely coupled model system provides a framework for probabilistic water inundation and wind speed levels for a new, risk-based approach to evacuation modeling, described in a companion article in this issue. It combines the Weather Research and Forecasting (WRF) meteorological model, the Coupled Routing and Excess STorage (CREST) hydrologic model, and the ADvanced CIRCulation (ADCIRC) storm surge, tide, and wind-wave model to compute inundation levels and wind speeds for an ensemble of hurricane predictions. Perturbations to WRF's initial and boundary conditions and different model physics/parameterizations generate an ensemble of storm solutions, which are then used to drive the coupled hydrologic + hydrodynamic models. Hurricane Isabel (2003) is used as a case study to illustrate the ensemble-based approach. The inundation, river runoff, and wind hazard results are strongly dependent on the accuracy of the mesoscale meteorological simulations, which improves with decreasing lead time to hurricane landfall. The ensemble envelope brackets the observed behavior while providing "best-case" and "worst-case" scenarios for the subsequent risk-based evacuation model. © 2018 Society for Risk Analysis.
Predicting bioactive conformations and binding modes of macrocycles
NASA Astrophysics Data System (ADS)
Anighoro, Andrew; de la Vega de León, Antonio; Bajorath, Jürgen
2016-10-01
Macrocyclic compounds experience increasing interest in drug discovery. It is often thought that these large and chemically complex molecules provide promising candidates to address difficult targets and interfere with protein-protein interactions. From a computational viewpoint, these molecules are difficult to treat. For example, flexible docking of macrocyclic compounds is hindered by the limited ability of current docking approaches to optimize conformations of extended ring systems for pose prediction. Herein, we report predictions of bioactive conformations of macrocycles using conformational search and binding modes using docking. Conformational ensembles generated using specialized search technique of about 70 % of the tested macrocycles contained accurate bioactive conformations. However, these conformations were difficult to identify on the basis of conformational energies. Moreover, docking calculations with limited ligand flexibility starting from individual low energy conformations rarely yielded highly accurate binding modes. In about 40 % of the test cases, binding modes were approximated with reasonable accuracy. However, when conformational ensembles were subjected to rigid body docking, an increase in meaningful binding mode predictions to more than 50 % of the test cases was observed. Electrostatic effects did not contribute to these predictions in a positive or negative manner. Rather, achieving shape complementarity at macrocycle-target interfaces was a decisive factor. In summary, a combined computational protocol using pre-computed conformational ensembles of macrocycles as a starting point for docking shows promise in modeling binding modes of macrocyclic compounds.
Ensemble Kalman Filter Data Assimilation in a Solar Dynamo Model
NASA Astrophysics Data System (ADS)
Dikpati, M.
2017-12-01
Despite great advancement in solar dynamo models since the first model by Parker in 1955, there remain many challenges in the quest to build a dynamo-based prediction scheme that can accurately predict the solar cycle features. One of these challenges is to implement modern data assimilation techniques, which have been used in the oceanic and atmospheric prediction models. Development of data assimilation in solar models are in the early stages. Recently, observing system simulation experiments (OSSE's) have been performed using Ensemble Kalman Filter data assimilation, in the framework of Data Assimilation Research Testbed of NCAR (NCAR-DART), for estimating parameters in a solar dynamo model. I will demonstrate how the selection of ensemble size, number of observations, amount of error in observations and the choice of assimilation interval play important role in parameter estimation. I will also show how the results of parameter reconstruction improve when accuracy in low-latitude observations is increased, despite large error in polar region data. I will then describe how implementation of data assimilation in a solar dynamo model can bring more accuracy in the prediction of polar fields in North and South hemispheres during the declining phase of cycle 24. Recent evidence indicates that the strength of the Sun's polar field during the cycle minima might be a reliable predictor for the next sunspot cycle's amplitude; therefore it is crucial to accurately predict the polar field strength and pattern.
Ali, Safdar; Majid, Abdul; Javed, Syed Gibran; Sattar, Mohsin
2016-06-01
Early prediction of breast cancer is important for effective treatment and survival. We developed an effective Cost-Sensitive Classifier with GentleBoost Ensemble (Can-CSC-GBE) for the classification of breast cancer using protein amino acid features. In this work, first, discriminant information of the protein sequences related to breast tissue is extracted. Then, the physicochemical properties hydrophobicity and hydrophilicity of amino acids are employed to generate molecule descriptors in different feature spaces. For comparison, we obtained results by combining Cost-Sensitive learning with conventional ensemble of AdaBoostM1 and Bagging. The proposed Can-CSC-GBE system has effectively reduced the misclassification costs and thereby improved the overall classification performance. Our novel approach has highlighted promising results as compared to the state-of-the-art ensemble approaches. Copyright © 2016 Elsevier Ltd. All rights reserved.
Ensemble Methods for MiRNA Target Prediction from Expression Data.
Le, Thuc Duy; Zhang, Junpeng; Liu, Lin; Li, Jiuyong
2015-01-01
microRNAs (miRNAs) are short regulatory RNAs that are involved in several diseases, including cancers. Identifying miRNA functions is very important in understanding disease mechanisms and determining the efficacy of drugs. An increasing number of computational methods have been developed to explore miRNA functions by inferring the miRNA-mRNA regulatory relationships from data. Each of the methods is developed based on some assumptions and constraints, for instance, assuming linear relationships between variables. For such reasons, computational methods are often subject to the problem of inconsistent performance across different datasets. On the other hand, ensemble methods integrate the results from individual methods and have been proved to outperform each of their individual component methods in theory. In this paper, we investigate the performance of some ensemble methods over the commonly used miRNA target prediction methods. We apply eight different popular miRNA target prediction methods to three cancer datasets, and compare their performance with the ensemble methods which integrate the results from each combination of the individual methods. The validation results using experimentally confirmed databases show that the results of the ensemble methods complement those obtained by the individual methods and the ensemble methods perform better than the individual methods across different datasets. The ensemble method, Pearson+IDA+Lasso, which combines methods in different approaches, including a correlation method, a causal inference method, and a regression method, is the best performed ensemble method in this study. Further analysis of the results of this ensemble method shows that the ensemble method can obtain more targets which could not be found by any of the single methods, and the discovered targets are more statistically significant and functionally enriched. The source codes, datasets, miRNA target predictions by all methods, and the ground truth for validation are available in the Supplementary materials.
Ensemble Methods for MiRNA Target Prediction from Expression Data
Le, Thuc Duy; Zhang, Junpeng; Liu, Lin; Li, Jiuyong
2015-01-01
Background microRNAs (miRNAs) are short regulatory RNAs that are involved in several diseases, including cancers. Identifying miRNA functions is very important in understanding disease mechanisms and determining the efficacy of drugs. An increasing number of computational methods have been developed to explore miRNA functions by inferring the miRNA-mRNA regulatory relationships from data. Each of the methods is developed based on some assumptions and constraints, for instance, assuming linear relationships between variables. For such reasons, computational methods are often subject to the problem of inconsistent performance across different datasets. On the other hand, ensemble methods integrate the results from individual methods and have been proved to outperform each of their individual component methods in theory. Results In this paper, we investigate the performance of some ensemble methods over the commonly used miRNA target prediction methods. We apply eight different popular miRNA target prediction methods to three cancer datasets, and compare their performance with the ensemble methods which integrate the results from each combination of the individual methods. The validation results using experimentally confirmed databases show that the results of the ensemble methods complement those obtained by the individual methods and the ensemble methods perform better than the individual methods across different datasets. The ensemble method, Pearson+IDA+Lasso, which combines methods in different approaches, including a correlation method, a causal inference method, and a regression method, is the best performed ensemble method in this study. Further analysis of the results of this ensemble method shows that the ensemble method can obtain more targets which could not be found by any of the single methods, and the discovered targets are more statistically significant and functionally enriched. The source codes, datasets, miRNA target predictions by all methods, and the ground truth for validation are available in the Supplementary materials. PMID:26114448
Ensemble perception of color in autistic adults.
Maule, John; Stanworth, Kirstie; Pellicano, Elizabeth; Franklin, Anna
2017-05-01
Dominant accounts of visual processing in autism posit that autistic individuals have an enhanced access to details of scenes [e.g., weak central coherence] which is reflected in a general bias toward local processing. Furthermore, the attenuated priors account of autism predicts that the updating and use of summary representations is reduced in autism. Ensemble perception describes the extraction of global summary statistics of a visual feature from a heterogeneous set (e.g., of faces, sizes, colors), often in the absence of local item representation. The present study investigated ensemble perception in autistic adults using a rapidly presented (500 msec) ensemble of four, eight, or sixteen elements representing four different colors. We predicted that autistic individuals would be less accurate when averaging the ensembles, but more accurate in recognizing individual ensemble colors. The results were consistent with the predictions. Averaging was impaired in autism, but only when ensembles contained four elements. Ensembles of eight or sixteen elements were averaged equally accurately across groups. The autistic group also showed a corresponding advantage in rejecting colors that were not originally seen in the ensemble. The results demonstrate the local processing bias in autism, but also suggest that the global perceptual averaging mechanism may be compromised under some conditions. The theoretical implications of the findings and future avenues for research on summary statistics in autism are discussed. Autism Res 2017, 10: 839-851. © 2016 International Society for Autism Research, Wiley Periodicals, Inc. © 2016 International Society for Autism Research, Wiley Periodicals, Inc.
Ensemble perception of color in autistic adults
Stanworth, Kirstie; Pellicano, Elizabeth; Franklin, Anna
2016-01-01
Dominant accounts of visual processing in autism posit that autistic individuals have an enhanced access to details of scenes [e.g., weak central coherence] which is reflected in a general bias toward local processing. Furthermore, the attenuated priors account of autism predicts that the updating and use of summary representations is reduced in autism. Ensemble perception describes the extraction of global summary statistics of a visual feature from a heterogeneous set (e.g., of faces, sizes, colors), often in the absence of local item representation. The present study investigated ensemble perception in autistic adults using a rapidly presented (500 msec) ensemble of four, eight, or sixteen elements representing four different colors. We predicted that autistic individuals would be less accurate when averaging the ensembles, but more accurate in recognizing individual ensemble colors. The results were consistent with the predictions. Averaging was impaired in autism, but only when ensembles contained four elements. Ensembles of eight or sixteen elements were averaged equally accurately across groups. The autistic group also showed a corresponding advantage in rejecting colors that were not originally seen in the ensemble. The results demonstrate the local processing bias in autism, but also suggest that the global perceptual averaging mechanism may be compromised under some conditions. The theoretical implications of the findings and future avenues for research on summary statistics in autism are discussed. Autism Res 2017, 10: 839–851. © 2016 The Authors Autism Research published by Wiley Periodicals, Inc. on behalf of International Society for Autism Research PMID:27874263
Assessing skill of a global bimonthly streamflow ensemble prediction system
NASA Astrophysics Data System (ADS)
van Dijk, A. I.; Peña-Arancibia, J.; Sheffield, J.; Wood, E. F.
2011-12-01
Ideally, a seasonal streamflow forecasting system might be conceived of as a system that ingests skillful climate forecasts from general circulation models and propagates these through thoroughly calibrated hydrological models that are initialised using hydrometric observations. In practice, there are practical problems with each of these aspects. Instead, we analysed whether a comparatively simple hydrological model-based Ensemble Prediction System (EPS) can provide global bimonthly streamflow forecasts with some skill and if so, under what circumstances the greatest skill may be expected. The system tested produces ensemble forecasts for each of six annual bimonthly periods based on the previous 30 years of global daily gridded 1° resolution climate variables and an initialised global hydrological model. To incorporate some of the skill derived from ocean conditions, a post-EPS analog method was used to sample from the ensemble based on El Niño Southern Oscillation (ENSO), Indian Ocean Dipole (IOD), North Atlantic Oscillation (NAO) and Pacific Decadal Oscillation (PDO) index values observed prior to the forecast. Forecasts skill was assessed through a hind-casting experiment for the period 1979-2008. Potential skill was calculated with reference to a model run with the actual forcing for the forecast period (the 'perfect' model) and was compared to actual forecast skill calculated for each of the six forecast times for an average 411 Australian and 51 pan-tropical catchments. Significant potential skill in bimonthly forecasts was largely limited to northern regions during the snow melt period, seasonally wet tropical regions at the transition of wet to dry season, and the Indonesian region where rainfall is well correlated to ENSO. The actual skill was approximately 34-50% of the potential skill. We attribute this primarily to limitations in the model structure, parameterisation and global forcing data. Use of better climate forecasts and remote sensing observations of initial catchment conditions should help to increase actual skill in future. Future work also could address the potential skill gain from using weather and climate forecasts and from a calibrated and/or alternative hydrological model or model ensemble. The approach and data might be useful as a benchmark for joint seasonal forecasting experiments planned under GEWEX.
NASA Astrophysics Data System (ADS)
Adams, T. E.
2016-12-01
Accurate and timely predictions of the lateral exent of floodwaters and water level depth in floodplain areas are critical globally. This paper demonstrates the coupling of hydrologic ensembles, derived from the use of numerical weather prediction (NWP) model forcings as input to a fully distributed hydrologic model. Resulting ensemble output from the distributed hydrologic model are used as upstream flow boundaries and lateral inflows to a 1-D hydrodynamic model. An example is presented for the Potomac River in the vicinity of Washington, DC (USA). The approach taken falls within the broader goals of the Hydrologic Ensemble Prediction EXperiment (HEPEX).
Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.
2007-01-01
To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.
Ramírez, J; Górriz, J M; Ortiz, A; Martínez-Murcia, F J; Segovia, F; Salas-Gonzalez, D; Castillo-Barnes, D; Illán, I A; Puntonet, C G
2018-05-15
Alzheimer's disease (AD) is the most common cause of dementia in the elderly and affects approximately 30 million individuals worldwide. Mild cognitive impairment (MCI) is very frequently a prodromal phase of AD, and existing studies have suggested that people with MCI tend to progress to AD at a rate of about 10-15% per year. However, the ability of clinicians and machine learning systems to predict AD based on MRI biomarkers at an early stage is still a challenging problem that can have a great impact in improving treatments. The proposed system, developed by the SiPBA-UGR team for this challenge, is based on feature standardization, ANOVA feature selection, partial least squares feature dimension reduction and an ensemble of One vs. Rest random forest classifiers. With the aim of improving its performance when discriminating healthy controls (HC) from MCI, a second binary classification level was introduced that reconsiders the HC and MCI predictions of the first level. The system was trained and evaluated on an ADNI datasets that consist of T1-weighted MRI morphological measurements from HC, stable MCI, converter MCI and AD subjects. The proposed system yields a 56.25% classification score on the test subset which consists of 160 real subjects. The classifier yielded the best performance when compared to: (i) One vs. One (OvO), One vs. Rest (OvR) and error correcting output codes (ECOC) as strategies for reducing the multiclass classification task to multiple binary classification problems, (ii) support vector machines, gradient boosting classifier and random forest as base binary classifiers, and (iii) bagging ensemble learning. A robust method has been proposed for the international challenge on MCI prediction based on MRI data. The system yielded the second best performance during the competition with an accuracy rate of 56.25% when evaluated on the real subjects of the test set. Copyright © 2017 Elsevier B.V. All rights reserved.
A new approach to human microRNA target prediction using ensemble pruning and rotation forest.
Mousavi, Reza; Eftekhari, Mahdi; Haghighi, Mehdi Ghezelbash
2015-12-01
MicroRNAs (miRNAs) are small non-coding RNAs that have important functions in gene regulation. Since finding miRNA target experimentally is costly and needs spending much time, the use of machine learning methods is a growing research area for miRNA target prediction. In this paper, a new approach is proposed by using two popular ensemble strategies, i.e. Ensemble Pruning and Rotation Forest (EP-RTF), to predict human miRNA target. For EP, the approach utilizes Genetic Algorithm (GA). In other words, a subset of classifiers from the heterogeneous ensemble is first selected by GA. Next, the selected classifiers are trained based on the RTF method and then are combined using weighted majority voting. In addition to seeking a better subset of classifiers, the parameter of RTF is also optimized by GA. Findings of the present study confirm that the newly developed EP-RTF outperforms (in terms of classification accuracy, sensitivity, and specificity) the previously applied methods over four datasets in the field of human miRNA target. Diversity-error diagrams reveal that the proposed ensemble approach constructs individual classifiers which are more accurate and usually diverse than the other ensemble approaches. Given these experimental results, we highly recommend EP-RTF for improving the performance of miRNA target prediction.
NASA Astrophysics Data System (ADS)
Bogner, Konrad; Monhart, Samuel; Liniger, Mark; Spririg, Christoph; Jordan, Fred; Zappa, Massimiliano
2015-04-01
In recent years large progresses have been achieved in the operational prediction of floods and hydrological drought with up to ten days lead time. Both the public and the private sectors are currently using probabilistic runoff forecast in order to monitoring water resources and take actions when critical conditions are to be expected. The use of extended-range predictions with lead times exceeding 10 days is not yet established. The hydropower sector in particular might have large benefits from using hydro meteorological forecasts for the next 15 to 60 days in order to optimize the operations and the revenues from their watersheds, dams, captions, turbines and pumps. The new Swiss Competence Centers in Energy Research (SCCER) targets at boosting research related to energy issues in Switzerland. The objective of HEPS4POWER is to demonstrate that operational extended-range hydro meteorological forecasts have the potential to become very valuable tools for fine tuning the production of energy from hydropower systems. The project team covers a specific system-oriented value chain starting from the collection and forecast of meteorological data (MeteoSwiss), leading to the operational application of state-of-the-art hydrological models (WSL) and terminating with the experience in data presentation and power production forecasts for end-users (e-dric.ch). The first task of the HEPS4POWER will be the downscaling and post-processing of ensemble extended-range meteorological forecasts (EPS). The goal is to provide well-tailored forecasts of probabilistic nature that should be reliable in statistical and localized at catchment or even station level. The hydrology related task will consist in feeding the post-processed meteorological forecasts into a HEPS using a multi-model approach by implementing models with different complexity. Also in the case of the hydrological ensemble predictions, post-processing techniques need to be tested in order to improve the quality of the forecasts against observed discharge. Analysis should be specifically oriented to the maximisation of hydroelectricity production. Thus, verification metrics should include economic measures like cost loss approaches. The final step will include the transfer of the HEPS system to several hydropower systems, the connection with the energy market prices and the development of probabilistic multi-reservoir production and management optimizations guidelines. The baseline model chain yielding three-days forecasts established for a hydropower system in southern-Switzerland will be presented alongside with the work-plan to achieve seasonal ensemble predictions.
NASA Astrophysics Data System (ADS)
Saleh, F.; Ramaswamy, V.; Wang, Y.; Georgas, N.; Blumberg, A.; Pullen, J.
2017-12-01
Estuarine regions can experience compound impacts from coastal storm surge and riverine flooding. The challenges in forecasting flooding in such areas are multi-faceted due to uncertainties associated with meteorological drivers and interactions between hydrological and coastal processes. The objective of this work is to evaluate how uncertainties from meteorological predictions propagate through an ensemble-based flood prediction framework and translate into uncertainties in simulated inundation extents. A multi-scale framework, consisting of hydrologic, coastal and hydrodynamic models, was used to simulate two extreme flood events at the confluence of the Passaic and Hackensack rivers and Newark Bay. The events were Hurricane Irene (2011), a combination of inland flooding and coastal storm surge, and Hurricane Sandy (2012) where coastal storm surge was the dominant component. The hydrodynamic component of the framework was first forced with measured streamflow and ocean water level data to establish baseline inundation extents with the best available forcing data. The coastal and hydrologic models were then forced with meteorological predictions from 21 ensemble members of the Global Ensemble Forecast System (GEFS) to retrospectively represent potential future conditions up to 96 hours prior to the events. Inundation extents produced by the hydrodynamic model, forced with the 95th percentile of the ensemble-based coastal and hydrologic boundary conditions, were in good agreement with baseline conditions for both events. The USGS reanalysis of Hurricane Sandy inundation extents was encapsulated between the 50th and 95th percentile of the forecasted inundation extents, and that of Hurricane Irene was similar but with caveats associated with data availability and reliability. This work highlights the importance of accounting for meteorological uncertainty to represent a range of possible future inundation extents at high resolution (∼m).
DART: Tools and Support for Ensemble Data Assimilation Research, Operations, and Education
NASA Astrophysics Data System (ADS)
Hoar, T. J.; Anderson, J. L.; Collins, N.; Raeder, K.; Kershaw, H.; Romine, G. S.; Mizzi, A. P.; Chatterjee, A.; Karspeck, A. R.; Zarzycki, C. M.; Ha, S. Y.; Barre, J.; Gaubert, B.
2014-12-01
The Data Assimilation Research Testbed (DART) is a community facility for ensemble data assimilation developed and supported by the National Center for Atmospheric Research. DART provides a comprehensive suite of software, documentation, examples and tutorials that can be used for ensemble data assimilation research, operations, and education. Scientists and software engineers from the Data Assimilation Research Section at NCAR are available to actively support DART users who want to use existing DART products or develop their own new applications. Current DART users range from university professors teaching data assimilation, to individual graduate students working with simple models, through national laboratories doing operational prediction with large state-of-the-art models. DART runs efficiently on many computational platforms ranging from laptops through thousands of cores on the newest supercomputers. This poster focuses on several recent research activities using DART with geophysical models. First, DART is being used with the Community Atmosphere Model Spectral Element (CAM-SE) and Model for Prediction Across Scales (MPAS) global atmospheric models that support locally enhanced grid resolution. Initial results from ensemble assimilation with both models are presented. DART is also being used to produce ensemble analyses of atmospheric tracers, in particular CO, in both the global CAM-Chem model and the regional Weather Research and Forecast with chemistry (WRF-Chem) model by assimilating observations from the Measurements of Pollution in the Troposphere (MOPITT) and Infrared Atmospheric Sounding Interferometer (IASI) instruments. Results from ensemble analyses in both models are presented. An interface between DART and the Community Atmosphere Biosphere Land Exchange (CABLE) model has been completed and ensemble land surface analyses with DART/CABLE will be discussed. Finally, an update on ensemble analyses in the fully-coupled Community Earth System (CESM) is presented. The poster includes instructions on how to get started using DART for research or educational applications.
Benchmarking Ensemble Streamflow Prediction Skill in the UK
NASA Astrophysics Data System (ADS)
Harrigan, Shaun; Smith, Katie; Parry, Simon; Tanguy, Maliko; Prudhomme, Christel
2017-04-01
Skilful hydrological forecasts at weekly to seasonal lead times would be extremely beneficial for decision-making in operational water management, especially during drought conditions. Hydro-meteorological ensemble forecasting systems are an attractive approach as they use two sources of streamflow predictability: (i) initial hydrologic conditions (IHCs), where soil moisture, groundwater and snow storage states can provide an estimate of future streamflow situations, and (ii) atmospheric predictability, where skilful forecasts of weather and climate variables can be used to force hydrological models. In the UK, prediction of rainfall at long lead times and for summer months in particular is notoriously difficult given the large degree of natural climate variability in ocean influenced mid-latitude regions, but recent research has uncovered exciting prospects for improved rainfall skill at seasonal lead times due to improved prediction of the North Atlantic Oscillation. However, before we fully understand what this improved atmospheric predictability might mean in terms of improved hydrological forecasts, we must first evaluate how much skill can be gained from IHCs alone. Ensemble Streamflow Prediction (ESP) is a well-established method for generating an ensemble of streamflow forecasts in the absence of skilful future meteorological predictions. The aim of this study is therefore to benchmark when (lead time/forecast initialisation month) and where (spatial pattern/catchment characteristics) ESP is skilful across a diverse set of catchments in the UK. Forecast skill was evaluated seamlessly from lead times of 1-day to 12-months and forecasts were initialised at the first of each month over the 1965-2015 hindcast period. This ESP output also provides a robust benchmark against which to assess how much improvement in skill can be achieved when meteorological forecasts are incorporated (next steps). To provide a 'tough to beat' benchmark, several variants of ESP with increasing complexity were produced, including better model representation of hydrological processes and sub-sampling of historic climate sequences (e.g. NAO+/NAO- years). This work is part of the Improving Predictions of Drought for User Decision Making (IMPETUS) project and provides insight to where advancements in atmospheric predictability is most needed in the UK in the context of water management.
NASA Astrophysics Data System (ADS)
Busuioc, Aristita; Dumitrescu, Alexandru; Dumitrache, Rodica; Iriza, Amalia
2017-04-01
Seasonal climate forecasts in Europe are currently issued at the European Centre for Medium-Range Weather Forecasts (ECMWF) in the form of multi-model ensemble predictions available within the "EUROSIP" system. Different statistical techniques to calibrate, downscale and combine the EUROSIP direct model output are used to optimize the quality of the final probabilistic forecasts. In this study, a statistical downscaling model (SDM) based on canonical correlation analysis (CCA) is used to downscale the EUROSIP seasonal forecast at a spatial resolution of 1km x 1km over the Movila farm placed in southeastern Romania. This application is achieved in the framework of the H2020 MOSES project (http://www.moses-project.eu). The combination between monthly standardized values of three climate variables (maximum/minimum temperatures-Tmax/Tmin, total precipitation-Prec) is used as predictand while combinations of various large-scale predictors are tested in terms of their availability as outputs in the seasonal EUROSIP probabilistic forecasting (sea level pressure, temperature at 850 hPa and geopotential height at 500 hPa). The predictors are taken from the ECMWF system considering 15 members of the ensemble, for which the hindcasts since 1991 until present are available. The model was calibrated over the period 1991-2014 and predictions for summers 2015 and 2016 were achieved. The calibration was made for the ensemble average as well as for each ensemble member. The model was developed for each lead time: one month anticipation for June, two months anticipation for July and three months anticipation for August. The main conclusions from these preliminary results are: best predictions (in terms of the anomaly sign) for Tmax (July-2 months anticipation, August-3 months anticipation) for both years (2015, 2016); for Tmin - good predictions only for August (3 months anticipation ) for both years; for precipitation, good predictions for July (2 months anticipation) in 2015 and August (3 months anticipation) in 2016; failed prediction for June (1-month anticipation) for all parameters. To see if the results obtained for 2015 and 2016 summers are in agreement with the general ECMWF model performance in forecast of the three predictors used in the CCA SDM calibration, the mean bias and root mean square errors (RMSE) calculated over the entire period in each grid point, for each ensemble member and ensemble average were computed. The obtained results are confirmed, showing highest ECMWF performance in forecasting of the three predictors for 3 months anticipation (August) and lowest performance for one month anticipation (June). The added value of the CCA SDM in forecasting local Tmax/Tmin and total precipitation was compared to the ECMWF performance using nearest grid point method. Comparisons were performed for the 1991-2014 period, taking into account the forecast made in May for July. An important improvement was found for the CCA SDM predictions in terms of the RMSE value (computed against observations) for Tmax/Tmin and less for precipitation. The tests are in progress for the other summer months (June, July).
Calibration of limited-area ensemble precipitation forecasts for hydrological predictions
NASA Astrophysics Data System (ADS)
Diomede, Tommaso; Marsigli, Chiara; Montani, Andrea; Nerozzi, Fabrizio; Paccagnella, Tiziana
2015-04-01
The main objective of this study is to investigate the impact of calibration for limited-area ensemble precipitation forecasts, to be used for driving discharge predictions up to 5 days in advance. A reforecast dataset, which spans 30 years, based on the Consortium for Small Scale Modeling Limited-Area Ensemble Prediction System (COSMO-LEPS) was used for testing the calibration strategy. Three calibration techniques were applied: quantile-to-quantile mapping, linear regression, and analogs. The performance of these methodologies was evaluated in terms of statistical scores for the precipitation forecasts operationally provided by COSMO-LEPS in the years 2003-2007 over Germany, Switzerland, and the Emilia-Romagna region (northern Italy). The analog-based method seemed to be preferred because of its capability of correct position errors and spread deficiencies. A suitable spatial domain for the analog search can help to handle model spatial errors as systematic errors. However, the performance of the analog-based method may degrade in cases where a limited training dataset is available. A sensitivity test on the length of the training dataset over which to perform the analog search has been performed. The quantile-to-quantile mapping and linear regression methods were less effective, mainly because the forecast-analysis relation was not so strong for the available training dataset. A comparison between the calibration based on the deterministic reforecast and the calibration based on the full operational ensemble used as training dataset has been considered, with the aim to evaluate whether reforecasts are really worthy for calibration, given that their computational cost is remarkable. The verification of the calibration process was then performed by coupling ensemble precipitation forecasts with a distributed rainfall-runoff model. This test was carried out for a medium-sized catchment located in Emilia-Romagna, showing a beneficial impact of the analog-based method on the reduction of missed events for discharge predictions.
Ensemble predictive model for more accurate soil organic carbon spectroscopic estimation
NASA Astrophysics Data System (ADS)
Vašát, Radim; Kodešová, Radka; Borůvka, Luboš
2017-07-01
A myriad of signal pre-processing strategies and multivariate calibration techniques has been explored in attempt to improve the spectroscopic prediction of soil organic carbon (SOC) over the last few decades. Therefore, to come up with a novel, more powerful, and accurate predictive approach to beat the rank becomes a challenging task. However, there may be a way, so that combine several individual predictions into a single final one (according to ensemble learning theory). As this approach performs best when combining in nature different predictive algorithms that are calibrated with structurally different predictor variables, we tested predictors of two different kinds: 1) reflectance values (or transforms) at each wavelength and 2) absorption feature parameters. Consequently we applied four different calibration techniques, two per each type of predictors: a) partial least squares regression and support vector machines for type 1, and b) multiple linear regression and random forest for type 2. The weights to be assigned to individual predictions within the ensemble model (constructed as a weighted average) were determined by an automated procedure that ensured the best solution among all possible was selected. The approach was tested at soil samples taken from surface horizon of four sites differing in the prevailing soil units. By employing the ensemble predictive model the prediction accuracy of SOC improved at all four sites. The coefficient of determination in cross-validation (R2cv) increased from 0.849, 0.611, 0.811 and 0.644 (the best individual predictions) to 0.864, 0.650, 0.824 and 0.698 for Site 1, 2, 3 and 4, respectively. Generally, the ensemble model affected the final prediction so that the maximal deviations of predicted vs. observed values of the individual predictions were reduced, and thus the correlation cloud became thinner as desired.
NASA Astrophysics Data System (ADS)
Wolff, J.; Jankov, I.; Beck, J.; Carson, L.; Frimel, J.; Harrold, M.; Jiang, H.
2016-12-01
It is well known that global and regional numerical weather prediction ensemble systems are under-dispersive, producing unreliable and overconfident ensemble forecasts. Typical approaches to alleviate this problem include the use of multiple dynamic cores, multiple physics suite configurations, or a combination of the two. While these approaches may produce desirable results, they have practical and theoretical deficiencies and are more difficult and costly to maintain. An active area of research that promotes a more unified and sustainable system for addressing the deficiencies in ensemble modeling is the use of stochastic physics to represent model-related uncertainty. Stochastic approaches include Stochastic Parameter Perturbations (SPP), Stochastic Kinetic Energy Backscatter (SKEB), Stochastic Perturbation of Physics Tendencies (SPPT), or some combination of all three. The focus of this study is to assess the model performance within a convection-permitting ensemble at 3-km grid spacing across the Contiguous United States (CONUS) when using stochastic approaches. For this purpose, the test utilized a single physics suite configuration based on the operational High-Resolution Rapid Refresh (HRRR) model, with ensemble members produced by employing stochastic methods. Parameter perturbations were employed in the Rapid Update Cycle (RUC) land surface model and Mellor-Yamada-Nakanishi-Niino (MYNN) planetary boundary layer scheme. Results will be presented in terms of bias, error, spread, skill, accuracy, reliability, and sharpness using the Model Evaluation Tools (MET) verification package. Due to the high level of complexity of running a frequently updating (hourly), high spatial resolution (3 km), large domain (CONUS) ensemble system, extensive high performance computing (HPC) resources were needed to meet this objective. Supercomputing resources were provided through the National Center for Atmospheric Research (NCAR) Strategic Capability (NSC) project support, allowing for a more extensive set of tests over multiple seasons, consequently leading to more robust results. Through the use of these stochastic innovations and powerful supercomputing at NCAR, further insights and advancements in ensemble forecasting at convection-permitting scales will be possible.
2013-09-30
using polar orbit microwave and infrared sounder measurements from the Global Telecommunication System (GTS). The SDAT system was developed as a...WRF/GSI initial conditions and WRF boundary conditions. • WRF system to do short-range forecasts (6 hours) to provide the background fields for GSI...UCAR is related to a NASA GNSS proposal: “Improving Tropical Prediction and Analysis using COSMIC Radio Occultation Observations and an Ensemble Data
A novel method for predicting kidney stone type using ensemble learning.
Kazemi, Yassaman; Mirroshandel, Seyed Abolghasem
2018-01-01
The high morbidity rate associated with kidney stone disease, which is a silent killer, is one of the main concerns in healthcare systems all over the world. Advanced data mining techniques such as classification can help in the early prediction of this disease and reduce its incidence and associated costs. The objective of the present study is to derive a model for the early detection of the type of kidney stone and the most influential parameters with the aim of providing a decision-support system. Information was collected from 936 patients with nephrolithiasis at the kidney center of the Razi Hospital in Rasht from 2012 through 2016. The prepared dataset included 42 features. Data pre-processing was the first step toward extracting the relevant features. The collected data was analyzed with Weka software, and various data mining models were used to prepare a predictive model. Various data mining algorithms such as the Bayesian model, different types of Decision Trees, Artificial Neural Networks, and Rule-based classifiers were used in these models. We also proposed four models based on ensemble learning to improve the accuracy of each learning algorithm. In addition, a novel technique for combining individual classifiers in ensemble learning was proposed. In this technique, for each individual classifier, a weight is assigned based on our proposed genetic algorithm based method. The generated knowledge was evaluated using a 10-fold cross-validation technique based on standard measures. However, the assessment of each feature for building a predictive model was another significant challenge. The predictive strength of each feature for creating a reproducible outcome was also investigated. Regarding the applied models, parameters such as sex, acid uric condition, calcium level, hypertension, diabetes, nausea and vomiting, flank pain, and urinary tract infection (UTI) were the most vital parameters for predicting the chance of nephrolithiasis. The final ensemble-based model (with an accuracy of 97.1%) was a robust one and could be safely applied to future studies to predict the chances of developing nephrolithiasis. This model provides a novel way to study stone disease by deciphering the complex interaction among different biological variables, thus helping in an early identification and reduction in diagnosis time. Copyright © 2017 Elsevier B.V. All rights reserved.
SIMULATION OF THE ICELAND VOLCANIC ERUPTION OF APRIL 2010 USING THE ENSEMBLE SYSTEM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buckley, R.
2011-05-10
The Eyjafjallajokull volcanic eruption in Iceland in April 2010 disrupted transportation in Europe which ultimately affected travel plans for many on a global basis. The Volcanic Ash Advisory Centre (VAAC) is responsible for providing guidance to the aviation industry of the transport of volcanic ash clouds. There are nine such centers located globally, and the London branch (headed by the United Kingdom Meteorological Office, or UKMet) was responsible for modeling the Iceland volcano. The guidance provided by the VAAC created some controversy due to the burdensome travel restrictions and uncertainty involved in the prediction of ash transport. The Iceland volcanicmore » eruption provides a useful exercise of the European ENSEMBLE program, coordinated by the Joint Research Centre (JRC) in Ispra, Italy. ENSEMBLE, a decision support system for emergency response, uses transport model results from a variety of countries in an effort to better understand the uncertainty involved with a given accident scenario. Model results in the form of airborne concentration and surface deposition are required from each member of the ensemble in a prescribed format that may then be uploaded to a website for manipulation. The Savannah River National Laboratory (SRNL) is the lone regular United States participant throughout the 10-year existence of ENSEMBLE. For the Iceland volcano, four separate source term estimates have been provided to ENSEMBLE participants. This paper focuses only on one of those source terms. The SRNL results in relation to other modeling agency results along with useful information obtained using an ensemble of transport results will be discussed.« less
Negative Correlation Learning for Customer Churn Prediction: A Comparison Study
Faris, Hossam
2015-01-01
Recently, telecommunication companies have been paying more attention toward the problem of identification of customer churn behavior. In business, it is well known for service providers that attracting new customers is much more expensive than retaining existing ones. Therefore, adopting accurate models that are able to predict customer churn can effectively help in customer retention campaigns and maximizing the profit. In this paper we will utilize an ensemble of Multilayer perceptrons (MLP) whose training is obtained using negative correlation learning (NCL) for predicting customer churn in a telecommunication company. Experiments results confirm that NCL based MLP ensemble can achieve better generalization performance (high churn rate) compared with ensemble of MLP without NCL (flat ensemble) and other common data mining techniques used for churn analysis. PMID:25879060
Abuassba, Adnan O M; Zhang, Dezheng; Luo, Xiong; Shaheryar, Ahmad; Ali, Hazrat
2017-01-01
Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L 2 -norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets.
Abuassba, Adnan O. M.; Ali, Hazrat
2017-01-01
Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L2-norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets. PMID:28546808
Hernández, Griselda; Anderson, Janet S.; LeMaster, David M.
2012-01-01
The acute sensitivity to conformation exhibited by amide hydrogen exchange reactivity provides a valuable test for the physical accuracy of model ensembles developed to represent the Boltzmann distribution of the protein native state. A number of molecular dynamics studies of ubiquitin have predicted a well-populated transition in the tight turn immediately preceding the primary site of proteasome-directed polyubiquitylation Lys 48. Amide exchange reactivity analysis demonstrates that this transition is 103-fold rarer than these predictions. More strikingly, for the most populated novel conformational basin predicted from a recent 1 ms MD simulation of bovine pancreatic trypsin inhibitor (at 13% of total), experimental hydrogen exchange data indicates a population below 10−6. The most sophisticated efforts to directly incorporate experimental constraints into the derivation of model protein ensembles have been applied to ubiquitin, as illustrated by three recently deposited studies (PDB codes 2NR2, 2K39 and 2KOX). Utilizing the extensive set of experimental NOE constraints, each of these three ensembles yields a modestly more accurate prediction of the exchange rates for the highly exposed amides than does a standard unconstrained molecular simulation. However, for the less frequently exposed amide hydrogens, the 2NR2 ensemble offers no improvement in rate predictions as compared to the unconstrained MD ensemble. The other two NMR-constrained ensembles performed markedly worse, either underestimating (2KOX) or overestimating (2K39) the extent of conformational diversity. PMID:22425325
Probabilistic rainfall warning system with an interactive user interface
NASA Astrophysics Data System (ADS)
Koistinen, Jarmo; Hohti, Harri; Kauhanen, Janne; Kilpinen, Juha; Kurki, Vesa; Lauri, Tuomo; Nurmi, Pertti; Rossi, Pekka; Jokelainen, Miikka; Heinonen, Mari; Fred, Tommi; Moisseev, Dmitri; Mäkelä, Antti
2013-04-01
A real time 24/7 automatic alert system is in operational use at the Finnish Meteorological Institute (FMI). It consists of gridded forecasts of the exceedance probabilities of rainfall class thresholds in the continuous lead time range of 1 hour to 5 days. Nowcasting up to six hours applies ensemble member extrapolations of weather radar measurements. With 2.8 GHz processors using 8 threads it takes about 20 seconds to generate 51 radar based ensemble members in a grid of 760 x 1226 points. Nowcasting exploits also lightning density and satellite based pseudo rainfall estimates. The latter ones utilize convective rain rate (CRR) estimate from Meteosat Second Generation. The extrapolation technique applies atmospheric motion vectors (AMV) originally developed for upper wind estimation with satellite images. Exceedance probabilities of four rainfall accumulation categories are computed for the future 1 h and 6 h periods and they are updated every 15 minutes. For longer forecasts exceedance probabilities are calculated for future 6 and 24 h periods during the next 4 days. From approximately 1 hour to 2 days Poor man's Ensemble Prediction System (PEPS) is used applying e.g. the high resolution short range Numerical Weather Prediction models HIRLAM and AROME. The longest forecasts apply EPS data from the European Centre for Medium Range Weather Forecasts (ECMWF). The blending of the ensemble sets from the various forecast sources is performed applying mixing of accumulations with equal exceedance probabilities. The blending system contains a real time adaptive estimator of the predictability of radar based extrapolations. The uncompressed output data are written to file for each member, having total size of 10 GB. Ensemble data from other sources (satellite, lightning, NWP) are converted to the same geometry as the radar data and blended as was explained above. A verification system utilizing telemetering rain gauges has been established. Alert dissemination e.g. for citizens and professional end users applies SMS messages and, in near future, smartphone maps. The present interactive user interface facilitates free selection of alert sites and two warning thresholds (any rain, heavy rain) at any location in Finland. The pilot service was tested by 1000-3000 users during summers 2010 and 2012. As an example of dedicated end-user services gridded exceedance scenarios (of probabilities 5 %, 50 % and 90 %) of hourly rainfall accumulations for the next 3 hours have been utilized as an online input data for the influent model at the Greater Helsinki Wastewater Treatment Plant.
EFS: an ensemble feature selection tool implemented as R-package and web-application.
Neumann, Ursula; Genze, Nikita; Heider, Dominik
2017-01-01
Feature selection methods aim at identifying a subset of features that improve the prediction performance of subsequent classification models and thereby also simplify their interpretability. Preceding studies demonstrated that single feature selection methods can have specific biases, whereas an ensemble feature selection has the advantage to alleviate and compensate for these biases. The software EFS (Ensemble Feature Selection) makes use of multiple feature selection methods and combines their normalized outputs to a quantitative ensemble importance. Currently, eight different feature selection methods have been integrated in EFS, which can be used separately or combined in an ensemble. EFS identifies relevant features while compensating specific biases of single methods due to an ensemble approach. Thereby, EFS can improve the prediction accuracy and interpretability in subsequent binary classification models. EFS can be downloaded as an R-package from CRAN or used via a web application at http://EFS.heiderlab.de.
NASA Technical Reports Server (NTRS)
Sippel, Jason A.; Zhang, Fuqing
2009-01-01
This study uses short-range ensemble forecasts initialized with an Ensemble-Kalman filter to study the dynamics and predictability of Hurricane Humberto, which made landfall along the Texas coast in 2007. Statistical correlation is used to determine why some ensemble members strengthen the incipient low into a hurricane and others do not. It is found that deep moisture and high convective available potential energy (CAPE) are two of the most important factors for the genesis of Humberto. Variations in CAPE result in as much difference (ensemble spread) in the final hurricane intensity as do variations in deep moisture. CAPE differences here are related to the interaction between the cyclone and a nearby front, which tends to stabilize the lower troposphere in the vicinity of the circulation center. This subsequently weakens convection and slows genesis. Eventually the wind-induced surface heat exchange mechanism and differences in landfall time result in even larger ensemble spread. 1
Development of web-based services for an ensemble flood forecasting and risk assessment system
NASA Astrophysics Data System (ADS)
Yaw Manful, Desmond; He, Yi; Cloke, Hannah; Pappenberger, Florian; Li, Zhijia; Wetterhall, Fredrik; Huang, Yingchun; Hu, Yuzhong
2010-05-01
Flooding is a wide spread and devastating natural disaster worldwide. Floods that took place in the last decade in China were ranked the worst amongst recorded floods worldwide in terms of the number of human fatalities and economic losses (Munich Re-Insurance). Rapid economic development and population expansion into low lying flood plains has worsened the situation. Current conventional flood prediction systems in China are neither suited to the perceptible climate variability nor the rapid pace of urbanization sweeping the country. Flood prediction, from short-term (a few hours) to medium-term (a few days), needs to be revisited and adapted to changing socio-economic and hydro-climatic realities. The latest technology requires implementation of multiple numerical weather prediction systems. The availability of twelve global ensemble weather prediction systems through the ‘THORPEX Interactive Grand Global Ensemble' (TIGGE) offers a good opportunity for an effective state-of-the-art early forecasting system. A prototype of a Novel Flood Early Warning System (NEWS) using the TIGGE database is tested in the Huai River basin in east-central China. It is the first early flood warning system in China that uses the massive TIGGE database cascaded with river catchment models, the Xinanjiang hydrologic model and a 1-D hydraulic model, to predict river discharge and flood inundation. The NEWS algorithm is also designed to provide web-based services to a broad spectrum of end-users. The latter presents challenges as both databases and proprietary codes reside in different locations and converge at dissimilar times. NEWS will thus make use of a ready-to-run grid system that makes distributed computing and data resources available in a seamless and secure way. An ability to run or function on different operating systems and provide an interface or front that is accessible to broad spectrum of end-users is additional requirement. The aim is to achieve robust interoperability through strong security and workflow capabilities. A physical network diagram and a work flow scheme of all the models, codes and databases used to achieve the NEWS algorithm are presented. They constitute a first step in the development of a platform for providing real time flood forecasting services on the web to mitigate 21st century weather phenomena.
Awad, Aya; Bader-El-Den, Mohamed; McNicholas, James; Briggs, Jim
2017-12-01
Mortality prediction of hospitalized patients is an important problem. Over the past few decades, several severity scoring systems and machine learning mortality prediction models have been developed for predicting hospital mortality. By contrast, early mortality prediction for intensive care unit patients remains an open challenge. Most research has focused on severity of illness scoring systems or data mining (DM) models designed for risk estimation at least 24 or 48h after ICU admission. This study highlights the main data challenges in early mortality prediction in ICU patients and introduces a new machine learning based framework for Early Mortality Prediction for Intensive Care Unit patients (EMPICU). The proposed method is evaluated on the Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database. Mortality prediction models are developed for patients at the age of 16 or above in Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery Recovery Unit (CSRU). We employ the ensemble learning Random Forest (RF), the predictive Decision Trees (DT), the probabilistic Naive Bayes (NB) and the rule-based Projective Adaptive Resonance Theory (PART) models. The primary outcome was hospital mortality. The explanatory variables included demographic, physiological, vital signs and laboratory test variables. Performance measures were calculated using cross-validated area under the receiver operating characteristic curve (AUROC) to minimize bias. 11,722 patients with single ICU stays are considered. Only patients at the age of 16 years old and above in Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery Recovery Unit (CSRU) are considered in this study. The proposed EMPICU framework outperformed standard scoring systems (SOFA, SAPS-I, APACHE-II, NEWS and qSOFA) in terms of AUROC and time (i.e. at 6h compared to 48h or more after admission). The results show that although there are many values missing in the first few hour of ICU admission, there is enough signal to effectively predict mortality during the first 6h of admission. The proposed framework, in particular the one that uses the ensemble learning approach - EMPICU Random Forest (EMPICU-RF) offers a base to construct an effective and novel mortality prediction model in the early hours of an ICU patient admission, with an improved performance profile. Copyright © 2017 Elsevier B.V. All rights reserved.
Singla, Neeru; Srivastava, Vishal; Mehta, Dalip Singh
2018-05-01
Malaria is a life-threatening infectious blood disease affecting humans and other animals caused by parasitic protozoans belonging to the Plasmodium type especially in developing countries. The gold standard method for the detection of malaria is through the microscopic method of chemically treated blood smears. We developed an automated optical spatial coherence tomographic system using a machine learning approach for a fast identification of malaria cells. In this study, 28 samples (15 healthy, 13 malaria infected stages of red blood cells) were imaged by the developed system and 13 features were extracted. We designed a multilevel ensemble-based classifier for the quantitative prediction of different stages of the malaria cells. The proposed classifier was used by repeating k-fold cross validation dataset and achieve a high-average accuracy of 97.9% for identifying malaria infected late trophozoite stage of cells. Overall, our proposed system and multilevel ensemble model has a substantial quantifiable potential to detect the different stages of malaria infection without staining or expert. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Lessons from Climate Modeling on the Design and Use of Ensembles for Crop Modeling
NASA Technical Reports Server (NTRS)
Wallach, Daniel; Mearns, Linda O.; Ruane, Alexander C.; Roetter, Reimund P.; Asseng, Senthold
2016-01-01
Working with ensembles of crop models is a recent but important development in crop modeling which promises to lead to better uncertainty estimates for model projections and predictions, better predictions using the ensemble mean or median, and closer collaboration within the modeling community. There are numerous open questions about the best way to create and analyze such ensembles. Much can be learned from the field of climate modeling, given its much longer experience with ensembles. We draw on that experience to identify questions and make propositions that should help make ensemble modeling with crop models more rigorous and informative. The propositions include defining criteria for acceptance of models in a crop MME, exploring criteria for evaluating the degree of relatedness of models in a MME, studying the effect of number of models in the ensemble, development of a statistical model of model sampling, creation of a repository for MME results, studies of possible differential weighting of models in an ensemble, creation of single model ensembles based on sampling from the uncertainty distribution of parameter values or inputs specifically oriented toward uncertainty estimation, the creation of super ensembles that sample more than one source of uncertainty, the analysis of super ensemble results to obtain information on total uncertainty and the separate contributions of different sources of uncertainty and finally further investigation of the use of the multi-model mean or median as a predictor.
NASA Astrophysics Data System (ADS)
Liu, Fang; Cao, San-xing; Lu, Rui
2012-04-01
This paper proposes a user credit assessment model based on clustering ensemble aiming to solve the problem that users illegally spread pirated and pornographic media contents within the user self-service oriented broadband network new media platforms. Its idea is to do the new media user credit assessment by establishing indices system based on user credit behaviors, and the illegal users could be found according to the credit assessment results, thus to curb the bad videos and audios transmitted on the network. The user credit assessment model based on clustering ensemble proposed by this paper which integrates the advantages that swarm intelligence clustering is suitable for user credit behavior analysis and K-means clustering could eliminate the scattered users existed in the result of swarm intelligence clustering, thus to realize all the users' credit classification automatically. The model's effective verification experiments are accomplished which are based on standard credit application dataset in UCI machine learning repository, and the statistical results of a comparative experiment with a single model of swarm intelligence clustering indicates this clustering ensemble model has a stronger creditworthiness distinguishing ability, especially in the aspect of predicting to find user clusters with the best credit and worst credit, which will facilitate the operators to take incentive measures or punitive measures accurately. Besides, compared with the experimental results of Logistic regression based model under the same conditions, this clustering ensemble model is robustness and has better prediction accuracy.
NASA Astrophysics Data System (ADS)
Pillai, Prasanth A.; Rao, Suryachandra A.; Das, Renu S.; Salunke, Kiran; Dhakate, Ashish
2017-10-01
The present study assess the potential predictability of boreal summer (June through September, JJAS) tropical sea surface temperature (SST) and Indian summer monsoon rainfall (ISMR) using high resolution climate forecast system (CFSv2-T382) hindcasts. Potential predictability is computed using relative entropy (RE), which is the combined effect of signal strength and model spread, while the correlation between ensemble mean and observations represents the actual skill. Both actual and potential skills increase as lead time decreases for Niño3 index and equatorial East Indian Ocean (EEIO) SST anomaly and both the skills are close to each other for May IC hindcasts at zero lead. At the same time the actual skill of ISMR and El Niño Modoki index (EMI) are close to potential skill for Feb IC hindcasts (3 month lead). It is interesting to note that, both actual and potential skills are nearly equal, when RE has maximum contribution to individual year's prediction skill and its relationship with absolute error is insignificant or out of phase. The major contribution to potential predictability is from ensemble mean and the role of ensemble spread is limited for Pacific SST and ISMR hindcasts. RE values are able to capture the predictability contribution from both initial SST and simultaneous boundary forcing better than ensemble mean, resulting in higher potential skill compared to actual skill for all ICs. For Feb IC hindcasts at 3 month lead time, initial month SST (Feb SST) has important predictive component for El Niño Modoki and ISMR leading to higher value of actual skill which is close to potential skill. This study points out that even though the simultaneous relationship between ensemble mean ISMR and global SST is similar for all ICs, the predictive component from initial SST anomalies are captured well by Feb IC (3 month lead) hindcasts only. This resulted in better skill of ISMR for Feb IC (3 month lead) hindcasts compared to May IC (0 month lead) hindcasts. Lack of proper contribution from initial SST and teleconnections induces large absolute error for ISMR in May IC hindcasts resulting in very low actual skill. Thus the use of potential predictability skill and actual skill collectively help to understand the fidelity of the model for further improvement by differentiating the role of initial SST and simultaneous boundary forcing to some extent.
Forecasting European Droughts using the North American Multi-Model Ensemble (NMME)
NASA Astrophysics Data System (ADS)
Thober, Stephan; Kumar, Rohini; Samaniego, Luis; Sheffield, Justin; Schäfer, David; Mai, Juliane
2015-04-01
Soil moisture droughts have the potential to diminish crop yields causing economic damage or even threatening the livelihood of societies. State-of-the-art drought forecasting systems incorporate seasonal meteorological forecasts to estimate future drought conditions. Meteorological forecasting skill (in particular that of precipitation), however, is limited to a few weeks because of the chaotic behaviour of the atmosphere. One of the most important challenges in drought forecasting is to understand how the uncertainty in the atmospheric forcings (e.g., precipitation and temperature) is further propagated into hydrologic variables such as soil moisture. The North American Multi-Model Ensemble (NMME) provides the latest collection of a multi-institutional seasonal forecasting ensemble for precipitation and temperature. In this study, we analyse the skill of NMME forecasts for predicting European drought events. The monthly NMME forecasts are downscaled to daily values to force the mesoscale hydrological model (mHM). The mHM soil moisture forecasts obtained with the forcings of the dynamical models are then compared against those obtained with the Ensemble Streamflow Prediction (ESP) approach. ESP recombines historical meteorological forcings to create a new ensemble forecast. Both forecasts are compared against reference soil moisture conditions obtained using observation based meteorological forcings. The study is conducted for the period from 1982 to 2009 and covers a large part of the Pan-European domain (10°W to 40°E and 35°N to 55°N). Results indicate that NMME forecasts are better at predicting the reference soil moisture variability as compared to ESP. For example, NMME explains 50% of the variability in contrast to only 31% by ESP at a six-month lead time. The Equitable Threat Skill Score (ETS), which combines the hit and false alarm rates, is analysed for drought events using a 0.2 threshold of a soil moisture percentile index. On average, the NMME based ensemble forecasts have consistently higher skill than the ESP based ones (ETS of 13% as compared to 5% at a six-month lead time). Additionally, the ETS ensemble spread of NMME forecasts is considerably narrower than that of ESP; the lower boundary of the NMME ensemble spread coincides most of the time with the ensemble median of ESP. Among the NMME models, NCEP-CFSv2 outperforms the other models in terms of ETS most of the time. Removing the three worst performing models does not deteriorate the ensemble performance (neither in skill nor in spread), but would substantially reduce the computational resources required in an operational forecasting system. For major European drought events (e.g., 1990, 1992, 2003, and 2007), NMME forecasts tend to underestimate area under drought and drought magnitude during times of drought development. During drought recovery, this underestimation is weaker for area under drought or even reversed into an overestimation for drought magnitude. This indicates that the NMME models are too wet during drought development and too dry during drought recovery. In summary, soil moisture drought forecasts by NMME are more skillful than those of an ESP based approach. However, they still show systematic biases in reproducing the observed drought dynamics during drought development and recovery.
NASA Technical Reports Server (NTRS)
Kalnay, Eugenia; Dalcher, Amnon
1987-01-01
It is shown that it is possible to predict the skill of numerical weather forecasts - a quantity which is variable from day to day and region to region. This has been accomplished using as predictor the dispersion (measured by the average correlation) between members of an ensemble of forecasts started from five different analyses. The analyses had been previously derived for satellite-data-impact studies and included, in the Northern Hemisphere, moderate perturbations associated with the use of different observing systems. When the Northern Hemisphere was used as a verification region, the prediction of skill was rather poor. This is due to the fact that such a large area usually contains regions with excellent forecasts as well as regions with poor forecasts, and does not allow for discrimination between them. However, when regional verifications were used, the ensemble forecast dispersion provided a very good prediction of the quality of the individual forecasts.
Post-processing method for wind speed ensemble forecast using wind speed and direction
NASA Astrophysics Data System (ADS)
Sofie Eide, Siri; Bjørnar Bremnes, John; Steinsland, Ingelin
2017-04-01
Statistical methods are widely applied to enhance the quality of both deterministic and ensemble NWP forecasts. In many situations, like wind speed forecasting, most of the predictive information is contained in one variable in the NWP models. However, in statistical calibration of deterministic forecasts it is often seen that including more variables can further improve forecast skill. For ensembles this is rarely taken advantage of, mainly due to that it is generally not straightforward how to include multiple variables. In this study, it is demonstrated how multiple variables can be included in Bayesian model averaging (BMA) by using a flexible regression method for estimating the conditional means. The method is applied to wind speed forecasting at 204 Norwegian stations based on wind speed and direction forecasts from the ECMWF ensemble system. At about 85 % of the sites the ensemble forecasts were improved in terms of CRPS by adding wind direction as predictor compared to only using wind speed. On average the improvements were about 5 %, but mainly for moderate to strong wind situations. For weak wind speeds adding wind direction had more or less neutral impact.
NASA Astrophysics Data System (ADS)
Soltanzadeh, I.; Azadi, M.; Vakili, G. A.
2011-07-01
Using Bayesian Model Averaging (BMA), an attempt was made to obtain calibrated probabilistic numerical forecasts of 2-m temperature over Iran. The ensemble employs three limited area models (WRF, MM5 and HRM), with WRF used with five different configurations. Initial and boundary conditions for MM5 and WRF are obtained from the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS) and for HRM the initial and boundary conditions come from analysis of Global Model Europe (GME) of the German Weather Service. The resulting ensemble of seven members was run for a period of 6 months (from December 2008 to May 2009) over Iran. The 48-h raw ensemble outputs were calibrated using BMA technique for 120 days using a 40 days training sample of forecasts and relative verification data. The calibrated probabilistic forecasts were assessed using rank histogram and attribute diagrams. Results showed that application of BMA improved the reliability of the raw ensemble. Using the weighted ensemble mean forecast as a deterministic forecast it was found that the deterministic-style BMA forecasts performed usually better than the best member's deterministic forecast.
Multi-RCM ensemble downscaling of global seasonal forecasts (MRED)
NASA Astrophysics Data System (ADS)
Arritt, R.
2009-04-01
Regional climate models (RCMs) have long been used to downscale global climate simulations. In contrast the ability of RCMs to downscale seasonal climate forecasts has received little attention. The Multi-RCM Ensemble Downscaling (MRED) project was recently initiated to address the question, Does dynamical downscaling using RCMs provide additional useful information for seasonal forecasts made by global models? MRED is using a suite of RCMs to downscale seasonal forecasts produced by the National Centers for Environmental Prediction (NCEP) Climate Forecast System (CFS) seasonal forecast system and the NASA GEOS5 system. The initial focus is on wintertime forecasts in order to evaluate topographic forcing, snowmelt, and the usefulness of higher resolution for near-surface fields influenced by high resolution orography. Each RCM covers the conterminous U.S. at approximately 32 km resolution, comparable to the scale of the North American Regional Reanalysis (NARR) which will be used to evaluate the models. The forecast ensemble for each RCM is comprised of 15 members over a period of 22+ years (from 1982 to 2003+) for the forecast period 1 December - 30 April. Each RCM will create a 15-member lagged ensemble by starting on different dates in the preceding November. This results in a 120-member ensemble for each projection (8 RCMs by 15 members per RCM). The RCMs will be continually updated at their lateral boundaries using 6-hourly output from CFS or GEOS5. Hydrometeorological output will be produced in a standard netCDF-based format for a common analysis grid, which simplifies both model intercomparison and the generation of ensembles. MRED will compare individual RCM and global forecasts as well as ensemble mean precipitation and temperature forecasts, which are currently being used to drive macroscale land surface models (LSMs). Metrics of ensemble spread will also be evaluated. Extensive process-oriented analysis will be performed to link improvements in downscaled forecast skill to regional forcings and physical mechanisms. Our overarching goal is to determine what additional skill can be provided by a community ensemble of high resolution regional models, which we believe will define a strategy for more skillful and useful regional seasonal climate forecasts.
Integrated Data-Archive and Distributed Hydrological Modelling System for Optimized Dam Operation
NASA Astrophysics Data System (ADS)
Shibuo, Yoshihiro; Jaranilla-Sanchez, Patricia Ann; Koike, Toshio
2013-04-01
In 2012, typhoon Bopha, which passed through the southern part of the Philippines, devastated the nation leaving hundreds of death tolls and significant destruction of the country. Indeed the deadly events related to cyclones occur almost every year in the region. Such extremes are expected to increase both in frequency and magnitude around Southeast Asia, during the course of global climate change. Our ability to confront such hazardous events is limited by the best available engineering infrastructure and performance of weather prediction. An example of the countermeasure strategy is, for instance, early release of reservoir water (lowering the dam water level) during the flood season to protect the downstream region of impending flood. However, over release of reservoir water affect the regional economy adversely by losing water resources, which still have value for power generation, agricultural and industrial water use. Furthermore, accurate precipitation forecast itself is conundrum task, due to the chaotic nature of the atmosphere yielding uncertainty in model prediction over time. Under these circumstances we present a novel approach to optimize contradicting objectives of: preventing flood damage via priori dam release; while sustaining sufficient water supply, during the predicted storm events. By evaluating forecast performance of Meso-Scale Model Grid Point Value against observed rainfall, uncertainty in model prediction is probabilistically taken into account, and it is then applied to the next GPV issuance for generating ensemble rainfalls. The ensemble rainfalls drive the coupled land-surface- and distributed-hydrological model to derive the ensemble flood forecast. Together with dam status information taken into account, our integrated system estimates the most desirable priori dam release through the shuffled complex evolution algorithm. The strength of the optimization system is further magnified by the online link to the Data Integration and Analysis System, a Japanese national project for collecting, integrating and analyzing massive amount of global scale observation data, meaning that the present system is applicable worldwide. We demonstrate the integrated system with observed extreme events in Angat Watershed, the Philippines, and Upper Tone River basin, Japan. The results show promising performance for operational use of the system to support river and dam managers' decision-making.
Reducing Risk of Noise-Induced Hearing Loss in Collegiate Music Ensembles Using Ambient Technology.
Powell, Jason; Chesky, Kris
2017-09-01
Student musicians are at risk for noise-induced hearing loss (NIHL) as they develop skills and perform during instructional activities. Studies using longitudinal dosimeter data show that pedagogical procedures and instructor behaviors are highly predictive of NIHL risk, thus implying the need for innovative approaches to increase instructor competency in managing instructional activities without interfering with artistic and academic freedom. Ambient information systems, an emerging trend in human-computer interaction that infuses psychological behavioral theories into technologies, can help construct informative risk-regulating systems. The purpose of this study was to determine the effects of introducing an ambient information system into the ensemble setting. The system used two ambient displays and a counterbalanced within-subjects treatment study design with six jazz ensemble instructors to determine if the system could induce a behavior change that alters trends in measures resulting from dosimeter data. This study assessed efficacy using time series analysis to determine changes in eight statistical measures of behavior over a 9-wk period. Analysis showed that the system was effective, as all instructors showed changes in a combination of measures. This study is in an important step in developing non-interfering technology to reduce NIHL among academic musicians.
NASA Astrophysics Data System (ADS)
Gelfan, Alexander; Moreido, Vsevolod
2017-04-01
Ensemble hydrological forecasting allows for describing uncertainty caused by variability of meteorological conditions in the river basin for the forecast lead-time. At the same time, in snowmelt-dependent river basins another significant source of uncertainty relates to variability of initial conditions of the basin (snow water equivalent, soil moisture content, etc.) prior to forecast issue. Accurate long-term hydrological forecast is most crucial for large water management systems, such as the Cheboksary reservoir (the catchment area is 374 000 sq.km) located in the Middle Volga river in Russia. Accurate forecasts of water inflow volume, maximum discharge and other flow characteristics are of great value for this basin, especially before the beginning of the spring freshet season that lasts here from April to June. The semi-distributed hydrological model ECOMAG was used to develop long-term ensemble forecast of daily water inflow into the Cheboksary reservoir. To describe variability of the meteorological conditions and construct ensemble of possible weather scenarios for the lead-time of the forecast, two approaches were applied. The first one utilizes 50 weather scenarios observed in the previous years (similar to the ensemble streamflow prediction (ESP) procedure), the second one uses 1000 synthetic scenarios simulated by a stochastic weather generator. We investigated the evolution of forecast uncertainty reduction, expressed as forecast efficiency, over various consequent forecast issue dates and lead time. We analyzed the Nash-Sutcliffe efficiency of inflow hindcasts for the period 1982 to 2016 starting from 1st of March with 15 days frequency for lead-time of 1 to 6 months. This resulted in the forecast efficiency matrix with issue dates versus lead-time that allows for predictability identification of the basin. The matrix was constructed separately for observed and synthetic weather ensembles.
NASA Astrophysics Data System (ADS)
Ma, Feng; Ye, Aizhong; Duan, Qingyun
2017-03-01
An experimental seasonal drought forecasting system is developed based on 29-year (1982-2010) seasonal meteorological hindcasts generated by the climate models from the North American Multi-Model Ensemble (NMME) project. This system made use of a bias correction and spatial downscaling method, and a distributed time-variant gain model (DTVGM) hydrologic model. DTVGM was calibrated using observed daily hydrological data and its streamflow simulations achieved Nash-Sutcliffe efficiency values of 0.727 and 0.724 during calibration (1978-1995) and validation (1996-2005) periods, respectively, at the Danjiangkou reservoir station. The experimental seasonal drought forecasting system (known as NMME-DTVGM) is used to generate seasonal drought forecasts. The forecasts were evaluated against the reference forecasts (i.e., persistence forecast and climatological forecast). The NMME-DTVGM drought forecasts have higher detectability and accuracy and lower false alarm rate than the reference forecasts at different lead times (from 1 to 4 months) during the cold-dry season. No apparent advantage is shown in drought predictions during spring and summer seasons because of a long memory of the initial conditions in spring and a lower predictive skill for precipitation in summer. Overall, the NMME-based seasonal drought forecasting system has meaningful skill in predicting drought several months in advance, which can provide critical information for drought preparedness and response planning as well as the sustainable practice of water resource conservation over the basin.
NASA Astrophysics Data System (ADS)
Meißner, Dennis; Klein, Bastian; Ionita, Monica; Hemri, Stephan; Rademacher, Silke
2017-04-01
Inland waterway transport (IWT) is an important commercial sector significantly vulnerable to hydrological impacts. River ice and floods limit the availability of the waterway network and may cause considerable damages to waterway infrastructure. Low flows significantly affect IWT's operation efficiency usually several months a year due to the close correlation of (low) water levels / water depths and (high) transport costs. Therefore "navigation-related" hydrological forecasts focussing on the specific requirements of water-bound transport (relevant forecast locations, target parameters, skill characteristics etc.) play a major role in order to mitigate IWT's vulnerability to hydro-meteorological impacts. In light of continuing transport growth within the European Union, hydrological forecasts for the waterways are essential to stimulate the use of the free capacity IWT still offers more consequently. An overview of the current operational and pre-operational forecasting systems for the German waterways predicting water levels, discharges and river ice thickness on various time-scales will be presented. While short-term (deterministic) forecasts have a long tradition in navigation-related forecasting, (probabilistic) forecasting services offering extended lead-times are not yet well-established and are still subject to current research and development activities (e.g. within the EU-projects EUPORIAS and IMPREX). The focus is on improving technical aspects as well as on exploring adequate ways of disseminating and communicating probabilistic forecast information. For the German stretch of the River Rhine, one of the most frequented inland waterways worldwide, the existing deterministic forecast scheme has been extended by ensemble forecasts combined with statistical post-processing modules applying EMOS (Ensemble Model Output Statistics) and ECC (Ensemble Copula Coupling) in order to generate water level predictions up to 10 days and to estimate its predictive uncertainty properly. Additionally for the key locations at the international waterways Rhine, Elbe and Danube three competing forecast approaches are currently tested in a pre-operational set-up in order to generate monthly to seasonal (up to 3 months) forecasts: (1) the well-known Ensemble Streamflow Prediction approach (ensemble based on historical meteorology), (2) coupling hydrological models with post-processed outputs from ECMWF's general circulation model (System 4), and (3) a purely statistical approach based on the stable relationship (teleconnection) of global or regional oceanic, climate and hydrological data with river flows. The current results, still pre-operational, reveal the existence of a valuable predictability of water levels and streamflow also at monthly up to seasonal time-scales along the larger rivers used as waterways in Germany. Last but not least insight into the technical set-up of the aforementioned forecasting systems operated at the Federal Institute of Hydrology, which are based on a Delft-FEWS application, will be given focussing on the step-wise extension of the former system by integrating new components in order to meet the growing needs of the customers and to improve and extend the forecast portfolio for waterway users.
A New Ensemble Canonical Correlation Prediction Scheme for Seasonal Precipitation
NASA Technical Reports Server (NTRS)
Kim, Kyu-Myong; Lau, William K. M.; Li, Guilong; Shen, Samuel S. P.; Lau, William K. M. (Technical Monitor)
2001-01-01
Department of Mathematical Sciences, University of Alberta, Edmonton, Canada This paper describes the fundamental theory of the ensemble canonical correlation (ECC) algorithm for the seasonal climate forecasting. The algorithm is a statistical regression sch eme based on maximal correlation between the predictor and predictand. The prediction error is estimated by a spectral method using the basis of empirical orthogonal functions. The ECC algorithm treats the predictors and predictands as continuous fields and is an improvement from the traditional canonical correlation prediction. The improvements include the use of area-factor, estimation of prediction error, and the optimal ensemble of multiple forecasts. The ECC is applied to the seasonal forecasting over various parts of the world. The example presented here is for the North America precipitation. The predictor is the sea surface temperature (SST) from different ocean basins. The Climate Prediction Center's reconstructed SST (1951-1999) is used as the predictor's historical data. The optimally interpolated global monthly precipitation is used as the predictand?s historical data. Our forecast experiments show that the ECC algorithm renders very high skill and the optimal ensemble is very important to the high value.
NASA Astrophysics Data System (ADS)
Noh, S. J.; Rakovec, O.; Kumar, R.; Samaniego, L. E.
2015-12-01
Accurate and reliable streamflow prediction is essential to mitigate social and economic damage coming from water-related disasters such as flood and drought. Sequential data assimilation (DA) may facilitate improved streamflow prediction using real-time observations to correct internal model states. In conventional DA methods such as state updating, parametric uncertainty is often ignored mainly due to practical limitations of methodology to specify modeling uncertainty with limited ensemble members. However, if parametric uncertainty related with routing and runoff components is not incorporated properly, predictive uncertainty by model ensemble may be insufficient to capture dynamics of observations, which may deteriorate predictability. Recently, a multi-scale parameter regionalization (MPR) method was proposed to make hydrologic predictions at different scales using a same set of model parameters without losing much of the model performance. The MPR method incorporated within the mesoscale hydrologic model (mHM, http://www.ufz.de/mhm) could effectively represent and control uncertainty of high-dimensional parameters in a distributed model using global parameters. In this study, we evaluate impacts of streamflow data assimilation over European river basins. Especially, a multi-parametric ensemble approach is tested to consider the effects of parametric uncertainty in DA. Because augmentation of parameters is not required within an assimilation window, the approach could be more stable with limited ensemble members and have potential for operational uses. To consider the response times and non-Gaussian characteristics of internal hydrologic processes, lagged particle filtering is utilized. The presentation will be focused on gains and limitations of streamflow data assimilation and multi-parametric ensemble method over large-scale basins.
Assimilation of sea ice concentration data in the Arctic via DART/CICE5 in the CESM1
NASA Astrophysics Data System (ADS)
Zhang, Y.; Bitz, C. M.; Anderson, J. L.; Collins, N.; Hendricks, J.; Hoar, T. J.; Raeder, K.
2016-12-01
Arctic sea ice cover has been experiencing significant reduction in the past few decades. Climate models predict that the Arctic Ocean may be ice-free in late summer within a few decades. Better sea ice prediction is crucial for regional and global climate prediction that are vital to human activities such as maritime shipping and subsistence hunting, as well as wildlife protection as animals face habitat loss. The physical processes involved with the persistence and re-emergence of sea ice cover are found to extend the predictability of sea ice concentration (SIC) and thickness at the regional scale up to several years. This motivates us to investigate sea ice predictability stemming from initial values of the sea ice cover. Data assimilation is a useful technique to combine observations and model forecasts to reconstruct the states of sea ice in the past and provide more accurate initial conditions for sea ice prediction. This work links the most recent version of the Los Alamos sea ice model (CICE5) within the Community Earth System Model version 1.5 (CESM1.5) and the Data Assimilation Research Testbed (DART). The linked DART/CICE5 is ideal to assimilate multi-scale and multivariate sea ice observations using an ensemble Kalman filter (EnKF). The study is focused on the assimilation of SIC data that impact SIC, sea ice thickness, and snow thickness. The ensemble sea ice model states are constructed by introducing uncertainties in atmospheric forcing and key model parameters. The ensemble atmospheric forcing is a reanalysis product generated with DART and the Community Atmosphere Model (CAM). We also perturb two model parameters that are found to contribute significantly to the model uncertainty in previous studies. This study applies perfect model observing system simulation experiments (OSSEs) to investigate data assimilation algorithms and post-processing methods. One of the ensemble members of a CICE5 free run is chosen as the truth. Daily synthetic observations are obtained by adding 15% random noise to the truth. Experiments assimilating the synthetic observations are then conducted to test the effectiveness of different data assimilation algorithms (e.g., localization and inflation) and post-processing methods (e.g., how to distribute the total increment of SIC into each ice thickness category).
NASA Astrophysics Data System (ADS)
Walcott, Sam
2013-03-01
Interactions between the proteins actin and myosin drive muscle contraction. Properties of a single myosin interacting with an actin filament are largely known, but a trillion myosins work together in muscle. We are interested in how single-molecule properties relate to ensemble function. Myosin's reaction rates depend on force, so ensemble models keep track of both molecular state and force on each molecule. These models make subtle predictions, e.g. that myosin, when part of an ensemble, moves actin faster than when isolated. This acceleration arises because forces between molecules speed reaction kinetics. Experiments support this prediction and allow parameter estimates. A model based on this analysis describes experiments from single molecule to ensemble. In vivo, actin is regulated by proteins that, when present, cause the binding of one myosin to speed the binding of its neighbors; binding becomes cooperative. Although such interactions preclude the mean field approximation, a set of linear ODEs describes these ensembles under simplified experimental conditions. In these experiments cooperativity is strong, with the binding of one molecule affecting ten neighbors on either side. We progress toward a description of myosin ensembles under physiological conditions.
An ensemble model of QSAR tools for regulatory risk assessment.
Pradeep, Prachi; Povinelli, Richard J; White, Shannon; Merrill, Stephen J
2016-01-01
Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa ( κ ): 0.63 and 0.62] for both the datasets. The ROC curves demonstrate the utility of the cut-off feature in the predictive ability of the ensemble model. This feature provides an additional control to the regulators in grading a chemical based on the severity of the toxic endpoint under study.
An ensemble model of QSAR tools for regulatory risk assessment
Pradeep, Prachi; Povinelli, Richard J.; White, Shannon; ...
2016-09-22
Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflictingmore » predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa (κ): 0.63 and 0.62] for both the datasets. The ROC curves demonstrate the utility of the cut-off feature in the predictive ability of the ensemble model. In conclusion, this feature provides an additional control to the regulators in grading a chemical based on the severity of the toxic endpoint under study.« less
Soft sensor modeling based on variable partition ensemble method for nonlinear batch processes
NASA Astrophysics Data System (ADS)
Wang, Li; Chen, Xiangguang; Yang, Kai; Jin, Huaiping
2017-01-01
Batch processes are always characterized by nonlinear and system uncertain properties, therefore, the conventional single model may be ill-suited. A local learning strategy soft sensor based on variable partition ensemble method is developed for the quality prediction of nonlinear and non-Gaussian batch processes. A set of input variable sets are obtained by bootstrapping and PMI criterion. Then, multiple local GPR models are developed based on each local input variable set. When a new test data is coming, the posterior probability of each best performance local model is estimated based on Bayesian inference and used to combine these local GPR models to get the final prediction result. The proposed soft sensor is demonstrated by applying to an industrial fed-batch chlortetracycline fermentation process.
Simulating Quantitative Cellular Responses Using Asynchronous Threshold Boolean Network Ensembles
With increasing knowledge about the potential mechanisms underlying cellular functions, it is becoming feasible to predict the response of biological systems to genetic and environmental perturbations. Due to the lack of homogeneity in living tissues it is difficult to estimate t...
Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V
2012-01-01
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999
Machine Learning Predictions of a Multiresolution Climate Model Ensemble
NASA Astrophysics Data System (ADS)
Anderson, Gemma J.; Lucas, Donald D.
2018-05-01
Statistical models of high-resolution climate models are useful for many purposes, including sensitivity and uncertainty analyses, but building them can be computationally prohibitive. We generated a unique multiresolution perturbed parameter ensemble of a global climate model. We use a novel application of a machine learning technique known as random forests to train a statistical model on the ensemble to make high-resolution model predictions of two important quantities: global mean top-of-atmosphere energy flux and precipitation. The random forests leverage cheaper low-resolution simulations, greatly reducing the number of high-resolution simulations required to train the statistical model. We demonstrate that high-resolution predictions of these quantities can be obtained by training on an ensemble that includes only a small number of high-resolution simulations. We also find that global annually averaged precipitation is more sensitive to resolution changes than to any of the model parameters considered.
A Canonical Ensemble Correlation Prediction Model for Seasonal Precipitation Anomaly
NASA Technical Reports Server (NTRS)
Shen, Samuel S. P.; Lau, William K. M.; Kim, Kyu-Myong; Li, Guilong
2001-01-01
This report describes an optimal ensemble forecasting model for seasonal precipitation and its error estimation. Each individual forecast is based on the canonical correlation analysis (CCA) in the spectral spaces whose bases are empirical orthogonal functions (EOF). The optimal weights in the ensemble forecasting crucially depend on the mean square error of each individual forecast. An estimate of the mean square error of a CCA prediction is made also using the spectral method. The error is decomposed onto EOFs of the predictand and decreases linearly according to the correlation between the predictor and predictand. This new CCA model includes the following features: (1) the use of area-factor, (2) the estimation of prediction error, and (3) the optimal ensemble of multiple forecasts. The new CCA model is applied to the seasonal forecasting of the United States precipitation field. The predictor is the sea surface temperature.
Altwaijry, Nojood A; Baron, Michael; Wright, David W; Coveney, Peter V; Townsend-Nicholson, Andrea
2017-05-09
The accurate identification of the specific points of interaction between G protein-coupled receptor (GPCR) oligomers is essential for the design of receptor ligands targeting oligomeric receptor targets. A coarse-grained molecular dynamics computer simulation approach would provide a compelling means of identifying these specific protein-protein interactions and could be applied both for known oligomers of interest and as a high-throughput screen to identify novel oligomeric targets. However, to be effective, this in silico modeling must provide accurate, precise, and reproducible information. This has been achieved recently in numerous biological systems using an ensemble-based all-atom molecular dynamics approach. In this study, we describe an equivalent methodology for ensemble-based coarse-grained simulations. We report the performance of this method when applied to four different GPCRs known to oligomerize using error analysis to determine the ensemble size and individual replica simulation time required. Our measurements of distance between residues shown to be involved in oligomerization of the fifth transmembrane domain from the adenosine A 2A receptor are in very good agreement with the existing biophysical data and provide information about the nature of the contact interface that cannot be determined experimentally. Calculations of distance between rhodopsin, CXCR4, and β 1 AR transmembrane domains reported to form contact points in homodimers correlate well with the corresponding measurements obtained from experimental structural data, providing an ability to predict contact interfaces computationally. Interestingly, error analysis enables identification of noninteracting regions. Our results confirm that GPCR interactions can be reliably predicted using this novel methodology.
Development of a multi-ensemble Prediction Model for China
NASA Astrophysics Data System (ADS)
Brasseur, G. P.; Bouarar, I.; Petersen, A. K.
2016-12-01
As part of the EU-sponsored Panda and MarcoPolo Projects, a multi-model prediction system including 7 models has been developed. Most regional models use global air quality predictions provided by the Copernicus Atmospheric Monitoring Service and downscale the forecast at relatively high spatial resolution in eastern China. The paper will describe the forecast system and show examples of forecasts produced for several Chinese urban areas and displayed on a web site developed by the Dutch Meteorological service. A discussion on the accuracy of the predictions based on a detailed validation process using surface measurements from the Chinese monitoring network will be presented.
The GMAO Hybrid Ensemble-Variational Atmospheric Data Assimilation System: Version 2.0
NASA Technical Reports Server (NTRS)
Todling, Ricardo; El Akkraoui, Amal
2018-01-01
This document describes the implementation and usage of the Goddard Earth Observing System (GEOS) Hybrid Ensemble-Variational Atmospheric Data Assimilation System (Hybrid EVADAS). Its aim is to provide comprehensive guidance to users of GEOS ADAS interested in experimenting with its hybrid functionalities. The document is also aimed at providing a short summary of the state-of-science in this release of the hybrid system. As explained here, the ensemble data assimilation system (EnADAS) mechanism added to GEOS ADAS to enable hybrid data assimilation applications has been introduced to the pre-existing machinery of GEOS in the most non-intrusive possible way. Only very minor changes have been made to the original scripts controlling GEOS ADAS with the objective of facilitating its usage by both researchers and the GMAO's near-real-time Forward Processing applications. In a hybrid scenario two data assimilation systems run concurrently in a two-way feedback mode such that: the ensemble provides background ensemble perturbations required by the ADAS deterministic (typically high resolution) hybrid analysis; and the deterministic ADAS provides analysis information for recentering of the EnADAS analyses and information necessary to ensure that observation bias correction procedures are consistent between both the deterministic ADAS and the EnADAS. The nonintrusive approach to introducing hybrid capability to GEOS ADAS means, in particular, that previously existing features continue to be available. Thus, not only is this upgraded version of GEOS ADAS capable of supporting new applications such as Hybrid 3D-Var, 3D-EnVar, 4D-EnVar and Hybrid 4D-EnVar, it remains possible to use GEOS ADAS in its traditional 3D-Var mode which has been used in both MERRA and MERRA-2. Furthermore, as described in this document, GEOS ADAS also supports a configuration for exercising a purely ensemble-based assimilation strategy which can be fully decoupled from its variational component. We should point out that Release 1.0 of this document was made available to GMAO in mid-2013, when we introduced Hybrid 3D-Var capability to GEOS ADAS. This initial version of the documentation included a considerably different state-of-science introductory section but many of the same detailed description of the mechanisms of GEOS EnADAS. We are glad to report that a few of the desirable Future Works listed in Release 1.0 have now been added to the present version of GEOS EnADAS. These include the ability to exercise an Ensemble Prediction System that uses the ensemble analyses of GEOS EnADAS and (a very early, but functional version of) a tool to support Ensemble Forecast Sensitivity and Observation Impact applications.
Quantifying predictability in a model with statistical features of the atmosphere
Kleeman, Richard; Majda, Andrew J.; Timofeyev, Ilya
2002-01-01
The Galerkin truncated inviscid Burgers equation has recently been shown by the authors to be a simple model with many degrees of freedom, with many statistical properties similar to those occurring in dynamical systems relevant to the atmosphere. These properties include long time-correlated, large-scale modes of low frequency variability and short time-correlated “weather modes” at smaller scales. The correlation scaling in the model extends over several decades and may be explained by a simple theory. Here a thorough analysis of the nature of predictability in the idealized system is developed by using a theoretical framework developed by R.K. This analysis is based on a relative entropy functional that has been shown elsewhere by one of the authors to measure the utility of statistical predictions precisely. The analysis is facilitated by the fact that most relevant probability distributions are approximately Gaussian if the initial conditions are assumed to be so. Rather surprisingly this holds for both the equilibrium (climatological) and nonequilibrium (prediction) distributions. We find that in most cases the absolute difference in the first moments of these two distributions (the “signal” component) is the main determinant of predictive utility variations. Contrary to conventional belief in the ensemble prediction area, the dispersion of prediction ensembles is generally of secondary importance in accounting for variations in utility associated with different initial conditions. This conclusion has potentially important implications for practical weather prediction, where traditionally most attention has focused on dispersion and its variability. PMID:12429863
Operational hydrological forecasting in Bavaria. Part II: Ensemble forecasting
NASA Astrophysics Data System (ADS)
Ehret, U.; Vogelbacher, A.; Moritz, K.; Laurent, S.; Meyer, I.; Haag, I.
2009-04-01
In part I of this study, the operational flood forecasting system in Bavaria and an approach to identify and quantify forecast uncertainty was introduced. The approach is split into the calculation of an empirical 'overall error' from archived forecasts and the calculation of an empirical 'model error' based on hydrometeorological forecast tests, where rainfall observations were used instead of forecasts. The 'model error' can especially in upstream catchments where forecast uncertainty is strongly dependent on the current predictability of the atrmosphere be superimposed on the spread of a hydrometeorological ensemble forecast. In Bavaria, two meteorological ensemble prediction systems are currently tested for operational use: the 16-member COSMO-LEPS forecast and a poor man's ensemble composed of DWD GME, DWD Cosmo-EU, NCEP GFS, Aladin-Austria, MeteoSwiss Cosmo-7. The determination of the overall forecast uncertainty is dependent on the catchment characteristics: 1. Upstream catchment with high influence of weather forecast a) A hydrological ensemble forecast is calculated using each of the meteorological forecast members as forcing. b) Corresponding to the characteristics of the meteorological ensemble forecast, each resulting forecast hydrograph can be regarded as equally likely. c) The 'model error' distribution, with parameters dependent on hydrological case and lead time, is added to each forecast timestep of each ensemble member d) For each forecast timestep, the overall (i.e. over all 'model error' distribution of each ensemble member) error distribution is calculated e) From this distribution, the uncertainty range on a desired level (here: the 10% and 90% percentile) is extracted and drawn as forecast envelope. f) As the mean or median of an ensemble forecast does not necessarily exhibit meteorologically sound temporal evolution, a single hydrological forecast termed 'lead forecast' is chosen and shown in addition to the uncertainty bounds. This can be either an intermediate forecast between the extremes of the ensemble spread or a manually selected forecast based on a meteorologists advice. 2. Downstream catchments with low influence of weather forecast In downstream catchments with strong human impact on discharge (e.g. by reservoir operation) and large influence of upstream gauge observation quality on forecast quality, the 'overall error' may in most cases be larger than the combination of the 'model error' and an ensemble spread. Therefore, the overall forecast uncertainty bounds are calculated differently: a) A hydrological ensemble forecast is calculated using each of the meteorological forecast members as forcing. Here, additionally the corresponding inflow hydrograph from all upstream catchments must be used. b) As for an upstream catchment, the uncertainty range is determined by combination of 'model error' and the ensemble member forecasts c) In addition, the 'overall error' is superimposed on the 'lead forecast'. For reasons of consistency, the lead forecast must be based on the same meteorological forecast in the downstream and all upstream catchments. d) From the resulting two uncertainty ranges (one from the ensemble forecast and 'model error', one from the 'lead forecast' and 'overall error'), the envelope is taken as the most prudent uncertainty range. In sum, the uncertainty associated with each forecast run is calculated and communicated to the public in the form of 10% and 90% percentiles. As in part I of this study, the methodology as well as the useful- or uselessness of the resulting uncertainty ranges will be presented and discussed by typical examples.
DART: New Research Using Ensemble Data Assimilation in Geophysical Models
NASA Astrophysics Data System (ADS)
Hoar, T. J.; Raeder, K.
2015-12-01
The Data Assimilation Research Testbed (DART) is a community facilityfor ensemble data assimilation developed and supported by the NationalCenter for Atmospheric Research. DART provides a comprehensive suite of software, documentation, and tutorials that can be used for ensemble data assimilation research, operations, and education. Scientists and software engineers at NCAR are available to support DART users who want to use existing DART products or develop their own applications. Current DART users range from university professors teaching data assimilation, to individual graduate students working with simple models, through national laboratories doing operational prediction with large state-of-the-art models. DART runs efficiently on many computational platforms ranging from laptops through thousands of cores on the newest supercomputers.This poster focuses on several recent research activities using DART with geophysical models.Using CAM/DART to understand whether OCO-2 Total Precipitable Water observations can be useful in numerical weather prediction.Impacts of the synergistic use of Infra-red CO retrievals (MOPITT, IASI) in CAM-CHEM/DART assimilations.Assimilation and Analysis of Observations of Amazonian Biomass Burning Emissions by MOPITT (aerosol optical depth), MODIS (carbon monoxide) and MISR (plume height).Long term evaluation of the chemical response of MOPITT-CO assimilation in CAM-CHEM/DART OSSEs for satellite planning and emission inversion capabilities.Improved forward observation operators for land models that have multiple land use/land cover segments in a single grid cell,Simulating mesoscale convective systems (MCSs) using a variable resolution, unstructured grid in the Model for Prediction Across Scales (MPAS) and DART.The mesoscale WRF+DART system generated an ensemble of year-long, real-time initializations of a convection allowing model over the United States.Constraining WACCM with observations in the tropical band (30S-30N) using DART also constrains the polar stratosphere during the same winter. Assimilation of MOPITT carbon monoxide Compact Phase Space Retrievals (CPSR) in WRF-Chem/DART.Future work:DART interface to the CICE (CESM) sea ice model.Fully coupled assimilations in CESM.
NASA Astrophysics Data System (ADS)
Ge, Cui; Wang, Jun; Reid, Jeffrey S.; Posselt, Derek J.; Xian, Peng; Hyer, Edward
2017-05-01
Atmospheric transport of smoke from equatorial Southeast Asian Maritime Continent (Indonesia, Singapore, and Malaysia) to the Philippines was recently verified by the first-ever measurement of aerosol composition in the region of the Sulu Sea from a research vessel named Vasco. However, numerical modeling of such transport can have large uncertainties due to the lack of observations for parameterization schemes and for describing fire emission and meteorology in this region. These uncertainties are analyzed here, for the first time, with an ensemble of 24 Weather Research and Forecasting model with Chemistry (WRF-Chem) simulations. The ensemble reproduces the time series of observed surface nonsea-salt PM2.5 concentrations observed from the Vasco vessel during 17-30 September 2011 and overall agrees with satellite (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) and Moderate Resolution Imaging Spectroradiometer (MODIS)) and Aerosol Robotic Network (AERONET) data. The difference of meteorology between National Centers for Environmental Prediction (NCEP's) Final (FNL) and European Center for Medium range Weather Forecasting (ECMWF's) ERA renders the biggest spread in the ensemble (up to 20 μg m-3 or 200% in surface PM2.5), with FNL showing systematically superior results. The second biggest uncertainty is from fire emissions; the 2 day maximum Fire Locating and Modelling of Burning Emissions (FLAMBE) emission is superior than the instantaneous one. While Grell-Devenyi (G3) and Betts-Miller-Janjić cumulus schemes only produce a difference of 3 μg m-3 of surface PM2.5 over the Sulu Sea, the ensemble mean agrees best with Climate Prediction Center (CPC) MORPHing (CMORPH)'s spatial distribution of precipitation. Simulation with FNL-G3, 2 day maximum FLAMBE, and 800 m injection height outperforms other ensemble members. Finally, the global transport model (Navy Aerosol Analysis and Prediction System (NAAPS)) outperforms all WRF-Chem simulations in describing smoke transport on 20 September 2011, suggesting the challenges to model tropical meteorology at mesoscale and finer scale.
eHive: an artificial intelligence workflow system for genomic analysis.
Severin, Jessica; Beal, Kathryn; Vilella, Albert J; Fitzgerald, Stephen; Schuster, Michael; Gordon, Leo; Ureta-Vidal, Abel; Flicek, Paul; Herrero, Javier
2010-05-11
The Ensembl project produces updates to its comparative genomics resources with each of its several releases per year. During each release cycle approximately two weeks are allocated to generate all the genomic alignments and the protein homology predictions. The number of calculations required for this task grows approximately quadratically with the number of species. We currently support 50 species in Ensembl and we expect the number to continue to grow in the future. We present eHive, a new fault tolerant distributed processing system initially designed to support comparative genomic analysis, based on blackboard systems, network distributed autonomous agents, dataflow graphs and block-branch diagrams. In the eHive system a MySQL database serves as the central blackboard and the autonomous agent, a Perl script, queries the system and runs jobs as required. The system allows us to define dataflow and branching rules to suit all our production pipelines. We describe the implementation of three pipelines: (1) pairwise whole genome alignments, (2) multiple whole genome alignments and (3) gene trees with protein homology inference. Finally, we show the efficiency of the system in real case scenarios. eHive allows us to produce computationally demanding results in a reliable and efficient way with minimal supervision and high throughput. Further documentation is available at: http://www.ensembl.org/info/docs/eHive/.
Yang, Wan; Karspeck, Alicia; Shaman, Jeffrey
2014-01-01
A variety of filtering methods enable the recursive estimation of system state variables and inference of model parameters. These methods have found application in a range of disciplines and settings, including engineering design and forecasting, and, over the last two decades, have been applied to infectious disease epidemiology. For any system of interest, the ideal filter depends on the nonlinearity and complexity of the model to which it is applied, the quality and abundance of observations being entrained, and the ultimate application (e.g. forecast, parameter estimation, etc.). Here, we compare the performance of six state-of-the-art filter methods when used to model and forecast influenza activity. Three particle filters—a basic particle filter (PF) with resampling and regularization, maximum likelihood estimation via iterated filtering (MIF), and particle Markov chain Monte Carlo (pMCMC)—and three ensemble filters—the ensemble Kalman filter (EnKF), the ensemble adjustment Kalman filter (EAKF), and the rank histogram filter (RHF)—were used in conjunction with a humidity-forced susceptible-infectious-recovered-susceptible (SIRS) model and weekly estimates of influenza incidence. The modeling frameworks, first validated with synthetic influenza epidemic data, were then applied to fit and retrospectively forecast the historical incidence time series of seven influenza epidemics during 2003–2012, for 115 cities in the United States. Results suggest that when using the SIRS model the ensemble filters and the basic PF are more capable of faithfully recreating historical influenza incidence time series, while the MIF and pMCMC do not perform as well for multimodal outbreaks. For forecast of the week with the highest influenza activity, the accuracies of the six model-filter frameworks are comparable; the three particle filters perform slightly better predicting peaks 1–5 weeks in the future; the ensemble filters are more accurate predicting peaks in the past. PMID:24762780
Dong, Chengliang; Wei, Peng; Jian, Xueqiu; Gibbs, Richard; Boerwinkle, Eric; Wang, Kai; Liu, Xiaoming
2015-01-01
Accurate deleteriousness prediction for nonsynonymous variants is crucial for distinguishing pathogenic mutations from background polymorphisms in whole exome sequencing (WES) studies. Although many deleteriousness prediction methods have been developed, their prediction results are sometimes inconsistent with each other and their relative merits are still unclear in practical applications. To address these issues, we comprehensively evaluated the predictive performance of 18 current deleteriousness-scoring methods, including 11 function prediction scores (PolyPhen-2, SIFT, MutationTaster, Mutation Assessor, FATHMM, LRT, PANTHER, PhD-SNP, SNAP, SNPs&GO and MutPred), 3 conservation scores (GERP++, SiPhy and PhyloP) and 4 ensemble scores (CADD, PON-P, KGGSeq and CONDEL). We found that FATHMM and KGGSeq had the highest discriminative power among independent scores and ensemble scores, respectively. Moreover, to ensure unbiased performance evaluation of these prediction scores, we manually collected three distinct testing datasets, on which no current prediction scores were tuned. In addition, we developed two new ensemble scores that integrate nine independent scores and allele frequency. Our scores achieved the highest discriminative power compared with all the deleteriousness prediction scores tested and showed low false-positive prediction rate for benign yet rare nonsynonymous variants, which demonstrated the value of combining information from multiple orthologous approaches. Finally, to facilitate variant prioritization in WES studies, we have pre-computed our ensemble scores for 87 347 044 possible variants in the whole-exome and made them publicly available through the ANNOVAR software and the dbNSFP database. PMID:25552646
Forced synchronization of large-scale circulation to increase predictability of surface states
NASA Astrophysics Data System (ADS)
Shen, Mao-Lin; Keenlyside, Noel; Selten, Frank; Wiegerinck, Wim; Duane, Gregory
2016-04-01
Numerical models are key tools in the projection of the future climate change. The lack of perfect initial condition and perfect knowledge of the laws of physics, as well as inherent chaotic behavior limit predictions. Conceptually, the atmospheric variables can be decomposed into a predictable component (signal) and an unpredictable component (noise). In ensemble prediction the anomaly of ensemble mean is regarded as the signal and the ensemble spread the noise. Naturally the prediction skill will be higher if the signal-to-noise ratio (SNR) is larger in the initial conditions. We run two ensemble experiments in order to explore a way to reduce the SNR of surface winds and temperature. One ensemble experiment is AGCM with prescribing sea surface temperature (SST); the other is AGCM with both prescribing SST and nudging the high-level temperature and winds to ERA-Interim. Each ensemble has 30 members. Larger SNR is expected and found over the tropical ocean in the first experiment because the tropical circulation is associated with the convection and the associated surface wind convergence as these are to a large extent driven by the SST. However, small SNR is found over high latitude ocean and land surface due to the chaotic and non-synchronized atmosphere states. In the second experiment the higher level temperature and winds are forced to be synchronized (nudged to reanalysis) and hence a larger SNR of surface winds and temperature is expected. Furthermore, different nudging coefficients are also tested in order to understand the limitation of both synchronization of large-scale circulation and the surface states. These experiments will be useful for the developing strategies to synchronize the 3-D states of atmospheric models that can be later used to build a super model.
National Centers for Environmental Prediction
/ VISION | About EMC EMC > GEFS > COLLABORATORS Home Operational Products Experimental Data ENSEMBLE FORECAST SYSTEM MSC NAEFS Products CPC NAEFS Experimental 8 to 14 Day Temperature Guidance CPC NAEFS Experimental 8 to 14 Day Precip Guidance NOAA / National Weather Service National Centers for
A hybrid neurogenetic approach for stock forecasting.
Kwon, Yung-Keun; Moon, Byung-Ro
2007-05-01
In this paper, we propose a hybrid neurogenetic system for stock trading. A recurrent neural network (NN) having one hidden layer is used for the prediction model. The input features are generated from a number of technical indicators being used by financial experts. The genetic algorithm (GA) optimizes the NN's weights under a 2-D encoding and crossover. We devised a context-based ensemble method of NNs which dynamically changes on the basis of the test day's context. To reduce the time in processing mass data, we parallelized the GA on a Linux cluster system using message passing interface. We tested the proposed method with 36 companies in NYSE and NASDAQ for 13 years from 1992 to 2004. The neurogenetic hybrid showed notable improvement on the average over the buy-and-hold strategy and the context-based ensemble further improved the results. We also observed that some companies were more predictable than others, which implies that the proposed neurogenetic hybrid can be used for financial portfolio construction.
Meta-heuristic CRPS minimization for the calibration of short-range probabilistic forecasts
NASA Astrophysics Data System (ADS)
Mohammadi, Seyedeh Atefeh; Rahmani, Morteza; Azadi, Majid
2016-08-01
This paper deals with the probabilistic short-range temperature forecasts over synoptic meteorological stations across Iran using non-homogeneous Gaussian regression (NGR). NGR creates a Gaussian forecast probability density function (PDF) from the ensemble output. The mean of the normal predictive PDF is a bias-corrected weighted average of the ensemble members and its variance is a linear function of the raw ensemble variance. The coefficients for the mean and variance are estimated by minimizing the continuous ranked probability score (CRPS) during a training period. CRPS is a scoring rule for distributional forecasts. In the paper of Gneiting et al. (Mon Weather Rev 133:1098-1118, 2005), Broyden-Fletcher-Goldfarb-Shanno (BFGS) method is used to minimize the CRPS. Since BFGS is a conventional optimization method with its own limitations, we suggest using the particle swarm optimization (PSO), a robust meta-heuristic method, to minimize the CRPS. The ensemble prediction system used in this study consists of nine different configurations of the weather research and forecasting model for 48-h forecasts of temperature during autumn and winter 2011 and 2012. The probabilistic forecasts were evaluated using several common verification scores including Brier score, attribute diagram and rank histogram. Results show that both BFGS and PSO find the optimal solution and show the same evaluation scores, but PSO can do this with a feasible random first guess and much less computational complexity.
NASA Astrophysics Data System (ADS)
Wood, A. W.; Clark, E.; Mendoza, P. A.; Nijssen, B.; Newman, A. J.; Clark, M. P.; Arnold, J.; Nowak, K. C.
2016-12-01
Many if not most national operational short-to-medium range streamflow prediction systems rely on a forecaster-in-the-loop approach in which some parts of the forecast workflow are automated, but others require the hands-on-effort of an experienced human forecaster. This approach evolved out of the need to correct for deficiencies in the models and datasets that were available for forecasting, and often leads to skillful predictions despite the use of relatively simple, conceptual models. On the other hand, the process is not reproducible, which limits opportunities to assess and incorporate process variations, and the effort required to make forecasts in this way is an obstacle to expanding forecast services - e.g., though adding new forecast locations or more frequent forecast updates, running more complex models, or producing forecast ensembles and hindcasts that can support verification. In the last decade, the hydrologic forecasting community has begun to develop more centralized, `over-the-loop' systems. The quality of these new forecast products will depend on their ability to leverage research in areas including earth system modeling, parameter estimation, data assimilation, statistical post-processing, weather and climate prediction, verification, and uncertainty estimation through the use of ensembles. Currently, the operational streamflow forecasting and water management communities have little experience with the strengths and weaknesses of over-the-loop approaches, even as the systems are being rolled out in major operational forecasting centers. There is thus a need both to evaluate these forecasting advances and to demonstrate their potential in a public arena, raising awareness in forecast user communities and development programs alike. To address this need, the National Center for Atmospheric Research is collaborating with the University of Washington, the Bureau of Reclamation and the US Army Corps of Engineers, using the NCAR 'System for Hydromet Analysis, Research, and Prediction' (SHARP) to implement, assess and demonstrate real-time over-the-loop forecasts. We present early hindcast and verification results from SHARP for short to medium range streamflow forecasts in a number of US case study watersheds.
Competitive Learning Neural Network Ensemble Weighted by Predicted Performance
ERIC Educational Resources Information Center
Ye, Qiang
2010-01-01
Ensemble approaches have been shown to enhance classification by combining the outputs from a set of voting classifiers. Diversity in error patterns among base classifiers promotes ensemble performance. Multi-task learning is an important characteristic for Neural Network classifiers. Introducing a secondary output unit that receives different…
Raposo, Letícia M; Nobre, Flavio F
2017-08-30
Resistance to antiretrovirals (ARVs) is a major problem faced by HIV-infected individuals. Different rule-based algorithms were developed to infer HIV-1 susceptibility to antiretrovirals from genotypic data. However, there is discordance between them, resulting in difficulties for clinical decisions about which treatment to use. Here, we developed ensemble classifiers integrating three interpretation algorithms: Agence Nationale de Recherche sur le SIDA (ANRS), Rega, and the genotypic resistance interpretation system from Stanford HIV Drug Resistance Database (HIVdb). Three approaches were applied to develop a classifier with a single resistance profile: stacked generalization, a simple plurality vote scheme and the selection of the interpretation system with the best performance. The strategies were compared with the Friedman's test and the performance of the classifiers was evaluated using the F-measure, sensitivity and specificity values. We found that the three strategies had similar performances for the selected antiretrovirals. For some cases, the stacking technique with naïve Bayes as the learning algorithm showed a statistically superior F-measure. This study demonstrates that ensemble classifiers can be an alternative tool for clinical decision-making since they provide a single resistance profile from the most commonly used resistance interpretation systems.
NASA Astrophysics Data System (ADS)
Efthimiou, G. C.; Andronopoulos, S.; Bartzis, J. G.
2018-02-01
One of the key issues of recent research on the dispersion inside complex urban environments is the ability to predict dosage-based parameters from the puff release of an airborne material from a point source in the atmospheric boundary layer inside the built-up area. The present work addresses the question of whether the computational fluid dynamics (CFD)-Reynolds-averaged Navier-Stokes (RANS) methodology can be used to predict ensemble-average dosage-based parameters that are related with the puff dispersion. RANS simulations with the ADREA-HF code were, therefore, performed, where a single puff was released in each case. The present method is validated against the data sets from two wind-tunnel experiments. In each experiment, more than 200 puffs were released from which ensemble-averaged dosage-based parameters were calculated and compared to the model's predictions. The performance of the model was evaluated using scatter plots and three validation metrics: fractional bias, normalized mean square error, and factor of two. The model presented a better performance for the temporal parameters (i.e., ensemble-average times of puff arrival, peak, leaving, duration, ascent, and descent) than for the ensemble-average dosage and peak concentration. The majority of the obtained values of validation metrics were inside established acceptance limits. Based on the obtained model performance indices, the CFD-RANS methodology as implemented in the code ADREA-HF is able to predict the ensemble-average temporal quantities related to transient emissions of airborne material in urban areas within the range of the model performance acceptance criteria established in the literature. The CFD-RANS methodology as implemented in the code ADREA-HF is also able to predict the ensemble-average dosage, but the dosage results should be treated with some caution; as in one case, the observed ensemble-average dosage was under-estimated slightly more than the acceptance criteria. Ensemble-average peak concentration was systematically underpredicted by the model to a degree higher than the allowable by the acceptance criteria, in 1 of the 2 wind-tunnel experiments. The model performance depended on the positions of the examined sensors in relation to the emission source and the buildings configuration. The work presented in this paper was carried out (partly) within the scope of COST Action ES1006 "Evaluation, improvement, and guidance for the use of local-scale emergency prediction and response tools for airborne hazards in built environments".
CABS-flex predictions of protein flexibility compared with NMR ensembles
Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian
2014-01-01
Motivation: Identification of flexible regions of protein structures is important for understanding of their biological functions. Recently, we have developed a fast approach for predicting protein structure fluctuations from a single protein model: the CABS-flex. CABS-flex was shown to be an efficient alternative to conventional all-atom molecular dynamics (MD). In this work, we evaluate CABS-flex and MD predictions by comparison with protein structural variations within NMR ensembles. Results: Based on a benchmark set of 140 proteins, we show that the relative fluctuations of protein residues obtained from CABS-flex are well correlated to those of NMR ensembles. On average, this correlation is stronger than that between MD and NMR ensembles. In conclusion, CABS-flex is useful and complementary to MD in predicting protein regions that undergo conformational changes as well as the extent of such changes. Availability and implementation: The CABS-flex is freely available to all users at http://biocomp.chem.uw.edu.pl/CABSflex. Contact: sekmi@chem.uw.edu.pl Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24735558
CABS-flex predictions of protein flexibility compared with NMR ensembles.
Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian
2014-08-01
Identification of flexible regions of protein structures is important for understanding of their biological functions. Recently, we have developed a fast approach for predicting protein structure fluctuations from a single protein model: the CABS-flex. CABS-flex was shown to be an efficient alternative to conventional all-atom molecular dynamics (MD). In this work, we evaluate CABS-flex and MD predictions by comparison with protein structural variations within NMR ensembles. Based on a benchmark set of 140 proteins, we show that the relative fluctuations of protein residues obtained from CABS-flex are well correlated to those of NMR ensembles. On average, this correlation is stronger than that between MD and NMR ensembles. In conclusion, CABS-flex is useful and complementary to MD in predicting protein regions that undergo conformational changes as well as the extent of such changes. The CABS-flex is freely available to all users at http://biocomp.chem.uw.edu.pl/CABSflex. sekmi@chem.uw.edu.pl Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Error Estimation of An Ensemble Statistical Seasonal Precipitation Prediction Model
NASA Technical Reports Server (NTRS)
Shen, Samuel S. P.; Lau, William K. M.; Kim, Kyu-Myong; Li, Gui-Long
2001-01-01
This NASA Technical Memorandum describes an optimal ensemble canonical correlation forecasting model for seasonal precipitation. Each individual forecast is based on the canonical correlation analysis (CCA) in the spectral spaces whose bases are empirical orthogonal functions (EOF). The optimal weights in the ensemble forecasting crucially depend on the mean square error of each individual forecast. An estimate of the mean square error of a CCA prediction is made also using the spectral method. The error is decomposed onto EOFs of the predictand and decreases linearly according to the correlation between the predictor and predictand. Since new CCA scheme is derived for continuous fields of predictor and predictand, an area-factor is automatically included. Thus our model is an improvement of the spectral CCA scheme of Barnett and Preisendorfer. The improvements include (1) the use of area-factor, (2) the estimation of prediction error, and (3) the optimal ensemble of multiple forecasts. The new CCA model is applied to the seasonal forecasting of the United States (US) precipitation field. The predictor is the sea surface temperature (SST). The US Climate Prediction Center's reconstructed SST is used as the predictor's historical data. The US National Center for Environmental Prediction's optimally interpolated precipitation (1951-2000) is used as the predictand's historical data. Our forecast experiments show that the new ensemble canonical correlation scheme renders a reasonable forecasting skill. For example, when using September-October-November SST to predict the next season December-January-February precipitation, the spatial pattern correlation between the observed and predicted are positive in 46 years among the 50 years of experiments. The positive correlations are close to or greater than 0.4 in 29 years, which indicates excellent performance of the forecasting model. The forecasting skill can be further enhanced when several predictors are used.
A deep learning-based multi-model ensemble method for cancer prediction.
Xiao, Yawen; Wu, Jun; Lin, Zongli; Zhao, Xiaodong
2018-01-01
Cancer is a complex worldwide health problem associated with high mortality. With the rapid development of the high-throughput sequencing technology and the application of various machine learning methods that have emerged in recent years, progress in cancer prediction has been increasingly made based on gene expression, providing insight into effective and accurate treatment decision making. Thus, developing machine learning methods, which can successfully distinguish cancer patients from healthy persons, is of great current interest. However, among the classification methods applied to cancer prediction so far, no one method outperforms all the others. In this paper, we demonstrate a new strategy, which applies deep learning to an ensemble approach that incorporates multiple different machine learning models. We supply informative gene data selected by differential gene expression analysis to five different classification models. Then, a deep learning method is employed to ensemble the outputs of the five classifiers. The proposed deep learning-based multi-model ensemble method was tested on three public RNA-seq data sets of three kinds of cancers, Lung Adenocarcinoma, Stomach Adenocarcinoma and Breast Invasive Carcinoma. The test results indicate that it increases the prediction accuracy of cancer for all the tested RNA-seq data sets as compared to using a single classifier or the majority voting algorithm. By taking full advantage of different classifiers, the proposed deep learning-based multi-model ensemble method is shown to be accurate and effective for cancer prediction. Copyright © 2017 Elsevier B.V. All rights reserved.
Prediction of conformationally dependent atomic multipole moments in carbohydrates
Cardamone, Salvatore
2015-01-01
The conformational flexibility of carbohydrates is challenging within the field of computational chemistry. This flexibility causes the electron density to change, which leads to fluctuating atomic multipole moments. Quantum Chemical Topology (QCT) allows for the partitioning of an “atom in a molecule,” thus localizing electron density to finite atomic domains, which permits the unambiguous evaluation of atomic multipole moments. By selecting an ensemble of physically realistic conformers of a chemical system, one evaluates the various multipole moments at defined points in configuration space. The subsequent implementation of the machine learning method kriging delivers the evaluation of an analytical function, which smoothly interpolates between these points. This allows for the prediction of atomic multipole moments at new points in conformational space, not trained for but within prediction range. In this work, we demonstrate that the carbohydrates erythrose and threose are amenable to the above methodology. We investigate how kriging models respond when the training ensemble incorporating multiple energy minima and their environment in conformational space. Additionally, we evaluate the gains in predictive capacity of our models as the size of the training ensemble increases. We believe this approach to be entirely novel within the field of carbohydrates. For a modest training set size of 600, more than 90% of the external test configurations have an error in the total (predicted) electrostatic energy (relative to ab initio) of maximum 1 kJ mol−1 for open chains and just over 90% an error of maximum 4 kJ mol−1 for rings. © 2015 Wiley Periodicals, Inc. PMID:26547500
Prediction of conformationally dependent atomic multipole moments in carbohydrates.
Cardamone, Salvatore; Popelier, Paul L A
2015-12-15
The conformational flexibility of carbohydrates is challenging within the field of computational chemistry. This flexibility causes the electron density to change, which leads to fluctuating atomic multipole moments. Quantum Chemical Topology (QCT) allows for the partitioning of an "atom in a molecule," thus localizing electron density to finite atomic domains, which permits the unambiguous evaluation of atomic multipole moments. By selecting an ensemble of physically realistic conformers of a chemical system, one evaluates the various multipole moments at defined points in configuration space. The subsequent implementation of the machine learning method kriging delivers the evaluation of an analytical function, which smoothly interpolates between these points. This allows for the prediction of atomic multipole moments at new points in conformational space, not trained for but within prediction range. In this work, we demonstrate that the carbohydrates erythrose and threose are amenable to the above methodology. We investigate how kriging models respond when the training ensemble incorporating multiple energy minima and their environment in conformational space. Additionally, we evaluate the gains in predictive capacity of our models as the size of the training ensemble increases. We believe this approach to be entirely novel within the field of carbohydrates. For a modest training set size of 600, more than 90% of the external test configurations have an error in the total (predicted) electrostatic energy (relative to ab initio) of maximum 1 kJ mol(-1) for open chains and just over 90% an error of maximum 4 kJ mol(-1) for rings. © 2015 Wiley Periodicals, Inc.
Ensemble forecast of human West Nile virus cases and mosquito infection rates
NASA Astrophysics Data System (ADS)
Defelice, Nicholas B.; Little, Eliza; Campbell, Scott R.; Shaman, Jeffrey
2017-02-01
West Nile virus (WNV) is now endemic in the continental United States; however, our ability to predict spillover transmission risk and human WNV cases remains limited. Here we develop a model depicting WNV transmission dynamics, which we optimize using a data assimilation method and two observed data streams, mosquito infection rates and reported human WNV cases. The coupled model-inference framework is then used to generate retrospective ensemble forecasts of historical WNV outbreaks in Long Island, New York for 2001-2014. Accurate forecasts of mosquito infection rates are generated before peak infection, and >65% of forecasts accurately predict seasonal total human WNV cases up to 9 weeks before the past reported case. This work provides the foundation for implementation of a statistically rigorous system for real-time forecast of seasonal outbreaks of WNV.
Ensemble forecast of human West Nile virus cases and mosquito infection rates.
DeFelice, Nicholas B; Little, Eliza; Campbell, Scott R; Shaman, Jeffrey
2017-02-24
West Nile virus (WNV) is now endemic in the continental United States; however, our ability to predict spillover transmission risk and human WNV cases remains limited. Here we develop a model depicting WNV transmission dynamics, which we optimize using a data assimilation method and two observed data streams, mosquito infection rates and reported human WNV cases. The coupled model-inference framework is then used to generate retrospective ensemble forecasts of historical WNV outbreaks in Long Island, New York for 2001-2014. Accurate forecasts of mosquito infection rates are generated before peak infection, and >65% of forecasts accurately predict seasonal total human WNV cases up to 9 weeks before the past reported case. This work provides the foundation for implementation of a statistically rigorous system for real-time forecast of seasonal outbreaks of WNV.
Total probabilities of ensemble runoff forecasts
NASA Astrophysics Data System (ADS)
Olav Skøien, Jon; Bogner, Konrad; Salamon, Peter; Smith, Paul; Pappenberger, Florian
2017-04-01
Ensemble forecasting has a long history from meteorological modelling, as an indication of the uncertainty of the forecasts. However, it is necessary to calibrate and post-process the ensembles as the they often exhibit both bias and dispersion errors. Two of the most common methods for this are Bayesian Model Averaging (Raftery et al., 2005) and Ensemble Model Output Statistics (EMOS) (Gneiting et al., 2005). There are also methods for regionalizing these methods (Berrocal et al., 2007) and for incorporating the correlation between lead times (Hemri et al., 2013). Engeland and Steinsland Engeland and Steinsland (2014) developed a framework which can estimate post-processing parameters varying in space and time, while giving a spatially and temporally consistent output. However, their method is computationally complex for our larger number of stations, which makes it unsuitable for our purpose. Our post-processing method of the ensembles is developed in the framework of the European Flood Awareness System (EFAS - http://www.efas.eu), where we are making forecasts for whole Europe, and based on observations from around 700 catchments. As the target is flood forecasting, we are also more interested in improving the forecast skill for high-flows rather than in a good prediction of the entire flow regime. EFAS uses a combination of ensemble forecasts and deterministic forecasts from different meteorological forecasters to force a distributed hydrologic model and to compute runoff ensembles for each river pixel within the model domain. Instead of showing the mean and the variability of each forecast ensemble individually, we will now post-process all model outputs to estimate the total probability, the post-processed mean and uncertainty of all ensembles. The post-processing parameters are first calibrated for each calibration location, but we are adding a spatial penalty in the calibration process to force a spatial correlation of the parameters. The penalty takes distance, stream-connectivity and size of the catchment areas into account. This can in some cases have a slight negative impact on the calibration error, but avoids large differences between parameters of nearby locations, whether stream connected or not. The spatial calibration also makes it easier to interpolate the post-processing parameters to uncalibrated locations. We also look into different methods for handling the non-normal distributions of runoff data and the effect of different data transformations on forecasts skills in general and for floods in particular. Berrocal, V. J., Raftery, A. E. and Gneiting, T.: Combining Spatial Statistical and Ensemble Information in Probabilistic Weather Forecasts, Mon. Weather Rev., 135(4), 1386-1402, doi:10.1175/MWR3341.1, 2007. Engeland, K. and Steinsland, I.: Probabilistic postprocessing models for flow forecasts for a system of catchments and several lead times, Water Resour. Res., 50(1), 182-197, doi:10.1002/2012WR012757, 2014. Gneiting, T., Raftery, A. E., Westveld, A. H. and Goldman, T.: Calibrated Probabilistic Forecasting Using Ensemble Model Output Statistics and Minimum CRPS Estimation, Mon. Weather Rev., 133(5), 1098-1118, doi:10.1175/MWR2904.1, 2005. Hemri, S., Fundel, F. and Zappa, M.: Simultaneous calibration of ensemble river flow predictions over an entire range of lead times, Water Resour. Res., 49(10), 6744-6755, doi:10.1002/wrcr.20542, 2013. Raftery, A. E., Gneiting, T., Balabdaoui, F. and Polakowski, M.: Using Bayesian Model Averaging to Calibrate Forecast Ensembles, Mon. Weather Rev., 133(5), 1155-1174, doi:10.1175/MWR2906.1, 2005.
Changing precipitation in western Europe, climate change or natural variability?
NASA Astrophysics Data System (ADS)
Aalbers, Emma; Lenderink, Geert; van Meijgaard, Erik; van den Hurk, Bart
2017-04-01
Multi-model RCM-GCM ensembles provide high resolution climate projections, valuable for among others climate impact assessment studies. While the application of multiple models (both GCMs and RCMs) provides a certain robustness with respect to model uncertainty, the interpretation of differences between ensemble members - the combined result of model uncertainty and natural variability of the climate system - is not straightforward. Natural variability is intrinsic to the climate system, and a potentially large source of uncertainty in climate change projections, especially for projections on the local to regional scale. To quantify the natural variability and get a robust estimate of the forced climate change response (given a certain model and forcing scenario), large ensembles of climate model simulations of the same model provide essential information. While for global climate models (GCMs) a number of such large single model ensembles exists and have been analyzed, for regional climate models (RCMs) the number and size of single model ensembles is limited, and the predictability of the forced climate response at the local to regional scale is still rather uncertain. We present a regional downscaling of a 16-member single model ensemble over western Europe and the Alps at a resolution of 0.11 degrees (˜12km), similar to the highest resolution EURO-CORDEX simulations. This 16-member ensemble was generated by the GCM EC-EARTH, which was downscaled with the RCM RACMO for the period 1951-2100. This single model ensemble has been investigated in terms of the ensemble mean response (our estimate of the forced climate response), as well as the difference between the ensemble members, which measures natural variability. We focus on the response in seasonal mean and extreme precipitation (seasonal maxima and extremes with a return period up to 20 years) for the near to far future. For most precipitation indices we can reliably determine the climate change signal, given the applied model chain and forcing scenario. However, the analysis also shows how limited the information in single ensemble members is on the local scale forced climate response, even for high levels of global warming when the forced response has emerged from natural variability. Analysis and application of multi-model ensembles like EURO-CORDEX should go hand-in-hand with single model ensembles, like the one presented here, to be able to correctly interpret the fine-scale information in terms of a forced signal and random noise due to natural variability.
National Centers for Environmental Prediction
Modeling Mesoscale Modeling Marine Modeling and Analysis Teams Climate Data Assimilation Ensembles and Post streamline the interaction of analysis, forecast, and post-processing systems within NCEP. The NEMS Force, and will eventually provide support to the community through the Developmental Test Center (DTC
National Centers for Environmental Prediction
Organization Search Enter text Search Navigation Bar End Cap Search EMC Go Branches Global Climate and Weather Modeling Mesoscale Modeling Marine Modeling and Analysis Teams Climate Data Assimilation Ensembles and Post Products People GLOBAL CLIMATE & WEATHER MODELING Global Forecast System (GFS) products - Please see
Smith, Morgan E; Singh, Brajendra K; Irvine, Michael A; Stolk, Wilma A; Subramanian, Swaminathan; Hollingsworth, T Déirdre; Michael, Edwin
2017-03-01
Mathematical models of parasite transmission provide powerful tools for assessing the impacts of interventions. Owing to complexity and uncertainty, no single model may capture all features of transmission and elimination dynamics. Multi-model ensemble modelling offers a framework to help overcome biases of single models. We report on the development of a first multi-model ensemble of three lymphatic filariasis (LF) models (EPIFIL, LYMFASIM, and TRANSFIL), and evaluate its predictive performance in comparison with that of the constituents using calibration and validation data from three case study sites, one each from the three major LF endemic regions: Africa, Southeast Asia and Papua New Guinea (PNG). We assessed the performance of the respective models for predicting the outcomes of annual MDA strategies for various baseline scenarios thought to exemplify the current endemic conditions in the three regions. The results show that the constructed multi-model ensemble outperformed the single models when evaluated across all sites. Single models that best fitted calibration data tended to do less well in simulating the out-of-sample, or validation, intervention data. Scenario modelling results demonstrate that the multi-model ensemble is able to compensate for variance between single models in order to produce more plausible predictions of intervention impacts. Our results highlight the value of an ensemble approach to modelling parasite control dynamics. However, its optimal use will require further methodological improvements as well as consideration of the organizational mechanisms required to ensure that modelling results and data are shared effectively between all stakeholders. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Decadal prediction skill in the ocean with surface nudging in the IPSL-CM5A-LR climate model
NASA Astrophysics Data System (ADS)
Mignot, Juliette; García-Serrano, Javier; Swingedouw, Didier; Germe, Agathe; Nguyen, Sébastien; Ortega, Pablo; Guilyardi, Eric; Ray, Sulagna
2016-08-01
Two decadal prediction ensembles, based on the same climate model (IPSL-CM5A-LR) and the same surface nudging initialization strategy are analyzed and compared with a focus on upper-ocean variables in different regions of the globe. One ensemble consists of 3-member hindcasts launched every year since 1961 while the other ensemble benefits from 9 members but with start dates only every 5 years. Analysis includes anomaly correlation coefficients and root mean square errors computed against several reanalysis and gridded observational fields, as well as against the nudged simulation used to produce the hindcasts initial conditions. The last skill measure gives an upper limit of the predictability horizon one can expect in the forecast system, while the comparison with different datasets highlights uncertainty when assessing the actual skill. Results provide a potential prediction skill (verification against the nudged simulation) beyond the linear trend of the order of 10 years ahead at the global scale, but essentially associated with non-linear radiative forcings, in particular from volcanoes. At regional scale, we obtain 1 year in the tropical band, 10 years at midlatitudes in the North Atlantic and North Pacific, and 5 years at tropical latitudes in the North Atlantic, for both sea surface temperature (SST) and upper-ocean heat content. Actual prediction skill (verified against observational or reanalysis data) is overall more limited and less robust. Even so, large actual skill is found in the extratropical North Atlantic for SST and in the tropical to subtropical North Pacific for upper-ocean heat content. Results are analyzed with respect to the specific dynamics of the model and the way it is influenced by the nudging. The interplay between initialization and internal modes of variability is also analyzed for sea surface salinity. The study illustrates the importance of two key ingredients both necessary for the success of future coordinated decadal prediction exercises, a high frequency of start dates is needed to achieve robust statistical significance, and a large ensemble size is required to increase the signal to noise ratio.
Seasonal Predictions with the GEOS GCM
NASA Technical Reports Server (NTRS)
Schubert, Siegfried; Chang, Yehui; Suarez, Max
1999-01-01
A number of ensembles of seasonal forecasts have recently been completed as part of NASA's Seasonal to Interannual Prediction Project (NSIPP). The focus is on the extratropical response of the atmosphere to observed Surface Sea Temperature (SST) anomalies during boreal winter. The prediction experiments consist of nine forecasts starting from slightly different initial conditions for each year of the 15 year period 1981-95, employing version 2 of the Goddard Earth Observing System (GEOS) atmospheric Global Circulation Models (GCM). The initial conditions are obtained from the NASA GEOS-1 reanalysis data. Comparisons with a companion set of six long-term simulations with observed SST (starting in 1978, so they have no memory of the initial conditions for the periods of interest) are used to assess the relative contributions of the initial conditions and SST anomalies to forecast skill ranging from daily to seasonal time scales. The ensembles are used to isolate the signal, and to assess the nature of the inherent variability (noise) of the forecasts.
NASA Astrophysics Data System (ADS)
Merker, Claire; Ament, Felix; Clemens, Marco
2017-04-01
The quantification of measurement uncertainty for rain radar data remains challenging. Radar reflectivity measurements are affected, amongst other things, by calibration errors, noise, blocking and clutter, and attenuation. Their combined impact on measurement accuracy is difficult to quantify due to incomplete process understanding and complex interdependencies. An improved quality assessment of rain radar measurements is of interest for applications both in meteorology and hydrology, for example for precipitation ensemble generation, rainfall runoff simulations, or in data assimilation for numerical weather prediction. Especially a detailed description of the spatial and temporal structure of errors is beneficial in order to make best use of the areal precipitation information provided by radars. Radar precipitation ensembles are one promising approach to represent spatially variable radar measurement errors. We present a method combining ensemble radar precipitation nowcasting with data assimilation to estimate radar measurement uncertainty at each pixel. This combination of ensemble forecast and observation yields a consistent spatial and temporal evolution of the radar error field. We use an advection-based nowcasting method to generate an ensemble reflectivity forecast from initial data of a rain radar network. Subsequently, reflectivity data from single radars is assimilated into the forecast using the Local Ensemble Transform Kalman Filter. The spread of the resulting analysis ensemble provides a flow-dependent, spatially and temporally correlated reflectivity error estimate at each pixel. We will present first case studies that illustrate the method using data from a high-resolution X-band radar network.
Ensemble modeling to predict habitat suitability for a large-scale disturbance specialist
Latif, Quresh S; Saab, Victoria A; Dudley, Jonathan G; Hollenbeck, Jeff P
2013-01-01
To conserve habitat for disturbance specialist species, ecologists must identify where individuals will likely settle in newly disturbed areas. Habitat suitability models can predict which sites at new disturbances will most likely attract specialists. Without validation data from newly disturbed areas, however, the best approach for maximizing predictive accuracy can be unclear (Northwestern U.S.A.). We predicted habitat suitability for nesting Black-backed Woodpeckers (Picoides arcticus; a burned-forest specialist) at 20 recently (≤6 years postwildfire) burned locations in Montana using models calibrated with data from three locations in Washington, Oregon, and Idaho. We developed 8 models using three techniques (weighted logistic regression, Maxent, and Mahalanobis D2 models) and various combinations of four environmental variables describing burn severity, the north–south orientation of topographic slope, and prefire canopy cover. After translating model predictions into binary classifications (0 = low suitability to unsuitable, 1 = high to moderate suitability), we compiled “ensemble predictions,” consisting of the number of models (0–8) predicting any given site as highly suitable. The suitability status for 40% of the area burned by eastside Montana wildfires was consistent across models and therefore robust to uncertainty in the relative accuracy of particular models and in alternative ecological hypotheses they described. Ensemble predictions exhibited two desirable properties: (1) a positive relationship with apparent rates of nest occurrence at calibration locations and (2) declining model agreement outside surveyed environments consistent with our reduced confidence in novel (i.e., “no-analogue”) environments. Areas of disagreement among models suggested where future surveys could help validate and refine models for an improved understanding of Black-backed Woodpecker nesting habitat relationships. Ensemble predictions presented here can help guide managers attempting to balance salvage logging with habitat conservation in burned-forest landscapes where black-backed woodpecker nest location data are not immediately available. Ensemble modeling represents a promising tool for guiding conservation of large-scale disturbance specialists. PMID:24340177
Global scale predictability of floods
NASA Astrophysics Data System (ADS)
Weerts, Albrecht; Gijsbers, Peter; Sperna Weiland, Frederiek
2016-04-01
Flood (and storm surge) forecasting at the continental and global scale has only become possible in recent years (Emmerton et al., 2016; Verlaan et al., 2015) due to the availability of meteorological forecast, global scale precipitation products and global scale hydrologic and hydrodynamic models. Deltares has setup GLOFFIS a research-oriented multi model operational flood forecasting system based on Delft-FEWS in an open experimental ICT facility called Id-Lab. In GLOFFIS both the W3RA and PCRGLOB-WB model are run in ensemble mode using GEFS and ECMWF-EPS (latency 2 days). GLOFFIS will be used for experiments into predictability of floods (and droughts) and their dependency on initial state estimation, meteorological forcing and the hydrologic model used. Here we present initial results of verification of the ensemble flood forecasts derived with the GLOFFIS system. Emmerton, R., Stephens, L., Pappenberger, F., Pagano, T., Weerts, A., Wood, A. Salamon, P., Brown, J., Hjerdt, N., Donnelly, C., Cloke, H. Continental and Global Scale Flood Forecasting Systems, WIREs Water (accepted), 2016 Verlaan M, De Kleermaeker S, Buckman L. GLOSSIS: Global storm surge forecasting and information system 2015, Australasian Coasts & Ports Conference, 15-18 September 2015,Auckland, New Zealand.
Ensemble Simulation of the Atmospheric Radionuclides Discharged by the Fukushima Nuclear Accident
NASA Astrophysics Data System (ADS)
Sekiyama, Thomas; Kajino, Mizuo; Kunii, Masaru
2013-04-01
Enormous amounts of radionuclides were discharged into the atmosphere by a nuclear accident at the Fukushima Daiichi nuclear power plant (FDNPP) after the earthquake and tsunami on 11 March 2011. The radionuclides were dispersed from the power plant and deposited mainly over eastern Japan and the North Pacific Ocean. A lot of numerical simulations of the radionuclide dispersion and deposition had been attempted repeatedly since the nuclear accident. However, none of them were able to perfectly simulate the distribution of dose rates observed after the accident over eastern Japan. This was partly due to the error of the wind vectors and precipitations used in the numerical simulations; unfortunately, their deterministic simulations could not deal with the probability distribution of the simulation results and errors. Therefore, an ensemble simulation of the atmospheric radionuclides was performed using the ensemble Kalman filter (EnKF) data assimilation system coupled with the Japan Meteorological Agency (JMA) non-hydrostatic mesoscale model (NHM); this mesoscale model has been used operationally for daily weather forecasts by JMA. Meteorological observations were provided to the EnKF data assimilation system from the JMA operational-weather-forecast dataset. Through this ensemble data assimilation, twenty members of the meteorological analysis over eastern Japan from 11 to 31 March 2011 were successfully obtained. Using these meteorological ensemble analysis members, the radionuclide behavior in the atmosphere such as advection, convection, diffusion, dry deposition, and wet deposition was simulated. This ensemble simulation provided the multiple results of the radionuclide dispersion and distribution. Because a large ensemble deviation indicates the low accuracy of the numerical simulation, the probabilistic information is obtainable from the ensemble simulation results. For example, the uncertainty of precipitation triggered the uncertainty of wet deposition; the uncertainty of wet deposition triggered the uncertainty of atmospheric radionuclide amounts. Then the remained radionuclides were transported downwind; consequently the uncertainty signal of the radionuclide amounts was propagated downwind. The signal propagation was seen in the ensemble simulation by the tracking of the large deviation areas of radionuclide concentration and deposition. These statistics are able to provide information useful for the probabilistic prediction of radionuclides.
Moučka, Filip; Lísal, Martin; Škvor, Jiří; Jirsák, Jan; Nezbeda, Ivo; Smith, William R
2011-06-23
We present a new and computationally efficient methodology using osmotic ensemble Monte Carlo (OEMC) simulation to calculate chemical potential-concentration curves and the solubility of aqueous electrolytes. The method avoids calculations for the solid phase, incorporating readily available data from thermochemical tables that are based on well-defined reference states. It performs simulations of the aqueous solution at a fixed number of water molecules, pressure, temperature, and specified overall electrolyte chemical potential. Insertion/deletion of ions to/from the system is implemented using fractional ions, which are coupled to the system via a coupling parameter λ that varies between 0 (no interaction between the fractional ions and the other particles in the system) and 1 (full interaction between the fractional ions and the other particles of the system). Transitions between λ-states are accepted with a probability following from the osmotic ensemble partition function. Biasing weights associated with the λ-states are used in order to efficiently realize transitions between them; these are determined by means of the Wang-Landau method. We also propose a novel scaling procedure for λ, which can be used for both nonpolarizable and polarizable models of aqueous electrolyte systems. The approach is readily extended to involve other solvents, multiple electrolytes, and species complexation reactions. The method is illustrated for NaCl, using SPC/E water and several force field models for NaCl from the literature, and the results are compared with experiment at ambient conditions. Good agreement is obtained for the chemical potential-concentration curve and the solubility prediction is reasonable. Future improvements to the predictions will require improved force field models.
Wind power application research on the fusion of the determination and ensemble prediction
NASA Astrophysics Data System (ADS)
Lan, Shi; Lina, Xu; Yuzhu, Hao
2017-07-01
The fused product of wind speed for the wind farm is designed through the use of wind speed products of ensemble prediction from the European Centre for Medium-Range Weather Forecasts (ECMWF) and professional numerical model products on wind power based on Mesoscale Model5 (MM5) and Beijing Rapid Update Cycle (BJ-RUC), which are suitable for short-term wind power forecasting and electric dispatch. The single-valued forecast is formed by calculating the different ensemble statistics of the Bayesian probabilistic forecasting representing the uncertainty of ECMWF ensemble prediction. Using autoregressive integrated moving average (ARIMA) model to improve the time resolution of the single-valued forecast, and based on the Bayesian model averaging (BMA) and the deterministic numerical model prediction, the optimal wind speed forecasting curve and the confidence interval are provided. The result shows that the fusion forecast has made obvious improvement to the accuracy relative to the existing numerical forecasting products. Compared with the 0-24 h existing deterministic forecast in the validation period, the mean absolute error (MAE) is decreased by 24.3 % and the correlation coefficient (R) is increased by 12.5 %. In comparison with the ECMWF ensemble forecast, the MAE is reduced by 11.7 %, and R is increased 14.5 %. Additionally, MAE did not increase with the prolongation of the forecast ahead.
Impact of Soil Moisture Initialization on Seasonal Weather Prediction
NASA Technical Reports Server (NTRS)
Koster, Randal D.; Suarez, Max J.; Houser, Paul (Technical Monitor)
2002-01-01
The potential role of soil moisture initialization in seasonal forecasting is illustrated through ensembles of simulations with the NASA Seasonal-to-Interannual Prediction Project (NSIPP) model. For each boreal summer during 1997-2001, we generated two 16-member ensembles of 3-month simulations. The first, "AMIP-style" ensemble establishes the degree to which a perfect prediction of SSTs would contribute to the seasonal prediction of precipitation and temperature over continents. The second ensemble is identical to the first, except that the land surface is also initialized with "realistic" soil moisture contents through the continuous prior application (within GCM simulations leading up to the start of the forecast period) of a daily observational precipitation data set and the associated avoidance of model drift through the scaling of all surface prognostic variables. A comparison of the two ensembles shows that soil moisture initialization has a statistically significant impact on summertime precipitation and temperature over only a handful of continental regions. These regions agree, to first order, with regions that satisfy three conditions: (1) a tendency toward large initial soil moisture anomalies, (2) a strong sensitivity of evaporation to soil moisture, and (3) a strong sensitivity of precipitation to evaporation. The degree to which the initialization improves forecasts relative to observations is mixed, reflecting a critical need for the continued development of model parameterizations and data analysis strategies.
Seamless hydrological predictions for a monsoon driven catchment in North-East India
NASA Astrophysics Data System (ADS)
Köhn, Lisei; Bürger, Gerd; Bronstert, Axel
2016-04-01
Improving hydrological forecasting systems on different time scales is interesting and challenging with regards to humanitarian as well as scientific aspects. In meteorological research, short-, medium-, and long-term forecasts are now being merged to form a system of seamless weather and climate predictions. Coupling of these meteorological forecasts with a hydrological model leads to seamless predictions of streamflow, ranging from one day to a season. While there are big efforts made to analyse the uncertainties of probabilistic streamflow forecasts, knowledge of the single uncertainty contributions from meteorological and hydrological modeling is still limited. The overarching goal of this project is to gain knowledge in this subject by decomposing and quantifying the overall predictive uncertainty into its single factors for the entire seamless forecast horizon. Our study area is the Mahanadi River Basin in North-East India, which is prone to severe floods and droughts. Improved streamflow forecasts on different time scales would contribute to early flood warning as well as better water management operations in the agricultural sector. Because of strong inter-annual monsoon variations in this region, which are, unlike the mid-latitudes, partly predictable from long-term atmospheric-oceanic oscillations, the Mahanadi catchment represents an ideal study site. Regionalized precipitation forecasts are obtained by applying the method of expanded downscaling to the ensemble prediction systems of ECMWF and NCEP. The semi-distributed hydrological model HYPSO-RR, which was developed in the Eco-Hydrological Simulation Environment ECHSE, is set up for several sub-catchments of the Mahanadi River Basin. The model is calibrated automatically using the Dynamically Dimensioned Search algorithm, with a modified Nash-Sutcliff efficiency as objective function. Meteorological uncertainty is estimated from the existing ensemble simulations, while the hydrological uncertainty is derived from a statistical post-processor. After running the hydrological model with the precipitation forecasts and applying the hydrological post-processor, the predictive uncertainty of the streamflow forecast can be analysed. The decomposition of total uncertainty is done using a two-way analysis of variance. In this contribution we present the model set-up and the first results of our hydrological forecasts with up to a 180 days lead time, which are derived by using 15 downscaled members of the ECMWF multi-model seasonal forecast ensemble as model input.
Folguera-Blasco, Núria; Cuyàs, Elisabet; Menéndez, Javier A; Alarcón, Tomás
2018-03-01
Understanding the control of epigenetic regulation is key to explain and modify the aging process. Because histone-modifying enzymes are sensitive to shifts in availability of cofactors (e.g. metabolites), cellular epigenetic states may be tied to changing conditions associated with cofactor variability. The aim of this study is to analyse the relationships between cofactor fluctuations, epigenetic landscapes, and cell state transitions. Using Approximate Bayesian Computation, we generate an ensemble of epigenetic regulation (ER) systems whose heterogeneity reflects variability in cofactor pools used by histone modifiers. The heterogeneity of epigenetic metabolites, which operates as regulator of the kinetic parameters promoting/preventing histone modifications, stochastically drives phenotypic variability. The ensemble of ER configurations reveals the occurrence of distinct epi-states within the ensemble. Whereas resilient states maintain large epigenetic barriers refractory to reprogramming cellular identity, plastic states lower these barriers, and increase the sensitivity to reprogramming. Moreover, fine-tuning of cofactor levels redirects plastic epigenetic states to re-enter epigenetic resilience, and vice versa. Our ensemble model agrees with a model of metabolism-responsive loss of epigenetic resilience as a cellular aging mechanism. Our findings support the notion that cellular aging, and its reversal, might result from stochastic translation of metabolic inputs into resilient/plastic cell states via ER systems.
An experimental investigation of the force network ensemble
NASA Astrophysics Data System (ADS)
Kollmer, Jonathan E.; Daniels, Karen E.
2017-06-01
We present an experiment in which a horizontal quasi-2D granular system with a fixed neighbor network is cyclically compressed and decompressed over 1000 cycles. We remove basal friction by floating the particles on a thin air cushion, so that particles only interact in-plane. As expected for a granular system, the applied load is not distributed uniformly, but is instead concentrated in force chains which form a network throughout the system. To visualize the structure of these networks, we use particles made from photoelastic material. The experimental setup and a new data-processing pipeline allow us to map out the evolution subject to the cyclic compressions. We characterize several statistical properties of the packing, including the probability density function of the contact force, and compare them with theoretical and numerical predictions from the force network ensemble theory.
A new Method for the Estimation of Initial Condition Uncertainty Structures in Mesoscale Models
NASA Astrophysics Data System (ADS)
Keller, J. D.; Bach, L.; Hense, A.
2012-12-01
The estimation of fast growing error modes of a system is a key interest of ensemble data assimilation when assessing uncertainty in initial conditions. Over the last two decades three methods (and variations of these methods) have evolved for global numerical weather prediction models: ensemble Kalman filter, singular vectors and breeding of growing modes (or now ensemble transform). While the former incorporates a priori model error information and observation error estimates to determine ensemble initial conditions, the latter two techniques directly address the error structures associated with Lyapunov vectors. However, in global models these structures are mainly associated with transient global wave patterns. When assessing initial condition uncertainty in mesoscale limited area models, several problems regarding the aforementioned techniques arise: (a) additional sources of uncertainty on the smaller scales contribute to the error and (b) error structures from the global scale may quickly move through the model domain (depending on the size of the domain). To address the latter problem, perturbation structures from global models are often included in the mesoscale predictions as perturbed boundary conditions. However, the initial perturbations (when used) are often generated with a variant of an ensemble Kalman filter which does not necessarily focus on the large scale error patterns. In the framework of the European regional reanalysis project of the Hans-Ertel-Center for Weather Research we use a mesoscale model with an implemented nudging data assimilation scheme which does not support ensemble data assimilation at all. In preparation of an ensemble-based regional reanalysis and for the estimation of three-dimensional atmospheric covariance structures, we implemented a new method for the assessment of fast growing error modes for mesoscale limited area models. The so-called self-breeding is development based on the breeding of growing modes technique. Initial perturbations are integrated forward for a short time period and then rescaled and added to the initial state again. Iterating this rapid breeding cycle provides estimates for the initial uncertainty structure (or local Lyapunov vectors) given a specific norm. To avoid that all ensemble perturbations converge towards the leading local Lyapunov vector we apply an ensemble transform variant to orthogonalize the perturbations in the sub-space spanned by the ensemble. By choosing different kind of norms to measure perturbation growth, this technique allows for estimating uncertainty patterns targeted at specific sources of errors (e.g. convection, turbulence). With case study experiments we show applications of the self-breeding method for different sources of uncertainty and different horizontal scales.
Ensemble Learning of QTL Models Improves Prediction of Complex Traits
Bian, Yang; Holland, James B.
2015-01-01
Quantitative trait locus (QTL) models can provide useful insights into trait genetic architecture because of their straightforward interpretability but are less useful for genetic prediction because of the difficulty in including the effects of numerous small effect loci without overfitting. Tight linkage between markers introduces near collinearity among marker genotypes, complicating the detection of QTL and estimation of QTL effects in linkage mapping, and this problem is exacerbated by very high density linkage maps. Here we developed a thinning and aggregating (TAGGING) method as a new ensemble learning approach to QTL mapping. TAGGING reduces collinearity problems by thinning dense linkage maps, maintains aspects of marker selection that characterize standard QTL mapping, and by ensembling, incorporates information from many more markers-trait associations than traditional QTL mapping. The objective of TAGGING was to improve prediction power compared with QTL mapping while also providing more specific insights into genetic architecture than genome-wide prediction models. TAGGING was compared with standard QTL mapping using cross validation of empirical data from the maize (Zea mays L.) nested association mapping population. TAGGING-assisted QTL mapping substantially improved prediction ability for both biparental and multifamily populations by reducing both the variance and bias in prediction. Furthermore, an ensemble model combining predictions from TAGGING-assisted QTL and infinitesimal models improved prediction abilities over the component models, indicating some complementarity between model assumptions and suggesting that some trait genetic architectures involve a mixture of a few major QTL and polygenic effects. PMID:26276383
2017-01-01
The accurate identification of the specific points of interaction between G protein-coupled receptor (GPCR) oligomers is essential for the design of receptor ligands targeting oligomeric receptor targets. A coarse-grained molecular dynamics computer simulation approach would provide a compelling means of identifying these specific protein–protein interactions and could be applied both for known oligomers of interest and as a high-throughput screen to identify novel oligomeric targets. However, to be effective, this in silico modeling must provide accurate, precise, and reproducible information. This has been achieved recently in numerous biological systems using an ensemble-based all-atom molecular dynamics approach. In this study, we describe an equivalent methodology for ensemble-based coarse-grained simulations. We report the performance of this method when applied to four different GPCRs known to oligomerize using error analysis to determine the ensemble size and individual replica simulation time required. Our measurements of distance between residues shown to be involved in oligomerization of the fifth transmembrane domain from the adenosine A2A receptor are in very good agreement with the existing biophysical data and provide information about the nature of the contact interface that cannot be determined experimentally. Calculations of distance between rhodopsin, CXCR4, and β1AR transmembrane domains reported to form contact points in homodimers correlate well with the corresponding measurements obtained from experimental structural data, providing an ability to predict contact interfaces computationally. Interestingly, error analysis enables identification of noninteracting regions. Our results confirm that GPCR interactions can be reliably predicted using this novel methodology. PMID:28383913
NASA Astrophysics Data System (ADS)
Hawkins, L. R.; Rupp, D. E.; Li, S.; Sarah, S.; McNeall, D. J.; Mote, P.; Betts, R. A.; Wallom, D.
2017-12-01
Changing regional patterns of surface temperature, precipitation, and humidity may cause ecosystem-scale changes in vegetation, altering the distribution of trees, shrubs, and grasses. A changing vegetation distribution, in turn, alters the albedo, latent heat flux, and carbon exchanged with the atmosphere with resulting feedbacks onto the regional climate. However, a wide range of earth-system processes that affect the carbon, energy, and hydrologic cycles occur at sub grid scales in climate models and must be parameterized. The appropriate parameter values in such parameterizations are often poorly constrained, leading to uncertainty in predictions of how the ecosystem will respond to changes in forcing. To better understand the sensitivity of regional climate to parameter selection and to improve regional climate and vegetation simulations, we used a large perturbed physics ensemble and a suite of statistical emulators. We dynamically downscaled a super-ensemble (multiple parameter sets and multiple initial conditions) of global climate simulations using a 25-km resolution regional climate model HadRM3p with the land-surface scheme MOSES2 and dynamic vegetation module TRIFFID. We simultaneously perturbed land surface parameters relating to the exchange of carbon, water, and energy between the land surface and atmosphere in a large super-ensemble of regional climate simulations over the western US. Statistical emulation was used as a computationally cost-effective tool to explore uncertainties in interactions. Regions of parameter space that did not satisfy observational constraints were eliminated and an ensemble of parameter sets that reduce regional biases and span a range of plausible interactions among earth system processes were selected. This study demonstrated that by combining super-ensemble simulations with statistical emulation, simulations of regional climate could be improved while simultaneously accounting for a range of plausible land-atmosphere feedback strengths.
Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy
Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing
2016-01-01
Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. PMID:27662651
Ensemble positive unlabeled learning for disease gene identification.
Yang, Peng; Li, Xiaoli; Chua, Hon-Nian; Kwoh, Chee-Keong; Ng, See-Kiong
2014-01-01
An increasing number of genes have been experimentally confirmed in recent years as causative genes to various human diseases. The newly available knowledge can be exploited by machine learning methods to discover additional unknown genes that are likely to be associated with diseases. In particular, positive unlabeled learning (PU learning) methods, which require only a positive training set P (confirmed disease genes) and an unlabeled set U (the unknown candidate genes) instead of a negative training set N, have been shown to be effective in uncovering new disease genes in the current scenario. Using only a single source of data for prediction can be susceptible to bias due to incompleteness and noise in the genomic data and a single machine learning predictor prone to bias caused by inherent limitations of individual methods. In this paper, we propose an effective PU learning framework that integrates multiple biological data sources and an ensemble of powerful machine learning classifiers for disease gene identification. Our proposed method integrates data from multiple biological sources for training PU learning classifiers. A novel ensemble-based PU learning method EPU is then used to integrate multiple PU learning classifiers to achieve accurate and robust disease gene predictions. Our evaluation experiments across six disease groups showed that EPU achieved significantly better results compared with various state-of-the-art prediction methods as well as ensemble learning classifiers. Through integrating multiple biological data sources for training and the outputs of an ensemble of PU learning classifiers for prediction, we are able to minimize the potential bias and errors in individual data sources and machine learning algorithms to achieve more accurate and robust disease gene predictions. In the future, our EPU method provides an effective framework to integrate the additional biological and computational resources for better disease gene predictions.
NASA Astrophysics Data System (ADS)
Lee, E.; Koster, R. D.; Ott, L. E.; Weir, B.; Mahanama, S. P. P.; Chang, Y.; Zeng, F.
2017-12-01
Understanding the underlying processes that control the carbon cycle is key to predicting future global change. Much of the uncertainty in the magnitude and variability of the atmospheric carbon dioxide (CO2) stems from uncertainty in terrestrial carbon fluxes. Budget-based analyses show that such fluxes exhibit substantial interannual variability, but the relative impacts of temperature and moisture variations on regional and global scales are poorly understood. Here we investigate the impact of a regional drought on terrestrial carbon fluxes and CO2 mixing ratios over North America using the NASA Goddard Earth Observing System (GEOS) Model. Two 48-member ensembles of NASA GEOS-5 simulations with fully coupled land and atmosphere carbon components are performed - a control ensemble and an ensemble with an artificially imposed dry land surface anomaly for three months (April-June) over the lower Mississippi River Valley. Comparison of the results using the ensemble approach allows a direct quantification of the impact of the regional drought on local and proximate carbon exchange at the land surface via the carbon-water feedback processes.
Ensemble of classifiers for confidence-rated classification of NDE signal
NASA Astrophysics Data System (ADS)
Banerjee, Portia; Safdarnejad, Seyed; Udpa, Lalita; Udpa, Satish
2016-02-01
Ensemble of classifiers in general, aims to improve classification accuracy by combining results from multiple weak hypotheses into a single strong classifier through weighted majority voting. Improved versions of ensemble of classifiers generate self-rated confidence scores which estimate the reliability of each of its prediction and boost the classifier using these confidence-rated predictions. However, such a confidence metric is based only on the rate of correct classification. In existing works, although ensemble of classifiers has been widely used in computational intelligence, the effect of all factors of unreliability on the confidence of classification is highly overlooked. With relevance to NDE, classification results are affected by inherent ambiguity of classifica-tion, non-discriminative features, inadequate training samples and noise due to measurement. In this paper, we extend the existing ensemble classification by maximizing confidence of every classification decision in addition to minimizing the classification error. Initial results of the approach on data from eddy current inspection show improvement in classification performance of defect and non-defect indications.
Hybrid vs Adaptive Ensemble Kalman Filtering for Storm Surge Forecasting
NASA Astrophysics Data System (ADS)
Altaf, M. U.; Raboudi, N.; Gharamti, M. E.; Dawson, C.; McCabe, M. F.; Hoteit, I.
2014-12-01
Recent storm surge events due to Hurricanes in the Gulf of Mexico have motivated the efforts to accurately forecast water levels. Toward this goal, a parallel architecture has been implemented based on a high resolution storm surge model, ADCIRC. However the accuracy of the model notably depends on the quality and the recentness of the input data (mainly winds and bathymetry), model parameters (e.g. wind and bottom drag coefficients), and the resolution of the model grid. Given all these uncertainties in the system, the challenge is to build an efficient prediction system capable of providing accurate forecasts enough ahead of time for the authorities to evacuate the areas at risk. We have developed an ensemble-based data assimilation system to frequently assimilate available data into the ADCIRC model in order to improve the accuracy of the model. In this contribution we study and analyze the performances of different ensemble Kalman filter methodologies for efficient short-range storm surge forecasting, the aim being to produce the most accurate forecasts at the lowest possible computing time. Using Hurricane Ike meteorological data to force the ADCIRC model over a domain including the Gulf of Mexico coastline, we implement and compare the forecasts of the standard EnKF, the hybrid EnKF and an adaptive EnKF. The last two schemes have been introduced as efficient tools for enhancing the behavior of the EnKF when implemented with small ensembles by exploiting information from a static background covariance matrix. Covariance inflation and localization are implemented in all these filters. Our results suggest that both the hybrid and the adaptive approach provide significantly better forecasts than those resulting from the standard EnKF, even when implemented with much smaller ensembles.
eHive: An Artificial Intelligence workflow system for genomic analysis
2010-01-01
Background The Ensembl project produces updates to its comparative genomics resources with each of its several releases per year. During each release cycle approximately two weeks are allocated to generate all the genomic alignments and the protein homology predictions. The number of calculations required for this task grows approximately quadratically with the number of species. We currently support 50 species in Ensembl and we expect the number to continue to grow in the future. Results We present eHive, a new fault tolerant distributed processing system initially designed to support comparative genomic analysis, based on blackboard systems, network distributed autonomous agents, dataflow graphs and block-branch diagrams. In the eHive system a MySQL database serves as the central blackboard and the autonomous agent, a Perl script, queries the system and runs jobs as required. The system allows us to define dataflow and branching rules to suit all our production pipelines. We describe the implementation of three pipelines: (1) pairwise whole genome alignments, (2) multiple whole genome alignments and (3) gene trees with protein homology inference. Finally, we show the efficiency of the system in real case scenarios. Conclusions eHive allows us to produce computationally demanding results in a reliable and efficient way with minimal supervision and high throughput. Further documentation is available at: http://www.ensembl.org/info/docs/eHive/. PMID:20459813
NASA Astrophysics Data System (ADS)
Matte, Simon; Boucher, Marie-Amélie; Boucher, Vincent; Fortier Filion, Thomas-Charles
2017-06-01
A large effort has been made over the past 10 years to promote the operational use of probabilistic or ensemble streamflow forecasts. Numerous studies have shown that ensemble forecasts are of higher quality than deterministic ones. Many studies also conclude that decisions based on ensemble rather than deterministic forecasts lead to better decisions in the context of flood mitigation. Hence, it is believed that ensemble forecasts possess a greater economic and social value for both decision makers and the general population. However, the vast majority of, if not all, existing hydro-economic studies rely on a cost-loss ratio framework that assumes a risk-neutral decision maker. To overcome this important flaw, this study borrows from economics and evaluates the economic value of early warning flood systems using the well-known Constant Absolute Risk Aversion (CARA) utility function, which explicitly accounts for the level of risk aversion of the decision maker. This new framework allows for the full exploitation of the information related to a forecasts' uncertainty, making it especially suited for the economic assessment of ensemble or probabilistic forecasts. Rather than comparing deterministic and ensemble forecasts, this study focuses on comparing different types of ensemble forecasts. There are multiple ways of assessing and representing forecast uncertainty. Consequently, there exist many different means of building an ensemble forecasting system for future streamflow. One such possibility is to dress deterministic forecasts using the statistics of past error forecasts. Such dressing methods are popular among operational agencies because of their simplicity and intuitiveness. Another approach is the use of ensemble meteorological forecasts for precipitation and temperature, which are then provided as inputs to one or many hydrological model(s). In this study, three concurrent ensemble streamflow forecasting systems are compared: simple statistically dressed deterministic forecasts, forecasts based on meteorological ensembles, and a variant of the latter that also includes an estimation of state variable uncertainty. This comparison takes place for the Montmorency River, a small flood-prone watershed in southern central Quebec, Canada. The assessment of forecasts is performed for lead times of 1 to 5 days, both in terms of forecasts' quality (relative to the corresponding record of observations) and in terms of economic value, using the new proposed framework based on the CARA utility function. It is found that the economic value of a forecast for a risk-averse decision maker is closely linked to the forecast reliability in predicting the upper tail of the streamflow distribution. Hence, post-processing forecasts to avoid over-forecasting could help improve both the quality and the value of forecasts.
Hsieh, Nan-Chen; Hung, Lun-Ping; Shih, Chun-Che; Keh, Huan-Chao; Chan, Chien-Hui
2012-06-01
Endovascular aneurysm repair (EVAR) is an advanced minimally invasive surgical technology that is helpful for reducing patients' recovery time, postoperative morbidity and mortality. This study proposes an ensemble model to predict postoperative morbidity after EVAR. The ensemble model was developed using a training set of consecutive patients who underwent EVAR between 2000 and 2009. All data required for prediction modeling, including patient demographics, preoperative, co-morbidities, and complication as outcome variables, was collected prospectively and entered into a clinical database. A discretization approach was used to categorize numerical values into informative feature space. Then, the Bayesian network (BN), artificial neural network (ANN), and support vector machine (SVM) were adopted as base models, and stacking combined multiple models. The research outcomes consisted of an ensemble model to predict postoperative morbidity after EVAR, the occurrence of postoperative complications prospectively recorded, and the causal effect knowledge by BNs with Markov blanket concept.
Using ensembles in water management: forecasting dry and wet episodes
NASA Astrophysics Data System (ADS)
van het Schip-Haverkamp, Tessa; van den Berg, Wim; van de Beek, Remco
2015-04-01
Extreme weather situations as droughts and extensive precipitation are becoming more frequent, which makes it more important to obtain accurate weather forecasts for the short and long term. Ensembles can provide a solution in terms of scenario forecasts. MeteoGroup uses ensembles in a new forecasting technique which presents a number of weather scenarios for a dynamical water management project, called Water-Rijk, in which water storage and water retention plays a large role. The Water-Rijk is part of Park Lingezegen, which is located between Arnhem and Nijmegen in the Netherlands. In collaboration with the University of Wageningen, Alterra and Eijkelkamp a forecasting system is developed for this area which can provide water boards with a number of weather and hydrology scenarios in order to assist in the decision whether or not water retention or water storage is necessary in the near future. In order to make a forecast for drought and extensive precipitation, the difference 'precipitation- evaporation' is used as a measurement of drought in the weather forecasts. In case of an upcoming drought this difference will take larger negative values. In case of a wet episode, this difference will be positive. The Makkink potential evaporation is used which gives the most accurate potential evaporation values during the summer, when evaporation plays an important role in the availability of surface water. Scenarios are determined by reducing the large number of forecasts in the ensemble to a number of averaged members with each its own likelihood of occurrence. For the Water-Rijk project 5 scenario forecasts are calculated: extreme dry, dry, normal, wet and extreme wet. These scenarios are constructed for two forecasting periods, each using its own ensemble technique: up to 48 hours ahead and up to 15 days ahead. The 48-hour forecast uses an ensemble constructed from forecasts of multiple high-resolution regional models: UKMO's Euro4 model,the ECMWF model, WRF and Hirlam. Using multiple model runs and additional post processing, an ensemble can be created from non-ensemble models. The 15-day forecast uses the ECMWF Ensemble Prediction System forecast from which scenarios can be deduced directly. A combination of the ensembles from the two forecasting periods is used in order to have the highest possible resolution of the forecast for the first 48 hours followed by the lower resolution long term forecast.
NASA Astrophysics Data System (ADS)
Pribram-Jones, Aurora
Warm dense matter (WDM) is a high energy phase between solids and plasmas, with characteristics of both. It is present in the centers of giant planets, within the earth's core, and on the path to ignition of inertial confinement fusion. The high temperatures and pressures of warm dense matter lead to complications in its simulation, as both classical and quantum effects must be included. One of the most successful simulation methods is density functional theory-molecular dynamics (DFT-MD). Despite great success in a diverse array of applications, DFT-MD remains computationally expensive and it neglects the explicit temperature dependence of electron-electron interactions known to exist within exact DFT. Finite-temperature density functional theory (FT DFT) is an extension of the wildly successful ground-state DFT formalism via thermal ensembles, broadening its quantum mechanical treatment of electrons to include systems at non-zero temperatures. Exact mathematical conditions have been used to predict the behavior of approximations in limiting conditions and to connect FT DFT to the ground-state theory. An introduction to FT DFT is given within the context of ensemble DFT and the larger field of DFT is discussed for context. Ensemble DFT is used to describe ensembles of ground-state and excited systems. Exact conditions in ensemble DFT and the performance of approximations depend on ensemble weights. Using an inversion method, exact Kohn-Sham ensemble potentials are found and compared to approximations. The symmetry eigenstate Hartree-exchange approximation is in good agreement with exact calculations because of its inclusion of an ensemble derivative discontinuity. Since ensemble weights in FT DFT are temperature-dependent Fermi weights, this insight may help develop approximations well-suited to both ground-state and FT DFT. A novel, highly efficient approach to free energy calculations, finite-temperature potential functional theory, is derived, which has the potential to transform the simulation of warm dense matter. As a semiclassical method, it connects the normally disparate regimes of cold condensed matter physics and hot plasma physics. This orbital-free approach captures the smooth classical density envelope and quantum density oscillations that are both crucial to accurate modeling of materials where temperature and pressure effects are influential.
NASA Astrophysics Data System (ADS)
Tsai, Hsiao-Chung; Elsberry, Russell L.
2013-12-01
SummaryAn opportunity exists to extend support to the decision-making processes of water resource management and hydrological operations by providing extended-range tropical cyclone (TC) formation and track forecasts in the western North Pacific from the 51-member ECMWF 32-day ensemble. A new objective verification technique demonstrates that the ECMWF ensemble can predict most of the formations and tracks of the TCs during July 2009 to December 2010, even for most of the tropical depressions. Due to the relatively large number of false-alarm TCs in the ECMWF ensemble forecasts that would cause problems for support of hydrological operations, characteristics of these false alarms are discussed. Special attention is given to the ability of the ECMWF ensemble to predict periods of no-TCs in the Taiwan area, since water resource management decisions also depend on the absence of typhoon-related rainfall. A three-tier approach is proposed to provide support for hydrological operations via extended-range forecasts twice weekly on the 30-day timescale, twice-daily on the 15-day timescale, and up to four times a day with a consensus of high-resolution deterministic models.
Men, Zhongxian; Yee, Eugene; Lien, Fue-Sang; Yang, Zhiling; Liu, Yongqian
2014-01-01
Short-term wind speed and wind power forecasts (for a 72 h period) are obtained using a nonlinear autoregressive exogenous artificial neural network (ANN) methodology which incorporates either numerical weather prediction or high-resolution computational fluid dynamics wind field information as an exogenous input. An ensemble approach is used to combine the predictions from many candidate ANNs in order to provide improved forecasts for wind speed and power, along with the associated uncertainties in these forecasts. More specifically, the ensemble ANN is used to quantify the uncertainties arising from the network weight initialization and from the unknown structure of the ANN. All members forming the ensemble of neural networks were trained using an efficient particle swarm optimization algorithm. The results of the proposed methodology are validated using wind speed and wind power data obtained from an operational wind farm located in Northern China. The assessment demonstrates that this methodology for wind speed and power forecasting generally provides an improvement in predictive skills when compared to the practice of using an "optimal" weight vector from a single ANN while providing additional information in the form of prediction uncertainty bounds.
Lien, Fue-Sang; Yang, Zhiling; Liu, Yongqian
2014-01-01
Short-term wind speed and wind power forecasts (for a 72 h period) are obtained using a nonlinear autoregressive exogenous artificial neural network (ANN) methodology which incorporates either numerical weather prediction or high-resolution computational fluid dynamics wind field information as an exogenous input. An ensemble approach is used to combine the predictions from many candidate ANNs in order to provide improved forecasts for wind speed and power, along with the associated uncertainties in these forecasts. More specifically, the ensemble ANN is used to quantify the uncertainties arising from the network weight initialization and from the unknown structure of the ANN. All members forming the ensemble of neural networks were trained using an efficient particle swarm optimization algorithm. The results of the proposed methodology are validated using wind speed and wind power data obtained from an operational wind farm located in Northern China. The assessment demonstrates that this methodology for wind speed and power forecasting generally provides an improvement in predictive skills when compared to the practice of using an “optimal” weight vector from a single ANN while providing additional information in the form of prediction uncertainty bounds. PMID:27382627
NASA Astrophysics Data System (ADS)
Kasiviswanathan, K.; Sudheer, K.
2013-05-01
Artificial neural network (ANN) based hydrologic models have gained lot of attention among water resources engineers and scientists, owing to their potential for accurate prediction of flood flows as compared to conceptual or physics based hydrologic models. The ANN approximates the non-linear functional relationship between the complex hydrologic variables in arriving at the river flow forecast values. Despite a large number of applications, there is still some criticism that ANN's point prediction lacks in reliability since the uncertainty of predictions are not quantified, and it limits its use in practical applications. A major concern in application of traditional uncertainty analysis techniques on neural network framework is its parallel computing architecture with large degrees of freedom, which makes the uncertainty assessment a challenging task. Very limited studies have considered assessment of predictive uncertainty of ANN based hydrologic models. In this study, a novel method is proposed that help construct the prediction interval of ANN flood forecasting model during calibration itself. The method is designed to have two stages of optimization during calibration: at stage 1, the ANN model is trained with genetic algorithm (GA) to obtain optimal set of weights and biases vector, and during stage 2, the optimal variability of ANN parameters (obtained in stage 1) is identified so as to create an ensemble of predictions. During the 2nd stage, the optimization is performed with multiple objectives, (i) minimum residual variance for the ensemble mean, (ii) maximum measured data points to fall within the estimated prediction interval and (iii) minimum width of prediction interval. The method is illustrated using a real world case study of an Indian basin. The method was able to produce an ensemble that has an average prediction interval width of 23.03 m3/s, with 97.17% of the total validation data points (measured) lying within the interval. The derived prediction interval for a selected hydrograph in the validation data set is presented in Fig 1. It is noted that most of the observed flows lie within the constructed prediction interval, and therefore provides information about the uncertainty of the prediction. One specific advantage of the method is that when ensemble mean value is considered as a forecast, the peak flows are predicted with improved accuracy by this method compared to traditional single point forecasted ANNs. Fig. 1 Prediction Interval for selected hydrograph
Microcanonical fluctuations of the condensate in weakly interacting Bose gases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Idziaszek, Zbigniew
2005-05-15
We study fluctuations of the number of Bose condensed atoms in a weakly interacting homogeneous and trapped gases. For a homogeneous system we apply the particle-number-conserving formulation of the Bogoliubov theory and calculate the condensate fluctuations within the canonical and the microcanonical ensembles. We demonstrate that, at least in the low-temperature regime, predictions of the particle-number-conserving and traditional, nonconserving theory are identical, and lead to the anomalous scaling of fluctuations. Furthermore, the microcanonical fluctuations differ from the canonical ones by a quantity which scales normally in the number of particles, thus predictions of both ensembles are equivalent in the thermodynamicmore » limit. We observe a similar behavior for a weakly interacting gas in a harmonic trap. This is in contrast to the trapped, ideal gas, where microcanonical and canonical fluctuations are different in the thermodynamic limit.« less
Assessment of the forecast skill of spring onset in the NMME experiment
NASA Astrophysics Data System (ADS)
Carrillo, C. M.; Ault, T.
2017-12-01
This study assesses the predictability of spring onset using an index of its interannual variability. We use the North American Multi-Model Ensemble (NMME) experiment to assess this predictability. The input dataset to compute spring onset index, SI-x, were treated with a daily joint bias correction (JBC) approach, and the SI-x outputs were post-processed using three ensemble model output statistic (EMOS) approaches—logistic regression, Gaussian Ensemble Dressing, and non-homogeneous Gaussian regression. These EMOS approaches quantify the effect of training period length and ensemble size on forecast skill. The highest range of predictability for the timing spring onset is from 10 to 60 days, and it is located along a narrow band between 35° to 45°N in the US. Using rank probability scores based on quantiles (q), a forecast threshold (q) of 0.5 provides a range of predictability that falls into two categories 10-40 and 40-60 days, which seems to represent the effect of the intra-seasonal scale. Using higher thresholds (q=0.6 and 0.7) predictability shows lower range with values around 10-30 days. The post-processing work using JBC improves the predictability skill by 13% from uncorrected results. Using EMOS, a significant positive change in the skill score is noted in regions where the skill with JBC shows evidence of improvement. The consensus of these techniques shows that regions of better predictability can be expanded.
NASA Astrophysics Data System (ADS)
Dib, Alain; Kavvas, M. Levent
2018-03-01
The characteristic form of the Saint-Venant equations is solved in a stochastic setting by using a newly proposed Fokker-Planck Equation (FPE) methodology. This methodology computes the ensemble behavior and variability of the unsteady flow in open channels by directly solving for the flow variables' time-space evolutionary probability distribution. The new methodology is tested on a stochastic unsteady open-channel flow problem, with an uncertainty arising from the channel's roughness coefficient. The computed statistical descriptions of the flow variables are compared to the results obtained through Monte Carlo (MC) simulations in order to evaluate the performance of the FPE methodology. The comparisons show that the proposed methodology can adequately predict the results of the considered stochastic flow problem, including the ensemble averages, variances, and probability density functions in time and space. Unlike the large number of simulations performed by the MC approach, only one simulation is required by the FPE methodology. Moreover, the total computational time of the FPE methodology is smaller than that of the MC approach, which could prove to be a particularly crucial advantage in systems with a large number of uncertain parameters. As such, the results obtained in this study indicate that the proposed FPE methodology is a powerful and time-efficient approach for predicting the ensemble average and variance behavior, in both space and time, for an open-channel flow process under an uncertain roughness coefficient.
NASA Astrophysics Data System (ADS)
Khajehei, Sepideh; Moradkhani, Hamid
2015-04-01
Producing reliable and accurate hydrologic ensemble forecasts are subject to various sources of uncertainty, including meteorological forcing, initial conditions, model structure, and model parameters. Producing reliable and skillful precipitation ensemble forecasts is one approach to reduce the total uncertainty in hydrological applications. Currently, National Weather Prediction (NWP) models are developing ensemble forecasts for various temporal ranges. It is proven that raw products from NWP models are biased in mean and spread. Given the above state, there is a need for methods that are able to generate reliable ensemble forecasts for hydrological applications. One of the common techniques is to apply statistical procedures in order to generate ensemble forecast from NWP-generated single-value forecasts. The procedure is based on the bivariate probability distribution between the observation and single-value precipitation forecast. However, one of the assumptions of the current method is fitting Gaussian distribution to the marginal distributions of observed and modeled climate variable. Here, we have described and evaluated a Bayesian approach based on Copula functions to develop an ensemble precipitation forecast from the conditional distribution of single-value precipitation forecasts. Copula functions are known as the multivariate joint distribution of univariate marginal distributions, which are presented as an alternative procedure in capturing the uncertainties related to meteorological forcing. Copulas are capable of modeling the joint distribution of two variables with any level of correlation and dependency. This study is conducted over a sub-basin in the Columbia River Basin in USA using the monthly precipitation forecasts from Climate Forecast System (CFS) with 0.5x0.5 Deg. spatial resolution to reproduce the observations. The verification is conducted on a different period and the superiority of the procedure is compared with Ensemble Pre-Processor approach currently used by National Weather Service River Forecast Centers in USA.
Operational value of ensemble streamflow forecasts for hydropower production: A Canadian case study
NASA Astrophysics Data System (ADS)
Boucher, Marie-Amélie; Tremblay, Denis; Luc, Perreault; François, Anctil
2010-05-01
Ensemble and probabilistic forecasts have many advantages over deterministic ones, both in meteorology and hydrology (e.g. Krzysztofowicz, 2001). Mainly, they inform the user on the uncertainty linked to the forecast. It has been brought to attention that such additional information could lead to improved decision making (e.g. Wilks and Hamill, 1995; Mylne, 2002; Roulin, 2007), but very few studies concentrate on operational situations involving the use of such forecasts. In addition, many authors have demonstrated that ensemble forecasts outperform deterministic forecasts in terms of performance (e.g. Jaun et al., 2005; Velazquez et al., 2009; Laio and Tamea, 2007). However, such performance is mostly assessed on the basis of numerical scoring rules, which compare the forecasts to the observations, and seldom in terms of management gains. The proposed case study adopts an operational point of view, on the basis that a novel forecasting system has value only if it leads to increase monetary and societal gains (e.g. Murphy, 1994; Laio and Tamea, 2007). More specifically, Environment Canada operational ensemble precipitation forecasts are used to drive the HYDROTEL distributed hydrological model (Fortin et al., 1995), calibrated on the Gatineau watershed located in Québec, Canada. The resulting hydrological ensemble forecasts are then incorporated into Hydro-Québec SOHO stochastic management optimization tool that automatically search for optimal operation decisions for the all reservoirs and hydropower plants located on the basin. The timeline of the study is the fall season of year 2003. This period is especially relevant because of high precipitations that nearly caused a major spill, and forced the preventive evacuation of a portion of the population located near one of the dams. We show that the use of the ensemble forecasts would have reduced the occurrence of spills and flooding, which is of particular importance for dams located in populous area, and increased hydropower production. The ensemble precipitation forecasts extend from March 1st of 2002 to December 31st of 2003. They were obtained using two atmospheric models, SEF (8 members plus the control deterministic forecast) and GEM (8 members). The corresponding deterministic precipitation forecast issued by SEF model is also used within HYDROTEL in order to compare ensemble streamflow forecasts with their deterministic counterparts. Although this study does not incorporate all the sources of uncertainty, precipitation is certainly the most important input for hydrological modeling and conveys a great portion of the total uncertainty. References: Fortin, J.P., Moussa, R., Bocquillon, C. and Villeneuve, J.P. 1995: HYDROTEL, un modèle hydrologique distribué pouvant bénéficier des données fournies par la télédétection et les systèmes d'information géographique, Revue des Sciences de l'Eau, 8(1), 94-124. Jaun, S., Ahrens, B., Walser, A., Ewen, T. and Schaer, C. 2008: A probabilistic view on the August 2005 floods in the upper Rhine catchment, Natural Hazards and Earth System Sciences, 8 (2), 281-291. Krzysztofowicz, R. 2001: The case for probabilistic forecasting in hydrology, Journal of Hydrology, 249, 2-9. Murphy, A.H. 1994: Assessing the economic value of weather forecasts: An overview of methods, results and issues, Meteorological Applications, 1, 69-73. Mylne, K.R. 2002: Decision-Making from probability forecasts based on forecast value, Meteorological Applications, 9, 307-315. Laio, F. and Tamea, S. 2007: Verification tools for probabilistic forecasts of continuous hydrological variables, Hydrology and Earth System Sciences, 11, 1267-1277. Roulin, E. 2007: Skill and relative economic value of medium-range hydrological ensemble predictions, Hydrology and Earth System Sciences, 11, 725-737. Velazquez, J.-A., Petit, T., Lavoie, A., Boucher, M.-A., Turcotte, R., Fortin, V. and Anctil, F. 2009: An evaluation of the Canadian global meteorological ensemble prediction system for short-term hydrological forecasting, Hydrology and Earth System Sciences, 13(11), 2221-2231. Wilks, D.S. and Hamill, T.M. 1995: Potential economic value of ensemble-based surface weather forecasts, Monthly Weather Review, 123(12), 3565-3575.
Lavers, David A.; Waliser, Duane E.; Ralph, F. Martin; Dettinger, Michael
2016-01-01
The western United States is vulnerable to socioeconomic disruption due to extreme winter precipitation and floods. Traditionally, forecasts of precipitation and river discharge provide the basis for preparations. Herein we show that earlier event awareness may be possible through use of horizontal water vapor transport (integrated vapor transport (IVT)) forecasts. Applying the potential predictability concept to the National Centers for Environmental Prediction global ensemble reforecasts, across 31 winters, IVT is found to be more predictable than precipitation. IVT ensemble forecasts with the smallest spreads (least forecast uncertainty) are associated with initiation states with anomalously high geopotential heights south of Alaska, a setup conducive for anticyclonic conditions and weak IVT into the western United States. IVT ensemble forecasts with the greatest spreads (most forecast uncertainty) have initiation states with anomalously low geopotential heights south of Alaska and correspond to atmospheric rivers. The greater IVT predictability could provide warnings of impending storminess with additional lead times for hydrometeorological applications.
Chen, Zhijia; Zhu, Yuanchang; Di, Yanqiang; Feng, Shaochong
2015-01-01
In IaaS (infrastructure as a service) cloud environment, users are provisioned with virtual machines (VMs). To allocate resources for users dynamically and effectively, accurate resource demands predicting is essential. For this purpose, this paper proposes a self-adaptive prediction method using ensemble model and subtractive-fuzzy clustering based fuzzy neural network (ESFCFNN). We analyze the characters of user preferences and demands. Then the architecture of the prediction model is constructed. We adopt some base predictors to compose the ensemble model. Then the structure and learning algorithm of fuzzy neural network is researched. To obtain the number of fuzzy rules and the initial value of the premise and consequent parameters, this paper proposes the fuzzy c-means combined with subtractive clustering algorithm, that is, the subtractive-fuzzy clustering. Finally, we adopt different criteria to evaluate the proposed method. The experiment results show that the method is accurate and effective in predicting the resource demands. PMID:25691896
Using HPC within an operational forecasting configuration
NASA Astrophysics Data System (ADS)
Jagers, H. R. A.; Genseberger, M.; van den Broek, M. A. F. H.
2012-04-01
Various natural disasters are caused by high-intensity events, for example: extreme rainfall can in a short time cause major damage in river catchments, storms can cause havoc in coastal areas. To assist emergency response teams in operational decisions, it's important to have reliable information and predictions as soon as possible. This starts before the event by providing early warnings about imminent risks and estimated probabilities of possible scenarios. In the context of various applications worldwide, Deltares has developed an open and highly configurable forecasting and early warning system: Delft-FEWS. Finding the right balance between simulation time (and hence prediction lead time) and simulation accuracy and detail is challenging. Model resolution may be crucial to capture certain critical physical processes. Uncertainty in forcing conditions may require running large ensembles of models; data assimilation techniques may require additional ensembles and repeated simulations. The computational demand is steadily increasing and data streams become bigger. Using HPC resources is a logical step; in different settings Delft-FEWS has been configured to take advantage of distributed computational resources available to improve and accelerate the forecasting process (e.g. Montanari et al, 2006). We will illustrate the system by means of a couple of practical applications including the real-time dynamic forecasting of wind driven waves, flow of water, and wave overtopping at dikes of Lake IJssel and neighboring lakes in the center of The Netherlands. Montanari et al., 2006. Development of an ensemble flood forecasting system for the Po river basin, First MAP D-PHASE Scientific Meeting, 6-8 November 2006, Vienna, Austria.
Improved Weather and Power Forecasts for Energy Operations - the German Research Project EWeLiNE
NASA Astrophysics Data System (ADS)
Lundgren, Kristina; Siefert, Malte; Hagedorn, Renate; Majewski, Detlev
2014-05-01
The German energy system is going through a fundamental change. Based on the energy plans of the German federal government, the share of electrical power production from renewables should increase to 35% by 2020. This means that, in the near future at certain times renewable energies will provide a major part of Germany's power production. Operating a power supply system with a large share of weather-dependent power sources in a secure way requires improved power forecasts. One of the most promising strategies to improve the existing wind power and PV power forecasts is to optimize the underlying weather forecasts and to enhance the collaboration between the meteorology and energy sectors. Deutscher Wetterdienst addresses these challenges in collaboration with Fraunhofer IWES within the research project EWeLiNE. The overarching goal of the project is to improve the wind and PV power forecasts by combining improved power forecast models and optimized weather forecasts. During the project, the numerical weather prediction models COSMO-DE and COSMO-DE-EPS (Ensemble Prediction System) by Deutscher Wetterdienst will be generally optimized towards improved wind power and PV forecasts. For instance, it will be investigated whether the assimilation of new types of data, e.g. power production data, can lead to improved weather forecasts. With regard to the probabilistic forecasts, the focus is on the generation of ensembles and ensemble calibration. One important aspect of the project is to integrate the probabilistic information into decision making processes by developing user-specified products. In this paper we give an overview of the project and present first results.
The Rise of Complexity in Flood Forecasting: Opportunities, Challenges and Tradeoffs
NASA Astrophysics Data System (ADS)
Wood, A. W.; Clark, M. P.; Nijssen, B.
2017-12-01
Operational flood forecasting is currently undergoing a major transformation. Most national flood forecasting services have relied for decades on lumped, highly calibrated conceptual hydrological models running on local office computing resources, providing deterministic streamflow predictions at gauged river locations that are important to stakeholders and emergency managers. A variety of recent technological advances now make it possible to run complex, high-to-hyper-resolution models for operational hydrologic prediction over large domains, and the US National Weather Service is now attempting to use hyper-resolution models to create new forecast services and products. Yet other `increased-complexity' forecasting strategies also exist that pursue different tradeoffs between model complexity (i.e., spatial resolution, physics) and streamflow forecast system objectives. There is currently a pressing need for a greater understanding in the hydrology community of the opportunities, challenges and tradeoffs associated with these different forecasting approaches, and for a greater participation by the hydrology community in evaluating, guiding and implementing these approaches. Intermediate-resolution forecast systems, for instance, use distributed land surface model (LSM) physics but retain the agility to deploy ensemble methods (including hydrologic data assimilation and hindcast-based post-processing). Fully coupled numerical weather prediction (NWP) systems, another example, use still coarser LSMs to produce ensemble streamflow predictions either at the model scale or after sub-grid scale runoff routing. Based on the direct experience of the authors and colleagues in research and operational forecasting, this presentation describes examples of different streamflow forecast paradigms, from the traditional to the recent hyper-resolution, to illustrate the range of choices facing forecast system developers. We also discuss the degree to which the strengths and weaknesses of each strategy map onto the requirements for different types of forecasting services (e.g., flash flooding, river flooding, seasonal water supply prediction).
Comparison of different filter methods for data assimilation in the unsaturated zone
NASA Astrophysics Data System (ADS)
Lange, Natascha; Berkhahn, Simon; Erdal, Daniel; Neuweiler, Insa
2016-04-01
The unsaturated zone is an important compartment, which plays a role for the division of terrestrial water fluxes into surface runoff, groundwater recharge and evapotranspiration. For data assimilation in coupled systems it is therefore important to have a good representation of the unsaturated zone in the model. Flow processes in the unsaturated zone have all the typical features of flow in porous media: Processes can have long memory and as observations are scarce, hydraulic model parameters cannot be determined easily. However, they are important for the quality of model predictions. On top of that, the established flow models are highly non-linear. For these reasons, the use of the popular Ensemble Kalman filter as a data assimilation method to estimate state and parameters in unsaturated zone models could be questioned. With respect to the long process memory in the subsurface, it has been suggested that iterative filters and smoothers may be more suitable for parameter estimation in unsaturated media. We test the performance of different iterative filters and smoothers for data assimilation with a focus on parameter updates in the unsaturated zone. In particular we compare the Iterative Ensemble Kalman Filter and Smoother as introduced by Bocquet and Sakov (2013) as well as the Confirming Ensemble Kalman Filter and the modified Restart Ensemble Kalman Filter proposed by Song et al. (2014) to the original Ensemble Kalman Filter (Evensen, 2009). This is done with simple test cases generated numerically. We consider also test examples with layering structure, as a layering structure is often found in natural soils. We assume that observations are water content, obtained from TDR probes or other observation methods sampling relatively small volumes. Particularly in larger data assimilation frameworks, a reasonable balance between computational effort and quality of results has to be found. Therefore, we compare computational costs of the different methods as well as the quality of open loop model predictions and the estimated parameters. Bocquet, M. and P. Sakov, 2013: Joint state and parameter estimation with an iterative ensemble Kalman smoother, Nonlinear Processes in Geophysics 20(5): 803-818. Evensen, G., 2009: Data assimilation: The ensemble Kalman filter. Springer Science & Business Media. Song, X.H., L.S. Shi, M. Ye, J.Z. Yang and I.M. Navon, 2014: Numerical comparison of iterative ensemble Kalman filters for unsaturated flow inverse modeling. Vadose Zone Journal 13(2), 10.2136/vzj2013.05.0083.
NASA Astrophysics Data System (ADS)
Sun, Hongyue; Luo, Shuai; Jin, Ran; He, Zhen
2017-07-01
Mathematical modeling is an important tool to investigate the performance of microbial fuel cell (MFC) towards its optimized design. To overcome the shortcoming of traditional MFC models, an ensemble model is developed through integrating both engineering model and statistical analytics for the extrapolation scenarios in this study. Such an ensemble model can reduce laboring effort in parameter calibration and require fewer measurement data to achieve comparable accuracy to traditional statistical model under both the normal and extreme operation regions. Based on different weight between current generation and organic removal efficiency, the ensemble model can give recommended input factor settings to achieve the best current generation and organic removal efficiency. The model predicts a set of optimal design factors for the present tubular MFCs including the anode flow rate of 3.47 mL min-1, organic concentration of 0.71 g L-1, and catholyte pumping flow rate of 14.74 mL min-1 to achieve the peak current at 39.2 mA. To maintain 100% organic removal efficiency, the anode flow rate and organic concentration should be controlled lower than 1.04 mL min-1 and 0.22 g L-1, respectively. The developed ensemble model can be potentially modified to model other types of MFCs or bioelectrochemical systems.
NASA Astrophysics Data System (ADS)
Foresti, L.; Reyniers, M.; Seed, A.; Delobbe, L.
2016-01-01
The Short-Term Ensemble Prediction System (STEPS) is implemented in real-time at the Royal Meteorological Institute (RMI) of Belgium. The main idea behind STEPS is to quantify the forecast uncertainty by adding stochastic perturbations to the deterministic Lagrangian extrapolation of radar images. The stochastic perturbations are designed to account for the unpredictable precipitation growth and decay processes and to reproduce the dynamic scaling of precipitation fields, i.e., the observation that large-scale rainfall structures are more persistent and predictable than small-scale convective cells. This paper presents the development, adaptation and verification of the STEPS system for Belgium (STEPS-BE). STEPS-BE provides in real-time 20-member ensemble precipitation nowcasts at 1 km and 5 min resolutions up to 2 h lead time using a 4 C-band radar composite as input. In the context of the PLURISK project, STEPS forecasts were generated to be used as input in sewer system hydraulic models for nowcasting urban inundations in the cities of Ghent and Leuven. Comprehensive forecast verification was performed in order to detect systematic biases over the given urban areas and to analyze the reliability of probabilistic forecasts for a set of case studies in 2013 and 2014. The forecast biases over the cities of Leuven and Ghent were found to be small, which is encouraging for future integration of STEPS nowcasts into the hydraulic models. Probabilistic forecasts of exceeding 0.5 mm h-1 are reliable up to 60-90 min lead time, while the ones of exceeding 5.0 mm h-1 are only reliable up to 30 min. The STEPS ensembles are slightly under-dispersive and represent only 75-90 % of the forecast errors.
NASA Astrophysics Data System (ADS)
Foresti, L.; Reyniers, M.; Seed, A.; Delobbe, L.
2015-07-01
The Short-Term Ensemble Prediction System (STEPS) is implemented in real-time at the Royal Meteorological Institute (RMI) of Belgium. The main idea behind STEPS is to quantify the forecast uncertainty by adding stochastic perturbations to the deterministic Lagrangian extrapolation of radar images. The stochastic perturbations are designed to account for the unpredictable precipitation growth and decay processes and to reproduce the dynamic scaling of precipitation fields, i.e. the observation that large scale rainfall structures are more persistent and predictable than small scale convective cells. This paper presents the development, adaptation and verification of the system STEPS for Belgium (STEPS-BE). STEPS-BE provides in real-time 20 member ensemble precipitation nowcasts at 1 km and 5 min resolution up to 2 h lead time using a 4 C-band radar composite as input. In the context of the PLURISK project, STEPS forecasts were generated to be used as input in sewer system hydraulic models for nowcasting urban inundations in the cities of Ghent and Leuven. Comprehensive forecast verification was performed in order to detect systematic biases over the given urban areas and to analyze the reliability of probabilistic forecasts for a set of case studies in 2013 and 2014. The forecast biases over the cities of Leuven and Ghent were found to be small, which is encouraging for future integration of STEPS nowcasts into the hydraulic models. Probabilistic forecasts of exceeding 0.5 mm h-1 are reliable up to 60-90 min lead time, while the ones of exceeding 5.0 mm h-1 are only reliable up to 30 min. The STEPS ensembles are slightly under-dispersive and represent only 80-90 % of the forecast errors.
Modeling task-specific neuronal ensembles improves decoding of grasp
NASA Astrophysics Data System (ADS)
Smith, Ryan J.; Soares, Alcimar B.; Rouse, Adam G.; Schieber, Marc H.; Thakor, Nitish V.
2018-06-01
Objective. Dexterous movement involves the activation and coordination of networks of neuronal populations across multiple cortical regions. Attempts to model firing of individual neurons commonly treat the firing rate as directly modulating with motor behavior. However, motor behavior may additionally be associated with modulations in the activity and functional connectivity of neurons in a broader ensemble. Accounting for variations in neural ensemble connectivity may provide additional information about the behavior being performed. Approach. In this study, we examined neural ensemble activity in primary motor cortex (M1) and premotor cortex (PM) of two male rhesus monkeys during performance of a center-out reach, grasp and manipulate task. We constructed point process encoding models of neuronal firing that incorporated task-specific variations in the baseline firing rate as well as variations in functional connectivity with the neural ensemble. Models were evaluated both in terms of their encoding capabilities and their ability to properly classify the grasp being performed. Main results. Task-specific ensemble models correctly predicted the performed grasp with over 95% accuracy and were shown to outperform models of neuronal activity that assume only a variable baseline firing rate. Task-specific ensemble models exhibited superior decoding performance in 82% of units in both monkeys (p < 0.01). Inclusion of ensemble activity also broadly improved the ability of models to describe observed spiking. Encoding performance of task-specific ensemble models, measured by spike timing predictability, improved upon baseline models in 62% of units. Significance. These results suggest that additional discriminative information about motor behavior found in the variations in functional connectivity of neuronal ensembles located in motor-related cortical regions is relevant to decode complex tasks such as grasping objects, and may serve the basis for more reliable and accurate neural prosthesis.
NASA Astrophysics Data System (ADS)
Yu, Wansik; Nakakita, Eiichi; Kim, Sunmin; Yamaguchi, Kosei
2016-08-01
The use of meteorological ensembles to produce sets of hydrological predictions increased the capability to issue flood warnings. However, space scale of the hydrological domain is still much finer than meteorological model, and NWP models have challenges with displacement. The main objective of this study to enhance the transposition method proposed in Yu et al. (2014) and to suggest the post-processing ensemble flood forecasting method for the real-time updating and the accuracy improvement of flood forecasts that considers the separation of the orographic rainfall and the correction of misplaced rain distributions using additional ensemble information through the transposition of rain distributions. In the first step of the proposed method, ensemble forecast rainfalls from a numerical weather prediction (NWP) model are separated into orographic and non-orographic rainfall fields using atmospheric variables and the extraction of topographic effect. Then the non-orographic rainfall fields are examined by the transposition scheme to produce additional ensemble information and new ensemble NWP rainfall fields are calculated by recombining the transposition results of non-orographic rain fields with separated orographic rainfall fields for a generation of place-corrected ensemble information. Then, the additional ensemble information is applied into a hydrologic model for post-flood forecasting with a 6-h interval. The newly proposed method has a clear advantage to improve the accuracy of mean value of ensemble flood forecasting. Our study is carried out and verified using the largest flood event by typhoon 'Talas' of 2011 over the two catchments, which are Futatsuno (356.1 km2) and Nanairo (182.1 km2) dam catchments of Shingu river basin (2360 km2), which is located in the Kii peninsula, Japan.
Avoiding drift related to linear analysis update with Lagrangian coordinate models
NASA Astrophysics Data System (ADS)
Wang, Yiguo; Counillon, Francois; Bertino, Laurent
2015-04-01
When applying data assimilation to Lagrangian coordinate models, it is profitable to correct its grid (position, volume). In isopycnal ocean coordinate model, such information is provided by the layer thickness that can be massless but must remains positive (truncated Gaussian distribution). A linear gaussian analysis does not ensure positivity for such variable. Existing methods have been proposed to handle this issue - e.g. post processing, anamorphosis or resampling - but none ensures conservation of the mean, which is imperative in climate application. Here, a framework is introduced to test a new method, which proceed as following. First, layers for which analysis yields negative values are iteratively grouped with neighboring layers, resulting in a probability density function with a larger mean and smaller standard deviation that prevent appearance of negative values. Second, analysis increments of the grouped layer are uniformly distributed, which prevent massless layers to become filled and vice-versa. The new method is proved fully conservative with e.g. OI or 3DVAR but a small drift remains with ensemble-based methods (e.g. EnKF, DEnKF, …) during the update of the ensemble anomaly. However, the resulting drift with the latter is small (an order of magnitude smaller than with post-processing) and the increase of the computational cost moderate. The new method is demonstrated with a realistic application in the Norwegian Climate Prediction Model (NorCPM) that provides climate prediction by assimilating sea surface temperature with the Ensemble Kalman Filter in a fully coupled Earth System model (NorESM) with an isopycnal ocean model (MICOM). Over 25-year analysis period, the new method does not impair the predictive skill of the system but corrects the artificial steric drift introduced by data assimilation, and provide estimate in good agreement with IPCC AR5.
Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods.
Notaro, Marco; Schubach, Max; Robinson, Peter N; Valentini, Giorgio
2017-10-12
The prediction of human gene-abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides a standard categorization of the abnormalities associated with human diseases. While the problem of the prediction of gene-disease associations has been widely investigated, the related problem of gene-phenotypic feature (i.e., HPO term) associations has been largely overlooked, even if for most human genes no HPO term associations are known and despite the increasing application of the HPO to relevant medical problems. Moreover most of the methods proposed in literature are not able to capture the hierarchical relationships between HPO terms, thus resulting in inconsistent and relatively inaccurate predictions. We present two hierarchical ensemble methods that we formally prove to provide biologically consistent predictions according to the hierarchical structure of the HPO. The modular structure of the proposed methods, that consists in a "flat" learning first step and a hierarchical combination of the predictions in the second step, allows the predictions of virtually any flat learning method to be enhanced. The experimental results show that hierarchical ensemble methods are able to predict novel associations between genes and abnormal phenotypes with results that are competitive with state-of-the-art algorithms and with a significant reduction of the computational complexity. Hierarchical ensembles are efficient computational methods that guarantee biologically meaningful predictions that obey the true path rule, and can be used as a tool to improve and make consistent the HPO terms predictions starting from virtually any flat learning method. The implementation of the proposed methods is available as an R package from the CRAN repository.
Ziminski, Joseph J; Hessler, Sabine; Margetts-Smith, Gabriella; Sieburg, Meike C; Crombag, Hans S; Koya, Eisuke
2017-03-22
Cues that predict the availability of food rewards influence motivational states and elicit food-seeking behaviors. If a cue no longer predicts food availability, then animals may adapt accordingly by inhibiting food-seeking responses. Sparsely activated sets of neurons, coined "neuronal ensembles," have been shown to encode the strength of reward-cue associations. Although alterations in intrinsic excitability have been shown to underlie many learning and memory processes, little is known about these properties specifically on cue-activated neuronal ensembles. We examined the activation patterns of cue-activated orbitofrontal cortex (OFC) and nucleus accumbens (NAc) shell ensembles using wild-type and Fos-GFP mice, which express green fluorescent protein (GFP) in activated neurons, after appetitive conditioning with sucrose and extinction learning. We also investigated the neuronal excitability of recently activated, GFP+ neurons in these brain areas using whole-cell electrophysiology in brain slices. Exposure to a sucrose cue elicited activation of neurons in both the NAc shell and OFC. In the NAc shell, but not the OFC, these activated GFP+ neurons were more excitable than surrounding GFP- neurons. After extinction, the number of neurons activated in both areas was reduced and activated ensembles in neither area exhibited altered excitability. These data suggest that learning-induced alterations in the intrinsic excitability of neuronal ensembles is regulated dynamically across different brain areas. Furthermore, we show that changes in associative strength modulate the excitability profile of activated ensembles in the NAc shell. SIGNIFICANCE STATEMENT Sparsely distributed sets of neurons called "neuronal ensembles" encode learned associations about food and cues predictive of its availability. Widespread changes in neuronal excitability have been observed in limbic brain areas after associative learning, but little is known about the excitability changes that occur specifically on neuronal ensembles that encode appetitive associations. Here, we reveal that sucrose cue exposure recruited a more excitable ensemble in the nucleus accumbens, but not orbitofrontal cortex, compared with their surrounding neurons. This excitability difference was not observed when the cue's salience was diminished after extinction learning. These novel data provide evidence that the intrinsic excitability of appetitive memory-encoding ensembles is regulated differentially across brain areas and adapts dynamically to changes in associative strength. Copyright © 2017 the authors 0270-6474/17/373160-11$15.00/0.
The Hydrologic Ensemble Prediction Experiment (HEPEX)
NASA Astrophysics Data System (ADS)
Wood, Andy; Wetterhall, Fredrik; Ramos, Maria-Helena
2015-04-01
The Hydrologic Ensemble Prediction Experiment was established in March, 2004, at a workshop hosted by the European Center for Medium Range Weather Forecasting (ECMWF), and co-sponsored by the US National Weather Service (NWS) and the European Commission (EC). The HEPEX goal was to bring the international hydrological and meteorological communities together to advance the understanding and adoption of hydrological ensemble forecasts for decision support. HEPEX pursues this goal through research efforts and practical implementations involving six core elements of a hydrologic ensemble prediction enterprise: input and pre-processing, ensemble techniques, data assimilation, post-processing, verification, and communication and use in decision making. HEPEX has grown through meetings that connect the user, forecast producer and research communities to exchange ideas, data and methods; the coordination of experiments to address specific challenges; and the formation of testbeds to facilitate shared experimentation. In the last decade, HEPEX has organized over a dozen international workshops, as well as sessions at scientific meetings (including AMS, AGU and EGU) and special issues of scientific journals where workshop results have been published. Through these interactions and an active online blog (www.hepex.org), HEPEX has built a strong and active community of nearly 400 researchers & practitioners around the world. This poster presents an overview of recent and planned HEPEX activities, highlighting case studies that exemplify the focus and objectives of HEPEX.
Reagan, Andrew J; Dubief, Yves; Dodds, Peter Sheridan; Danforth, Christopher M
2016-01-01
A thermal convection loop is a annular chamber filled with water, heated on the bottom half and cooled on the top half. With sufficiently large forcing of heat, the direction of fluid flow in the loop oscillates chaotically, dynamics analogous to the Earth's weather. As is the case for state-of-the-art weather models, we only observe the statistics over a small region of state space, making prediction difficult. To overcome this challenge, data assimilation (DA) methods, and specifically ensemble methods, use the computational model itself to estimate the uncertainty of the model to optimally combine these observations into an initial condition for predicting the future state. Here, we build and verify four distinct DA methods, and then, we perform a twin model experiment with the computational fluid dynamics simulation of the loop using the Ensemble Transform Kalman Filter (ETKF) to assimilate observations and predict flow reversals. We show that using adaptively shaped localized covariance outperforms static localized covariance with the ETKF, and allows for the use of less observations in predicting flow reversals. We also show that a Dynamic Mode Decomposition (DMD) of the temperature and velocity fields recovers the low dimensional system underlying reversals, finding specific modes which together are predictive of reversal direction.
Reagan, Andrew J.; Dubief, Yves; Dodds, Peter Sheridan; Danforth, Christopher M.
2016-01-01
A thermal convection loop is a annular chamber filled with water, heated on the bottom half and cooled on the top half. With sufficiently large forcing of heat, the direction of fluid flow in the loop oscillates chaotically, dynamics analogous to the Earth’s weather. As is the case for state-of-the-art weather models, we only observe the statistics over a small region of state space, making prediction difficult. To overcome this challenge, data assimilation (DA) methods, and specifically ensemble methods, use the computational model itself to estimate the uncertainty of the model to optimally combine these observations into an initial condition for predicting the future state. Here, we build and verify four distinct DA methods, and then, we perform a twin model experiment with the computational fluid dynamics simulation of the loop using the Ensemble Transform Kalman Filter (ETKF) to assimilate observations and predict flow reversals. We show that using adaptively shaped localized covariance outperforms static localized covariance with the ETKF, and allows for the use of less observations in predicting flow reversals. We also show that a Dynamic Mode Decomposition (DMD) of the temperature and velocity fields recovers the low dimensional system underlying reversals, finding specific modes which together are predictive of reversal direction. PMID:26849061
NASA Astrophysics Data System (ADS)
Federico, S.; Avolio, E.; Bellecci, C.; Colacino, M.; Walko, R. L.
2006-03-01
This paper reports preliminary results for a Limited area model Ensemble Prediction System (LEPS), based on RAMS (Regional Atmospheric Modelling System), for eight case studies of moderate-intense precipitation over Calabria, the southernmost tip of the Italian peninsula. LEPS aims to transfer the benefits of a probabilistic forecast from global to regional scales in countries where local orographic forcing is a key factor to force convection. To accomplish this task and to limit computational time in an operational implementation of LEPS, we perform a cluster analysis of ECMWF-EPS runs. Starting from the 51 members that form the ECMWF-EPS we generate five clusters. For each cluster a representative member is selected and used to provide initial and dynamic boundary conditions to RAMS, whose integrations generate LEPS. RAMS runs have 12-km horizontal resolution. To analyze the impact of enhanced horizontal resolution on quantitative precipitation forecasts, LEPS forecasts are compared to a full Brute Force (BF) ensemble. This ensemble is based on RAMS, has 36 km horizontal resolution and is generated by 51 members, nested in each ECMWF-EPS member. LEPS and BF results are compared subjectively and by objective scores. Subjective analysis is based on precipitation and probability maps of case studies whereas objective analysis is made by deterministic and probabilistic scores. Scores and maps are calculated by comparing ensemble precipitation forecasts against reports from the Calabria regional raingauge network. Results show that LEPS provided better rainfall predictions than BF for all case studies selected. This strongly suggests the importance of the enhanced horizontal resolution, compared to ensemble population, for Calabria for these cases. To further explore the impact of local physiographic features on QPF (Quantitative Precipitation Forecasting), LEPS results are also compared with a 6-km horizontal resolution deterministic forecast. Due to local and mesoscale forcing, the high resolution forecast (Hi-Res) has better performance compared to the ensemble mean for rainfall thresholds larger than 10mm but it tends to overestimate precipitation for lower amounts. This yields larger false alarms that have a detrimental effect on objective scores for lower thresholds. To exploit the advantages of a probabilistic forecast compared to a deterministic one, the relation between the ECMWF-EPS 700 hPa geopotential height spread and LEPS performance is analyzed. Results are promising even if additional studies are required.
Ensemble density variational methods with self- and ghost-interaction-corrected functionals
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pastorczak, Ewa; Pernal, Katarzyna, E-mail: pernalk@gmail.com
2014-05-14
Ensemble density functional theory (DFT) offers a way of predicting excited-states energies of atomic and molecular systems without referring to a density response function. Despite a significant theoretical work, practical applications of the proposed approximations have been scarce and they do not allow for a fair judgement of the potential usefulness of ensemble DFT with available functionals. In the paper, we investigate two forms of ensemble density functionals formulated within ensemble DFT framework: the Gross, Oliveira, and Kohn (GOK) functional proposed by Gross et al. [Phys. Rev. A 37, 2809 (1988)] alongside the orbital-dependent eDFT form of the functional introducedmore » by Nagy [J. Phys. B 34, 2363 (2001)] (the acronym eDFT proposed in analogy to eHF – ensemble Hartree-Fock method). Local and semi-local ground-state density functionals are employed in both approaches. Approximate ensemble density functionals contain not only spurious self-interaction but also the so-called ghost-interaction which has no counterpart in the ground-state DFT. We propose how to correct the GOK functional for both kinds of interactions in approximations that go beyond the exact-exchange functional. Numerical applications lead to a conclusion that functionals free of the ghost-interaction by construction, i.e., eDFT, yield much more reliable results than approximate self- and ghost-interaction-corrected GOK functional. Additionally, local density functional corrected for self-interaction employed in the eDFT framework yields excitations energies of the accuracy comparable to that of the uncorrected semi-local eDFT functional.« less
NASA Astrophysics Data System (ADS)
Xu, Lei; Chen, Nengcheng; Zhang, Xiang
2018-02-01
Drought is an extreme natural disaster that can lead to huge socioeconomic losses. Drought prediction ahead of months is helpful for early drought warning and preparations. In this study, we developed a statistical model, two weighted dynamic models and a statistical-dynamic (hybrid) model for 1-6 month lead drought prediction in China. Specifically, statistical component refers to climate signals weighting by support vector regression (SVR), dynamic components consist of the ensemble mean (EM) and Bayesian model averaging (BMA) of the North American Multi-Model Ensemble (NMME) climatic models, and the hybrid part denotes a combination of statistical and dynamic components by assigning weights based on their historical performances. The results indicate that the statistical and hybrid models show better rainfall predictions than NMME-EM and NMME-BMA models, which have good predictability only in southern China. In the 2011 China winter-spring drought event, the statistical model well predicted the spatial extent and severity of drought nationwide, although the severity was underestimated in the mid-lower reaches of Yangtze River (MLRYR) region. The NMME-EM and NMME-BMA models largely overestimated rainfall in northern and western China in 2011 drought. In the 2013 China summer drought, the NMME-EM model forecasted the drought extent and severity in eastern China well, while the statistical and hybrid models falsely detected negative precipitation anomaly (NPA) in some areas. Model ensembles such as multiple statistical approaches, multiple dynamic models or multiple hybrid models for drought predictions were highlighted. These conclusions may be helpful for drought prediction and early drought warnings in China.
NASA Astrophysics Data System (ADS)
Elders, Akiko; Pegion, Kathy
2017-12-01
Arctic sea ice plays an important role in the climate system, moderating the exchange of energy and moisture between the ocean and the atmosphere. An emerging area of research investigates how changes, particularly declines, in sea ice extent (SIE) impact climate in regions local to and remote from the Arctic. Therefore, both observations and model estimates of sea ice become important. This study investigates the skill of sea ice predictions from models participating in the North American Multi-Model Ensemble (NMME) project. Three of the models in this project provide sea-ice predictions. The ensemble average of these models is used to determine seasonal climate impacts on surface air temperature (SAT) and sea level pressure (SLP) in remote regions such as the mid-latitudes. It is found that declines in fall SIE are associated with cold temperatures in the mid-latitudes and pressure patterns across the Arctic and mid-latitudes similar to the negative phase of the Arctic Oscillation (AO). These findings are consistent with other studies that have investigated the relationship between declines in SIE and mid-latitude weather and climate. In an attempt to include additional NMME models for sea-ice predictions, a proxy for SIE is used to estimate ice extent in the remaining models, using sea surface temperature (SST). It is found that SST is a reasonable proxy for SIE estimation when compared to model SIE forecasts and observations. The proxy sea-ice estimates also show similar relationships to mid-latitude temperature and pressure as the actual sea-ice predictions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Voisin, Nathalie; Pappenberger, Florian; Lettenmaier, D. P.
2011-08-15
A 10-day globally applicable flood prediction scheme was evaluated using the Ohio River basin as a test site for the period 2003-2007. The Variable Infiltration Capacity (VIC) hydrology model was initialized with the European Centre for Medium Range Weather Forecasts (ECMWF) analysis temperatures and wind, and Tropical Rainfall Monitoring Mission Multi Satellite Precipitation Analysis (TMPA) precipitation up to the day of forecast. In forecast mode, the VIC model was then forced with a calibrated and statistically downscaled ECMWF ensemble prediction system (EPS) 10-day ensemble forecast. A parallel set up was used where ECMWF EPS forecasts were interpolated to the spatialmore » scale of the hydrology model. Each set of forecasts was extended by 5 days using monthly mean climatological variables and zero precipitation in order to account for the effect of initial conditions. The 15-day spatially distributed ensemble runoff forecasts were then routed to four locations in the basin, each with different drainage areas. Surrogates for observed daily runoff and flow were provided by the reference run, specifically VIC simulation forced with ECMWF analysis fields and TMPA precipitation fields. The flood prediction scheme using the calibrated and downscaled ECMWF EPS forecasts was shown to be more accurate and reliable than interpolated forecasts for both daily distributed runoff forecasts and daily flow forecasts. Initial and antecedent conditions dominated the flow forecasts for lead times shorter than the time of concentration depending on the flow forecast amounts and the drainage area sizes. The flood prediction scheme had useful skill for the 10 following days at all sites.« less
DrugECs: An Ensemble System with Feature Subspaces for Accurate Drug-Target Interaction Prediction
Jiang, Jinjian; Wang, Nian; Zhang, Jun
2017-01-01
Background Drug-target interaction is key in drug discovery, especially in the design of new lead compound. However, the work to find a new lead compound for a specific target is complicated and hard, and it always leads to many mistakes. Therefore computational techniques are commonly adopted in drug design, which can save time and costs to a significant extent. Results To address the issue, a new prediction system is proposed in this work to identify drug-target interaction. First, drug-target pairs are encoded with a fragment technique and the software “PaDEL-Descriptor.” The fragment technique is for encoding target proteins, which divides each protein sequence into several fragments in order and encodes each fragment with several physiochemical properties of amino acids. The software “PaDEL-Descriptor” creates encoding vectors for drug molecules. Second, the dataset of drug-target pairs is resampled and several overlapped subsets are obtained, which are then input into kNN (k-Nearest Neighbor) classifier to build an ensemble system. Conclusion Experimental results on the drug-target dataset showed that our method performs better and runs faster than the state-of-the-art predictors. PMID:28744468
Stable time filtering of strongly unstable spatially extended systems
Grote, Marcus J.; Majda, Andrew J.
2006-01-01
Many contemporary problems in science involve making predictions based on partial observation of extremely complicated spatially extended systems with many degrees of freedom and with physical instabilities on both large and small scale. Various new ensemble filtering strategies have been developed recently for these applications, and new mathematical issues arise. Because ensembles are extremely expensive to generate, one such issue is whether it is possible under appropriate circumstances to take long time steps in an explicit difference scheme and violate the classical Courant–Friedrichs–Lewy (CFL)-stability condition yet obtain stable accurate filtering by using the observations. These issues are explored here both through elementary mathematical theory, which provides simple guidelines, and the detailed study of a prototype model. The prototype model involves an unstable finite difference scheme for a convection–diffusion equation, and it is demonstrated below that appropriate observations can result in stable accurate filtering of this strongly unstable spatially extended system. PMID:16682626
Stable time filtering of strongly unstable spatially extended systems.
Grote, Marcus J; Majda, Andrew J
2006-05-16
Many contemporary problems in science involve making predictions based on partial observation of extremely complicated spatially extended systems with many degrees of freedom and with physical instabilities on both large and small scale. Various new ensemble filtering strategies have been developed recently for these applications, and new mathematical issues arise. Because ensembles are extremely expensive to generate, one such issue is whether it is possible under appropriate circumstances to take long time steps in an explicit difference scheme and violate the classical Courant-Friedrichs-Lewy (CFL)-stability condition yet obtain stable accurate filtering by using the observations. These issues are explored here both through elementary mathematical theory, which provides simple guidelines, and the detailed study of a prototype model. The prototype model involves an unstable finite difference scheme for a convection-diffusion equation, and it is demonstrated below that appropriate observations can result in stable accurate filtering of this strongly unstable spatially extended system.
New machine-learning algorithms for prediction of Parkinson's disease
NASA Astrophysics Data System (ADS)
Mandal, Indrajit; Sairam, N.
2014-03-01
This article presents an enhanced prediction accuracy of diagnosis of Parkinson's disease (PD) to prevent the delay and misdiagnosis of patients using the proposed robust inference system. New machine-learning methods are proposed and performance comparisons are based on specificity, sensitivity, accuracy and other measurable parameters. The robust methods of treating Parkinson's disease (PD) includes sparse multinomial logistic regression, rotation forest ensemble with support vector machines and principal components analysis, artificial neural networks, boosting methods. A new ensemble method comprising of the Bayesian network optimised by Tabu search algorithm as classifier and Haar wavelets as projection filter is used for relevant feature selection and ranking. The highest accuracy obtained by linear logistic regression and sparse multinomial logistic regression is 100% and sensitivity, specificity of 0.983 and 0.996, respectively. All the experiments are conducted over 95% and 99% confidence levels and establish the results with corrected t-tests. This work shows a high degree of advancement in software reliability and quality of the computer-aided diagnosis system and experimentally shows best results with supportive statistical inference.
NASA Astrophysics Data System (ADS)
Kim, Ji-in; Ryu, Kyongsik; Suh, Ae-sook
2016-04-01
In 2014, three major governmental organizations that are Korea Meteorological Administration (KMA), K-water, and Korea Rural Community Corporation have been established the Hydrometeorological Cooperation Center (HCC) to accomplish more effective water management for scarcely gauged river basins, where data are uncertain or non-consistent. To manage the optimal drought and flood control over the ungauged river, HCC aims to interconnect between weather observations and forecasting information, and hydrological model over sparse regions with limited observations sites in Korean peninsula. In this study, long-term forecasting ensemble models so called Global Seasonal forecast system version 5 (GloSea5): a high-resolution seasonal forecast system, provided by KMA was used in order to produce drought outlook. Glosea5 ensemble model prediction provides predicted drought information for 1 and 3 months ahead with drought index including Standardized Precipitation Index (SPI3) and Palmer Drought Severity Index (PDSI). Also, Global Precipitation Measurement and Global Climate Observation Measurement - Water1 satellites data products are used to estimate rainfall and soil moisture contents over the ungauged region.
NASA Astrophysics Data System (ADS)
Ramos, Maria-Helena; Wetterhall, Fredrik; Wood, Andy; Wang, Qj; Pappenberger, Florian; Verkade, Jan
2017-04-01
Since 2004, HEPEX (Hydrologic Ensemble Prediction Experiment) has been fostering a community of researchers and practitioners around the world. Through the years, it has contributed to establish a more integrative view of hydrological forecasting, where data assimilation, hydro-meteorological modelling chains, post-processing techniques, expert knowledge, and decision support systems are connected to enhance operational systems and water management applications. Here we present the community activities in HEPEX that have contributed to strengthening this unfunded/volunteer effort for more than a decade. It includes the organization of workshops, conference sessions, testbeds and inter-comparison experiments. More recently, HEPEX has also prompted the development of several publicly available role-play games and, since 2013, it has been running a blog portal (www.hepex.org), which is used as an intersection point for members. Through this website, members can continuously share their research, make announcements, report on workshops, projects and meetings, and hear about related research and operational challenges. It also creates a platform for early career scientists to become increasingly involved in hydrological forecasting science and applications.
Verification of an ensemble prediction system for storm surge forecast in the Adriatic Sea
NASA Astrophysics Data System (ADS)
Mel, Riccardo; Lionello, Piero
2014-12-01
In the Adriatic Sea, storm surges present a significant threat to Venice and to the flat coastal areas of the northern coast of the basin. Sea level forecast is of paramount importance for the management of daily activities and for operating the movable barriers that are presently being built for the protection of the city. In this paper, an EPS (ensemble prediction system) for operational forecasting of storm surge in the northern Adriatic Sea is presented and applied to a 3-month-long period (October-December 2010). The sea level EPS is based on the HYPSE (hydrostatic Padua Sea elevation) model, which is a standard single-layer nonlinear shallow water model, whose forcings (mean sea level pressure and surface wind fields) are provided by the ensemble members of the ECMWF (European Center for Medium-Range Weather Forecasts) EPS. Results are verified against observations at five tide gauges located along the Croatian and Italian coasts of the Adriatic Sea. Forecast uncertainty increases with the predicted value of the storm surge and with the forecast lead time. The EMF (ensemble mean forecast) provided by the EPS has a rms (root mean square) error lower than the DF (deterministic forecast), especially for short (up to 3 days) lead times. Uncertainty for short lead times of the forecast and for small storm surges is mainly caused by uncertainty of the initial condition of the hydrodynamical model. Uncertainty for large lead times and large storm surges is mainly caused by uncertainty in the meteorological forcings. The EPS spread increases with the rms error of the forecast. For large lead times the EPS spread and the forecast error substantially coincide. However, the EPS spread in this study, which does not account for uncertainty in the initial condition, underestimates the error during the early part of the forecast and for small storm surge values. On the contrary, it overestimates the rms error for large surge values. The PF (probability forecast) of the EPS has a clear skill in predicting the actual probability distribution of sea level, and it outperforms simple "dressed" PF methods. A probability estimate based on the single DF is shown to be inadequate. However, a PF obtained with a prescribed Gaussian distribution and centered on the DF value performs very similarly to the EPS-based PF.
Predicting patchy particle crystals: variable box shape simulations and evolutionary algorithms.
Bianchi, Emanuela; Doppelbauer, Günther; Filion, Laura; Dijkstra, Marjolein; Kahl, Gerhard
2012-06-07
We consider several patchy particle models that have been proposed in literature and we investigate their candidate crystal structures in a systematic way. We compare two different algorithms for predicting crystal structures: (i) an approach based on Monte Carlo simulations in the isobaric-isothermal ensemble and (ii) an optimization technique based on ideas of evolutionary algorithms. We show that the two methods are equally successful and provide consistent results on crystalline phases of patchy particle systems.
Climate Modeling and Causal Identification for Sea Ice Predictability
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hunke, Elizabeth Clare; Urrego Blanco, Jorge Rolando; Urban, Nathan Mark
This project aims to better understand causes of ongoing changes in the Arctic climate system, particularly as decreasing sea ice trends have been observed in recent decades and are expected to continue in the future. As part of the Sea Ice Prediction Network, a multi-agency effort to improve sea ice prediction products on seasonal-to-interannual time scales, our team is studying sensitivity of sea ice to a collection of physical process and feedback mechanism in the coupled climate system. During 2017 we completed a set of climate model simulations using the fully coupled ACME-HiLAT model. The simulations consisted of experiments inmore » which cloud, sea ice, and air-ocean turbulent exchange parameters previously identified as important for driving output uncertainty in climate models were perturbed to account for parameter uncertainty in simulated climate variables. We conducted a sensitivity study to these parameters, which built upon a previous study we made for standalone simulations (Urrego-Blanco et al., 2016, 2017). Using the results from the ensemble of coupled simulations, we are examining robust relationships between climate variables that emerge across the experiments. We are also using causal discovery techniques to identify interaction pathways among climate variables which can help identify physical mechanisms and provide guidance in predictability studies. This work further builds on and leverages the large ensemble of standalone sea ice simulations produced in our previous w14_seaice project.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kraisler, Eli; Kronik, Leeor
2014-05-14
The fundamental gap is a central quantity in the electronic structure of matter. Unfortunately, the fundamental gap is not generally equal to the Kohn-Sham gap of density functional theory (DFT), even in principle. The two gaps differ precisely by the derivative discontinuity, namely, an abrupt change in slope of the exchange-correlation energy as a function of electron number, expected across an integer-electron point. Popular approximate functionals are thought to be devoid of a derivative discontinuity, strongly compromising their performance for prediction of spectroscopic properties. Here we show that, in fact, all exchange-correlation functionals possess a derivative discontinuity, which arises naturallymore » from the application of ensemble considerations within DFT, without any empiricism. This derivative discontinuity can be expressed in closed form using only quantities obtained in the course of a standard DFT calculation of the neutral system. For small, finite systems, addition of this derivative discontinuity indeed results in a greatly improved prediction for the fundamental gap, even when based on the most simple approximate exchange-correlation density functional – the local density approximation (LDA). For solids, the same scheme is exact in principle, but when applied to LDA it results in a vanishing derivative discontinuity correction. This failure is shown to be directly related to the failure of LDA in predicting fundamental gaps from total energy differences in extended systems.« less
NASA Astrophysics Data System (ADS)
Athanasiadis, Panos; Gualdi, Silvio; Scaife, Adam A.; Bellucci, Alessio; Hermanson, Leon; MacLachlan, Craig; Arribas, Alberto; Materia, Stefano; Borelli, Andrea
2014-05-01
Low-frequency variability is a fundamental component of the atmospheric circulation. Extratropical teleconnections, the occurrence of blocking and the slow modulation of the jet streams and storm tracks are all different aspects of low-frequency variability. Part of the latter is attributed to the chaotic nature of the atmosphere and is inherently unpredictable. On the other hand, primarily as a response to boundary forcings, tropospheric low-frequency variability includes components that are potentially predictable. Seasonal forecasting faces the difficult task of predicting these components. Particularly referring to the extratropics, the current generation of seasonal forecasting systems seem to be approaching this target by realistically initializing most components of the climate system, using higher resolution and utilizing large ensemble sizes. Two seasonal prediction systems (Met-Office GloSea and CMCC-SPS-v1.5) are analyzed in terms of their representation of different aspects of extratropical low-frequency variability. The current operational Met-Office system achieves unprecedented high scores in predicting the winter-mean phase of the North Atlantic Oscillation (NAO, corr. 0.74 at 500 hPa) and the Pacific-N. American pattern (PNA, corr. 0.82). The CMCC system, considering its small ensemble size and course resolution, also achieves good scores (0.42 for NAO, 0.51 for PNA). Despite these positive features, both models suffer from biases in low-frequency variance, particularly in the N. Atlantic. Consequently, it is found that their intrinsic variability patterns (sectoral EOFs) differ significantly from the observed, and the known teleconnections are underrepresented. Regarding the representation of N. hemisphere blocking, after bias correction both systems exhibit a realistic climatology of blocking frequency. In this assessment, instantaneous blocking and large-scale persistent blocking events are identified using daily geopotential height fields at 500 hPa. Given a documented strong relationship between high-latitude N. Atlantic blocking and the NAO, one would expect a predictive skill for the seasonal frequency of blocking comparable to that of the NAO. However, this remains elusive. Future efforts should be in the direction of reducing model biases not only in the mean but also in variability (band-passed variances).
Mesoscale Climate Evaluation Using Grid Computing
NASA Astrophysics Data System (ADS)
Campos Velho, H. F.; Freitas, S. R.; Souto, R. P.; Charao, A. S.; Ferraz, S.; Roberti, D. R.; Streck, N.; Navaux, P. O.; Maillard, N.; Collischonn, W.; Diniz, G.; Radin, B.
2012-04-01
The CLIMARS project is focused to establish an operational environment for seasonal climate prediction for the Rio Grande do Sul state, Brazil. The dynamical downscaling will be performed with the use of several software platforms and hardware infrastructure to carry out the investigation on mesoscale of the global change impact. The grid computing takes advantage of geographically spread out computer systems, connected by the internet, for enhancing the power of computation. The ensemble climate prediction is an appropriated application for processing on grid computing, because the integration of each ensemble member does not have a dependency on information from another ensemble members. The grid processing is employed to compute the 20-year climatology and the long range simulations under ensemble methodology. BRAMS (Brazilian Regional Atmospheric Model) is a mesoscale model developed from a version of the RAMS (from the Colorado State University - CSU, USA). BRAMS model is the tool for carrying out the dynamical downscaling from the IPCC scenarios. Long range BRAMS simulations will provide data for some climate (data) analysis, and supply data for numerical integration of different models: (a) Regime of the extreme events for temperature and precipitation fields: statistical analysis will be applied on the BRAMS data, (b) CCATT-BRAMS (Coupled Chemistry Aerosol Tracer Transport - BRAMS) is an environmental prediction system that will be used to evaluate if the new standards of temperature, rain regime, and wind field have a significant impact on the pollutant dispersion in the analyzed regions, (c) MGB-IPH (Portuguese acronym for the Large Basin Model (MGB), developed by the Hydraulic Research Institute, (IPH) from the Federal University of Rio Grande do Sul (UFRGS), Brazil) will be employed to simulate the alteration of the river flux under new climate patterns. Important meteorological input variables for the MGB-IPH are the precipitation (most relevant), temperature, and wind field, all provided by BRAMS. The Uruguay river basin will be analyzed in the scope of this proposal, (d) INFOCROP: this crop model has been calibrated for Southern Brazil, three agriculture cropswill be analyzed: rice, soybean and corn.
Cortical ensemble activity increasingly predicts behaviour outcomes during learning of a motor task
NASA Astrophysics Data System (ADS)
Laubach, Mark; Wessberg, Johan; Nicolelis, Miguel A. L.
2000-06-01
When an animal learns to make movements in response to different stimuli, changes in activity in the motor cortex seem to accompany and underlie this learning. The precise nature of modifications in cortical motor areas during the initial stages of motor learning, however, is largely unknown. Here we address this issue by chronically recording from neuronal ensembles located in the rat motor cortex, throughout the period required for rats to learn a reaction-time task. Motor learning was demonstrated by a decrease in the variance of the rats' reaction times and an increase in the time the animals were able to wait for a trigger stimulus. These behavioural changes were correlated with a significant increase in our ability to predict the correct or incorrect outcome of single trials based on three measures of neuronal ensemble activity: average firing rate, temporal patterns of firing, and correlated firing. This increase in prediction indicates that an association between sensory cues and movement emerged in the motor cortex as the task was learned. Such modifications in cortical ensemble activity may be critical for the initial learning of motor tasks.
NASA Technical Reports Server (NTRS)
Di Tomaso, Enza; Schutgens, Nick A. J.; Jorba, Oriol; Perez Garcia-Pando, Carlos
2017-01-01
A data assimilation capability has been built for the NMMB-MONARCH chemical weather prediction system, with a focus on mineral dust, a prominent type of aerosol. An ensemble-based Kalman filter technique (namely the local ensemble transform Kalman filter - LETKF) has been utilized to optimally combine model background and satellite retrievals. Our implementation of the ensemble is based on known uncertainties in the physical parametrizations of the dust emission scheme. Experiments showed that MODIS AOD retrievals using the Dark Target algorithm can help NMMB-MONARCH to better characterize atmospheric dust. This is particularly true for the analysis of the dust outflow in the Sahel region and over the African Atlantic coast. The assimilation of MODIS AOD retrievals based on the Deep Blue algorithm has a further positive impact in the analysis downwind from the strongest dust sources of the Sahara and in the Arabian Peninsula. An analysis-initialized forecast performs better (lower forecast error and higher correlation with observations) than a standard forecast, with the exception of underestimating dust in the long-range Atlantic transport and degradation of the temporal evolution of dust in some regions after day 1. Particularly relevant is the improved forecast over the Sahara throughout the forecast range thanks to the assimilation of Deep Blue retrievals over areas not easily covered by other observational datasets.The present study on mineral dust is a first step towards data assimilation with a complete aerosol prediction system that includes multiple aerosol species.
NASA Astrophysics Data System (ADS)
Di Tomaso, Enza; Schutgens, Nick A. J.; Jorba, Oriol; Pérez García-Pando, Carlos
2017-03-01
A data assimilation capability has been built for the NMMB-MONARCH chemical weather prediction system, with a focus on mineral dust, a prominent type of aerosol. An ensemble-based Kalman filter technique (namely the local ensemble transform Kalman filter - LETKF) has been utilized to optimally combine model background and satellite retrievals. Our implementation of the ensemble is based on known uncertainties in the physical parametrizations of the dust emission scheme. Experiments showed that MODIS AOD retrievals using the Dark Target algorithm can help NMMB-MONARCH to better characterize atmospheric dust. This is particularly true for the analysis of the dust outflow in the Sahel region and over the African Atlantic coast. The assimilation of MODIS AOD retrievals based on the Deep Blue algorithm has a further positive impact in the analysis downwind from the strongest dust sources of the Sahara and in the Arabian Peninsula. An analysis-initialized forecast performs better (lower forecast error and higher correlation with observations) than a standard forecast, with the exception of underestimating dust in the long-range Atlantic transport and degradation of the temporal evolution of dust in some regions after day 1. Particularly relevant is the improved forecast over the Sahara throughout the forecast range thanks to the assimilation of Deep Blue retrievals over areas not easily covered by other observational datasets. The present study on mineral dust is a first step towards data assimilation with a complete aerosol prediction system that includes multiple aerosol species.
NASA Astrophysics Data System (ADS)
Seyoum, Mesgana; van Andel, Schalk Jan; Xuan, Yunqing; Amare, Kibreab
Flow forecasting in poorly gauged, flood-prone Ribb and Gumara sub-catchments of the Blue Nile was studied with the aim of testing the performance of Quantitative Precipitation Forecasts (QPFs). Four types of QPFs namely MM5 forecasts with a spatial resolution of 2 km; the Maximum, Mean and Minimum members (MaxEPS, MeanEPS and MinEPS where EPS stands for Ensemble Prediction System) of the fixed, low resolution (2.5 by 2.5 degrees) National Oceanic and Atmospheric Administration Global Forecast System (NOAA GFS) ensemble forecasts were used. Both the MM5 and the EPS were not calibrated (bias correction, downscaling (for EPS), etc.). In addition, zero forecasts assuming no rainfall in the coming days, and monthly average forecasts assuming average monthly rainfall in the coming days, were used. These rainfall forecasts were then used to drive the Hydrologic Engineering Center’s-Hydrologic Modeling System, HEC-HMS, hydrologic model for flow predictions. The results show that flow predictions using MaxEPS and MM5 precipitation forecasts over-predicted the peak flow for most of the seven events analyzed, whereas under-predicted peak flow was found using zero- and monthly average rainfall. The comparison of observed and predicted flow hydrographs shows that MM5, MaxEPS and MeanEPS precipitation forecasts were able to capture the rainfall signal that caused peak flows. Flow predictions based on MaxEPS and MeanEPS gave results that were quantitatively close to the observed flow for most events, whereas flow predictions based on MM5 resulted in large overestimations for some events. In follow-up research for this particular case study, calibration of the MM5 model will be performed. The overall analysis shows that freely available atmospheric forecasting products can provide additional information on upcoming rainfall and peak flow events in areas where only base-line forecasts such as no-rainfall or climatology are available.
Di Pierro, Michele; Cheng, Ryan R; Lieberman Aiden, Erez; Wolynes, Peter G; Onuchic, José N
2017-11-14
Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible. Copyright © 2017 the Author(s). Published by PNAS.
An Analysis of Numerical Weather Prediction of the Diabatic Rossby Vortex
2014-06-01
Forecast SLP Mean and Spread ...............................................................................................148 2. DRV02 72 Hour...ECMWF Ensemble Forecast SLP Mean and Spread ...............................................................................................149 3...DRV03 72 Hour ECMWF Ensemble Forecast SLP Mean and Spread
Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data
Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.
2016-01-01
We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872
Data Assimilation and Predictability Studies on Typhoon Sinlaku (2008) Using the WRF-LETKF System
NASA Astrophysics Data System (ADS)
Miyoshi, T.; Kunii, M.
2011-12-01
Data assimilation and predictability studies on Tropical Cyclones with a particular focus on intensity forecasts are performed with the newly-developed Local Ensemble Transform Kalman Filter (LETKF) system with the WRF model. Taking advantage of intensive observations of the internationally collaborated T-PARC (THORPEX Pacific Asian Regional Campaign) project, we focus on Typhoon Sinlaku (2008) which intensified rapidly before making landfall to Taiwan. This study includes a number of data assimilation experiments, higher-resolution forecasts, and sensitivity analysis which quantifies impacts of observations on forecasts. This presentation includes latest achievements up to the time of the conference.
NASA Astrophysics Data System (ADS)
Martin, A.; Pascal, C.; Leconte, R.
2014-12-01
Stochastic Dynamic Programming (SDP) is known to be an effective technique to find the optimal operating policy of hydropower systems. In order to improve the performance of SDP, this project evaluates the impact of re-updating the policy at every time step by using Ensemble Streamflow Prediction (ESP). We present a case study of the Kemano's hydropower system on the Nechako River in British Columbia, Canada. Managed by Rio Tinto Alcan (RTA), this system is subject to large streamflow volumes in spring due to important amount of snow depth during the winter season. Therefore, the operating policy should not only maximize production but also minimize the risk of flooding. The hydrological behavior of the system is simulated with CEQUEAU, a distributed and deterministic hydrological model developed by the Institut national de la recherche scientifique - Eau, Terre et Environnement (INRS-ETE) in Quebec, Canada. On each decision time step, CEQUEAU is used to generate ESP scenarios based on historical meteorological sequences and the current state of the hydrological model. These scenarios are used into the SDP to optimize the new release policy for the next time steps. This routine is then repeated over the entire simulation period. Results are compared with those obtained by using SDP on historical inflow scenarios.
Leong, Max K.; Syu, Ren-Guei; Ding, Yi-Lung; Weng, Ching-Feng
2017-01-01
The glycine-binding site of the N-methyl-D-aspartate receptor (NMDAR) subunit GluN1 is a potential pharmacological target for neurodegenerative disorders. A novel combinatorial ensemble docking scheme using ligand and protein conformation ensembles and customized support vector machine (SVM)-based models to select the docked pose and to predict the docking score was generated for predicting the NMDAR GluN1-ligand binding affinity. The predicted root mean square deviation (RMSD) values in pose by SVM-Pose models were found to be in good agreement with the observed values (n = 30, r2 = 0.928–0.988, = 0.894–0.954, RMSE = 0.002–0.412, s = 0.001–0.214), and the predicted pKi values by SVM-Score were found to be in good agreement with the observed values for the training samples (n = 24, r2 = 0.967, = 0.899, RMSE = 0.295, s = 0.170) and test samples (n = 13, q2 = 0.894, RMSE = 0.437, s = 0.202). When subjected to various statistical validations, the developed SVM-Pose and SVM-Score models consistently met the most stringent criteria. A mock test asserted the predictivity of this novel docking scheme. Collectively, this accurate novel combinatorial ensemble docking scheme can be used to predict the NMDAR GluN1-ligand binding affinity for facilitating drug discovery. PMID:28059133
Leong, Max K; Syu, Ren-Guei; Ding, Yi-Lung; Weng, Ching-Feng
2017-01-06
The glycine-binding site of the N-methyl-D-aspartate receptor (NMDAR) subunit GluN1 is a potential pharmacological target for neurodegenerative disorders. A novel combinatorial ensemble docking scheme using ligand and protein conformation ensembles and customized support vector machine (SVM)-based models to select the docked pose and to predict the docking score was generated for predicting the NMDAR GluN1-ligand binding affinity. The predicted root mean square deviation (RMSD) values in pose by SVM-Pose models were found to be in good agreement with the observed values (n = 30, r 2 = 0.928-0.988, = 0.894-0.954, RMSE = 0.002-0.412, s = 0.001-0.214), and the predicted pK i values by SVM-Score were found to be in good agreement with the observed values for the training samples (n = 24, r 2 = 0.967, = 0.899, RMSE = 0.295, s = 0.170) and test samples (n = 13, q 2 = 0.894, RMSE = 0.437, s = 0.202). When subjected to various statistical validations, the developed SVM-Pose and SVM-Score models consistently met the most stringent criteria. A mock test asserted the predictivity of this novel docking scheme. Collectively, this accurate novel combinatorial ensemble docking scheme can be used to predict the NMDAR GluN1-ligand binding affinity for facilitating drug discovery.
NASA Astrophysics Data System (ADS)
Leong, Max K.; Syu, Ren-Guei; Ding, Yi-Lung; Weng, Ching-Feng
2017-01-01
The glycine-binding site of the N-methyl-D-aspartate receptor (NMDAR) subunit GluN1 is a potential pharmacological target for neurodegenerative disorders. A novel combinatorial ensemble docking scheme using ligand and protein conformation ensembles and customized support vector machine (SVM)-based models to select the docked pose and to predict the docking score was generated for predicting the NMDAR GluN1-ligand binding affinity. The predicted root mean square deviation (RMSD) values in pose by SVM-Pose models were found to be in good agreement with the observed values (n = 30, r2 = 0.928-0.988, = 0.894-0.954, RMSE = 0.002-0.412, s = 0.001-0.214), and the predicted pKi values by SVM-Score were found to be in good agreement with the observed values for the training samples (n = 24, r2 = 0.967, = 0.899, RMSE = 0.295, s = 0.170) and test samples (n = 13, q2 = 0.894, RMSE = 0.437, s = 0.202). When subjected to various statistical validations, the developed SVM-Pose and SVM-Score models consistently met the most stringent criteria. A mock test asserted the predictivity of this novel docking scheme. Collectively, this accurate novel combinatorial ensemble docking scheme can be used to predict the NMDAR GluN1-ligand binding affinity for facilitating drug discovery.
Kinetic rate constant prediction supports the conformational selection mechanism of protein binding.
Moal, Iain H; Bates, Paul A
2012-01-01
The prediction of protein-protein kinetic rate constants provides a fundamental test of our understanding of molecular recognition, and will play an important role in the modeling of complex biological systems. In this paper, a feature selection and regression algorithm is applied to mine a large set of molecular descriptors and construct simple models for association and dissociation rate constants using empirical data. Using separate test data for validation, the predicted rate constants can be combined to calculate binding affinity with accuracy matching that of state of the art empirical free energy functions. The models show that the rate of association is linearly related to the proportion of unbound proteins in the bound conformational ensemble relative to the unbound conformational ensemble, indicating that the binding partners must adopt a geometry near to that of the bound prior to binding. Mirroring the conformational selection and population shift mechanism of protein binding, the models provide a strong separate line of evidence for the preponderance of this mechanism in protein-protein binding, complementing structural and theoretical studies.
Bayesian model aggregation for ensemble-based estimates of protein pKa values
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gosink, Luke J.; Hogan, Emilie A.; Pulsipher, Trenton C.
2014-03-01
This paper investigates an ensemble-based technique called Bayesian Model Averaging (BMA) to improve the performance of protein amino acid pmore » $$K_a$$ predictions. Structure-based p$$K_a$$ calculations play an important role in the mechanistic interpretation of protein structure and are also used to determine a wide range of protein properties. A diverse set of methods currently exist for p$$K_a$$ prediction, ranging from empirical statistical models to {\\it ab initio} quantum mechanical approaches. However, each of these methods are based on a set of assumptions that have inherent bias and sensitivities that can effect a model's accuracy and generalizability for p$$K_a$$ prediction in complicated biomolecular systems. We use BMA to combine eleven diverse prediction methods that each estimate pKa values of amino acids in staphylococcal nuclease. These methods are based on work conducted for the pKa Cooperative and the pKa measurements are based on experimental work conducted by the Garc{\\'i}a-Moreno lab. Our study demonstrates that the aggregated estimate obtained from BMA outperforms all individual prediction methods in our cross-validation study with improvements from 40-70\\% over other method classes. This work illustrates a new possible mechanism for improving the accuracy of p$$K_a$$ prediction and lays the foundation for future work on aggregate models that balance computational cost with prediction accuracy.« less
Alderman, Phillip D.; Stanfill, Bryan
2016-10-06
Recent international efforts have brought renewed emphasis on the comparison of different agricultural systems models. Thus far, analysis of model-ensemble simulated results has not clearly differentiated between ensemble prediction uncertainties due to model structural differences per se and those due to parameter value uncertainties. Additionally, despite increasing use of Bayesian parameter estimation approaches with field-scale crop models, inadequate attention has been given to the full posterior distributions for estimated parameters. The objectives of this study were to quantify the impact of parameter value uncertainty on prediction uncertainty for modeling spring wheat phenology using Bayesian analysis and to assess the relativemore » contributions of model-structure-driven and parameter-value-driven uncertainty to overall prediction uncertainty. This study used a random walk Metropolis algorithm to estimate parameters for 30 spring wheat genotypes using nine phenology models based on multi-location trial data for days to heading and days to maturity. Across all cases, parameter-driven uncertainty accounted for between 19 and 52% of predictive uncertainty, while model-structure-driven uncertainty accounted for between 12 and 64%. Here, this study demonstrated the importance of quantifying both model-structure- and parameter-value-driven uncertainty when assessing overall prediction uncertainty in modeling spring wheat phenology. More generally, Bayesian parameter estimation provided a useful framework for quantifying and analyzing sources of prediction uncertainty.« less
Zou, Lingyun; Nan, Chonghan; Hu, Fuquan
2013-12-15
Various human pathogens secret effector proteins into hosts cells via the type IV secretion system (T4SS). These proteins play important roles in the interaction between bacteria and hosts. Computational methods for T4SS effector prediction have been developed for screening experimental targets in several isolated bacterial species; however, widely applicable prediction approaches are still unavailable In this work, four types of distinctive features, namely, amino acid composition, dipeptide composition, .position-specific scoring matrix composition and auto covariance transformation of position-specific scoring matrix, were calculated from primary sequences. A classifier, T4EffPred, was developed using the support vector machine with these features and their different combinations for effector prediction. Various theoretical tests were performed in a newly established dataset, and the results were measured with four indexes. We demonstrated that T4EffPred can discriminate IVA and IVB effectors in benchmark datasets with positive rates of 76.7% and 89.7%, respectively. The overall accuracy of 95.9% shows that the present method is accurate for distinguishing the T4SS effector in unidentified sequences. A classifier ensemble was designed to synthesize all single classifiers. Notable performance improvement was observed using this ensemble system in benchmark tests. To demonstrate the model's application, a genome-scale prediction of effectors was performed in Bartonella henselae, an important zoonotic pathogen. A number of putative candidates were distinguished. A web server implementing the prediction method and the source code are both available at http://bioinfo.tmmu.edu.cn/T4EffPred.
NASA Astrophysics Data System (ADS)
Solvang Johansen, Stian; Steinsland, Ingelin; Engeland, Kolbjørn
2016-04-01
Running hydrological models with precipitation and temperature ensemble forcing to generate ensembles of streamflow is a commonly used method in operational hydrology. Evaluations of streamflow ensembles have however revealed that the ensembles are biased with respect to both mean and spread. Thus postprocessing of the ensembles is needed in order to improve the forecast skill. The aims of this study is (i) to to evaluate how postprocessing of streamflow ensembles works for Norwegian catchments within different hydrological regimes and to (ii) demonstrate how post processed streamflow ensembles are used operationally by a hydropower producer. These aims were achieved by postprocessing forecasted daily discharge for 10 lead-times for 20 catchments in Norway by using EPS forcing from ECMWF applied the semi-distributed HBV-model dividing each catchment into 10 elevation zones. Statkraft Energi uses forecasts from these catchments for scheduling hydropower production. The catchments represent different hydrological regimes. Some catchments have stable winter condition with winter low flow and a major flood event during spring or early summer caused by snow melting. Others has a more mixed snow-rain regime, often with a secondary flood season during autumn, and in the coastal areas, the stream flow is dominated by rain, and the main flood season is autumn and winter. For post processing, a Bayesian model averaging model (BMA) close to (Kleiber et al 2011) is used. The model creates a predictive PDF that is a weighted average of PDFs centered on the individual bias corrected forecasts. The weights are here equal since all ensemble members come from the same model, and thus have the same probability. For modeling streamflow, the gamma distribution is chosen as a predictive PDF. The bias correction parameters and the PDF parameters are estimated using a 30-day sliding window training period. Preliminary results show that the improvement varies between catchments depending on where they are situated and the hydrological regime. There is an improvement in CRPS for all catchments compared to raw EPS ensembles. The improvement is up to lead-time 5-7. The postprocessing also improves the MAE for the median of the predictive PDF compared to the median of the raw EPS. But less compared to CRPS, often up to lead-time 2-3. The streamflow ensembles are to some extent used operationally in Statkraft Energi (Hydro Power company, Norway), with respect to early warning, risk assessment and decision-making. Presently all forecast used operationally for short-term scheduling are deterministic, but ensembles are used visually for expert assessment of risk in difficult situations where e.g. there is a chance of overflow in a reservoir. However, there are plans to incorporate ensembles in the daily scheduling of hydropower production.
Regional sea level variability in a high-resolution global coupled climate model
NASA Astrophysics Data System (ADS)
Palko, D.; Kirtman, B. P.
2016-12-01
The prediction of trends at regional scales is essential in order to adapt to and prepare for the effects of climate change. However, GCMs are unable to make reliable predictions at regional scales. The prediction of local sea level trends is particularly critical. The main goal of this research is to utilize high-resolution (HR) (0.1° resolution in the ocean) coupled model runs of CCSM4 to analyze regional sea surface height (SSH) trends. Unlike typical, lower resolution (1.0°) GCM runs these HR runs resolve features in the ocean, like the Gulf Stream, which may have a large effect on regional sea level. We characterize the variability of regional SSH along the Atlantic coast of the US using tide gauge observations along with fixed radiative forcing runs of CCSM4 and HR interactive ensemble runs. The interactive ensemble couples an ensemble mean atmosphere with a single ocean realization. This coupling results in a 30% decrease in the strength of the Atlantic meridional overturning circulation; therefore, the HR interactive ensemble is analogous to a HR hosing experiment. By characterizing the variability in these high-resolution GCM runs and observations we seek to understand what processes influence coastal SSH along the Eastern Coast of the United States and better predict future SLR.
Stability of Ensemble Models Predicts Productivity of Enzymatic Systems
Theisen, Matthew K.; Lafontaine Rivera, Jimmy G.; Liao, James C.
2016-03-10
Stability in a metabolic system may not be obtained if incorrect amounts of enzymes are used. Without stability, some metabolites may accumulate or deplete leading to the irreversible loss of the desired operating point. Even if initial enzyme amounts achieve a stable steady state, changes in enzyme amount due to stochastic variations or environmental changes may move the system to the unstable region and lose the steady-state or quasi-steady-state flux. This situation is distinct from the phenomenon characterized by typical sensitivity analysis, which focuses on the smooth change before loss of stability. Here we show that metabolic networks differ significantlymore » in their intrinsic ability to attain stability due to the network structure and kinetic forms, and that after achieving stability, some enzymes are prone to cause instability upon changes in enzyme amounts. We use Ensemble Modelling for Robustness Analysis (EMRA) to analyze stability in four cell-free enzymatic systems when enzyme amounts are changed. Loss of stability in continuous systems can lead to lower production even when the system is tested experimentally in batch experiments. The predictions of instability by EMRA are supported by the lower productivity in batch experimental tests. Finally, the EMRA method incorporates properties of network structure, including stoichiometry and kinetic form, but does not require specific parameter values of the enzymes.« less
NASA Astrophysics Data System (ADS)
Verkade, J. S.; Brown, J. D.; Davids, F.; Reggiani, P.; Weerts, A. H.
2017-12-01
Two statistical post-processing approaches for estimation of predictive hydrological uncertainty are compared: (i) 'dressing' of a deterministic forecast by adding a single, combined estimate of both hydrological and meteorological uncertainty and (ii) 'dressing' of an ensemble streamflow forecast by adding an estimate of hydrological uncertainty to each individual streamflow ensemble member. Both approaches aim to produce an estimate of the 'total uncertainty' that captures both the meteorological and hydrological uncertainties. They differ in the degree to which they make use of statistical post-processing techniques. In the 'lumped' approach, both sources of uncertainty are lumped by post-processing deterministic forecasts using their verifying observations. In the 'source-specific' approach, the meteorological uncertainties are estimated by an ensemble of weather forecasts. These ensemble members are routed through a hydrological model and a realization of the probability distribution of hydrological uncertainties (only) is then added to each ensemble member to arrive at an estimate of the total uncertainty. The techniques are applied to one location in the Meuse basin and three locations in the Rhine basin. Resulting forecasts are assessed for their reliability and sharpness, as well as compared in terms of multiple verification scores including the relative mean error, Brier Skill Score, Mean Continuous Ranked Probability Skill Score, Relative Operating Characteristic Score and Relative Economic Value. The dressed deterministic forecasts are generally more reliable than the dressed ensemble forecasts, but the latter are sharper. On balance, however, they show similar quality across a range of verification metrics, with the dressed ensembles coming out slightly better. Some additional analyses are suggested. Notably, these include statistical post-processing of the meteorological forecasts in order to increase their reliability, thus increasing the reliability of the streamflow forecasts produced with ensemble meteorological forcings.
A Novel Multi-Class Ensemble Model for Classifying Imbalanced Biomedical Datasets
NASA Astrophysics Data System (ADS)
Bikku, Thulasi; Sambasiva Rao, N., Dr; Rao, Akepogu Ananda, Dr
2017-08-01
This paper mainly focuseson developing aHadoop based framework for feature selection and classification models to classify high dimensionality data in heterogeneous biomedical databases. Wide research has been performing in the fields of Machine learning, Big data and Data mining for identifying patterns. The main challenge is extracting useful features generated from diverse biological systems. The proposed model can be used for predicting diseases in various applications and identifying the features relevant to particular diseases. There is an exponential growth of biomedical repositories such as PubMed and Medline, an accurate predictive model is essential for knowledge discovery in Hadoop environment. Extracting key features from unstructured documents often lead to uncertain results due to outliers and missing values. In this paper, we proposed a two phase map-reduce framework with text preprocessor and classification model. In the first phase, mapper based preprocessing method was designed to eliminate irrelevant features, missing values and outliers from the biomedical data. In the second phase, a Map-Reduce based multi-class ensemble decision tree model was designed and implemented in the preprocessed mapper data to improve the true positive rate and computational time. The experimental results on the complex biomedical datasets show that the performance of our proposed Hadoop based multi-class ensemble model significantly outperforms state-of-the-art baselines.
NASA Technical Reports Server (NTRS)
Keppenne, Christian L.; Rienecker, Michele; Borovikov, Anna Y.; Suarez, Max
1999-01-01
A massively parallel ensemble Kalman filter (EnKF)is used to assimilate temperature data from the TOGA/TAO array and altimetry from TOPEX/POSEIDON into a Pacific basin version of the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. The EnKF is an approximate Kalman filter in which the error-covariance propagation step is modeled by the integration of multiple instances of a numerical model. An estimate of the true error covariances is then inferred from the distribution of the ensemble of model state vectors. This inplementation of the filter takes advantage of the inherent parallelism in the EnKF algorithm by running all the model instances concurrently. The Kalman filter update step also occurs in parallel by having each processor process the observations that occur in the region of physical space for which it is responsible. The massively parallel data assimilation system is validated by withholding some of the data and then quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The distributions of the forecast and analysis error covariances predicted by the ENKF are also examined.
Stimuli Reduce the Dimensionality of Cortical Activity
Mazzucato, Luca; Fontanini, Alfredo; La Camera, Giancarlo
2016-01-01
The activity of ensembles of simultaneously recorded neurons can be represented as a set of points in the space of firing rates. Even though the dimension of this space is equal to the ensemble size, neural activity can be effectively localized on smaller subspaces. The dimensionality of the neural space is an important determinant of the computational tasks supported by the neural activity. Here, we investigate the dimensionality of neural ensembles from the sensory cortex of alert rats during periods of ongoing (inter-trial) and stimulus-evoked activity. We find that dimensionality grows linearly with ensemble size, and grows significantly faster during ongoing activity compared to evoked activity. We explain these results using a spiking network model based on a clustered architecture. The model captures the difference in growth rate between ongoing and evoked activity and predicts a characteristic scaling with ensemble size that could be tested in high-density multi-electrode recordings. Moreover, we present a simple theory that predicts the existence of an upper bound on dimensionality. This upper bound is inversely proportional to the amount of pair-wise correlations and, compared to a homogeneous network without clusters, it is larger by a factor equal to the number of clusters. The empirical estimation of such bounds depends on the number and duration of trials and is well predicted by the theory. Together, these results provide a framework to analyze neural dimensionality in alert animals, its behavior under stimulus presentation, and its theoretical dependence on ensemble size, number of clusters, and correlations in spiking network models. PMID:26924968
Stimuli Reduce the Dimensionality of Cortical Activity.
Mazzucato, Luca; Fontanini, Alfredo; La Camera, Giancarlo
2016-01-01
The activity of ensembles of simultaneously recorded neurons can be represented as a set of points in the space of firing rates. Even though the dimension of this space is equal to the ensemble size, neural activity can be effectively localized on smaller subspaces. The dimensionality of the neural space is an important determinant of the computational tasks supported by the neural activity. Here, we investigate the dimensionality of neural ensembles from the sensory cortex of alert rats during periods of ongoing (inter-trial) and stimulus-evoked activity. We find that dimensionality grows linearly with ensemble size, and grows significantly faster during ongoing activity compared to evoked activity. We explain these results using a spiking network model based on a clustered architecture. The model captures the difference in growth rate between ongoing and evoked activity and predicts a characteristic scaling with ensemble size that could be tested in high-density multi-electrode recordings. Moreover, we present a simple theory that predicts the existence of an upper bound on dimensionality. This upper bound is inversely proportional to the amount of pair-wise correlations and, compared to a homogeneous network without clusters, it is larger by a factor equal to the number of clusters. The empirical estimation of such bounds depends on the number and duration of trials and is well predicted by the theory. Together, these results provide a framework to analyze neural dimensionality in alert animals, its behavior under stimulus presentation, and its theoretical dependence on ensemble size, number of clusters, and correlations in spiking network models.
A Feature and Algorithm Selection Method for Improving the Prediction of Protein Structural Class.
Ni, Qianwu; Chen, Lei
2017-01-01
Correct prediction of protein structural class is beneficial to investigation on protein functions, regulations and interactions. In recent years, several computational methods have been proposed in this regard. However, based on various features, it is still a great challenge to select proper classification algorithm and extract essential features to participate in classification. In this study, a feature and algorithm selection method was presented for improving the accuracy of protein structural class prediction. The amino acid compositions and physiochemical features were adopted to represent features and thirty-eight machine learning algorithms collected in Weka were employed. All features were first analyzed by a feature selection method, minimum redundancy maximum relevance (mRMR), producing a feature list. Then, several feature sets were constructed by adding features in the list one by one. For each feature set, thirtyeight algorithms were executed on a dataset, in which proteins were represented by features in the set. The predicted classes yielded by these algorithms and true class of each protein were collected to construct a dataset, which were analyzed by mRMR method, yielding an algorithm list. From the algorithm list, the algorithm was taken one by one to build an ensemble prediction model. Finally, we selected the ensemble prediction model with the best performance as the optimal ensemble prediction model. Experimental results indicate that the constructed model is much superior to models using single algorithm and other models that only adopt feature selection procedure or algorithm selection procedure. The feature selection procedure or algorithm selection procedure are really helpful for building an ensemble prediction model that can yield a better performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
NASA Astrophysics Data System (ADS)
Tito Arandia Martinez, Fabian
2014-05-01
Adequate uncertainty assessment is an important issue in hydrological modelling. An important issue for hydropower producers is to obtain ensemble forecasts which truly grasp the uncertainty linked to upcoming streamflows. If properly assessed, this uncertainty can lead to optimal reservoir management and energy production (ex. [1]). The meteorological inputs to the hydrological model accounts for an important part of the total uncertainty in streamflow forecasting. Since the creation of the THORPEX initiative and the TIGGE database, access to meteorological ensemble forecasts from nine agencies throughout the world have been made available. This allows for hydrological ensemble forecasts based on multiple meteorological ensemble forecasts. Consequently, both the uncertainty linked to the architecture of the meteorological model and the uncertainty linked to the initial condition of the atmosphere can be accounted for. The main objective of this work is to show that a weighted combination of meteorological ensemble forecasts based on different atmospheric models can lead to improved hydrological ensemble forecasts, for horizons from one to ten days. This experiment is performed for the Baskatong watershed, a head subcatchment of the Gatineau watershed in the province of Quebec, in Canada. Baskatong watershed is of great importance for hydro-power production, as it comprises the main reservoir for the Gatineau watershed, on which there are six hydropower plants managed by Hydro-Québec. Since the 70's, they have been using pseudo ensemble forecast based on deterministic meteorological forecasts to which variability derived from past forecasting errors is added. We use a combination of meteorological ensemble forecasts from different models (precipitation and temperature) as the main inputs for hydrological model HSAMI ([2]). The meteorological ensembles from eight of the nine agencies available through TIGGE are weighted according to their individual performance and combined to form a grand ensemble. Results show that the hydrological forecasts derived from the grand ensemble perform better than the pseudo ensemble forecasts actually used operationally at Hydro-Québec. References: [1] M. Verbunt, A. Walser, J. Gurtz et al., "Probabilistic flood forecasting with a limited-area ensemble prediction system: Selected case studies," Journal of Hydrometeorology, vol. 8, no. 4, pp. 897-909, Aug, 2007. [2] N. Evora, Valorisation des prévisions météorologiques d'ensemble, Institu de recherceh d'Hydro-Québec 2005. [3] V. Fortin, Le modèle météo-apport HSAMI: historique, théorie et application, Institut de recherche d'Hydro-Québec, 2000.
NASA Astrophysics Data System (ADS)
Mel, Riccardo; Lionello, Piero
2014-05-01
Advantages of an ensemble prediction forecast (EPF) technique that has been used for sea level (SL) prediction at the Northern Adriatic coast are investigated. The aims is to explore whether EPF is more precise than the traditional Deterministic Forecast (DF) and the value of the added information, mainly on forecast uncertainty. Improving the SL forecast for the city of Venice is of paramount importance for the management and maintenance of this historical city and for operating the movable barriers that are presently being built for its protection. The operational practice is simulated for three months from 1st October to 31st December 2010. The EPF is based on the HYPSE model, which is a standard single-layer nonlinear shallow water model, whose equations are derived from the depth averaged momentum equations and predicts the SL. A description of the model is available in the scientific literature. Forcing of HYPSE are provided by three different sets of 3-hourly ECMWF 10m-wind and MSLP fields: the high resolution meteorological forecast (which is used for the deterministic SL forecast, DF), the control run forecast (CRF, that differs from the DF forecast only for it lower meteorological fields resolution) and the 50 ensemble members of the ECMWF EPS (which are used for the SL-EPS. The resolution of DF fields is T1279 and resolution of both CRF and ECMWF EPS fields is T639 resolution. The 10m wind and MSLP fields have been downloaded at 0.125degs (DF) and 0.25degs(CRF and EPS) and linearly interpolated to the HYPSE grid (which is the same for all simulations). The version of HYPSE used in the SR EPS uses a rectangular mesh grid of variable size, which has the minimum grid step (0.03 degrees) in the northern part of the Adriatic Sea, from where grid step increases with a 1.01 factor in both latitude and longitude (In practice, resolution varies in the range from 3.3 to 7km). Results are analyzed considering the EPS spread, the rms of the simulations, the Brier Skill Score and are compared to observations at tide gauges distributed along the Croatian and Italian coast of the Adriatic Sea. It is shown that the ensemble spread is indeed a reliable indicator of the uncertainty of the storm surge prediction. Further, results show how uncertainty depends on the predicted value of sea level and how it increases with the forecast time range. The accuracy of the ensemble mean forecast is actually larger than that of the deterministic forecast, though the latter is produced by meteorological forcings at higher resolution
Zhou, Jiyun; Lu, Qin; Xu, Ruifeng; He, Yulan; Wang, Hongpeng
2017-08-29
Prediction of DNA-binding residue is important for understanding the protein-DNA recognition mechanism. Many computational methods have been proposed for the prediction, but most of them do not consider the relationships of evolutionary information between residues. In this paper, we first propose a novel residue encoding method, referred to as the Position Specific Score Matrix (PSSM) Relation Transformation (PSSM-RT), to encode residues by utilizing the relationships of evolutionary information between residues. PDNA-62 and PDNA-224 are used to evaluate PSSM-RT and two existing PSSM encoding methods by five-fold cross-validation. Performance evaluations indicate that PSSM-RT is more effective than previous methods. This validates the point that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction. An ensemble learning classifier (EL_PSSM-RT) is also proposed by combining ensemble learning model and PSSM-RT to better handle the imbalance between binding and non-binding residues in datasets. EL_PSSM-RT is evaluated by five-fold cross-validation using PDNA-62 and PDNA-224 as well as two independent datasets TS-72 and TS-61. Performance comparisons with existing predictors on the four datasets demonstrate that EL_PSSM-RT is the best-performing method among all the predicting methods with improvement between 0.02-0.07 for MCC, 4.18-21.47% for ST and 0.013-0.131 for AUC. Furthermore, we analyze the importance of the pair-relationships extracted by PSSM-RT and the results validates the usefulness of PSSM-RT for encoding DNA-binding residues. We propose a novel prediction method for the prediction of DNA-binding residue with the inclusion of relationship of evolutionary information and ensemble learning. Performance evaluation shows that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction and ensemble learning can be used to address the data imbalance issue between binding and non-binding residues. A web service of EL_PSSM-RT ( http://hlt.hitsz.edu.cn:8080/PSSM-RT_SVM/ ) is provided for free access to the biological research community.
NASA Astrophysics Data System (ADS)
Noh, Seong Jin; Rakovec, Oldrich; Kumar, Rohini; Samaniego, Luis
2016-04-01
There have been tremendous improvements in distributed hydrologic modeling (DHM) which made a process-based simulation with a high spatiotemporal resolution applicable on a large spatial scale. Despite of increasing information on heterogeneous property of a catchment, DHM is still subject to uncertainties inherently coming from model structure, parameters and input forcing. Sequential data assimilation (DA) may facilitate improved streamflow prediction via DHM using real-time observations to correct internal model states. In conventional DA methods such as state updating, parametric uncertainty is, however, often ignored mainly due to practical limitations of methodology to specify modeling uncertainty with limited ensemble members. If parametric uncertainty related with routing and runoff components is not incorporated properly, predictive uncertainty by DHM may be insufficient to capture dynamics of observations, which may deteriorate predictability. Recently, a multi-scale parameter regionalization (MPR) method was proposed to make hydrologic predictions at different scales using a same set of model parameters without losing much of the model performance. The MPR method incorporated within the mesoscale hydrologic model (mHM, http://www.ufz.de/mhm) could effectively represent and control uncertainty of high-dimensional parameters in a distributed model using global parameters. In this study, we present a global multi-parametric ensemble approach to incorporate parametric uncertainty of DHM in DA to improve streamflow predictions. To effectively represent and control uncertainty of high-dimensional parameters with limited number of ensemble, MPR method is incorporated with DA. Lagged particle filtering is utilized to consider the response times and non-Gaussian characteristics of internal hydrologic processes. The hindcasting experiments are implemented to evaluate impacts of the proposed DA method on streamflow predictions in multiple European river basins having different climate and catchment characteristics. Because augmentation of parameters is not required within an assimilation window, the approach could be stable with limited ensemble members and viable for practical uses.
Pai, Priyadarshini P; Mondal, Sukanta
2016-10-01
Proteins interact with carbohydrates to perform various cellular interactions. Of the many carbohydrate ligands that proteins bind with, mannose constitute an important class, playing important roles in host defense mechanisms. Accurate identification of mannose-interacting residues (MIR) may provide important clues to decipher the underlying mechanisms of protein-mannose interactions during infections. This study proposes an approach using an ensemble of base classifiers for prediction of MIR using their evolutionary information in the form of position-specific scoring matrix. The base classifiers are random forests trained by different subsets of training data set Dset128 using 10-fold cross-validation. The optimized ensemble of base classifiers, MOWGLI, is then used to predict MIR on protein chains of the test data set Dtestset29 which showed a promising performance with 92.0% accurate prediction. An overall improvement of 26.6% in precision was observed upon comparison with the state-of-art. It is hoped that this approach, yielding enhanced predictions, could be eventually used for applications in drug design and vaccine development.
Confident Surgical Decision Making in Temporal Lobe Epilepsy by Heterogeneous Classifier Ensembles
Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Jafari-Khouzani, Kourosh; Elisevich, Kost; Fotouhi, Farshad
2015-01-01
In medical domains with low tolerance for invalid predictions, classification confidence is highly important and traditional performance measures such as overall accuracy cannot provide adequate insight into classifications reliability. In this paper, a confident-prediction rate (CPR) which measures the upper limit of confident predictions has been proposed based on receiver operating characteristic (ROC) curves. It has been shown that heterogeneous ensemble of classifiers improves this measure. This ensemble approach has been applied to lateralization of focal epileptogenicity in temporal lobe epilepsy (TLE) and prediction of surgical outcomes. A goal of this study is to reduce extraoperative electrocorticography (eECoG) requirement which is the practice of using electrodes placed directly on the exposed surface of the brain. We have shown that such goal is achievable with application of data mining techniques. Furthermore, all TLE surgical operations do not result in complete relief from seizures and it is not always possible for human experts to identify such unsuccessful cases prior to surgery. This study demonstrates the capability of data mining techniques in prediction of undesirable outcome for a portion of such cases. PMID:26609547
NASA Astrophysics Data System (ADS)
Goyal, Sandeep K.; Singh, Rajeev; Ghosh, Sibasish
2016-01-01
Mixed states of a quantum system, represented by density operators, can be decomposed as a statistical mixture of pure states in a number of ways where each decomposition can be viewed as a different preparation recipe. However the fact that the density matrix contains full information about the ensemble makes it impossible to estimate the preparation basis for the quantum system. Here we present a measurement scheme to (seemingly) improve the performance of unsharp measurements. We argue that in some situations this scheme is capable of providing statistics from a single copy of the quantum system, thus making it possible to perform state tomography from a single copy. One of the by-products of the scheme is a way to distinguish between different preparation methods used to prepare the state of the quantum system. However, our numerical simulations disagree with our intuitive predictions. We show that a counterintuitive property of a biased classical random walk is responsible for the proposed mechanism not working.
National Centers for Environmental Prediction
Modeling Mesoscale Modeling Marine Modeling and Analysis Teams Climate Data Assimilation Ensembles and Post Ice group works on sea ice analysis from satellite, sea ice modeling, and ice-atmosphere-ocean / VISION | About EMC Analysis Drift Model KISS Model Numerical Forecast Systems The Polar and Great Lakes
Ensemble of ground subsidence hazard maps using fuzzy logic
NASA Astrophysics Data System (ADS)
Park, Inhye; Lee, Jiyeong; Saro, Lee
2014-06-01
Hazard maps of ground subsidence around abandoned underground coal mines (AUCMs) in Samcheok, Korea, were constructed using fuzzy ensemble techniques and a geographical information system (GIS). To evaluate the factors related to ground subsidence, a spatial database was constructed from topographic, geologic, mine tunnel, land use, groundwater, and ground subsidence maps. Spatial data, topography, geology, and various ground-engineering data for the subsidence area were collected and compiled in a database for mapping ground-subsidence hazard (GSH). The subsidence area was randomly split 70/30 for training and validation of the models. The relationships between the detected ground-subsidence area and the factors were identified and quantified by frequency ratio (FR), logistic regression (LR) and artificial neural network (ANN) models. The relationships were used as factor ratings in the overlay analysis to create ground-subsidence hazard indexes and maps. The three GSH maps were then used as new input factors and integrated using fuzzy-ensemble methods to make better hazard maps. All of the hazard maps were validated by comparison with known subsidence areas that were not used directly in the analysis. As the result, the ensemble model was found to be more effective in terms of prediction accuracy than the individual model.
NASA Astrophysics Data System (ADS)
Gelb, Lev D.; Chakraborty, Somendra Nath
2011-12-01
The normal boiling points are obtained for a series of metals as described by the "quantum-corrected Sutton Chen" (qSC) potentials [S.-N. Luo, T. J. Ahrens, T. Çağın, A. Strachan, W. A. Goddard III, and D. C. Swift, Phys. Rev. B 68, 134206 (2003)]. Instead of conventional Monte Carlo simulations in an isothermal or expanded ensemble, simulations were done in the constant-NPH adabatic variant of the Gibbs ensemble technique as proposed by Kristóf and Liszi [Chem. Phys. Lett. 261, 620 (1996)]. This simulation technique is shown to be a precise tool for direct calculation of boiling temperatures in high-boiling fluids, with results that are almost completely insensitive to system size or other arbitrary parameters as long as the potential truncation is handled correctly. Results obtained were validated using conventional NVT-Gibbs ensemble Monte Carlo simulations. The qSC predictions for boiling temperatures are found to be reasonably accurate, but substantially underestimate the enthalpies of vaporization in all cases. This appears to be largely due to the systematic overestimation of dimer binding energies by this family of potentials, which leads to an unsatisfactory description of the vapor phase.
Helmholtz and Gibbs ensembles, thermodynamic limit and bistability in polymer lattice models
NASA Astrophysics Data System (ADS)
Giordano, Stefano
2017-12-01
Representing polymers by random walks on a lattice is a fruitful approach largely exploited to study configurational statistics of polymer chains and to develop efficient Monte Carlo algorithms. Nevertheless, the stretching and the folding/unfolding of polymer chains within the Gibbs (isotensional) and the Helmholtz (isometric) ensembles of the statistical mechanics have not been yet thoroughly analysed by means of the lattice methodology. This topic, motivated by the recent introduction of several single-molecule force spectroscopy techniques, is investigated in the present paper. In particular, we analyse the force-extension curves under the Gibbs and Helmholtz conditions and we give a proof of the ensembles equivalence in the thermodynamic limit for polymers represented by a standard random walk on a lattice. Then, we generalize these concepts for lattice polymers that can undergo conformational transitions or, equivalently, for chains composed of bistable or two-state elements (that can be either folded or unfolded). In this case, the isotensional condition leads to a plateau-like force-extension response, whereas the isometric condition causes a sawtooth-like force-extension curve, as predicted by numerous experiments. The equivalence of the ensembles is finally proved also for lattice polymer systems exhibiting conformational transitions.
A Bayesian Ensemble Approach for Epidemiological Projections
Lindström, Tom; Tildesley, Michael; Webb, Colleen
2015-01-01
Mathematical models are powerful tools for epidemiology and can be used to compare control actions. However, different models and model parameterizations may provide different prediction of outcomes. In other fields of research, ensemble modeling has been used to combine multiple projections. We explore the possibility of applying such methods to epidemiology by adapting Bayesian techniques developed for climate forecasting. We exemplify the implementation with single model ensembles based on different parameterizations of the Warwick model run for the 2001 United Kingdom foot and mouth disease outbreak and compare the efficacy of different control actions. This allows us to investigate the effect that discrepancy among projections based on different modeling assumptions has on the ensemble prediction. A sensitivity analysis showed that the choice of prior can have a pronounced effect on the posterior estimates of quantities of interest, in particular for ensembles with large discrepancy among projections. However, by using a hierarchical extension of the method we show that prior sensitivity can be circumvented. We further extend the method to include a priori beliefs about different modeling assumptions and demonstrate that the effect of this can have different consequences depending on the discrepancy among projections. We propose that the method is a promising analytical tool for ensemble modeling of disease outbreaks. PMID:25927892
NASA Astrophysics Data System (ADS)
van der Zwan, Rene
2013-04-01
The Rijnland water system is situated in the western part of the Netherlands, and is a low-lying area of which 90% is below sea-level. The area covers 1,100 square kilometres, where 1.3 million people live, work, travel and enjoy leisure. The District Water Control Board of Rijnland is responsible for flood defence, water quantity and quality management. This includes design and maintenance of flood defence structures, control of regulating structures for an adequate water level management, and waste water treatment. For water quantity management Rijnland uses, besides an online monitoring network for collecting water level and precipitation data, a real time control decision support system. This decision support system consists of deterministic hydro-meteorological forecasts with a 24-hr forecast horizon, coupled with a control module that provides optimal operation schedules for the storage basin pumping stations. The uncertainty of the rainfall forecast is not forwarded in the hydrological prediction. At this moment 65% of the pumping capacity of the storage basin pumping stations can be automatically controlled by the decision control system. Within 5 years, after renovation of two other pumping stations, the total capacity of 200 m3/s will be automatically controlled. In critical conditions there is a need of both a longer forecast horizon and a probabilistic forecast. Therefore ensemble precipitation forecasts of the ECMWF are already consulted off-line during dry-spells, and Rijnland is running a pilot operational system providing 10-day water level ensemble forecasts. The use of EPS during dry-spells and the findings of the pilot will be presented. Challenges and next steps towards on-line implementation of ensemble forecasts for risk-based operational management of the Rijnland water system will be discussed. An important element in that discussion is the question: will policy and decision makers, operator and citizens adapt this Anticipatory Water management, including temporary lower storage basin levels and a reduction in extra investments for infrastructural measures.
NASA Astrophysics Data System (ADS)
Wood, A. W.; Clark, E.; Newman, A. J.; Nijssen, B.; Clark, M. P.; Gangopadhyay, S.; Arnold, J. R.
2015-12-01
The US National Weather Service River Forecasting Centers are beginning to operationalize short range to medium range ensemble predictions that have been in development for several years. This practice contrasts with the traditional single-value forecast practice at these lead times not only because the ensemble forecasts offer a basis for quantifying forecast uncertainty, but also because the use of ensembles requires a greater degree of automation in the forecast workflow than is currently used. For instance, individual ensemble member forcings cannot (practically) be manually adjusted, a step not uncommon with the current single-value paradigm, thus the forecaster is required to adopt a more 'over-the-loop' role than before. The relative lack of experience among operational forecasters and forecast users (eg, water managers) in the US with over-the-loop approaches motivates the creation of a real-time demonstration and evaluation platform for exploring the potential of over-the-loop workflows to produce usable ensemble short-to-medium range forecasts, as well as long range predictions. We describe the development and early results of such an effort by a collaboration between NCAR and the two water agencies, the US Army Corps of Engineers and the US Bureau of Reclamation. Focusing on small to medium sized headwater basins around the US, and using multi-decade series of ensemble streamflow hindcasts, we also describe early results, assessing the skill of daily-updating, over-the-loop forecasts driven by a set of ensemble atmospheric outputs from the NCEP GEFS for lead times from 1-15 days.
NASA Technical Reports Server (NTRS)
Achuthavarier, Deepthi; Koster, Randal; Marshak, Jelena; Schubert, Siegfried; Molod, Andrea
2018-01-01
In this study, we examine the prediction skill and predictability of the Madden Julian Oscillation (MJO) in a recent version of the NASA GEOS-5 atmosphere-ocean coupled model run at at 1/2 degree horizontal resolution. The results are based on a suite of hindcasts produced as part of the NOAA SubX project, consisting of seven ensemble members initialized every 5 days for the period 1999-2015. The atmospheric initial conditions were taken from the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2), and the ocean and the sea ice were taken from a GMAO ocean analysis. The land states were initialized from the MERRA-2 land output, which is based on observation-corrected precipitation fields. We investigated the MJO prediction skill in terms of the bivariate correlation coefficient for the real-time multivariate MJO (RMM) indices. The correlation coefficient stays at or above 0.5 out to forecast lead times of 26-36 days, with a pronounced increase in skill for forecasts initialized from phase 3, when the MJO convective anomaly is located in the central tropical Indian Ocean. A corresponding estimate of the upper limit of the predictability is calculated by considering a single ensemble member as the truth and verifying the ensemble mean of the remaining members against that. The predictability estimates fall between 35-37 days (taken as forecast lead when the correlation reaches 0.5) and are rather insensitive to the initial MJO phase. The model shows slightly higher skill when the initial conditions contain strong MJO events compared to weak events, although the difference in skill is evident only from lead 1 to 20. Similar to other models, the RMM-index-based skill arises mostly from the circulation components of the index. The skill of the convective component of the index drops to 0.5 by day 20 as opposed to day 30 for circulation fields. The propagation of the MJO anomalies over the Maritime Continent does not appear problematic in the GEOS-5 hindcasts implying that the Maritime Continent predictability barrier may not be a major concern in this model. Finally, the MJO prediction skill in this version of GEOS-5 is superior to that of the current seasonal prediction system at the GMAO; this could be partly attributed to a slightly better representation of the MJO in the free running version of this model and partly to the improved atmospheric initialization from MERRA-2.
Global system for hydrological monitoring and forecasting in real time at high resolution
NASA Astrophysics Data System (ADS)
Ortiz, Enrique; De Michele, Carlo; Todini, Ezio; Cifres, Enrique
2016-04-01
This project presented at the EGU 2016 born of solidarity and the need to dignify the most disadvantaged people living in the poorest countries (Africa, South America and Asia, which are continually exposed to changes in the hydrologic cycle suffering events of large floods and/or long periods of droughts. It is also a special year this 2016, Year of Mercy, in which we must engage with the most disadvantaged of our Planet (Gaia) making available to them what we do professionally and scientifically. The project called "Global system for hydrological monitoring and forecasting in real time at high resolution" is Non-Profit and aims to provide at global high resolution (1km2) hydrological monitoring and forecasting in real time and continuously coupling Weather Forecast of Global Circulation Models, such us GFS-0.25° (Deterministic and Ensembles Run) forcing a physically based distributed hydrological model computationally efficient, such as the latest version extended of TOPKAPI model, named TOPKAPI-eXtended. Finally using the MCP approach for the proper use of ensembles for Predictive Uncertainty assessment essentially based on a multiple regression in the Normal space, can be easily extended to use ensembles to represent the local (in time) smaller or larger conditional predictive uncertainty, as a function of the ensemble spread. In this way, each prediction in time accounts for both the predictive uncertainty of the ensemble mean and that of the ensemble spread. To perform a continuous hydrological modeling with TOPKAPI-X model and have hot start of hydrological status of watersheds, the system assimilated products of rainfall and temperature derived from remote sensing, such as product 3B42RT of TRMM NASA and others.The system will be integrated into a Decision Support System (DSS) platform, based on geographical data. The DSS is a web application (For Pc, Tablet/Mobile phone): It does not need installation (all you need is a web browser and an internet connection) and not need update (all upgrade are deployed on the remote server)and DSS is a classical client-server application. The client side will be an HTML 5-CSS 3 application, it runs in one of the most common browser. The server side consist in: A web server (Apache web server); a map server (Geoserver); a Geographical q3456Relational Database Management Sytem (Postgresql+Postgis); Tools based on GDAL Lybraries. A customized web page will be implemented to publish all hydrometeorological information and forecast runs (free) for all users in the world. In this first presentation of the project are invited to attend all those scientific / technical people, Universities, Research Centers (public or private) who want to collaborate in it, opening a brainstorming to improve the System. References: • Liu Z. and Todini E., (2002). Towards a comprehensive physically based rainfall-runoff model. Hydrology and Earth System Sciences (HESS), 6(5):859-881, 2002. • Thielen, J., Bartholmes, J., Ramos, M.-H., and de Roo, A., (2009): The European Flood Alert System - Part 1: Concept and development, Hydrol. Earth Syst. Sci., 13, 125-140, 2009. • Coccia C., Mazzetti C., Ortiz E., Todini E., (2010) - A different soil conceptualization for the TOPKAPI model application within the DMIP 2. American Geophysical Union. Fall Meeting, San Francisco H21H-07, 2010. • Pappenberger, F., Cloke, H. L., Balsamo, G., Ngo-Duc, T., and Oki,T., (2010) Global runoff routing with the hydrological component of the ECMWF NWP system, Int. J. Climatol., 30, 2155-2174, 2010. • Coccia, G. and Todini, E., (2011). Recent developments in predictive uncertainty assessment based on the Model Conditional Processor approach. Hydrology and Earth System Sciences, 15, 3253-3274, 2011. • Wu, H., Adler, R. F., Hong, Y., Tian, Y., and Policelli, F.,(2012): Evaluation of Global Flood Detection Using Satellite-Based Rainfall and a Hydrologic Model, J. Hydrometeorol., 13, 1268-1284, 2012. • Simth M. et al., (2013). The Distributed Model Intercomparison Project - Phase 2: Experiment Design and Summary Results of the Western Basin Experiments, Journal of Hydrology 507, 300-329, 2013. • Pontificiae Academiae Scientiarvm (2014). Proceedings of the Joint Workshop on 2-6 May 2014: Sustainable Humanity Sustainable Nature Our Responsibility. Pontificiae Academiae Scientiarvm Extra Series 41. Vatican City. 2014 • Encyclical letter CARITAS IN VERITATE of the supreme pontiff Benedict XVI to the bishops, priests and deacons, men and women religious the lay faithful and all people of good will on integral human development in charity and truth. Vatican City . 2009. • Encyclical letter LAUDATO SI' of the holy father Francis on care for our common home. Vatican City. 2015
Regional crop yield forecasting: a probabilistic approach
NASA Astrophysics Data System (ADS)
de Wit, A.; van Diepen, K.; Boogaard, H.
2009-04-01
Information on the outlook on yield and production of crops over large regions is essential for government services dealing with import and export of food crops, for agencies with a role in food relief, for international organizations with a mandate in monitoring the world food production and trade, and for commodity traders. Process-based mechanistic crop models are an important tool for providing such information, because they can integrate the effect of crop management, weather and soil on crop growth. When properly integrated in a yield forecasting system, the aggregated model output can be used to predict crop yield and production at regional, national and continental scales. Nevertheless, given the scales at which these models operate, the results are subject to large uncertainties due to poorly known weather conditions and crop management. Current yield forecasting systems are generally deterministic in nature and provide no information about the uncertainty bounds on their output. To improve on this situation we present an ensemble-based approach where uncertainty bounds can be derived from the dispersion of results in the ensemble. The probabilistic information provided by this ensemble-based system can be used to quantify uncertainties (risk) on regional crop yield forecasts and can therefore be an important support to quantitative risk analysis in a decision making process.
Microcanonical entropy for classical systems
NASA Astrophysics Data System (ADS)
Franzosi, Roberto
2018-03-01
The entropy definition in the microcanonical ensemble is revisited. We propose a novel definition for the microcanonical entropy that resolve the debate on the correct definition of the microcanonical entropy. In particular we show that this entropy definition fixes the problem inherent the exact extensivity of the caloric equation. Furthermore, this entropy reproduces results which are in agreement with the ones predicted with standard Boltzmann entropy when applied to macroscopic systems. On the contrary, the predictions obtained with the standard Boltzmann entropy and with the entropy we propose, are different for small system sizes. Thus, we conclude that the Boltzmann entropy provides a correct description for macroscopic systems whereas extremely small systems should be better described with the entropy that we propose here.
NASA Astrophysics Data System (ADS)
Slater, L. J.; Villarini, G.; Bradley, A.
2015-12-01
Model predictions of precipitation and temperature are crucial to mitigate the impacts of major flood and drought events through informed planning and response. However, the potential value and applicability of these predictions is inescapably linked to their forecast quality. The North-American Multi-Model Ensemble (NMME) is a multi-agency supported forecasting system for intraseasonal to interannual (ISI) climate predictions. Retrospective forecasts and real-time information are provided by each agency free of charge to facilitate collaborative research efforts for predicting future climate conditions as well as extreme weather events such as floods and droughts. Using the PRISM climate mapping system as the reference data, we examine the skill of five General Circulation Models (GCMs) from the NMME project to forecast monthly and seasonal precipitation and temperature over seven sub-regions of the continental United States. For each model, we quantify the seasonal accuracy of the forecast relative to observed precipitation using the mean square error skill score. This score is decomposed to assess the accuracy of the forecast in the absence of biases (potential skill), and in the presence of conditional (slope reliability) and unconditional (standardized mean error) biases. The quantification of these biases allows us to diagnose each model's skill over a full range temporal and spatial scales. Finally, we test each model's forecasting skill by evaluating its ability to predict extended periods of extreme temperature and precipitation that were conducive to 'billion-dollar' historical flood and drought events in different regions of the continental USA. The forecasting skill of the individual climate models is summarized and presented along with a discussion of different multi-model averaging techniques for predicting such events.
Genetic programming based ensemble system for microarray data classification.
Liu, Kun-Hong; Tong, Muchenxuan; Xie, Shu-Tong; Yee Ng, Vincent To
2015-01-01
Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.
Genetic Programming Based Ensemble System for Microarray Data Classification
Liu, Kun-Hong; Tong, Muchenxuan; Xie, Shu-Tong; Yee Ng, Vincent To
2015-01-01
Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved. PMID:25810748
A multiphysical ensemble system of numerical snow modelling
NASA Astrophysics Data System (ADS)
Lafaysse, Matthieu; Cluzet, Bertrand; Dumont, Marie; Lejeune, Yves; Vionnet, Vincent; Morin, Samuel
2017-05-01
Physically based multilayer snowpack models suffer from various modelling errors. To represent these errors, we built the new multiphysical ensemble system ESCROC (Ensemble System Crocus) by implementing new representations of different physical processes in the deterministic coupled multilayer ground/snowpack model SURFEX/ISBA/Crocus. This ensemble was driven and evaluated at Col de Porte (1325 m a.s.l., French alps) over 18 years with a high-quality meteorological and snow data set. A total number of 7776 simulations were evaluated separately, accounting for the uncertainties of evaluation data. The ability of the ensemble to capture the uncertainty associated to modelling errors is assessed for snow depth, snow water equivalent, bulk density, albedo and surface temperature. Different sub-ensembles of the ESCROC system were studied with probabilistic tools to compare their performance. Results show that optimal members of the ESCROC system are able to explain more than half of the total simulation errors. Integrating members with biases exceeding the range corresponding to observational uncertainty is necessary to obtain an optimal dispersion, but this issue can also be a consequence of the fact that meteorological forcing uncertainties were not accounted for. The ESCROC system promises the integration of numerical snow-modelling errors in ensemble forecasting and ensemble assimilation systems in support of avalanche hazard forecasting and other snowpack-modelling applications.
NASA Astrophysics Data System (ADS)
Kamal, S.; Maslowski, W.; Roberts, A.; Osinski, R.; Cassano, J. J.; Seefeldt, M. W.
2017-12-01
The Regional Arctic system model has been developed and used to advance the current state of Arctic modeling and increase the skill of sea ice forecast. RASM is a fully coupled, limited-area model that includes the atmosphere, ocean, sea ice, land hydrology and runoff routing components and the flux coupler to exchange information among them. Boundary conditions are derived from NCEP Climate Forecasting System Reanalyses (CFSR) or Era Iterim (ERA-I) for hindcast simulations or from NCEP Coupled Forecast System Model version 2 (CFSv2) for seasonal forecasts. We have used RASM to produce sea ice forecasts for September 2016 and 2017, in contribution to the Sea Ice Outlook (SIO) of the Sea Ice Prediction Network (SIPN). Each year, we produced three SIOs for the September minimum, initialized on June 1, July 1 and August 1. In 2016, predictions used a simple linear regression model to correct for systematic biases and included the mean September sea ice extent, the daily minimum and the week of the minimum. In 2017, we produced a 12-member ensemble on June 1 and July 1, and 28-member ensemble August 1. The predictions of September 2017 included the pan-Arctic and regional Alaskan sea ice extent, daily and monthly mean pan-Arctic maps of sea ice probability, concentration and thickness. No bias correction was applied to the 2017 forecasts. Finally, we will also discuss future plans for RASM forecasts, which include increased resolution for model components, ecosystem predictions with marine biogeochemistry extensions (mBGC) to the ocean and sea ice components, and feasibility of optional boundary conditions using the Navy Global Environmental Model (NAVGEM).